Encyclopedia of Virology, Volume 4 Viruses as Infectious Agents: Bacterial, Archaeal, Fungal, Algal, and Invertebrate Viruses [4 ed.] 0128234083, 9780128234082


348 16 159MB

English Pages [975] Year 2021

Report DMCA / Copyright

DOWNLOAD PDF FILE

Table of contents :
Cover
ENCYCLOPEDIA OF VIROLOGY FOURTH EDITION
EDITORS IN CHIEF
EDITORIAL BOARD
SECTION EDITORS
FOREWORD
PREFACE
HOW TO USE THE ENCYCLOPEDIA
LIST OF CONTRIBUTORS
CONTENT OF ALL VOLUMES
BACTERIAL VIRUSES
History of Virology: Bacteriophages
Discovery of Phages
Phage Therapy
Phage Typing
Phage Genetics
Phage Biochemistry
Phage and Biotechnology
Phage Ecology and Evolution
Conclusion
Further Reading
Icosahedral Phages – Single-Stranded DNA (φX174)
Glossary
History
Virion Morphology and Genome Content
Host Cell Recognition, Attachment, Eclipse and Penetration
DNA Replication
Gene Expression
Morphogenesis
DNA Packaging and the DNA Binding Protein
Lysis
Evolution and Evolutionary Studies
Evolution of a Two Scaffolding Protein System
φX174-Like Viruses as a Model System for Experimental Evolution
Further Reading
Single-Stranded RNA Bacterial Viruses
Glossary
Icosahedral ssRNA Bacterial Viruses
Background
Phage lifecycle
Phage Structure and Assembly
Biotechnological Applications
Further Reading
Enveloped Icosahedral Phages – Double-Stranded RNA (phiv6)
Glossary
Introduction
The Cystoviridae Family
Classification and Host Range
Virion Structure and Properties
Physico-Chemical Properties
Architecture of the Polymerase Complex and the Nucleocapsid
The Protein Components of the Cystovirus Portal Complex and Nucleocapsid
P1
P2
P4
P7
P8
Membrane Envelope
Host Cell Attachment Complex
Genome Organization and Sequence Similarity
Replication Cycle
Plasma Membrane Penetration and NC Entrance into the Cytoplasm
Transcription
PC Assembly and Packaging
NC maturation
Membrane acquisition and host cell lysis
Recombination and Reassortment
Carrier State
Reverse Genetics and Self Assembly
Applications and Intellectual Property
Further Reading
Relevant Websites
Membrane-Containing Icosahedral DNA Bacteriophages
Glossary
Introduction
Virion Structure and Properties
Overall Structure
Membrane and DNA
Life Cycle
Receptor Recognition and DNA Delivery
Genome Replication
Particle Assembly
Cell Lysis
Genomes and Genomics
Acknowledgment
Further Reading
Relevant Websites
Tailed Double-Stranded DNA Phages
Glossary
Introduction
The Structure of Tailed dsDNA Bacteriophages
Bacteriophage Structure and Function
Capsids
Tails and Tail Fibers
Temperate Versus Virulent Bacteriophages
Injection
DNA Injection
Protein Injection
Host Interactions and Regulation of Gene Expression
Classes of Genes
Early Genes
Late Genes
Virion Assembly
Assembly Pathways
Tail Assembly
Capsid Assembly
Lysis
Genomes and Genomics
Chromosome Diversity and Replication
Diversity in Genome Size and Organization
Common Themes in Genome Structure
Horizontal Exchange of Genes is Widespread
Common Ancestry
Further Reading
Relevant Websites
Helical and Filamentous Phages
Glossary
Introduction
Biotopes: Prolific, Temperate and Pathogenic
Structure of Filamentous Phages
Infection Process
Replication by the Rolling Circle Mechanism
Biosynthesis of M13 (fd, f1)
The Coat Proteins
The Major Coat Protein gp8
The Minor Coat Proteins gp7 and gp9
The Minor Coat Proteins gp3 and gp6
Assembly and Secretion
Biotechnological Applications of Filamentous Phage
Further Reading
Replication of Bacillus Double-Stranded DNA Bacteriophages
Glossary
Introduction
DNA Replication of SPO1-Like Viruses
DNA Replication of SPP1-Like Viruses
DNA Replication of phi29-Like Viruses
References
Relevant Websites
Lytic Transcription
Glossary
Introduction
Transcription Initiation and Elongation by Host RNA Polymerase
Developmental Pathways and Transcription Control by T7, T4, and N4
T4
T7
Early transcription
Phage RNAP and late gene transcription
The transcription cycle
Elongation
Termination
N4
Inhibition of Host RNAP by T4, T7, and N4, and Other Members of Those Families
Summary
Further Reading
Lysogeny
Glossary
Introduction
Why Lysogeny?
Persistence of DNA
Integration Into the Chromosome
Extrachromosomal Persistence of DNA
Maintenance of the Lysogenic State
Bacteriophage λ
Bacteriophage 186
Integration-Dependent Bacteriophage Immunity
Use of a Host Protein as an Immunity Repressor
RNA to Maintain Lysogeny
Prophage Induction
Evolutionary and Phenotypical Effects of Lysogeny
Lysogenic Conversion
Gene Disruption
Genomic Rearrangement
Conclusion
Acknowledgments
Further Reading
Relevant Websites
Decision Making by Temperate Phages
Glossary
Introduction
The Post-Infection Decision of Phage Lambda
Counting by Infecting Phages
The View From the Single Cell
The Decision to Remain Dormant
Conclusion
Acknowledgments
Further Reading
Mobilization of Phage Satellites
Glossary
Introduction
Genetic Organization
Integration and Excision, Gene Induction, and Replication
Packaging and Horizontal Transmission
Strategies PICIs Use to Interfere With Their Helper Phages
Prevalence of Phage Satellites
Counter-Evolution by Phages to Avoid Parasitism by Satellites
Consequences of Phage Satellites in Human Health and Disease
Conclusion
Further Reading
Portal Vertex
Glossary
Introduction
Formation of a Prohead
DNA Packaging into the Prohead
Virion Maturation
The Structural Features of the Portal Proteins
Functions of the Portal Proteins
The Viral Portal Protein as a Nucleator in the Assembly of the Prohead With Proper Morphology
Portal Proteins Provide Binding Sites for Terminases
Portal Proteins Bound With Adapter Proteins Facilitates Tail Attachment
Portal Proteins Help to Spool Condensed Genomic DNA in the Capsid to Form a Highly Ordered Structure
Portal Proteins as a Check-Valve Preventing DNA From Slipping Out
Portal Proteins as a Sensor to Control the DNA Packaging
The Portal Proteins Actively Assists DNA Packaging Coupling With ATP Hydrolysis
The terminase-driven mechanism plays a key role at the initial stage of DNA packaging
The portal proteins actively assists DNA packaging at the late stage of DNA packaging
The portal directly-mediated packaging mode
Rotation of the portal protein may apply forces to DNA
Lengthwise channel expansion and contraction in the portal protein may apply forces to DNA
Sequential movement of tunnel loops may apply force to DNA
The portal indirectly-mediated packaging mode
DNA translocation driven by the energy from the DNA deformation in the portal protein
DNA translocation driven by a brownian motor inherent in the portal protein
Main Techniques Used to Study Protal Protein Structure and Function
Biochemical methods
Techniques for structural analysis
The optical tweezers: a technique to observe DNA packaging in real time
Molecular dynamics simulation
Planar bilayer membrane (BLM) technology
Concluding Remarks
Further Reading
Outline placeholder
Glossary
Introduction
Major Capsid Protein Before Capsid Assembly
Prohead Assembly
Scaffolding Protein
Accelerating assembly
Promote correct geometry
Delay MCP transformation
Exclude host factors
Scaffold involvement in portal incorporation
The Portal
Structure of the Prohead
Stability of the Prohead
Assembly Parasites
Further Reading
Enzymology of Viral DNA Packaging Machines
Glossary
Introduction
Unit Length Versus Headful Packaging Mechanisms
The Viral Terminase Enzymes
The TerS Subunits
The TerL Subunits
The Terminase Holoenzymes
The Lambda System
The Terminase Enzyme of Phage Lambda
The lambda TerS Subunit
The TerL DNA Packaging Domain
The lambda TerL Maturation Domain
Escherichia Coli Integration Host Factor
The Genome Maturation Complex
The Lambda Cohesive End Site and Assembly of the Genome Maturation Complex
Model for the Assembly of the Genome Maturation Complex
Genome Maturation: The cos-Cleavage Reaction
Genome Maturation Summary
The Genome Packaging Complex
Assembly of the Packaging Motor
cos-Clearance: Transition to a Translocating Motor
A Nucleotide Switch for cos-Clearance?
The Translocating Motor
Structure of the Translocating Motors
Coordinated ATP Hydrolysis by the Terminase Motors
Mechanochemical Coupling and DNA Translocation
Termination of Packaging
Unit Length Packaging Motors
Terminase Ejection and Virion Completion
Conclusion
Further Reading
DNA Packaging: DNA Recognition
Glossary
Introduction
Genome Replication and Selection Mechanisms
Monomeric replication products
Concatemeric replication products
The Shape of DNA
Protein-DNA Binding
A Question of Fidelity: Sequence Specific and Non-Specific Binding
DNA Binding Domains
Strength in Numbers: Multiple Binding Events Regulate Complex Assembly
Recognition of Viral Genomic DNA
Monomeric Genomes and Terminal Proteins
Concatemeric DNA and Terminase Proteins
Direct terminal repeats: T3 and T7
Cohesive ends: λ-like
Genome recognition in headful bacteriophages
DNA Wrapping/Bending is a Common Feature in Selective Packaging
Further Reading
Relevant Websites
DNA Packaging: The Translocation Motor
Glossary
Introduction
Terminase and Phi29 Motors are Branches Within the ASCE Family of ATPases
Phi29 Motors
Terminase Motors
The “Inchworm” Translocation Model
The “Lever” Translocation Model
The DNA “Crunching” or “Compression” Translocation Model
Termination of Packaging
Outlook and Future Studies
Further Reading
Biophysics of DNA Packaging
Glossary
Introduction
Molecular Structure
DNA Packaging Characteristics
Packaging Speed
Force Generation
Pausing and Slipping
DNA Packaging Mechanisms
Mechanochemical Coupling
Motor-substrate Interactions
Force Generation
Motor Coordination
Further Reading
Energetics of the DNA-Filled Head
Glossary
Introduction
Equilibrium Energy and Structure of Intracapsid DNA
Metastability of Intra-Capsid DNA Facilitating or Inhibiting Viral DNA Ejection
The Mobility of Packaged Genome Controls Ejection Dynamics
Pressure-Driven Release of Viral Genome Into a Host Cell is a Mechanism Leading to Infection
Concluding Remarks
Further Reading
Bacteriophage Receptor Proteins of Gram-Negative Bacteria
Glossary
History and Overview
Techniques and Methods for Studying Phage Receptors
Properties of Receptor Proteins
Ecology and Evolution of Phage-Host Interactions
Relevance of Phage-Receptor Interactions to Phage Therapy
Summary
Appendix: Methods for Receptor Identification or Characterization
Acknowledgments
Further Reading
Relevant Websites
Tail Structure and Dynamics
Glossary
Introduction
Caudovirales
Organization and Assembly of Long Tails
The Tube
The Sheath
Baseplates of Long Contractile Tails
Tail Tip Complex of Long Non-Contractile Tails
Short Non-Contractile Tails: Tube-Like Tail Structure
The Tailspikes, Tail Fibers and Phage-Host Interaction
Acknowledgments
Further Reading
Relevant Website
Bacteriophage Tail Fibres, Tailspikes, and Bacterial Receptor Interaction
Glossary
Introduction
Structures of Bacteriophage Tail Fibers
Structures of Bacteriophage Tailspikes and Other Receptor-Binding Proteins
Host Cell Recognition
Conclusion and Perspectives
Acknowledgments
Further Reading
Phage Genome and Protein Ejection In Vivo
Glossary
Protein Ejection and Trans-Envelope Channel Formation
Proteins are Ejected into the Host Cell at the Initiation of Infection
Tail-Less Phages
Short-Tailed Phages (Podoviridae)
Non-Contractile Long-Tailed Phages (Siphoviridae)
Contractile Long-Tailed Phages
Factors Affecting Phage Genome Translocation
Transcription-Mediated Genome Internalization In Vivo
Non-Transcriptional Genome Internalization In Vivo
Models of Phage Genome Ejection In Vivo
Acknowledgments
See also
Further Reading
Dealing With the Whole Head: Diversity and Function of Capsid Ejection Proteins in Tailed Phages
Glossary
Introduction
Tailed Phage Capsids With Low Numbers of E Proteins in Specific Locales
Low copy number E proteins with specific locales in the capsid
Podoviral E proteins
Capsids With Large Numbers of E Proteins Distributed Throughout the DNA
Capsids With Large Numbers of Co-Localized E Proteins
The Enigmas of E Proteins
Phages with yet to be identified E proteins
What mechanisms ensure encapsidation of E proteins?
Impacts of E proteins on capsid DNA packaging and ejection
Impacts of E proteins on DNA ejection
Potential of E protein delivery from capsid derived nanocontainers
Conclusions
References
Jumbo Phages
Glossary
Introduction
History
Isolation and Characterization
Virion Structure
Morphotypes
Icosahedral Geometries
Virion DNA Density
Head and Tail Fibers
Genome Features
Terminal Redundancies
Genome Nucleotide Composition
Nucleotide Modifications and Substitutions
ORFan Genes
Transfer RNA Genes
Advanced Capabilities
RNA Polymerases
DNA Repair Enzymes
NADplus Salvage Pathway
LPS Biosynthesis
Phage Nucleus
Evolution
Concluding Remarks
Further Reading
CRISPR-Cas Systems and Anti-CRISPR Proteins: Adaptive Defense and Counter-Defense in Prokaryotes and Their Viruses
Glossary
The Phage-Host Arms Race
CRISPR-Cas, A Brief History
CRISPR-Cas Diversity and Classification
Mechanisms of CRISPR-Cas Immunity
Adaptation
Defense
Anti-CRISPR Proteins
Prophage-Dependent Functional Approach
Guilt-By Association Bioinformatic Approach
Self-Targeting Bioinformatic Approach
Lytic Phage Dependent Functional Approach
Concluding Thoughts
Further Reading
Relevant Website
Bacteriophage: Therapeutics and Diagnostics Development
Glossary
Introduction
The Development of Phage as Therapeutics for Bacterial Infections
The Development of Phage as Diagnostics for Bacterial Infections
Major Advantages of Phage Technology
Specificity
Self-Replication
Major Limitations of Phage Technology
Narrow Host Range
Phage Resistance
Additional Challenges for Phage Therapy
Preclinical Data Translation
Safety and Efficacy
Dosing Strategies
Chemistry, Manufacturing, and Controls (CMC)
Regulatory Pathway
Environmental Impact
Additional Challenges for Phage Diagnostics
Phage Manipulation
Multiplex Testing
Intracellular Bacteria
Sample Interference
Market Acceptance
Conclusions
Further Reading
Bacteriophage Vaccines
Introduction
Architecture of Phage T4
Phage T4 as a Vaccine Platform
In Vivo VLP Assembly
In Vitro VLP Assembly
Phage T4 VLP Vaccines
Other Phage Vaccine Platforms
Conclusion Statement
Acknowledgments
Further Reading
Relevant Website
Bacteriophage Diversity
Glossary
Introduction
Understanding the Nature of Tailed Phage Diversity and Phage Classification
Strategies for Studying Phage Diversity
Examples of Phage Diversity
Top Down Studies
Human phageome
Marine phageome
Bottom Up Studies
Two large diversity studies
The Enterobacteriales tailed phage example
Contribution of Prophages to Phage Diversity
Phage Diversity and Horizontal Exchange of Genetic Information
The Nature of Horizontally Exchangeable Mosaic Section Alleles
How did Genome Mosaicism Arise?
Genetic Exchange Among Superclusters
Phage-host Relationships and Phage Diversity
Phage Cluster-Host Species Relationships With the Enterobacteriales
Narrow and Wide Host Range Phages
Summary
Further Reading
Relevant Websites
Genetic Mosaicism in the Tailed Double-Stranded DNA Phages
Glossary
Introduction
Evidence for Genome Mosaicism Before the Advent of DNA Sequencing
DNA Sequencing Illuminates Properties of Bacteriophage Genomes
Comparative Genomics of Complete Bacteriophage Sequences Shows Diversity and Mosaicism
Mosaicism in the Siphoviridae
Mosaicism in the Myoviridae
Mosaicism in the Podoviridae
Clusters and Superclusters: Large Scale Comparative Genome Analysis
Different Phage Clusters Acquire New Genes at Different Rates
Mosaicism, Hybrids, and Single Gene Surveys
Further Reading
Relevant Websites
Bacteriophages of the Human Microbiome
Glossary
Introduction
Human Phageome Distribution and Composition
Dynamics and Implications of the Human Phage Community
Clinical Utility of Bacteriophages of the Human Microbiome
Concluding Remarks
Further Reading
Bacteriophage: Red Recombination System and the Development of Recombineering Technologies
Glossary
Introduction
Bacteriophage
Homologous Recombination Systems in E. coli and λ Phage
Biological Roles of the λ Red System
Role in DNA Replication
Genetic Exchange Among Similar Phages – Generating Genetic Diversity
DNA Repair
Classical Models of λ Recombination
Development of Recombineering
Recombineering With dsDNA
Markerless Gene Deletions and Counterselection Schemes
Recombineering With ssDNA
In vivo Cloning
Mechanism of Oligo-Mediated Recombineering
Mechanism of dsDNA Recombineering
Recombineering in Pathogenic Bacteria
Combining Recombineering With Other Gene Modification Technologies
Recombinase-Independent Recombination
Final Thoughts
Further Reading
Nanotechnology Application of Bacteriophage DNA Packaging Nanomotors
Glossary
Introduction
The Revolving Mechanism of Biomotors in Bacteriophages
The Revolution Mechanism Without Rotation Common to the dsDNA Packaging Motors of all the dsDNA Bacteriophages
Revolving Mechanisms Determined by Motor Channel Sizes
Revolving Motors Distinguished by Chirality
Special Aspects of Revolving Motor Actions
The Application of Bacteriophage DNA Packaging Motors in Single Pore Sensing of DNA, RNA, Chemicals, and Proteins
Phi29 Connector
SPP1 Connector
T7 Connector
The Mechanism of Single Pore Sensing
Applications of the Biological Nanopore Channel as a Conduit for Single Pore Sensing
Summary
Acknowledgments
Further Reading
General Ecology of Bacteriophages
Glossary
Introduction
Phage Existence Within Environments
Overview
Phage Isolation and Host Range
Microscopic Determination
Phage Ecology
Overview
Phage Organismal Ecology
Bacteria-like versus virion-emphasizing modes of existence
Phage Population Ecology
Phage Community Ecology
Phage Ecosystem Ecology
Further Reading
Relevant Websites
Marine Bacteriophages
Glossary
A Short Introduction in Marine Phages
Morphological Diversity of Marine Phages
Genomic Diversity of Marine Phages and How do we Study it
Phage Marker Genes
Viral Metagenomics (Viromics)
Viral Contigs From Fosmid Libraries
Single Virus Genomics
Long Read Viromics
Metagenome Assembled Viral Genomes (MAVGs)
Linking Phage Genomes With Host Identity
ssDNA Phages in the Marine Environment
Cultivated Phages Infecting Main Groups of Marine Bacteria
Phages of Marine Cyanobacteria - Cyanophages
Phages of unicellular cyanobacteria
Phages of filamentous cyanobacteria
Phages of Marine Alphaproteobacteria
Phages of marine Pelagibacterales
SAR116 phages
Phages of marine Rhodobacteraceae - Roseophages
Phages of Marine Gammaproteobacteria
Vibriophages
Pseudoalteromonas phages
Phages of Marine Bacteroidetes
Marine Phage Ecology
Phage Micro- and Macro-diversity in the Marine Environment
Marine Phages as Factors Driving Bacterial Mortality and Diversity
Diel Rhythms of Phage Infections in the Marine Environment
Auxiliary Metabolic Genes in Marine Phages
Transfer RNAs (tRNAs) in Phages and Their Role in Phage-Host Interactions
Conclusions
References
Further Reading
Ecology of Phages in Extreme Environments
Glossary
Introduction
Phages in Hypersaline Environments
Phages in Thermal Environments
Terrestrial Hot Springs
Deep-Sea Hydrothermal Vents
Hot Deserts
Phages in Polar and Other Cold Environments
Polar Oceans
Glaciers
Polar Lakes
Sea Ice
Permafrost
Other Cold Environments
Atmosphere
Conclusions
Further Reading
Diversity of Hyperthermophilic Archaeal Viruses
Glossary
Introduction
Morphology and Structure
Viruses with particular morphologies
Family Ampullaviridae (from Latin ampulla for “bottle”)
Family Bicaudaviridae (from Latin bi, “two”, and cauda for “tail”)
Family Clavaviridae (from Latin clava for “club”, “stick”)
Family Fuselloviridae (from Latin fusello for “little spindle”)
Family Guttaviridae (from Latin gutta for “droplet”)
Family Spiraviridae (from Latin spira for “coil”)
Spherical viruses
Family Globuloviridae (from Latin globulus for “small ball”)
Family Ovaliviridae (from Latin ovalis for “oval”)
Family Portogloboviridae (from Latin porto for “to bear”, and globus for “ball”)
Family Turriviridae (from Latin turris for “tower”)
Filamentous viruses
Proposed class Tokiviricetes: order Ligamenvirales: family Rudiviridae (from the Latin rudis for “small rod”) order...
General comments
Genomes
Family Fuselloviridae
Order Ligamenvirales
Evolutionary relationships
Further Reading
ARCHAEAL VIRUSES
Euryarchaeal Viruses
Glossary
Introduction
Icosahedral Tailed Viruses
Icosahedral Tailed Viruses of Halophilic Euryarchaea
Tailed Viruses of Methanogenic Euryarchaea
Icosahedral Internal Membrane-Containing Viruses
Pleomorphic, Spindle-Shaped, and Spherical Viruses
Pleomorphic Viruses
Spindle-Shaped Viruses
Spherical Virus Metsv
Culture-Independent Studies
Conclusions
Further Reading
Relevant Websites
Vesicle-Like Archaeal Viruses
Glossary
Introduction
An Overview of the Virion Structure and Viral Life Cycle
Genomic Characteristics
Gene Content and Conserved Genes
Family Pleolipoviridae: Related Viruses With Different Genome Types
New Isolates and Haloarchaeal Genomic Regions
Stability of Pleolipovirus Infectivity
Salinity
Temperature and Other Significant Factors
Structural Proteins
The Spike Protein
Further Reading
Relevant Website
Virus-Host Interactions in Archaea
Glossary
Introduction
Virus Entry
Interaction With Cellular Appendages
Interaction With Cell-Surface
Kinetics of Virus Entry
Genome Replication
Genome Integration
Transcription
Transcription Regulators
Transcriptional Control
Virion Egress
Cell Membrane Disruption
Viral Release Without Membrane Disruption
Antiviral Defense and Viral Counterdefense Mechanisms
Antiviral Defense Mechanisms
CRISPR-Cas systems
Toxin-antitoxin system
Counterdefense Mechanisms
Conclusions
Further Reading
Antiviral Defense Mechanisms in Archaea
Glossary
Introduction
CRISPR-Cas: The Prokaryotic Adaptive Immune System
Spacer Acquisition and its Regulation in Archaea
Expression of CRISPR Loci and crRNA Biogenesis
Archaeal Type I CRISPR-Cas Systems
Archaeal Type III CRISPR-Cas Systems
Novel Archaeal CRISPR-Cas Systems
Anti-CRISPR Proteins of Archaeal Viruses
Archaeal Innate Antiviral Systems
Future Perspectives
Further Readings
Discovery of Archaeal Viruses in Hot Spring Environments Using Viral Metagenomics
Introduction
Sample Collection and Processing
Next Generation Sequencing of Samples
Post Sequence Analysis
What Have we Learned?
Looking Forward
Further Reading
Metagenomes of Archaeal Viruses in Hypersaline Environments
Glossary
Introduction
How can Archaeal Viruses be Studied Using Viral Metagenomes?
What Viral Metagenomes are Available From Hypersaline Environments and What are Their General Characteristics?
Viruses of Haloquadratum Walsbyi
Viruses of the Nanohaloarchaea
Concluding Remarks
Further Reading
Extreme Environments as a Model System to Study How Virus–Host Interactions Evolve Along the Symbiosis Continuum
Glossary
Aspects of Viral Fitness
Environment Impact on Host-Virus Interaction
Viruses of Acidic Hot Springs
Complexity of Host Virus-Interactions
How do Ecology and Evolution Shape Archaeal Viruses?
How do Virus Interactions Impact Ecology and Evolution?
Acknowledgments
References
Further Reading
FUNGAL VIRUSES
An Introduction to Fungal Viruses
Glossary
Introduction
Biological Properties
Host Range
Symptom Expression
Transmission
Mixed Infections
Fungal Virus Taxonomy
dsRNA Mycoviruses
Single-Stranded RNA Viruses
Unassigned ssRNA Viruses
Replication and Gene Expression Strategy
Recent Technical Advances in Fungal Virology
Future Perspectives
Cross-Kingdom Infection by Viruses and Viroids
Yeast as a (Model) Host to Study Viral Replication
Virus Neo-Lifestyles
Host Defense Against Mycoviruses and Their Counter-Defense
Role of Mycoviruses in Plant-Fungal Mutualistic Associations
Mycovirus as Biocontrol Agents and as Tools for Fundamental Studies
Further Reading
Relevant Websites
Cross-Kingdom Virus Infection
Introduction
Viruses That Replicate in Both Plants and Insects
Closely Related Viruses Independently Infect Plants and Fungi
Artificially Established Viral Cross Infections Between Plants and Fungi
Evidence of the Natural Transmission of a Plant Virus to a Fungus
A Fungal DNA Virus That Replicates in an Insect and Uses It as a Vector
The Contribution of Cross-Kingdom Viral Infection to Virus Evolution
Concluding Remarks
Further Reading
Diversity of Mycoviruses in Aspergilli
Glossary
Introduction
Prevalence
Transmission
Genomes
The Aspergillus foetidus Mycovirus Complex
Partitiviruses
Chrysoviruses
Narnaviruses and Mitoviruses
Polymycoviruses
Phenotypes
Further Reading
Evolution of Mycoviruses
Glossary
Introduction
Evolutionary History and Drivers of Evolution
Effects of Host Divergence in the Evolution
Coinfection and Evolution
Structural Evolution of Mycoviruses
Further Readings
Relevant Website
Mixed Infections of Mycoviruses in Phytopathogenic Fungus Sclerotinia sclerotiorum
Glossary
Introduction
Strain Ep-1PN Harbors Two Positive-Single Stranded RNA Mycoviruses
Strain SX247 is Co-Infected With Two plusssRNA Mycoviruses
Strain AH98 is a Mix-Infection by Two Unrelated Single-Stranded RNA Mycoviruses
The Hypovirulent Strain SZ-150 Contains Three Mycoviruses and a Satellite RNA
Four Mycoviruses Co-Infect a Hypovirulent Strain SCH941
Megabirnavirus Infects the Hypovirulent Strain SX466 With a Mitovirus and a Partitivirus
Strain Sunf-M is Coinfected With Two Unrelated dsRNA Mycoviruses
Mitoviruses Widely Exist in and Co-Infect S. sclerotiorum
Three Strains of S. sclerotiorum Harbor Multiple Unassigned Mycoviruses or dsRNA Elements
The Possible Reasons for the Co-Infections of Mycoviruses in S. sclerotiorum
References
Further Reading
Mycovirus-Mediated Biological Control
Glossary
Introduction
Disease Cycle of Plant Pathogenic Fungi
Basic Concepts for Mycovirus-Mediated Biological Control
General Procedure for Developing a Mycovirus-Mediated Biological Control System
Virus Detection and Characterization
Testing for Phenotypic Effects of the Mycovirus
Transmission Properties
Biocontrol Testing in the Field
Biosafety Issues
Molecular Approaches to Improve the Use of Mycoviruses as Biological Control Agents
Selected Examples for Mycovirus-Mediated Biological Control
Hypovirulence in Cryphonectria parasitica
The “exclusive transmissible hypovirulence”
Artificial application of hypovirulence
Hypovirulence in Ophiostoma novo-ulmi
Hypovirulence in Sclerotinia sclerotiorum
Hypovirulence in Botrytis cinerea
Hypovirulence in Rosellinia necatrix
Hypovirulence in Helminthosporium victoriae
Hypovirulence in Fusarium graminearum
Future Perspectives
Further Reading
Mycoviruses With Filamentous Particles
Glossary
Introduction
Botrytis virus F (BotV-F)
Family - Gammaflexiviridae
Genus - Mycoflexivirus
Genome Structure
Biological Properties
Virion Morphology
Phylogenetic Relationships
Botrytis Virus X (BotV-X)
Family - Alfaflexiviridae
Genus - Botrexvirus
Genome Structure
Biological Properties
Virion Morphology
Phylogenetic Relationships
Sclerotinia Sclerotimonavirus
Family - Mymonaviridae
Genus - Sclerotimonavirus
Genome Structure
Biological Properties
Virion Morphology
Phylogenetic Relationships
Fusarium Graminearum Negative-Stranded RNA virus 1
Family - Mymonaviridae (?)
Genus - Unclassified
Genome Structure
Biological Properties
Particle Morphology
Phylogenetic Relationships
Colletotrichum Camelliae Filamentous Virus 1
Family - Unclassified
Genus - Unclassified
Genome Structure
Biological Properties
Virion Morphology
Phylogenetic Relationships
Relationships Between Filamentous Viruses From Fungi and Plants
Further Reading
Prions of Yeast and Fungi
Glossary
Introduction and History
Genetic Signature of a Prion
Self-Propagating Amyloid as the Basis for Most Yeast Prions
Shuffling Prion Domains and Amyloid Structure
Shuffleable Prion Domains Suggests Parallel In-Register beta-Sheet Structure
Infectious Prion Amyloids Have In-Register Parallel Folded beta-sheet Architecture
Prion Variants and the Species Barrier
Prion Variant Information Templating Mechanism
Chaperones and Prions
Prion Generation, and [PIN]: A Prion That Gives Rise to Prions
Biological Roles of Prions: A Help or a Hindrance?
Anti-Prion Systems
Inositol Polyphosphates and Prion Propagation
Enzyme as Prion
Conclusions
Acknowledgment
See also
Further Reading
Single-Stranded DNA Mycoviruses
Glossary
Introduction
The Host of the DNA Mycovirus
Discovery of DNA Mycoviruses
The Genome and Proteins of SsHADV-1
Host Range of SsHADV-1
Extracellular Entry of SsHADV-1
Mutualistic Interaction Between SsHADV 1 and Mushroom Sciarid Fly
Transmission of SsHADV-1
Distribution of SsHADV-1
The Taxonomy of SsHADV-1
Explore SsHADV-1 to Control Fungal Disease
Further Reading
Structure of Double-Stranded RNA Mycoviruses
Introduction
Structure of dsRNA Virus Capsids
Totiviruses
Chrysoviruses
Quadriviruses
Partitiviruses
Megabirnaviruses
Evolutionary Relationships Based on Structural Comparisons
RdRp and dsRNA Organization Within Mycovirus Capsids
Concluding Remarks and Future Perspectives
See also
Further Reading
Ustilago maydis Viruses and Their Killer Toxins
The Totiviruses
The Killer Phenomena
KP4
Effects of KP4 on U. maydis Cells
KP4 Blocks L-type Voltage Gated Ca2+ Channels
Effect of KP4 on Plants
Possible Application of KP4: Fungal Resistance in Plants
Evolutionary Origin of KP4
KP6
The Atomic Structure of KP6
KP1
Comparison of the Killer Proteins
Further Reading
Vegetative Incompatibility in Filamentous Fungi
Glossary
Introduction
Characteristics of the Vegetative Incompatibility Reaction
Microscopic and Macroscopic Analyses of Hyphal Fusion
Identification of VCGs
Barrage assay
Microscopy
Visualization of labeled proteins during co-culture
Auxotrophic complementation
Detecting the PCD of fused cells
The Genetics of Vegetative Incompatibility
The Signaling Pathways of Vegetative Incompatibility
HET Protein Involved in NLR-Mediated Innate Immunity in Fungi
Mycovirus Transmission
Mycoviruses can be Horizontally Transmitted Among VCGs
Potential Approaches to Enhance Mycovirus Transmission Between VCGs
Further Reading
Viral Diseases of Agaricus bisporus, the Button Mushroom
Glossary
Introduction
La France Disease - Agaricus Bisporus Virus 1 (AbV1)
Genome Organization
Virion Structure and Composition
Taxonomy and Classification
Biological Properties (Virus Host Relationships)
Epidemiology
Diagnostics and Identification
Control
Brown Cap Mushroom Disease - Agaricus Bisporus Virus 16 (AbV16)
Genome and Virion Structure
Classification
Viral Expression and Disease Development
Epidemiology
Diagnosis
Further Reading
Viral Killer Toxins
Glossary
Introduction
DsRNA Viruses and Killer Phenotype Expression in S. Cerevisiae
Viral Replication Cycle
Viral Preprotoxin Processing and Toxin Maturation
Endocytosis and Intracellular Transport of the K28 Virus Toxin
ER Exit and Nuclear Entry of the K28 Virus Toxin
K28 Affects DNA Synthesis, Cell-Cycle Progression, and Induces Apoptosis
Lethality of Membrane Damaging Viral Killer Toxins
Self-Protection in Killer Virus-Infected Yeast - Toxin Immunity
Further Reading
Alternaviruses (Unassigned)
Glossary
Introduction
Genome Organization
Virion Properties
Biological Effects of Alternavirus on Their Host
Proteins Encoded by Alternaviruses
3prime Poly (A) Structure of Alternaviruses
Evolutionary Relationships Among Chrysoviruses
Reference
Further Reading
Barnaviruses (Barnaviridae)
Glossary
Introduction
MBV Virion Properties
MBV Virion Structure and Composition
MBV Genome Organization and Expression
MBV Evolutionary Relationships
MBV Transmission and Host Range
Further Reading
Botybirnaviruses (Botybirnavirus)
Glossary
Introduction
Virion Properties
Genome Organization and Replication
Taxonomy and Similarity With Other Viruses
Transmission and Distribution
Biology
Further Reading
Chrysoviruses (Chrysoviridae) - General Features and Chrysovirus-Related Viruses
Glossary
Introduction
Virion Properties
Virion structure and composition
Genome Organization
Genome Expression and Replication
Chrysovirus RdRps
Chrysovirus CPs
Alphachryso-P3 Shares a ‘Phytoreo S7 Domain’ With Core Proteins of Phytoreoviruses
Chryso-P4 is a Putative Protease and Virion Associated as a Minor Protein
Replication of Chrysoviruses
Taxonomy and Phylogenetic Analysis
Biology and Effects of Chrysoviruses on Fungal Hosts
Acknowledgment
Further Reading
Fungal Partitiviruses (Partitiviridae)
Glossary
Introduction
Classification
Virion Structure
Genome
Life Cycle
Epidemiology
Pathogenesis
Taxonomic and Phylogenetic Considerations
Further Reading
Relevant Websites
Fusariviruses (Unassigned)
Glossary
Introduction
Taxonomy and Classification
Virion Property
Genome Organization
Gene Expression
Virus Transmissions
Virus-Host Interactions
Summary and Future Prospective
Further Reading
Relevant Website
Giardiavirus (Totiviridae)
History
Taxonomy, Classification and Evolution
Host Range and Geographic Distribution
Physical and Biochemical Characteristics
Organization and Molecular Biology of GLV and GCV dsRNA Genomes
Infection and Replication
References
Hypoviruses (Hypoviridae)
Glossary
Introduction
Taxonomy and Genetic Organization
Hypovirus Gene Expression Strategy
Hypovirus Functional Domains
Anti-Hypovirus Defense Mechanisms
Further Reading
Megabirnaviruses (Megabirnaviridae)
Glossary
Introduction
Virion Properties
Genome Organization
Genome Expression and Replication
Virion Structure
Biological Properties
Taxonomic and Phylogenetic Considerations
Future Perspectives
Functions of Megabirna-P3 and -P4
Further Reading
Mitoviruses (Mitoviridae)
Glossary
Genome Structure
Accessory RNAs Associated With Mitovirus Infections
Phenotypic Effects of Mitovirus Infection
Phylogenetic Relationships
Taxonomy and Nomenclature
Transmission
Engineering Mitoviruses for Infectivity
Host Defense
Codon Usage and Implications for Mitovirus Biology and Evolution
Origin and Evolution
Plant Mitoviruses
Further Reading
Mycoreoviruses (Reoviridae)
Glossary
Structure-Function Relationships
Taxonomy and Nomenclature
Genome Structures, Organizations, and Relationships
Mycoreovirus 1
Expression of MyRV1 Gene Products
Mycoreovirus 2
Mycoreovirus 3
Sclerotinia Sclerotiorum Reoviruses
Mycoreovirus 4
Sclerotinia Sclerotiorum Reovirus 1
Effects of Mycoreoviruses on Fungal Gene Expression
Coinfections of Mycoreovirus and Other Viruses
Mycoreovirus Genome Rearrangements
Further Reading
Mymonaviruses (Mymonaviridae)
Glossary
Introduction
Phylogenetic Status of Mymonoviridae and its Related Families
Species and Tentative Species in the Family Mymonoviridae
Virion
The Genomic Structure
The Impact on the Host
Host and Distribution
Further Reading
Narnaviruses (Narnaviridae)
Glossary
Introduction
Historical Background
Viral Genomes
Ribonucleoprotein Complexes as a Viral Entity
Replication Intermediates
Generation of Narnaviruses in Vivo
Cis-Acting Signals for Replication
Cis-Signals for Formation of Ribonucleoprotein Complexes
Narnavirus Persistence in the Host
Further Reading
Phlegiviruses (Unassigned)
Introduction
Virion Properties
Genome Organization
Biological Properties
Evolutionary Relationships among Phlegiviruses
Future Perspectives
Further Reading
Plant and Protozoal Partitiviruses (Partitiviridae)
Glossary
Introduction
Current Taxonomy of the Partitiviridae
Virion Properties
Alphapartitivirus
Betapartitivirus
Cryspovirus
Deltapartitivirus
Genome Organization and Replication Strategy
Genome Organization
5prime and 3primeUTRs
RNA1& 2
Replication Strategy
Transmission
Virus-Host-Interaction
Further Reading
Relevant Websites
Quadriviruses (Quadriviridae)
Glossary
Introduction
Virion Structure and Composition
Genome Organization and Expression
Molecular and Biological Properties
Phylogenetic and Evolutionary Relationships of RnQV1 to Other dsRNA Mycoviruses
See also
Further Reading
Relevant Website
Totiviruses (Totiviridae)
Glossary
Introduction
Taxonomy of and Evolutionary Relationships Among Totivirids
Virion Properties
Virion Structure and Composition
Genome Organization and Expression
Virus Replication Cycle
Biological Properties
Virus-Host Relationships
Conclusions
Further Reading
Yado-kari Virus 1 and Yado-nushi Virus 1 (Unassigned)
Glossary
Introduction
Virion Morphology
Genome Characteristics of YnV1
Genome Characteristics of YkV1
Phylogenetic Placements of YnV1 and YkV1
Proposed Replication Model for YnV1 and YkV1
Molecular Entities Sharing Similar YkV1/YnV1-Like Interactions
YkV1/YnV1-Like Virus Combinations in Other Fungi
Predicted Past and Future of YkV1
Concluding Remarks and Future Directions
Further Reading
Yeast L-A Virus (Totiviridae)
Glossary
Introduction
History
Virion Structure
Genome Organization
Replication Cycle
Transcription
RNA Packaging
RNA Replication
Viral Translation
L-A Genetics
Other RNA Replicons in Yeast: L-BC, 20S RNA, 23S RNA
See also
Further Reading
ALGAL VIRUSES
Algal Marnaviruses (Marnaviridae)
RNA Viruses Infecting Marine Protists
Development of a Taxonomic Framework and Resulting Changes in Taxonomy
Marine RNA Virus Quasispecies
Importance of RNA Viruses in Marine Microbial Ecology
Summary
Further Reading
Algal Mimiviruses (Mimiviridae)
Glossary
History
General Properties
Proposed Subfamily Mesomimivirinae
Unclassified Algae-Infecting Members in the Family Mimiviridae
Further Reading
Miscellaneous Algal Viruses (Alvernaviridae, Bacilladnaviridae, Dinodnavirus, Reoviridae)
Introduction
Algal single-stranded RNA Viruses
Genus Dinornavirus
Algal double-stranded RNA Viruses
Genus Mimoreovirus
Algal single-stranded DNA Viruses
Genus Bacilladnavirus
Algal double-stranded DNA Viruses (not Belonging to Either of the Families: Phycodnaviridae or Mimiviridae)
Genus Dinodnavirus
Reference
Further Reading
Phycodnaviruses (Phycodnaviridae)
History
Taxonomy and Classification
Virion Structure and Composition
Morphology
Physicochemical and Physical Properties
Nucleic Acids
Proteins
Lipids
Carbohydrates
Genomes
Virus Replication
Virus Transcription
Ecology
Resistance to Phycodnavirus Infections
Phycodnavirus Genes Encode Some Interesting and Unexpected Proteins
Perspectives
Acknowledgments
Further Reading
INVERTEBRATE VIRUSES
An Introduction to Viruses of Invertebrates
Glossary
Introduction
Artoviridae
Ascoviridae
Baculoviridae
Bidnaviridae
Birnaviruses of Invertebrates (Entomobirnavirus)
Dicistroviridae
Hytrosaviridae
Iflaviridae
Iridoviruses of Invertebrates (Betairidovirinae)
Malacoherpesviridae
Mesoniviridae
Nimaviridae
Nodaviridae
Nudiviridae
Nyamiviridae
Parvoviruses of Insects (Densovirinae and Hamaparvovirinae)
Polydnaviridae
Poxviruses of Invertebrates (Entomopoxvirinae)
Reoviruses of Invertebrates (Sedoreovirinae and Spinaviridae)
Rhabdoviruses of Invertebrates (Rhabdoviridae, Almendravirus, Alphanemrhavirus, Caligrhavirus, Sigmavirus)
Roniviridae
Sarthroviridae
Solinviviridae
Tetraviruses (Families, Alphatetraviridae, Carmotetraviridae, Permutotetraviridae)
Taxa of Other Viruses of Invertebrates
Arboviruses in Their Invertebrate Vectors
Plant Viruses in Their Invertebrate Vectors
Retrotransposons Associated With Invertebrates
Acknowledgments
See also
Further Reading
Relevant Websites
Ascoviruses (Ascoviridae)
Glossary
Introduction
History
Distribution and Taxonomy
Virion Structure and Composition
Transmission and Ecology
Host Range
Pathology and Pathogenesis
Signs of Disease
Cytopathology and Cell Biology
Tissue Tropism
Replication and Virion Assembly
Origin and Evolution
Future Perspectives
Further Reading
Baculovirus–Host Interactions: Repurposing Host-Acquired Genes (Baculoviridae)
Glossary
Introduction
Viruses Acquiring Genes From Their Hosts
Hosts Acquiring Genes From Infecting Viruses
Baculovirus Replication
Baculovirus Acquisition of Genes
Baculovirus Host Range Genes
Virus Defense and Resistance to Baculoviruses
Concluding Remarks
Classification (Compact)
Virion Structure
Genome
Replication Cycle
Epidemiology
Pathogenesis
Diagnosis
Further Reading
Baculoviruses: General Features (Baculoviridae)
Glossary
Historical Perspective
Nomenclature, Taxonomy, and Classification
Morphology
Genomes, Gene Content, Organization
Evolution
Infection Cycle
Baculovirus Transmission and Host Behavioral Manipulation
Baculovirus Expression Technology
Baculoviral Insecticides
Further Reading
Relevant Website
Baculoviruses: Molecular Biology and Replication (Baculoviridae)
Glossary
Introduction
Infection Cycle
Two Virion Phenotypes
Oral and Systemic Infection
Dissemination of OBs
Genome Organization and Content
Baculovirus Gene Expression
Temporal Regulation of Transcription
Early gene expression
Late and very late gene expression
Viral DNA Replication
Virion Morphogenesis and Protein Composition
Nucleocapsids
BV Morphogenesis
GP64 glycoprotein characteristics
F protein properties
ODV Assembly and ODV Envelopes
Tegument Proteins
Appendix A Supplementary Material
See also
Further Reading
Bidensoviruses (Bidnaviridae)
Classification
Genome Organization and Expression Strategy
Pathology of Silkworm Associated With BmBDV
Viral Non-Structural Proteins
Viral Structural Protein
Viral Replication
Virus Evolution
Further Reading
Bunyaviruses of Arthropods (Mypoviridae, Nairoviridae, Peribunyaviridae, Phasmaviridae, Phunuiviridae, Wupedeviridae)
Glossary
Introduction
Virion Structure
Genome and Coding Strategies of the Viral Genomes
Viral Replication Cycle
Host Associations, Virus Maintenance Cycles and Pathogenicity
Further Reading
Dicistroviruses (Dicistroviridae)
Glossary
Introduction
Taxonomy and Classification
Biophysical Properties
Organization of the Dicistrovirus Genome
Virus Replication and Genome Expression
Host Range
Pathology and Transmission
Geographic and Strain Variation
Relationships Within the Family
Similarity With Other Taxa
Acknowledgments
Further Reading
Entomobirnaviruses (Birnaviridae)
Glossary
Classification (Compact)
Virion Structure
Genome
Life Cycle
Epidemiology
Clinical Features
Pathogenesis
Further Reading
Relevant Website
Hytrosaviruses (Hytrosaviridae)
Glossary
Introduction
Taxonomy and Classification
Similarities With Other Virus Taxa
Virion Structure
Genome Organization
Modes of Infection and Gene Expression
Major Structural Proteins
Host Range
Pathogenesis and Tissue Tropism
Pathogenesis in the Salivary Glands
Pathogenesis in Other Host Tissues
Viral Latency
Transmission and Epidemiology
GpSGHV Transmission Dynamics in the Tsetse Fly
MdSGHV Transmission Dynamics in the Housefly
Diagnosis and Management of SGHV Infections
Diagnosis
Management
Conclusions
Further Reading
Relevant Website
Iflaviruses (Iflaviridae)
Glossary
Classification
Virion Structure
Genome
Life Cycle
Epidemiology
Clinical Signs
Pathogenesis
Diagnosis/Detection Methods
Prevention
Further Reading
Relevant Websites
Iridoviruses of Invertebrates (Iridoviridae)
Glossary
Introduction
Classification of Iridovirids
Morphology and Composition
IIV-6 Persistence and Sensitivity to External Factors
Host Range and Pathology
Genome Organization and Codon Usage
Virion Proteins
Viral Entry, Replication, and Release Strategy
Transcriptional Regulation
Promoter Elements and Transcriptional Regulation
Induction/Inhibition of Apoptosis in Infections
Concluding Remarks
Further Reading
Relevant Website
Mesoniviruses (Mesoniviridae)
Classification
Virion Structure
Genome
ORF1a/ORF1b
Structural Proteins
Life Cycle
Reference
Further Reading
Nimaviruses (Nimaviridae)
Glossary
Introduction
Taxonomy
Virion Structure and Composition
Genome and Phylogeny
Life Cycle
Apoptosis
Hemocyte Responses to WSSV Infection
Epidemiology
Clinical Features and Pathology
Protection of Shrimp Against WSSV Infection Using Vaccination and RNAi Strategies
Further Reading
Relevant Website
Nodaviruses of Invertebrates and Fish (Nodaviridae)
Classification (Compact): Family Nodaviridae, genera Alphanodavirus and Betanodavirus
Provisional Nodaviruses
Proposed Six Clade Taxonomic Structure
Virion Structure
Genome
Proteins
RNA-Dependent RNA Polymerase (RdRp)
Capsid Proteins
Proteins B1 and B2
Physical Properties
Life Cycle
Epidemiology
Clinical Features
VNN
WTD
VCMD
Pathogenesis
Immune Responses to Nodavirus Infection
Innate Immune Responses
Adaptive Immune Responses in Nodavirus Infection
Diagnosis
VNN
WTD
VCMD
Treatment
Prevention
VNN
WTD
VCMD
Further Reading
Nudiviruses (Nudiviridae)
Glossary
Classification
History
Criteria for Classification
Members of the Nudiviridae
Phylogeny (Evolution of Nudiviridae and Other dsDNA Arthropod Viruses)
Virion Structure
Transmission and Pathogenesis
Virus Life Cycle
Genome
General Features
Gene Content
Gene Regulation for Switching Productive and Latent Infections
Negative Impacts and Potential Applications
Further Reading
Parvoviruses of Invertebrates (Parvoviridae)
Glossary
Introduction
Discovery, Taxonomy and Evolution of Densoviruses; of Polyphyly, Paraphyly and its Tangled Relationship With...
Biology, Pathology and Host Range
General Features
Shrimp Densoviruses
Penaeus stylirostris penstyldensovirus 1 (PstDV1)
Fenneropenaeus chinensis hepandensovirus (FcHDV)
New invertebrate DVs
Biochemical Properties and Purification of Densoviruses
Structural Features of Virions
Biophysical Features and Functions Associated With the Densovirus Capsid
Densovirus Genome Structure and Replication
Expression Strategies
Conclusions
Acknowledgments
Reference
Further Reading
Relevant Websites
Polydnaviruses (Polydnaviridae)
Glossary
Classification (Compact)
Life Cycle of PDVs
Virion Structure
Morphogenesis of PDV Particles
The PDV Packaged Genome
Function of the Genes Encoded by PDV Packaged Genome
Fate of the PDV Packaged Genome in the Parasitized Insect
The Proviral Genome Maintained in the Wasp Genome
Organization of PDV Proviral Segments in the Wasp Genome
The PDV Viral Machineries
Concluding Remarks
Further Reading
Poxviruses of Insects (Poxviridae)
Glossary
Introduction
Classification
Virus/Midgut Interactions
Replication in Larvae
Genomics
Conserved and Shared Genes
Gene Families
Entomopoxvirus Phylogeny
Acknowledgment
Further Reading
Reoviruses of Invertebrates (Reoviridae)
Glossary
Taxonomy
Introduction
Spinareovirinae
Historical Overview
Host Range, Diseases, Transmission, and Distribution
Cypoviruses
Idnoreoviruses
Dinovernavirus
Aquareovirus
Fijivirus
Orthoreovirus
Virion Properties, Genome, and Replication
Cypoviruses
Idnoreoviruses
Dinovernavirus
Antigenic and Genetic Relationships
Cypoviruses
Idnoreoviruses
Dinovernaviruses
Sedoreovirinae
Cardoreovirus
Phytoreovirus
Seadornavirus
See also
Further Reading
Rhabdoviruses of Insects (Rhabdoviridae)
Classification
Virion Structure
Genome
Sigmavirus Genome
Almendraviruses Genome
Other Insect Rhabdovirus Genomes
Life Cycle
Integration Into the Host Genome
Epidemiology
Distribution, Spatial and Temporal
Transmission
Prevalence
Population Dynamics
Clinical Features
Pathogenesis
Diagnosis
Further Reading
Relevant Websites
Sarthroviruses (Sarthroviridae)
Glossary
Introduction
Geographical Distribution
Clinical Signs of WTD and Histopathology
Morphology of Virus
Genome Organization
Taxonomic Position
Pathogenicity and Transmission of Disease
Host Range
Susceptibility of Cell Line to XSV
Diagnostic Tools
Control Measures
Acknowledgment
References
Relevant Website
Solinviviruses (Solinviviridae)
Glossary
Introduction
Classification (Compact)
Virion Structure
Genome
Life Cycle
Epidemiology (Host Specificity/Prevalence/Transmission)
Clinical Features
Pathogenesis
Diagnosis (Detection)
Prevention (Biocontrol)
Further Reading
Relevant Websites
Tetraviruses (Alphatetraviridae, Carmotetraviridae, Permutotetraviridae)
Introduction
Genome Organization
Alphatetraviridae
Permutotetraviridae
Carmotetraviridae
Viral Replication
The Tetravirus Capsid
Capsid Structure
Capsid Maturation, Auto-Proteolysis and Dynamics
Pathology
Symptoms and Transmission
Host Range
Persistent Infections
Concluding Remarks
Reference
Further Reading
Recommend Papers

Encyclopedia of Virology, Volume 4 Viruses as Infectious Agents: Bacterial, Archaeal, Fungal, Algal, and Invertebrate Viruses [4 ed.]
 0128234083, 9780128234082

  • 0 0 0
  • Like this paper and download? You can publish your own PDF file online for free in a few minutes! Sign Up
File loading please wait...
Citation preview

ENCYCLOPEDIA OF VIROLOGY FOURTH EDITION

Volume 4

ENCYCLOPEDIA OF VIROLOGY FOURTH EDITION EDITORS IN CHIEF

Dennis H. Bamford Molecular and Integrative Biosciences Research Programme Faculty of Biological and Environmental Sciences University of Helsinki, Helsinki, Finland

Mark Zuckerman South London Specialist Virology Centre King’s College Hospital NHS Foundation Trust London, United Kingdom and Department of Infectious Diseases School of Immunology and Microbial Sciences, King’s College London Medical School London, United Kingdom

Volume 4

AMSTERDAM  BOSTON  HEIDELBERG  LONDON  NEW YORK  OXFORD PARIS  SAN DIEGO  SAN FRANCISCO  SINGAPORE  SYDNEY  TOKYO Academic Press is an imprint of Elsevier

ACADEMIC PRESS

Academic Press is an imprint of Elsevier Radarweg 29, PO Box 211, 1000 AE Amsterdam, Netherlands The Boulevard, Langford Lane, Kidlington, Oxford OX5 1GB, United Kingdom

50 Hampshire Street, 5th Floor, Cambridge MA 02139, United States Copyright r 2021 Elsevier Ltd. unless otherwise stated. All rights reserved. No part of this publication may be reproduced or transmitted in any form or by any means, electronic or mechanical, including photocopying, recording, or any information storage and retrieval system, without permission in writing from the publisher. Details on how to seek permission, further information about the Publisher’s permissions policies and our arrangements with organizations such as the Copyright Clearance Center and the Copyright Licensing Agency, can be found at our website: www.elsevier.com/permissions. This book and the individual contributions contained in it are protected under copyright by the Publisher (other than as may be noted herein). Notices Knowledge and best practice in this field are constantly changing. As new research and experience broaden our understanding, changes in research methods, professional practices, or medical treatment may become necessary. Practitioners and researchers may always rely on their own experience and knowledge in evaluating and using any information, methods, compounds, or experiments described herein. In using such information or methods they should be mindful of their own safety and the safety of others, including parties for whom they have a professional responsibility. To the fullest extent of the law, neither the Publisher nor the authors, contributors, or editors, assume any liability for any injury and/or damage to persons or property as a matter of products liability, negligence or otherwise, or from any use or operation of any methods, products, instructions, or ideas contained in the material herein. Library of Congress Cataloging-in-Publication Data A catalog record for this book is available from the Library of Congress British Library Cataloguing-in-Publication Data A catalogue record for this book is available from the British Library ISBN 978-0-12-814515-9

For information on all publications visit our website at http://store.elsevier.com

Publisher: Oliver Walter Acquisitions Editor: Priscilla Braglia

Content Project Manager: Katarzyna Miklaszewska Associate Content Project Manager: Gayathri S. Designer: Matthew Limbert .

EDITORS IN CHIEF

Dennis H. Bamford, PhD, is Professor Emeritus of Virology at the Faculty of Biological and Environmental Sciences, University of Helsinki, Finland. He obtained his PhD in 1980 from the Department of Genetics, University of Helsinki. During 1981–1982 he was an EMBO postdoctoral fellow at the Public Health Research Institute of the City of New York, United States, and during 1983–1992 he worked as a Senior Scientist at the Academy of Finland. In 1993 he was appointed Professor of General Microbiology at the University of Helsinki. He was awarded the esteemed Academy Professorship twice, in 2002–2007 and 2012–2016, and he also served twice as the Director of the Finnish Center of Excellence (in Structural Virology, 2000–2005, and in Virus Research, 2006–2011). Prof. Bamford has had continuous external research funding (e.g., from several European Union, Academy of Finland, TEKES and Jusélius Foundation funds, as well as the Human Frontier Science Program). He is an EMBO member and has held several positions of trust in scientific and administrative organizations. Prof. Bamford has published approx. 400 articles in international peer-reviewed journals in virology, microbiology, biochemistry, and molecular biology (36 of them in high impact journals). About half of the primary articles have been published with international collaborators showing high international integration. He has also been invited to give 56 keynote and plenary presentations in major international meetings. Prof. Bamford has supervised over 35 Master’s and over 40 PhD theses. Seven of his graduate students or post docs have obtained a professorship and a similar number have a principal investigator status. Prof. Bamford has studied virus evolution from a structure-centered perspective, showing that seemingly unrelated viruses, such as bacteriophage PRD1 and human adenovirus have similar virion architecture. When the corona virion architecture was gradually revealed, it was observed that its structural elements were close to those seen in RNA bacteriophage phi6 so that phi6 has been actively used as surrogate for pathogenic viruses - quite a surprise!

Dr. Mark Zuckerman is Head of Virology, Consultant Medical Virologist, and Honorary Senior Lecturer at South London Specialist Virology Centre, King’s College Hospital NHS Foundation Trust and King’s College London Medical School, Department of Infectious Diseases, School of Immunology and Microbial Sciences in London, United Kingdom. His interests include the clinical interface between developing molecular diagnostic tests relevant to the local population of patients, respiratory virus infections, herpesvirus infections in immunocompromised patients and blood-borne virus transmission incidents in the healthcare setting. He has chaired the UK Clinical Virology Network, Royal College of Pathologists Virology Specialty Advisory Committee and Virology Examiners Panel and is a member of the Specialty Advisory Committee on Transfusion Transmitted Viruses. He is a co-author on four editions of the “Mims’ Medical Microbiology” textbook, has written chapters in a number of other textbooks and has over 100 publications in international peer-reviewed journals and is an associate editor for two journals.

v

EDITORIAL BOARD Editors in Chief Dennis H. Bamford Molecular and Integrative Biosciences Research Programme, Faculty of Biological and Environmental Sciences, University of Helsinki, Helsinki, Finland Mark Zuckerman South London Specialist Virology Centre, King’s College Hospital NHS Foundation Trust, London, United Kingdom and Department of Infectious Diseases, School of Immunology and Microbial Sciences, King’s College London Medical School, London, United Kingdom

Section Editors Claude M. Fauquet St Louis, MO, United States Michael Feiss Department of Microbiology and Immunology, Carver College of Medicine, University of Iowa, Iowa City, IA, United States Elizabeth E. Fry Department of Structural Biology, Nuffield Department of Medicine, University of Oxford, Oxford, United Kingdom Said A. Ghabrial† Department of Plant Pathology, University of Kentucky, Lexington, KY, United States Eric Hunter Department of Pathology and Laboratory Medicine, Emory University School of Medicine and Emory Vaccine Center, Emory University, Atlanta, GA, United States Ilkka Julkunen Institute of Biomedicine, University of Turku, Turku, Finland Peter J. Krell Department of Molecular and Cellular Biology, University of Guelph, Guelph, ON, Canada Mart Krupovic Archaeal Virology Unit, Institut Pasteur, Paris, France Maija Lappalainen HUS Diagnostic Center, HUSLAB, Clinical Microbiology, University of Helsinki and Helsinki University Hospital, Helsinki, Finland Hubert G.M. Niesters Department of Medical Microbiology and Infection Prevention, Division of Clinical Virology, University Medical Center Groningen, Groningen, The Netherlands Massimo Palmarini MRC-University of Glasgow Centre for Virus Research, Glasgow, United Kingdom David Prangishvili Institut Pasteur, Paris, France and Ivane Javakhishvili Tbilisi State University, Tbilisi, Georgia David I. Stuart Department of Structural Biology, Nuffield Department of Medicine, University of Oxford, Oxford, United Kingdom and Diamond Light Source, Didcot, United Kingdom Nobuhiro Suzuki Institute of Plant Stress and Resources (IPSR), Okayama University, Kurashiki, Japan



Deceased.

vii

SECTION EDITORS Claude Fauquet received his PhD in biochemistry from University Louis Pasteur in Strasburg, France in 1974. Dr. Fauquet joined the Institut de Recherche pour le Dévelopement (IRD) and worked there as a plant virologist for 28 years, and served in Ivory Coast, West Africa for 14 years. In 1991, he founded the International Laboratory for Tropical Agricultural Biotechnology (ILTAB) at The Scripps Research Institute, CA, United States. ILTAB was then hosted by the Donald Danforth Plant Science Center, St. Louis, MO, from 1999 to 2012. In 2003, he co-founded the Global Cassava Partnership for the 21st Century (GCP21), which he directed until 2019 and which goal is to improve the cassava crop worldwide. Dr. Fauquet is an international leader in plant virology including taxonomy, epidemiology, molecular virology, and in gene-silencing as an antiviral strategy. He was Secretary of the International Committee on Taxonomy of Viruses (ICTV) for 18 years and the editor of several ICTV Reports including the VIIIth ICTV Report in 2005. He has published more than 300 research papers in reviewed journals and books. He is a fellow of the American Association for the Advancement of Science, of the American Phytopathological Society and a member of the St. Louis Academy of Sciences. In 2007, Dr. Fauquet was knighted “Chevalier de l’Ordre des Palmes Académiques” by the French Minister of High Education and Research.

Dr. Michael Feiss is Professor Emeritus in the Department of Microbiology and Immunology of the Carver College of Medicine at the University of Iowa, IA, United States. Dr. Feiss received his PhD in Genetics at the University of Washington followed by a postdoctoral traineeship in the laboratory of Dr. Allan Campbell at Stanford. Dr. Feiss is a microbial geneticist who studies virus assembly with an emphasis on how a DNA virus, bacteriophage lambda, packages viral DNA into the empty prohead shell. The lab investigates how sites in the viral DNA orchestrate the initiation and termination of the DNA packaging process. This work includes comprehensive examination of the DNA recognition sites. A related interest is study of terminase, the viral DNA packaging enzyme, including the functional domains for protein–DNA and protein–protein interactions. A second focus has been the roles of the bacterial host’s IHF and DnaJ proteins in the lytic life cycle of the virus. More recent work has involved a genetic dissection of the role of terminase’s ATPase center that powers translocation of viral DNA into the prohead. This interest in the ATP hydrolysis-driven packaging motor involves a multidisciplinary collaboration examining the kinetics of DNA packaging during individual packaging events. Finally, recent studies have also looked at how the packaging process has diverged among several lambda-like phages, including phages 21, N15, and Gifsy-1.

Elizabeth E. Fry is a senior postdoctoral scientist in structural biology at the University of Oxford, Oxford, United Kingdom, where she received her DPhil. for studies relating to the structure determination of Foot-and-Mouth Disease Virus. Specializing in structural virology, Dr. Fry has studied many virus/viral protein structures but her primary focus is on picornavirus structure and function, in particular receptor interactions and virus uncoating. She is particularly interested in rationally designing virus-like-particles as next generation vaccines to reduce the inherent risks in handling live viruses.

ix

x

Section Editors Said A. Ghabrial† received his BSc in 1959 from Cairo University, Cairo, Egypt, and his PhD from Louisiana State University, Baton Rouge, LA, United States, in 1965. Dr. Ghabrial did postdoctoral research at the University of California, Davis, CA, United States, before returning to Cairo, where he served as a plant virologist in the Ministry of Agriculture. He returned to the United States in 1970 to do postdoctoral research at Purdue University, West Lafayette, IN. In 1972, he joined the Plant Pathology Department at the University of Kentucky, Lexington, KY, United States, where he rose to the rank of professor in 1986 and worked until 2013. Dr. Ghabrial has served as an associate and senior editor of Phytopathology. He served on the editorial boards of the Encyclopedia of Virology, 3rd edition and Encyclopedia of Plant Pathology, and edited a thematic issue of Advances in Virus Research on “Mycoviruses”. He was a member of the American Phytopathological Society (APS) and the American Society for Virology (ASV); in July 2002 he was elected as a Fellow of the American Phytopathological Society. He also acted as Chair of the ICTV Subcommittee on Fungal Viruses in 1987–1993 and 2011–2014. His long professional career allowed him to make many scientific achievements in phytopathology and virology. Among them are molecular dissection of a legume-infecting RNA virus, bean pod mottle virus (BPMV), development of BPMV-based vectors, discovery of a transmissible debilitation disease of the phytopathogenic ascomycete, Helminthosporium victoriae (Cochliobolus victoriae), establishment of a viral etiology of the H. victoriae disease, and advancement of structural biology of diverse fungal viruses.

Eric Hunter, PhD, is Professor of Pathology and Laboratory Medicine at Emory University, Atlanta, GA, United States. He serves as Co-Director of the Emory Center for AIDS Research and is a Georgia Research Alliance Eminent Scholar. Dr. Hunter’s research focus has been the molecular virology and pathogenesis of retroviruses, including human immunodeficiency virus. He has made significant contributions to the understanding of the role of retroviral glycoprotein structural features during viral entry and providing unique insights into the assembly and replication of this virus family. In recent years the emphasis of his research has been on HIV transmission and pathogenesis, defining the extreme genetic bottleneck and selection of viruses with unique traits during HIV heterosexual transmission. He has described the selection of fitter viruses at the target mucosa, a gender difference in the extent of selection bias, and a role for genital inflammation in reducing selection. His research has defined the impact of HIV adaptation to the cellular immune response on immune recognition and control of HIV after transmission, as well as on virus replicative fitness in vitro and in vivo. Recent work highlights the roles that virus replicative fitness and sex of the host play in defining disease progression in a newly infected individual. His bibliography includes over 300 peer-reviewed articles, reviews, and book chapters. He has also been the recipient of four NIH merit awards for his work on retrovirus and HIV molecular biology. Dr. Hunter served as the Editor in Chief of the journal AIDS Research and Human Retroviruses for 10 years. He was Chair of the AIDS Vaccine Research Subcommittee which is charged with providing advice and consultation on AIDS vaccine research to the National Institute of Allergy and Infectious Diseases and continues to serve on editorial boards for several academic journals and on external advisory committees for several government, academic, and commercial institutions.

Ilkka Julkunen graduated as an MD/PhD in 1984 from the Department of Virology, University of Helsinki, Helsinki, Finland. He worked as a postdoctoral research fellow at Memorial SloanKettering Cancer Center in New York, United States, in 1986–1989, followed by positions as a senior scientist, group leader and research professor at Finnish Institute for Health and Welfare in 1989–2013. In 2013 he became a Professor of Virology at the University of Turku, Turku, Finland. The research interests of Dr. Julkunen have concentrated on innate and adaptive humoral immunity in viral and microbial infections. He has studied intracellular signaling and RIG-I and TLR-mediated activation of interferon system in human macrophages and dendritic cells and stable cell lines in response to human and avian influenza, Sendai, Zika and coronavirus infections. In addition, he has analyzed the downregulation of innate immunity by viral regulatory proteins from influenza, HCV, flavi-, filo- and coronaviruses. He has expertise in vaccinology, biotechnology and development of methods to analyze antiviral immunity, he has also been actively involved in research training and collaborations with biotechnological industry.



Deceased.

Section Editors

xi

Peter Krell started his career in virology early as a summer high school student working for the Canadian Forestry Service studying the resistance of nuclear polyhedrosis viruses (now called baculoviruses) to environmental exposure with Dr. Fred T. Bird at the Insect Pathology Research Institute in Sault Ste. Marie, ON, Canada. He received his BSc and MSc in biology from Carleton University studying the iridovirus Tipula Iridescent Virus with Dr Peter Lee, in Ottawa, the Canadian capital. For his PhD he headed east to Dalhousie University in Halifax, Nova Scotia on the Atlantic coast. In addition to enjoying the salt sea air, fresh cod, lobster and mussels, he studied the molecular biology of polydnaviruses under the guidance of Dr Don Stoltz. Heading south to Texas A&M University in College Station, TX, United States, as a Postdoctoral Fellow he worked with Dr. Max Summer, of baculovirus fame, and Dr. Brad Vinson continuing to study polydnaviruses, but also became steeped in the early days of molecular baculovirology. He then accepted a faculty position in the Department of Microbiology and Immunology at the University of Guelph in Guelph, ON, Canada. There he switched to baculovirus research, which was more tractable, due in part to available cell cultures and focused on viral DNA replication and functional genomics, particularly on chitinase, cathepsin and ME53. In collaboration with Dr. Eva Nagy he studied molecular biology of different animal viruses, notably Fowl Avian adenoviruses and their development as vaccine vectors, but also on the birnavirus infectious pancreatic necrosis virus, the coronavirus porcine endemic diarrhea virus, fowlpox virus and the paramyxovirus Newcastle disease virus. He has been involved extensively with virus taxonomy, being active in the International Committee on Taxonomy of Viruses (ICTV) as member of the Polydnaviridae and Baculoviridae study groups, national representative of Canada on the ICTV, member of the Executive Committee for the ICTV and Chair of the ICTV Invertebrate Virus Subcommittee. In terms of governance, Peter Krell was President of the Canadian Society of Microbiology, Secretary and later President of the Society for Invertebrate Pathology, as well as being on the Editorial Boards of the Canadian Journal of Microbiology and the ASM Journal of Virology. While at the University of Guelph, he rose through the ranks to Professor and is currently University Professor Emeritus.

Mart Krupovic is the Head of the Archaeal Virology Unit in the Department of Microbiology at the Institut Pasteur of Paris, France. He received his MSc in Biochemistry in 2005 from the Vilnius University, Vilnius, Lithuania and PhD in 2010 in general microbiology from the University of Helsinki, Helsinki, Finland. His current research focuses on the diversity, origin, and evolution of viruses, as well as molecular mechanisms of virus–host interactions in archaea. He has published over 170 journal articles and serves as an editor or on the editorial boards of Biology Direct, Research in Microbiology, Scientific Reports, Virology, and Virus Evolution. He is also a member of the Executive Committee of the International Committee on Taxonomy of Viruses (ICTV) and chairs the Archaeal Viruses Subcommittee of the ICTV.

Maija Lappalainen, MD, PhD, Associate Professor of Clinical Microbiology, is the Head of Clinical Microbiology in the HUS Diagnostic Center, HUSLAB, University of Helsinki and Helsinki University Hospital, Helsinki, Finland. In her thesis during the years 1987–1992 she studied the incidence and diagnostics of congenital toxoplasmosis. After PhD, her research interest has been in diagnostic clinical virology, viral hepatitis, respiratory infections, viral infections in the immunocompromised patients and viral infections during pregnancy.

xii

Section Editors

Hubert G.M. Niesters (1958) studied biology and chemistry in Nijmegen, the Netherlands. After obtaining his PhD in Utrecht (Prof. dr. M. Horzinek and Prof. dr. B. van der Zeijst, 1987) on the molecular epidemiology of infectious bronchitis virus, he worked as a post-doctoral fellow with Prof. dr. Jim Strauss at the California Institute of Technology (Pasadena, United States) on the replication of Alphaviruses. He received a Niels Stensen fellowship (The Netherlands) and an E.S. Gosney fellowship (Caltech) during this period. After returning to the Netherlands (1989), he became a research associate in medical microbiology at the Diagnostic Medical Center (Delft) but moved back to clinical virology as a senior research associate in 1991 at the Erasmus University Medical Center Rotterdam (Head Prof. dr. Ab Osterhaus). From 1993 to 2007, he was responsible for the molecular diagnostics unit. During this period, he was involved in the discovery and characterization of several new viruses and variants. In 2007, he became full professor and director of the Laboratory of Clinical Virology within the Department of Medical Microbiology at the University Medical Center Groningen and University of Groningen. He has been actively involved in the implementation and development of new technologies like real-time amplification and automation within clinical virology. He has been focusing on molecular diagnostics and its use and the clinical value in a transplant setting, as well as in monitoring treatment of hepatitis viruses. Recently, his interest focuses on rapid regional epidemiology, automation including MiddleWare solutions for molecular diagnostics, as well as the cost–benefit of rapid point-ofimpact molecular testing. Special interest is focused on raising awareness for the detection of enteroviruses (enterovirus D68) and its relationship with acute flaccid myelitis (AFM). Since 2017, he is the Chair of the executive board of QCMD (Quality Control of Molecular Diagnostics, Glasgow). He is an auditor and team leader for the Dutch Council of Accreditation and Co-Editor in Chief of the Journal of Clinical Virology. He is an (co)-author of more than 250 peer-reviewed papers, chapters and reviews including emerging viruses, such as enterovirus D68 and hepatitis E virus (H-index 80). For his entire work, he received in 2016 the “Ed Nowakowski Senior Memorial Clinical Virology Award” from the Pan American Society for Clinical Virology.

Massimo Palmarini is the Director of the MRC-University of Glasgow Centre for Virus Research and Chair of Virology at the University of Glasgow, Glasgow, United Kingdom. A veterinarian by training, his research programs focus on the biology, evolution and pathogenesis of arboviruses and the mechanisms of virus cross-species transmission. His work is funded by the MRC and the Wellcome Trust. Massimo Palmarini has been elected Fellow of the Academy of Medical Sciences, of the Royal Society of Edinburgh and of the Royal Society of Biology and he was a Wolfson-Royal Society Research Merit Awardee. He is a Wellcome Trust Investigator.

David Prangishvili, PhD, Honorary Professor at the Institut Pasteur, Paris, France, and Professor at Tbilisi State University, Tbilisi, Georgia, is one of the pioneers in studies on the biology of Archaea and their viruses. His scientific career spans ex-USSR (Institute of Molecular Biology, Moscow; 1970–1976), Georgia (Georgian National Academy of Sciences, Tbilisi; 1976–1991), Germany (Max-Planck Institute for Biochemistry, Munich; University of Regensburg; 1991–2004) and France (Institut Pasteur, Paris, 2004–2020). In the research groups headed by him, several dozens of new species and eight new families of archaeal viruses have been discovered and characterized, which display remarkable diversity of unique morphotypes and exceptional genome contents. The results of his research contribute to the knowledge on viral diversity on our planet and change the field of prokaryotic virology, leading to the notion that viruses of hyperthermophilic Archaea form a particular group in the viral world, distinctive from viruses of Bacteria and Eukarya, and to the recognition of the virosphere of Archaea as one of the distinct features of this Domain of Life. David Prangishvili is a member of the Academia Europaea, the European Academy of Microbiology, and the Georgian National Academy of Sciences.

Section Editors

xiii

David I. Stuart is MRC Professor of Structural Biology in the Nuffield Department of Medicine, Oxford University, Oxford, United Kingdom, Life Science Director at Diamond Light Source and Director of Instruct-ERIC (pan-European organisation providing shared access to infrastructure and methods for structural biology). He has diverse interests in structural virology from picornaviruses, double-stranded RNA viruses and enveloped RNA viruses. His drive to develop structural techniques led to the determination of the structure of Bluetongue virus (1995) and then the first membrane containing virus, PRD1. More recently, he has been at the fore-front of bringing Cryo-EM technology to bear on virus structure determination and its future role in visualizing virus function in cellulo. In addition to basic science he has a strong commitment to structural vaccinology and the development of antiviral drugs.

Dr. Nobuhiro Suzuki, PhD, received his MSc (1985) in phytopathology and PhD (1989) in virology from Tohoku University in Sendai, Japan. Dr. Suzuki currently serves as a full Professor of the Institute of Plant Stress and Resources, formerly Institute of Plant Sciences and Bioresouces at Okayama University and as an Editor of Virus Research, Frontiers in Virology, Journal of General Plant Pathology, Virology Journal, and Biology. He has also been Guest Editor to PLoS Pathogens, PNAS, and mBio, and an Editorial Board member of Virology and Journal of Virology. Suzuki Laboratory focuses on characterization of diverse viruses infecting phytopathogenic fungi and exploration of their interplays. Recent achievements include the discovery of a neo-virus lifestyle exhibited by a (+)ssRNA virus and an unrelated dsRNA virus in a plant pathogenic fungus and of multilayer antiviral defense in fungi involving Dicer. Prior to coming to Kurashiki, Okayama Prefecture, he was a visiting fellow of the Center for Agricultural Biotechnology at the University of Maryland Biotechnology Institute (UMBI), College Park, MA, United States, for 4 years (1997–2001) to study molecular biology of hypoviruses in the laboratory of Professor Donald L. Nuss. Before visiting UMBI, he served as an assistant professor and a lecturer of the Biotechnology Institute at the Akita Prefectural College of Agriculture, Japan, for 11 years (1988–1998) where he was engaged in a project on molecular characterization of rice dwarf phytoreovirus, a member of the family Reoviridae. He received awards from the Japanese Phytopathological Society of Japan and Japanese Society for Virology for his outstanding achievements in plant and fungal virology.

FOREWORD I am delighted to write the foreword to this wonderful Fourth Edition of the Encyclopedia of Virology. The Third Edition was published in 2008, how the world has changed in the intervening years. The release of the updated fourth edition could not be more timely or more prescient. It is superb and a huge tribute to the authors, Elsevier the publisher, and to the brilliant editors, Dennis Bamford and Mark Zuckerman. SARS-CoV-2 has dominated the world since it emerged in 2019 and affected every continent and every aspect of life. A reminder, if it were needed, of the impact of infectious diseases, the importance of virology and the vulnerability and interconnectivity of our world. There is no doubt that with rapidly changing ecology, urbanization, climate change, increased travel, and fragile public health systems, epidemics and pandemics will become more frequent, more complex and harder to prevent and contain. Most of these epidemics will be caused by viruses, those we know about and maybe able to predict and some we do not know of that will emerge from animals, plants or the environment. Our changing climate will change the epidemiology of viruses, their vectors and the infections they cause, hence the critical importance of this totally revised Fourth Edition of the Encyclopedia of Virology which brings together research and an understanding of viruses in animals, plants, bacteria and fungi, the environment, and among humans. Never has a holistic, one-health understanding been more important. That starts with an understanding of the fundamentals of virology, a field of science that has been transformed in the years since the Third Edition. An understanding transformed by embracing traditional fields of molecular and structural biology, genomics, and influenced by immunology, genetics, pharmacology and increasingly by epidemiology and mathematics. Events of 2020 and 2021 also show why it is so important to integrate within traditional virology an understanding of the animal and human health and behavior, of climate change and its impact on the ecology of viruses, plant sciences and vectors. And why we must understand the viruses we think we know well, and those viruses less extensively studied. Research is critical to this, research that pushes the boundaries of what we know, has the humility to seek answers to things we do not understand and shares that knowledge with the widest possible community. That research will be most exciting at the interface between disciplines, most impactful when dynamic, open, inclusive, global, and collaborative. This is what the Fourth Edition of the Encyclopedia of Virology, the largest reference source of research in virology sets out to achieve. It is a wonderful contribution to a critical field of knowledge. It contains new chapters, every chapter revised and updated by a dedicated global community who have come together to provide what is a brilliant and inspiring reference. It is an honor to contribute in a very small way to the timely release of the Fourth Edition of the Encyclopedia of Virology. Jeremy Farrar

xv

PREFACE The fourth edition of the Encyclopedia of Virology is encyclopedic, but we wanted to move away from an alphabetical list, apart from where it was more logical, to a vision that encompassed a different structure. Articles describing novel trends as well as original discoveries in specific subfields of virology have been distributed into a set of five volumes, namely Fundamentals of Virology, Human and Animal Viruses, Plant Viruses, Bacterial, Archaeal, Fungal, Algal and Invertebrate Viruses, and Diagnosis, Treatment and Prevention of Virus Infections. We had hoped that the new edition would ‘go viral’ but it was ironic that the time to publication 12 years after the previous edition had been made a bit longer due to a virus infection. The world encountered a devastating global pandemic, COVID-19, caused by a new type of a coronavirus, SARS-CoV-2. Scientists in many disciplines all over the world started immediate efforts to discover solutions as to how to mitigate and stop the spread of the pandemic. Virology moved from being a highly specialized subject to one in which everyone became a virologist, proving just how significant the different aspects of virology are in terms of understanding the nature of viral infection. Since the previous edition, the growth in the field of general virology has been enormous, including huge advances in basic science, identification of novel viruses, diagnostic methods, treatment and prevention. Taking this into account, the introduction of the articles within the Encyclopedia are very timely and crucial for providing a wealth of knowledge of the latest findings in the field of virology to a vast range of people, whether school students, undergraduates, postgraduates, teachers, scientists, researchers, journalists and others interested in infections and the conflict between the host and the pathogen. Pandemic viruses have become a serious public concern in the changing world. We can ask ourselves whether we have reached the point in which nature can no longer cope with the consequences of increased population density and human activities that are harmful to the environment. Although several pandemics have threatened mankind before, this COVID-19 pandemic has highlighted the massive adverse economic consequences towards the wellbeing of society and the importance of research in virology. We aimed to produce a Major Reference Work that differs in approach to others and binds all the virology disciplines together. Chapters have been included on origin, evolution and emergence of viruses, environmental virology and ecology, epidemiology, techniques for studying viruses, viral life cycles, structure, entry, genome and replication, assembly and packaging and taxonomy and viral–host interactions. Information has been included on all known species of viruses infecting bacteria, fungi, plants, vertebrates and invertebrates. Additional topics include antiviral classification and examples of their use in management of infection, diagnostic assays and vaccines, as well as the economic importance of viral diseases of crops and their control. This edition used viral classification according to the 9th Report of the International Committee on Taxonomy of Viruses published in 2012. Updating it to the 10th Report in 2020 was affected by the pandemic and can be found online at http://ictv.global/report/. We wish to acknowledge the hard work, interest, flexibility and patience, during such difficult times both socially and professionally, of everybody involved in the process of writing this edition of the Encyclopedia of Virology, especially Katarzyna Miklaszewska, Priscilla Braglia, Sam Crowe and colleagues at Elsevier. We sincerely thank all the authors and section editors for their excellent contributions to this edition.

Book Cover Image: Viruses are obligate parasites and all cells have their own viruses increasing the total number of viruses to the estimated astronomical number of 1031 that extends the number of stars in the universe. The viral string illustrates how pandemic viruses surround the globe. The original picture was created by Dr. Nina Atanasova (Finnish Meteorological Institute and University of Helsinki) and amended by Matthew Limbert at Elsevier. Dennis H. Bamford Mark Zuckerman

xvii

HOW TO USE THE ENCYCLOPEDIA Structure of the Encyclopedia All articles in the encyclopedia are arranged thematically as a series of entries within subjects/sections, apart from volume 2 where there it was more logical to have articles arranged alphabetically. There are three features to help you easily find the topic you are interested in: a thematic contents list, a full subject index, and contributors. 1. Thematic contents list: The alphabetical contents list, which appears at the front of each volume, lists the entries in the order that they appear in the encyclopedia. 2. Index: The index appears at the end of volume 5 and includes page numbers for quick reference to the information you are looking for. The index entries differentiate between references to a whole entry, a part of an entry, and a table or figure. 3. Contributors: At the start of each volume there is a list of the authors who contributed to all volumes.

xix

LIST OF CONTRIBUTORS Stephen T. Abedon The Ohio State University, Mansfield, OH, United States Peter Abrahamian Agricultural Research Service, US Department of Agriculture, Beltsville, MD, United States Jônatas S. Abrahão Federal University of Minas Gerais, Belo Horizonte, Brazil Florence Abravanel Toulouse University Hospital, Toulouse, France and Toulouse University Paul Sabatier, Toulouse, France Nicola G.A. Abrescia Center for Cooperative Research in Biosciences, Basque Research and Technology Alliance, Derio, Spain; Ikerbasque, Basque Foundation for Science, Bilbao, Spain; and Center for Biomedical Research in the Liver and Digestive Diseases Network, Carlos III Health Institute, Madrid, Spain Gian Paolo Accotto Institute for Sustainable Plant Protection, National Research Council of Italy, Torino, Italy

Aleksandra Alimova The City University of New York (CUNY), School of Medicine, The City College of New York, New York, NY, United States Juan C. Alonso National Biotechnology Center–Spanish National Research Council, Madrid, Spain Imran Amin National Institute for Biotechnology and Genetic Engineering, Faisalabad, Pakistan Stephanie E. Ander University of Colorado School of Medicine, Aurora, CO, United States Danielle E. Anderson Duke-NUS Medical School, Singapore, Singapore Ida Bagus Andika Qingdao Agricultural University, Qingdao, China Ana C.d.S.P. Andrade Federal University of Minas Gerais, Belo Horizonte, Brazil Juana Angel Pontifical Javeriana University, Bogota, Colombia

Elisabeth Adderson St. Jude Children’s Research Hospital, Memphis, TN, United States and University of Tennessee Health Sciences Center, Memphis, TN, United States

Vanesa Anton-Vazquez King’s College Hospital, London, United Kingdom

Mustafa Adhab University of Baghdad, Baghdad, Iraq

Guido Antonelli Sapienza University of Rome, Rome, Italy

Alexey A. Agranovsky Lomonosov Moscow State University, Moscow, Russia Nasim Ahmed National Institute for Biotechnology and Genetic Engineering, Faisalabad, Pakistan Maher Al Rwahnih University of California, Davis, CA, United States Olufemi J. Alabi Texas A& M AgriLife Research and Extension Center, Weslaco, TX, United States Aurélie A. Albertini Institute for Integrative Biology of the Cell (I2BC), French Alternative Energies and Atomic Energy Commission, French National Center for Scientific Research, Paris-Sud University, University of Paris-Saclay, Gif-sur-Yvette, France

Josefa Antón University of Alicante, Alicante, Spain Nanako Aoki Tokyo University of Agriculture and Technology, Fuchu, Japan Timothy D. Appleby King’s College Hospital, London, United Kingdom Miguel Arenas Department of Biochemistry, Genetics and Immunology, University of Vigo, Vigo, Spain and CINBIO (Biomedical Research Center), University of Vigo, Vigo, Spain Basil Arif Laboratory for Molecular Virology, Great Lakes Forestry Centre, Sault Ste Marie, ON, Canada

xxi

xxii

List of Contributors

Vicente Arnau Institute for Integrative Systems Biology (I2SysBio), University of Valencia–Spanish National Research Council, Valencia, Spain Gaurav Arya Duke University, Durham, NC, United States Leyla Asadi University of Alberta, Edmonton, AB, Canada Sassan Asgari The University of Queensland, Brisbane, QLD, Australia Nina S. Atanasova Finnish Meteorological Institute, Helsinki, Finland and University of Helsinki, Helsinki, Finland Houssam Attoui UMR1161 Virologie, INRAE – French National Research Institute for Agriculture, Food and Environment, ANSES, Ecole Nationale Vétérinaire d’Alfort, University of Paris-Est, Maisons-Alfort, France Silvia Ayora National Biotechnology Center–Spanish National Research Council, Madrid, Spain

Xiaoyong Bao The University of Texas Medical Branch, Galveston, TX, United States Yiming Bao Beijing Institute of Genomics, Chinese Academy of Sciences, University of Chinese Academy of Sciences, Beijing, China Alan D.T. Barrett The University of Texas Medical Branch, Galveston, TX, United States Diana P. Baquero Archaeal Virology Unit, Institut Pasteur, Paris, France and Sorbonne University, Paris, France Moshe Bar-Joseph Agricultural Research Organization, Volcani Center, Bet Dagan, Israel Rachael S. Barr Bristol Royal Hospital for Children, Bristol, United Kingdom Ralf Bartenschlager Heidelberg University, Heidelberg, Germany

Walid Azab Free University of Berlin, Berlin, Germany

David L.V. Bauer Francis Crick Institute, London, United Kingdom

Sasha R. Azar The University of Texas Medical Branch, Galveston, TX, United States

Oliver W. Bayfield University of York, York, United Kingdom

Fengwei Bai The University of Southern Mississippi, Hattiesburg, MS, United States Dalan Bailey The Pirbright Institute, Pirbright, United Kingdom S.C. Baker Loyola University of Chicago, Maywood, IL, United States Fausto Baldanti University of Pavia, Pavia, Italy and Scientific Institute for Research, Hospitalization and Healthcare, San Matteo Polyclinic Foundation, Pavia, Italy Logan Banadyga Public Health Agency of Canada, Winnipeg, MB, Canada Ashley C. Banyard Animal and Plant Health Agency, Addlestone, United Kingdom; University of West Sussex, Falmer, United Kingdom; and St. George's Medical School, University of London, London, United Kingdom

Sally A. Baylis Paul-Ehrlich-Institute, Langen, Germany Philippa M. Beard The Pirbright Institute, Pirbright, United Kingdom and The Roslin Institute, University of Edinburgh, United Kingdom Paul Becher University of Veterinary Medicine, Hannover, Germany Björn Becker Saarland University, Saarbrücken, Germany Karen L. Beemon Johns Hopkins University, Baltimore, MD, United States Martin Beer Friedrich-Loeffler-Institute, Insel Riems, Germany Jose Miguel Benito Health Research Institute of the Jiménez Díaz Foundation, Autonomous University of Madrid and Rey Juan Carlos University Hospital, Móstoles, Spain Mária Benko ˝ Institute for Veterinary Medical Research, Center for Agricultural Research, Budapest, Hungary

List of Contributors

Max Bergoin National Institute of Scientific Research – ArmandFrappier Health Research Centre, Laval, QC, Canada Sabrina Bertin Council for Agricultural Research and Economics, Research Center for Plant Protection and Certification, Rome, Italy Shweta Bhatt University of Copenhagen, Copenhagen, Denmark Dennis K. Bideshi California Baptist University, Riverside, CA, United States and University of California, Riverside, CA, United States Yves Bigot INRAE – French National Research Institute for Agriculture, Food and Environment, Nouzilly, France Richard J. Bingham University of York, York, United Kingdom

Maxime Boutier University of Liège, Liège, Belgium P.R. Bowser Cornell University, Ithaca, NY, United States Daniel Bradshaw Public Health England, London, United Kingdom Claude Bragard University of Louvain, Louvain-la-Neuve, Belgium Aaron C. Brault Centers for Disease Control and Prevention, Fort Collins, CO, United States Nicolas Bravo-Vasquez St. Jude Children’s Research Hospital, Memphis, TN, United States Rob W. Briddon University of Agriculture, Faisalabad, Pakistan Thomas Briese Columbia University, New York, NY, United States

Vera Bischoff Institute for Chemistry and Biology of the Marine Environment, Oldenburg, Germany

Paul Britton The Pirbright Institute, Pirbright, United Kingdom

Kate N. Bishop Francis Crick Institute, London, United Kingdom

Thomas J. Brouwers Athena Institute, VU Amsterdam, Amsterdam, The Netherlands

Lindsay W. Black The University of Maryland School of Medicine, Baltimore, MD, United States Romain Blanc-Mathieu Institute for Chemical Research, Kyoto University, Kyoto, Japan Soile Blomqvist National Institute for Health and Welfare, Helsinki, Finland Bryony C. Bonning University of Florida, Gainesville, FL, United States Lisa M. Bono Rutgers, The State University of New Jersey, New Brunswick, NJ, United States Alexia Bordigoni Aix-Marseille University, CNRS, IRD, Mediterranean Institute of Oceanography, Marseille, France and Aix-Marseille University, IRD257, Assistance-Publique des Hôpitauxde Marseille, UMR Microbes, Evolution, Phylogeny and Infections (MEPHI), IHU Méditerranée Infection, Marseille, France Mihnea Bostina University of Otago, Dunedin, New Zealand

xxiii

Kevin E. Brown Frimley Park Hospital, Frimley, United Kingdom and Immunisation and Countermeasures Division, Public Health England, London, United Kingdom Corina P.D. Brussaard NIOZ Royal Netherlands Institute for Sea Research, Den Burg, Texel, The Netherlands and Utrecht University, Utrecht, The Netherlands Harald Brüssow Laboratory of Gene Technology, Department of Biosystems, KU Leuven, Leuven, Belgium Joachim J. Bugert Bundeswehr Institute of Microbiology, Munich, Germany Jozef J. Bujarski Northern Illinois University, DeKalb, IL, United States and Polish Academy of Sciences, Poznan, Poland Laura Burga University of Otago, Dunedin, New Zealand Sara H. Burkhard University Hospital of Zurich, Zurich, Switzerland Cara C. Burns Centers for Disease Control and Prevention, Atlanta, GA, United States

xxiv

List of Contributors

Felicity Burt University of the Free State, Bloemfontein, South Africa Kerry S. Burton Leamington Spa, United Kingdom Sarah J. Butcher University of Helsinki, Helsinki, Finland Mathias Büttner Leipzig University, Leipzig, Germany Jesse Cahill Sandia National Labs, Albuquerque, NM, United States Marianna Calabretto Sapienza University of Rome, Rome, Italy Thierry Candresse The National Research Institute for Agriculture, Food and the Environment, University of Bordeaux, Villenave d′Ornon, France Alan J. Cann University of Leicester, Leicester, United Kingdom Lorenzo Capucci The Lombardy and Emilia Romagna Experimental Zootechnic Institute, Brescia, Italy Irene Carlon-Andres University of Oxford, Oxford, United Kingdom José M. Casasnovas National Center for Biotechnology, Spanish National Research Council (CSIC), Madrid, Spain J.W. Casey Cornell University, Ithaca, NY, United States R.N. Casey Cornell University, Ithaca, NY, United States Sherwood R. Casjens University of Utah, Salt Lake City, UT, United States Antonella Casola The University of Texas Medical Branch, Galveston, TX, United States José R. Castón National Center for Biotechnology, Spanish National Research Council, Madrid, Spain

Patrizia Cavadini The Lombardy and Emilia Romagna Experimental Zootechnic Institute, Brescia, Italy Supranee Chaiwatpongsakorn Nationwide Children’s Hospital, Columbus, OH, United States Supriya Chakraborty Jawaharlal Nehru University, New Delhi, India Yu-Chan Chao Institute of Molecular Biology, Academia Sinica, Taipei, Taiwan Tyler P. Chavers Centers for Disease Control and Prevention, Atlanta, GA, United States Keping Chen Jiangsu University, Zhenjiang, China Xiaorui Chen Genomics Research Center, Academia Sinica, Taipei, Taiwan Yanping Chen Bee Research Laboratory, Agricultural Research Service, US Department of Agriculture, Beltsville, MD, United States Dayna Cheng National Cheng Kung University, Tainan, Taiwan Quentin Chesnais University of Strasbourg, Colmar, France Sotaro Chiba Nagoya University, Nagoya, Japan Wah Chiu Stanford University, Stanford, CA, United States David Chmielewski Stanford University, Stanford, CA, United States Irma E. Cisneros The University of Texas Medical Branch, Galveston, TX, United States Lark L. Coffey University of California, Davis, CA, United states Alanna B. Cohen Rutgers University, New Brunswick, NJ, United States

Carlos E. Catalano University of Colorado Anschutz Medical Campus, Skaggs School of Pharmacy and Pharmaceutical Sciences, Aurora, CO, United States

Jeffrey I. Cohen Laboratory of Infectious Diseases, National Institute of Allergy and Infectious Diseases, National Institutes of Health, Bethesda, MD, United States

Roberto Cattaneo Mayo Clinic, Rochester, MN, United States

Seth Coleman Rice University, Houston, TX, United States

List of Contributors

Miquel Coll Institute for Research in Biomedicine, Barcelona, Spain and Institute for Molecular Biology of Barcelona, Barcelona, Spain John Collinge UCL Institute of Prion Diseases, London, United Kingdom Carina Conceicao The Pirbright Institute, Pirbright, United Kingdom Gabriela N. Condezo National Center for Biotechnology, Spanish National Research Council, Madrid, Spain

xxv

Amy Davis St Jude Children’s Research Hospital, Memphis, TN, United States William O. Dawson Citrus Research and Education Center, Lake Alfred, FL, United States and University of Florida, Lake Alfred, FL, United States Erik De Clercq Rega Institute for Medical Research, KU Leuven, Leuven, Belgium Raoul J. de Groot Utrecht University, Utrecht, The Netherlands

Michaela J. Conley MRC-University of Glasgow Centre for Virus Research, Glasgow, United Kingdom

Juan C. de la Torre The Scripps Research Institute, La Jolla, CA, United States

Charles A Coomer University of Oxford, Oxford, United Kingdom

Marcelo De las Heras University of Zaragoza, Zaragoza, Spain

Anne K. Cordes Hannover Medical School, Institute of Virology, Hannover, Germany

Juliana Gabriela Silva de Lima Federal University of Rio Grande do Norte, Natal, Brazil

Mauricio Cortes Jr. Department of Chemistry, College of Arts and Sciences, Fort Wayne, IN, United States Robert H.A. Coutts University of Hertfordshire, Hatfield, United Kingdom Jeff A. Cowley CSIRO Livestock Industries, Brisbane, QLD, Australia Robert W. Cross The University of Texas Medical Branch, Galveston, TX, United States Henryk Czosnek The Hebrew University of Jerusalem, Rehovot, Israel Håkon Dahle Department of Biological Sciences, University of Bergen, Bergen, Norway Janet M. Daly University of Nottingham, Sutton Bonington, United Kingdom Subha Das Okayama University, Kurashiki, Japan

Athos S. de Oliveira University of Brasília, Brasília, Brazil Nicole T. de Stefano University of Florida, Gainesville, FL, United States Greg Deakin NIAB-EMR, East Malling, United Kingdom Philippe Delfosse University of Luxembourg, Esch-sur-Alzette, Luxembourg Natacha Delrez University of Liège, Liège, Belgium Tatiana A. Demina Molecular and Integrative Biosciences Research Program, Faculty of Biological and Environmental Sciences, University of Helsinki, Helsinki, Finland Ismail Demir Department of Biology, Karadeniz Technical University, Trabzon, Turkey Zihni Demirbağ Department of Biology, Karadeniz Technical University, Trabzon, Turkey

Indranil Dasgupta University of Delhi, New Delhi, India

X. Deng Loyola University of Chicago, Maywood, IL, United States

Sibnarayan Datta Defence Research Laboratory, Defence Research and Development Organisation (DRDO), Tezpur, Assam, India

Cécile Desbiez Plant Pathology Unit, INRAE – French National Research Institute for Agriculture, Food and Environment, Montfavet, France

xxvi

List of Contributors

Christelle Desnues Aix-Marseille University, CNRS, IRD, Mediterranean Institute of Oceanography, Marseille, France and Aix-Marseille University, IRD 257, Assistance-Publique des Hôpitaux de Marseille, UMR Microbes, Evolution, Phylogeny and Infections (MEPHI), IHU Méditerranée Infection, Marseille, France

Lucy Dorrell University of Oxford, Oxford, United Kingdom

Samantha J. DeWerff University of Illinois at Urbana-Champaign, Urbana, IL, United States

Andreas Dotzauer University of Bremen, Bremen, Germany

Daniele Di Carlo Sapienza University of Rome, Rome, Italy Arturo Diaz La Sierra University, Riverside, CA, United States Alfredo Diaz-Lara University of California, Davis, CA, United States Ralf G. Dietzgen The University of Queensland, St. Lucia, QLD, Australia Michele Digiaro International Center for Advanced Mediterranean Agronomic Studies (CIHEAM), Mediterranean Agronomic Institute of Bari, Valenzano, Italy Michael Dills Montana State University, Bozeman, MT, United States Wayne Dimech National Serology Reference Laboratory, Fitzroy, VIC, Australia Savithramma P. Dinesh-Kumar University of California, Davis, CA, United States Linda K. Dixon The Pirbright Institute, Pirbright, United Kingdom Valerian V. Dolja Oregon State University, Corvallis, OR, United States Aušra Domanska University of Helsinki, Helsinki, Finland Leslie L. Domier Agricultural Research Service, US Department of Agriculture, Urbana, IL, United States Pilar Domingo-Calap Institute for Integrative Systems Biology (I2SysBio), University of Valencia-CSIC, Valencia, Spain Tatiana Domitrovic Federal University of Rio de Janeiro, Rio de Janeiro, Brazil Sarah M. Doore Michigan State University, East Lansing, MI, United States

Rosemary A. Dorrington Rhodes University, Grahamstown, South Africa Andor Doszpoly Hungarian Academy of Sciences, Budapest, Hungary

Simon B. Drysdale St George’s University Hospitals NHS Foundation Trust, London, United Kingdom and St George’s, University of London, London, United Kingdom Robert L. Duda University of Pittsburgh, Pittsburgh, PA, United States Carol Duffy University of Alabama, Tuscaloosa, AL, United States Siobain Duffy Rutgers, The State University of New Jersey, New Brunswick, NJ, United States David D. Dunigan University of Nebraska–Lincoln, Lincoln, NE, United States Stéphane Duquerroy University of Paris-Saclay, Orsay, France and Institut Pasteur, Paris, France Bas E. Dutilh Utrecht University, Utrecht, The Netherlands and Radboud University Medical Center, Nijmegen, The Netherlands Michael Edelstein Faculty of Medicine, Bar Ilan University, Ramat Gan, Israel Herman K. Edskes National Institutes of Health, Bethesda, MD, United States Rosina Ehmann Bundeswehr Institute of Microbiology, Munich, Germany Toufic Elbeaino International Center for Advanced Mediterranean Agronomic Studies (CIHEAM), Mediterranean Agronomic Institute of Bari, Valenzano, Italy Joanne B. Emerson University of California, Davis, CA, United States Ann Emery University of North Carolina at Chapel Hill, Chapel Hill, NC, United States

List of Contributors

xxvii

Christine E. Engeland University Hospital Heidelberg and German Cancer Research Center, Heidelberg, Germany and Witten/Herdecke University, Witten, Germany

Elvira Fiallo-Olivé Institute for Mediterranean and Subtropical Horticulture “La Mayora”–Spanish National Research Council–University of Malaga, Algarrobo-Costa, Málaga, Spain

Luis Enjuanes National Center for Biotechnology – Spanish National Research Council (CNB-CSIC), Madrid, Spain

Andrew E. Firth University of Cambridge, Cambridge, United Kingdom

Katri Eskelin University of Helsinki, Helsinki, Finland Rosa Esteban Institute of Biology and Functional Genomics, CSIC/University of Salamanca, Salamanca, Spain Mary K. Estes Baylor College of Medicine, Houston, TX, United States Cassia F. Estofolete São José do Rio Preto School of Medicine, São José do Rio Preto, Brazil Alyssa B. Evans National Institutes of Health, Hamilton, MT, United States Øystein Evensen Norwegian University of Life Sciences, Oslo, Norway Alex Evilevitch Department of Experimental Medical Science, Lund University, Lund, Sweden Montserrat Fàbrega-Ferrer Institute for Research in Biomedicine, Barcelona, Spain and Institute for Molecular Biology of Barcelona, Barcelona, Spain Francesco Faggioli Council for Agricultural Research and Economics, Research Center for Plant Protection and Certification, Rome, Italy Bentley A. Fane University of Arizona, Tucson, AZ, United States Brian A. Federici University of California, Riverside, CA, United States F. Fenner Australian National University, Canberra, ACT, Australia Isabel Fernández de Castro Cell Structure Laboratory, National Center for Biotechnology – Spanish National Research Council (CNB-CSIC), Madrid, Spain Giovanni Ferrara University of Alberta, Edmonton, AB, Canada

Roland A. Fleck King’s College London, London, United Kingdom Ricardo Flores Polytechnic University of Valencia, Higher Council of Scientific Research, Valencia, Spain Ervin Fodor University of Oxford, Oxford, United Kingdom Anthony R. Fooks Animal and Plant Health Agency, Addlestone, United Kingdom; University of Liverpool, Liverpool, United Kingdom; and St. George's Medical School, University of London, London, United Kingdom Patrick Forterre Archeal Virology Unit, Institut Pasteur, Paris, France and French National Center for Scientific Research, Institute of Integrative Biology of the Cell, University of Paris-Saclay, Gif sur Yvette, France Rennos Fragkoudis University of Nottingham, Sutton Bonington, United Kingdom and University of Edinburgh, Edinburgh, United Kingdom Manuel A. Franco Pontifical Javeriana University, Bogota, Colombia Giovanni Franzo Department of Animal Medicine, Production and Health (MAPS), Padua University, Padua, Italy Graham L. Freimanis The Pirbright Institute, Pirbright, United Kingdom Juliana Freitas-Astúa Brazilian Agricultural Research Corporation (Embrapa) Cassava and Fruits, Cruz das Almas, Brazil Elizabeth E. Fry Department of Structural Biology, Nuffield Department of Medicine, University of Oxford, Oxford, United Kingdom Marc Fuchs Cornell University, Geneva, NY, United States Tsutomu Fujimura Institute of Biology and Functional Genomics, CSIC/University of Salamanca, Salamanca, Spain

xxviii

List of Contributors

Kuko Fuke Tokyo University of Agriculture and Technology, Fuchu, Japan

Said A. Ghabrial† Department of Plant Pathology, University of Kentucky, Lexington, KY, United States

Toshiyuki Fukuhara Tokyo University of Agriculture and Technology, Fuchu, Japan

Clément Gilbert Evolution, Genomes, Behavior and Ecology Laboratory, CNRS University of Paris-Sud UMR 9191, IRD UMR 247, Gif-sur-Yvette, France

To S. Fung South China Agricultural University, Guangzhou, China Yahya Z.A. Gaafar Julius Kuehn Institute – Federal Research Center for Cultivated Plants, Braunschweig, Germany Toni Gabaldon Barcelona Supercomputing Center-National Center for Supercomputing, Institute of Research in Biomedicine, and Catalan Institution for Research and Advanced Studies, Barcelona, Spain Morgan Gaïa University of Paris-Saclay, Evry, France José Gallardo National Center for Biotechnology, Spanish National Research Council, Madrid, Spain Hernan Garcia-Ruiz University of Nebraska–Lincoln, Lincoln, NE, United States Juan A. García National Center for Biotechnology-Spanish National Research Council, Madrid, Spain Matteo P. Garofalo The University of Texas Medical Branch, Galveston, TX, United States Yves Gaudin Institute for Integrative Biology of the Cell (I2BC), French Alternative Energies and Atomic Energy Commission, French National Center for Scientific Research, Paris-Sud University, University of Paris-Saclay, Gif-sur-Yvette, France Andrew D.W. Geering The University of Queensland, St. Lucia, QLD, Australia Thomas W. Geisbert The University of Texas Medical Branch, Galveston, TX, United States Andrea Gentili Council for Agricultural Research and Economics, Research Center for Plant Protection and Certification, Rome, Italy Volker Gerdts University of Saskatchewan, Saskatoon, SK, Canada

Robert L. Gilbertson University of California, Davis, CA, United States Efstathios S. Giotis Imperial College London, London, United Kingdom and University of Essex, Colchester, United Kingdom Laurent Glais French Federation of Seed Potato Growers/Research, Development, Promotion of Seed Potato, Paris, France and Institute for Genetics, Environment and Plant Protection, Agrocampus West, French National Institute for Agriculture, Food and Environment, University of Rennes 1, Le Rheu, France Miroslav Glasa Biomedical Research Center, Slovak Academy of Sciences, Bratislava, Slovakia Ido Golding University of Illinois at Urbana-Champaign, Urbana, IL, United States Esperanza Gomez-Lucia Complutense University of Madrid, Madrid, Spain Zheng Gong Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing, China Andrea González-González University of Florida, Gainesville, FL, United States Michael M. Goodin University of Kentucky, Lexington, KY, United States Alexander E. Gorbalenya Leiden University Medical Center, Leiden, The Netherlands Paul Gottlieb The City University of New York (CUNY), School of Medicine, The City College of New York, New York, NY, United States M.-A. Grandbastien INRAE – French National Research Institute for Agriculture, Food and Environment, Versailles, France †

Deceased.

List of Contributors

Meritxell Granell National Center for Biotechnology, Madrid, Spain and Institute of Chemical Research of Catalonia (ICIQ), Tarragona, Spain

Sébastien Halary National Museum of Natural History, UMR 7245 CNRS/MNHN Molécule de Communication et Adaptation des Micro-organismes, Paris, France

Patrick L. Green The Ohio State University, Columbus, OH, United States

Aron J. Hall Centers for Disease Control and Prevention, Atlanta, GA, United States

Sandra J. Greive University of York, York, United Kingdom

John Hammond Floral and Nursery Plants Research, Agricultural Research Service, US Department of Agriculture, Beltsville, MD, United States

Diane E. Griffin Johns Hopkins Bloomberg School of Public Health, Baltimore, MD, United States Jonathan M. Grimes University of Oxford, Oxford, United Kingdom Nigel Grimsley Integrative Biology of Marine Organisms Laboratory, Banuyls-sur-Mer, France and Sorbonne University, Banuyls-sur-Mer, France Bruno Gronenborn Institute for Integrative Biology of the Cell, CNRS, University of Paris-Sud, CEA, Gif sur Yvette, France Julianne H. Grose Brigham Young University, Provo, UT, United States Scott Grytdal Centers for Disease Control and Prevention, Atlanta, GA, United States

Rosemarie W. Hammond Agricultural Research Service, US Department of Agriculture, Beltsville, MD, United States Virginia Hargest St Jude Children’s Research Hospital, Memphis, TN, United States and University of Tennessee Health Science Center, Memphis, TN, United States Scott J. Harper Washington State University, Prosser, WA, United States Balázs Harrach Institute for Veterinary Medical Research, Center for Agricultural Research, Budapest, Hungary Masayoshi Hashimoto The University of Tokyo, Tokyo, Japan Muhammad Hassan University of Agriculture, Faisalabad, Pakistan

Duane J. Gubler Duke-NUS Medical School, Singapore, Singapore

Asma Hatoum-Aslan University of Alabama, Tuscaloosa, AL, United States

Peixuan Guo College of Pharmacy, The Ohio State University, Columbus, OH, United States

Philippa C. Hawes The Pirbright Institute, Pirbright, United Kingdom

Tongkun Guo Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing, China Anne-Lise Haenni Institut Jacques Monod, French National Center for Scientific Research, Paris Diderot University, Paris, France Susan L. Hafenstein Pennsylvania State University, Hershey, PA, United States Ahmed Hafez Biotechvana, Valencia, Spain; Pompeu Fabra University, Barcelona, Spain; and Minia University, Minya, Egypt Marie Hagbom Linköping University, Linköping, Sweden

xxix

Janelle A. Hayes University of Massachusetts Medical School, Worcester, MA, United States Guijuan He Virginia Tech, Blacksburg, VA, United States Klaus Hedman University of Helsinki, Helsinki, Finland and Helsinki University Hospital, Helsinki, Finland Albert Heim Hannover Medical School, Hanover, Germany Gary L. Hein University of Nebraska–Lincoln, Lincoln, NE, United States Manfred Heinlein IBMP-CNRS, University of Strasbourg, Strasbourg, France

xxx

List of Contributors

Mercedes Hernando-Pérez National Center for Biotechnology, Spanish National Research Council, Madrid, Spain Carmen Hernández Institute for Plant Molecular and Cell Biology (Spanish National Research Council–Polytechnic University of Valencia), Valencia, Spain Etienne Herrbach University of Strasbourg, Colmar, France Stephen Higgs Biosecurity Research Institute, Kansas State University, Manhattan, KS, United States Bradley I. Hillman Rutgers University, New Brunswick, NJ, United States Deborah M. Hinton National Institute of Diabetes and Digestive and Kidney Diseases, National Institutes of Health, Bethesda, MD, United States Judith Hirsch Plant Pathology Unit, INRAE – French National Research Institute for Agriculture, Food and Environment, Montfavet, France Jody Hobson-Peters Australian Infectious Diseases Research Centre, School of Chemistry and Molecular Biosciences, The University of Queensland, St Lucia, QLD, Australia

Elisabeth Huguet Research Institute on Insect Biology, French National Center for Scientific Research, University of Tours, Tours, France Roger Hull John Innes Centre, Norwich, United Kingdom Kiwamu Hyodo Okayama University, Kurashiki, Japan Eugénie Hébrard Interactions Plantes Microorganismes Environnement, Institut de Recherche pour le Développement, Centre de coopération internationale en recherche agronomique pour le développement, University of Montpellier, Montpellier, France Martin Hölzer University of Jena, Jena, Germany Tetsuro Ikegami The University of Texas Medical Branch at Galveston, Galveston, TX, United States Niina Ikonen Finnish Institute for Health and Welfare, Helsinki, Finland Cihan I˙nan Department of Molecular Biology and Genetics, Karadeniz Technical University, Trabzon, Turkey

Natalie M. Holste University of Nebraska–Lincoln, Lincoln, NE, United States

I˙kbal Agah I˙nce Department of Medical Microbiology, Acıbadem University School of Medicine, Istanbul, Turkey

Jin S. Hong Kangwon National University, Chunchon, South Korea

Katsuaki Inoue Diamond Light Source, Didcot, United Kingdom

Margaret J. Hosie MRC-University of Glasgow Centre for Virus Research, Glasgow, United Kingdom

Toru Iwanami Tokyo University of Agriculture, Tokyo, Japan

Olivia G. Howell University of Alabama, Tuscaloosa, AL, United States

Jacques Izopet Toulouse University Hospital, Toulouse, France and Toulouse University Paul Sabatier, Toulouse, France

Liya Hu Baylor College of Medicine, Houston, TX, United States Zhaoyang Hu Jiangsu University, Zhenjiang, China Kuan-Ying A. Huang Chang Gung Memorial Hospital, Taoyuan, Taiwan Yu Huang Peking University, Beijing, China Natalia B. Hubbs Hanover College, Hanover, IN, United States

Fauziah Mohd Jaafar UMR1161 Virologie, INRAE – French National Research Institute for Agriculture, Food and Environment, ANSES, Ecole Nationale Vétérinaire d’Alfort, University of Paris-Est, Maisons-Alfort, France Andrew O. Jackson China Agricultural University, Beijing, China Daral J. Jackwood The Ohio State University/OARDC, Wooster, OH, United States

List of Contributors

Jean-Rock Jacques Cellular and Molecular Epigenetics (GIGA), Liège, Belgium and Molecular Biology (TERRA), Gembloux, Belgium Tiffany Jenkins Nationwide Children’s Hospital, Columbus, OH, United States and The Ohio State University, Columbus, OH, United States Jeffrey D. Jensen Arizona State University, Tempe, AZ, United States Daohong Jiang Huazhong Agricultural University, Wuhan, China Zhihao Jiang China Agricultural University, Beijing, China

xxxi

Laura Kakkola University of Turku, Turku, Finland Hannimari Kallio-Kokko University of Helsinki and Helsinki University Hospital, Helsinki, Finland Nassim Kamar Toulouse University Hospital, Toulouse, France and Toulouse University Paul Sabatier, Toulouse, France Phyllis J. Kanki Harvard T.H. Chan School of Public Health, Boston, MA, United States Peter Karayiannis University of Nicosia, Nicosia, Cyprus

Allison R. Jilbert The University of Adelaide, Adelaide, SA, Australia

Henry M. Kariithi Kenya Agricultural and Livestock Research Organization, Nairobi, Kenya

Peng Jing Department of Chemistry, College of Arts and Sciences, Fort Wayne, IN, United States

Brian A. Kelch University of Massachusetts Medical School, Worcester, MA, United States

Xixi Jing Central South University, Changsha, China

Karen E. Keller Horticultural Crops Research Unit, Agricultural Research Service, US Department of Agriculture, Corvallis, OR, United States

Meesbah Jiwaji Rhodes University, Grahamstown, South Africa Kyle L. Johnson The University of Texas at El Paso, El Paso, TX, United States and CQuentia, Memphis, TN, United States Welkin E. Johnson Boston College, Chestnut Hill, MA, United States Ian M. Jones University of Reading, Reading, United Kingdom and London School of Hygiene and Tropical Medicine, London, United Kingdom Ramon Jordan Agricultural Research Service, US Department of Agriculture, Beltsville, MD, United States Thomas Joris Cellular and Molecular Epigenetics (GIGA), Liège, Belgium and Molecular Biology (TERRA), Gembloux, Belgium Ilkka Julkunen Institute of Biomedicine, University of Turku, Turku, Finland Sandra Junglen Charité - University Medicine Berlin, Berlin, Germany Masanori Kaido Kyoto University, Kyoto, Japan

Japhette E. Kembou-Ringert University of Tel Aviv, Tel Aviv, Israel Peter J. Kerr University of Sydney, Sydney, NSW, Australia and CSIRO Health and Biosecurity, Black Mountain Laboratories, Canberra, ACT, Australia Tiffany King Nationwide Children’s Hospital, Columbus, OH, United States and The Ohio State University College of Medicine, Columbus, OH, United States Andrea Kirmaier Boston College, Chestnut Hill, MA, United States Thomas Klose Purdue University, West Lafayette, IN, United States Barbara G. Klupp Friedrich-Loeffler-Institute, Greifswald-Insel Riems, Germany David M. Knipe Harvard Medical School, Boston, MA, United States Nick J. Knowles The Pirbright Institute, Pirbright, United Kingdom Guus Koch Wageningen Bioveterinary Research, Lelystad, The Netherlands

xxxii

List of Contributors

Renate Koenig Julius Kühn Institute – Federal Research Center for Cultivated Plants, Braunschweig, Germany Susanne E. Kohalmi The University of Western Ontario, London, ON, Canada Hideki Kondo Okayama University, Kurashiki, Japan Jennifer L. Konopka-Anstadt Centers for Disease Control and Prevention, Atlanta, GA, United States Eugene V. Koonin National Center for Biotechnology Information, National Library of Medicine, Bethesda, MD, United States and National Institutes of Health, Bethesda, MD, United States Marion P.G. Koopmans Erasmus Medical Center, Rotterdam, The Netherlands Richard Kormelink Wageningen University and Research, Wageningen, The Netherlands Ioly Kotta-Loizou Imperial College London, London, United Kingdom Peter J. Krell Department of Molecular and Cellular Biology, University of Guelph, Guelph, ON, Canada Mart Krupovic Archaeal Virology Unit, Institut Pasteur, Paris, France

Manish Kumar Jawaharlal Nehru University, New Delhi, India Gael Kurath US Geological Survey, Western Fisheries Research Center, Seattle, WA, United States Satu Kurkela University of Helsinki and Helsinki University Hospital, Helsinki, Finland Wan-Chun Lai Chang Gung Memorial Hospital, Taoyuan, Taiwan Kevin Lamkiewicz University of Jena, Jena, Germany Rebecca K. Lane University of Texas Health Science Center at San Antonio, San Antonio, TX, United States Andrew S. Lang Memorial University of Newfoundland, St. John’s, NL, Canada Daniel Carlos Ferreira Lanza Federal University of Rio Grande do Norte, Natal, Brazil Maija Lappalainen HUS Diagnostic Center, HUSLAB, Clinical Microbiology, University of Helsinki and Helsinki University Hospital, Helsinki, Finland Katherine LaTourrette University of Nebraska–Lincoln, Lincoln, NE, United States

Andreas Kuhn University of Hohenheim, Stuttgart, Germany

Chris Lauber TWINCORE – Center for Experimental and Clinical Infection Research, Hannover, Germany

Jens H. Kuhn National Institutes of Health, Frederick, MD, United States

Antonio Lavazza The Lombardy and Emilia Romagna Experimental Zootechnic Institute, Brescia, Italy

Richard J. Kuhn Purdue University, West Lafayette, IN, United States

C. Martin Lawrence Montana State University, Bozeman, MT, United States

Suvi Kuivanen University of Helsinki, Helsinki, Finland

Hervé Lecoq Plant Pathology Unit, INRAE – French National Research Institute for Agriculture, Food and Environment, Montfavet, France

Ranjababu Kulasegaram Guy’s and St Thomas’ NHS Foundation Trust, London, United Kingdom Raghavendran Kulasegaran-Shylini Department of Pathogen Infection, Faculty of Infectious and Tropical Diseases, London School of Hygiene and Tropical Medicine, London, United Kingdom Gaurav Kumar University of Delhi, New Delhi, India

Young-Min Lee Utah State University, Logan, UT, United States Kristen N. LeGault University of California, Berkeley, CA, United States James Legg International Institute of Tropical Agriculture, Dar es Salaam, Tanzania

List of Contributors

xxxiii

Anne Legreve University of Louvain, Louvain-la-Neuve, Belgium

Walter Ian Lipkin Columbia University, New York, NY, United States

Petr G. Leiman The University of Texas Medical Branch, Galveston, TX, United States

Jan G. Lisby Copenhagen University Hospital Hvidovre, Hvidovre, Denmark

Stanley M. Lemon Lineberger Comprehensive Cancer Center, The University of North Carolina at Chapel Hill, NC, United States and Department of Microbiology and Immunology, The University of North Carolina at Chapel Hill, NC, United States

Ding X. Liu South China Agricultural University, Guangzhou, China

Sebastian Leptihn Zhejiang University-Edinburgh University Institute, Zhejiang University, Haining, China Dennis J. Lewandowski University of Florida, Lake Alfred, FL, United States Sébastien Lhomme Toulouse University Hospital, Toulouse, France and Toulouse University Paul Sabatier, Toulouse, France Dawei Li China Agricultural University, Beijing, China Guangdi Li Central South University, Changsha, China Guoqing Li Huazhong Agricultural University, Wuhan, China Yi Li Peking University, Beijing, China Zhefeng Li College of Pharmacy, The Ohio State University, Columbus, OH, United States Zhenghe Li Zhejang University, Hangzhou, China

Qiang Liu University of Saskatchewan, Saskatoon, SK, Canada Sijun Liu Iowa State University, Ames, IA, United States Carlos Llorens Biotechvana, Scientific Park University of Valencia, Valencia, Spain L. Sue Loesch-Fries Purdue University, West Lafayette, IN, United States George P. Lomonossoff John Innes Centre, Norwich, United Kingdom L. Letti Lopez The University of Texas at Austin, Austin, TX, United States Alan T. Loynachan University of Kentucky, Lexington, KY, United States Garry A. Luke University of St. Andrews, St. Andrews, United Kingdom M. Luo University of Alabama at Birmingham, Birmingham, AL, United States Juan J. López-Moya Center for Research in Agricultural Genomics and Spanish National Research Council, Barcelona, Spain

Jia Q. Liang South China Agricultural University, Guangzhou, China

Che Ma Genomics Research Center, Academia Sinica, Taipei, Taiwan

Sebastian Liebe Institute of Sugar Beet Research, Göttingen, Germany

Stuart A. MacFarlane The James Hutton Institute, Invergowrie, United Kingdom

João Paulo Matos Santos Lima Federal University of Rio Grande do Norte, Natal, Brazil

Saichetana Macherla J. Craig Venter Institute, La Jolla, CA, United States

Bruno Lina HCL Department of Virology, National Reference Center for Respiratory Viruses, Institute of Infectious Agents, Croix-Rousse Hospital, Lyon, France and Virpath Laboratory, International Center of Research in Infectiology (CIRI), INSERM U1111, CNRS—UMR 5308, École Normale Supérieure de Lyon, University Claude Bernard Lyon, Lyon University, Lyon, France

Kensaku Maejima The University of Tokyo, Tokyo, Japan Fabrizio Maggi University of Pisa, Pisa, Italy and University of Insubria, Varese, Italy Melissa S. Maginnis The University of Maine, Orono, ME, United States

xxxiv

List of Contributors

Edgar Maiss Leibniz University Hannover, Hannover, Germany

Chikara Masuta Hokkaido University, Sapporo, Japan

Kira S. Makarova National Center for Biotechnology Information, National Library of Medicine, Bethesda, MD, United States

Carlos P. Mata University of Leeds, Leeds, United Kingdom

Ariana Manglli Council for Agricultural Research and Economics, Research Center for Plant Protection and Certification, Rome, Italy

Jelle Matthijnssens Rega Institute for Medical Research, KU Leuven, Leuven, Belgium

Annette Mankertz Robert Koch-Institute, Berlin, Germany

Claire P. Mattison Centers for Disease Control and Prevention, Atlanta, GA, United States and Cherokee Nation Assurance, Arlington, VA, United States

Pilar Manrique The Ohio State University, Wexner Medical Center, Columbus, OH, United States

William McAllister Rowan University School of Osteopathic Medicine, Stratford, NJ, United States

Shahid Mansoor National Institute for Biotechnology and Genetic Engineering, Faisalabad, Pakistan

Alison A. McBride National Institute of Allergy and Infectious Diseases, National Institutes of Health, Bethesda, MD, United States

Marco Marklewitz Institute of Virology, Charité – University Medicine Berlin, Berlin, Germany Giovanni P. Martelli† University of Bari Aldo Moro, Bari, Italy Darren P. Martin University of Cape Town, Cape Town, South Africa Robert R. Martin Horticultural Crops Research Unit, Agricultural Research Service, US Department of Agriculture, Corvallis, OR, United States Manuel Martinez-Garcia University of Alicante, Alicante, Spain Francisco Martinez-Hernandez University of Alicante, Alicante, Spain Natalia Martín-González Autonomous University of Madrid, Madrid, Spain Joaquín Martínez Martínez Bigelow Laboratory for Ocean Sciences, East Boothbay, ME, United States Manja Marz University of Jena, Jena, Germany Andrea Marzi National Institute of Allergy and Infectious Diseases, National Institutes of Health, Hamilton, MT, United States Hema Masarapu Sri Venkateswara University, Tirupati, India †

Deceased.

Michael McChesney University of California, Davis, CA, United States Elaine McCulloch Quality Control for Molecular Diagnostics (QCMD), Glasgow, United Kingdom Andrew J. McMichael University of Oxford, Oxford, United Kingdom Alexander McPherson University of California, Irvine, CA, United States Irene K. Meki French National Center for Scientific Research, Montpellier, France Ulrich Melcher Oklaoma State University, Stillwater, OK, United States Tomas A Melgarejo University of California, Davis, CA, United States Michael J. Melzer Department of Plant and Environmental Protection Sciences, University of Hawaii, Honolulu, HI, United States Luiza Mendonça University of Oxford, Oxford, United Kingdom Xiang-Jin Meng Virginia Polytechnic Institute and State University, Blacksburg, VA, United States Peter P.C. Mertens University of Nottingham, Sutton Bonington, United Kingdom

List of Contributors

Thomas C. Mettenleiter Friedrich-Loeffler-Institute, Greifswald-Insel Riems, Germany Philip D. Minor St Albans, United Kingdom Ali Mirazimi National Veterinary Institute, Uppsala, Sweden and Karolinska Hospital University, Huddinge, Sweden Nischay Mishra Columbia University, New York, NY, United States Edward S. Mocarski Emory University School of Medicine, Atlanta, GA, United States Florian Mock University of Jena, Jena, Germany Volker Moennig University of Veterinary Medicine, Hannover, Germany Ian J. Molineux The University of Texas at Austin, Austin, TX, United States Aderito L. Monjane Norwegian Veterinary Institute, Oslo, Finland Jacen S. Moore University of Tennessee Health Science Center, Memphis, TN, United States Marc C. Morais The University of Texas Medical Branch, Galveston, TX, United States Cristina Moraru Institute for Chemistry and Biology of the Marine Environment, Oldenburg, Germany Hiromitsu Moriyama Tokyo University of Agriculture and Technology, Tokyo, Japan Sergey Y. Morozov Lomonosov Moscow State University, Moscow, Russia Thomas E. Morrison University of Colorado School of Medicine, Aurora, CO, United States Léa Morvan University of Liège, Liège, Belgium Benoît Moury Plant Pathology Unit, INRAE – French National Institute for Agriculture, Food and Environment, Montfavet, France

xxxv

Muhammad Mubin University of Agriculture, Faisalabad, Pakistan Nicolas J. Mueller University Hospital of Zurich, Zurich, Switzerland Emmanuelle Muller The French Agricultural Research Center for International Development, Joint Research Units–Biology and Genetics of Plant-Pathogen Interactions, Montpellier, France and Biology and Genetics of PlantPathogen Interactions, University of Montpellier, The French Agricultural Research Center for International Development, French National Institute for Agricultural Research, Montpellier SupAgro, Montpellier, France John S. Munday Massey University, Palmerston North, New Zealand Jacob H. Munson-McGee Montana State University, Bozeman, MT, United States and Bigelow Laboratory for Ocean Sciences, East Boothbay, ME, United States Hacer Muratoğlu Department of Molecular Biology and Genetics, Karadeniz Technical University, Trabzon, Turkey Kenan C. Murphy University of Massachusetts Medical School, Worcester, MA, United States Ugrappa Nagalakshmi University of California, Davis, CA, United States Keizo Nagasaki Kochi University, Nankoku, Japan Nazia Nahid GC University, Faisalabad, Pakistan and University of Agriculture, Faisalabad, Pakistan Venugopal Nair The Pirbright Institute, Pirbright, United Kingdom Remziye Nalçacıoğlu Department of Molecular Biology and Genetics, Karadeniz Technical University, Trabzon, Turkey Shigetou Namba The University of Tokyo, Tokyo, Japan Rubab Z. Naqvi National Institute for Biotechnology and Genetic Engineering, Faisalabad, Pakistan Rachel Nash The Pirbright Institute, Surrey, United Kingdom C.K. Navaratnarajah Purdue University, West Lafayette, IN, United States

xxxvi

List of Contributors

Maria A. Navarrete-Muñoz Biotechvana, Madrid, Spain; Institute of Health Research-Jiménez Díaz Foundation, Autonomous University of Madrid; and Rey Juan Carlos University Hospital, Móstoles, Spain Jesús Navas-Castillo Institute for Mediterranean and Subtropical Horticulture “La Mayora”–Spanish National Research Council– University of Malaga, Algarrobo-Costa, Málaga, Spain Muhammad S. Nawaz-ul-Rehman University of Agriculture, Faisalabad, Pakistan Christopher L. Netherton The Pirbright Institute, Pirbright, United Kingdom Thu V.P. Nguyen Baylor College of Medicine, Houston, TX, United States Annette Niehl Julius Kühn Institute – Federal Research Center for Cultivated Plants, Braunschweig, Germany Hubert G.M. Niesters Department of Medical Microbiology and Infection Prevention, Division of Clinical Virology, University Medical Center Groningen, Groningen, The Netherlands Jozef I. Nissimov University of Waterloo, Waterloo, ON, Canada Norman Noah London School of Hygiene and Tropical Medicine, London, United Kingdom Mauricio L. Nogueira São José do Rio Preto School of Medicine, São José do Rio Preto, São Paulo, Brazil Johan Nordgren Linköping University, Linköping, Sweden C. Micha Nübling Paul-Ehrlich-Institute, Langen, Germany Visa Nurmi University of Helsinki, Helsinki, Finland

Hanna M. Oksanen Molecular and Integrative Biosciences Research Program, Faculty of Biological and Environmental Sciences, University of Helsinki, Helsinki, Finland Graziele P. Oliveira Federal University of Minas Gerais, Belo Horizonte, Brazil Francesco Origgi University of Bern, Bern, Switzerland Nikolaus Osterrieder Free University of Berlin, Berlin, Germany Robert A. Owens Beltsville Agricultural Research Center, Beltsville, MD 20705, United States Emine Özsahin University of Guelph, Guelph, ON, Canada Sergi Padilla-Parra University of Oxford, Oxford, United Kingdom; Department of Infectious Diseases, Faculty of Life Sciences and Medicine, King’s College London, London, United Kingdom; and Randall Division of Cell and Molecular Biophysics, King’s College London, London, United Kingdom Joshua Pajak Duke University, Durham, NC, United States Massimo Palmarini MRC-University of Glasgow Centre for Virus Research, Glasgow, United Kingdom Amanda R. Panfil The Ohio State University, Columbus, OH, United States Marcus Panning Institute of Virology, Freiburg University Medical Center, Faculty of Medicine, University of Freiburg, Freiburg, Germany

Donald L. Nuss University of Maryland, Rockville, MD, United States

Vitantonio Pantaleo National Research Council, Research Unit of Bari, Bari, Italy

M. Steven Oberste Centers for Disease Control and Prevention, Atlanta, GA, United States

Anna Papa Aristotle University of Thessaloniki, Thessaloniki, Greece

Hiroyuki Ogata Institute for Chemical Research, Kyoto University, Kyoto, Japan Ane Ogbe University of Oxford, Oxford, United Kingdom

Nikolaos Pappas Utrecht University, Utrecht, The Netherlands Hanu R. Pappu Washington State University, Pullman, WA, United States

List of Contributors

xxxvii

Kristin N. Parent Michigan State University, East Lansing, MI, United States

Jean-Marie Peron Toulouse University Hospital, Toulouse, France and Toulouse University Paul Sabatier, Toulouse, France

Colin R. Parrish Cornell University, Ithaca, NY, United States

Karin E. Peterson National Institutes of Health, Hamilton, MT, United States

A. Lorena Passarelli Kansas State University, Manhattan, KS, United States

Karel Petrzik Biology Center CAS, Institute of Plant Molecular Biology, České Budějovice, Czech Republic

Basavaprabhu L. Patil ICAR–Indian Institute of Horticultural Research, Bengaluru, India Jade Pattyn University of Antwerp, Antwerp, Belgium T.A. Paul Cornell University, Ithaca, NY, United States Lillian Pavlik Laboratory for Molecular Virology, Great Lakes Forestry Centre, Sault Ste Marie, ON, Canada Susan L. Payne Texas A& M University, College Station, TX, United States Michael N. Pearson The University of Auckland, Auckland, New Zealand Mark E. Peeples Nationwide Children’s Hospital, Columbus, OH, United States and The Ohio State University College of Medicine, Columbus, OH, United States Ben Peeters Wageningen Bioveterinary Research, Lelystad, The Netherlands Joseph S.M. Peiris The University of Hong Kong, Pok Fu Lam, Hong Kong Malik Peiris The University of Hong Kong, Pok Fu Lam, Hong Kong Judit J. Pénzes National Institute of Scientific Research – ArmandFrappier Health Research Centre, Laval, QC, Canada Miryam Pérez-Cañamás Institute for Plant Molecular and Cell Biology (Spanish National Research Council–Polytechnic University of Valencia), Valencia, Spain

Mahtab Peyambari Pennsylvania State University, State College, PA, United States Sujal Phadke J. Craig Venter Institute, La Jolla, CA, United States Hanh T. Pham National Institute of Scientific Research – ArmandFrappier Health Research Centre, Laval, QC, Canada Mauro Pistello University of Pisa, Pisa, Italy Daniel Ponndorf John Innes Centre, Norwich, United Kingdom Leo L.M. Poon The University of Hong Kong, Pok Fu Lam, Hong Kong Welkin H. Pope University of Pittsburgh, Pittsburgh, PA, United States Minna M. Poranen University of Helsinki, Helsinki, Finland Claudine Porta The Pirbright Institute, Pirbright, United Kingdom and University of Oxford, Oxford, United Kingdom Samuel S. Porter National Institute of Allergy and Infectious Diseases, National Institutes of Health, Bethesda, MD, United States and University of Maryland, College Park, MD, United States Frank A. Post King's College Hospital NHS Foundation Trust, London, United Kingdom

Marta Pérez-Illana National Center for Biotechnology, Spanish National Research Council, Madrid, Spain

Nils Poulicard Interactions Plantes Microorganismes Environnement, Institut de Recherche pour le Développement, Centre de coopération internationale en recherche agronomique pour le développement, University of Montpellier, Montpellier, France

Jaume Pérez-Sánchez Institute of Aquaculture Torre de la Sal, Spanish National Research Council, Castellon, Spain

David Prangishvili Institut Pasteur, Paris, France and Ivane Javakhishvili Tbilisi State University, Tbilisi, Georgia

xxxviii

List of Contributors

B. V. Venkataram Prasad Baylor College of Medicine, Houston, TX, United States Lalita Priyamvada Centers for Disease Control and Prevention, Atlanta, GA, United States

Chris M. Rands University of Geneva Medical School and Swiss Institute of Bioinformatics, Geneva, Switzerland Venigalla B. Rao The Catholic University of America, Washington, DC, United States

Simone Prospero Swiss Federal Institute for Forest, Snow and Landscape Research WSL, Birmensdorf, Switzerland

Janne J. Ravantti University of Helsinki, Helsinki, Finland

Elisabeth Puchhammer-Stöckl Medical University of Vienna, Vienna, Austria

Mandy Ravensbergen Wageningen University and Research, Wageningen, The Netherlands

Jianming Qiu University of Kansas Medical Center, Kansas City, KS, United States

Georget Y. Reaiche-Miller The University of Adelaide, Adelaide, SA, Australia

S.L. Quackenbush Colorado State University, Fort Collins, CO, United States

D.V.R. Reddy International Crops Research Institute for the Semi-Arid Tropics, Hyderabad, India

Killian J. Quinn King’s College Hospital, London, United Kingdom

Vishwanatha R.A.P. Reddy The Pirbright Institute, Pirbright, United Kingdom

Diego F. Quito-Avila Department of Life Sciences, ESPOL Polytechnic University, Guayaquil, Ecuador

Juan Reguera Aix-Marseille University, French National Center for Scientific Research, Marseille, France and French National Institute of Health and Medical Research, Marseille, France

Frank Rabenstein Julius Kühn Institute, Quedlinburg, Germany Sheli R. Radoshitzky United States Army Medical Research Institute of Infectious Diseases, Frederick, MD, United States

William K. Reisen University of California, Davis, CA, United states Jingshan Ren University of Oxford, Oxford, United Kingdom

Saleem U. Rahman National Institute for Biotechnology and Genetic Engineering, Faisalabad, Pakistan

Renato O. Resende University of Brasilia, Brasilia, Brazil

Mbolarinosy Rakotomalala FOFIFA, Antananarivo, Madagascar

Peter A. Revill The Peter Doherty Institute of Infection and Immunity, Royal Melbourne Hospital, Melbourne, VIC, Australia

Norma Rallon Institute of Health Research-Jiménez Díaz Foundation, Autonomous University of Madrid and Rey Juan Carlos University Hospital, Móstoles, Spain

Félix A. Rey Institut Pasteur, Paris, France

Robert P. Rambo Diamond Light Source, Didcot, United Kingdom

Simone G. Ribeiro Embrapa Genetic Resources and Biotechnology, Brasília, Brazil

Bertha Cecilia Ramirez The Institute for Integrative Biology of the Cell, The French Alternative Energies and Atomic Energy Commission, French National Center for Scientific Research, University of Paris-Sud, University of Paris-Saclay, Gif-sur-Yvette, France María D. Ramos-Barbero University of Alicante, Alicante, Spain

Lara Rheinemann University of Utah, Salt Lake City, UT, United States

Daniel Rigling Swiss Federal Institute for Forest, Snow and Landscape Research (WSL), Birmensdorf, Switzerland Cristina Risco Cell Structure Laboratory, National Center for Biotechnology – Spanish National Research Council (CNB-CSIC), Madrid, Spain

List of Contributors

Efraín E. Rivera-Serrano Lineberger Comprehensive Cancer Center, The University of North Carolina at Chapel Hill, NC, United States and Department of Microbiology and Immunology, The University of North Carolina at Chapel Hill, NC, United States Cécile Robin INRAE – French National Research Institute for Agriculture, Food and Environment, UMR BIOGECO, Cestas, France Rodrigo A.L. Rodrigues Federal University of Minas Gerais, Belo Horizonte, Brazil Elina Roine University of Helsinki, Helsinki, Finland

xxxix

Polly Roy Department of Pathogen Infection, Faculty of Infectious and Tropical Diseases, London School of Hygiene and Tropical Medicine, London, United Kingdom and University of Reading, Reading, United Kingdom Aaron P. Roznowski The University of Texas at Austin, Austin, TX, United States and University of Arizona, Tucson, AZ, United States Luisa Rubino Institute for Sustainable Plant Protection, National Research Council, Bari, Italy Olli Ruuskanen Turku University Hospital, Turku, Finland

Maria R. Rojas University of California, Davis, CA, United States

Eugene V. Ryabov USDA, Agricultural Research Service, Beltsville, MD, United States

Marilyn J. Roossinck Pennsylvania State University, State College, PA, United States

Martin D. Ryan University of St. Andrews, St. Andrews, United Kingdom

Vera I.D. Ros Wageningen University and Research, Wageningen, The Netherlands

Ki H. Ryu Seoul Women’s University, Seoul, South Korea

Cristina Rosa Pennsylvania State University, University Park, PA, United States Hanna Rose Leibniz University Hannover, Hannover, Germany David A. Rosenbaum University of Florida, Gainesville, FL, United States Shannan L. Rossi The University of Texas Medical Branch, Galveston, TX, United States Michael G. Rossmann† Purdue University, West Lafayette, IN, United States

Hanns-Joachim Rziha Eberhard Karls University of Tübingen, Tübingen, Germany Sead Sabanadzovic Mississippi State University, Starkville, MS, United States Roghaiyeh Safari Cellular and Molecular Epigenetics (GIGA), Liège, Belgium and Molecular Biology (TERRA), Gembloux, Belgium Azeez Sait Sahul Hameed C. Abdul Hakeem College, Melvisharam, India

L. Roux University of Geneva Medical School, Geneva, Switzerland

Nicole Samies University of Alabama at Birmingham, Birmingham, AL, United States

Simon Roux US Department of Energy Joint Genome Institute, Walnut Creek, CA, United States

Carmen San Martín National Center for Biotechnology, Spanish National Research Council, Madrid, Spain

J. Rovnak Colorado State University, Fort Collins, CO, United States David J. Rowlands University of Leeds, Leeds, United Kingdom †

Deceased.

Ruth-Anne Sandaa Department of Biological Sciences, University of Bergen, Bergen, Norway Hélène Sanfaçon Agriculture and Agri-Food Canada, Summerland, BC, Canada

xl

List of Contributors

Rafael Sanjuán Institute for Integrative Systems Biology (I2SysBio), University of Valencia-CSIC, Valencia, Spain

Declan C. Schroeder University of Reading, Reading, United Kingdom and University of Minnesota, St. Paul, MN, United States

Neeraja Sankaran Utrecht University, Utrecht, The Netherlands

Stacey Schultz-Cherry St. Jude Children’s Research Hospital, Memphis, TN, United States

Fernando Santos University of Alicante, Alicante, Spain Cecilia Sarmiento Tallinn University of Technology, Tallinn, Estonia Takahide Sasaya National Agriculture and Food Research Organization, Fukuyama, Japan Preethi Sathanantham Virginia Tech, Blacksburg, VA, United States Panayampalli S. Satheshkumar Centers for Disease Control and Prevention, Atlanta, GA, United States Yukiyo Sato Okayama University, Kurashiki, Japan Andreas Sauerbrei Jena University Hospital, Jena, Germany Eugene I. Savenkov Swedish University of Agricultural Sciences, Uppsala, Sweden and Linnean Center for Plant Biology, Uppsala, Sweden Carita Savolainen-Kopra National Institute for Health and Welfare, Helsinki, Finland Kay Scheets Oklahoma State University, Stillwater, OK, United States Uffe V. Schenider Copenhagen University Hospital Hvidovre, Hvidovre, Denmark Richard H. Scheuermann J. Craig Venter Institute, La Jolla, CA, United States; University of California, San Diego, CA, United States; La Jolla Institute for Immunology, La Jolla, CA, United States; and Global Virus Network, Baltimore, MD, United States Manfred J. Schmitt Saarland University, Saarbrücken, Germany James E. Schoelz University of Missouri, Columbia, MO, United States Jason R. Schrad Michigan State University, East Lansing, MI, United States

Thomas F. Schulz Hannover Medical School, Institute of Virology, Hannover, Germany and German Center for Infection Research, Hannover-Braunschweig Site, Braunschweig, Germany Catherine A. Scougall The University of Adelaide, Adelaide, SA, Australia Kimberley D. Seed University of California, Berkeley, CA, United States Joaquim Segalés Departament of Animal Health and Anatomy, Faculty of Veterinary Medicine, Autonomous University of Barcelona, Barcelona, Spain; Animal Health Research Center (CReSA) – Institute of Agrifood Research and Technology (IRTA), Campus UAB, Barcelona, Spain; and OIE Collaborating Center for the Research and Control of Emerging and Re-emerging Swine Diseases in Europe (IRTA-CReSA), Barcelona, Spain Mateo Seoane-Blanco National Center for Biotechnology, Madrid, Spain Madhumati Sevvana Purdue University, West Lafayette, IN, United States Kazım Sezen Department of Biology, Karadeniz Technical University, Trabzon, Turkey Arvind Sharma Institut Pasteur, Paris, France Sumit Sharma Linköping University, Linköping, Sweden James M. Sharp University of Zaragoza, Zaragoza, Spain and Edinburgh, United Kingdom Qunxin She Shandong University, Qingdao, China Keith E. Shearwin The University of Adelaide, Adelaide, SA, Australia Hanako Shimura Hokkaido University, Sapporo, Japan Reina S. Sikkema Erasmus Medical Center, Rotterdam, The Netherlands

List of Contributors

Aaron Simkovich Agriculture and Agri-Food Canada, London, ON, Canada and The University of Western Ontario, London, ON, Canada Peter Simmonds University of Oxford, Oxford, United Kingdom Tarja Sironen University of Helsinki, Helsinki, Finland Susanna Sissonen Finnish Institute for Health and Welfare, Helsinki, Finland Michael A. Skinner Imperial College London, London, United Kingdom Douglas E. Smith University of California, San Diego, La Jolla, CA, United States Melvyn Smith Viapath Analytics, Specialist Virology Centre, King’s College NHS Foundation Trust, London, United Kingdom Thomas J. Smith The University of Texas Medical Branch, Galveston, TX, United States Teemu Smura Helsinki University Hospital and University of Helsinki, Helsinki, Finland Eric J. Snijder Leiden University Medical Center, Leiden, The Netherlands Gisela Soboll Hussey Michigan State University, East Lansing, MI, United States Maria Söderlund-Venermo University of Helsinki, Helsinki, Finland Merike Sõmera Tallinn University of Technology, Tallinn, Estonia Eun G. Song Seoul Women’s University, Seoul, South Korea Milan J. Sonneveld Erasmus University Medical Center, Rotterdam, The Netherlands Beatriz Soriano Biotechvana, Scientific Park University of Valencia and Institute for Integrative Systems Biology (I2SysBio), University of Valencia–Spanish National Research Council, Valencia, Spain

xli

Thomas E. Spencer University of Missouri, Columbia, MO, United States Pothur Sreenivasulu Sri Venkateswara University, Tirupati, India Ashley L. St. John Duke-NUS Medical School, Singapore, Singapore David K. Stammers University of Oxford, Oxford, United Kingdom John Stanley John Innes Centre, Colney, United Kingdom Glyn Stanway University of Essex, Colchester, United Kingdom Thilo Stehle University of Tuebingen, Tuebingen, Germany and Vanderbilt University School of Medicine, Nashville, TN, United States Gregory W. Stevenson Iowa State University, Ames, IA, United States Lucy Rae Stewart Agricultural Research Service, US Department of Agriculture, Wooster, OH, United States C.C.M.M. Stijger Wageningen University and Research Center, Bleiswijk, The Netherlands Peter G. Stockley University of Leeds, Leeds, United Kingdom David Stone Weymouth Laboratory, Centre for Environment, Fisheries and Aquaculture Science, Weymouth, United Kingdom Ashley E Strother The University of Texas Medical Branch, Galveston, TX, United States Sundharraman Subramanian Michigan State University, East Lansing, MI, United States William C. Summers Yale University, New Haven, CT, United States Liying Sun Northwest A& F University, Yangling, China Wesley I. Sundquist University of Utah, Salt Lake City, UT, United States Petri Susi University of Turku, Turku, Finland Curtis A. Suttle University of British Columbia, Vancouver, BC, Canada

xlii

List of Contributors

Nobuhiro Suzuki Institute of Plant Stress and Resources (IPSR), Okayama University, Kurashiki, Japan Lennart Svensson Linköping University, Linköping, Sweden and Karolinska Institute, Stockholm, Sweden Ronald Swanstrom University of North Carolina at Chapel Hill, Chapel Hill, NC, United States

Nicholas M.I. Taylor University of Copenhagen, Copenhagen, Denmark Xu Tengzhi University of California, Davis, CA, United States Raquel Tenorio Cell Structure Laboratory, National Center for Biotechnology – Spanish National Research Council (CNB-CSIC), Madrid, Spain

Daniele M. Swetnam University of California, Davis, CA, United states

Robert B. Tesh The University of Texas Medical Branch, Galveston, TX, United States

Moriah L. Szpara Pennsylvania State University, University Park, PA, United States

Vaskar Thapa Pennsylvania State University, State College, PA, United States

Keisuke Tabata Heidelberg University, Heidelberg, Germany

John E. Thomas The University of Queensland, Brisbane, QLD, Australia

Anna Taglienti Council for Agricultural Research and Economics, Research Center for Plant Protection and Certification, Rome, Italy

Julie A. Thomas Rochester Institute of Technology, Rochester, NY, United States

Naoki Takeshita Tokyo University of Agriculture and Technology, Fuchu, Japan Kana Takeshita Urayama Tokyo University of Agriculture and Technology, Fuchu, Japan Michael E. Taliansky The James Hutton Institute, Dundee, United Kingdom Pan Tao The Catholic University of America, Washington, DC, United States

Lynn C. Thomason Frederick National Laboratory for Cancer Research, Frederick, MD, United States Elizabeth Ashley Thompson The University of Southern Mississippi, Hattiesburg, MS, United States Jeremy R. Thompson Cornell University, Ithaca, NY, United States Antonio Tiberini Council for Agricultural Research and Economics, Research Center for Plant Protection and Certification, Rome, Italy

Jacqueline E. Tate Centers for Disease Control and Prevention, Atlanta, GA, United States

Peter Tijssen National Institute of Scientific Research – ArmandFrappier Health Research Centre, Microbiology and Immunology, Laval, QC, Canada

Satyanarayana Tatineni Agricultural Research Service, US Department of Agriculture, Lincoln, NE, United States and University of Nebraska–Lincoln, Lincoln, NE, United States

Yuji Tomaru Japan Fisheries Research and Education Agency, Kanagawa, Japan

Sisko Tauriainen University of Turku, Turku, Finland Norbert Tautz University of Luebeck, Luebeck, Germany Paulo Tavares Institute for Integrative Biology of the Cell, CEA, CNRS, University of Paris-Sud, University of Paris-Saclay, Gif-sur-Yvette, France

Laura Tomassoli Council for Agricultural Research and Economics, Research Center for Plant Protection and Certification, Rome, Italy Ruben Torres National Biotechnology Center–Spanish National Research Council, Madrid, Spain Jia Q. Truong The University of Adelaide, Adelaide, SA, Australia

List of Contributors

Erkki Truve† Tallinn University of Technology, Tallinn, Estonia Chih-Hsuan Tsai Institute of Molecular Biology, Academia Sinica, Taipei, Taiwan Roman Tuma University of Leeds, Leeds, United Kingdom and University of South Bohemia, České Budějovice, Czech Republic Topi Turunen Infectious Disease Unit, Espoo, Finland and Finnish Institute for Health and Welfare, Helsinki, Finland Reidun Twarock University of York, York, United Kingdom Ioannis E. Tzanetakis University of Arkansas, Fayetteville, United States Antti Vaheri University of Helsinki, Helsinki, Finland Eeva J. Vainio Natural Resources Institute Finland (Luke), Helsinki, Finland Anna M. Vaira Institute for Sustainable Plant Protection, National Research Council of Italy, Torino, Italy Steven M. Valles Center for Medical, Agricultural and Veterinary Entomology, Agricultural Research Service, US Department of Agriculture, Gainesville, FL, United States Adrián Valli National Center for Biotechnology-Spanish National Research Council, Madrid, Spain Rodrigo A. Valverde Louisiana State University Agricultural Center, Baton Rouge, United States Pierre Van Damme University of Antwerp, Antwerp, Belgium Rene A.A. van der Vlugt Wageningen University and Research Center, Wageningen, The Netherlands Bernard A.M. Van der Zeijst Leiden University Medical Center, Leiden, The Netherlands Koenraad Van Doorslaer University of Arizona, Tucson, AZ, United States †

Deceased.

xliii

James L. Van Etten University of Nebraska–Lincoln, Lincoln, NE, United States Suzanne van Meer University Medical Center Utrecht, Utrecht, The Netherlands Monique M. van Oers Wageningen University and Research, Wageningen, The Netherlands Mark J. van Raaij National Center for Biotechnology, Madrid, Spain Marc H.V. Van Regenmortel University of Strasbourg, Strasbourg, France Piet A. van Rijn Wageningen Bioveterinary Research, Lelystad, The Netherlands and North-West University, Potchefstroom, South Africa Alain Vanderplasschen University of Liège, Liège, Belgium Dana L. Vanlandingham College of Veterinary Medicine, Kansas State University, Manhattan, KS, United States Olli Vapalahti Helsinki University Hospital and University of Helsinki, Helsinki, Finland Mark Varrelmann Institute of Sugar Beet Research, Göttingen, Germany Nikos Vasilakis The University of Texas Medical Branch, Galveston, TX, United States Michael Veit Free University of Berlin, Berlin, Germany Česlovas Venclovas Vilnius University, Vilnius, Lithuania H. Josef Vetten Julius Kühn Institute, Braunschweig, Germany Marli Vlok University of British Columbia, Vancouver, BC, Canada Anne-Nathalie Volkoff Diversity, Genomes and Insects-Microorganisms Interactions, National Institute of Agricultural Research, University of Montpellier, Montpellier, France Ian E.H. Voorhees Cornell University, Ithaca, NY, United States Alex Vorsters University of Antwerp, Antwerp, Belgium

xliv

List of Contributors

Jonathan D.F. Wadsworth UCL Institute of Prion Diseases, London, United Kingdom

Kerstin Wernike Friedrich-Loeffler-Institute, Insel Riems, Germany

Peter J. Walker The University of Queensland, St. Lucia, QLD, Australia

Rachel J. Whitaker University of Illinois at Urbana-Champaign, Urbana, IL, United States

Paul Wallace Quality Control for Molecular Diagnostics (QCMD), Glasgow, United Kingdom

K. Andrew White York University, Toronto, ON, Canada

Aiming Wang Agriculture and Agri-Food Canada, London, ON, Canada

Anna E. Whitfield North Carolina State University, Raleigh, NC, United States

Jen-Ren Wang National Cheng Kung University, Tainan, Taiwan

Richard Whitley University of Alabama at Birmingham, Birmingham, AL, United States

Lin-Fa Wang Duke-NUS Medical School, Singapore, Singapore Nan Wang Institute of Biophysics, Chinese Academy of Sciences, Beijing, China Xiangxi Wang Institute of Biophysics, Chinese Academy of Sciences, Beijing, China Xiaofeng Wang Virginia Tech, Blacksburg, VA, United States Katherine N. Ward University College London, London, United Kingdom Matti Waris University of Turku, Turku, Finland Ranjit Warrier Purdue University, West Lafayette, IN, United States Daniel Watterson Australian Infectious Diseases Research Centre, School of Chemistry and Molecular Biosciences, The University of Queensland, St Lucia, QLD, Australia Marta L. Wayne University of Florida, Gainesville, FL, United States

Reed B. Wickner National Institute of Diabetes and Digestive and Kidney Diseases, National Institutes of Health, Bethesda, MD, United States Luc Willems Cellular and Molecular Epigenetics (GIGA), Liège, Belgium and Molecular Biology (TERRA), Gembloux, Belgium Brian J. Willett MRC-University of Glasgow Centre for Virus Research, Glasgow, United Kindom Alexis Williams The University of Texas Medical Branch, Galveston, TX, United States Stephen A. Winchester Frimley Park Hospital, Frimley, United Kingdom and Immunisation and Countermeasures Division, Public Health England, London, United Kingdom Clayton W. Winkler National Institutes of Health, Hamilton, MT, United States

Friedemann Weber FB 10 – Institute for Virology, Justus Liebig University Giessen, Giessen, Germany

Stephan Winter Leibniz Institute – DSMZ – German Collection of Microorganisms and Cell Cultures GmbH, Braunschweig, Germany

Sung-Chan Wei Institute of Molecular Biology, Academia Sinica, Taipei, Taiwan

William M. Wintermantel Agricultural Research Service, US Department of Agriculture, Salinas, CA, United States

Robin A. Weiss University College London, London, United Kingdom

Jennifer Wirth Montana State University, Bozeman, MT, United States

Tao Weitao Southwest Baptist University, Bolivar, MO, United States

Yuri I. Wolf National Center for Biotechnology Information, National Library of Medicine, Bethesda, MD, United States

List of Contributors

xlv

Thorsten Wolff Robert Koch Institute, Berlin, Germany

Lawrence S. Young University of Warwick, Coventry, United Kingdom

Blaide Woodburn University of North Carolina at Chapel Hill, Chapel Hill, NC, United States

Mark J. Young Montana State University, Bozeman, MT, United States

Michael E. Woodson The University of Texas Medical Branch, Galveston, TX, United States Courtney Woolsey The University of Texas Medical Branch, Galveston, TX, United States Chien-Fu Wu Tokyo University of Agriculture and Technology, Fuchu, Japan Mingde Wu Huazhong Agricultural University, Wuhan, China Songsong Wu Huazhong Agricultural University, Wuhan, China Yan Xiang University of Texas Health Science Center at San Antonio, San Antonio, TX, United States Jiatao Xie Huazhong Agricultural University, Wuhan, China Zhuang Xiong Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing, China Hajime. Yaegashi Institute of Fruit Tree and Tea Science, NARO, Morioka, Japan Mehtap Yakupoğlu Trabzon University, Trabzon, Turkey Yasuyuki Yamaji The University of Tokyo, Tokyo, Japan Meng Yang China Agricultural University, Beijing, China Teng-Chieh Yang Scarsdale, NY, United States Qin Yao Jiangsu University, Zhenjiang, China Tianyou Yao Baylor College of Medicine, Houston, TX, United States Nobuyuki Yoshikawa Iwate University, Morioka, Japan George R. Young Francis Crick Institute, London, United Kingdom

Ry Young Texas A& M University, College Station, TX, United States Isaac T. Younker University of Alabama, Tuscaloosa, AL, United States Qian Yu School of Life Sciences, Jiangsu University, Zhenjiang, China Sang-Im Yun Utah State University, Logan, UT, United States Fauzia Zarreen University of Delhi, New Delhi, India Francisco M. Zerbini Federal University of Viçosa, Viçosa, Brazil Dong-Xiu Zhang University of Maryland, Rockville, MD, United States Jianqiang Zhang Iowa State University, Ames, IA, United States Junjie Zhang Texas A& M University, College Station, TX, United States Long Zhang College of Pharmacy, The Ohio State University, Columbus, OH, United States Pan Zhang Central South University, Changsha, China Peijun Zhang University of Oxford, Oxford, United Kingdom and Electron Bio-Imaging Centre, Diamond Light Source, Didcot, United Kingdom Tao Zhang Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing, China Yongliang Zhang China Agricultural University, Beijing, China Zhenlu Zhang Shandong Agricultural University, Tai’an, China Lixia Zhou College of Pharmacy, The Ohio State University, Columbus, OH, United States

xlvi

List of Contributors

Ling Zhu Institute of Biophysics, Chinese Academy of Sciences, Beijing, China Heiko Ziebell Julius Kühn Institute – Federal Research Center for Cultivated Plants, Braunschweig, Germany John Ziebuhr The Queen's University of Belfast, Belfast, United Kingdom

Jeffrey J. Zimmerman Iowa State University, Ames, IA, United States Falk Zucker Institute for Chemistry and Biology of the Marine Environment, Oldenburg, Germany

CONTENT OF ALL VOLUMES Editors in Chief

v

Editorial Board

vii

Section Editors

ix

Foreword

xv

Preface

xvii

Guide to Use

xix

List of Contributors

xxi

VOLUME 1 The Virus as a Concept – Fundamentals of Virology A Brief History of Virology David J Rowlands

3

The Origin of Viruses Patrick Forterre and Morgan Gaïa

14

The Virocell Concept Patrick Forterre

23

Virus Taxonomy Jens H Kuhn

28

The Greater Virus World and Its Evolution Eugene V Koonin and Valerian V Dolja

38

The Virus Species Concept Peter Simmonds

47

Genetic Diversity and Evolution of Viral Populations Rafael Sanjuán and Pilar Domingo-Calap

53

Mechanisms of RNA Virus Evolution Lisa M Bono and Siobain Duffy

62

Mechanisms of DNA Virus Evolution Moriah L Szpara and Koenraad Van Doorslaer

71

Paleovirology Clément Gilbert

79

Evolution Steered by Structure Nicola GA Abrescia

87

xlvii

xlviii

Content of all Volumes

Pairwise Sequence Comparison in Virology Tao Zhang, Zheng Gong, Tongkun Guo, Zhuang Xiong, and Yiming Bao

100

Computational Analysis of Recombination in Viral Nucleotide Sequences Miguel Arenas

108

Phylogeny of Viruses Alexander E Gorbalenya and Chris Lauber

116

Virus Bioinformatics Nikolaos Pappas, Simon Roux, Martin Hölzer, Kevin Lamkiewicz, Florian Mock, Manja Marz, and Bas E Dutilh

124

Metagenomics in Virology Simon Roux, Jelle Matthijnssens, and Bas E Dutilh

133

Database and Analytical Resources for Viral Research Community Sujal Phadke, Saichetana Macherla, and Richard H Scheuermann

141

Classification of the Viral World Based on Atomic Level Structures Janne J Ravantti and Nicola GA Abrescia

153

Isolating, Culturing, and Purifying Viruses With a Focus on Bacterial and Archaeal Viruses Katri Eskelin and Hanna M Oksanen

162

High Throughput Sequencing and Virology Graham L Freimanis and Nick J Knowles

175

Single-Virus Genomics: Studying Uncultured Viruses, One at a Time Manuel Martinez-Garcia, Francisco Martinez-Hernandez, and Joaquín Martínez Martínez

184

Biophysical Characterizations in the Solution State Robert P Rambo and Katsuaki Inoue

191

Virus Crystallography Jonathan M Grimes

199

Advanced Light and Correlative Microscopy in Virology Sergi Padilla-Parra, Charles A Coomer, and Irene Carlon-Andres

208

Atomic Force Microscopy (AFM) Investigation of Viruses Alexander McPherson

218

Cryo-Electron Microscopy (CEM) Structures of Viruses David Chmielewski and Wah Chiu

233

Analysis of Viruses in the Cellular Context by Electron Tomography Peijun Zhang and Luiza Mendonça

242

Mathematical Modeling of Virus Architecture Reidun Twarock

248

Principles of Virus Structure Madhumati Sevvana, Thomas Klose, and Michael G Rossmann†

257

Structures of Small Icosahedral Viruses Elizabeth E Fry, Jingshan Ren, and Claudine Porta

278

Structural Principles of the Flavivirus Particle Organization and of Its Conformational Changes Stéphane Duquerroy, Arvind Sharma, and Félix A Rey

290

Reoviruses (Reoviridae) and Their Structural Relatives Liya Hu, Mary K Estes, and B V Venkataram Prasad

303



Deceased.

Content of all Volumes

xlix

Structures of Tailed Phages and Herpesviruses (Herpesviridae) Montserrat Fàbrega-Ferrer and Miquel Coll

318

Adenoviruses (Adenoviridae) and Their Structural Relatives Gabriela N Condezo, Natalia Martín-González, Marta Pérez-Illana, Mercedes Hernando-Pérez, José Gallardo, and Carmen San Martín

329

Negative Single-Stranded RNA Viruses (Mononegavirales): A Structural View Juan Reguera

345

Structure of Retrovirus Particles (Retroviridae) David K Stammers and Jingshan Ren

352

Structure of Helical Viruses C Martin Lawrence

362

Giant Viruses and Their Virophage Parasites Rodrigo AL Rodrigues, Ana CdSP Andrade, Graziele P Oliveira, and Jônatas S Abrahão

372

Viral Replication Cycle AJ Cann

382

Viral Receptors José M Casasnovas and Thilo Stehle

388

Bacterial and Archeal Virus Entry Minna M Poranen and Aušra Domanska

402

Nonenveloped Eukaryotic Virus Entry Ian M Jones and Polly Roy

409

Enveloped Virus Membrane Fusion Aurélie A Albertini and Yves Gaudin

417

Genome Replication of Bacterial and Archaeal Viruses Česlovas Venclovas

429

Viral Transcription David LV Bauer and Ervin Fodor

439

Translation of Viral Proteins Martin D Ryan and Garry A Luke

444

Recombination Jozef J Bujarski

460

Assembly of Viruses: Enveloped Particles CK Navaratnarajah, R Warrier, and RJ Kuhn

468

Assembly of Viruses: Nonenveloped Particles M Luo

475

Virion Assembly: From Small Picornaviruses (Picornaviridae) to Large Herpesviruses (Herpesviridae) Ling Zhu, Nan Wang, and Xiangxi Wang

480

Genome Packaging Richard J Bingham, Reidun Twarock, Carlos P Mata, and Peter G Stockley

488

Virus Factories Isabel Fernández de Castro, Raquel Tenorio, and Cristina Risco

495

Release of Phages From Prokaryotic Cells Jesse Cahill and Ry Young

501

Virus Budding Lara Rheinemann and Wesley I Sundquist

519

l

Content of all Volumes

Vesicle-Mediated Transcytosis and Export of Viruses Efraín E Rivera-Serrano and Stanley M Lemon

529

Vector Transmission of Animal Viruses Houssam Attoui, Fauziah Mohd Jaafar, Rennos Fragkoudis, and Peter PC Mertens

542

The Human Virome Alexia Bordigoni, Sébastien Halary, and Christelle Desnues

552

Epidemiology of Human and Animal Viral Diseases Michael Edelstein

559

Zoonosis, Emerging and Re-Emerging Viral Diseases Janet M Daly

569

Antiviral Innate Immunity: Introduction Friedemann Weber

577

Humoral and T Cell-Mediated Immunity to Viruses Ane Ogbe and Lucy Dorrell

584

Antigenicity and Antigenic Variation Kuan-Ying A Huang, Xiaorui Chen, Che Ma, Dayna Cheng, Jen-Ren Wang, and Wan-Chun Lai

597

Antigen Presentation Andrew J McMichael

601

Defense Against Viruses and Other Genetic Parasites in Prokaryotes Kira S Makarova, Yuri I Wolf, and Eugene V Koonin

606

Defective-Interfering Viruses L Roux

617

Ecology and Global Impacts of Viruses Joanne B Emerson

621

The Role of Retroviruses in Cellular Evolution Andrea Kirmaier and Welkin E Johnson

627

The Role of Bacteriophages in Bacterial Evolution Chris M Rands and Harald Brüssow

633

Viruses and Their Potential for Bioterrorism Dana L Vanlandingham and Stephen Higgs

644

The Use of Viral Promoters in Expression Vectors Ian M Jones

652

Oncolytic Viruses Laura Burga and Mihnea Bostina

658

Biotechnology Approaches to Modern Vaccine Design George P Lomonossoff and Daniel Ponndorf

662

Viruses: Impact on Science and Society Neeraja Sankaran and Robin A Weiss

671

VOLUME 2 Viruses as Infectious Agents: Human and Animal Viruses Adenoviruses (Adenoviridae) Balázs Harrach and Mária Benkő

3

Content of all Volumes

li

African Horse Sickness Virus (Reoviridae) Piet A van Rijn

17

African Swine Fever Virus (Asfarviridae) Linda K Dixon, Rachel Nash, Philippa C Hawes, and Christopher L Netherton

22

Akabane Virus and Schmallenberg Virus (Peribunyaviridae) Martin Beer and Kerstin Wernike

34

Alphaviruses Causing Encephalitis (Togaviridae) Diane E Griffin

40

Anelloviruses (Anelloviridae) Fabrizio Maggi and Mauro Pistello

48

Animal Lentiviruses (Retroviridae) Esperanza Gomez-Lucia

56

Animal Morbilliviruses (Paramyxoviridae) Carina Conceicao and Dalan Bailey

68

Animal Papillomaviruses (Papillomaviridae) John S Munday

79

Astroviruses (Astroviridae) Virginia Hargest, Amy Davis, and Stacey Schultz-Cherry

92

Avian Hepadnaviruses (Hepadnaviridae) Allison R Jilbert, Georget Y Reaiche-Miller, and Catherine A Scougall

100

Avian Herpesviruses (Herpesviridae) Vishwanatha RAP Reddy and Venugopal Nair

112

Avian Influenza Viruses (Orthomyxoviridae) Nicolas Bravo-Vasquez and Stacey Schultz-Cherry

117

Avian Leukosis and Sarcoma Viruses (Retroviridae) Karen L Beemon

122

Bluetongue Virus (Reoviridae) Raghavendran Kulasegaran-Shylini and Polly Roy

127

Borna Disease Virus and Related Bornaviruses (Bornaviridae) Susan L Payne

137

Bovine Leukemia Virus (Retroviridae) Thomas Joris, Roghaiyeh Safari, Jean-Rock Jacques, and Luc Willems

144

Bovine Viral Diarrhea, Border Disease, and Classical Swine Fever Viruses (Flaviviridae) Paul Becher, Volker Moennig, and Norbert Tautz

153

Capripoxviruses, Parapoxviruses, and Other Poxviruses of Ruminants (Poxviridae) Philippa M Beard

165

Chikungunya Virus (Togaviridae) Thomas E Morrison and Stephanie E Ander

173

Circoviruses (Circoviridae) Giovanni Franzo and Joaquim Segalés

182

Coronaviruses: General Features (Coronaviridae) Paul Britton

193

Coronaviruses: Molecular Biology (Coronaviridae) X Deng and SC Baker

198

lii

Content of all Volumes

Crimean-Congo Hemorrhagic Fever Virus and Nairoviruses of Medical Importance (Nairoviridae) Ali Mirazimi, Felicity Burt, and Anna Papa

208

Dengue Viruses (Flaviviridae) Ashley L St. John and Duane J Gubler

218

Ebola Virus (Filoviridae) Andrea Marzi and Logan Banadyga

232

Enteroviruses (Picornaviridae) Carita Savolainen-Kopra, Soile Blomqvist, and Petri Susi

245

Enveloped, Positive-Strand RNA Viruses (Nidovirales) L Enjuanes, AE Gorbalenya, RJ de Groot, JA Cowley, J Ziebuhr, and EJ Snijder

256

Epstein–Barr Virus (Herpesviridae) Lawrence S Young

267

Equine Herpesviruses (Herpesviridae) Gisela Soboll Hussey, Nikolaus Osterrieder, and Walid Azab

278

Equine, Canine, and Swine Influenza (Orthomyxoviridae) Janet M Daly and Japhette E Kembou-Ringert

287

Feline Calicivirus (Caliciviridae) Margaret J Hosie and Michaela J Conley

294

Feline Leukemia and Sarcoma Viruses (Retroviridae) Brian J Willett and Margaret J Hosie

300

Fish and Amphibian Alloherpesviruses (Herpesviridae) Maxime Boutier, Léa Morvan, Natacha Delrez, Francesco Origgi, Andor Doszpoly, and Alain Vanderplasschen

306

Fish Retroviruses (Retroviridae) TA Paul, RN Casey, PR Bowser, JW Casey, J Rovnak, and SL Quackenbush

316

Fish Rhabdoviruses (Rhabdoviridae) Gael Kurath and David Stone

324

Foot-and-Mouth Disease Viruses (Picornaviridae) David J Rowlands

332

Fowlpox Virus and Other Avipoxviruses (Poxviridae) Efstathios S Giotis and Michael A Skinner

343

Hantaviruses (Hantaviridae) Tarja Sironen and Antti Vaheri

349

Henipaviruses (Paramyxoviridae) Lin-Fa Wang and Danielle E Anderson

355

Hepatitis A Virus (Picornaviridae) Andreas Dotzauer

362

Hepatitis B Virus (Hepadnaviridae) Peter Karayiannis

373

Hepatitis C Virus (Flaviviridae) Ralf Bartenschlager and Keisuke Tabata

386

Hepeviruses (Hepeviridae) Xiang-Jin Meng

397

Herpes Simplex Virus 1 and 2 (Herpesviridae) David M Knipe and Richard Whitley

404

Content of all Volumes

liii

History of Virology: Vertebrate Viruses F Fenner

414

Human Boca- and Protoparvoviruses (Parvoviridae) Maria Söderlund-Venermo and Jianming Qiu

419

Human Coronavirus-229E, -OC43, -NL63, and -HKU1 (Coronaviridae) Ding X Liu, Jia Q Liang, and To S Fung

428

Human Cytomegalovirus (Herpesviridae) Edward S Mocarski

441

Human Immunodeficiency Virus (Retroviridae) Blaide Woodburn, Ann Emery, and Ronald Swanstrom

460

Human Metapneumovirus (Pneumoviridae) Antonella Casola, Matteo P Garofalo, and Xiaoyong Bao

475

Human Norovirus and Sapovirus (Caliciviridae) Sumit Sharma, Marie Hagbom, Lennart Svensson, and Johan Nordgren

483

Human Papillomaviruses (Papillomaviridae) Alison A McBride and Samuel S Porter

493

Human Parainfluenza Viruses (Paramyxoviridae) Elisabeth Adderson

502

Human Pathogenic Arenaviruses (Arenaviridae) Sheli R Radoshitzky and Juan C de la Torre

507

Human Polyomaviruses (Papillomaviridae) Melissa S Maginnis

518

Human T-Cell Leukemia Virus-1 and -2 (Retroviridae) Amanda R Panfil and Patrick L Green

528

Infectious Bursal Disease Virus (Birnaviridae) Daral J Jackwood

540

Infectious Pancreatic Necrosis Virus (Birnaviridae) Øystein Evensen

544

Influenza A Viruses (Orthomyxoviridae) Laura Kakkola, Niina Ikonen, and Ilkka Julkunen

551

Influenza B, C and D Viruses (Orthomyxoviridae) Thorsten Wolff and Michael Veit

561

Jaagsiekte Sheep Retrovirus (Retroviridae) James M Sharp, Marcelo De las Heras, Massimo Palmarini, and Thomas E Spencer

575

Japanese Encephalitis Virus (Flaviviridae) Sang-Im Yun and Young-Min Lee

583

Kaposi’s Sarcoma-Associated Herpesvirus (Herpesviridae) Anne K Cordes and Thomas F Schulz

598

Marburg and Ravn Viruses (Filoviridae) Courtney Woolsey, Thomas W Geisbert, and Robert W Cross

608

Measles Virus (Paramyxoviridae) Roberto Cattaneo and Michael McChesney

619

Molluscum Contagiosum Virus (Poxviridae) Joachim J Bugert and Rosina Ehmann

629

liv

Content of all Volumes

Mumps Virus (Paramyxoviridae) Stephen A Winchester and Kevin E Brown

634

Murine Leukemia and Sarcoma Viruses (Retroviridae) George R Young and Kate N Bishop

643

Newcastle Disease Virus (Paramyxoviridae) Ben Peeters and Guus Koch

648

Orthobunyaviruses (Peribunyaviridae) Alyssa B Evans, Clayton W Winkler, and Karin E Peterson

654

Parapoxviruses (Poxviridae) Hanns-Joachim Rziha and Mathias Büttner

666

Parechoviruses (Picornaviridae) Sisko Tauriainen and Glyn Stanway

675

Parvoviruses of Carnivores, and the Emergence of Canine Parvovirus (Parvoviridae) Colin R Parrish, Ian EH Voorhees, and Susan L Hafenstein

683

Polioviruses (Picornaviridae) Philip D Minor

688

Porcine Reproductive and Respiratory Syndrome Virus and Equine Arteritis Virus (Arteriviridae) Jianqiang Zhang, Alan T Loynachan, Gregory W Stevenson, and Jeffrey J Zimmerman

697

Prions of Vertebrates Jonathan DF Wadsworth and John Collinge

707

Pseudorabies Virus (Herpesviridae) Thomas C Mettenleiter and Barbara G Klupp

714

Rabbit Hemorrhagic Disease Virus and European Brown Hare Syndrome Virus (Caliciviridae) Lorenzo Capucci, Patrizia Cavadini, and Antonio Lavazza

724

Rabbit Myxoma Virus and the Fibroma Viruses (Poxviridae) Peter J Kerr

730

Rabies and Other Lyssaviruses (Rhabdoviridae) Ashley C Banyard and Anthony R Fooks

738

Respiratory Syncytial Virus (Pneumoviridae) Tiffany King, Tiffany Jenkins, Supranee Chaiwatpongsakorn, and Mark E Peeples

747

Rhinoviruses (Picornaviridae) Matti Waris and Olli Ruuskanen

757

Rift Valley Fever Virus and Other Phleboviruses (Phenuiviridae) Tetsuro Ikegami

765

Roseoloviruses: Human Herpesviruses 6A, 6B and 7 (Herpesviridae) Katherine N Ward

778

Rotaviruses (Reoviridae) Juana Angel and Manuel A Franco

789

Rubella Virus (Picornaviridae) Annette Mankertz

797

Saint Louis Encephalitis Virus (Flaviviridae) William K Reisen, Lark L Coffey, Daniele M Swetnam, and Aaron C Brault

805

Severe Acute Respiratory Syndrome (SARS) and Middle East Respiratory Syndrome (MERS) (Coronaviridae) Joseph SM Peiris and Leo LM Poon

814

Content of all Volumes

lv

Severe Acute Respiratory Syndrome Coronavirus 2 (SARS-CoV-2) (Coronaviridae) Malik Peiris

825

Simian Immunodeficiency Virus (SIV) and HIV-2 (Retroviridae) Phyllis J Kanki

827

Sindbis Virus (Togaviridae) Satu Kurkela

837

Tick-Borne Encephalitis Virus (Flaviviridae) Teemu Smura, Suvi Kuivanen, and Olli Vapalahti

843

Transmissible Gastroenteritis Virus of Pigs and Porcine Epidemic Diarrhea Virus (Coronaviridae) Qiang Liu and Volker Gerdts

850

Vaccinia Virus (Poxviridae) Yan Xiang and Rebecca K Lane

854

Varicella-Zoster Virus (Herpesviridae) Jeffrey I Cohen

860

Variola and Monkeypox Viruses (Poxviridae) Lalita Priyamvada and Panayampalli S Satheshkumar

868

Vesicular Stomatitis Virus and Bovine Ephemeral Fever Virus (Rhabdoviridae) Peter J Walker and Robert B Tesh

875

West Nile Virus (Flaviviridae) Fengwei Bai and Elizabeth Ashley Thompson

884

Yellow Fever Virus (Flaviviridae) Ashley E Strother and Alan DT Barrett

891

Zika Virus (Flaviviridae) Nikos Vasilakis, Shannan L Rossi, Sasha R Azar, Irma E Cisneros, Cassia F Estofolete, and Mauricio L Nogueira

899

VOLUME 3 Viruses as Infectious Agents: Plant Viruses An Introduction to Plant Viruses Roger Hull

3

Emerging and Re-Emerging Plant Viruses Sabrina Bertin, Francesco Faggioli, Andrea Gentili, Ariana Manglli, Anna Taglienti, Antonio Tiberini, and Laura Tomassoli

8

Emerging Geminiviruses (Geminiviridae) Muhammad S Nawaz-ul-Rehman, Nazia Nahid, and Muhammad Mubin

21

Movement of Viruses in Plants Manfred Heinlein

32

Plant Antiviral Defense: Gene-Silencing Pathways Vitantonio Pantaleo, Chikara Masuta, and Hanako Shimura

43

Plant Resistance to Viruses: Engineered Resistance Marc Fuchs

52

Plant Resistance to Viruses: Natural Resistance Associated With Dominant Genes Mandy Ravensbergen and Richard Kormelink

60

lvi

Content of all Volumes

Plant Resistance to Viruses: Natural Resistance Associated With Recessive Genes Masayoshi Hashimoto, Kensaku Maejima, Yasuyuki Yamaji, and Shigetou Namba

69

Plant Viral Diseases: Economic Implications Basavaprabhu L Patil

81

Retrotransposons of Plants M-A Grandbastien

98

Vector Transmission of Plant Viruses Etienne Herrbach and Quentin Chesnais

106

Viral Suppressors of Gene Silencing Hernan Garcia-Ruiz

116

Virus-Induced Gene Silencing (VIGS) Xu Tengzhi, Ugrappa Nagalakshmi, and Savithramma P Dinesh-Kumar

123

Alfalfa Mosaic Virus (Bromoviridae) L Sue Loesch-Fries

132

Alphaflexiviruses (Alphaflexiviridae) Sergey Y Morozov and Alexey A Agranovsky

140

Alphasatellites (Alphasatellitidae) Rob W Briddon and Muhammad S Nawaz-ul-Rehman

149

Amalgaviruses (Amalgaviridae) Ioannis E Tzanetakis, Sead Sabanadzovic, and Rodrigo A Valverde

154

Badnaviruses (Caulimoviridae) Andrew DW Geering

158

Banana Bunchy Top Virus (Nanoviridae) John E Thomas

169

Barley Yellow Dwarf Viruses (Luteoviridae) Leslie L Domier

176

Bean Common Mosaic Virus and Bean Common Mosaic Necrosis Virus (Potyviridae) Ramon Jordan and John Hammond

184

Bean Golden Mosaic Virus and Bean Golden Yellow Mosaic Virus (Geminiviridae) Francisco M Zerbini and Simone G Ribeiro

192

Beet Curly Top Virus (Geminiviridae) Robert L Gilbertson, Tomas A Melgarejo, Maria R Rojas, William M Wintermantel, and John Stanley

200

Beet Necrotic Yellow Vein Virus (Benyviridae) Sebastian Liebe, Annette Niehl, Renate Koenig, and Mark Varrelmann

213

Benyviruses (Benyviridae) Annette Niehl, Sebastian Liebe, Mark Varrelmann, and Renate Koenig

219

Betaflexiviruses (Betaflexiviridae) Nobuyuki Yoshikawa and Hajime Yaegashi

229

Betasatellites and Deltasatelliles (Tolecusatellitidae) Muhammad S Nawaz-ul-Rehman, Nazia Nahid, Muhammad Hassan, and Muhammad Mubin

239

Bluner-, Cile-, and Higreviruses (Kitaviridae) Diego F Quito-Avila, Juliana Freitas-Astúa, and Michael J Melzer

247

Brome Mosaic Virus (Bromoviridae) Guijuan He, Zhenlu Zhang, Preethi Sathanantham, Arturo Diaz, and Xiaofeng Wang

252

Content of all Volumes

lvii

Bromoviruses (Bromoviridae) Jozef J Bujarski

260

Bymoviruses (Potyviridae) Annette Niehl and Frank Rabenstein

268

Cacao Swollen Shoot Virus (Caulimoviridae) Emmanuelle Muller

274

Carmo-Like Viruses (Tombusviridae) Miryam Pérez-Cañamás and Carmen Hernández

285

Cassava Brown Streak Viruses (Potyviridae) Basavaprabhu L Patil

293

Cassava Mosaic Viruses (Geminiviridae) James Legg and Stephan Winter

301

Caulimoviruses (Caulimoviridae) James E Schoelz and Mustafa Adhab

313

Cheraviruses, Sadwaviruses and Torradoviruses (Secoviridae) Toru Iwanami and René AA van der Vlugt

322

Citrus Tristeza Virus (Closteroviridae) Moshe Bar-Joseph, Scott J Harper, and William O Dawson

327

Closteroviruses (Closteroviridae) Marc Fuchs

336

Comoviruses and Fabaviruses (Secoviridae) George P Lomonossoff

348

Cotton Leaf Curl Disease (Geminiviridae) Nasim Ahmed, Imran Amin, and Shahid Mansoor

355

Cowpea Mosaic Virus (Secoviridae) George P Lomonossoff

364

Cucumber Mosaic Virus (Bromoviridae) Judith Hirsch and Benoît Moury

371

Dianthovirus (Tombusviridae) Kiwamu Hyodo and Masanori Kaido

383

Endornaviruses (Endornaviridae) Toshiyuki Fukuhara

388

Fimoviruses (Fimoviridae) Toufic Elbeaino and Michele Digiaro

396

Furoviruses (Virgaviridae) Annette Niehl and Renate Koenig

405

Geminiviruses (Geminiviridae) Jesús Navas-Castillo and Elvira Fiallo-Olivé

411

Hordeiviruses (Virgaviridae) Zhihao Jiang, Meng Yang, Yongliang Zhang, Andrew O Jackson, and Dawei Li

420

Idaeoviruses (Mayoviridae) Robert R Martin and Karen E Keller

430

Ilarviruses (Bromoviridae) Aaron Simkovich, Susanne E Kohalmi, and Aiming Wang

439

lviii

Content of all Volumes

Luteoviruses (Luteoviridae) Leslie L Domier

447

Machlomovirus and Panicoviruses (Tombusviridae) Kay Scheets

456

Maize Streak Virus (Geminiviridae) Darren P Martin and Aderito L Monjane

461

Nanoviruses (Nanoviridae) Bruno Gronenborn and H Josef Vetten

470

Necro-Like Viruses (Tombusviridae) Luisa Rubino and Giovanni P Martelli†

481

Nepoviruses (Secoviridae) Hélène Sanfaçon

486

Ophioviruses (Aspiviridae) Anna M Vaira and John Hammond

495

Orthotospoviruses (Tospoviridae) Renato O Resende and Hanu R Pappu

507

Ourmiaviruses (Botourmiaviridae) Gian Paolo Accotto and Cristina Rosa

516

Papaya Ringspot Virus (Potyviridae) Cécile Desbiez and Hervé Lecoq

520

Pecluviruses (Virgaviridae) Hema Masarapu, Pothur Sreenivasulu, Philippe Delfosse, Claude Bragard, Anne Legreve, and DVR Reddy

528

Pepino Mosaic Virus (Alphaflexiviridae) Rene AA van der Vlugt and CCMM Stijger

539

Plant Reoviruses (Reoviridae) Yu Huang and Yi Li

545

Plant Resistance to Geminiviruses Basavaprabhu L Patil, Supriya Chakraborty, Henryk Czosnek, Elvira Fiallo-Olivé, Robert L Gilbertson, James Legg, Shahid Mansoor, Jesús Navas-Castillo, Rubab Z Naqvi, Saleem U Rahman, and Francisco M Zerbini

554

Plant Rhabdoviruses (Rhabdoviridae) Ralf G Dietzgen, Michael M Goodin, and Zhenghe Li

567

Plant Satellite Viruses (Albetovirus, Aumaivirus, Papanivirus, Virtovirus) Mart Krupovic

581

Plum Pox Virus (Potyviridae) Miroslav Glasa and Thierry Candresse

586

Poleroviruses (Luteoviridae) Hernan Garcia-Ruiz, Natalie M Holste, and Katherine LaTourrette

594

Pomoviruses (Virgaviridae) Eugene I Savenkov

603

Potato Virus Y (Potyviridae) Laurent Glais and Benoît Moury

612

Potexviruses (Alphaflexiviridae) Ki H Ryu, Eun G Song, and Jin S Hong

623



Deceased.

Content of all Volumes

lix

Potyviruses (Potyviridae) Adrián Valli, Juan A García, and Juan J López-Moya

631

Quinviruses (Betaflexiviridae) Ki H Ryu and Eun G Song

642

Reverse-Transcribing Viruses (Belpaoviridae, Metaviridae, and Pseudoviridae) Carlos Llorens, Beatriz Soriano, Maria A Navarrete-Muñoz, Ahmed Hafez, Vicente Arnau, Jose Miguel Benito, Toni Gabaldon, Norma Rallon, Jaume Pérez-Sánchez, and Mart Krupovic

653

Rice Tungro Disease (Secoviridae, Caulimoviridae) Gaurav Kumar, Fauzia Zarreen, and Indranil Dasgupta

667

Rice Yellow Mottle Virus (Solemoviridae) Eugénie Hébrard, Nils Poulicard, and Mbolarinosy Rakotomalala

675

Satellite Nucleic Acids and Viruses Olufemi J Alabi, Alfredo Diaz-Lara, and Maher Al Rwahnih

681

Secoviruses (Secoviridae) Jeremy R Thompson

692

Sequiviruses and Waikaviruses (Secoviridae) Lucy Rae Stewart

703

Solemoviruses (Solemoviridae) Cecilia Sarmiento, Merike Sõmera, and Erkki Truve†

712

Tenuiviruses (Phenuiviridae) Bertha Cecilia Ramirez and Anne-Lise Haenni

719

Tobacco Mosaic Virus (Virgaviridae) Marc HV Van Regenmortel

727

Tobamoviruses (Virgaviridae) Ulrich Melcher, Dennis J Lewandowski, and William O Dawson

734

Tobraviruses (Virgaviridae) Stuart A MacFarlane

743

Tomato Leaf Curl New Delhi Virus (Geminiviridae) Supriya Chakraborty and Manish Kumar

749

Tomato Spotted Wilt Virus (Tospoviridae) Hanu R Pappu, Anna E Whitfield, and Athos S de Oliveira

761

Tomato Yellow Leaf Curl Viruses (Geminiviridae) Henryk Czosnek

768

Tombusvirus-Like Viruses (Tombusviridae) K Andrew White

778

Tombusviruses (Tombusviridae) Luisa Rubino and Kay Scheets

788

Tritimoviruses and Rymoviruses (Potyviridae) Satyanarayana Tatineni and Gary L Hein

797

Triviruses (Betaflexiviridae) Yahya ZA Gaafar and Heiko Ziebell

805

Tymoviruses (Tymoviridae) Rosemarie W Hammond and Peter Abrahamian

818



Deceased.

lx

Content of all Volumes

Umbraviruses (Tombusviridae) Eugene V Ryabov and Michael E Taliansky

827

Varicosaviruses (Rhabdoviridae) Takahide Sasaya

833

Virgaviruses (Virgaviridae) Eugene I Savenkov

839

Viroids (Pospiviroidae and Avsunviroidae) Ricardo Flores and Robert A Owens

852

Watermelon Mosaic Virus and Zucchini Yellow Mosaic Virus (Potyviridae) Cécile Desbiez and Hervé Lecoq

862

VOLUME 4 Viruses as Infectious Agents: Bacterial, Archaeal, Fungal, Algal, and Invertebrate Viruses Bacterial Viruses History of Virology: Bacteriophages William C Summers

3

Icosahedral Phages – Single-Stranded DNA (φX174) Bentley A Fane and Aaron P Roznowski

10

Single-Stranded RNA Bacterial Viruses Peter G Stockley and Junjie Zhang

21

Enveloped Icosahedral Phages – Double-Stranded RNA (φ6) Paul Gottlieb and Aleksandra Alimova

26

Membrane-Containing Icosahedral DNA Bacteriophages Roman Tuma, Sarah J Butcher, and Hanna M Oksanen

36

Tailed Double-Stranded DNA Phages Robert L Duda

45

Helical and Filamentous Phages Andreas Kuhn and Sebastian Leptihn

53

Replication of Bacillus Double-Stranded DNA Bacteriophages Silvia Ayora, Paulo Tavares, Ruben Torres, and Juan C Alonso

61

Lytic Transcription William McAllister and Deborah M Hinton

69

Lysogeny Keith E Shearwin and Jia Q Truong

77

Decision Making by Temperate Phages Ido Golding, Seth Coleman, Thu VP Nguyen, and Tianyou Yao

88

Mobilization of Phage Satellites Kristen N LeGault and Kimberley D Seed

98

Portal Vertex Peng Jing and Mauricio Cortes Jr.

105

Content of all Volumes

lxi

Prohead, the Head Shell Pre-Cursor Marc C Morais and Michael E Woodson

115

Enzymology of Viral DNA Packaging Machines Carlos E Catalano

124

DNA Packaging: DNA Recognition Sandra J Greive and Oliver W Bayfield

136

DNA Packaging: The Translocation Motor Janelle A Hayes and Brian A Kelch

148

Biophysics of DNA Packaging Joshua Pajak, Gaurav Arya, and Douglas E Smith

160

Energetics of the DNA-Filled Head Alex Evilevitch

167

Bacteriophage Receptor Proteins of Gram-Negative Bacteria Sarah M Doore, Kristin N Parent, Sundharraman Subramanian, Jason R Schrad, and Natalia B Hubbs

175

Tail Structure and Dynamics Shweta Bhatt, Petr G Leiman, and Nicholas MI Taylor

186

Bacteriophage Tail Fibres, Tailspikes, and Bacterial Receptor Interaction Mateo Seoane-Blanco, Mark J van Raaij, and Meritxell Granell

194

Phage Genome and Protein Ejection In Vivo Ian J Molineux, L Letti Lopez, and Aaron P Roznowski

206

Dealing With the Whole Head: Diversity and Function of Capsid Ejection Proteins in Tailed Phages Lindsay W Black and Julie A Thomas

219

Jumbo Phages Isaac T Younker and Carol Duffy

229

CRISPR-Cas Systems and Anti-CRISPR Proteins: Adaptive Defense and Counter-Defense in Prokaryotes and Their Viruses Asma Hatoum-Aslan and Olivia G Howell

242

Bacteriophage: Therapeutics and Diagnostics Development Teng-Chieh Yang

252

Bacteriophage Vaccines Pan Tao and Venigalla B Rao

259

Bacteriophage Diversity Julianne H Grose and Sherwood R Casjens

265

Genetic Mosaicism in the Tailed Double-Stranded DNA Phages Welkin H Pope

276

Bacteriophages of the Human Microbiome Pilar Manrique, Michael Dills, and Mark J Young

283

Bacteriophage: Red Recombination System and the Development of Recombineering Technologies Lynn C Thomason and Kenan C Murphy

291

Nanotechnology Application of Bacteriophage DNA Packaging Nanomotors Tao Weitao, Lixia Zhou, Zhefeng Li, Long Zhang, and Peixuan Guo

302

General Ecology of Bacteriophages Stephen T Abedon

314

Marine Bacteriophages Vera Bischoff, Falk Zucker, and Cristina Moraru

322

lxii

Content of all Volumes

Ecology of Phages in Extreme Environments Tatiana A Demina and Nina S Atanasova

342

Archaeal Viruses Diversity of Hyperthermophilic Archaeal Viruses David Prangishvili, Mart Krupovic, and Diana P Baquero

359

Euryarchaeal Viruses Tatiana A Demina and Hanna M Oksanen

368

Vesicle-Like Archaeal Viruses Elina Roine and Nina S Atanasova

380

Virus–Host Interactions in Archaea Diana P Baquero, David Prangishvili, and Mart Krupovic

387

Antiviral Defense Mechanisms in Archaea Qunxin She

400

Discovery of Archaeal Viruses in Hot Spring Environments Using Viral Metagenomics Jennifer Wirth, Jacob H Munson-McGee, and Mark J Young

407

Metagenomes of Archaeal Viruses in Hypersaline Environments Fernando Santos, María D Ramos-Barbero, and Josefa Antón

414

Extreme Environments as a Model System to Study How Virus–Host Interactions Evolve Along the Symbiosis Continuum Samantha J DeWerff and Rachel J Whitaker

419

Fungal Viruses An Introduction to Fungal Viruses Nobuhiro Suzuki

431

Cross-Kingdom Virus Infection Liying Sun, Hideki Kondo, and Ida Bagus Andika

443

Diversity of Mycoviruses in Aspergilli Ioly Kotta-Loizou

450

Evolution of Mycoviruses Mahtab Peyambari, Vaskar Thapa, and Marilyn J Roossinck

457

Mixed Infections of Mycoviruses in Phytopathogenic Fungus Sclerotinia sclerotiorum Jiatao Xie and Daohong Jiang

461

Mycovirus-Mediated Biological Control Daniel Rigling, Cécile Robin, and Simone Prospero

468

Mycoviruses With Filamentous Particles Michael N Pearson

478

Prions of Yeast and Fungi Reed B Wickner and Herman K Edskes

487

Single-Stranded DNA Mycoviruses Daohong Jiang

493

Structure of Double-Stranded RNA Mycoviruses José R Castón, Nobuhiro Suzuki, and Said A Ghabrial†

504



Deceased.

Content of all Volumes

lxiii

Ustilago maydis Viruses and Their Killer Toxins Alexis Williams and Thomas J Smith

513

Vegetative Incompatibility in Filamentous Fungi Songsong Wu, Daohong Jiang, and Jiatao Xie

520

Viral Diseases of Agaricus bisporus, the Button Mushroom Kerry S Burton and Greg Deakin

528

Viral Killer Toxins Manfred J Schmitt and Björn Becker

534

Alternaviruses (Unassigned) Hiromitsu Moriyama, Nanako Aoki, Kuko Fuke, Kana Takeshita Urayama, Naoki Takeshita, and Chien-Fu Wu

544

Barnaviruses (Barnaviridae) Peter A Revill

549

Botybirnaviruses (Botybirnavirus) Mingde Wu, Guoqing Li, Daohong Jiang, and Jiatao Xie

552

Chrysoviruses (Chrysoviridae) - General Features and Chrysovirus-Related Viruses Ioly Kotta-Loizou, Robert HA Coutts, José R Castón, Hiromitsu Moriyama, and Said A Ghabrial†

557

Fungal Partitiviruses (Partitiviridae) Eeva J Vainio

568

Fusariviruses (Unassigned) Sotaro Chiba

577

Giardiavirus (Totiviridae) Juliana Gabriela Silva de Lima, João Paulo Matos Santos Lima, and Daniel Carlos Ferreira Lanza

582

Hypoviruses (Hypoviridae) Dong-Xiu Zhang and Donald L Nuss

589

Megabirnaviruses (Megabirnaviridae) Yukiyo Sato and Nobuhiro Suzuki

594

Mitoviruses (Mitoviridae) Bradley I Hillman and Alanna B Cohen

601

Mycoreoviruses (Reoviridae) Bradley I Hillman and Alanna B Cohen

607

Mymonaviruses (Mymonaviridae) Daohong Jiang

615

Narnaviruses (Narnaviridae) Rosa Esteban and Tsutomu Fujimura

621

Phlegiviruses (Unassigned) Karel Petrzik

627

Plant and Protozoal Partitiviruses (Partitiviridae) Hanna Rose and Edgar Maiss

632

Quadriviruses (Quadriviridae) Hideki Kondo, José R Castón, and Nobuhiro Suzuki

642

Totiviruses (Totiviridae) Bradley I Hillman and Alanna B Cohen

648



Deceased.

lxiv

Content of all Volumes

Yado-kari Virus 1 and Yado-nushi Virus 1 (Unassigned) Subha Das and Nobuhiro Suzuki

658

Yeast L-A Virus (Totiviridae) Reed B Wickner, Tsutomu Fujimura, and Rosa Esteban

664

Algal Viruses Algal Marnaviruses (Marnaviridae) Marli Vlok, Curtis A Suttle, and Andrew S Lang

671

Algal Mimiviruses (Mimiviridae) Ruth-Anne Sandaa, Håkon Dahle, Corina PD Brussaard, Hiroyuki Ogata, and Romain Blanc-Mathieu

677

Miscellaneous Algal Viruses (Alvernaviridae, Bacilladnaviridae, Dinodnavirus, Reoviridae) Keizo Nagasaki, Yuji Tomaru, and Corina PD Brussaard

684

Phycodnaviruses (Phycodnaviridae) James L Van Etten, David D Dunigan, Keizo Nagasaki, Declan C Schroeder, Nigel Grimsley, Corina PD Brussaard, and Jozef I Nissimov

687

Invertebrate Viruses An Introduction to Viruses of Invertebrates Peter Krell

699

Ascoviruses (Ascoviridae) Sassan Asgari, Dennis K Bideshi, Yves Bigot, and Brian A Federici

724

Baculovirus–Host Interactions: Repurposing Host-Acquired Genes (Baculoviridae) A Lorena Passarelli

732

Baculoviruses: General Features (Baculoviridae) Vera ID Ros

739

Baculoviruses: Molecular Biology and Replication (Baculoviridae) Monique M van Oers

747

Bidensoviruses (Bidnaviridae) Qin Yao, Zhaoyang Hu, and Keping Chen

759

Bunyaviruses of Arthropods (Mypoviridae, Nairoviridae, Peribunyaviridae, Phasmaviridae, Phunuiviridae, Wupedeviridae) Sandra Junglen

764

Dicistroviruses (Dicistroviridae) Yanping Chen and Steven M Valles

768

Entomobirnaviruses (Birnaviridae) Marco Marklewitz

776

Hytrosaviruses (Hytrosaviridae) Henry M Kariithi and Irene K Meki

780

Iflaviruses (Iflaviridae) Bryony C Bonning and Sijun Liu

792

Iridoviruses of Invertebrates (Iridoviridae) İkbal Agah İnce

797

Mesoniviruses (Mesoniviridae) Jody Hobson-Peters and Daniel Watterson

804

Content of all Volumes

lxv

Nimaviruses (Nimaviridae) Peter Krell and Emine Ozsahin

808

Nodaviruses of Invertebrates and Fish (Nodaviridae) Kyle L Johnson and Jacen S Moore

819

Nudiviruses (Nudiviridae) Yu-Chan Chao, Chih-Hsuan Tsai, and Sung-Chan Wei

827

Parvoviruses of Invertebrates (Parvoviridae) Judit J Pénzes, Hanh T Pham, Qian Yu, Max Bergoin, and Peter Tijssen

835

Polydnaviruses (Polydnaviridae) Anne-Nathalie Volkoff and Elisabeth Huguet

849

Poxviruses of Insects (Poxviridae) Basil Arif, Lillian Pavlik, Remziye Nalçacıoğlu, Hacer Muratoğlu, Cihan İnan, Mehtap Yakupoğlu, Emine Özsahin, Ismail Demir, Kazım Sezen, and Zihni Demirbağ

858

Reoviruses of Invertebrates (Reoviridae) Peter Krell

867

Rhabdoviruses of Insects (Rhabdoviridae) Andrea González-González, Nicole T de Stefano, David A Rosenbaum, and Marta L Wayne

883

Sarthroviruses (Sarthroviridae) Azeez Sait Sahul Hameed

888

Solinviviruses (Solinviviridae) Steven M Valles and Andrew E Firth

892

Tetraviruses (Alphatetraviridae, Carmotetraviridae, Permutotetraviridae) Rosemary A Dorrington, Tatiana Domitrovic, and Meesbah Jiwaji

897

VOLUME 5 Diagnosis, Treatment and Prevention of Virus Infections Diagnosis Introduction to Virus Diagnosis and Treatment Maija Lappalainen and Hubert GM Niesters

3

Electron Microscopy for Viral Diagnosis Roland A Fleck

5

Serological Approaches for Viral Diagnosis Klaus Hedman and Visa Nurmi

15

A Brief History of the Development of Diagnostic Molecular-Based Assays Hubert GM Niesters

22

Sequencing Strategies Sibnarayan Datta

27

Validating Real-Time Polymerase Chain Reaction (PCR) Assays Melvyn Smith

35

Rapid Point-of-Care Assays Jan G Lisby and Uffe V Schenider

45

lxvi

Content of all Volumes

Standardization of Diagnostic Assays Sally A Baylis, C Micha Nübling, and Wayne Dimech

52

Quality Assurance in the Clinical Virology Laboratory Paul Wallace and Elaine McCulloch

64

Biosafety and Biosecurity in Diagnostic Laboratories Hannimari Kallio-Kokko and Susanna Sissonen

82

Screening for Viral Infections Walter Ian Lipkin, Nischay Mishra, and Thomas Briese

91

Clinical Diagnostic Virology Marcus Panning

98

Virus Diagnosis in Immunosuppressed Individuals Elisabeth Puchhammer-Stöckl and Fausto Baldanti

105

Diagnosis; Future Prospects on Direct Diagnosis Marianna Calabretto, Daniele Di Carlo, Fabrizio Maggi, and Guido Antonelli

112

Treatment Antiviral Classification Guangdi Li, Xixi Jing, Pan Zhang, and Erik De Clercq

121

Antiretroviral Therapy – Nucleoside/Nucleotide and Non-Nucleoside Reverse Transcriptase Inhibitors Timothy D Appleby and Killian J Quinn

131

Protease Inhibitors Vanesa Anton-Vazquez and Frank A Post

139

HIV Integrase Inhibitors and Entry Inhibitors Daniel Bradshaw and Ranjababu Kulasegaram

145

Management of Respiratory Syncytial Virus Infections (Pneumoviridae) Rachael S Barr and Simon B Drysdale

155

Management of Influenza Virus Infections (Orthomyxoviridae) Bruno Lina

160

Management of Herpes Simplex Virus Infections (Herpesviridae) Nicole Samies and Richard Whitley

175

Management of Varicella-Zoster Virus Infections (Herpesviridae) Andreas Sauerbrei

181

Treatment and Prevention of Herpesvirus Infections in the Immunocompromised Host Sara H Burkhard and Nicolas J Mueller

190

Management of Adenovirus Infections (Adenoviridae) Albert Heim

197

Management of Hepatitis A and E Virus Infection Sébastien Lhomme, Florence Abravanel, Jean-Marie Peron, Nassim Kamar, and Jacques Izopet

206

Management of Patients With Chronic Hepatitis B (Hepadnaviridae) and Chronic Hepatitis D Infection (Deltavirus) Milan J Sonneveld and Suzanne van Meer

217

Studying Population Genetic Processes in Viruses: From Drug-Resistance Evolution to Patient Infection Dynamics Jeffrey D Jensen

227

Content of all Volumes

Virus-Based Cancer Therapeutics Roberto Cattaneo and Christine E Engeland

lxvii

233

Prevention Surveillance of Infectious Diseases Norman Noah

247

Preparing for Emerging Zoonotic Viruses Reina S Sikkema and Marion PG Koopmans

256

Use of Immunoglobulins in the Prevention of Viral Infections Leyla Asadi and Giovanni Ferrara

267

Vaccine Production, Safety, and Efficacy Thomas J Brouwers and Bernard AM Van der Zeijst

281

Vaccines Against Viral Gastroenteritis Scott Grytdal, Tyler P Chavers, Claire P Mattison, Jacqueline E Tate, and Aron J Hall

289

Human Papillomavirus (HPV) Vaccines and Their Impact Jade Pattyn, Pierre Van Damme, and Alex Vorsters

295

Influenza Vaccination Topi Turunen

300

Polio Eradication M Steven Oberste, Cara C Burns, and Jennifer L Konopka-Anstadt

310

Subject Index

315

BACTERIAL VIRUSES

History of Virology: Bacteriophages William C Summers, Yale University, New Haven, CT, United States r 2021 Elsevier Ltd. All rights reserved. This is an update of H.-W. Ackermann, History of Virology: Bacteriophages, In Encyclopedia of Virology (Third Edition), edited by Brian W.J. Mahy and Marc H.V. Van Regenmortel, Elsevier Ltd., 2008, doi:10.1016/B978-012374410-4.00590-2.

Discovery of Phages Although bacteriophages are the most abundant organisms on our planet, and they often exert drastic effects (lysis) on their bacterial hosts, it was not until the second decade of the 20th century that they were recognized and their basic biological nature was studied. While the literature contains several dozen reports that may be, only in retrospect, of phenomena associated with bacteriophages, the first clear recognition and description of what was at the time called “the bacteriophage phenomenon” was by a rather mysterious French-Canadian autodidact, Félix Hubert d’Herelle. D’Herelle, working at the Pasteur Institute during the first world war, was investigating the problem of virulence of bacterial strains. The basis for differing virulence of different isolates of the same bacterial species was of both theoretical and practical interest. Basing his reasoning on earlier (later shown to be incorrect) ideas that microbial virulence was modulated by “associated” organisms, d’Herelle fractionated a bacterial culture from a particularly virulent outbreak of dysentery by filtration to obtain a bacteria-free filtrate that he added back to a pure culture of the original organism. To his surprise, rather than increase the virulence of this culture, this filtrate caused lysis of the bacterial culture and, on solid media, the bacterial growth was inhibited in small spots. Material recovered from the spots was sterile with respect to bacteria, but could cause sequential lysis of fresh cultures. This “lytic principle” could be serially propagated indefinitely. D’Herelle reasoned that this “lytic principle” was a filter-passing, particulate agent that could multiply at the expense of the bacterial cell. He called it “bacteriophage” and classed as an “ultravirus”, meaning that it was smaller than the usual “virus”, a term used at the time to indicate any agent of transmissible disease. In his words, bacteriophage was an invisible obligate parasite. D’Herelle was more expansive than just describing a new type of microbe, however. Since the bacteriophage was able to kill bacteria, and it appeared to him that the quantity of bacteriophage in patient samples increased as the patient was recovering, he reasoned that the recovery itself was caused by the takeover by the bacteriophage. Again, in his words, it was the agent of recovery and immunity. In the pre-antibiotic era this concept promised here-to-fore unavailable opportunities to treat infectious diseases. Almost immediately after d’Herelle’s initial report in 1917 and his assertion that phages were involved in recovery from infectious diseases as well as the agent of what he called natural immunity, his discovery was explored by others and controversy ensued. The initial controversy concerned the biological nature of phage, that is, was it an ultravirus or a cellular product? This controversy became acute when d’Herelle challenged the theory of immunity recently advocated by the eminent director of the Pasteur Institute in Brussels, and recent Nobel prize winner, Jules Bordet. In his 1923 monograph on phage and disease, d’Herelle referred to Bordet’s work on bacteriolytic antibodies as “an error”. Bordet and his young protégé, Andre Gratia, took up the challenge, initially not defending the antibody work, but by attacking d’Herelle’s interpretation of the nature of phage. This approach was of little effect, so when they discovered Twort’s rather obscure paper on ultramicroscopic viruses from 1915, Bordet and Gratia launched an attack on d’Herelle’s priority of discovery. This dispute was intense for almost a decade, and probably was a deterrent and distraction to phage research for several decades. The priority dispute survives to the present. D’Herelle’s response to the priority challenge was to try to demonstrate that what he had discovered was not the same phenomenon as observed by Twort. This work did help to clarify the nature of phage but did not convince most observers for the simple reasons that Twort’s paper was vague and he published nothing later in the way of clarifying experiments. Two key problems were at the center of d’Herelle’s work on the nature of phage in the 1920s: is phage a unique organism that shows various manifestations, or are there many species of phage? D’Herelle argued for the “unicity” of phage, and claimed that under different conditions, for example, when supplied with different host bacteria, it adapted to such changes. This notion was not so strange as it seems today. In the 19th century, bacteriologists argued the same way about bacterial species, one species with variable forms (polymorphism) or many species, each with a stable form (monomorphism). Also, in the 1920s, a theory called “cyclogeny” was popular that suggested microbial cultures could change form and behavior depending on cultural conditions. Examples include bacterial shape changes observed in exponentially growing cultures versus stationary phase cultures and changes observed upon bacterial sporulation and germination. The second key problem involved the physical nature of phage. D’Herelle argued that the fact that phage produced plaques, and that the number of plaques produced upon dilution followed a Poisson distribution, proved that phage were particulate in nature. He even enlisted Einstein, with whom he discussed the matter, on his side in the argument: “I was very glad to see how this deservedly-famous mathematician evaluated my experimental demonstration, for I do not believe that there are a great many biological experiments whose nature satisfies a mathematician”. Still, most bacteriologists of the day thought of phage as some

Encyclopedia of Virology, 4th Edition, Volume 4

doi:10.1016/B978-0-12-809633-8.20950-3

3

4

History of Virology: Bacteriophages

sort of cellular product with lytic properties, such as the recently discovered enzyme, lysozyme. The “corpuscular nature of phage” was not definitively settled until they were visualized by electron microscopy in 1939. A third controversy that smoldered for many years surrounded the status of phages (and filterable viruses in general) was their status as “living” beings. Microbiologists in particular struggled with various definitions of “life”, probably a legacy of earlier vitalism. While viruses seemed to “reproduce”, which was one key property of living things, it was not clear that they could “assimilate” material from their environment in order to grow. Their dependence on a full-fledged cellular host was confusing in terms of the existing criteria for “life”. At this period, autocatalytic enzymes such as pepsinogen and trypsinogen, which could convert themselves from inactive to active pepsin and trypsin, were discovered. Could phage “growth” be such an autocatalytic phenomenon? Nobelist John Northrop and his colleagues in the Rockefeller Virus Unit at Princeton thought so. Biologists were intrigued by things “at the borderline of life”. D’Herelle even developed an extensive theory of evolution based on phage as the chemical basis for life, asserting that phage were a colloidal form of matter that coalesced into protoplasm and cells. He undertook a search for phages in sulfur hot springs and other places where the “origin of life” might have occurred. He termed the study of phage “protobiology” and assigned the Linnaean binomial Bacteriophagus protobios early in his work on “protobes”. Indeed, during his tenure at Yale in the late 1920s and early 1930s, he referred to himself as Professor of Protobiology. The question of the organismal nature of viruses led scientists such as Jacques Bronfenbrenner to examine the respiration of phages using techniques such as the newly devised Warburg apparatus. To add to the mystery of viruses, in 1935 Wendell Stanley managed to obtain viruses in crystalline form, the sine qua non of chemical purity. Stanley’s chemical analysis of the crystals of tobacco mosaic virus found them to be made of protein (he missed the 7% phosphorus content, an indication of the presence of nucleic acid, later found by Bawden and Pirie in 1936). The question of whether viruses are “living or dead” has by now been relegated to late night undergraduate debating, and in 1957, Andre Lwoff simply stated “viruses are viruses”.

Phage Therapy One of d’Herelle’s earliest observations of phage involved dysentery patients and he noted the quantity and virulence of the phages in stool samples seemed to increase as the patients were recovering from the disease. This, as noted above, suggested that the phage was contributing to the recovery. It was no big leap to suggest that phages, isolated in the laboratory, might be used as treatment of such diseases. In the era before antibiotics, such therapeutic potential generated great interest, and the interest in phage in the decades of the 1920s and 1930s focused mainly on various attempts to exploit this therapeutic promise. The first trials of phage therapy were carried out in the veterinary setting. D’Herelle carried out a trial of phages in an epidemic of avian typhosis, a bacterial disease of chickens, that was becoming endemic in rural France in 1918. He isolated a phage that was virulent against the pathogen as isolated from the diseased birds, and in both field trials and laboratory tests, found that phage had both therapeutic and prophylactic usefulness in actual disease outbreaks. He continued his search for opportunities to field-test his ideas and became a veritable globe trotter in this endeavor, spreading phages in North Africa, Argentina, French Indochina, India, and Egypt, as well as the hospitals of Paris. In the pre-antibiotic era, his work attracted much attention and many attempts to employ phage therapy. After over a decade of clinical experience, the American Medical Association’s Council on Pharmacy and Chemistry reviewed the literature on phage therapy and the 1934 report written by Monroe Eaton and Stanhope Bayne-Jones concluded that the efficacy of phage therapy was still not established because of the conflicting reports, ambiguous results, and lack of understanding of the biological nature of phage. This report coincided with d’Herelle’s departure from the US to set up a phage institute in Tbilisi in the USSR, an ill-fated venture that ended with his return to France in 1937 after the execution of his host and fellow phage worker, Georgi Eliava, who was caught up in the Stalinist purges of that period. This tragedy, shortly followed by the events leading up to World War 2, curtailed the influence of the most ardent and effective advocate of phage therapy. The discovery and effectiveness of antibiotics (and chemotherapy such as the sulfa drugs) in the late 1930s and on into the next decade led to the eclipse of interest in phage therapy, especially in the West. In the Eastern Bloc of nations, partly because of the difficulty in developing the capacity to produce antibiotics and partly because of the lingering interest in phage therapy, research on phage therapy continued in several places, especially in Tbilisi and Poland. As understanding of phage biology increased, these centers profited from new knowledge as applied to phage therapy and have emerged in recent years as leaders in the renewed interest in phage therapy in this era of multi-drug antibiotic resistance. By the mid-1980s, especially with the widespread recognition of MRSA and the ubiquity of horizontal gene transmission of drug-resistance, the search for solutions to this problem became acute. New antibiotic discovery was recognized as both economically non-viable as well as likely to be futile. Phage therapy was again seriously considered. In the West phage therapy seemed to have been discarded for two reasons, one theoretical and one political. Since phages were known to be antigenic, the development of immune resistance was considered a fatal drawback as was the well-known ability of bacteria to mutate to phage resistance. Further, cold-war ideology militated against any recognition of scientific work from the Soviets. A key study leading to the resurgence of interest in phage therapy was published in 1996 by Carl Merrill and his colleagues, who showed that the problem of immune clearance of therapeutic phages was not insurmountable. They showed that by repeated cycles of selection in vivo, phages strains (of Lambda bacteriophage) could be derived that were invisible to the host (mouse) immune recognition system. Detailed understanding of the nature of phage receptors and phage resistance mechanisms suggested that these were not such daunting impediments. By 2000, phage therapy had become respectable again.

History of Virology: Bacteriophages

5

Phage Typing Soon after the discovery of bacteriophages by d’Herelle in 1917, it became clear that there was some sort of specificity of phages for particular host strains of bacteria. While the issue of “unicity” of phage was unresolved, it was still observed that phage isolates could be obtained (whether by adaptation or by isolation of pre-existing strains) that could distinguish between different bacterial hosts by their “host range”. D’Herelle and his colleagues noted some correlation between phage strains and the immunological properties of the host bacteria. Other phage researchers, especially James Craigie in Canada, inverted this notion to devise phage strains that could be used to classify bacterial based on their susceptibility to known tester phage strains. Classification of the Salmonellae was a particularly fraught problem, and in 1938, Yen and Craigie published a landmark classification of S. typhi (B. typhosus) by a set of “typing phages” that had host ranges that correlated precisely with the Vi surface antigen types. Phage typing became a widespread technique in diagnostic bacteriology and epidemiological investigations, yet it also remained a distinct branch of phage research. Phages were isolated and characterized for their utility in discriminating one bacterial variant from another, and standard sets of phages were developed for microbiological reference laboratories. Phage typing continues as a useful and reliable tool in diagnostic bacteriology even today. Phage typing is relatively rapid, easy, and low tech, and the diversity of phages has made it possible to find phages with the needed fine discriminatory properties. Interestingly, phage typing was, and continues to be, successful even though its practitioners and developers have been rather separate from the basic phage researchers who were attacking the fundamental biology of bacteriophages.

Phage Genetics The very first defining property of phages was their serial transmissibility, a property that implied some sort of heredity mechanism. Indeed, as H.J. Muller famously noted in 1922, “if these d’Herelle bodies were really genes, fundamentally like our chromosome genes, they would give us an utterly new angle from which to attack the gene problem”. The hereditary properties of phage were understood only slowly, partly because the concepts of adaptation, mutation, and speciation were just being extended to the microbial world. By the mid-1930s it was becoming clear that certain observable characters of viruses, such as host range and lesion morphology, behaved as stable, heritable properties. This was as true for phages as for animal and plant viruses. One oddity that phage presented was the ability of some phages to associate with their host bacteria in mysterious ways that at times appeared to be a kind of latency (later termed the lysogenic state) and at times appeared to be a kind of genetic modifier that changed one of more bacterial property, for example, toxin production or surface antigens (later termed phage conversion or phage transduction). The debates over the biologic nature of phage (microbe or self-activating enzyme) hampered the clarification of the genetic nature of phage biology. In the 1930s, Eugene and Elizabeth Wollman at the Pasteur Institute advanced the understanding of the lysogenic state and the relationship of phage infection to the introduction of new genes into the host cells. In the immediate post-war period with a renewed interest in the fundamental biology of phage (as opposed to their potential as anti-infectious agents for therapy) several groups saw the utility of phages as simple genetic organisms analogous to other microbes such as bacteria and fungi. Different “strains” of phage had been isolated from the earliest research that differed in their size, host range, and plaque appearance on a uniform “lawn” of a given host. Further, it was noted that rare variants arose which had the characteristics of mutant forms of the same strain. In 1929 Vladimir Sertic was the first to report that it was possible to obtain phages that apparently adapted to be able to grow on strains that were normally resistant to such phages. Salvador Luria made a systematic study of host range mutants in 1945 and Alfred Hershey described plaque type mutations in the same year. Phage biologists quickly adopted the obvious analogies to higher organisms and spoke of bacteriophage “strains”, “races”, “species”, “variants”, and “mutants”. As phage biologists sought to understand phage reproduction by testing for intracellular interference and competition between phages for the cellular processes involved in reproduction, they also noted that phages of differing properties but of the same strain could recombine or “cross” in the same way that higher organisms behaved. A kind of mating between strains with different genetic “markers” was developed to construct recombination frequency maps and to devise models to explain various oddities of phage biology (such as multiplicity reactivation, lysogeny, and lateral gene transfer or transduction). The technical simplicity of phage genetics together with the statistical power of the large numbers of progeny phage and effective selection methods led to very high resolution genetic mapping experiments as well as detailed characterization of certain kinds of mutations. The ability to detect very rare recombinants meant that very close markers could be studied. In the famous T4rII gene, mutations that were in adjacent base pairs could be observed as separate markers in the mapping experiments of Seymour Benzer, thus clarifying and reconceiving the very concept of the gene. The use of mutagens with known mechanisms (insertion mutations produced by acridines) allowed Crick and Brenner to cross phage mutations in such a way as to establish, by genetic means, the triplet nature of the genetic code. With phages, the genes under study (in the phage) were separable from the environment in which they were expressed (the bacterial host), it became possible to study the rather mysterious phenomenon of suppressor mutations. Suppressors are, in general, genes that change (usually revert) other mutations but do not map in the gene of the function that was originally mutated. They are sometimes called extragenic suppressors. Bacteria that carried certain kinds of suppressor genes were used to isolate phage variants that required suppression for growth. These “sus” (suppressor sensitive, later called amber, ocher, and umber mutants for trivial reasons) mutants of phage turned out to be important in understanding the role of transfer RNA and the stop codons of the genetic code in protein synthesis.

6

History of Virology: Bacteriophages

The early observation that phage infection sometimes modified the properties of the few bacterial survivors of infection suggested that phages might be mutagens of some sort. This turned out to be true in the broad sense that they caused genetic changes, but by acting as vectors to bring in new genes rather than by direct changes to existing genes. This mode of what is now called “lateral gene transfer” turns out to be of major significance in evolution and gene flow in unicellular organisms in general. Many strains of pathogenic bacteria owe their virulence to toxin genes that are associated with temperate phages (phages that can become latent without lysing their hosts) that have carried these genes into an otherwise benign organism. Common examples include the toxin genes in cholera carried on the temperate phage CTXφ, and the toxin genes in the diphtheria bacteria. In addition to toxins, temperate phages can control cell adhesion, invasion, and colonization properties of the bacterial host. For molecular biologists, the detailed genetic study of a single phage, Lambda, a temperate phage of E. coli, provided a trove of information on both the mechanism of lateral gene transfers (such as the mechanism of genome integration and recombination) as well as the molecular processes of gene expression and regulation. These discoveries with bacteriophage led directly to the current understanding of retrovirus biology (HIV integration and viral oncogenesis, for example) as well as the entire field of transcriptional control of gene expression. At the time phage workers were exploring host range mutants and the strain specificity of various phages, two groups observed a rather strange sort of host-range phenomenon. A phage grown on one particular host strain sometimes could grow efficiently on that host but fail to grow well on a closely related host (measured by the “efficiently of plating” e.o.p., a theoretical e.o.p. of 1 was taken to mean, without much evidence, that every physical phage particle was infective and produced one plaque on a lawn of sensitive bacteria). While the e.o.p. on the “restrictive host” was not zero, it was often in the range of 10−4. The phages that escaped this restriction, however, were now somehow adapted to grow with an e.o.p of near 1 again on the so-called restricting host. Sometimes it grew as well on the earlier host, too. It was as if the phage somehow remembered the last host in which it grew. This phenomenon seemed like some sort of unstable mutation or adaptation to the growth requirements of the new host. This phenomenon was observed in both T4 phage by Luria and Human and in phage Lambda by Weigle and Bertani in 1952 and was termed host-induced variation (later called modification) to distinguish it from stable hereditary mutation. Host-induced modification of phages was a rather specialized and esoteric aspect of phage biology that did not attract much attention at first. The modification involved seemed to reside in the genome of the phage, but its chemical nature and physiological significance was obscure. By the mid 1960s, however, it became clear that some chemical modifications of DNA were involved, methylation of certain bases, hydroxymethylation and glycosylation in other cases. It appeared, then, that some, but not all, bacterial hosts could modify phage DNA in ways that could be recognized by other host bacteria. How this mechanism worked, however, was unclear. In the mid 1960s, however, experiments on the fate of phage DNA upon entering the bacteria showed that the phage DNA that was restricted entered the bacterial cell but was degraded rapidly after entry. Two lines of research soon clarified this phenomenon. First, in 1969 Herbert Boyer and Daisy Roulland-Dussoix dissected the genetics of bacteria that could modify and restrict phages, and identified a three gene system (ramABC) that controlled this process: one gene controlled modification, one gene controlled restriction, and a third gene controlled both. The interpretation was that there was one gene product each for marking and degrading the DNA, and a third gene product which directed the other two gene production to specific sequences in the phage DNA, some sort of sequence recognition system. Second, starting in 1968, numerous investigators demonstrated nucleases in various systems that showed specificity for DNA which was destined for restriction in vivo. These various enzymes, surprisingly ubiquitous, came to be called “restriction endonucleases” (later shortened to “restriction enzymes”). In 1970, Thomas Kelly, Jr., and Hamilton Smith found that one of these enzymes, a nuclease from H. influenza, cleaved DNA at a specific palindromic sequence of 6 base pairs. These nucleases, with their specificity and diversity, soon were recognized as powerful new tools for gene manipulation and formed the basis for many rapid developments in biotechnology as well as basic scientific research. From an esoteric problem in phage biology of interest to a tiny handful of researchers in the 1960s, this work led to a multibillion dollar global enterprise. For anyone who thinks that support for science can and should be predicated on prospects of future payoff, the story of phage restriction and modification is certainly a cautionary tale. The small size of phage genomes coupled with their homogeneity and ease of preparation made them the organisms of choice for the application of nucleotide sequence studies starting in the 1960s when the hypothesis of a genetic code was being explored. Phages provided a ready source of “genes” of limited complexity compared to cellular DNAs. Except for the indirect approach in which the frequency of pairs of neighboring nucleotides were compared, the first attempts at sequence comparisons were based on the formation of stable hybrid DNA duplexes between two DNAs to be tested. Ingenious variations of this method were devised including observing DNA hybrids in the electron microscope (“heteroduplex mapping”) and later hybridizations with one strand immobilized on a solid matrix (filter hybridization and “Southern” blotting). With the discovery of the sequence-specific restriction endonucleases in the 1970s, the mapping of cleavage sites on phage and small virus genomes provided a significant advance is sequence knowledge and comparisons that became a way to characterize and manipulate phage genes and genomes. When methods were developed for complete DNA sequence analysis, phage genomes were the obvious first choices for application of these powerful new techniques, and the DNA of phage φX174 became the first DNA genome to be completely sequenced in 1977 by Fred Sanger and his colleagues at Cambridge University.

History of Virology: Bacteriophages

7

Phage Biochemistry The chemical study of bacteriophage seems much simpler than the genetic and biological work on phage, probably because phage are relatively simple entities, and the techniques needed were perfected rather later than other methods. As a consequence of their small size yet delicate composition, it was hard to isolate and prepare phage preparations that were suitable for the chemists until the availability of the ultracentrifuge in the post World War 2 period. Early study of phages relied on rather crude preparations that were probably mostly inactive because of the harsh methods of organic chemistry were unsuitable for most complex biological materials. Modern nucleoprotein chemistry was essentially unknown before World War 2. That phages contained protein and nucleic acid was clearly shown in the 1930s by Martin Schlesinger, first in Germany and later, when he was a refugee in England. His phage preps contained amino acid polymers by chemical analysis, and nucleic acids by certain diagnostic reactions used by the histochemists (the Feulgen reaction). They were semiquantitative at best. The physical chemist, William J. Elford, working with the virologist Christopher Andrewes in England used calibrated dialysis membranes as filters to get rough estimates of the sizes of different phages. The widespread introduction in 1949 of the model L preparative ultracentrifuge by Spinco (Specialty Instrument Company, later acquired by Beckman) the spinoff company of E.G. Pickles of the University of Virginia greatly advanced the chemical and biochemical study of phages as well as other biological materials. With the ability to sediment virus particles and separate them from other cellular material (“debris”), biochemists could analyze the whole and subviral components of specific phages. Soon it was found that there was great diversity among different phages. The DNA in phages was found to be a single polymer chain, not occurring in segments such as the chromosomes of higher cells. The DNA in some small phages was found to be a single strand of DNA in the form of a covalently closed circle (φX174), and some phages even contained RNA rather than DNA (f2, MS2, R17, Qβ). Cases of unusual nucleic acid components, for example hydroxymethyl cytosine and deoxyuridine were found in some phages. The finding of methylation and glycosylation of DNA in phages was crucial to understanding the important discovery of phage restriction and modification. As the biochemists elucidated new metabolic events that occurred after infection, new phage-induced enzymes were discovered, and later shown to be encoded by phage genes (rather than phage-activated cell genes). These enzymes included new DNA polymerases, phage-specific RNA polymerases and transcription factors, polynucleotide kinase, DNA ligase, and several enzymes of nucleotide metabolism. Several of these activities were unknown as cellular functions and became useful tools in nucleic acid biochemistry. Phage biology was central to the problem of protein biosynthesis from the early 1940s. Since phage reproduction yields a burst of new extracellular material in a rapid and synchronous fashion, it was an ideal model to study the production of new protein. Early experiments with radiolabeled substrates settled the question as to the origin of the material in the new phages: it came from precursors in the cell by de novo synthesis after infection, rather than from repurposing pre-existing proteins. In the case of DNA, some phages produced enzymes that degraded the host DNA and reutilized the precursor nucleotides for phage DNA synthesis. These pathways became the first such steps in the central dogma to be dissected by in vitro study of the replication machinery with small phage φX174 DNA as the template. The discovery of phages containing RNA genomes provided an ideal source of RNA that could be used to test the new ideas about messenger RNA and ribosome function in the 1960s.

Phage and Biotechnology The first inkling that phages might be exploited for technological applications (excluding phage for medical therapies) was the demonstration that a phage could be used to isolate and purify a cellular gene. In 1969 Jon Beckwith and his colleagues isolated pure lac operon DNA from E. coli using the technology afforded by Lambda phages that had incorporated that cellular DNA to become a “transducing phage”. While this method was complicated and of apparent limited application, the program to isolate and purify specific genes for biotechnological applications had been launched. The subsequent discovery that restriction endonucleases could be used to prepare genes and gene fragments at will, along with the use of phages and plasmids as vectors to introduce genes into cells meant that this new technology had come of age. Phage biologists had developed a good understanding of the process of transduction (finding phages that had incorporated cell genes into their genomes) meant that phages such Lambda were rapidly exploited as vectors for introducing a specific DNA sequence into a single host cell that could then be grown up into a mass culture from a single, genetically defined progenitor, a process that came to be called “molecular cloning” or simply cloning (derived from the original meaning from plant biology, “a shoot” or “offspring”). The understanding of gene regulation in phages, obtained from years of basic phage research, provided the new generation of biotechnologists with tools to control the expression of inserted genes at will. A particularly important technology depended on the discovery that one E. coli phage, T7, encoded an entirely new RNA polymerase that transcribed only certain genes expressed late in the phage infectious cycle. The transcription signals for this polymerase were linked to the genes to be cloned and thus their expression would only occur if the T7 RNA polymerase was present and active. By controlling this RNA polymerase in turn, the genetic engineer could turn the cloned gene on and off as needed. This system was particularly useful since many genes made products that were detrimental to the cell growth when fully expressed, making it hard to grow large cultures with high expression levels. The T7 system, based on rather esoteric phage research, turned out to make possible the overproduction of cloned gene products in vast excess for other biotechnological uses. In some cases, up to half of the intracellular protein could be forced from

8

History of Virology: Bacteriophages

the cloned gene, in effect, making the crude cell extract 50% pure to begin with. Much of the current progress in structural biology owes it success to these phage expression systems which produce abundant quantities of proteins normally found in miniscule amounts in the cell. Another ingenious application of phage biology to biotechnological problems was the development of “phage display” methods, pioneered by George Smith in 1985. Since phages have multiple copies of identical coat proteins on their surface and since these coat proteins are encoded by the phage genome, it was possible to insert a mixture of foreign gene sequences into the coat protein genes of individual phages in such a way that the phages produced fusion proteins that could still function as a coat protein molecule yet at the same time “display” some of the foreign protein domains on the surface of the phage. A mixture of phages, each containing a different introduced foreign sequence attached to its coat protein gene, was called a “phage display library”. The power of this method was based on various ways of selecting for the properties of the foreign, “displayed” protein. Affinity methods with a chemical ligand or a protein target linked to a solid matrix could then be used to adsorb only those phages which “displayed” a specific functional cloned gene. Antibody affinity methods were sometimes used as well. The beauty of this method, is that an individual phage, selected by virtue of that particular phage carrying the specific selected gene, could be plaquepurified and grown into a pure stock of phage with only the desired cloned gene. Another recent tool for genetic engineers and biotechnologists that has only a peripheral relation to the history of bacteriophage is the gene editing techniques based on the CRISPR Cas 9 system. CRISPR is an acronym for “clustered regularly interspaced short palindromic repeats”, a description of DNA sequence regularities noted in bacterial DNA sequences by several research groups between 1987 and 2000. The ubiquity of these regularities and their relation to small RNAs led researchers to seek some function for these sequences. Eventually, they were found to be related to a DNA cutting function that was connected to one of the many ways that prokaryotic cells seem to defend against lateral gene transfer, in particular, bacteriophages. The exploitation of this system for simple alteration of gene sequences represents a major advance in gene manipulation technology. It is historically incorrect, however, to present the history of CRISPR as part of the history of bacteriophages, since this mechanism was applied to phage biology only post hoc, and phage research had essentially no role in its discovery or elucidation, serving only as a subsequent biological justification for its presence in cells.

Phage Ecology and Evolution Phages are ubiquitous in the biosphere, but only relatively recently has the diversity and ecological importance of bacteriophages attracted much attention. Roger Hendrix noted that, given that the average phage titer of the world’s sea waters is about 108 per ml., there are about 75 million blue whale units of phage on earth. That is, the combined global mass of phages equals that of the mass of about 75 million blue whales, clearly an impressive number. Placed end-to-end, these tiny phages would stretch about 100 million light years, about 100 times the diameter of our own galaxy. Why are phages so abundant? What is their function in diverse ecosystems? How many kinds of phages are there? Only recently have these questions been studied. Early phage researchers were troubled by the differing results they obtained with phages isolated from nature. D’Herelle and his colleagues tabulated some of their isolates and noted that they differed (in no obvious or systematic way) in size, immunological properties, host range, grow rates, and so on. Later, when phages were examined under the electron microscope, diversity but no clear systematics emerged. For years, since its first report in 1971 under Peter Wildy, the International Committee on the Nomenclature of Viruses (ICNV) has struggled (with essentially no success) to systematize the classification of viruses including phages. Every year, the diversity of phages continues to surprise phage biologists, especially with more study of the archaea and their phages. Microbes are now recognized as more than just passive parasites, rather they are essential symbionts or key components of complex ecosystems. Phages are now recognized as a playing an important role in many biological communities. For example, the interplay of phages and various cyanobacteria in the oceans are crucial to the global carbon cycle. Without massive phage lysis of cyanobacteria every day, atmospheric carbon would be sequestered in these bacteria, sink to the ocean floor (eventually becoming oil) but in the meantime, the depletion of atmospheric CO2 would reduce the greenhouse effect so much that the entire planet could freeze. Another example of phage-host equilibrium is found in the Ganges River in India. For millennia Hindus have used this sacred river for interment of the remains of the dead. In a traditionally endemic cholera region, it is surprising that the Ganges water is not a source of major cholera infections. Over centuries, apparently, there had been an evolution of anti-cholera phages in the Ganges that effectively suppresses the cholera vibrio populations despite the episodic contaminations during lethal disease outbreaks. This “Ganges paradox” was noted long before phages were discovered, but this observation cannot be considered as an earlier “discovery” of phages.

Conclusion Bacteriophages, discovered just over a century ago, are the most prevalent biological agents on our planet. They are also likely the most diverse. Their small size, simple genetic and physical structure, and rapid reproduction seem to account for their adaptability, their multiple ecological roles and impact, and their utility for diverse scientific purposes. Some phages were exploited from their

History of Virology: Bacteriophages

9

first discovery as anti-infectious disease agents in human and animal medicine. Others, because of their seemingly simplicity, were useful biological models that led to deep insights into basic biological problems such as heredity and cellular physiology. Knowledge of phage biology and biochemistry together with their ease of laboratory manipulation led to diverse applications in biotechnology far removed from their natural biological activities. The gradual recognition of the diversity and complexities of microbial ecology has recently brought phages once again into a central role in the global biological economy, extending even as far as having a part to play in the eventual attack on the problem of global climate change.

Further Reading Cragie, J., Yen, C.H., 1938. The demonstration of types of B. typhosus by means of preparations of type II Vi phage: I. Principles and technique. Canadian Journal of Public Health 29, 448–463. d’Herelle, F., 1917. Sur un microbe invisible antagoniste des bacilles dysentériques. Comptes rendus de l'Académie des Sciences 165, 373–375. Ellis, E.L., Delbrück, M., 1939. The growth of bacteriophage. The Journal of General Physiology 22, 365–384. Holmes, F.L., 2006. In: Summers, W.C. (Ed.), Reconceiving the Gene: Seymour Benzer’s Adventures in Phage Genetics. New Haven: Yale University Press. Lederberg, J., 1998. Plasmid (1952–1997). Plasmid 39, 1–9. Luria, S.E., 1945. Mutations of bacterial viruses affecting their host range. Genetics 30, 84–99. Muller, H.J., 1922. Variation due to change in the individual gene. The American Naturalist 56, 32–50. Smith, G.P., Petrenko, V.A., 1997. Phage display. Chemical Reviews 97, 391–410. Stent, G.S., 1963. Molecular Biology of Bacterial Viruses. San Francisco: W.H. Freeman. Summers, W.C., 2001. Bacteriophage therapy. Annual Review of Microbiology 55, 437–451. Summers, W.C., 1999. Félix d’Herelle and the Origins of Molecular Biology. New Haven: Yale University Press. Suttle, C.A., 2005. Viruses in the sea. Nature 437, 356–361. Twort, F.W., 1915. An investigation on the nature of ultra-microscopic viruses. The Lancet 186, 1241–1243. Wagner, P.L., Waldor, M.K., 2002. Bacteriophage control of bacterial virulence. Infection and Immunity 70, 3985–3993.

Icosahedral Phages – Single-Stranded DNA (φX174) Bentley A Fane, University of Arizona, Tucson, AZ, United States Aaron P Roznowski, The University of Texas at Austin, Austin, TX, United States and University of Arizona, Tucson, AZ, United States r 2021 Elsevier Ltd. All rights reserved. This is an update of B.A. Fane, M. Chen, J.E. Cherwa, A. Uchiyama, Icosahedral ssDNA Bacterial Viruses, In Encyclopedia of Virology (Third Edition), edited by Brian W.J. Mahy and Marc H.V. Van Regenmortel, Elsevier Ltd., 2008, doi:10.1016/B978-012374410-4.00456-8.

Glossary

Ångström (Å) 10−10 m. The unit used when describing distances within protein structures. T ¼ 1 icosahedron A geometric shape consisting of 20 faces and 12 vertices. A T ¼ 1 virion is composed of 60 viral coat proteins. π-π stacking Attractive, noncovalent interations between aromatic rings.

Procapsid A viral assembly intermediate containing a full complement of scaffolding proteins but devoid of a genome. Scaffolding protein A protein that directs the assembly of the virus, found in the procapsid assembly intermediate but not in the mature virion. S value or Svedberg A unit of speed to describe the rate a particle moves through a gradient.

History Sometime in the 1920s, Les Années Folles, Sertic and Bulgakov isolated bacteriophage φX174, the most well-known virus of the Microviridae (micro: Greek for small). From the onset it was an enfant-terrible, so different from the other bacteriophages isolated during this period of extensive phage discovery. It was unusually tiny, readily passing through the smallest of ultrafilters. This biophysical characteristic defined its “race,” race X (Roman numeral ten) and the isolate was placed in a vial labeled #174, from which the phage derived its name, the race X phage (φ) in test tube #174. And there it sat, relatively unperturbed, for decades. The first electron micrographs revealed small, vague isometric particles, vastly different from the tailed morphologies, typically associated with bacteriophages. The atypical capsid encapsulated an equally atypical genome, which shared qualities with both single-stranded (ss) RNA and doublestranded (ds) DNA. In 1959, Robert Sinsheimer determined that φX174 genome was a single-stranded DNA molecule, the first one defined. While genetic maps of other phages consisted of orderly linear gene progressions, the φX174 map was most peculiar with genes located within genes. When Sanger and colleagues sequenced the genome, the complex arrangement of overlapping reading frames was confirmed. It was so beguiling that many suspected that φX174 had extraterrestrial origin. As the New York Times reported the theory, an advanced race engineered φX174 and disseminated it into the cosmos where it would ‘‘persist until the evolution of intelligent life and finally of investigators interested in the genetics of phage.’’ Although attempts were actually made to decipher the hypothesized hidden message, they all failed. However, rumors persist that the code has been broken: it reads, ‘‘Behold this marvelous little thing.’’ This small virus would continue to have a large impact on molecular biology. Arthur Kornberg and colleagues used it to elucidate the molecular mechanism of prokaryotic DNA replication. Masaki Hayashi and colleagues defined the φX174 assembly pathway and were the first to demonstrate that viral DNA packaging could be achieved in vitro. As the genome sequence initially defied imagination in 1978, the atomic structure of the viral procapsid, the first such structure solved at this high resolution, beguiled structural virologists, with its unusual external scaffolding protein lattice. And when events were at their most bizarre, the skeletons in the family closet emerged: the gokushoviruses (Japanese for very small). These viruses escaped detection for decades, hiding within obligate intracellular parasitic bacterial hosts, Chlamydia and Bdellovibrio, or mollicutes, which are difficult to propagate in a laboratory setting. With the advent of environmental sequencing, these distant φX174 relatives appear to be quite common, especially in marine environments. However, their hosts have yet to be identified. The φX174-like phages’ most distinguishing features have always been their extensive overlapping reading frames, in which new genes have been discovered as recently as 2014, and their tailless structure. Double-stranded phages use their tails to deliver their genomes. Lacking that structure, the mechanism of microvirus DNA penetration was entirely vague. In the 1970s, Jazwinski and colleagues recognized the critical involvement of protein H, a minor capsid protein, calling it the DNA pilot protein. They speculated that “protein H facilitates the transit of nucleic acid, a polyelectrolyte, through the membrane lipid bilayers, by the creation of a pore.” Thirty-five years later, their hypothesis was shown to be exceptionally insightful. In 2014, the atomic structure of the H protein’s central domain was determined. Ten H monomers form a tube that emerges from the capsid while on the host cell’s surface. Its unique biophysical features suggest energy utilization mechanisms that may be common during DNA transport through conduits.

Virion Morphology and Genome Content All members of the Microviridae package circular, ssDNA genomes of positive polarity within T ¼ 1 icosahedral capsids. High resolution atomic structures of several microvirus capsids, including φX174, have been determined. The bulk of the capsid is composed of 60 copies of the coat protein (Fig. 1 and Table 1), which exhibits the common β-barrel fold, or Swiss jellyroll

10

Encyclopedia of Virology, 4th Edition, Volume 4

doi:10.1016/B978-0-12-809633-8.20944-8

Icosahedral Phages – Single-Stranded DNA (φX174)

11

Fig. 1 Virion structures. (A) A space filling rendering of the φX174 virion. The major coat protein is depicted in grey. The major spike protein is depicted in blue. (B) The arrangement of five φX174 coat proteins around a five-fold axis of symmetry. Numbers indicate the axis of symmetry and the number of coat proteins interacting at that axis. (C) A space filling rendering of the φX174 procapsid. The major coat, major spike, and external scaffolding proteins are respectively depicted in grey, blue, and aquamarine. (D) A cryo-EM image reconstruction of gokushovirus Chp2, courtesy of S. Kailasan and M. Agbandje-McKenna.

motif, found in many icosahedral virion capsid proteins (Fig. 1(B)). In addition to the coat protein, φX174-like viruses also contain 60 copies each of the major spike protein G and DNA binding J proteins. There are also 12 copies of a fourth structural protein, the DNA pilot protein H, which is not visible within the capsid’s overall icosahedral symmetry. Although the exact location of the H proteins are not known, 10 of them form a tube that emerges from the capsid once attached to the host cell’s surface. There are several major differences between microvirus and gokushovirus capsid superstructure and protein content. In the 250 Å diameter φX174-like virions (Fig. 1(A)), 70 Å wide spikes, composed of five G proteins, rise 30 Å above the capsid surface. By contrast, the gokushoviruses lack major spike proteins: hence, the fivefold axes of symmetry are not decorated. However, the cryoelectron microscopy image reconstruction of gokushovirus Chp2 shows mushroom-shaped protrusions at the threefold axes of symmetry. These structures, which rise approximately 55 Å above the 270 Å diameter capsid (Fig. 1(D)), are composed of amino acid sequences from three interacting coat proteins (VP1). The gokushovirus VP2 is also a structural protein. Although its function has not been experimentally defined, bioinformatic analysis strongly suggests it is analogous to the DNA pilot protein H. Finally, gokushoviruses likely contain another protein, VP8, which is similar to the microvirus DNA binding protein. Both are short, highly basic peptides. However, it has yet to be detected in virions. This may be due to the protein’s small size and the difficulties associated with propagating large quantities of gokushoviruses for biochemical characterization. The genetic maps of φX174 and Chp2 are depicted in Fig. 2. Gokushoviral genomes encode neither an external scaffolding nor a major spike protein. The absence of these genes accounts for their smaller genomes. The external scaffolding protein has at least three known functions in φX174 morphogenesis. It nucleates procapsid assembly, mediates the subsequent elongation reaction, and stabilizes the assembled procapsid at the two- and threefold axes of symmetry (Fig. 1(C)). These functions are either not required or performed by different proteins in the gokushoviruses. Gokushovirus procapsid

12

Icosahedral Phages – Single-Stranded DNA (φX174)

Table 1

Microviridae gene products

φX174-like protein

Gokushovirus

A A*

VP4 UDa

B C D E F G H J K

ProteinFunction Stage II and Stage III DNA replication. An unessential protein for viral propagation. It may play a role in the inhibition of host cell DNA replication and superinfection exclusion. VP3 Internal scaffolding protein, required for procapsid morphogenesis and the assembly of early morphogenetic intermediates. 60 copies present in the procapsid. VP5?b Facilitates the switch from Stage II to Stage III DNA replication. Required for Stage III DNA synthesis. NPc External scaffolding protein, required for procapsid morphogenesis. 240 copies present in the procapsid. UD Host cell lysis. VP1 Major coat protein. 60 copies present in the virion and procapsid. NP Major spike protein. 60 copies present in the virion and procapsid. VP2? DNA pilot protein need for DNA injection, also called the minor spike protein. 12 copies in the procapsid and virion. VP8? DNA binding protein, needed for DNA packaging, 60 copies present in the virion. UD An unessential protein for viral propagation. It may play a role optimizing burst sizes in various hosts.

a

UD, undetermined. A “?” indicates a hypothesized function based on bioinformatic data. NP, not present in Gokushoviruses.

b c

Fig. 2 Genetic maps. (Top) gokushovirus Chp2. (Middle) φX174. (Bottom) transcripts found in φX174 infected cells. The promoters and transcription terminators are indicated on the linear map of φX174. Line thickness indicates relative transcript abundance. The gene A transcript is very unstable; the terminator for this transcript is unknown. For protein functions, see Table 1.

assembly is most likely mediated by the internal scaffolding protein VP3, which is analogous to the φX174 internal scaffolding protein B. In the φX174-like viruses, genes A and C encode proteins mediating DNA replication. Gokushovirus gene 4 and possibly gene 5 encode their respective homologues Lastly, gene E encodes a lysis protein. A gokushoviral equivalent has yet to be identified. All φX174-like genomes contain two additional genes, A* and K. They encode proteins with unknown or poorly defined function. Furthermore, neither is required for growth under laboratory conditions. Both genes are contained within larger genes. Thus, maintaining them does not increase genome size. The φX174-like phages fall into three major clades, each one represented by a canonical virus: φX174, G4 and α3. Genome size corresponds to the three clades: B5.3 kb, B5.7 kb, and B6.1 kb, for the φX174, G4 and α3 clades, respectively. Although genomes vary in size, the capsids’ internal volumes are nearly identical. The α3-clade genomes are more complex. They contain two versions of gene C; Cbig and Csmall. Both proteins are synthesized in infected cells. Although both are not required for viability, they are required for optimal fitness. In addition, three small ORFs are found in the α3-clade H-A intercistronic regions, which are significantly larger than those found in the other clades. The functions of the encoded proteins, or if they are synthesized, are unknown.

Icosahedral Phages – Single-Stranded DNA (φX174)

13

Fig. 3 Surface residues and conformational changes associated with host range, attachment, and eclipse. (A) Space filling rendering of the α3 X-ray structure, with one G protein spike removed. The coat protein is depicted in white, the spike protein in magenta. The 6-carbon sugar binding site, which likely mediates initial attachment, is depicted in purple. Dark grey residues denote the regions within the coat protein that may merge with the lipid bilayer (see panel C). Mutations that alter host range are depicted in blue. One G protein spike has been removed to illustrate the location of these substitutions within the coat–spike protein interface. (B) The clustering of mutations affecting host range within the coat– spike interface. Proteins are depicted as described above. Mutations at blue residues alter host range. (C) Cryo-EM image reconstruction of φX174 interacting with a lipid bilayer. (D and E) Interactions with the lipid bilayer at the point of cell-virus contact removes the spike protein pentamer. Before DNA ejection the coat protein pentamer is closed (panel D), whereas a pore later forms for the release of the H-tube and DNA (panel E).

Host Cell Recognition, Attachment, Eclipse and Penetration φX174-like phages recognize terminal glucose and galactose moieties on host lipopolysaccharides (LPSs). The structure of the LPS is both necessary and sufficient to mediate both attachment and penetration. In X-ray structures of φX174, six coat protein residues constitute six carbon sugar binding sites when Ca2+ is present (Fig. 3(A)). Due to the icosahedral symmetry of the capsids, each virion has 60 such sites, which likely mediate attachment to host LPS. This is a reversible reaction: infectious virions can be dissociated from host cell surfaces. During eclipse, also called irreversible attachment, DNA-filled virions can still be visualized on the surface of host cells. However, particles are no longer infectious when removed from membranes. A cryo-EM structures of a DNA-filled particle attached to a lipid bilayer elucidate the conformational switches associated with eclipse (Fig. 3(C, D, and E)). In this structure, the major spike pentamer at the lipid-interacting vertex has been removed, and an annulus of coat protein residues merges with the lipid bilayer, anchoring the infecting virion to the membrane. An additional conformational change was observed in a post-DNA release structure: a pore formed at the attached vertex (Fig. 3(E)), The DNA translocating conduit, or H-tube, most likely emerged through this opening during penetration (see below). The results of host range studies underscore the importance of coat-spike protein interactions during early infection (Fig. 3(A) and (B)). Despite B90% amino acid identity, the natural host of microvirus α3 is E. coli C; whereas microvirus ST-1 is an E. coli K12-specific phage. Virions attached to and eclipsed on both native and unsusceptible hosts; however, they breached only their native host’s cell wall. Thus, unsusceptible host-phage interactions promote off-pathway reactions, inactivating particles without penetration. Single amino acid substitutions in either the α3 coat or spike proteins expand host range. The substitutions cluster around the outer circumference of the coat-spike interface (Fig. 3(A) and (B)). During DNA penetration, bacteriophages must propel their genomes across a formidable barrier: the cell wall. The Gram-negative prokaryotic cell wall is composed of two lipid membranes that surround a peptidoglycan layer. The emblematic tail structures found

14

Icosahedral Phages – Single-Stranded DNA (φX174)

Fig. 4 The atomic structure of the φX174 DNA translocating conduit, also called the H-tube. (A) Two of the ten H protein monomers are depicted in cyan. The structure is divided into two domains, which are distinguished by repeating heptad or hendecad motifs. A heptad is a group of seven, whereas a hendecad is a group of eleven. An example of each motif is highlighted in magenta. (B) Looking through the H-tube from its C-terminal opening. The inward facing amide and guanidinium containing side chains are depicted as stick models. As above, two monomers are depicted in cyan.

on double-stranded (ds) phages often perform this function. However, the φX174-like phages are strictly icosahedral and lack and external tail structure. Although early studies (circa 1975) indicated that the DNA pilot protein H mediated penetration, efforts to visualize the protein within the virion structure have been dismally unsuccessful. Thus, the penetration mechanism remained unknown for almost 40 years. Then, a 13-year effort to both purify and generate diffracting H protein crystals finally yielded results. Although protein H is a monomer in all viral assembly intermediates (see below), it crystalized as an oligomer (Fig. 4(A)). Ten α-helical monomers formed a 170 Å-long, coiled-coil, helical barrel. This barrel, or tube, is long enough to span membrane adhesion sites, regions of the cell wall where the inner and outer membranes make contact. Infecting particles tend to congregate at these sites. The H-tube’s internal diameter is 22 Å, which can accommodate two single-stranded (ss), un-base paired DNA strands. Since the genome is circular, two ssDNA strands must simultaneously pass through the conduit. The helices run parallel to each other and a structural kink divides the tube into two domains. Domain A has a 11,3 coiled-coil structure (11 residues per 3 helical turns), whereas domain B has a 7,2 coiled-coil structure. The tube has been visualized in situ by cryo-EM tomography. Upon cell contact, a fully formed tube emerges from the capsid and descends into the host cell wall. After DNA delivery, the tube dissociates within the cell wall, which is necessary to preserve the cell’s membrane potential. The tube’s inner surface is lined with glutamine, asparagine, and arginine amino acid side chains (Fig. 4(B)). Containing amide and guanidinium groups, these side chains can form hydrogen bonds with nucleic acids. The side chains point toward the tube’s center and are angled towards the host cell’s cytoplasm. This orientation may prevent the genome from moving backwards during transit, thus creating a one-way passage. In addition, these side chains may play two other roles during DNA transport. Although the tube’s internal diameter can accommodate two ssDNA strands, it is not wide enough to accommodate the hair-pin structures formed by ss nucleic acids. By competition, these side chains, which interact with nitrogenous bases, may eliminate hair-pin structures during translocation. These tube-DNA interactions may also modulate the release of potential energy used when transporting DNA.

Icosahedral Phages – Single-Stranded DNA (φX174)

15

Mature virions; containing a highly compacted, volumetrically constrained genome; may represent the particle’s highest energy state. This inherent potential energy may be used to initiate genome delivery and begin moving the genome into the host cell. In most biological systems, potential energy performs work more efficiently if released in small, usable increments. For example, during oxidative phosphorylation, for example, electrons are not directly passed from NADH2 to O2. The released energy from a direct transfer would be too great for the cell to effectively harness. Instead, the electrons flow through the electron transport system, in which energy is liberated in a series of exploitable reactions. The amide and guanidinium groups may regulate genome delivery by slowing the transport process, creating a frictional force that counters the pressure driving the genome into the cell. The system can be compared to a compressed gas driven machine. In this analogy, the gas cylinder is the capsid, the packaged genome is the compressed gas, the amide and guanidium groups are the regulatory gauge, and the H-tube is the pipe connected to the cylinder. If too much air (energy) is released at one time, the pipe could fail. The frictional force created by the amide and guanidinium groups may prevent the transport of the entire genome, which appears to require an additional energy source. This source could be linked to DNA replication. If cells are not actively synthesizing DNA, the entire genome is not efficiently transferred. Thus, the first stage of DNA synthesis and penetration may be concurrent processes. As the ssDNA enters the cell it is converted into double-stranded molecule. The phosphate bond hydrolyzed within the deoxynucleotide triphosphate (dNTP) contains more energy than the newly formed phosphate bond in the DNA backbone. Using this small energy differential, the host cell DNA polymerase may help pull the ssDNA genome into the cell. Due to their recent discovery, the details of gokushovirus attachment and penetration have yet to be defined. However, unlike the φX174-like viruses, gokushovirus Chp2 appears to utilize a host cell protein as a receptor molecule, as opposed to LPS. The results of bioinformatic approaches combined with chlamydiaphage host range studies suggest that the threefold related protrusion may govern host cell specific attachment. However, there is no direct experimental evidence to support this hypothesis.

DNA Replication The studies of φX174 DNA replication are historically significant. Positive polarity ssDNA replication strategies are complex, occurring in three separate stages, which are described below (Fig. 5). Kornberg and colleagues reconstituted the first two stages in vitro while Hayashi and colleagues reconstituted the third stage of DNA replication, which includes concurrent packaging of the single-stranded genome. Collectively, these studies established the first defined viral genome replication process on the biochemical level. As single-stranded (ss) viral DNA is delivered into the infected cell via the H-tube (Fig. 5), Stage I DNA replication commences. During this stage, the ss genome is converted into a covalently closed, double-stranded (ds), circular molecule, called replicative form one (RFI) DNA. Since purified ssDNA produces progeny when transfected, host cell proteins are both necessary and sufficient for stage I replication in vivo. In vitro, a stem–loop structure in the FG intercistronic region serves as the host cell protein recognition site, which initiates the assembly of the primosome. Whether other regions of the genome can perform this function is vivo has yet to be determined. After primosome assembly, the complex migrates along the ssDNA in a 5′→3′ direction synthesizing the requisite RNA primers for DNA replication. Addition of the holoenzyme leads to chain elongation. The 13 host cell proteins required for this stage of synthesis are the same proteins involved in replicating the host cell chromosome. During stage II DNA synthesis, RFI DNA is amplified. In addition to the host cell proteins required for stage I replication, stage II replication is dependent on the viral A protein and the host cell rep protein, which functions as a helicase. The viral A protein binds to the stage II/III origin of replication and nicks it to initiate (+) strand synthesis, which occurs via a rolling circle mechanism. After nicking, protein A forms a covalent ester bond with the DNA, forming a relaxed circular molecule called RFII. The host cell rep protein unwinds the helix and the host ssDNA binding protein (Ssb) stabilizes the separated strands. After one round of rolling circle synthesis, protein A cuts the newly generated origin and acts as a ligase, generating a covalently closed circular molecule. Minus strand synthesis is mechanistically similar to stage I DNA synthesis. Stage III DNA synthesis involves the concurrent synthesis and packaging of the ssDNA genome. Procapsids and viral protein C are required for this reaction along with all the proteins involved in the previous stages of replication with the exception of the Ssb. Thus, a single-stranded genome is not synthesized unless there is a procapsid to accept it. As the dsDNA is unwound at the stage II/III origin of replication, competition between Ssb and protein C for binding determines whether the dsDNA or ssDNA will be synthesized. This has been demonstrated both in vitro and in vivo. In vitro, excess Ssb promotes another round of stage II DNA synthesis occurs. By contrast excess protein C inhibits stage II DNA replication, and single-stranded genomes are concurrently synthesized and packaged. In vivo, altering the Ssb/protein C ratio by exogenously over-expressing cloned genes is detrimental to plaque formation. If the ratio increases, ssDNA synthesis is delayed. Thus, fewer mature virions are formed before programmed cell lysis. Ratio decreases result in fewer RF molecules, which is the template for both genome biosynthesis and transcription. Thus, the detrimental effects are more global. When packaging occurs, proteins A, C, and rep form a complex on the dsDNA that docks to procapsids, presumably in a groove that spans one of the twofold axes of symmetry. Procapsid binding does not occur in the absence of protein C. Mechanistically, stage III DNA synthesis is similar to the stage II (+) rolling circle synthesis. As the new (+) strand is synthesized, the displaced strand is translocated into the procapsid. After one round of synthesis, protein A, which is covalently attached to the origin of replication, cuts the newly synthesized origin and acts as a ligase to generate a covalently closed circular molecule on the displaced strand.

16

Icosahedral Phages – Single-Stranded DNA (φX174)

Fig. 5 Morphogenesis and DNA replication of φX174.

Icosahedral Phages – Single-Stranded DNA (φX174)

17

Gene Expression Since Microviridae contain single-stranded genomes of positive polarity, Stage I DNA synthesis, which generates the negative strand, must occur before transcription. Unlike large bacteriophages, Microviridae gene expression is neither temporal nor mediated by trans-acting mechanisms. Thus, the timing and relative production of viral proteins is entirely dependent on cis-acting regulation signals: promoters, transcription terminators, mRNA stability sequences, and ribosome binding sites. In the φX174-like viruses, promoters are found upstream of genes A, B, and D. Terminators are found after genes J, F, G, and H (Fig. 2). The terminators are not 100% efficient; thus, a wide variety of transcripts are produced. There is a rough correlation between gene transcript abundance and the amount of encoded protein required for the viral life cycle. For example, the virus requires more copies of protein D, the external scaffolding protein, than any other viral protein. Gene D transcripts are the most abundant in the cell. Similarly, there are more transcripts of genes F, J, and G than transcripts of gene H. The relative stoichiometry of these structural proteins are 5:5:5:1, respectively. Protein expression is also affected by mRNA stability. Each mRNA species decays with a characteristic rate. Transcripts of gene A decay very rapidly, ensuring that this non-structural protein is not over expressed. And finally, regulation can also be achieved on the translational level. Despite the gene E’s location within gene D on the most abundant transcript, very few E proteins, which mediate cell lysis, are translated due to an extremely pathetic ribosome binding site.

Morphogenesis Despite extensive searches, a dependence on host molecular chaperones, such as GroEL and GroES, has never been documented. Thus, chaperone-independence is another characteristic that distinguishes the φX174-like viruses from dsDNA bacteriophages. The first virally encoded assembly intermediates are the 9S and 6S particles, respective pentamers of the major coat F and spike G proteins (Fig. 5). Unlike most viruses that utilize a scaffolding protein to mediate morphogenesis, two scaffolding proteins orchestrate φX174 assembly, temporally dividing the pathway into two discernable phases. The early phase is mediated by the internal scaffolding protein B. After or during 9S particle formation, five B proteins bind to the pentamer’s underside, which becomes the capsid’s internal surface. The B proteins also recruits one copy of the DNA pilot protein H. However, the DNA pilot protein is not required for assembly, as H-less particles can be assembled into uninfectious, virus-like particles. The finished product is the 9S* particle. Early assembly then concludes when a major spike protein G pentamer, the 6S particle, binds to the top of the 9S* particle, forming the 12S* assembly intermediate. As previously discussed, the DNA pilot H protein forms a decameric tube for DNA delivery. However, 9S* and 12S* assembly intermediate contain one copy of the H protein. Thus, the H-tube’s discovery was not expected and formation must occur after capsid assembly. Coat-internal scaffolding protein contacts are mediated primarily by aromatic interactions, creating an elaborate π-π stacking network (Fig. 6(D)). While the X-ray structure elucidates the molecular details of the coat-scaffolding protein interactions, genetic and biochemical analyses elucidate the functional consequences. The π-π stacking network has been mutationally modified to both eliminate or alter B protein binding. The former leads to 9S particles aggregation, whereas the latter kinetically traps 9S* or 12S* assembly intermediates. Thus, the B protein temporally guides the coat protein through assembly. It suppresses interactions between coat protein pentamers until late assembly and ensures that 9S* and 12S* particles are primed to respectively associate with 6S pentamers and the external scaffolding protein D, which orchestrates late assembly. Mutations within the B protein’s N-terminus, which is not part of the π-π stacking network, affect H protein incorporation. During the second phase of assembly, 240 copies of the external scaffolding protein D, arranged as dimers, organize twelve 12S* particles into the procapsid. External scaffolding proteins are rare. The only other known external scaffolding proteins belong to parasitic satellite virus systems such as bacteriophage P4. The satellite virus encodes an external scaffolding protein that hijacks the helper virus’s capsid proteins, forcing them to construct a smaller capsid, which is only suitable for the satellite virus’s smaller genome. Thus, the φX174-like viruses are the only non-satellite viruses known to utilize an external scaffolding protein. This protein performs many of the functions typically associated with internal scaffolding proteins in single scaffolding protein systems: it nucleates procapsid assembly, organizes procapsid assembly precursors during elongation, and stabilizes the procapsid during DNA packaging. Unique in nature, the φX174 external scaffolding protein is also biophysically unique, having the inherent ability to achieve at least six different structures. The D proteins operate as dimers. 120 D protein dimers organize twelve, 12S* particles into a procapsid. There are four D subunits (D1, D2, D3 and D4) per coat protein in the procapsid (Fig. 6(A) and (C)). They are arranged as two similar, but not identical, asymmetric dimers (D1D2 and D3D4). Each subunit makes a unique set of contacts with the underlying coat protein, the spike protein, and neighboring D protein subunits. Accordingly, the structure of each subunit is unique. The atomic structure of the assembly naïve D protein dimer has also been determined. The subunits within that dimer, DADB, appear poised to achieve the four structures found in the procapsid. DA has a structure somewhat between D1 and D3, while DB has a structure midway between D2 and D4. The 12S* + D protein → procapsid reaction has been analyzed both in vitro and in vivo. Capsid assembly reactions are typically divided into two phases: nucleation and elongation. Nucleation reactions, which yield a short-lived nucleation complex, are generally higher order reactions and/or require higher critical concentrations than elongation reactions. Therefore, once a reaction is nucleated, a slower process, the elongation reactions quickly go to completion. This can be regarded as an evolutionary mandate.

18

Icosahedral Phages – Single-Stranded DNA (φX174)

Fig. 6 Cryo-EM and atomic renderings of the φX174 procapsid. (A) Space filling rendering of the φX174 procapsid. The four structurally unique external scaffolding proteins (D1-D4) are depicted using the colors indicated in panel C. (B) Cryo-EM rendering of the procapsid with the removal of the external scaffolding protein. (C) The four unique D proteins, depicted in shades of blue, are associated with the coat and spike proteins, respectively colored white and periwinkle. (D) Aromatic interactions between the φX174 coat and internal scaffolding protein. The internal scaffolding protein is depicted in magenta. The coat protein is depicted in two shades of blue. Dark blue resides contain aromatic side chains, whereas residues with positively charged side chains are light blue. The most extensive π-π stacking interactions; the attractive, noncovalent interations between aromatic rings; occur between the last amino acid in protein B and three aromatic coat protein residues.

If nucleation reactions occurred more readily than elongation reactions, subunits would be divided among too many nucleation complexes. This would quickly exhaust the assembly subunit pool and result in partially formed capsids. In this regard, the 12S* + external scaffolding protein D → procapsid transition is very similar to single scaffolding protein-mediated assembly systems. However, φX174 has two scaffolding proteins, which has resulted in an interesting role reversal among the reactants. Compared to other systems, the external scaffolding protein behaves more like a viral coat protein, whereas the coat protein, a component of the 12S* particle, acts more like a scaffolding protein. Typically, viral coat proteins from form aberrant assemblies in the absence of scaffolding proteins, whereas scaffolding proteins alone are relatively inert. Nucleation reactions require higher coat protein critical concentrations than scaffolding protein concentrations. During elongation, the scaffolding proteins control fidelity. By contrast, the φX174 external scaffolding protein D self-associates to form large heterogeneous spherical complexes. The coat proteins in the 12S* particles do not self-associate in vitro, in vivo, or in the assembled procapsid, which is held together almost exclusively by D-D protein contacts across the two-fold axes of symmetry. Remarkably, there appears to be no associations between coat protein pentamers (Fig. 6(B)). Lastly, nucleation dependent a high scaffolding:coat protein ratios, and 12* particles in reaction control elongation fidelity. Gokushovirus genomes do not encode an external scaffolding protein and their assembly pathway remains to be elucidated. A gokushovirus particle containing VP3 in addition to the structural coat and DNA pilot proteins has been isolated. Unlike virions, these particles are devoid of DNA. This particle most likely represents a gokushovirus procapsid and indicates that VP3, which is not found in virions, functions as an internal scaffolding protein.

Icosahedral Phages – Single-Stranded DNA (φX174)

19

DNA Packaging and the DNA Binding Protein Genome biosynthesis and packaging are concurrent processes in φX174. The pre-initiation complex; consisting of the host cell Rep viral A and C proteins; associates with the procapsid forming the 50S complex. As described above, the viral A protein binds the origin of replication in replicative form DNA. The results of genetic studies indicate that the pre-initiation docking site resides along a two-fold axis of symmetry. The DNA binding protein J enters the procapsid during packaging and is absolutely required for genome encapsidation. The N-termini of microvirus J proteins are extremely basic and bind the genome via simple charge-charge interactions. Once in the procapsid, the J protein’s C-termini, which are very hydrophobic and aromatic, compete with the internal scaffolding protein for binding to a cleft in the viral coat protein. This competition results in the extrusion of the internal scaffolding protein during the packaging reaction. Although the majority of the packaged genome appears to exist as a dense core, the J protein’s basic amino acids, along with a small cluster of basic capsid residues, tether the genome to the capsid’s inner surface. In the φX174 X-ray structure, the J protein forms an S shaped polypeptide chain devoid of secondary structure. The C-terminus is tightly associated with the cleft located near the center of the coat protein. Each of the 60 J proteins traces a path toward the five-fold axis of symmetry, crosses over to the adjacent capsid protein, and veers toward the C-terminus of the adjacent J protein. Thus, the DNA binding protein guides the incoming genome into a somewhat icosahedrally ordered conformation and a portion of the genome is ordered in the X-ray structure. The biophysical characterization of φX174 particles packaged with foreign genome-length DNA or mutant DNA binding proteins suggests that protein-DNA interactions influence the final stages of morphogenesis. Morphogenesis terminates with the provirion to virion transition: the dissociation of the external scaffolding protein and an 8.5 Å radial collapse of the capsid pentamers around the genome. The tethered genome constrains the spatial orientation and secondary structure of the remaining nucleotides. Therefore, altering the tether or the base composition of the packaged nucleic acid may affect the magnitude or integrity of the collapse. However, the role of DNA-capsid interactions in φX174 is not as dramatic as those seen in other viral systems in which abrogating genome-capsid interactions leads to severely aberrant particles.

Lysis Unlike large ds bacteriophages, ssDNA microviruses do not have the genetic capacity to encode a two-component holin-endolysin system. Instead, they have evolved a small protein, lysis protein E, that inhibits peptidoglycan biosynthesis. Thus, lysis is dependent on host cell division, during which the cell becomes sensitive to osmotic pressure. The results of several elegant genetic studies elucidated protein E function. By selecting for lysis resistant cells, Ryland Young and colleagues first uncovered the slyD (sensitivity to lysis) gene, which encodes a peptidly-prolyl cis-transferase-isomerase, or PPIase. However, it is unlikely that the slyD gene product is the E protein target. Considering the function of PPIase's in protein folding and the E protein’s five prolyl bonds, it seemed more likely that the E protein was a substrate for the host cell enzyme. In fact, gene E mutants, Epos (plates on slyD), can be readily isolated. In order to determine the target of protein E, Epos proteins were used in a second round of selection with slyD cells. The surviving colonies contained mutations in the mraY gene, which encodes translocase I. This enzyme catalyzes the formation of the first lipid-linked intermediate in cell wall biosynthesis. In the presence of protein E, its activity is greatly reduced. Thus, it is likely the E protein’s direct target.

Evolution and Evolutionary Studies The evolutionary relationship between the gokushoviruses and the φX174-like viruses remains somewhat obscure. There is a deep evolutionary rift between the two groups, with no known intermediate species. The rift is likely a function of their host’s biology and not the hosts’ evolutionary relationship. The gokushoviruses have been isolated from obligated intracellular parasitic bacteria and mollicutes. For example, the Bdellovibrio host for the gokushovirus φMH2K is a proteobacterium, like the φX174-like phage hosts, but φMH2K is much more closely related to the phages of chlamydia, which is in an unrelated bacterial phylum. Two of the primary differences between the two groups are the external scaffolding and spike proteins. Many gokusho or gokusho-like viral genomes have been detected in environmental samples, primarily oceanic, and microbiomes. However, their hosts have yet to be identified. Nonetheless, these viruses likely play a major role in the shape and functioning of ecosystems.

Evolution of a Two Scaffolding Protein System While the existence of two scaffolding proteins is unique in any system, it is particularly peculiar for T ¼ 1 capsids. No other T ¼ 1 virus requires a scaffolding protein, let alone two. The results of an exhaustive search for new E. coli φX174-like phages indicate that the 47 known species can be divided into three separate clades, typified by bacteriophages φX174, G4 and α3. Although there is evidence for some horizontal gene transfer between the species, the extent is considerably lower than that observed for dsDNA

20

Icosahedral Phages – Single-Stranded DNA (φX174)

viruses. The one gene that seems to be the most recent acquisition, or at least its present form, is the external scaffolding protein gene, which appears to have originated in the φX174 clade and spread to the two the others. An accumulation of both genetic, biochemical, and structural data indicates that the external scaffolding protein is more essential for morphogenesis. The internal scaffolding protein, on the other hand, may be best viewed as an efficiency protein, facilitating several morphogenetic reactions, but not absolutely essential for any one in particular. This hypothesis was tested by evolving a sextuple mutant strain that no longer requires the internal scaffolding protein. Although mutations in structural and external scaffolding proteins did arise, two mutations were found in promoters upstream of the external scaffolding gene, leading to its over-expression. These two regulatory mutations, along with one conferring a substitution in the external scaffolding protein, have a kinetic effect on virion assembly. This indicates that one function of the internal scaffolding protein is to lower the critical concentration of the external scaffolding protein needed to nucleate procapsid assembly. The morphogenesis of wild-type φX174 is extremely rapid, progeny virions can be detected as quickly as 5.0 min post infection, which may be critical for a small phage without the genome content to encode super infection exclusion functions. At five minutes post infection, most other phages are just concluding early gene expression. Rapid morphogenesis may be a consequence of having two scaffolding proteins, allowing the φX174-like viruses to compete with the larger and vastly more prevalent dsDNA phages. In this evolutionary model, the external scaffolding protein is a recent acquisition. Those phages that did not acquire the gene, the gokushoviruses, may have persisted by finding a niche free of competition: obligate intracellular parasitic hosts like chlamydia.

φX174-Like Viruses as a Model System for Experimental Evolution Due to the small genomes and the ability to interpret amino acid substitutions within the context of the virion and procapsid atomic structures, the φX174-like viruses have become one of the leading systems for molecular evolution analyses. In studies pioneered by Jim Bull, Holly Wichman and colleagues, viruses are placed under selective conditions, and grown for numerous generations either in chemostats or by passaging viruses between fresh cell cultures. Individual genomes are sequenced at various time intervals, which monitors the appearance and disappearance of mutations. The obtained mutations differ depending on the selective conditions. While these studies identify beneficial changes in both structural and nonstructural proteins, many mutations are in genetic regulatory sequences, which most likely optimize the relative level of viral proteins synthesized under the experimental conditions.

Further Reading Aoyama, A., Hamatake, R.K., Hayashi, M., 1983. In vitro synthesis of bacteriophage phiX174 by purified components. Proceedings of the National Academy of Sciences of the United States of America 80, 4195–4199. Bernhardt, T.G., Struck, D.K., Young, R., 2001. The lysis protein E of phiX174 is a specific inhibitor of the MraY-catalyzed step in peptidoglycan synthesis. Journal of Biological Chemistry 276, 6093–6097. Cherwa Jr., J.E., Tyson, J., Bedwell, G.J., et al., 2017. PhiX174 procapsid assembly: The effects of an inhibitory external scaffolding protein and resistant coat proteins in vitro. Journal of Virology 91. (pii: e01878-01816). Clarke, I.N., Cutcliffe, L.T., Everson, J.S., et al., 2004. Chlamydiaphage Chp2, a skeleton in the phiX174 closet: Scaffolding protein and procapsid identification. Journal of Bacteriology 186, 7571–7574. Dokland, T., McKenna, R., Ilag, L.L., et al., 1997. Structure of a viral procapsid with molecular scaffolding. Nature 389, 308–313. Doore, S.M., Baird, C.D., Roznowski, A.P., Fane, B.A., 2014. The evolution of genes within genes and the control of DNA replication in microviruses. Molecular Biology and Evolution 31, 1421–1431. Doore, S.M., Fane, B.A., 2016. The microviridae: Diversity, assembly, and experimental evolution. Virology 491, 45–55. Hayashi, M., Aoyama, A., Richardson, D.L., Hayashi, N.M., 1988. Biology of the bacteriophage jX174. In: Calendar, R. (Ed.), The Bacteriophages, vol. 2. New York: Plenum Press, pp. 1–71. Kornberg, A., 1980. DNA Replication. San Francisco: Freeman. Rokyta, D.R., Burch, C.L., Caudle, S.B., Wichman, H.A., 2006. Horizontal gene transfer and the evolution of microvirid coliphage genomes. Journal of Bacteriology 188, 1134–1142. Roznowski, A.P., Fane, B.A., 2016. Structure-function analysis of the phiX174 DNA-piloting protein using length-altering mutations. Journal of Virology 90, 7956–7966. Sun, L., Young, L.N., Zhang, X., et al., 2014. Icosahedral phiX174 forms a tail for DNA transport. Nature 505, 432–435. Sun, Y., Roznowski, A.P., Tokuda, J.M., et al., 2017. Structural changes of tailless bacteriophage PhiX174 during penetration of bacterial cell walls. Proceedings of the National Academy of Sciences of the United States of America 114, 13708–13713. Szekely, A.J., Breitbart, M., 2016. Single-stranded DNA phages: From early molecular biology tools to recent revolutions in environmental microbiology. FEMS Microbiology Letters 363, fnw027. Wichman, H.A., Brown, C.J., 2010. Experimental evolution of viruses: Microviridae as a model system. Philosophical Transactions of the Royal Society B: Biological Sciences 365, 2495–2501.

Single-Stranded RNA Bacterial Viruses Peter G Stockley, University of Leeds, Leeds, United Kingdom Junjie Zhang, Texas A&M University, College Station, TX, United States r 2021 Elsevier Ltd. All rights reserved. This is an update of P.G. Stockley, Icosahedral ssRNA Bacterial Viruses, In Encyclopedia of Virology (Third Edition), edited by Brian W.J. Mahy and Marc H.V. Van Regenmortel, Elsevier Ltd., 2008, doi:10.1016/B978-012374410-4.00760-3.

Glossary Pfu Stands for plaque-forming unit; that is the number of phage particles per host cell required to generate productive infections. Theoretically this value could be 1 but very few of the phage that attach to the edge of a pilus will get access to the inside of a cell, so Pfu values in the 100s are not uncommon. Polycistronic RNA The term cistron was coined to mean any section of an mRNA that gets translated. The RNA phage genomes are “polycistronic” because they encode multiple protein products. Quasi-equivalent conformers These are different conformations of the identical polypeptide sequence that form at the differing symmetry-related positions of viral capsids.

Translational repression Refers to the process whereby a protein product interacts at a defined site within a mRNA molecule preventing translation of the downstream cistron. Triangulation number, T Is defined mathematically as T ¼ (h2 þ hk þ k2), where h and k are any pair of positive integers, including zero. For instance giving rise to triangulation numbers of 1 for h, k ¼ 1,0, and 3 for h, k ¼ 1,1. These h and k values can be determined by analysis of real viruses by analyzing them in terms of networks of equilateral triangles that cover the surface of a sphere. The details of this analysis can be found in the article by Caspar and Klug listed under further reading. UTR Untranslated region, found at the 50 and 30 ends of phage RNA genome where they form specific secondary structures recognized by the replication enzyme.

Icosahedral ssRNA Bacterial Viruses Background Some E. coli cells contain a separate genetic element, the F factor plasmid, in addition to their chromosomal DNA. Such F þ bacteria produce a pilus protein that self-assembles a hair-like appendage, the pilus, on the outside of the cell that is an essential part of conjunctive DNA transfer between bacterial cells. This primitive version of sexual genetic transfer is known as conjugation and the term “male” is used for bacteria carrying the F factor. A group of bacterial viruses, the RNA phages exploit this feature binding to the sides of the pilus and transferring their ssRNA genomes into the bacterium as the pilus retracts, the latter being part of its natural function. Tim Loeb working in Norton Zinder’s laboratory at the Rockefeller University in New York was the first to realize that there might be such male-specific bacterial viruses, bacteriophages. Together they isolated and began to characterise such phages from an initial sample of raw sewage. One intensively studied version known as MS2 is believed to have been named because it was male-specific factor 2, although there are unconfirmed stories that the MS also stands for Metropolitan Sewer, the source of some of the samples they screened for phage! Similar phages, all now grouped as Leviviruses, were rapidly isolated world-wide after this initial discovery and we now know that such phages are extremely common in the environment with up to 107 Pfu/mL in sewage. Both human and animal hosts appear to harbor E. coli populations able to support such phages which are thus termed coliphage. Virtually identical phages have been found that are specific to Acinetobacter (AP205) and other gram-negative bacteria. In such phages the infection process still occurs via pili but these are distinct from the sex pili of E. coli. Phage PRR1, for instance, infects hosts via pili encoded by incompatibility type P plasmids which are widely dispersed allowing the phage to infect a range of different bacterial cells. On the basis of immunological cross-reactivity and genome organization the Leviviridae are classified into two genera. These have been further sub-divided into two groups. All the phages are members of the Leviviridae or the Alleloleviridae, the latter being distinguished by having slightly longer genomes and because of the presence of read-through products from the CP gene (Fig. 1). Sub-groups I and II are typified by phages MS2 and GA, respectively, whilst typical Group III and IV phages are Qb and SP, respectively. They have a single copy, positive-sense, single-stranded genomic RNA (gRNA) of B3–4.5 kb in length, that encodes a maturation protein (MP, also called A or A2, depending on the phage), a coat protein subunit (CP), a replicase subunit and a lysis protein (Fig. 1). The two most studied are canonical genetic systems specific for the E. coli F pilus: MS2, the first organism to have its genome sequenced, and Qb, which provided the genome and replicase for the early studies of RNA-dependent RNA replication and in vitro evolution. Until recently only B30 Leviviridae genomes had been deposited in the NCBI database, more than half of which are F-specific phages resembling MS2 or Qb in genome organization. However, orders of magnitude more novel ssRNA phages have been identified in meta-transcriptomes, which may infect more diverse groups of hosts and represent a vast “dark matter” that is under-characterized.

Encyclopedia of Virology, 4th Edition, Volume 4

doi:10.1016/B978-0-12-809633-8.21311-3

21

22

Single-Stranded RNA Bacterial Viruses

Fig. 1 Genetic maps and structural organization of ssRNA phages. (A) Color-coded genomic maps of the ssRNA phages Qb, AP205, MS2, PP7, ɸCb5, M, from top to bottom. The gene for the lysis protein (black) is the most variable among different ssRNA phages. Note that for Qb, the lysis protein also acts as the maturation protein; and it has an additional minor coat protein due to the read-through of the gene for the major coat protein (Read-through: green). (B) Cross-sectional view of an ssRNA phage. The front part of the capsid has been removed to show the interior of the virion and the various components are labeled. (C) Shows an electron micrograph of an E. coli cell with two pili, each of which is decorated with MS2 phage, i.e., the initial complex with the host receptor, see inset for details. The magnification is B16,000 x and reveals the relative sizes of the host and its virus particles. Pili occasionally contract toward their host and it is believed that during this process the bound phage get internalized but for MS2 only the MP-RNA complex gets into the cell. Once there a protease cleaves the MP allowing the RNA to act as an mRNA for production of phage proteins. (D) Life-cycle post genome entry including translation to produce phage proteins, genomic RNA replication, self-assembly of progeny phage particles and subsequent lysis of the host. Reproduced from Paranchych, W., 1975. Attachment, ejection and penetration stages of the RNA phage infectious process. In: Zinder, N.D. (Ed.), RNA Phages. Cold Spring Harbor, New York: Cold Spring Harbor Press, pp. 85–112. with permission from Cold Spring Harbor Laboratory Press (NY). Watson, J.D., Hapkins, N.H., Roberts, J.W., Steltz, J.A., Winer, A.M., 1993. Molecular Biology of the Gene : General Principle, vol. I. fifth ed. New Delhi: Pearson Education, Inc., with permission.

Phage lifecycle Infectious phage must contain at least one copy of the MP as well as a CP shell to protect the genome (Fig. 1). Infection is initiated by the interaction of MP with the edge of a bacterial pilus (Fig. 1(C)), the first observation that this component is both structural and at least partially exposed to the surface. Loss of MP results in particles that are no longer infectious and renders part of the

Single-Stranded RNA Bacterial Viruses

23

Fig. 2 The Regulation of Phage Gene Expression. (A) The amounts of each of the encoded proteins produced during an infection cycle. Although each gene is present in equal amounts along each copy of every genome there is a dramatic difference in the amounts of each protein in the cell, reflecting the requirements for different numbers of these protein products for replication, self-assembly of progeny phage and lysis of the host. The implication is that the expression of these protein products is tightly controlled. Editor: use images from the 3rd edition (B) RNA-RNA and Translational Repression - At times in the phage lifecycle the stem-loop structures (right) encompassing the start codon for the phage encoded replicase subunit are sequestered by long range Watson-Crick base-pairing interactions to a segment of the CP gene, the Min Jou sequence (left). Translation of the CP gene releases this sequence allowing the expression of replicase. However, as the CP concentration rises it binds sequencespecifically to a cognate stem-loop sequestering the replicase start codon leading to translational repression. Editor: use images from the 3rd edition. Reproduced from Van Duin, J., 1988. Single-stranded RNA bacteriophages. In: Calendar, R., (Ed.), The Bacteriophages, vol. 1. New York & London: Plenum Press, pp. 117–167 with permission.

genome RNase accessible, implying a second function namely MP-RNA interaction. Careful work by Paranchych and others shows that infection occurs when just the MP-gRNA complex enters the cell, the CP subunits of the capsid being left outside. The precise mechanism of how the MP-RNA complex is internalized has not been worked out in any detail. Once inside however the MP undergoes a single proteolytic cleavage event which initiates the phage lifecycle in the cell (Fig. 1(D)). Expression of phage proteins is tightly controlled, both in the amounts and timing of the appearance of each protein product, reflecting the need for differing amounts of phage proteins, despite their genes all being present equally on the same RNA (Fig. 2). An essential early gene product is the RNA-dependent RNA polymerase. This contains only one phage-encoded subunit, from the 30 replicase gene, which forms an active complex with host proteins S1 and elongation factors Tu and Ts. This enzyme then copies the entire positive-sense strand into its negative-sense complement, and then repeats the process on the negative-strand to create progeny genomes. The complementary strands are only base-paired in the region being copied preventing formation of dsRNA. These events require the enzyme to recognize distinct signals in the 30 UTRs of each strand. The positive-sense progeny produced serve as mRNAs for the production of phage proteins and subsequently get repurposed as assembly substrates during the formation of progeny phage. The replication enzyme from Qb was used to establish in vitro RNA transcription. Subsequent evolution studies identified a non-viral replicator that outcompeted the genome as a replication substrate revealing the compromises required to encompass the multiple functions needed to create an infectious particle. CP expression occurs from the start of infection and its level increases throughout an infection, reflecting the need to assemble large numbers (41000) progeny phage. There seems little regulation of this expression other than the sequence and stability of the initiator hairpin structure that binds the ribosome. Upstream flanking sequences may also positively promote translation, perhaps by forming structures that favor binding of the ribosome to this site. The phage makes more MP than is strictly necessary but its level is tightly controlled. In part this regulation is due to poor translational initiation, the start codon in MS2 being GUG not AUG. Additionally, the ribosome binding site of the MP gene may base pair with an internal region of its gene preventing translation. This creates a window for expression in which as the RNA is replicated the MP start site is free to bind ribosomes but these must compete with internal base-pairing once the full MP gene sequence is present. The replicase is made at a level of B5 copies/genome and its levels are also tightly controlled. This regulation is achieved in two distinct ways. As with the MP gene the ribosome replicase start site can be sequestered in another long-range base pairing interaction, this time with a section of the CP gene. Translation of the CP gene sequesters this complementary sequence, known as the Min Jou sequence after its discoverer, freeing the region around the start of the replicase gene so that it can fold to form a ribosome binding site (Fig. 2(B)). However replicase expression is further controlled via a translational repression complex that forms with the phage CP. A stem-loop translational operator hairpin, TR (Fig. 2(B)) can form that is bound sequence-specifically by a CP dimer in a concentration-dependent fashion. This operator encompasses the start codon for replicase and formation of this complex effectively switches off replicase translation. The complex has become a paradigm for studies of sequence-specific RNA-protein recognition. It also appears to be the site of assembly initiation of the phage CP shell around the genome (see below). Regulating phage gene expression is vital because of the importance of timing the switch in the viral life-cycle from replication to assembly of progeny phage. During that process the viral mRNAs are repurposed as assembly substrates. Lysis – cell death – is the ultimate result of the infection ensuring release of progeny phage. In MS2, this is achieved by the action of a phage-encoded lysis protein expressed inefficiently from the þ 1 reading-frame within the CP gene (Fig. 1). The lysis

24

Single-Stranded RNA Bacterial Viruses

Fig. 3 Structure & Assembly of RNA Phages. (A) Asymmetric cryo-EM reconstructions of MS2 and Qb shown with the front half of their CP shells removed to reveal the density for the genome (light blue) and the MP (purple). The encapsidated CP dimer in Qb is shown in green. (B) CP quasi-conformations required to assemble the T ¼ 3 capsids. CP dimers are either symmetric (C/C conformation) or asymmetric (A/B conformations). These different conformers interdigitate at particle 3-fold vertices (D), but 5-folds ( ) are surrounded only by B-type conformers. (C) Proposed model for assembly. Multiple TR-like stem-loops across the MS2 genome (red cylinders) trigger CP dimer conformational change in favor of the A/B conformer. Multiple such cognate RNA-CP interactions collectively make assembly highly efficient, initially collapsing the CP-free RNA conformation and promoting the assembly pathway illustrated.

protein is the last gene product to begin expression, consistent with its functional role in the life-cycle. The delayed production of this gene product appears again to be a consequence of translational control, the ribosome binding site allowing initiation of translation being inaccessible until the ribosome translating the CP gene terminates at a defined site. Experiments where the CP gene stop codons are moved further 30 prevent lysis expression implying that correctly regulated expression is due to a defined RNA folding event similar to those controlling replicase translational repression. Variations of the details of this mechanism may partially account for the frequency of read-through products in some of the CPs in some phages.

Phage Structure and Assembly Simple spherical viruses like the RNA phages were important at the beginning of the study of molecular biology since they posed the important question of how a gene product could encapsidate the RNA that encoded it. The solution of course is that the RNA genome is used multiple times as an mRNA producing many small CP subunits that subsequently associate to form a closed container into which the nucleic acid co-assembled. The simplest form of such a container would be highly symmetrical, allowing each CP to make identical intermolecular contacts with its neighbors. The container with the highest symmetry is an icosahedron, implying that viral shells should contain 60 CP subunits. Caspar & Klug extended this idea to include higher CP stoichiometries that are multiples of 60 arguing that CPs can occupy quasi-equivalent symmetry-related positions in the capsid. This concept has been central to virology for over 50 years. It solves the problem of how viruses can encode a container large enough for their genetic material. Very many simple viral capsids correspond to this arrangement in which icosahedral facets are sub-triangulated by multiple CP subunits, whose numbers and arrangements can be predicted to derive a Triangulation Number (see Glossary) for a viral particle. The earliest viral structures, starting with Harrison’s pioneering structure of the T ¼ 3 Tomato Bushy Stunt Virus, seemed to confirm this concept. They used symmetry-averaging to calculate atomic resolution electron density maps. However, whilst such structures produce high resolution maps of viral CP shells major non-symmetrical components such as the viral genome and any minor structural proteins are not seen. Indeed flexible sections of the CPs, which are often involved in contacts to the genome, are also lost. This anomaly is being addressed by using modern cryo-electron microscopy (cryo-EM) which unlike crystallography can calculate atomic resolution reconstructions of virions without symmetry averaging, resulting in electron density maps that reveal the previously unseen asymmetric components. Asymmetric cryo-EM structures at close to atomic resolution are now available for both MS2 and Qb (Fig. 3). They reveal shells of CP dimers that roughly occupy the lattice positions expected for the A/B and C/C dimers needed to create an idealized T ¼ 3 particle (see above). However, this lattice is interrupted by the presence of a single copy of the MP, which replaces a CP dimer at a C/C position. Also visible is B80% of the electron density for the encapsidated genomes. Instead of an esthetically pleasing icosahedrally-symmetric protein shell of CPs we can now see the viral molecular machinery poised for host attachment and infection. Given that the encapsidated genome appears well-ordered a reasonable assumption is that it consists of only a single conformer. Protein-free RNA molecules of that size (3569 nt for MS2) would be expected to sample multiple conformational states in an ensemble of structures. Thus the structures of these particles begs an important question of how such a quasi-symmetrical structure can be assembled around a highly asymmetric genome, incorporating the MP as it occurs. This problem has been

Single-Stranded RNA Bacterial Viruses

25

addressed using a variety of biophysical techniques. Mass spectrometry and NMR spectroscopy showed that TR operator-CP dimer interaction biases the conformational preference of the CP dimer towards the A/B state. The sequence-specifically bound RNA acts as an allosteric effector for this change. Sixty A/B dimers are required to make the T ¼ 3 capsid of the virion but there is only one copy of the TR sequence within the genome. Single molecule fluorescence measurements of the hydrodynamic radius of dye-labeled genomes in assembly reactions, however, have revealed that the RNA must be compacted during encapsidation. This compaction is driven by multiple CP dimers binding to sites across the genome, consistent with experiments triggering assembly in vitro with stem-loops other than TR. The multiple genomic sites being bound have been termed Packaging Signals (PSs), and assembly using such sites is PS-mediated assembly. They act to make the assembly process highly efficient in the biological environment restricting the molecular components from exploring all possible assembly pathways. The asymmetric structure of MS2 confirms these predictions about its assembly mechanism. There are 450 stem-loops that point at CP dimers and could act as PSs, and 15 of these remain bound by their CP partners in the virion. Not all the PS sites need to stay bound following assembly, since they are only required to act as allosteric effectors during the assembly process itself. Whilst this mechanism accounts for assembly of a symmetric T ¼ 3 particle it does not explain how the unique MP gets incorporated, creating the biologically functional asymmetric protein shell. Cryo-EM and earlier biochemistry reveal that MP makes a specific genome contact in an additional stem-loop adjacent to the 30 end of the genome. This may occur before or shortly after translational repression of the replicase cistron by binding of a CP dimer at TR (see above). A fascinating insight into the assembly mechanism, and potentially virion evolution, comes from the fact that in the infectious phage particle the CP dimer at TR, which includes the start codon of the replicase is in contact with the MP bound to the untranslated region at the end of that gene. Thus both structural proteins act to sequester the replicase gene, preventing the gene from being translated and any existing replicase beginning to make the minus-strand. The importance of this arrangement is highlighted by a variation seen in the structure of Qb, where instead of replacing a CP dimer it appears that the MP protein may have forced the CP dimer to become internalized within the phage protein shell.

Biotechnological Applications RNA phage biology has been exploited in many differing areas of biotechnology. The sequence-specific interaction between the phage CP dimer and its RNA translational operator has been used to monitor RNA-protein interactions and RNA/RNA-binding protein localization in cells using fluorescently modified components/antibodies. In addition the ease by which virus-like particles can be assembled has led to their use to create bespoke VLPs carrying artificial cargoes, such as toxins or siRNAs that can then be used therapeutically. Bacteriophage MS2 CP has become the first viral protein to have every one of its natural amino acids mutated creating a detailed structure-function map. It is therefore very likely that this list of applications of this system will continue to expand.

Further Reading Borodavka, A., Tuma, R., Stockley, P.G., 2012. Evidence that viral RNAs have evolved for efficient, two-stage packaging. PNAS 109, 15769–15774. Caspar, D.L.D., Klug, A., 1962. Physical principles in the construction of regular viruses. Cold Spring Harbor Symposium Quantitative Biology 27, 1–24. Cui, Z., Gorzelnik, K.V., Chang, J., et al., 2017. Structures of Qb virions, virus-like particles, and the Qb–MurA complex reveal internal coat proteins and the mechanism of host lysis. PNAS 114, 11697–11702. Dai, X., Li, Z., Lai, M., et al., 2017. In situ structures of the genome and genome-delivery apparatus in a single-stranded RNA virus. Nature 541, 112–116. Gorzelnik, K.V., Cui, Z., Reed, C.A., et al., 2016. Asymmetric cryo-EM structure of the canonical Allolevivirus Qb reveals a single maturation protein and the genomic ssRNA in situ. PNAS 113, 11519–11524. Meng, R., Jiang, M., Cui, Z., et al., 2019. Structural basis for the adsorption of a single-stranded RNA bacteriophage. Nature Communications 10, 3130. Paranchych, W., 1975. Attachment, ejection and penetration stages of the RNA phage infectious process. In: Zinder, N.D. (Ed.), RNA Phages. Cold Spring Harbor, New York: Cold Spring Harbor Press, pp. 85–112. Twarock, R., Stockley, P.G., 2019. RNA-mediated virus assembly: Mechanisms and consequences for viral evolution and therapy. Annual Review of Biophysics 48, 495–514. Valegård, K., Murray, J.B., Stockley, P.G., Stonehouse, N.J., Liljas, L., 1994. Crystal structure of a bacteriophage-RNA coat protein-operator complex. Nature 371, 623–626. Witherell, G.W., Gott, J.M., Uhlenbeck, O.C., 1991. Specific interaction between RNA phage coat proteins and RNA. Progress in Nucleic Acid Research and Molecular Biology 40, 185–220.

Enveloped Icosahedral Phages – Double-Stranded RNA (u6) Paul Gottlieb and Aleksandra Alimova, The City University of New York (CUNY), School of Medicine, The City College of New York, New York, NY, United States r 2021 Elsevier Ltd. All rights reserved.

Nomenclature

OM Outer membrane PC Procapsid RdRP RNA-dependent RNA polymerase RLPS Rough lipopolysaccharide S20,w Sedimentation coefficient, standardized to conditions corresponding to pure water at 20°C and extrapolated to zero protein concentration, 10−13 s, and it’s expressed in Svedberg value S ssRNA Single stranded RNA T Triangulation number

ATP Adenosine triphosphate Cryo-EM Cryo electron microscopy Da Dalton, 1.67*10−24 g dsRNA Double stranded RNA LPS Lipopolysaccharide mRNA Messenger RNA NC Nucleocapsid NMR Nuclear magnetic resonance NTP Nucleoside triphosphate NTPase Nucleoside tri-phosphatase

Glossary Capsid The protein shell of the virus containing the nucleic acid genome. Carrier state The complete or incomplete viral genome maintained in the host cell as a stable episome without cell lysis. Genomic precursor Viral mRNA that is subsequently packaged in the capsid. Genomic segment One of the 3 double stranded RNA molecules in the virus core. Nucleocapsid Expanded, RNA filled capsid, covered by the outer shell matrix and capable of transcription. Packaging Insertion of the mRNA genome precursors into the virus core

Polymerase complex The viral protein complex that contains the RNA polymerase and the dsRNA genome segments. The structure is capable of replication and transcription of the viral genome. Procapsid The viral capsid prior to acquisition of the genome precursors. Reverse genetics Rescue of an RNA virus genome from cDNA recombinant clones. Self-assembly Mechanism with which protein precursors organize into a defined structure utilizing local interactions and without the intervention of external direction.

Introduction The Cystoviridae Family The cystoviruses (family Cystoviridae) are a unique group of viruses that have proven of great utility in the study of many facets of molecular virology. First discovered in 1973 as a bacteriophage of phytopathogenic pseudomonads, φ6 proved unique in that it contained lipid and possessed a genome consisting of three strands of double-stranded RNA (dsRNA). The replicative mechanism and structure make cystoviruses analogous to members of the Reoviridae family, therefore much of the research interest was driven to shed light on analogous systems in the pathological reoviruses. Until 1999, φ6 had been the only member identified in the genus and the extensive investigation of its genomic sequence, replication cycle, structure, and RNA packaging system formed the foundation of this field. Additional viruses have since been isolated that share the physical and structural characteristics of φ6. All are chloroform-sensitive, indicating they contain lipids, derived from the host cell. Their genome consists of three dsRNA segments of different sizes denoted ‘S’, ‘M’ and ‘L’ for ‘small’, ‘medium’, and ‘large’, respectively. (Transcripts from the three genome segments are designated in lower case; i.e., ‘s’, ‘m’, and ‘l’, respectively). The overall viral organization is shown schematically in Fig. 1. All the cystoviruses are virulent and infect strains of the plant pathogen, Pseudomonas phaseolicola. Based on genetic sequence and host range analysis, the well-classified cystovirus bacteriophages were divided into two major groups in which φ7, φ 9, φ10, and φ11 are closely related to φ6, while φ8, φ12, and φ13 are distantly related to φ6. Additional cystovirus family members have subsequently been isolated from environmental samples throughout the United States, Finland and China indicating the genus is extremely widespread. After the synthesis and isolation of a recombinant φ6 procapsid (PC) in 1990, seminal studies of the structure, replication, genome packaging, and transcriptional regulation of φ6 were performed. This research established the first in vitro RNA packaging and replication assay for this class of viruses and subsequently utilized it to rescue infectious particles.

26

Encyclopedia of Virology, 4th Edition, Volume 4

doi:10.1016/B978-0-12-809633-8.20945-X

Enveloped Icosahedral Phages – Double-Stranded RNA (u6)

27

Fig. 1 Schematic organization of layered structure of the j6 virion. The inner layer, procapsid PC, includes the shell, composed of 120 copies of P1 protein or 60 non-symmetrical dimers of P1. Inside the shell, are the P2 and P7 proteins, which are localized at the 5-fold axis of symmetry portal and three ds-RNA segments, which have low symmetry quasi-concentric shell organization. The P4 proteins form a hexameric ring around the 5-fold axis; the NC includes the PC and protein P8 matrix layer, and the complete virion has envelope from the cellular bi-lipid membrane with randomly distributed P3, P6, P9, P10 and P11 proteins.

Classification and Host Range φ6 was originally isolated from a bean straw plant that was infested with Pseudomonas syringae pv. Phaseolicola, strain HB10Y and until 1999 was the lone type in the family Cystoviridae. The isolation of eight additional lipid–containing, three-segmented dsRNA bacteriophages expanded the group with additional types some of which were closely related and others distantly related to φ6. The host range of cystoviruses is governed by the structure of the attachment apparatus and which cell surface receptor it utilizes. Of the ϕ6 and its close relatives, host infection requires binding to type IV pili. Therefore the φ6 group has a host range limited to Pseudomonas syringae (and for several laboratory selected φ6 mutants Pseudomonas pseudoalcaligenes). In species φ8, φ12, and φ13, the attachment apparatus is a heteromeric protein assembly that utilizes the rough lipopolysaccharide (RLPS) as a receptor. These types also have the ability to infect other gram-negative bacteria containing an RLPS – although without plaque formation. In 2010 the Cystoviridae family was again expanded with the isolation of φ2954 from radish leaves. The base sequences for many of the genes and for the segment termini were similar but not identical to those of bacteriophage φ12. However, the host specificity was for the type IV pili of Pseudomonas syringae HB10Y rather than for the RLPS to which φ12 attaches. Next φNN was isolated from a freshwater habitat demonstrating that the virus group is quite diverse and widely distributed. This phage utilizes the type IV pilus of Pseudomonas syringae for attachment and phylogenetic trees of the genome segments indicate a close relationship to φ6. Of great interest was the 2016 isolation in China of φYY from hospital sewage, a cystovirus that uses the opportunistic pathogen Pseudomonas aeruginosa strain, PAO38, as a host.

Virion Structure and Properties Physico-Chemical Properties The virus particle molecular weight is approximately 9.9 × 107 Da, sedimentation coefficient S20,w approximately 405S (or 4.05*10−11 s) and density 1.24 g cm−3 in sucrose. The molecular weight of the nucleocapsid (NC) is approximately 4.0 × 107 Da. The composition is 70% (w/w) protein, 14% dsRNA, and 16% phospholipids.

Architecture of the Polymerase Complex and the Nucleocapsid All the cystovirus members are enveloped bacteriophages of multilayered structure (Fig. 1). The first species isolated, φ6, is the best characterized in regard to its structure and assembly mechanisms. The inner layer, the PC, is composed of 60 asymmetric dimers of protein P1 that form a dodecahedral T ¼ 1 (triangulation number) shell. At each of the twelve dodecahedron faces protein P4 hexamers (the packaging nucleoside triphosphatase, NTPase) localized at the 5-fold axis, protrude from the PC. The P2 and P7 proteins localized at the portal at the 5-fold axis of symmetry are within the P1 shell. The φ6 NC outer shell has icosahedral symmetry and is formed of 200 P8 trimers, a small membrane-associated protein, in a T ¼ 13 matrix that partially covers the PC. The NC is enveloped by a lipid membrane with the lipids derived from the host cell cytoplasmic membrane and the viral envelope surface proteins which have an asymmetric distribution. The individual species of the cystoviruses are not entirely isomorphic but similar in overall design. The NC structures have similarity in P1, P2, P4 and, P7 organization but the P8 layer is quite different among the species. In φ8 the NC cover is virtually non-existent consisting of only 60 copies of P8, a small membrane associated protein. In contrast to φ6 and φ8, NC architecture based on analysis of cryo-electron microscope single particle reconstruction analysis placed φ12 in an intermediate category showing an incomplete T ¼ 13, the P8 organization with significant vacancies occupied by protein P7 dimers.

28

Enveloped Icosahedral Phages – Double-Stranded RNA (u6)

Fig. 2 The cystovirus PC 3D structure before (unexpanded) RNA packaging and after (expanded) RNA packaging, replication and the structure of the mature NC. Data reproduced from Huiskonen, J.T., de Haas, F., Bubeck, D., et al., 2006. Structure of the bacteriophage φ6 nucleocapsid suggests a mechanism for sequential RNA packaging. Structure 14, 1039–1048.

Cystoviruses package their plus sense messenger RNA (mRNA) sequentially into a preformed PC in the order s, m and l and the PC expands to accommodate the genome (Fig. 2). In φ6, packaging relies upon signals at the RNA 5′ end called “pac” containing about 200 nucleotides (with extensive secondary structure) while specific sequences at the 3′ end are RNA replication signals. The PC expansion is facilitated by hinge like movement of the P1 subunits approximately parallel to the two fold edges of the dodecahedral PC. The expansion and packaging activates the P2 RNA-dependent RNA polymerase (RdRP) which replicates the mRNA to dsRNA. The expanded genome-filled PC is finally enclosed within the 200 P8 trimer matrix completing the assembly of the mature NC. The packaged dsRNA genome segments appear to be located in lower symmetry quasi-concentric shells based upon electron cryo-microscopy reconstructions.

The Protein Components of the Cystovirus Portal Complex and Nucleocapsid The cystovirus RNA portal complexes, consisting of protein elements P2, P4, and P7, are located on each of the apex of the assembled PC. There is a potential total of twelve RNA portal complexes upon the dodecahedral PC however not all the sites are occupied. The dynamic interaction of these three portal proteins governs the packaging, replication, and transcription of the viral RNA genome. This intricate molecular machine can be described in regard to each of the separate proteins and below is reviewed what is currently understood about each individual component.

P1 The PC organizing protein is P1 and it forms the framework of the entire complex. P1 interacts with the other the three portal proteins to properly position them within PC structure. The PC is built from 120 identical P1 proteins organized into a T ¼ 1 shell containing 60 non-symmetric dimers where each dimer is composed of subunits A and B of the same 3-dimensional fold. The P1 crystal structure has been determined for viruses φ6 and φ8 and is a trapezoidal shape that accommodates the significant conformational changes that occur during RNA packaging (Fig. 3(A)). Intriguingly different conclusions describing the evolutionary history of the φ6 and φ8 P1's were reached by the respective research groups. The φ6 P1 was reported to show no P1 fold similarity with the known folds noted in other dsRNA viruses. Therefore the φ6 P1 structure analysis led to the suggestion that cystoviruses derived from a different lineage from other dsRNA viruses. The second interpretation that was derived from the φ8 P1 structural determination reasoned that the protein was indeed similar to the corresponding reovirus coat proteins. This conclusion was reached notwithstanding the fact that reoviruses do not undergo particle expansion when the segmented dsRNA genome is packaged, as their coat proteins are already highly elongated.

P2 The P2 of cystoviruses exhibits RNA-dependent RNA polymerase (RdRP) activity catalyzing de novo initiation and elongation of replication and producing double-stranded RNA from single-stranded RNA templates. The crystal structure of the φ6 RdRP revealed that it is highly similar to that of Hepatitis C Virus (HCV) suggesting an evolutionary link even between double-stranded RNA viruses and flaviviruses. The basic architecture demonstrated that the φ6 polymerase resembles the hand-like features model in that the 664 amino acid residues of the monomeric protein form 25 α-helices and 21 β-strands whose 3-dimensional geometry is reminiscent of fingers, palm, and thumb in a human hand (Fig. 3(B)). The key aspartate residues characteristic of RNA polymerases are positioned analogous to that of the HCV polymerase. Co-crystallization of the φ6 RdRP with oligonucleotides resembling the 3′ ends of the viral RNA templates showed that they bind within a tunnel lined with mostly basic amino acids that leads to an active site. A binding site in the substrate pore orders triphosphate moieties by their attachment to key basic residues. The observations suggested a mechanism whereby the incoming

Enveloped Icosahedral Phages – Double-Stranded RNA (u6)

29

Fig. 3 Structure of individual proteins and the assembly of j6. The structure reconstructions are derived from cryo-electron microscopy and x-ray crystallography. (A) The j6 P1 protein organized showing non-symmetrical dimers, P1A (red) and P1b (blue) (PBD #). The conformational mismatch of superimposed P1A and P1B is shown on lower part of the image. Symmetry mismatch of trapezoidal isomers allow for significant conformational changed during packaging. (B) The j6 P2 protein (PBD #1HHT, data were provided by Butcher, S.J., Grimes, J.M., Makeyev, E.V., et al., 2001. A mechanism for initiating RNA-dependent RNA polymerization. Nature 410, 235). The arrow indicates the RNA template tunnel entrance. The cut-through rendering of the 3D structure shows the internal organization of RNA template tunnel and substrate pore. (C) The j6 P4 protein (PDB #5MUV, data provided Sun, Z., El Omari, K., Sun, X., et al., 2017. Double-stranded RNA virus outer shell assembly by bona fide domain-swapping. Nature Communication 8, 14814–14814). A hexameric toroid built of six identical subunits is localized around 5-fold axis of symmetry, which creates a symmetry mismatch. The top and side views are shown. The monomer of the j6 P4 is shown with the N- and C-terminus indicated. (D) The j6 P7 protein localization within PC based on difference map between PCs mutants lacking different structural proteins. The central cut of the PC complex shows the localization P2-P7 complexes at 5-fold axis proximity. P2 protein shown in green, P7 – in red. P2 density is blue and P4 toroid are in cyan. Magnified section showing P2 surrounded by three P7 monomers and an insert showing docked atomic model of P7. Data reproduced from Katz, G., Wei, H., Alimova, A., et al., 2012. Protein P7 of the cystovirus j6 is located at the three-fold axis of the unexpanded procapsid. PLoS ONE 7 (10). (E) The average j6 P8 trimers, rendered as solid isosurface or mesh. Green rods are a-helixes. The top and side views are shown. Data reproduced from Huiskonen, J.T., de Haas, F., Bubeck, D., et al., 2006. Structure of the bacteriophage j6 nucleocapsid suggests a mechanism for sequential RNA packaging. Structure 14, 1039–1048.

30

Enveloped Icosahedral Phages – Double-Stranded RNA (u6)

double-stranded RNA is unwound and fed through a template site to an active site, while nucleotides enter through a substrate pore. P2 is found closer to the 3-fold axis of symmetry in the unfilled PC. Cryo-EM data strongly suggests that the RdRP moves to the 5-fold axis upon RNA packaging and PC expansion.

P4 The cystoviruses package their own mRNA into the PC and the energy for this translocation activity is provided by the P4 NTPase. This protein is a hexamer, and when assembled on each of the twelve 5-fold axes on the PC, creates a symmetry mismatch (Fig. 3(C)). The initial x-ray diffraction high-resolution structure was first determined for the φ12 P4 and subsequently for φ6, φ8, and φ13. The six identical P4 subunits are arranged in a ring and resemble a domed toroid (Fig. 3(C)). Six ATP binding sites are located near the external perimeter of the ring at interfaces between adjacent subunits. The quaternary structure of the P4 toroid is similar to other hexameric helicases and the φ8 and φ13 examples both have 5′–3′ RNA-helicase activity. The hexameric conformation contains a central region through which the mRNA being packaged is translocated. In addition there is a loop in each monomer found to be central within the hexamer structure. This loop is involved in the RNA translocation and is known to bind NTPs. Only one P4 – based portal is thought to sequentially package the three RNA segments. This could be the statistical consequence of a low occupancy of P4 on the unexpanded procapsid. The tenuous connection between P4 C-terminal and P1 is likely stabilized when all three genome segments are packaged, replicated and the P8 matrix assembles on the mature NC. Thus the fully assembled NC contains all of the twelve P4 hexamers.

P7 The P7 protein is the least characterized of the PC proteins in regard to precise function but in vivo studies demonstrate that it is essential for virus viability. P7 has been referred to as an accessory packaging factor but it is conceivable that this protein plays a direct role in RNA translocation and RNA packaging. The φ12 P7 crystal structure was resolved at 1.83 angstroms for the N-terminal portion of the protein. P7 forms a novel α/β fold and measurement of solution NMR relaxation rates, chemical crosslinking, and static light scattering studies show that it exists in the dimeric form in solution (Fig. 3(D)). P7 has a C-terminal tail that is extremely flexible and in vitro assembly studies showed that incorporation of the protein stabilizes, and in φ6, greatly enhances the rate of the PC assembly. As the PC is filled with the viral genome the PC undergoes expansion and as noted above the hinge of this conformational change lies along the two-fold axis, where P7 was postulated to exist as an elongated dimer at the interface between two P1 dimers. It was considered that the flexible C-terminal tail of P7 plays a role in inducing the P1 conformational change acting as the molecular hinge. Yet the position and function of the P7 packaging factor remain controversial. Cryo-EM studies of the unfilled φ6 PC suggest that the P7 density overlaps the P2 density implying mutual exclusion of each protein at an inner 3-fold axis. However, difference maps generated between PC reconstructions and mutants lacking either P2 or P7 strongly suggest that P7 stabilizes P2 at the inner three-fold axis prior to RNA packaging. This implies P2 and P7 can interact and that they are located near each other in the vicinity of the inner 3-fold axis prior to packaging. Single particle reconstructions of the mature φ12 NC strongly suggest that the P7 dimer is subsequently located at the portal 5-fold axis of symmetry surrounding the P4 NTPase hexamer.

P8 The addition of the P8 matrix to the genome filled PC completes the viral core maturation process and the arrangement of trimers of P8 determine the architecture of the NC. The NC matrix contains two types of holes, designated I and II with 60 of each type based on the arrangement of trimers of P8. In the φ6 T ¼ 13 lattice there are four classes of interdigitating P8 trimers termed Q, R, S, and T with Q closest to the 5-fold axis of symmetry (Fig. 3(E)). The five-fold axis of the P8 layer of the NC is occupied by the hexameric P4 packaging protein. In φ12 the T ¼ 13 lattice is considered to be incomplete in that a missing P8 density is occupied by densities corresponding to P7 dimers that surround the P4 hexamer. In φ6 two states of the P8 trimer were identified as either open or closed. Shifting between the two states of the P8 trimers swap domains (or interdigitate) in order to stabilize the outer shell.

Membrane Envelope The third, outermost, layer of the cystoviruses is a lipid bilayer. The viruses are 20% lipid by weight and constitute enough to cover approximately 50% of surface of the particle; the vacancies composed of the envelope protein. The φ6 lipid bilayer contains phospholipids derived from the host plasma membrane and includes four virally encoded integral membrane proteins, P6, P9, P10, and P13. P3 forms an external spike that binds the host receptor and is anchored to the bilayer by P6. In the viruses that bind the host RLPS, the envelope also originates from the host cell membrane but the surface attachment proteins differ substantially from φ6, reflecting the different host-cell attachment mechanism. Nevertheless, in all cases, P6 appears to anchor the P3 complex to the membrane and is also able to mediate membrane fusion releasing the uncovered NC into the periplasmic space. P12 is a non-structural protein that mediates membrane acquisition during assembly of the φ6 viral particle and presumably does the same for other cystoviruses.

Enveloped Icosahedral Phages – Double-Stranded RNA (u6)

31

Fig. 4 The j12 attachment P3 protein complex. The complex consist of six globular copies of P3a and three copies of P3c. Data reproduced from Leo-Macias, A., Katz, G., Wei, H., et al., 2011. The toroidal surface complexes of bacteriophage ϕ12 are responsible for host-cell attachment. Virology 414 (2), 103–109.

Host Cell Attachment Complex The φ6 spike protein, P3, extends from the viral outer envelope where it is anchored to the integral membrane protein P6. The P3-P6 complex constitutes the viral attachment apparatus that binds to type IV pili of the host cell to initiate infection and the pilus contracts to bring the virus particle to the outer cell membrane (OM) (Fig. 5). The φ8, φ12, and φ13 outer surfaces differ radically from that of φ6, φ8, φ12 and φ13 which bind to a truncated LPS O chain to initiate host cell infection. The φ12 attachment protein complex has been analysed in detail and is of a toroid shape consisting of six globular domains with six-fold symmetry (Fig. 4). Each of the globular domains is likely composed one copy of protein P3a and three copies of protein P3c. In all the species, once delivered to the host cell OM, the viral envelope fuses with that of the host using the fusogenic protein P6.

Genome Organization and Sequence Similarity The Cystoviridae have three dsRNA segments that are designated according to the size as L (large, with 6374 base pairs in φ6), M (medium, with 4063 base pairs in φ6), and, S (small, with 2948 base pairs in φ6) (Fig. 6). Each virion particle contains only one copy of each segment and the genes are clustered into functional groups. Each genomic segment is transcribed into a polycistronic mRNA that is designated by a corresponding lower case letter. The genes encoding the PC are located on the L segment in the order gene 7, gene 2, gene 4, and gene 1. Gene 14 precedes gene 7 and the function of its protein product remains incompletely described. While in most members of the Cystoviridae the order of the L segment genes is 14, 7, 2, 4, and, 1; in φ8 the order is 14, H, 2, 4, 1, and 7. The H region is composed of a pair of genes, Ha and Hb in which the latter is required for temporal control of the l transcript (see below). In φ6 the production of the RNA polymerase P2 is down- regulated to approximately 10% the level of the other three PC proteins by translational coupling to the P7 gene, as gene 2 lacks a ribosome binding site. The medium segment carries the genes 10, 6, 3, and, 13. Genes 6 and 3 are in a polar relationship with each other as mutants lacking production of one suppresses the expression of the other. Gene 3 encodes a single polypeptide in ϕ6, P3, whereas in φ12, gene 3 encodes at least two proteins, P3a and P3c. The presence of two ORFs encoding P3 is also observed in φ8 and φ13, which like φ12 recognize RLPS as a host-cell receptor. This observation suggests that the oligomeric P3 complex extending from the viral surface uniquely encodes host cell specificity (see below). For each virus member, each of the three segments has a short, highly conserved sequence at the 5′-end (i.e., designated from the plus sense strand of the dsRNA). The highly conserved sequence is followed by a unique packaging sequence that identifies each segment and ensures that it is packaged in the PC. In 2018, a detailed phylogenetic analysis of each of the genomic segments for the characterized cystovirus types was conducted and the degree of relatedness established. The analysis allowed the diagramming of phylogenetic trees showing the relationships among each of the three genomic segments for the characterized virus types. Notably the phages φ6 and φNN were found to have significant nucleotide sequence similarity in the S and L segments and the proteins encoded are nearly identical. Greater diversity was seen in the M segment possibly reflecting the host range adaptation due to distant habitats. Overall a taxonomic description for the Cystoviridae was achieved and in regard to genome characteristics (such as GC content, gene organization, and gene synteny) there is a great similarity. It was proposed that based upon criteria of 95% nucleotide sequence identity for the demarcation of species in the Cystoviridae family, phages φ8, φ12, φ13, φ2954, φNN, and φYY can be classified as distinct species.

Replication Cycle The entire cystovirus replication cycle is represented schematically in Fig. 5. While the host cell-viral P3 attachment mechanism (Fig. 5(A) and (B)) is either via the pilis or RLPS (depending on the specific cystovirus) the remaining steps are for the most part the same in all the classified types.

Plasma Membrane Penetration and NC Entrance into the Cytoplasm The integral membrane protein P6 facilitates the membrane fusion of the viral envelope with that of the host OM. The NC is then able to enter the periplasmic space and the peptidoglycan layer is digested by endopeptidase protein P5 that is loosely associated

32

Enveloped Icosahedral Phages – Double-Stranded RNA (u6)

Fig. 5 The j6 replication cycle (in details of each step are in the text). (A, B) P3 attached to the pili, followed by pili retraction; (C) virial envelope fusion with OM; (D) j6 NC entry into intracellular vesicle; (E) viral core release into the cytoplasm, P8 disassembly and initiation of transcription; (F) procapsid self-assembly; (G) packaging initiation in the order +s, +m, +l and PC expansion; (H) + strand RNA replication to the dsRNA genome; (I) P8 matrix loosely assembled around PC; (J) cell-derived bi-lipid membrane incorporated into the virion structure, followed by cell lysis and virion release.

Fig. 6 The three RNA segments of j6 and genes encoding the proteins.

on the NC (Fig. 5(C)). The NC penetration is mediated by the P8 lattice which interacts with the host cytoplasmic membrane resulting in containment of the NC in an intracellular vesicle (Fig. 5(D)). The viral core is released into the host cytoplasm and the P8 lattice disassembly initiates viral genome transcription (Fig. 5(E)).

Transcription Once within the host cell cytoplasm, the NC transcribes the three dsRNA genomic segments to produce the message sense l, m, and s strands that are expelled from the particle (Fig. 5(E)). Transcription functions by a semiconservative strand displacement mechanism in which the nascent plus strand that is displaced replaces the original one which is released from

Enveloped Icosahedral Phages – Double-Stranded RNA (u6)

33

the NC. In φ6, transcription initially produces equimolar concentrations of all of the three segments. However translation of the l transcript is favored which allows assembly of new unfilled PC particles early in the infection. Later in the replication cycle, the l transcription level is reduced to 5%–10% of the initial concentration and translation of m and s predominates. The mechanism that regulates the temporal control of cystovirus transcription is incompletely understood but has been partially elucidated for φ6 and φ8. In φ6 the 5′ 18-base sequences of the three-plus sense stands are identical with the exception that the m and s begin with GG while l begins with GU. The GU acts to signal the control the altered transcription profile of the l message RNA. Isolated of PC particles containing dsRNA and primed for transcription were found to contain a host cell-derived protein, YajQ, that influenced temporal control of transcription by increasing l production. YajQ does not directly activate the P2 polymerase but was found to bind the PC framework protein P1. Phage φ2954 has adapted to utilize host cell protein glutaredoxin 3 (GrxC) for the activation of L segment transcription and mutations that resulted in GrxC transcription independence were found in the gene for protein P1. In φ2954 (and φ12) the 5′ end of the L segment begins with ACAA as an identifier while the S and M segments begin with GCAA (GAA in φ12). In φ8 the level of l transcription is independently controlled, without the need of a host cell protein. The three φ8 5′ plus sense RNA ends have short consensus sequences in that only the first seven nucleotides are identical. Yet in spite of the absence of a 5′ L sequence identifier late in the infection cycle the l message transcription is considerably lower than that of the m and s messages. A gene found on the 5′ end L segment (see above) encoding product Hb acts with host cell RNase R to degrade excess l transcript and bring it down to a concentration required for the late infection phase.

PC Assembly and Packaging The φ6 replicative mechanism starts with the assembly of the empty PC (Fig. 5(F)). This is followed by the specific selection and packaging of three segments of viral mRNA within the PC. The dodecahedral-shaped PC (Fig. 5(G) and (H)) contains the RNA packaging and replication apparatus at each of twelve portals constituting polymerase complex. The polymerase complex consists of P2, P4 and, based on cryo-electron microscopy reconstruction of the φ12 virus, P7. The viral mRNA strands are packaged in a specific size order and replicated to the double-stranded format utilizing P2 RdRP only after the packaging is completed (Fig. 5(H)). The P2 RdRPs are located at each of the 12 portals and appear to undergo a position shift from the three-fold axis to the five-fold axis after PC expansion. The PC undergoes a specific and sequential expansion during RNA packaging that selectively exposes RNA binding sites for each of the three RNA segments. The noncoding sequences at the genome 3′ ends direct RNA replication and are discussed below.

NC maturation After packaging and RNA replication, a matrix composed of 200 trimeric copies of protein P8, is assembled around the filled PC (Fig. 5(I)). The P8 matrix is an open structure which can allow passage of water, cations and other smaller molecules. The PC, P8, and genome together form the NC.

Membrane acquisition and host cell lysis A lipid-bilayer-membrane envelope, derived from the Pseudomonas host-cell, is assembled around the NC (Fig. 5(J)). The mechanism of viral envelope assembly is not entirely understood. It involves the non-structural P12 protein which is not present in the completely assembled virion. The envelope contains four membrane proteins, P6, P9, P10, and P13, within a lipid bilayer. Based on cryo-electron microscopy of the φ12 virus, protein P5 is likely positioned between the NC and the envelope. P3 proteins form external spikes on the envelope whose function is to bind to the host cell receptor. After the entire virus is assembled host cell lysis releases approximately 150–200 virus particles. This process utilizes two lytic proteins, the P5 murein peptidase working in conjunction with a holin lytic factor P10 that destabilizes the host envelope.

Recombination and Reassortment RNA strand recombination in cystoviruses is due to template switching and is not overly dependent on sequence similarity at the crossover point. Recombination occurs during minus strand synthesis of a plus sense template and is dependent on the conformation on the RNA 3′ end. Replication recognition signals of the plus sense RNA consist of a secondary structure composed of stem-loops near the 3′ ends and removing these structures facilitates recombination. The replication of the genome segment after deletion of the RNA 3′ structure can only be restored if the template is rescued by recombination. In addition to genetic exchange by recombination reassortment of the three genome segments occur among the cystovirus species. This mechanism can occur naturally or in the laboratory setting. Investigation of the rates of reassortment in both natural and laboratory types allows analysis of viral evolution in multi-stranded RNA viruses. Experimentally genetic exchanges have been visualized as altering whole virus electrophoretic mobility demonstrating a useful selectable phenotype in studies of viral evolution and fitness.

34

Enveloped Icosahedral Phages – Double-Stranded RNA (u6)

Carrier State Under certain conditions, a φ6 infection can produce a carrier state in which the virus replicates but does not cause lysis of the host cells and the phenomenon is often seen in viruses with mutant viral lytic factors. In the carrier state, the phage does not need to reinfect cells and can dispense with select genes. Essentially the viral genome replicates as an episomal element in the host cell cytoplasm. The carrier state can be extremely stable as when drug resistance selection is maintained. For example, a kanamycin resistance gene incorporated into the M segment- with host cell selection for the antibiotic- allowed the loss of the S segment. This carrier mutant clearly had altered the packaging order by dispensing with s that is normally the first viral mRNA packaged. These mutants had altered the P1 protein suggesting that RNA interaction and packaging order is dependent on the PC framework protein.

Reverse Genetics and Self Assembly The advent of reverse genetics methods for both φ6 and φ8 represented the first example of the production of recombinant genetic methods for segmented dsRNA viruses. Initially, the φ6 rescue system required the packaging of in vitro synthesized transcripts into a PC assembled in E. coli to which was added the P8 matrix. This particle was then transfected into pseudomonad spheroplasts and viable recombinant progeny recovered. This system demonstrated the feasibility of producing viable engineered segmented dsRNA virus particles, however, the in vitro assembly was not efficient. Subsequently, methods utilized electroporation of φ6 and φ8 cDNA plasmids carrying three genomic segments that were introduced directly into pseudomonad hosts. The host cells expressed either SP6 or T7 RNA polymerase that directed recombinant viral ssRNA transcription. The latter methodology has contributed to the understanding of the precise nature of the RNA packaging and replication signals and mechanisms for RNA segment recognition by the PC. The in vitro self-assembly of complete and partial PC particles has been achieved as well by condensation of the purified protein components. The particles were found to be capable of packaging the viral ssRNA via NTP hydrolysis, replication the dsRNA and subsequent transcription of nascent ssRNA. The self-assembly assay has facilitated the study of PC assembly kinetics and the precise protein stoichiometry in the particles.

Applications and Intellectual Property Since the time of its discovery, φ6 has been the subject of intellectual property applications. The observation that the phage contained dsRNA stimulated the filing and award of a patent in 1974 for the preparation of high yields of dsRNA for interferon production. In 2006 a patent was awarded for the rescue of recombinant φ6 that carried cDNA for the expression of therapeutic genes encoding vaccine antigens, catalytic or antisense RNAs and immunoregulatory proteins. The claims of the invention suggested that the isolated recombinant phage NC could be induced to enter eukaryotic tissue where the bioreactive gene would be expressed by a cap-independent translation enhancer after transcription by the φ6 P2 RdRP. While these two applications have not been brought to practice recombinant φ6 P2 has been made commercially available. The RdRP, with its capacity for primer independent replication and transcription, have made it a useful tool for the synthesis of large RNA molecules and RNA-based sequencing.

Further Reading Díaz-Muñoz, S.L., Tenaillon, O., Goldhill, D., et al., 2013. Electrophoretic mobility confirms reassortment bias among geographic isolates of segmented RNA phages. BMC Evolutionary Biology 13, 206. El Omari, K., Sutton, G., Ravantti, J.J., et al., 2013. Plate tectonics of virus shell assembly and reorganization in phage φ8, a distant relative of mammalian reoviruses. Structure 21 (8), 1384–1395. Ford, B.E., Sun, B., Carpino, J., et al., 2014. Frequency and fitness consequences of bacteriophage φ6 host range mutations. PLoS One 9 (11). Gottlieb, P., Wei, H., Potgieter, C., Toporovsky, I., 2002. Characterization of j12, a bacteriophage related to j12: Nucleotide sequence of the small and middle doublestranded RNA. Virology 293 (1), 118–124. Gottlieb, P., Wei, H., Potgieter, C., Toporovsky, I., 2002. Characterization of j12, a bacteriophage related to j12: Nucleotide sequence of the large double-stranded RNA. Virology 295 (2), 266–271. Katz, G., Wei, H., Alimova, A., et al., 2012. Protein P7 of the cystovirus j6 is located at the three-fold axis of the unexpanded procapsid. PLoS One 7 (10). Leo-Macias, A., Katz, G., Wei, H., et al., 2011. Toroidal surface complexes of bacteriophage ϕ12 are responsible for host-cell attachment. Virology 414 (2), 103–109. Mäntynen, S., Sundberg, L.R., Poranen, M.M., 2018. Recognition of six additional cystoviruses: Pseudomonas virus j6 is no longer the sole species of the family Cystoviridae. Archive of Virology 163 (4), 1117–1124. McDonald, S.M., Nelson, M.I., Turner, P.E., Patton, J.T., 2016. Reassortment in segmented RNA viruses: Mechanisms and outcomes. Nature Reviews Microbiology 14 (7), 448–460. Mindich, L., 2012. Packaging in dsRNA viruses. Advances in Experimental Medicine and Biology 726, 601–608. Nemecek, D., Boura, E., Wu, W., et al., 2013. Subunit folds and maturation pathway of a dsRNA virus capsid. Structure 21 (8), 1374–1383. Nemecek, D., Qiao, J., Mindich, L., Steven, A.C., Heymann, J.B., 2012. Packaging accessory protein P7 and polymerase P2 have mutually occluding binding sites inside the bacteriophage j6 procapsid. Journal of Virology 86 (21), 11616–11624. Qiao, J., Mindich, L., 2013. The template specificity of bacteriophage j6 RNA polymerase. Journal of Virology 87 (18), 10190–10194. Sun, Y., Qiao, X., Qiao, J., Onodera, S., Mindich, L., 2003. Unique properties of the inner core of bacteriophage j8, a virus with a segmented dsRNA genome. Virology 308 (2), 354–361. Vidaver, A.K., Koski, R.K., Van Etten, J.L., 1973. Bacteriophage j6: A Lipid-Containing Virus of Pseudomonas phaseolicola. Journal of Virology 11 (5), 799–805.

Enveloped Icosahedral Phages – Double-Stranded RNA (u6)

Relevant Websites https://talk.ictvonline.org/ictv-reports/ictv_online_report/dsrna-viruses/w/cystoviridae Cystoviridae. https://viralzone.expasy.org/750 dsRNA virion. https://emedicine.medscape.com/article/227348-overview Medscape (Reoviruses). http://www.virology.net/Big_Virology The Big Picture Book of Viruses. http://www.marinespecies.org/aphia.php?p=taxdetails&id=600254 WoRMS - World Register of Marine Species.

35

Membrane-Containing Icosahedral DNA Bacteriophages Roman Tuma, University of Leeds, Leeds, United Kingdom and University of South Bohemia, Cˇeské Budeˇjovice, Czech Republic Sarah J Butcher, University of Helsinki, Helsinki, Finland Hanna M Oksanen, Molecular and Integrative Biosciences Research Program, Faculty of Biological and Environmental Sciences, University of Helsinki, Helsinki, Finland r 2021 Elsevier Ltd. All rights reserved.

Nomenclature dsDNA

ssDNA Single-stranded DNA T Triangulation number

Double-stranded DNA

Glossary DNA packaging Energy-requiring process where the empty virus capsid is filled by the virus genome. Protein-primed DNA replication Duplication of a linear DNA genome utilizing a protein covalently linked to the DNA terminus to initiate the reaction. Temperate Bacteriophage can either lyse the cell or enter a resting state called lysogeny, staying dormant within the cell until induced into lytic cycle.

Triangulation number, T number (T) A parameter defined for icosahedral viruses that describes the geometrical arrangement of the protein subunits within the capsid shell. Virulent Bacteriophages lyse the cell releasing new viral progeny.

Introduction The icosahedral bacterial viruses with an internal membrane can have double-stranded (ds) or single-stranded (ss) DNA genomes. Currently these viruses are classified into four families: Tectiviridae, Corticoviridae, Matsushitaviridae (previously in Spherolipoviridae), and Finnlakeviridae (Table 1). In addition, there are virus isolates not yet assigned to any family and a newly proposed family Autolykiviridae. The dsDNA genomes are either linear or circular, and the only known ssDNA phage with an internal membrane has a circular genome. They encompass both virulent and temperate viruses, their virions range from 49 nm to 80 nm in diameter and lack a tail but may exhibit a special vertex containing a DNA packaging machinery. Tectiviruses form the most numerous group, and they are classified into three genera: Alphatectivirus, Betatectivirus, and Gammatectivirus (and a newly proposed Deltatectivirus) (Fig. 1). Members of the Alphatectivirus (PRD1 and PR4) are virulent phages of Gram-negative hosts (e.g., Pseudomonas and Salmonella). These viruses along with the four extremely similar phages (PR3, PR5, PR772, and L17) have linear dsDNA genomes. Their sequence similarity is between 91.9% and 99.8%, which is surprising since they have been isolated from different parts of the world. The viruses of the genus Betatectivirus are temperate phages (AP50, Bam35, GIL16, and Wip1) replicating in Gram-positive Bacillus species. The third genus Gammatectivirus has a single member, the first temperate tectivirus, GC1, with a Gram-negative host (Gluconobacter). The type virus of the Tectiviridae is PRD1, which infects a wide variety of Gram-negative bacteria. Its host range is limited to bacteria that contain a conjugative antibiotic resistance plasmid, since it utilizes the plasmid-encoded cell surface DNA-transfer complex as a receptor. Bam35 serves as a model virus for phages infecting Bacillus species. The length of the genomes (about 15 kbp) and the order of the genes is conserved in all the tectiviruses, but there is no significant sequence similarity between the two major groups (Alphatectivirus and Betatectivirus). The genomes of the tectiviruses encode about 35 proteins as exemplified by the PRD1 genes (Table 2). The type virus of the family Corticoviridae is PM2. PM2 was isolated in 1968 from seawater off the coast of Chile along with its host, which is a marine bacterium Pseudoalteromonas espejiana. PM2 was the first membrane-containing bacteriophage isolated. PM2 has a negatively-supercoiled, circular dsDNA of about 10 kbp. It encodes about 17 proteins (Table 3). Recently, Pseudoalteromonas phage Cr39582 was induced from Pseudoalteromonas sp. strain Cr6751 and assigned to the family Corticoviridae. PM2 and Cr39582 have 85% nucleotide identity but their spike protein genes are non-homologous. The entry and the DNA-packaging mechanisms differ from those of tectiviruses with linear dsDNA genomes. Thermus bacteriophages P23–77 and IN93 (species Hukuchivirus P23-77 and Hukuchivirus IN93) are members of the family Matsushitaviridae and were recently moved from the family Sphaerolipoviridae (Table 1). Flavobacterium phage FLiP with a circular ssDNA genome has been recently assigned to the new family Finnlakeviridae. In addition, isolates of Salisaeta phage SSIP-1 and Helicobacter pylori phage KHP30 with circular dsDNA genomes are yet to be assigned families. These viruses share no significant sequence homology to any previously identified virus group. A group of 18, marine, virulent, non-tailed viruses with linear, dsDNA, 10 kbp genomes infecting members of the Vibrionaceae were recently identified and the family Autolykiviridae was proposed after phylogenetic analyzes (Table 1). The proposed Autolykiviridae members have 21%–25% amino acid identity to PM2 in their major capsid protein but the protein-primed DNA polymerase has 36%–37%

36

Encyclopedia of Virology, 4th Edition, Volume 4

doi:10.1016/B978-0-12-814515-9.00044-8

Membrane-Containing Icosahedral DNA Bacteriophages

Table 1

37

Icosahedral internal membrane-containing DNA bacteriophages with complete genome sequence

Genome

Taxonomy, family

Taxonomy, virus species

Putative members

dsDNA, linear

Tectiviridae

Pseudomonas virus PRD1; Pseudomonas virus PR4; Bacillus virus AP50; Bacillus virus Bam35; Bacillus virus GIL16; Bacillus virus Wip1; Gluconobacter virus GC1;

Enterobacteria phage PR3, PR5, L17, and PR722; Thermus phage P37–14; Bacillus phage phiNS11, GIL01, and GIL16; Rhodococcus phage Toil; Streptomyces phage WheeHeim, and Forthebois

Proposed Autolykiviridae

dsDNA, circular

Corticoviridae Proposed Matsushitaviridae Unassigned Unassigned

ssDNA, circular

Finnlakeviridae

Vibrio phage 1.008. O, 1.040. O, 1.011. O, 1.069. O, 1.062. O, 1.125. O, 2.092. O, 1.020. O, 1.080. O, 1.141. A, 1.043. O, 1.044. O, 1.057. O, 1.095. O, 1.107. A, 1.102. O, 1.048. O, 1.249. A Pseudoalteromonas virus PM2; Pseudoalteromonas virus Cr39582; Hukuchi virus P23–77; Hukuchi virus IN93 Salisaeta phage SSIP-1 Helicobacter pylori phage KHP30 Flavobacterium virus FLiP

Flavobacterium phage FLiP

Fig. 1 Genomes of tectiviruses PRD1, Bam35, and GC1 representing Alphatectivirus, Betatectivirus, and Gammatectivirus genera, respectively. Similarity between translated nucleotide sequences is shown. Abbreviations and color codes: TP, terminal protein; DNAP, family B DNA polymerase (red); MCP, major capsid protein (blue); str. protein, structural protein; SSB, single-stranded DNA-binding protein; DBP, DNA-binding protein; muramidase (green), packaging ATPase (orange), transglycosylase (green). GC1 ORFs shared with alphatectiviruses are highlighted in light yellow. Gray ORFs do not have homologs in either of the other two genomes. Reprinted from Philippe, C., Krupovic, M., Jaomanjaka, F., et al., 2018. Bacteriophage GC1, a novel tectivirus infecting Gluconobacter cerinus, an acetic acid bacterium associated with wine-making. Viruses 10 (1), 39, doi:10.3390/v10010039. Copyright © 2018 by the authors; licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons by Attribution (CC-BY) license http://creativecommons.org/licenses/by/4.0/.

amino acid identity to members of the Tectiviridae. Large numbers of viruses with double jelly-roll major capsid protein have been detected in the genomes of diverse major bacterial and archaeal phyla, and in marine water column and sediment metagenomes, indicating that the diversity of this virus group greatly exceeds the diversity that is currently recognized (Fig. 2; Table 1). The most studied model system in this group of viruses is bacteriophage PRD1, which is used here as an example to illustrate some of the most common characteristics of this diverse virus group.

Virion Structure and Properties Overall Structure The virion of PRD1 is an icosahedrally-symmetric particle approximately 65 nm in diameter (Fig. 3). It is composed of about 70% (mass fraction) protein, 15% lipid, and 15% DNA and has a mass of about 66 MDa. The structure of the virion has been studied

38

Table 2

Membrane-Containing Icosahedral DNA Bacteriophages

PRD1 genes, corresponding proteins, and protein functions

Gene

Protein

Mass (kDa)

Descriptiona

I II III V VI VII VIII IX X XI XII XIV XV XVI XVII XVIII XIX XX XXII XXX XXXI XXXII XXXIII XXXIV XXXV XXXVI XXXVII

P1 P2 P3 P5 P6 P7 P8 P9 P10 P11 P12 P14 P15 P16 P17 P18 P19 P20 P22 P30 P31 P32 P33 P34 P35 P36 P37

63.3 63.7 43.1 34.2 17.6 27.1 29.5 25.8 20.6 22.2 16.6 15.0 17.3 12.6 9.5 9.8 10.5 4.7 5.5 9.0 13.7 5.4 7.5 6.7 12.8 13.2 9.9

DNA polymerase (N) Receptor binding protein (S) Major capsid protein (C) Trimeric spike protein (S) Unique vertex protein, DNA packaging accessory protein (C, P) DNA delivery, holin, transglycosylase (L, M) Genome terminal protein (N) Unique vertex protein, DNA packaging ATPase (C, P) Assembly (A, N) DNA delivery (M) ssDNA binding protein (N) DNA delivery (M) Endolysin (L) Infectivity (M) Assembly (A, N) DNA delivery(M) ssDNA binding protein (N) Unique vertex protein (M, P) Unique vertex protein (M, P) Minor capsid protein (C) Pentameric penton protein, base of spike (S) DNA delivery (M) Assembly (A, N) (M) Holin (L) Accessory lysis protein, spanin (L) Accessory lysis protein, spanin (L)

a

(N) non-structural early protein; (M) integral membrane protein; (S) spike complex protein; (A) assembly protein; (P) packaging protein; (C) capsid protein; (L) lysis protein.

Table 3

PM2 genes, corresponding proteins, and protein functions

Gene

Protein

Mass (kDa)

Description

I II III IV V VI VII VIII IX X XII XIII XIV XV XVI XVII XVIII

P1 P2 P3 P4 P5 P6 P7 P8 P9 P10 P12 P13 P14 P15 P16 P17 P18

37.5 30.2 10.8 4.4 17.9 14.3 3.6 7.3 24.7 29.0 73.4 7.2 11.0 18.1 10.3 6.0 5.7

Spike protein Major capsid protein Major membrane protein Membrane protein Membrane protein Major membrane protein Membrane protein Membrane protein Potential packaging ATPase Membrane protein Replication initiation protein Transcription factor Transcription factor Transcription factor Transcription factor Lysis Lysis

extensively using cryo electron microscopy and X-ray crystallography (Fig. 3(a)). X-ray crystallography results have indicated the roles of four proteins in controlling virus assembly. There are 240 hexagonally shaped trimers of the major capsid protein, P3 (Fig. 4(a)), occupying the surface of the capsid on a pseudo T ¼ 25 lattice (Fig. 3(a)), an arrangement that is also found in the human adenovirus capsid. The capsid protein fold and arrangement in the shell belong to the vertical b-barrel structural lineage (Fig. 4(c)) is common among viruses with an internal membrane not only bacterial viruses but also archaeal (STIV, STIV2) and eukaryotic viruses (PBCV-1) having one double b-barrel MCP. In addition, phages (P23-77, the family Matshushitaviridae) and archaeal viruses (SH1, HCIV-1, HHIV-2, the family Sphaerolipoviridae) have two major capsid proteins obeying vertical single b-barrel fold, thus belong also to the vertical b-barrel structural lineage. In PRD1, a dimer of the linear glue protein P30 extends

Membrane-Containing Icosahedral DNA Bacteriophages

39

Fig. 2 Diversity of viruses exhibiting the double jelly-roll (DJR) major capsid protein. (a) Phylogeny of bacterial and archaeal DJR virus capsid proteins. Group numbers are assigned to each branch for reference, colored blocks indicate hosts, black circles on branches indicate approximate likelihood-ratio test branch support Z0.9. (b) Element gene diagrams from each group show prophage host genome neighborhoods and metagenome contigs often contain additional genes common to DJR elements. (G þ , Gram-positive; G  , Gram-negative hosts). Reprinted with permission from Kauffman, K., Hussain, F.A., Yang, J., et al., 2018. A major lineage of non-tailed dsDNA viruses as unrecognized killers of marine bacteria. Nature 554, 118–122. © 2018 Springer Nature.

from one vertex to the next, cementing the P3 facets together (Fig. 4(b)) and determining the overall size of the shell. Eleven of the vertices are composed of the pentamer penton protein P31, which interlock with P3 and the transmembrane protein P16 forming the base for the spike complex (Fig. 3(b)). The receptor binding vertices have two proteins attached to the penton: one is a trimer of P5 (Fig. 5(a)) attached by the N-terminus, the other is a monomer of P2 (Fig. 5(b)), the receptor-binding protein. Both P5 and P2 are elongated molecules (P5 is 17 nm long, P2 is 15.5 nm long). P2 is a club-shaped molecule with a pseudo b-propeller head and a long tail formed from extended b-sheet. The head is proposed to be the site of receptor binding, lying distal to the virus. The twelfth PRD1 vertex is the unique or special vertex used for genome packaging and DNA delivery (Figs. 3(b) and 6). At the unique vertex, the capsid and the internal membrane are interconnected by a hexamer composed of heterodimers of proteins P20 and P22 and serves as a membrane conduit for the genome and as a nucleation site for the special vertex assembly. The outer part of the unique vertex is formed of a complex of packaging ATPase P9 and packaging efficiency factor P6 (Fig. 6(c)). When the packaging starts, protein P9 is recruited to the procapsid and the PRD1 genome with the terminal protein P8 is packaged into the procapsid powered by ATP hydrolysis. The packaging ATPase P9 is part of the mature virion in contrast to the packaging ATPases of tailed dsDNA bacteriophages. The overall size and structure of the phage Bam35 is very similar to that of PRD1. However, the exact counterparts of many of the PRD1 structural proteins have not yet been clearly identified. Two of the major differences between PRD1 and Bam35 are firstly, the presence of a large transmembrane protein complex in Bam35 that modulates the curvature of the membrane under the capsid facets, and secondly, the receptor-binding proteins of the attachment vertices. The corticovirus PM2 particle is icosahedrally-symmetric and measures 57 nm in diameter from vertex to vertex. The mass of the virion is B45 MDa and it is composed of protein (72%), lipid (14%), and DNA (14%). The capsid is composed of 200 trimers

40

Membrane-Containing Icosahedral DNA Bacteriophages

Fig. 3 (a) Cryo-electron microscopy reconstruction of PRD1 virion. One facet of the T ¼ 25 capsid is delineated together with position of the symmetry axes. Pentons are red, peripentonal hexons are yellow, three-fold hexon clusters are blue. (b) Schematics of protein, lipid and DNA location within the PRD1 virion. Reprinted from Peralta, B., et al., 2013. Mechanism of Membranous Tunneling Nanotube Formation in Viral Genome Delivery. PLOS Biology 11 (9), e1001667. © 2018 by the authors. This article is an open access article distributed under the terms and conditions of the Creative Commons by Attribution (CC-BY) license http://creativecommons.org/licenses/by/4.0/.

Fig. 4 (a) Structure of the PRD1 capsid P3 protein trimer with a double b-barrel fold (blue), resulting in a hexameric shape, based on X-ray crystallography. (b) Virion structure revealed a dimer of the cementing protein P30 (also known as the tape measure protein) running lying underneath the capsid shell along the twofold positions. The icosahedron is representing the viral membrane. Dimers of P30 are stabilizing the virion structure. Reprinted from Bamford and Butcher (2008) Encyclopedia of Virology (Third Edition) Icosahedral dsDNA Bacterial Viruses with an Internal Membrane. pp 1–6. © 2008 Elsevier. (c) Structural clade tree for the vertical double barrel lineage. Reprinted from Abrescia, N.G., Grimes, J.M., Kivela, H.M., et al., 2008. Insights into virus evolution and membrane biogenesis from the structure of the marine lipid-containing bacteriophage PM2. Molecular Cell 31, 749–761. © 2008 Elsevier.

of the MCP P2 arranged on a pseudo T ¼ 21 lattice. Pentameric receptor-binding spikes protrude from the vertices (Fig. 5(c)). The receptor binding domain of the spike is structurally related to the PRD1 P5 head domain (Fig. 5(a) and (c)).

Membrane and DNA In PRD1, about half of the virion proteins are associated with the membrane (Table 2). The lipid headgroups are predominantly phosphatidylethanolamine (53%) and phosphatidylglycerol (43%) with 4% of cardiolipin. The membrane is well-ordered, following the icosahedral outline of the capsid. Many interactions occur between the membrane, the capsid and the DNA. The average separation of the concentric layers of DNA is approximately 2.5 nm, similar to that found in other bacteriophages and animal viruses. Removal of the capsid and spike proteins by heat or guanidinium hydrochloride treatment results in aggregation of the membrane vesicle. In PM2, the membrane, lying underneath the capsid, follows the shape of the capsid as in PRD1, and there are many interactions mediated by the major membrane proteins P3 and P6 (Table 3). The lipid composition is approximately 64% phosphatidylglycerol, 27% phosphatidylethanolamine, and 8% neutral lipids and a small amount of acyl phosphatidylglycerol. Release of the capsid and spike proteins from the virion by freeze–thawing or by chelation of calcium ions with ethylene glycol tetraacetic acid (EGTA) results in a soluble vesicle called the lipid core.

Membrane-Containing Icosahedral DNA Bacteriophages

41

Fig. 5 Structure of the vertex proteins determined by X-ray crystallography. (a) The trimeric PRD1 spike protein P5. (b) The monomeric PRD1 receptor-binding protein P2. Reprinted from Bamford and Butcher (2008) Encyclopedia of Virology (Third Edition) Icosahedral dsDNA Bacterial Viruses with an Internal Membrane. pp. 1–6. © 2008 Elsevier. (c) Pentameric vertex structure of PM2 virion. Reprinted from Abrescia, N.G., Grimes, J.M., Kivela, H.M., et al., 2008. Insights into virus evolution and membrane biogenesis from the structure of the marine lipid-containing bacteriophage PM2. Molecular Cell 31, 749–761. © 2008 Elsevier.

Fig. 6 Structure of the PRD1 packaging vertex obtained by asymmetric cryo-EM reconstruction. (a) View along the five-fold axis. (b) Cross-section through the packaging vertex and internal DNA density. (c) Schematics of protein localization within the vertex density with the packaging ATPase P9 capping the outside of the vertex. Reprinted from Hong, C., Oksanen, H.M., Liu, X., et al., 2014. A structural model of the genome packaging process in a membrane-containing double stranded DNA virus. PLoS Biology 12 (12), e1002024. © 2014 by the authors. This article is an open access article distributed under the terms and conditions of the Creative Commons by Attribution (CC-BY) license http://creativecommons.org/licenses/by/4.0/.

Life Cycle PRD1 is a virulent phage that exploits the transcription functions of its host. The host cell is selected by specific recognition of a plasmid-encoded cell surface receptor with its primary function in bacterial conjugation. PRD1 belongs to the class of broad-hostrange, donor-specific phages, which infect cells harboring one of the IncP-, IncN-, or IncW-type multiple-drug resistance

42

Membrane-Containing Icosahedral DNA Bacteriophages

conjugative plasmids. Among the hosts are several opportunistic human pathogens such as Escherichia coli, Salmonella enterica, and Pseudomonas aeruginosa. After adsorption, the genome is injected into the host cell cytosol with the help of the inner membrane, leaving the protein capsid outside. After the replication of the phage genome and production of the phage components, both virusand host-encoded factors assist in particle assembly. Host cell lysis releases some 500 progeny viruses. Bam35 is a temperate phage, that can have a lytic life cycle like PRD1, or exist in a dormant state within the host (lysogenic life cycle). In contrast to PRD1, the host range of Bam35, and its relatives GIL01 and GIL16, is limited to one species, Bacillus thuringiensis. AP50 infects Bacillus anthracis, and a defect phage replicating as a linear plasmid (pBClin15) has been described for Bacillus cereus. The corticovirus PM2, P23–77 phage of Thermus thermophilus, SSIP-1 phage of Salisaeta species, the Vibrio phages are virulent viruses. Thermus phage IN93 and Streptomyces phages WheeHeim and Forthebois are temperate phages. Of all the DNA phages with an internal membrane, the life cycle of PRD1 is best understood and is described below.

Receptor Recognition and DNA Delivery A single phage structural protein, P2, is responsible for PRD1 attachment to its host. Each of the eleven receptor recognition vertices is a metastable structure and possibly capable of recognizing the receptor. After binding to the receptor, the particle most probably re-orients (possibly guided by secondary binders e.g., protein P5) so that the unique vertex is perpendicular to the cell surface. The association of P2 with the receptor activates, possibly by receptor binding complex removal, the injection process. This leads to irreversible binding. Both empty and DNA-containing particles are bound equally tightly to cells, indicating that DNA injection is not a prerequisite for this tight interaction. Isolation and analysis of PRD1 mutants have resulted in the identification of eight, phage-specific, structural proteins essential for infectivity and DNA delivery. In addition to the spike complex proteins (P2, P5, and P31 especially P2 needed for adsorption), protein P11 starts the DNA delivery process and membrane proteins P14, P18, and P32 are involved in later stages of the DNA delivery. Mutant particles missing protein P7 (a lytic transglycosylase) are infectious but the DNA entry process is delayed. The PRD1 membrane undergoes a structural transformation from a spherical vesicle to a tubular form which penetrates the host cell envelope and acts as a conduit for DNA delivery. DNA ejection tube protrudes from the unique vertex used for DNA packaging. The packaged genome (DNA-P8 complex) does not have a role in the triggering or in the formation of the DNA delivery tube, since also procapsids devoid of the packaged genome can form tubes. Similar tube formation is typical for all members of the Tectiviridae and also seen in the Vibrio phages, but not recognized in other virus isolates representing internal membranecontaining DNA phages.

Genome Replication The genome of PRD1 is a linear dsDNA molecule with proteins covalently (protein P8) attached to both 50 termini and having 110 bp terminal repeats. PRD1 replicates its DNA by a protein-primed replication mechanism similarly to other viruses with linear dsDNA genomes, including adenovirus and the f29-type phages. PRD1 DNA replication starts with the formation of a covalent bond between the genome terminal protein, P8, and the 50 terminal nucleotide, dGMP, in a reaction catalyzed by the phage DNA polymerase (protein P1). The minimal origin of replication resides in the 20 first-terminal base pairs of both genome ends, and the fourth base from the 30 end of the template directs, by base complementation, the linking of deoxyribonucleoside monophosphate (dNMP) to the terminal protein. The 30 end DNA sequence is maintained by sliding back of the polymerase complex. After initiation, elongation of the initiation complex by the same DNA polymerase takes place resulting in the formation of full-length daughter DNA molecules. Two phage-encoded ssDNA binding proteins, P12 and P19, are involved in replication in vivo.

Particle Assembly Approximately 15 min post infection, the major capsid protein P3 and the spike complex proteins P2, P5, and P31 are found soluble in the host cell cytosol, whereas the phage-encoded membrane proteins (e.g., P7, P11, P14, and P18) are associated with or inserted into the host cell cytoplasmic membrane (CM). Correct folding and assembly of several viral proteins are dependent on the host GroEL/ES chaperonins. Upon assembly, a virus-specific patch from the host CM is pinched off by the forming procapsid using the membrane-bound scaffolding protein P10. In addition, two small phage-encoded non-structural proteins P17 and P33 function in the assembly process, possibly in association with the host chaperonin complex GroEL/GroES. Correct assembly results in an empty capsid enclosing a membrane rich in phage-specific proteins (procapsid). The linear dsDNA genome is packaged into the prohead by the packaging ATPase P9. Unlike packaging ATPases of most other icosahedral dsDNA bacteriophages, P9 is part of the mature virus structure. It resides at a single vertex that also contains proteins P6, P20, and P22. P6 is a soluble protein needed for efficient DNA packaging and the latter two are integral membrane proteins connecting the portal structure to the viral membrane.

Membrane-Containing Icosahedral DNA Bacteriophages

43

Cell Lysis At the end of the infection cycle, the newly synthesized progeny virions are released via host cell lysis. A two-component, holin–endolysin system operates in phage PRD1. The endolysin protein, P15, is a soluble b-1,4-N-acetylmuramidase that degrades the peptidoglycan of the Gram-negative cell causing host cell lysis. The holin protein, P35, is a helper factor that makes pores in the inner membrane to facilitate the access of the endolysin to the peptidoglycan in the cell wall, controlling the timing of lysis. In addition, PRD1 has accessory lysis proteins P36 and P37 that function as a pair (spanins) which is an interchangeable analog of the Rz/Rz1 of bacteriophage lambda. Spanins act after cell-wall degradation, aggregating and causing the disruption of the outer membrane.

Genomes and Genomics The length of the PRD1 genome is 14,927 bp and the guanine–cytosine (GC) content is 48.1%. It has 110 bp-long inverted terminal repeat (ITR) sequences at the ends, which are 100% identical. The majority of the PRD1 ORFs products have been identified and currently there are 27 known genes (Table 2). The genomes of the other Gram-negative bacteria-infecting tectiviruses are similar, varying between 14,935 and 14,954 bp, and exhibit remarkably conserved nucleotide sequences, the overall identity being 91.9%–99.8%. The Bam35 genome is 14,935 bp long with a GC content of 39.7%. The ITRs are 74 bp long with only 81% identity between the ends. PRD1, Bam35, and GC1 (representatives of the three genera of Tectiviridae) do not share much sequence similarity (Fig. 1). The discovery of the almost invariant genomes within the tectivirus groups contrasts sharply with the situation observed for the tailed dsDNA bacteriophages. The nucleotide identity is 100% between Bam35 and the related GIL01 and 83% between GIL01 and GIL16. PRD1 genome is organized into two early and three late operons. Although there are no major sequence similarities between PRD1, Bam35, and GC1 genomes, common genes for DNA polymerase, packaging ATPase, muramidase, transglycosylase lytic enzyme, and major capsid protein can be recognized by corresponding conserved amino acid sequence motifs in the databases or identified by comparing the N-terminal amino acid sequence of the major virion protein to the DNA sequence. Typically, the genes responsible for the host cell recognition (the spike protein) are the most diverse even between closest viruses. The circular PM2 genome is 10,079 bp long with a GC content of 42.2%. It contains 21 putative genes, of which 17 have so far been shown to be functional (Table 3). Promoter mapping by primer extension has revealed three operons, which are expressed in a timely fashion during infection. The first early operon is highly similar to the maintenance region of the Pseudoalteromonas plasmid pAS28. The second early operon contains genes for DNA replication and regulation of late phage functions. The PM2 genome replicates via a rolling-circle mechanism. Protein P12 has conserved sequence motifs common to superfamily I replication initiation proteins. This superfamily consists of the A proteins of certain bacteriophages, such as fX174 and G4, and the initiation proteins of cyanobacterial and archaeal plasmids. The function of the late operon is activated by two phage-encoded transcription factors, P13 and P14. P14 has sequence similarity to the TFIIS-type general eukaryotic transcription factors most closely resembling those of the archaeal organisms Thermococcus celer and Sulfolobus acidocaldaricus. The structural proteins, encoded by the late genes, are similar to the corresponding proteins of tectiviruses, either membrane associated or soluble. Based on the conserved amino acid sequence motifs deduced from the nucleotide sequence, one of the structural virion proteins is a putative packaging ATPase. The packaging process of the circular supercoiled PM2 DNA is coupled with the formation of the particle.

Acknowledgment Supported in part by ERDF CZ.02.1.01/0.0/0.0/15_003/0000441 to R.T.

Further Reading Aalto, A.P., Bitto, D., Ravantti, J.J., et al., 2012. Snapshot of virus evolution in hypersaline environments from the characterization of a membrane-containing Salisaeta icosahedral phage 1. Proceedings of the National Academy of Sciences of the United States of America 109, 7079–7084. doi:10.1073/pnas.1120174109. Abrescia, N.G., Grimes, J.M., Kivela, H.M., et al., 2008. Insights into virus evolution and membrane biogenesis from the structure of the marine lipid-containing bacteriophage PM2. Molecular Cell 31, 749–761. Abrescia, N.G., Cockburn, J.J., Grimes, J.M., et al., 2004. Insights into assembly from structural analysis of bacteriophage PRD1. Nature 432, 68–74. Atanasova, N.S., Sencilo, A., Pietilä, M.K., et al., 2015. Comparison of lipid-containing bacterial and archaeal viruses. Advances in Virus Research 92, 1–61. Azinas, S., Bano, F., Torca, I., et al., 2018. Membrane-containing virus particles exhibit the mechanics of a composite material for genome protection. Nanoscale 10, 7769–7779. Cockburn, J.J., Abrescia, N.G., Grimes, J.M., et al., 2004. Membrane structure and interactions with protein and DNA in bacteriophage PRD1. Nature 432, 122–125. Demina, T.A., Pietilä, M.K., Svirskaite, J., et al., 2017. HCIV-1 and other tailless icosahedral internal membrane-containing viruses of the family Sphaerolipoviridae. Viruses 9, 32. Gill, J.J., Wang, B., Sestak, E., Young, R., Chu, K.H., 2018. Characterization of a novel tectivirus phage Tail and its potential as an agent for biolipid extraction. Scientific Reports 8, 1062. Hong, C., Oksanen, H.M., Liu, X., et al., 2014. A structural model of the genome packaging process in a membrane-containing double stranded DNA virus. PLOS Biology 12, e1002024. Kauffman, K.M., Hussain, F.A., Yang, J., et al., 2018. A major lineage of non-tailed dsDNA viruses as unrecognized killers of marine bacteria. Nature 554, 118–122.

44

Membrane-Containing Icosahedral DNA Bacteriophages

Laanto, E., Mäntynen, S., De Colibus, L., et al., 2017. Virus found in a boreal lake links ssDNA and dsDNA viruses. Proceedings of the National Academy of Sciences of the United States of America 114, 8378–8383. Laurinmäki, P.A., Huiskonen, J.T., Bamford, D.H., Butcher, S.J., 2005. Membrane proteins modulate the bilayer curvature in the bacterial virus Bam35. Structure 13, 1819–1828. Mäntynen, S., Sundberg, L.R., Oksanen, H.M., Poranen, M.M., 2019. Half a century of research on membrane-containing bacteriophages: Bringing new concepts to modern virology. Viruses 11, 76. Oksanen, H.M., Abrescia, N.G.A., 2019. Membrane-containing icosahedral bacteriophage PRD1: The dawn of viral lineages. Advances in Experimental Medicine and Biology 1215, 85–109. Peralta, B., Gil-Carton, D., Castano-Diez, D., et al., 2013. Mechanism of membranous tunnelling nanotube formation in viral genome delivery. PLOS Biology 11, e1001667. Rissanen, I., Grimes, J.M., Pawlowski, A., et al., 2013. Bacteriophage P23-77 capsid protein structures reveal the archetype of an ancient branch from a major virus lineage. Structure 21, 718–726.

Relevant Websites https://talk.ictvonline.org/ictv-reports/ictv_online_report/dsdna-viruses/w/corticoviridae Corticoviridae. https://talk.ictvonline.org/ictv-reports/ictv_online_report/ssdna-viruses/w/finnlakeviridae Finnlakeviridae. https://talk.ictvonline.org/ictv-reports/ictv_9th_report/dsdna-viruses-2011/w/dsdna_viruses/133/tectiviridae Tectiviridae.

Tailed Double-Stranded DNA Phages Robert L Duda, University of Pittsburgh, Pittsburgh, PA, United States r 2021 Elsevier Ltd. All rights reserved.

Glossary Capsid or Head The protein container for the Caudovirales phage genome, usually icosahedral in shape, that protects it from the environment until it is delivered to a new host. Contractile tail The tail type that defines the Myoviridae, composed of a baseplate that contacts the host surface, an outer protein cylinder or sheath and an inner tail tube or core. When a host is bound the sheath contracts driving the tube into the cell to deliver the genome to the host. DNA packaging The energy-driven process by which a procapsid is filled with phage DNA to become a mature capsid. GenBank The United States National Institutes of Health genetic sequence database, an annotated collection of all publicly available DNA sequences. Horizontal gene exchange Transfer of genetic material between different species. Portal A grommet-like ring composed of 12 copies of the portal protein that occupies one capsid vertex, and

through which DNA is packaged and to which the tail is attached. Procapsid or Prohead An immature capsid that has a spherical shape and contains no DNA. T number The Triangulation number (T), a formal descriptor for complex icosahedral structures that describes the geometric arrangement of subunits for a viral capsid. The number of capsid protein subunits is B60  T. Tail The part of the virion responsible for attachment to the host and genome injection that is attached to one vertex of the capsid, at the portal. Tail fiber An elongated protein fiber attached to a phage tail that helps recognize a host by binding to the host by the tip of the fiber. Virion The entire Caudovirales viral particle composed of a capsid containing the viral genome, tail, tail fibers and other appendages.

Introduction The icosahedral tailed double-stranded DNA bacterial viruses or bacteriophages of order Caudovirales are among the most widely recognized icons of modern molecular biology. Anyone who has participated in an advanced biology, genetics, or molecular biology class is likely to have been exposed to images of the lunar-lander-shaped bacteriophage T4 virion (family Myoviridae) with its elongated icosahedral capsid and machine-like contractile tail (Fig. 1(A)) or the plain icosahedral capsid and elegant, gently curving tail of bacteriophage l (family Siphoviridae, Fig. 1(B)). Both T4 and l are bacteriophages that infect the enteric bacterial host Escherichia coli and have been studied extensively. Bacteriophages were originally discovered in the early 20th century (1915–17) by English mircrobiologist Frederick W. Twort, and French-Canadian microbiologist Felix d0 Herelle. Although there was much early interest in bacteriophages because of their potential for use in treating bacterial diseases (phage therapy), research on bacteriophages for disease therapy was largely abandoned after the discovery of antibiotics. Bacteriophage research was revived in the 1940s by Max Delbrück and colleagues (the well-known Phage Group that often met at the Cold Spring Harbor Laboratories on Long Island, NY) and the focused phage research started by this group of scientists lead to many fundamental biological discoveries during the birth of molecular biology. The modern emergence of multiply antibiotic resistant strains of human pathogens has rekindled an interest in developing phage therapy as a means of treating human, fish, insect (bees) and livestock diseases.

The Structure of Tailed dsDNA Bacteriophages The invention and commercialization the electron microscope led to the first images of phages as sperm-like particles with a head and tail using metal shadowing, in which images were formed by heavy metal atoms such as uranium that were evaporated in a vacuum and allowed to strike a dried phage specimen at an angle. The later introduction of negative stains (salts of heavy metals that dry as a thin layer without forming crystals and in which small particles such as phages could be embedded) for electron microscopy resulted in images that were far richer in detail than earlier methods and revealed the complexity and variety of phage morphology. Fig. 1 shows electron micrographs of phages T4 and l made using the negative stain technique. Cryo-electron microscopy of frozen hydrated phage samples is the latest innovation and is capable of revealing near-atomic details of phages and their structural components. Electron microscopy has remained a major tool in the study of bacteriophages and, in fact, the current taxonomic system for phages relies heavily on phage morphology as determined by electron microscopy as a major discriminating factor. Table 1 shows a taxonomic table for order Caudovirales and some of their characteristics. Myoviridae have contractile tails and include bacteriophages T4 (Figs. 1(A) and 2) and P1. Siphoviridae have long non-contractile tails and include phages l (Fig. 1(B)) and T5. Podoviridae have shorter stubby or stumpy tails and include phages P22 and T7. The utility of morphological classification as a basis for phage taxonomy has more recently come into question as new insights

Encyclopedia of Virology, 4th Edition, Volume 4

doi:10.1016/B978-0-12-809633-8.21230-2

45

46

Tailed Double-Stranded DNA Phages

Fig. 1 Electron micrographs of two bacteriophages. (A) Bacteriophage T4. (B) Bacteriophage Ur-l. Ur-l is the original phage l isolate and has the side tail fibers that are missing in the l strains that are widely used. The micrographs were made using the negative stain Uryanyl acetate to embed the phage particles and provide contrast. The samples appear lighter than the enclosing stain.

Table 1

Tailed dsDNA bacteriophages: Order Caudovirales

Family

Genus

Defining example

Capsid T-number

Myoviridae (phages with contractile tails)

“T4-like viruses”

Enterobacteria phage T4 Bacillus phage G Enterobacteria phage P2 PhiKZ Enterobacteria phage Mu Bacillus phage SP01 Halobacterium virus fH Enterobacteria phage l Enterobacteria phage T1 Enterobacteria phage T5 Lactococcus phage c2 Mycobacterium phage L5 Methanobacterium cM1 Enterobacteria phage T7 Bacillus phage f29 Enterobacteria phage P22

T T T T T T T T T T T T ? T T T

Siphoviridae (phages with thin non-contractile tails)

Podoviridae (phages with short stubby tails)

“P2-like viruses” “PhiKZ-like viruses” “Mu-like viruses” “SP01-like viruses” “fH-like viruses” “l-like viruses” “T1-like viruses” “T5-like viruses” “c2-like viruses” “L5-like viruses” “cM1-like viruses” “T7-like viruses” “f29-like viruses” “P22-like viruses”

¼ ¼ ¼ ¼ ¼ ¼ ¼ ¼ ¼ ¼ ¼ ¼

13; Q ¼ 20 52 7 27 7 16 7? 7 7? 13 4?; Q ¼ 7? 7

¼ 7 ¼ 3; Q ¼ 5 ¼ 7

Genome size bp 168,903 497,513 33,593 280,334 36,717 B132,500 B57,000 48,502 48,836 121,750 22,172 52,297 B23,246 39,937 19,366 41,724

into phage evolution have revealed that horizontal exchange of genes is widespread among phages, and further, that some structural genes of bacteriophages and viruses that infect organisms from other domains of life appear to share common ancestors.

Bacteriophage Structure and Function Capsids In the dsDNA bacteriophages, the capsid serves as the container for the DNA chromosome and protects it from the environment until it is delivered to a new host by the phage tail, the organelle of attachment and genome delivery. In many other types of viruses, the capsid is taken up by cells with the genome still inside, and the virus particle uncoats or disassembles within the new host to initiate viral replication. The dsDNA bacteriophages capsids do not have to disassemble in this manner to initiate infection, so they can be constructed to be highly stable – resistant to a wide variety of chemical and physical assaults from the environment and digestive tracts – as they travel from one host to another. The dsDNA genome in bacteriophage capsids is packed to a very high density (B0.5 g/ml), so the capsid shell has to be strong enough to contain this tightly packed and highly concentrated DNA. Most of the Caudovirales have symmetric icosahedral capsids, but some have capsids with an elongated icosahedral shape, like T4 (Figs. 1(A) and 2) that have icosahedral ends with an elongated tubular middle section in between.

Tailed Double-Stranded DNA Phages

47

Fig. 2 Structural features of a typical bacteriophage. A drawing of a bacteriophage with a contractile tail, like phage T4. Many of the structural components that are discussed in the article are labeled.

Tails and Tail Fibers Phage tails function to identify and bind to the correct host and to deliver the phage genome into that new host to initiate replication. Phage tails have a large variety of morphological forms with many parts and appendages, such as those labeled on the diagram of T4 in Fig. 2 and visible in the electron micrograph in Fig. 1. We know little about the functions of some tail parts, but others are understood in considerable detail. For example, in bacteriophage T4, both the long tail fibers and the short tail fibers (which are folded under the baseplate until after the long tail fibers are bound to the host – see Fig. 2) recognize specific host receptors. The long tail fibers of phage T4 are the primary determinants of host range, and mutations that change the T4 host range cause alterations in the fibers near their distal tips. The whiskers that are attached to the top of the T4 tail have multiple functions. The whiskers act as assembly jigs for adding the long tail fibers to the virion, and they also act as environmental sensors that sequester the long tail fibers under unfavorable conditions and thus prevent attachment to a host. The long tail fibers of T4 have an analog in bacteriophage l, but in the ordinary laboratory strains of l, these fibers are not made because of a mutation in the side tail fiber gene. The electron micrograph in Fig. 2 shows the long side tail fibers present on Ur-l, a primordial l that lacks this mutation. These long side tail fibers of Ur-l speed up the adsorption of this phage to its host, but are not required for infection, because the primary interaction of l with its host is mediated by the central tail fiber (a trimer of the tail protein gpJ) protruding from the end of the lambda tail. The central tail fiber of l binds to a host maltose transport protein (called LamB or MalB) in the outer membrane of the host. Binding of phage l to its receptor under suitable conditions acts as a signal that triggers the injection of the phage DNA genome into the host to initiate replication. The structure and functions of many parts of the contractile tails of Myoviridae have been revealed in depth, especially for the phage T4 tail. After the T4 long tail fibers attach to the host, the short tail fibers unfold from the baseplate and bind to a second receptor on the host surface. The baseplate remains attached to the cell and undergoes a dramatic reorganization in which it changes from a hexagon to a star shape, causing the sheath to contract and drive the internal tail tube into the cell. As the

48

Tailed Double-Stranded DNA Phages

iron-containing nail-shaped tip of the inner tail tube passes through the outer membrane, tail lysozyme molecules are released to create a small hole in the cell wall peptidoglycan layer. These combined actions allow the tip of the tail tube to reach the inner membrane, where a channel is created that allows the dsDNA genome to enter the cell and initiate replication.

Temperate Versus Virulent Bacteriophages Virulent bacteriophages, once they have infected a host, are committed to go through their entire growth cycle including replication of the bacteriophage chromosome, production of progeny viral particles and the lysis of the host cell. Temperate phages, on the other hand, are able to grow lytically like their virulent counterparts, but they are also capable of going into a dormant state within the host. This dormant state is called lysogeny, and often involves integration of the phage chromosome into the host chromosome, as occurs with phage l, or conversion of the phage chromosome into an autonomously replicating plasmid, as occur with phage P1. The seemingly normal host containing a dormant phage chromosome is called a lysogen. The dormant phage chromosome is maintained in a mostly silent state, until some change in the condition of the host (often related to stress, such as damage to the host DNA) acts as a signal to induce the phage to begin a lytic growth cycle, all the way through progeny release. The temperate phage in the lysogenic state makes a repressor protein which can repress nearly all phage functions, except for the synthesis of the repressor itself. If the repressor is subsequently destroyed, as it may be after DNA damage occurs with in the cell, the phage initiates a normal lytic growth cycle, usually excising the phage chromosome from the host chromosome as an early step. How a temperate phage decides between a virulent and lysogenic mode of infection is a complex process that appears to depend on the conditions within the host cell at the time of infection. In a simplified version of the process, it is a balance between the production of sufficient repressor protein to shut down phage development and the antagonistic action of other factors that prevent repressor synthesis and function.

Injection DNA Injection As described above, all Caudovirales bacteriophage particles have tails with highly specialized parts with specific roles to mediate attachment and infection. The fundamental goal of attachment is the introduction of the viral DNA into the host to begin production of progeny virions. The detailed mechanism of DNA entry varies widely, from phage to phage. Although this process is called DNA injection, the phage particles do not act like syringes, as the name implies. In some cases it appears that chromosome entry is largely passive, a result of the DNA being very densely packed into the virion at a high concentration. The high concentration of DNA packaged in the phage capsid appears to play a role in the passive transfer of DNA out of the capsid into the host; this appears to be true for many phages, including phage T4 and phage l. Other Caudovirales, such as phage T7 (family Podoviridae) and phage T5 (family Siphoviridae) have more elaborate programs of DNA injection with an initial stage that appears to be largely passive, followed by pause after only a fraction of the chromosome has been injected. Such paused DNA injection resumes only after proteins (encoded by the genes injected in the initial stage) act to restart chromosome entry or to actively pull the rest of the chromosome into the host in a controlled manner.

Protein Injection Many Caudovirales inject both DNA and proteins into their hosts. The function of a few of these injected proteins is known, but in many cases remains mysterious. For example, phage T4 injects multiple copies of several proteins into the host. These include one protein that modifies the host RNA-polymerase, another that inhibits a host nuclease that targets and digests T4’s highly modified DNA, and other proteins of unknown function. Surprisingly, most of phage T4’s injected proteins are not absolutely required for infection and mutants that lack them are able to grow on many ordinary laboratory hosts. Phage N4 injects a very large and essential RNA polymerase protein that transcribes this phage’s early genes. Phage T7 was thought to inject several large proteins into its host, but these have now been shown to be extruded from the capsid interior and subsequently form a structure which is the channel for the passage of DNA and other proteins into the host.

Host Interactions and Regulation of Gene Expression Classes of Genes A regulated program of gene expression begins soon after any Caudovirales has injected its DNA. The expression of phage genes can be divided by function, such as host takeover, replication, virion assembly, or lysis, but more often they are sorted by their temporal order of expression. In most cases gene expression is divided into early genes and late genes, although for some phage there may be a cascade of gene expression modes that occur sequentially. In a cascading series, early phage gene products control the expression of a set middle gene products, which control the expression of late genes.

Tailed Double-Stranded DNA Phages

49

Early Genes Phage early genes are usually transcribed by the normal host RNA polymerase, while the expression of late genes is dependant on the expression of the early genes. The earliest genes expressed might include ones that produce proteins to counteract the host’s anti-bacteriophage defensive systems, as occurs in phage T7. Early genes of temperate phages are those required to make the decision to become dormant or to grow normally and carry out the first steps of these processes, including, for example, the repressor gene and integration genes, or DNA replication and recombination genes. Early genes of virulent bacteriophages, such as phage T4, include genes for genome replication and genes that shut down unneeded host functions, such as host transcription and host protein synthesis, and may include other genes required to recycle host resources, for example, by specifically degrading the host chromosome to intermediates that can be re-used during the replication of the phage genome.

Late Genes The proteins encoded by the late genes include the structural proteins required for the assembly of the bacteriophage virion, as well as those needed for packaging the genome into the virion and for cell lysis. Late gene expression is often dependant on early (or middle) gene expression because the late genes have transcriptional promoters that are not recognized by the normal host polymerase. The early genes may direct the synthesis of transcription factors that either (a) change the specificity of the host RNA polymerase or (b) modify the polymerase in other ways, in order to express the whole set of late genes at the appropriate time with high efficiency. Alternatively, one of the early genes may encode for an entirely new RNA polymerase with a new specificity that recognizes late gene promoters. Expression of the structural proteins to high levels allows the production of a large number of progeny phages.

Virion Assembly Assembly Pathways The assembly of Caudovirales virions is a highly ordered process in which separate subassemblies of the virion are built and then joined to form the mature virion. The assembly of capsids, tails and tail fibers each follows an independent assembly pathway in which individual protein components are added, usually sequentially, to a growing structure until it is complete.

Tail Assembly The major components of a Caudovirales tail are a baseplate or tail tip, major tail proteins (one for a non-contractile tail; two for contractile tails, an inner tube protein and outer sheath protein), and termination or capping proteins to stabilize the completed tail and connect it to the head. Tail assembly begins with the formation of an initiator complex, which may be either a complete tail tip or a complete baseplate, that is assembled via a separate pathway. The initiator complex includes a template that specifies the length of the tail, the tape measure protein complex in a compact form and bound by tape measurespecific chaperone proteins. The major tail proteins bind to the initiator complex and assemble into a tube around the tape measure protein until the full length of the tape measure protein is enclosed and the termination proteins can bind and stabilize the tail.

Capsid Assembly Capsids of Caudovirales are built as precursor procapsids, into which the phage genome is later packed. Procapsids are initially assembled from several types of components: a portal complex, which will connect the capsid to the tail, a major capsid protein, a scaffolding protein (which may be a separate protein or may be a disposable (removable) part of the major capsid protein). Phages may also utilize decoration or stabilization proteins, but these are not part of the procapsid and add to and stabilize the capsid only after the genome is packed inside. The number of capsid protein subunits needed for assembly is equal to B60  T subunits, where T is the triangulation number listed in Table 1 (or, for phages with elongated capsids, B30  (T þ Q) subunits, where T and Q are specified in Table 1). One vertex of a phage capsid is occupied by the portal complex, so the number of capsid proteins will by 5 less than calculated by the simplified formulae. So for a T ¼ 7 virus, such as l, 420  5 ¼ 415 copies are needed. Some phages, including phage T4 and relatives, also make a specialized minor capsid protein that forms the pentameric vertices of the icosahedral capsids, but most do not. Capsid assembly in Caudovirales begins with the completion of the assembly initiator complex, the portal or connector, composed of 12 copies of the portal protein, which occupies one of the 12 icosahedral vertices of the capsid. The major capsid protein(s), together with hundreds of copies of the scaffolding protein co-assemble onto the initiator to produce a complete shell with the scaffolding protein on the inside. After assembly is complete, the scaffolding protein is expelled intact or digested by a special protease that is also incorporated into the procapsid. Procapsids appear spherical and usually have a smaller diameter than mature capsids. When DNA is packaged into the procapsid, the capsid usually expands and changes shape, transforming into the typical angular, icosahedral shape of mature capsids. If present, decoration proteins add to the outside of the capsid after it expands and help

50

Tailed Double-Stranded DNA Phages

to stabilize the structure. Once DNA is packaged, the capsid is made ready to join to a tail by the addition of proteins that bind to the portal.

Lysis Lysis of a typical Caudovirales host usually requires the breaching of the bacterial envelope, which usually requires separate factors for each of the envelope’s layers. For example, in gram negative hosts, holins are used to breach the inner membrane, a lysozyme or endolysin is used to break down the peptidoglycan intermediate layer, and spannins are used to breach the outer membrane. The endolysin is a soluble enzyme with the capacity to break the bonds holding the host cell wall together. The endolysin molecules accumulate within the host cytoplasm during the late stages of infection, but are unable to attack the cell wall because they are sequestered within the cytoplasmic membrane. Holin proteins allow the endolysin to get across the cytoplasmic membrane by forming holes in the inner membrane. Holins are synthesized and inserted into the membrane in a form that does not form holes at first. As the bacteriophage infection proceeds, the holins accumulate in the membrane until a predetermined time at which they suddenly and catastrophically induce membrane breakdown. The membrane holes produced by holins allow the endolysin molecules to attack the bacterial cell wall, causing rapid cell lysis and releasing the progeny Caudovirales from the cell. Spannins usually work as a pair (one each for the inner and outer membranes) and appear to disrupt the outer membrane by mediating fusions between the inner and outer membranes.

Genomes and Genomics Chromosome Diversity and Replication The chromosomes that are packaged into the capsids of the Caudovirales are linear dsDNA. Caudovirales virions always contain an entire genome or, alternatively more than an entire genome to ensure that a complete genome is packaged into each and that every particle can initiate an infection. To achieve this, a site-specific mechanism may be used, in which DNA packaging begins and ends at defined sites that are recognized by the packaging machinery to create exactly unit genome length DNA chromosomes. Alternatively, the amount of virion DNA may be regulated by a head-full packaging mechanism which fills a capsid that has the capacity to hold slightly more than one genome’s worth of DNA. The head-full-packaged chromosomes have the same sequence at each end and are said to be terminally redundant. After injection, the phage genomes often rearrange to form a circular chromosome that is the primary replicative form of the genome of many dsDNA phages, however, many other phages replicate without forming such DNA circles. Caudovirales chromosome replication often takes place bi-directionally from a single origin, but more complex replication schemes and multiple replication origins have also been observed. Late in infection, most phages switch to a mode of replication that produces DNA concatemers, or long strings of genome copies joined end to end, either by a rolling circle mode of replication from circular chromosomes, or by recombination between multiple overlapping genome copies. Such DNA concatemers, whether linear or branched, are the forms of the genome that are the usual substrate for DNA packaging for both head-full and site specific mechanisms. When it does occurs, circularization of the phage chromosome takes place via one of two mechanisms. The first is by the annealing of complementary single-stranded DNA sticky ends left by the packaging enzymes, in the cases where the packaged chromosomes have defined endpoints. The second is by a recombination-like mechanism that joins the complementary regions of the overlapping, terminally redundant chromosome ends to form circular DNA molecules. Circular chromosomes are the most common form for bacterial chromosomes, but some bacteria also have linear chromosomes, and a small subset of the Caudovirales also form linear chromosomes that replicate as linear plasmids when in their lysogenic form.

Diversity in Genome Size and Organization The genomes of Caudovirales range in size from B20,000 base pairs in phage f29, to B170,000 base pairs in phage T4, and up to B500,000 base pairs in other known examples. The smaller phages have proportionally fewer genes than the large phages, but are functional and successful phages none the less. Given this wide range of sizes and diversity of the Caudovirales, it is not surprising that there is not a common conserved genome structure for this order of viruses. Within a genus of phages there is often a common genetic structure, but within and across families there is often little resemblance in genome organization. There are notable exceptions to these generalizations, for example, the genus “P22-like viruses” and the genus “l-like viruses” have rather closely related genomes.

Common Themes in Genome Structure Despite the differences mentioned, there are many common general features in Caudovirales genomes. The genes for the structural proteins, such as those for capsids, tails, or tail fibers, tend to be found clustered together, and within these clusters, the genes that encode proteins that physically interact with each other also tend to be grouped together. For example, the capsid protein genes for

Tailed Double-Stranded DNA Phages

51

the portal, the scaffolding protein and the major capsid protein usually occur together and in the order listed. Sets of late genes are often grouped together in clusters and, in some cases, the entire set of structural genes are grouped together in clusters and transcribed together from a single promoter, as is the case for phage l.

Horizontal Exchange of Genes is Widespread A large number of complete bacteriophage genomes have been fully sequenced, and thousands of these have been deposited in GenBank and other databases. In addition, there are also many temperate phages residing in the genomes of bacterial chromosomes deposited in GenBank, some defective (or cryptic), some complete and able to form viable phage. Analyses of the sequences of these genomes have shown that many of the genes of the Caudovirales have highly similar counterparts in other genomes and in many cases the similarity can be inferred to be truly homologous. An important conclusion of these analyses is that there is extensive horizontal exchange of genes between phages that are not within the same species or genus or order. It appears that genetic exchange between phages by both homologous and non-homologous recombination mechanisms is widespread and that many phages are mosaic combinations of genes found elsewhere. Many of the homologous genes are within the Caudovirales, but some are from outside this group. In some cases, large modules of genes in a pair of phages are closely related, for example, the head genes of phage HK97 (Siphoviridae) and phage SfV (Myoviridae), despite belonging to different taxonomic groups. In other cases, a single phage tail fiber encoding gene in one phage may contain several distinct regions of homology with several other different phages.

Common Ancestry The exact nature of the evolutionary relationships among phages and how it relates to phage taxonomy is a controversial subject, but at the level of individual protein coding genes, it is fairly certain that proteins with a high degree of sequence similarity in analogous proteins almost certainly share a common evolutionary ancestor. The power and utility of bioinformatics to tease out weak (sequence) similarities between distantly related proteins gives the evolutionary scientist powerful tools to detect relationships that escape casual examination. However, the reliance on sequence similarity matches to detect homologous relationships breaks down when homologous proteins have diverged so far that sequence similarity is undetectable. A notable case is that of the major capsid proteins of bacteriophages and other viruses. The 3-dimensional structure and protein fold of the major capsid protein of phage HK97 was determined to high resolution by x-ray crystallography. Sensitive bioinformatic techniques were able to detect weak sequence similarities between the HK97 capsid protein sequence and the capsid proteins of a large number of other Caudovirales, suggesting that the HK97 capsid protein fold is quite common. Subsequently, the determination of the protein folds of phages T4 (Myoviridae), P22, f29, and epsilon 15 (Podoviridae) and many other phages, by structural methods has shown that all of these phages also have the same protein fold as HK97, despite the lack of any detectable sequence similarity. This structural similarity also extends to the capsid proteins of the Herpesviruses, which use a variant of the HK97 fold for their the floor domains of their much larger major capsid proteins and also have a portal protein with fold similar to their bacteriophage homologs.

Further Reading Cairns, J., Stent, G.S., Watson, J.D. (Eds.), 1992. Phage and the Origins of Molecular Biology, expanded ed. Cold Spring Harbor, NY: Cold Spring Harbor Laboratory Press. Calendar, R. (Ed.), 2006. The Bacteriophages, second ed. Oxford: Oxford University Press. Casjens, S.R., 2011. The DNA-packaging nanomotor of tailed bacteriophages. Nature Reviews Microbiology 9 (9), 647–657. Caspar, D.L., Klug, A., 1962. Physical principles in the construction of regular viruses. Cold Spring Harbor Symposium on Quantitative Biology 27, 1–24. Chaikeeratisak, V., Nguyen, K., Khanna, K., et al., 2017. Assembly of a nucleus-like structure during viral replication in bacteria. Science 355 (6321), 194–197. Duda, R.L., Teschke, C.M., 2019. The amazing HK97 fold: Versatile results of modest differences. Current Opinion in Virology 36, 9–15. Hatfull, G.F., Hendrix, R.W., 2011. Bacteriophages and their genomes. Current Opinion of Virology 1 (4), 298–303. Hendrix, R.W., Casjens, S., 1988. Control mechanisms in dsDNA bacteriophage assembly. In: Calendar, R. (Ed.), The Bacteriophages. NY: Plenum Press. Hendrix, R.W., Roberts, J.W., Stahl, F.W., Weisberg, R.A., 1983. Lambda II. Cold Spring Harbor, NY: Cold Spring Harbor Laboratory Press. Hu, B., Margolin, W., Molineux, I.J., Liu, J., 2015. Structural remodeling of bacteriophage T4 and host membranes during infection initiation. Proceedings of the National Academy of Sciences of the United States of America 112 (35), E4919–E4928. Hua, J., Huet, A., Lopez, C.A., et al., 2017. Capsids and genomes of jumbo-sized bacteriophages reveal the evolutionary reach of the HK97 fold. mBio 8 (5). Karam, J.D., Drake, J.W., Kreuzer, K.N., et al. (Eds.), 1994. Molecular Biology of Bacteriophage T4. Washington, DC: ASM Press. Katsura, I., 1990. Mechanism of length determination in bacteriophage lambda tails. Advances in Biophysics 26, 1–18. Leiman, P.G., Kanamaru, S., Mesyanzhinov, V.V., Arisaka, F., Rossmann, M.G., 2003. Structure and morphogenesis of bacteriophage T4. Cellular and Molecular Life Sciences 60 (11), 2356–2370. Rossmann, M.G., Rao, V.B. (Eds.), 2012. Viral Molecular Machines. NY: Springer. Xu, J., Hendrix, R.W., Duda, R.L., 2014. Chaperone-protein interactions that mediate assembly of the bacteriophage lambda tail to the correct length. Journal of Molecular Biology 426 (5), 1004–1018. Young, R., 2014. Phage lysis: Three steps, three choices, one outcome. Journal of Microbiology 52 (3), 243–258.

52

Tailed Double-Stranded DNA Phages

Relevant Websites http://www.ncbi.nlm.nih.gov/genomes/GenomesGroup.cgi?taxid=28883 Complete genomes: Caudovirales NCBI. https://viralzone.expasy.org/256 Phage index B ViralZone page. https://phagesdb.org/phages/ The Actinobacteriophage Database. http://www.virology.ws Virology blog.

Helical and Filamentous Phages Andreas Kuhn, University of Hohenheim, Stuttgart, Germany Sebastian Leptihn, Zhejiang University-Edinburgh University Institute, Zhejiang University, Haining, China r 2021 Elsevier Ltd. All rights reserved.

Glossary Adsorption Binding of phage particles to the receptor on the bacterial cell surface. Atomic force microscopy (AFM) High resolution scanning microscope with a laser-controlled cantilever. Conjugation Sexual contact of bacteria to exchange genetic information from the donor to the recipient cell involving a F-pilus of the donor cell. Extremophilic bacteria Bacteria that live at high (thermophilic) or very low (psychrophilic) temperatures. Filamentous Morphology of the phage that defines the Inoviridae. At the proximal end 5 copies of the gp3 adsorb to the tip of a F-pilus of the host cell.

Membrane insertion Newly synthesized proteins that are incorporated into the membrane before their assembly into the phage particle. Morphogenetic signal Sequence on the viral DNA that is responsible to initiate DNA the packaging reaction. Pathogenicity Factors encoded by the viral DNA that are toxic for the human organism the bacterial host colonizes. Rolling circle replication Continuous replication mode to generate the single stranded phage DNA to be packaged to the progeny particles.

Introduction The first filamentous phage was isolated 60 years ago in Germany, when Hartmut Hoffmann-Berling took a sample from the backyard of a cattle farm close to Heidelberg. In the lab, he tested the sample for plaque formation on Escherichia coli Hfr. Since the selected phage contained DNA and appeared male-specific is was named “fd”. Nearly at the same time, Tim Loeb at the Rockefeller institute isolated the male-specific “f1” phage and Hans-Peter Hofschneider in Munich phage M13, who also showed its filamentous morphology with the electron microscope. The “M” in M13 stands for Munich, and it was Hofschneider’s 13th isolate. Today we know that the three phages, fd, f1 and M13, are nearly identical in their DNA sequence and therefore can be considered as an identical phage (termed Ff phage). The fd genome was one of the first to get determined with 6408 bases. The advantage of single-stranded DNA (ssDNA) for the in vitro replication reaction helped to determine the first sequences in general. Because of this, any DNA was ligated into the M13 genome and after transfection the phage was isolated to purify the ssDNA for sequencing. At this time, M13 was the common vehicle for DNA sequencing. In the early work of Hoffmann-Berling it was recognized that the fd phage was “liberated” from cells without lysis, in an unknown “new phage liberation mechanism”. In addition, the first studies of their morphogenesis revealed no visible cytoplasmic intermediate structures as had been observed with many other phages. Experiments by Rolf Knippers to assemble the phage in vitro by mixing the ssDNA with the major coat protein gp8 were unsuccessful. Instead, as we know today, the phage is assembled in the membrane during its secretion and requires the membrane to coat the ssDNA in an ordered secretion process and involving a motor protein complex in the membrane. The assembly and secretion of the phage requires the catalytic proteins gp1, gp4, gp11 and thioredoxin of the host. The filamentous phage again became famous when Johnny Mekalanos discovered that the cholera toxin gene was embedded in an integrated phage CTX-(j) genome. Following this observation, many more genome-integrated cryptic filamentous phages were detected in Vibrio cholerae. In fact, there might be a huge number of integrated filamentous phages in the metagenome which are not yet identified. Virulent filamentous phage have been found in Pseudomonas (Pf1 to Pf7), Xanthomonas (Xf, Cf, Lf, Xv2), Salmonella (Lineavirus Ike), Ralstonia (Habenivirus RSM1 to 3, RSS1, RS603), Vibrio (Fibrovirus fs1, Saetivirus fs2), and in Spiroplasma (Plectovirus 1-C74, Vesperitiliovirus 1-R8A2B) and recently also in Thermus thermophilus. Phylogenetic studies show that gene 1 is the most conserved among the known filamenteous phage genes and can be taken as a basis for a phylogenetic tree. This analysis suggests that there are several distinct clades, where the Inoviridae group together with Lineaviridae and Saetiviridae are distinct from the known Vibrio and Pseudomonas phages. Similar in structure but genetically unrelated to the filamentous phage are the helical phages that were found in archaea. The Lipothrixviridae and Rudiviridae infect Crenarchaea and are 400–2200 nm long tubular particles with a diameter of about 20 nm. They contain linear dsDNA of 16–60 kbp that is packed with a helical capsid protein. Another helical archaevirus, termed Clavavirus has recently been discovered that contains a circular dsDNA of 5.3 kbp. No sequence homology has been found in the current data base. The diameter of the virus is 16 nm with a length of only 143 nm.

Encyclopedia of Virology, 4th Edition, Volume 4

doi:10.1016/B978-0-12-809633-8.20986-2

53

54

Helical and Filamentous Phages

Biotopes: Prolific, Temperate and Pathogenic The discovery of filamentous phages was delayed and still is hampered by a crucial technical step to remove larger debris including bacterial cells from aqueous samples by employing porous membranes or filters. Due to this step, spheroidal particles are more likely obtained than elongated ones, thus leading to a removal of filamentous phages from most samples. Therefore, it was estimated that filamentous phage representatives are by numbers only about 5% of those of spherical or head-and-tail phages. In contrast to this estimate, a recent publication in the field of (meta-) genomic sequencing indicated that there are many more filamentous phages than previously thought. According to the study, the currently described 56 members of the Inoviridae family represent only a small fraction of a highly diverse group that is proposed to even extend to the archaea kingdom. Astonishingly, Inovirus sequences have been identified in 35% of the metagenome sequences available until 2019. About 6% of the annotated genomes from bacteria and archaea contain filamentous phage genes. These sequences cover only the ones that are integrated into the host genome. Therefore, it can be assumed that many more filamentous phages exist often in a plasmid like state, as bacterial viruses are found as an episome or chromosomally integrated. Not only their DNA, but the filamentous phage particles themselves have been discovered in all biotopes on the planet. Most of the ecosystems where filamentous phages can be found are moderate in nature and are inhabited by mesophiles, i.e., environments with “normal” temperatures and pH, such as the gut of mammals including humans, where they infect proteobacteria, including enterobacteria (an estimated 90%). Surprising is the partnership that some filamentous phages have entered. To the benefit of the phage which uses the cellular replication machinery, the bacterial host itself has been shown in some cases to obtain an evolutionary advantage by being infected with the virus. Many of the filamentous viruses have a parasitic lifestyle, living on the expense of the host cell. However, thus in some cases a mutualistic relationship is discussed as filamentous virus was found to influence certain host characteristics to the benefit of both, the bacterium and the phage. The most prominent example is the bacteriovirus CTX-(j) that infects Vibrio cholerae, and delivers the genes for the cholera toxin to a comparably harmless bacterium. Once integrated into the genome, cholera toxin A and B are produced, leading to a destruction of the mucosal membranes resulting in a high influx of water, causing diarrhea. Release of the phage (excision) is triggered by a drop in pH, as found in the intestines of a human host, thus allowing other bacteria to be infected by the phage, turning the infected bacteria into highly virulent ones, in a process called phage conversion, resulting in a so-called toxigenic Vibrio strain. It is thought that the cholera toxin genes have been acquired by an evolutionary predecessor of the CTX-(j) phage, giving the virus a unique advantage as the spreading of a highly virulent bacterial host is also beneficial for the phage. CTX-(j) is not the only Vibrio phage; there is a vast variety of phages most of which are genome-integrated such as VGJ-(j) which integrates into the same sites as CTX-(j). Interesting is the occurrence of prophages with incomplete genomes, also known as satellite phages. These phages piggyback on some of the genes of their relatives, leading to the formation of so-called mosaic phages. An example is RS1 which has been reported to use the morphogenesis protein gp1 of CTX-(j) in order to form phage particles. Yersinia pestis is another example where a filamentous phage contributes to the pathogenicity of a bacterium. Y. pestis evolved from the enteropathogen Yersinia pseudotuberculosis about three millennia ago and became infected by the filamentous phage YpfPhi, which remained an extrachromosomal element in many lineages. The Y. pestis strain that contained the phage integrated into the host genome, caused the third plague epidemic. This major bubonic plague pandemic began in China in 1855 and spread to all inhabited continents, killing more than 12 million in China and India alone. Interestingly, the phage Ypf-(j) infects Y. pestis but also Y. pseudotuberculosis and Yersinia enterolytica. In addition, the transduction is also expanded to E. coli strains. In contrast to being genome integrated in Y. enterolytica, the phage exists as an extrachromosomal plasmid in Y. pseudotuberculosis and in E. coli. The presence of the phage confers better colonization abilities of the host in mammalian tissues. While Ypf-(j) is not a major virulence factor, it confers a higher fitness and an increase in pathogenicity to the bacterium, albeit modest. Some commensal bacteria become opportunistic pathogens upon the genetic acquisition of virulence factors. E. coli is no exception to this phenomenon where the enterobacterium causes diarrhea, urinary tract infections but also sepsis, and meningitis. For example, the E. coli strain O18:K1:H7 is one of the leading causes of neonatal meningitis and cystitis in North America. The serotype CO92 carries a genetic element called CUS-1 which was identified to be a filamentous prophage. The genome of the phage is almost identical to CUS-2 found in Y. pestis biovar orientalis. Both contain a gene called puvA which, when destroyed, affects the ability of E. coli K1 strains to systemically propagate or even survive in vivo, indicating a role in pathogenesis. The gene is flanked by several other genes that show high similarity to those coding for coat proteins and proteins crucial for morphogenesis of other filamentous phages. Filamentous phage particles much like M13 were detected in the supernatant of cultures from both, E. coli K1 as well as Y. pestis biovar orientalis CO92 cells, later called CUS-(j). Another filamentous phage, MDAF, which infects Neisseria meningitidis, has been implicated to have a contribution to the virulence of the host when the virus is in a prophage state. While it was shown that the presence of the phage has no influence on virulence during the bloodstream stage of N. meningitidis, it has been demonstrated that phage-infected bacteria are more efficient in colonizing host epithelial cells during the progression of the disease. While being directly responsible for the pathogenicity of their host, filamentous phages can affect the immune system of a human directly, when infected by bacteria harboring such a virus. In case of the Pseudomonas aeruginosa phage Pf, the co-existence of the phage in their host impacted the healing process negatively, leading to chronic, badly and slow-healing wounds. In the mouse model, 50 times fewer Pseudomonas cells were required to cause a wound infection, when comparing phage-infected to noninfected cells. Also, the rate of morbidity was increased when mice were infected with Pseudomonas harboring phage. Surprisingly, the effects were caused not by increased virulence of the bacteria but by the immune response of the mammalian host. The uptake

Helical and Filamentous Phages

55

of phage particles by macrophages caused among other cellular reactions, a Toll-like receptor response, mediated by the mRNA which is transcribed from the phage DNA, as well as to the inhibition of phagocytosis, suggesting that phage can play a direct pathogenic role in the infection process. Filamentous phages can be found on plants, in the soil, on food products and in lakes, rivers and the sea. Also in extreme environments where extremophilic bacteria are present, filamentous phages have been discovered. Examples are volcanic hot springs that are found all over the world, including in Japan, where the phage (j)-OH3 has been discovered together with its host Thermus thermophilus. The conditions where host and virus thrive are around 701C with an almost neutral pH and low osmolarity. The stability and infectivity of the filamentous phage reflect this environment, as a higher temperature, high salt or low and high pH are detrimental and lead to denaturation and inactivation of the virus. While it is not known whether the phage confers any benefit to the host, or is purely parasitic, the existence of a thermostable phage might possibly allow nanotechnological applications under elevated temperatures. On the other temperature extreme, cold-adapted microorganisms were also found to harbor filamentous phages. A psychrophilic bacterium from the Pseudoalteromonas group which thrives in the cold waters and in the ice of the arctic, can be infected by a filamentous phage called f327. Prokaryotes in the arctic ice and water are exposed to very low temperatures and large fluctuations in nutrient levels, salt and gas concentrations, as well as in pH. Interestingly, approximately 50% of the species living in permanent arctic ice belong to the genus Pseudoalteromonas of which more than 30% were found to contain filamentous phage genes. As one would expect from a parasite, phage infection resulted in slower growth rates and decreased final cell density. In addition, infected bacteria were also less tolerant towards changes in salinity or higher hydrogen peroxide concentrations compared to uninfected cells. Surprisingly, infected bacteria showed a much higher mobility, contrasting the negative impacts on cell viability mentioned above. Possibly, the increased mobility allows to reach nutrient enriched environments faster, e.g., when released from the ice at higher temperatures. While being psychrophilic, Shewanella piezophila is also a piezophile, i.e., a pressure-loving (or -tolerant) bacterium which lives on deep ocean floor, with optimal growth at a pressure of 20 MPa and a temperature of 201C. The S. piezophila strain WP3 has been found to contain a genome-integrated filamentous phage, called SW1. SW1 is significantly induced at cold temperatures around 41C, the typical temperature in the deep sea. The presence of the phage strongly impacts the expression of almost 50 host genes, many of them involved in flagella production. Interestingly, the phage does not influence the swimming motility of the host, while uninfected cells exhibit an increase in swarming motility at 41C. Because motility is energetically expensive for cells, it is thought that the phage reduces this energetic cost by suppressing flagella genes and thus increases chance of survival of the host. While the above-mentioned hosts are all Gram-negative bacteria, filamentous phage-like particles have also been observed to infect Clostridium acetobutylicum, a Gram-positive. So far, only one filamentous phage which infects a Gram-positive bacterium has been characterized in detail with regards to genome and proteins. The phage B5 uses Propionibacterium freudenreichii as a host to multiply, which is an important microbial component during the maturation of certain cheeses, in particular Emmental, where the carbon dioxide is responsible for forming the characteristic large holes. In contrast to contaminations, especially lytic phages can cause problems in industrial food processes where microorganisms are involved. However, B5 infection does not negatively influence the product. Whether or not this phage changes any pathways in the bacterial host, is currently not known. Until today, no filamentous phages have been discovered that infect the third kingdom next to eukarya and bacteria, the archaea. However, a recently published metagenomic study revealed sequences of Inovirus-like genes with archaeal genomes that show highly conserved sequences of filamentous phages. In addition, a circular plasmid like episome of a filamentous virus has been isolated from Methanolobus profundi. The discovery of filamentous viruses in archaea is fascinating from an evolutionary perspective, as none of the currently characterized viruses infecting archaea are related to those of bacteria. Finding Inovirus genomes in both, archaea and bacteria, could be explained by the idea that this family existed very early on with a virus infecting a common ancestor. This conclusion is supported by the fact that the archaeal viruses display a high genetic diversity and lack strong similarity to bacterial phages, thus are likely not to result from a recent genomic exchange.

Structure of Filamentous Phages Filamentous phages are structurally very different from other phages that usually have head and tail. Instead, they are long flexible filaments with two different ends, the base end which adsorbs to the host and the tip which is the end that leaves the host cell first during the secretion event. Based on early fiber diffraction methods, the filamentous phages were divided into two structural classes, where M13, f1 and fd are members of class 1 on which this review is focused. The fiber diffraction and other spectroscopic techniques showed the dimensions of the phage as a rod of about 890 Å in length and with a 6 Å diameter. The phage coat consists of an interdigitated helical arrangement of about 2750 copies of the major coat protein gp8. The slightly curved a-helical proteins cover the ssDNA with an about 201 tilt along the helical axis of the phage particle. The ssDNA is packed as two individual antiparallel strands within the protein tube where the phosphodiester backbone of the DNA may be electrostatically connected to the positively charged residues at the C-terminal of gp8. The helical arrangement of the coat proteins describe a C5S2 symmetry which states that 5 proteins start at the same time leading to a five-fold rotation axis. The coat proteins overlap such that the negatively charged N-terminal regions are exposed on the outside of the particle and the coat actually consists of two a-helical layers. The two ends of the phage consist of the minor coat proteins. The base of the phage which is used to adsorb on the host cell is made of five copies each of gp3 and gp6. The two proteins form a defined structure visible as knobs with a diameter of about 50 Å

56

Helical and Filamentous Phages

on the phage particles. The tip of the phage is formed by two small hydrophobic proteins gp7 and gp9, also with about five copies each. These two proteins are bound to the intergenic region of the ssDNA and initiate the packaging event.

Infection Process The filamentous phages M13, fd and f1 are male-specific and infect only bacteria that have a F plasmid encoding the F-pilus. The central D2 domain of the “attachment protein” gp3 interacts with the tip of the F-pilus whereas the C-terminal D3 domain of gp3 is anchored to the phage capsid. The binding of the phage induces a retraction of the pilus similar as the binding of a recipient female cell does. The retraction brings the adsorbed phage close to the outer membrane surface and possibly also into the periplasm where it interacts with the TolQRA, an inner membrane protein complex with two large periplasmic domains. However, it is not known how the infecting phage actually crosses the outer membrane, but this might occur together with the F-pilus retraction process. In the periplasm, the C-terminal domain of TolA interacts with the N-terminal D1 domain of gp3. This binding is thought to trigger a conformational change in the C-terminal domain of gp3 and supports the insertion of gp3 into the inner membrane, possibly with help from TolQRA. Since the Tol complex is anchored in the inner membrane and spans the periplasm, it might offer a path for the infecting phage from the outer to the inner membrane. There, the phage particle is dissociated and the coat proteins partition into the inner membrane. In the inner membrane, it is discussed that the 5 copies of gp3 form a kind of DNA channel, possibly together with TolR, that allows the single stranded DNA to enter the bacterial cytoplasm.

Replication by the Rolling Circle Mechanism Soon after the infecting phage DNA has entered the cytoplasm of the host it is a target of the DNA polymerase III which converts the single stranded DNA into a double strand. First, E. coli RNA polymerase binds to the intragenic region between gene 4 and gene 2 (b 5484–5987) to synthesize a short 30 base primer starting from base 5756. The primer is then extended with deoxyribonucleotides by DNA polymerase III and the RNA primer is replaced by deoxynucleotides catalyzed by DNA polymerase I of E. coli. DNA ligase of the host seals the ends of the double stranded DNA into a closed circle, termed replicative form RF I. This first stage involves no viral protein and the generation of the parental RF I is independent on the translation and expression of phage proteins. DNA replication is then initiated by gp2, a soluble globular protein of 46 kDa. It has an endonucleolytic activity, attacks the parental RF DNA at a specific site and cleaves the ( þ ) strand between the bases 5781 and 5782 in the intragenic region. Most likely, gp2 recognizes two hairpin structures (5692–5791) in the ( þ ) strand and provides a free 50 end (RF II). In the early phase of infection E. coli polymerase III replicates the circular RF to about 100 copies per cell. Later in infection, the polymerization of the (  ) strand is prevented by binding gp5, a single strand binding protein. As a result of this “rolling circle” only ( þ ) strands decorated with gp5 are generated in the cells. Besides generating the endonucleolytic cut, gp2 has a second function in producing unit length ssDNA and sealing the 50 and 30 ends of the ssDNA-gp5 complex together to a closed circular DNA molecule. It is unknown whether also gp10 is involved in this process, since in the absence of gp10 the proliferation of phage is greatly reduced. The ssDNA-gp5 complex has been thoroughly studied by many laboratories. gp5 is a 9.7 kDa protein and is mainly present as a dimer. The globular protein consists of numerous b sheets that fold into anti-parallel loops and expose positively charged residues. These residues are involved in the binding two strands of ssDNA building a 3-dimensional helical structure which has been modeled. In this model, two strands of the circular ssDNA are packed into a left-handed helix that is surrounded by gp5 dimers. The binding of gp5 to ssDNA occurs cooperatively to the phosphate backbone, but a phenylalanine and tyrosine residues are involved in contacting bases of the bound ssDNA.

Biosynthesis of M13 (fd, f1) The Coat Proteins Transcription of the filamentous DNA occurs early after the replicative RF I form has been generated. Early studies have shown that M13 is transcribed from the non-viral strand in 8 RNA species ranging from 8S to 20S. They are mostly generated from individual promoters that are recognized by the E. coli RNA polymerase holoenzyme. A strong rho-independent termination site is located shortly after the end of gene 8 forming a tight hairpin loop structure and a rho-dependent termination site proximal to gene 1. Translation of the 11 viral genes occurs shortly after infection by the host ribosomes. The ribosome binding sites (RBS) are located in a short distance before the start codon of genes 1 to 9, where the RBS for gp7 is located within gene 5, the RBS for gp6 is within gene 3 and the RBS for gp1 is within gene 6. For gp9 the start codon overlaps with the stop codon of gp7. Interestingly, the coding sequences for gp1 and gp4 overlap with 7 codons and differ by their reading frame. Gene 10 is located within gene 2 in the same reading frame and also gene 11 starts within gene 1 in the same frame. Therefore, gp10 includes the same amino acid sequence as gp2 from residue 166–276 and gp11 is identical with gp1 from residue 241–348. This shows that the coding sequences of M13 are tightly packed and only the 503 bp intergenic region lacks any protein-coding information.

Helical and Filamentous Phages

57

The Major Coat Protein gp8 Essential for the assembly of the filamentous phage is the expression of the 5 coat proteins gp3, 6, 7, 8, 9. They all are inserted into the membrane where the assembly of the phage particles occurs. gp8 is the major coat protein with more than 2500 copies per particle. It is first synthesized with a signal sequence of 23 amino acids that is cleaved off after membrane insertion. Leader peptidase (LepB) recognizes the transmembrane procoat proteins at the residues  1,  3 and  6 in the signal sequence following the “von Heijne rules”. The membrane insertion of procoat requires an essential membrane protein of the host, the membrane insertase YidC. YidC binds to the hydrophobic sequences of procoat with its hydrophobic slide structure leading to the integration of the procoat protein across the membrane. After cleavage by leader peptidase, the 50 amino acid long M13 coat protein exposes the N-terminal first 20 residues into the periplasmic space and the C-terminal 10 residues into the cytoplasm. The exact mechanism of membrane insertion has been studied with the major coat protein of Pf3 in vivo and in vitro. This protein lacks a signal sequence but also requires YidC for membrane insertion. The Pf3 coat protein has been purified and added to YidC containing proteoliposomes which inserted the coat protein efficiently. Atto-labeled mutants of Pf3 and YidC allowed the calculation of the speed the membrane insertion reaction takes by single molecule Förster resonance energy transfer (smFRET) and is in the range of 10 ms. Disulfide cross-linking showed that the coat protein enters YidC step-by-step up the hydrophobic slide of YidC and the membrane potential dependent translocation step then moves the N-terminal domain of Pf3 coat from the hydrophilic groove inside YidC into the periplasm. In the membrane, the M13 coat proteins oligomerize as shown by disulfide cross-linking. Possibly, these oligomeric forms interact with the gp1/11 packaging machine and cover the single stranded DNA. The four positively charged amino acid residues in the C-terminal tail of the coat protein might electrostatically interact with the negatively charged phospholipids of the DNA. When the number of the lysines was reduced the length of the phage particles increased correlating with the altered charge density. On the phage particle, gp8 interacts with neighboring gp8 by hydrophobic interaction in a tilted staggered arrangement (see section on the phage particle structure). The N-terminal region of the coat protein is negatively charged but is quite flexible for alterations. Antigenic peptides and display peptides have been incorporated into this region and were exposed on the phage.

The Minor Coat Proteins gp7 and gp9 The assembly reaction is initiated with the capping proteins gp7 and gp9. Both proteins are synthesized without a signal sequence but fractionate with the inner membrane. gp7 is a small 33 amino acid long single-spanning membrane protein. Its N-terminal region of 8 residues includes 3 negatively charged residues, whereas the C-terminal residue is an arginine residue. For the display of foreign peptides the PelB signal sequence was added to support the membrane insertion of gp7. Even in the absence of an added signal sequence gp7 was useful for phage display. Gp9 has only 32 residues and is the smallest coat protein. The C-terminal region of 15 residues is positively charged and in the cytoplasm. It is unclear whether the N-terminus is exposed in the periplasm since there is no hydrophilic residue present. When the N-terminus was extended by 20 polar residues encompassing a T7 tag the protein was fully functional and complemented an M13am9 infection. The membrane insertion of this extended gp9 was dependent on the membrane insertase YidC and after insertion the N-terminal extension was digestible by proteinase K. It is possible that the non-extended gp9 protein inserts also spontaneously as was shown with artificial membranes. Therefore, gp9 can be used for phage display and if it is applied with a second display site at the base end of the phage particles (e.g., gp3) bi-functional phage can be designed. Gp7 and gp9 presumably interact with the hairpin packaging signal to initiate DNA packaging at the membrane surface. Mutants which had the packaging signal deleted were compensated by plaque-forming phage mutations within genes 7, 9 and 1. gp7 can be co-immunoprecipitated with gp8 which suggests that once the formation of packaging complex is initiated the major coat protein is fed into the packaging machinery for assembly.

The Minor Coat Proteins gp3 and gp6 As the packaging process is approaching the end of the single stranded DNA circle, gp3 and 6 bind and terminate the phage assembly. gp3 is first synthesized with a signal sequence of 18 amino acid residues and inserted by the Sec translocase involving SecA. After cleavage it is a single-spanning membrane protein and has a large N-terminal domain of 378 amino acids in the periplasm and a cytoplasmic region of 5 residues. The membrane anchoring of gp3 has been studied in detail. When the length of the membrane anchor was reduced from 23 residues to 17 the topology was unchanged. Larger deletions, however, resulted in a periplasmic location. The periplasmic region of gp3 contains two b structured domains, termed D1 and D2. The two domains (later named as N1 and N2) are connected by a flexible hinge region that is functional during the phage infection process. Whereas N1 (residues 1–68) folds and unfolds rapidly N2 (residues 124–217) folding is slow followed by the folding of the hinge region and the docking of the two domains tightly together. This tight folding ensures that the phage particle is stable in changing environments. Following the two N-domains the periplasmic portion of gp3 contains a C1 and a C2 domain. Whereas C1 is important for phage stability, C2 is involved in the release of the secreted nascent phage progeny. Gp3 has been extensively used for phage display technology. The random sequences encoding the display peptides are introduced between the signal peptide and the mature sequence of gp3. Therefore, the peptides are displayed on the phage at the very N-terminus not affecting the two b structured domains N1 and N2.

58

Helical and Filamentous Phages

At the base of the phage filament gp3 is connected to gp6, a small very hydrophobic protein of 112 amino acid residues. From its sequence it is predicted to span the membrane three times with the C-terminus in the cytoplasm. In the phage, gp6 is required for the termination of the assembly reaction. Amber mutations cause the generation of polyphage. Based on this observation and the stability of the polyphage it has been concluded that gp6 first binds to the end of the nascent phage and gp3 then follows. Also gp6 has been used for phage display with extensions at its C-terminus.

Assembly and Secretion The assembly of filamentous phage is initiated at the “morphogenetic signal” on the replicated M13 single stranded DNA (bases 5550–5577) which is a double-stranded hairpin structure close to the 50 end of the replicated ssDNA. The tip of the hairpin a GCAloop might be the actual structure of the DNA that interacts with the membrane inserted gp7/gp9 initiating the assembly reaction. In the assembled phage, gp7 and gp9 that are located at the tip of the particle and are present with about 5 copies each, together forming the phage cap structure. The membrane based assembly process is controlled by gp1 and its shorter version gp11 which starts at an internal gp1 codon for the methionine residue 241. The two proteins form a common oligomeric “preinitiation complex” in the membrane together with gp4 that is localized in the outer membrane. gp1 has a large N-terminal domain in the cytoplasm that contains an ATPase binding site with the typical Walker box elements. Mutations within these elements inhibit phage propagation suggesting that ATP hydrolysis is required for phage extrusion. Close to the membrane spanning regions of gp1 and gp11 are a characteristic pattern of positively charged residues similar to the pattern found in gp8. These are presumably involved in the transient binding of the extruding phage DNA molecule. The morphogenetic signal of the phage DNA contacts the gp1/11 complex in the inner membrane together with thioredoxin reductase TrxA of the host converting the preinitiation complex to an “initiation complex”. Since the enzymatic activity of TrxA is not required for phage propagation the contribution of TrxA is likely only structural, similar as it is for phage T7 replication. The single stranded DNA that interacts with the gp1/11 complex is covered with multiple copies of gp5. The gp5 proteins are continuously displaced from the DNA as it enters the assembly machinery and within the machinery gp5 is replaced by the major coat protein gp8. gp8 exists as oligomers in the membrane and presumably, these are fed into the membrane spanning assembly machine laterally for assembly (Fig. 1). Since in the M13 phage particle the coat proteins are arranged as pentamers in so called five-start helices it is possible that the gp1/11 machinery has five lateral entrances for the gp8 proteins that all five simultaneously bind to the DNA. A model of a cooperative pentameric assembly of gp8 generating the phage has been proposed early on. According to this model, each of the added coat proteins would be followed by the next coat protein in a certain distance leading to the staggered and interdigitated subunit packing of gp8. In the phage, this results in a shingle-like arrangement of the coat proteins covering the particle. Thereby, the nascent filamentous phage particle is drilling itself out of the host cell (Fig. 1). During the assembly process the growing nascent phage filament passes through the outer membrane with support of gp4. gp4 is a protein of 406 amino acid residues and is synthesized as a precursor of 426 residues. In the outer membrane gp4 is assembled into a 14-meric secretin structure with a central cavity of 6–8 nm forming a gated pore. This diameter allows the passage of the emerging phage particles that have a diameter of about 6 nm. The channel gating of the gp4 secretin is of functional importance since mutants in the respective regions of gp4 lead to a leaky phenotype of the host cells that also leads to an enhanced sensitivity towards antibiotics. In addition, the N-terminal portion of gp4 that is located in the periplasm makes contact to the gp1/11 assembly machine located in the inner membrane possibly forming a continuous structure. In accordance with this, a mutant in gp1 was found to be partially compensated by a mutation in gp4. However, the interaction between gp4 and gp1/11 might be weak since the two proteins are not co-purifying when extracted from cells.

Gp1/11

Fig. 1 Model of M13 packaging. The membrane-inserted major coat proteins oligomerize and bind to the ssDNA in a shingle-like fashion catalyzed by the packaging motor gp1/11 and ATP.

Helical and Filamentous Phages

59

Fig. 2 Infected cells extruding M13 phage particles visualized by the AFM. Samples were taken at 0, 5, 10 and 16 min post infection and recorded with the atomic force microscope. The bar is 1 mm. Courtesy of Martin Ploss.

The extrusion of the filamentous phage was visualized by atomic force microscopy (AFM). Synchronized by the infecting phage, the generation and extrusion of progeny phage from the host cells was observed already after 5 min and could be followed timedependently reaching a maximum of extruding particles all over the cell surface after 16 min (Fig. 2). Judging from the length of the extruding particles the assembly and secretion is a fast process. The phage assembly process is terminated by the two proteins gp6 and gp3. Presumably, they both enter the gp1/11 assembly machine when the extruding DNA molecule is entirely covered by gp8. As mentioned above, gp6 is most likely first assembled to the nascent phage followed by the binding of gp3 with its C-terminal region. It has been shown that gp3 and gp6 co-purified when extracted from phage. Mutations in gp3 and gp6 can lead to multiple length particles, also called polyphage. This clearly shows that the two minor coat proteins are involved in the termination step of the assembly. Normally, the size of the phage DNA determines the length of the filamentous phage.

Biotechnological Applications of Filamentous Phage One of the first applications of M13 phage was its use as a DNA sequencing vector. Since the sequencing reaction in vitro is very efficient with single stranded DNA, foreign genes were cloned into M13. The ssDNA was extracted from the phage particles and sequenced with the “universal” oligonucleotide using radiolabeled nucleotides. The universal primer anneals shortly before the multiple cloning sites where the foreign genes were ligated. Many applications are based on the phage display technology. The coat proteins (namely gp3, 6, 7, 8, 9) can be modified with protein domains that are exposed on the phage surface. These exposed domains can be used as binding components to a plethora of different targets and surfaces. The use of randomized oligonucleotides allows the generation of “display libraries” from which the optimal binders can be selected. Details are described in the recommended “further readings” section. Due to their unique morphology with a length to diameter ratio of E130, filamentous phages can be employed as scaffolds for nanotechnological applications. Many fascinating examples exist where the long and thin structure of the bacterial virus has been made use of to create wire-like structures for various applications. Many groups, especially the research group of MIT scientist Angela Belcher has shown the vast potential of creating virus-based hybrid materials. One such hybrid material consists of a genetically engineered multifunctional M13 phage that binds to carbon nanotubes via the coat protein gp8, which allowed to deposit amorphous FePO4 in order to achieve high-efficiency electric conductivity. The phage-hybrid material resulted in the reduction of the electron transport from the highly resistive cathode to the conductive elements of the battery. The material displayed a superior performance to any previously engineered amorphous FePO4 electrode, in terms of power performance and capacity. Filamentous bacteriophage can also be used as nucleation templates for the ambient temperature synthesis of nanowires with phase-changing properties (material capable of changing between two physical states). This is a particularly interesting

60

Helical and Filamentous Phages

application as faster computers can be constructed by drastically reducing the time delays that occur in current architectures from the transfer and storage of information between silicon- and magnetic-based memories (usually milliseconds). Nanosecond time scale information storage can be achieved by specialized segregating-binary-alloy-type phase-change-materials (e.g., Ga-Sb-based systems), by employing a single memory structure. In purely inorganic materials, the disadvantages of such materials are the high energy consumption and high temperature-induced elemental segregation. This can be overcome by using filamentous phages as templates: By forming a filamentous phage germanium–tin oxide hybrid material, achieved via an electrostatic binding on the M13 virus coat proteins, wire-like phase-change-materials can be obtained. This wire-like structures displayed reliable and controllable phase-changing signatures, with tens of nanoseconds switching times and a great potential for building new and highly advanced computers. Filamentous phage particles have also been used to build a biological metal solar cell arrays that can convert the energy from light with high efficiency into electricity. The M13 phage was used to construct a 3D structure which proved to be highly efficient in light harvesting in specialized dye-sensitized solar cells (DSSCs). Again, the filamentous structure was used to deposit metals onto the phage coat proteins. In this case, gold nanoparticles were used, which in turn were encapsulated with titanium dioxide. This resulted in a plasmon-enhanced nano-wire photoanode. Another fascinating example is the use of M13 to prepare low density 3D aerogels, which are ultralight porous structures with excellent elastic behavior, allowing up to 90% material compression. By using peptide fusions on capsid proteins, inorganic materials can be attached creating inorganic aerogels. Such materials offer a plethora of possible applications, e.g., as bio‐scaffolds, for the storage of hydrogen or energy, thermoelectrics, and catalysis. Not using filamentous phages as a wire or wire-network, a novel strategy for biocatalysis for the production of semisynthetic enzymes was developed by the incorporation of an enzymic motif into thermostable M13 phage coat proteins. Examples are phage particles that contain a site modeled after the active site of the enzyme carbonic anhydrase. Filamentous phages can also be employed as nanoparticles for imaging: For near-IR fluorescence imaging of tumors in deep tissues, genetically engineered multifunctional M13 phages were assembled with fluorescent single-walled carbon nanotubes as targeted probes for the specific uptake in prostate tumors.

Further Reading Duché, D., Houot, L., 2019. Similarities and differences between colicin and filamentous phage uptake by bacterial cells. EcoSal Plus. (ESP-0030-2018). Hay, D.H., Lithgow, T., 2019. Filamentous phages: Masters of a microbial sharing economy. EMBO Reports. e47427. Loh, B., Kuhn, A., Leptihn, S., 2019. The fascinating biology behind phage display: Filamentous Phage Assembly. Molecular Microbiology 111, 1132–1138. Marvin, D.A., Symmons, M.F., Straus, S.K., 2014. Structure and assembly of filamentous bacteriophages. Progress in Biophysics and Molecular Biology 114, 80–122. Ptchelkine, D., Gillum, A., Mochzuki, T., et al., 2017. Unique architecture of thermophilic archaeal virus APBV1 and its genome packaging. Nature Communications 8, 1436. Rakonjac, J., Bennett, N.J., Spagnuolo, J., Gagic, D., Russel, M., 2011. Filamentous bacteriophage: Biology, phage display and nanotechnology applications. Current Issues of Molecular Biology 13, 51–76. Scott, J.K., Smith, G.P., 1990. Searching for peptide ligands with an epitope library. Science 249, 386–390.

Replication of Bacillus Double-Stranded DNA Bacteriophages Silvia Ayora, National Biotechnology Center–Spanish National Research Council, Madrid, Spain Paulo Tavares, Institute for Integrative Biology of the Cell, CEA, CNRS, University of Paris-Sud, University of Paris-Saclay, Gif-sur-Yvette, France Ruben Torres and Juan C Alonso, National Biotechnology Center–Spanish National Research Council, Madrid, Spain r 2021 Elsevier Ltd. All rights reserved.

Glossary θ replication Circle-to-circle or θ-type DNA synthesis. It occurs when the replisome is assembled on a circular chromosome at the ori region, leading to its uni- or bidirectional replication. Its product is two catenated DNA molecules. σ replication It is also referred as rolling circle replication. Its product is a concatemer generated by recombinationdependent DNA replication (RDR). Concatemer A head-to-tail linear DNA molecule generated either by a replicative or by a recombination mechanism. cos A DNA sequence recognized by the viral terminase to initiate and terminate encapsidation of a unit-length genome. DNA synthesis To synthesize both DNA strands in a concerted and semiconservative fashion in the 5′→ 3′ direction. One strand is synthesized mainly continuously (the leading strand) and the other is synthesized discontinuously (the lagging strand). Headful packaging Processive viral DNA packaging along a phage DNA concatemer whose first cycle is initiated by an endonucleolytic cleavage at pac, and terminated by a sequence-independent cut that occurs when a threshold amount of packaged DNA is reached inside the phage capsid. The subsequent encapsidation cycle begins at the DNA end created by the headful cut, and processive headful

encapsidation cycles ensue. The DNA molecules encapsidated are terminally redundant (usually 1.02–1.1 longer than the genome size) and the order of genes is partially circularly permuted to an extent that depends on the concatemer size. pac A DNA structure and sequence recognized by the viral terminase to initiate encapsidation by the headful packaging mechanism. Preprimosome A protein complex that loads the replicative DNA helicase into DNA, and it is subsequently disassembled. Primosome A protein complex composed of the replicative DNA helicase and the DNA primase, or a single helicase-primase protein. In some cases, a hybrid RNA-DNA primer is synthesized in concert with a DNA polymerase (DnaE) to initiate DNA synthesis. Replication origin (ori) A locus at which a replication initiation protein binds or an RNA polymerase transcript melts the duplex at an adjacent AT rich region (DUE, for DNA-unwinding element). The resulting ssDNA regions act as loading site of the replicative helicase. Replisome A protein machinery devoted to replicate DNA. In Bacilli it is composed of two replicative DNA polymerase holoenzymes (PolC and DnaE), the clamp loader DnaX-HolA-HolB (also termed τδδ′), the processivity factor DnaN (also termed β clamp), and the primosome.

Introduction Bacteriophages (or phages) provide the largest unexplored genetic reservoir and are the most common biological entities on earth. The large majority of phages with double-stranded (ds) DNA belong to the ancient Caudovirales Order (B97%), and their division is probably older than the separation of the three domains of life (Bacteria, Archaea and Eukarya). The Caudovirales Order is composed of three major Families: Myoviridae (B20% of total species of the Order), Siphoviridae (B68%), and Podoviridae (B11%) (Ackermann, 2003). Bacteria of the Firmicutes phylum (Gram-positive cells with low dC + dG content in their DNA), which are placed at the earliest division of Bacteria, are predominant in a healthy gut microbiota. Among Firmicutes, the Class Bacilli includes bacteria with round- (termed coccus) or rod-like form (bacillus) that represent a major health burden in the community as well as in hospitalized patients as e.g., Streptococcus spp, Enterococcus spp, Staphylococcus spp, Listeria spp and several Genera from the Bacillaceae family. Thus, the inherent ability of phages to infect those bacteria and control their biomass turnover makes them ideal candidates to develop antimicrobial agents for pathogen-specific remediation (e.g., phage therapy). On the other hand, Bacilli phages infecting Lactococcus spp, Lactobacillus spp, Leuconostoc ssp, Oenococcus spp as well as certain Bacillaceae are a major concern to dairy, wine, and meat bioprocesses. Understanding how these phages multiply is an essential step to harness the full potential of phage-based technologies and to generate novel strategies to combat industrial phage contaminations. Here, we describe present knowledge on the different strategies that Bacilli phages use to replicate their DNA. A representative phage from each of the major families of the Order Caudovirales that infects the model bacterium Bacillus subtilis was chosen (Table 1), and compared with other Bacilli phages.

Encyclopedia of Virology, 4th Edition, Volume 4

doi:10.1016/B978-0-12-809633-8.20970-9

61

62

Replication of Bacillus Double-Stranded DNA Bacteriophages

Table 1

Phage- and Host-encoded proteins involved in DNA replication

Function

SPO1

T4a

SPP1

ϕ29

Host

ssDNA binding Replisome organizer Protein priming Replication mediator (RMP) Replicative helicase Helicase loader DNA primase DNA polymeraseb Clamp loader Sliding clamp Topoisomerase Rad50-Mre11nuclease 5′→3′ exonucleases Recombinase mediator Recombinase Translocase DNA helicase DNA helicase HJ resolvase

? – NA ? gp21.1 – gp21.95 gp31 (A) ? ? J1 mutation gp21.3/21.9 ? – – gp19.5 gp.34.33 – gp24.1

gp32 – NA gp59 gp41 – gp61 gp43 (B) gp44/62 gp45 gp39/52/60 gp46/47 – UvsY UvsX UvsW ? Dda gp49

gp36 gp38 NA gp38-gp39 gp40 gp39 host host host host host ? gp34.1 – gp35 ? ? ? gp44

gp5 – gp3/TP gp6 – – – gp2 (B) – – – – – – – – – – –

SsbA DnaA/PriA NA PriA-DnaB-D-I DnaC DnaB-D-I DnaG-DnaE PolC, DnaE (C) DnaX-HolAB DnaN GyrAB, ParCE SbcCD – RecOR RecA RecG PcrA RecD2 RecU

Early and late replication of large dsDNA viruses of the Myoviridae family was reconstituted in vitro only for phage T4 that infects Escherichia coli. Therefore, T4 is depicted for its comparison with SPO1. b The family to which the replicative DNA polymerase belongs is denoted within parentheses. Note: NA, not applied; ?, unknown. a

The majority of cells and their viruses initiate DNA replication by building an orisome. In the host cell an ATP-dependent replisome organizer protein(s) binds to multiple sites at the ori region. This interaction leads to a conformational change that exerts a mechanical force for strand opening of an adjacent AT-rich region termed DUE (for DNA-unwinding element) and recruits, directly or indirectly, the replicative helicase to establish bidirectional replication forks. Most viruses of the Order Caudovirales use their own replication machinery or a combined action of viral and host functions, while a minor number of them encodes only for one function (the ori region) that is essential for initiation of DNA replication, and relies on the host machinery for replication elongation, as lactococcal phage c2 (and perhaps sk1) of the Siphoviridae Family. The dsDNA phages have evolved an efficient and economical process for amplification of over 100 virus genome copies within minutes. They use three different mechanisms to initiate DNA synthesis. The majority of those phages use two different replication modes (early and late) to replicate their linear or circular genomes to generate the concatemeric DNA substrate for genome packaging into preformed viral procapsids (or proheads). A less common approach is the replication of linear unit-length genomes by a protein priming mechanism (Salas, 1991). These different strategies to initiate DNA replication and the replication mode used are not restricted to a specific tailed phages Family. For example, within the Podoviridae Family, phage T7 uses RNA polymerase to initiate DNA replication as earlier described for plasmid ColE1 (Masukata et al., 1987), phage P22 uses an ATP-independent replisome organizer protein to start DNA synthesis, whereas phage ϕ29 initiates DNA synthesis by a protein priming mechanism. Viruses, in general, and the different Families of dsDNA phages that infect bacteria of the Bacilli class in particular, do not have a universal set of genes. They have a core of ancient virus hallmark genes shared within viral families, combined with non-conserved genes that build genomes with a mosaic organization. Viral infection and horizontal gene transfer strongly contribute to spread and swap genetic information among viruses leading to this mosaicism that obscures their phylogenetic relationship. Such effect is also observed in the genome module encoding the DNA replication and recombination functions of Caudovirales. In this article we summarize current knowledge on the replication strategies of the large genome size SPO1-like viruses of Myoviridae (with a contractile tail); the middle-size SPP1-like viruses of Siphoviridae (with flexible, non-contractile tail); and the small-size ϕ29-like viruses (with a short, non-contractile tail) of Podoviridae Family. With the exception of ϕ29-like viruses, the replication of tailed phage genomes starts by an early replication mode that generates a template substrate for late replication. Late replication uses a recombination-dependent replication (RDR) mode to yield the head-to-tail concatemer that is cut at specific site(s) cos (to render unit-length genome) or pac (redundant ends) for packaging into viral procapsids. Therefore, recombination is an essential step in phage DNA metabolism. Phages can use a pac site and a headful packaging mechanism to generate mature terminally redundant DNA molecules with or without circular permutation (Casjens, 2011; Rao and Feiss, 2015). The pac-containing phages constitute major vehicles for horizontal gene transfer, including virulence and antibiotic resistance genes. Alternatively, tailed phages use cos sites in the replicated genome concatemer to initiate and terminate DNA encapsidation, which generates unit-length packaged DNA molecules (Casjens, 2011; Rao and Feiss, 2015). Other phages like SPO1 package DNA molecules that have a non-permuted, several kb-long, terminal redundancy at their ends. Since the substrate used for processive packaging is a concatemer, their terminal redundancy must be generated by de novo DNA synthesis in concert with DNA packaging (Stewart et al., 2009). The underlying mechanism remains unknown.

Replication of Bacillus Double-Stranded DNA Bacteriophages

SPO1

19.5

21.1 1064

Hel SP10

21.3 551

RDH

165

200

195 348

76

194

75

192

70

34.33 5045

EXo 195

2313

186

Lig

PcrA

196 110

Rec

170

198 490

165

13513

2921

59

53 3846

73 796

32.95 391

14294

6292

70 591

5045

191

68

68 260

Pol

3950

804

32.85

174 920

1669

1687

66 2277

Res

171

74

16

60

14061

272

196

1580

Prim

169

4

LP65

31

24.1 1016

SbcC

168

3293

Stau2

21.95 64

SbcD

167 1405

Bc431v3

21.9 1292

63

83 8286

85 1111

Fig. 1 Genome overview of the replication module of SPO1-like phages. The arrows indicate the direction of gene transcription. Relatedness is derived from P-BLAST comparison of phage-encoded products. The color of each gene identifies its classification in functional groups: DNA recombination (red), DNA replication (blue), and related functions (black). The lengths of the interspersed regions are indicated. The names of the genes are labeled above the arrows and their putative function is annotated. Hel: helicase; RDH: replicative DNA helicase; SbcD and SbcC: nuclease SbcCD; Prim: primase; Res: holliday junction resolvase; Pol: DNA polymerase; Exo: 5′→3′ exonuclease; Lig: DNA ligase; PcrA: helicase; Rec: RecA-like recombinase. The genome sequences accession numbers are NC 011421, MF422185, NC 020873, AY682195 and KP881332.

Finally, ϕ29-like viruses initiate DNA replication by protein priming that is extended by the DNA polymerase. The synthesized unit-length genome is subsequently packaged with a terminal protein (TP) covalently linked at the 5′-end into an empty prohead (Salas, 1991). Phages with homology to the SPO1, SPP1 or ϕ29 prototype genomes can show three degrees of evolutionary relatedness: (i) those whose complete genome has strong nucleotide sequence identity to the phage prototype; (ii) those with a mosaic genome organization, having discrete modules with significant degree of nucleotide sequence homology to the phage prototype; and (iii) those whose sequence identity between the different replication proteins is low, but their coding genes order is conserved. Therefore, our analysis of the replication machinery of SPO1-, SPP1-, and ϕ29-like phages will not be restricted to those with close phylogenetic relatedness. We will use a sensu lato classification to include viruses with conserved synteny of DNA replication gene modules with respect to the prototype virus for the analysis of their replication machineries.

DNA Replication of SPO1-Like Viruses The genus SPO1-like virus, including the prototype phage SPO1 that infects B. subtilis, is in the process of being renamed genus Okubovirus in honor of Shunko Okubo, who isolated SPO1 and carried out the first genetic studies of this myovirus. It includes large dsDNA viruses with a genome size 4100 kb. B. subtilis spp SP82, 2C, SP8, ϕ25, H1, ϕe and SP5C phages show a significant degree of nucleotide sequence identity to phage SPO1 (group i). A comparative analysis of available phage genome sequences identified two other groups of phages based on their level of relatedness to SPO1: Bacillus phages SP10 and Bc431v3, Listeria spp P100 and A511; Staphylococcus spp phages K, G1, P1, S3K, φ812, Twort, Stau2; and Lactobacillus spp phage LP65 and Lb338-1; Enterococcus spp phage φEF24C. Phages SPO1 (representative of group i), SP10 and Bc431v3 (group ii) and LP65 and Stau2 (group iii) are depicted in Fig. 1. Replication and recombination genes are clustered in a region of their core genome. They are spaced either by genes encoding for a function optimizing viral DNA amplification or for proteins of unrelated/ unknown function. The genome of SPO1-like viruses is large (130–160 kb-long) and thymine is replaced by 5′-hydroxymethyluracil (hmU). DNA replication starts from at least two replication origins. Although SPO1 is the best-characterized member of this group, there is limited information on its DNA replication mechanism. Here, SPO1 protein candidates participating in the process will be discussed, and potential mechanistic similarities with the well-studied DNA replication of myovirus T4 will be addressed. At early times of infection, SPO1 DNA replication initiates at specific origin sequences by RNA polymerase transcription to generate a stable untranslated transcript. This RNA remains attached to its template to form a strand displacement RNA (R-loop). This structure maintains the DNA strands melted at the DUE region in a process that is independent of recombination proteins, and DNA replication starts. As infection progresses, a RDR mechanism becomes predominant and the origin-dependent replication mode is turned off. The SPO1-encoded DNA primase (Prim, gene product (gp) 21.95), replicative DNA helicase (RDH, gp21.1), DNA polymerase (Pol, gp31) and 5′→3′ exonuclease (Exo, gp32.85) (Fig. 1) may have their functional counterparts in the host DnaG, DnaC, DNA PolI and YpcP proteins, respectively, but these host functions are unable to substitute for the SPO1-encoded proteins (Table 1). When the RDR mechanism is turned on, phage SPO1 probably relies on phage- and hostencoded recombination machineries. Although phage T4 codes proteins with single-stranded DNA-binding and recombinase activities essential for its replication, none of the SPO1-encoded products shows significant similarity to proteins with related functions (Table 1). However, other phages of this Family encode an ATP-dependent recombinase (Rec, counterpart of host RecA)

64

Replication of Bacillus Double-Stranded DNA Bacteriophages

downstream of the viral-encoded DNA polymerase (Fig. 1), suggesting that in these phages initiation of late replication is controlled by a viral protein. At present, it is unknown if host RecA is able to substitute for the viral encoded counterpart. Some of the SPO1 DNA replication functions for which no viral gene was identified could be performed by uncharacterized phage proteins. For example, no gene product shows significant sequence similarity to host DNA gyrase. However, SPO1 infection requires a DNA topoisomerase function and SPO1 mutants can replicate in the presence of DNA gyrase inhibitors, suggesting the presence of a viral topoisomerase (Table 1). In late replication, the host-encoded recombination machinery might initiate RDR. Then, SPO1-encoded DNA helicases Hel and PcrA-like (gp19.5 and gp34.33) could prevent over-replication at displacement loops (D-loops) formed by homologous recombination or are involved in the replication of branched intermediates. Branched replication intermediates could be resolved by the phage putative HJ-specific resolvase Res (gp24.1) (Table 1), although some viruses might have lost it, and rely on a host function (Fig. 1). The SPO1 gp21.1 and gp21.9 proteins share homology with the host SbcCD complex, which cleaves DNA ends, hairpins, abnormal replication intermediates, and terminal DNA structures preparing DNA ends for RDR, in a way similar to the eukaryotic Rad50-Mre11 complex. Finally, the gp32.95 LigD-like DNA ligase could seal hmUra containing DNA ends. The majority of SPO1 genes are co-oriented with the DNA replication RDR direction. One strategy to avoid deleterious head-on replication-transcription conflicts is to use the helicase gp34.33 together with the gp21.1-gp21.9 complex to promote replisome progression and to reduce head-on conflicts. The detailed mechanism of DNA replication of SPO1-like viruses (Spounavirinae subfamily) is unknown. Their genome core is reminiscent of the one from E. coli T4-like phages (Tevenvirinae subfamily). Since T4 DNA replication was extensively studied and reconstituted in vitro (Nossal, 1983; Mosig, 1998; Kreuzer, 2000; Benkovic and Spiering, 2017; Barry et al., 2018), we revisit its molecular mechanism as a model for the large viruses of the Myoviridae Family. In phage T4, cytosine is replaced by hydroxymethylcytosine (hmC). All the proteins necessary for replication of its linear and/or circular genome are encoded by the phage, with the exception of the RNA polymerase that is provided by the host cell. Early after infection, T4 DNA replication starts from the oriE locus close to an early promoter (major origin) and from four secondary origins (oriA, oriC, oriF and oriG) located near middle promoters. The primary and secondary origins are located just upstream of a DUE. RNA polymerase synthesizes a RNA that binds stably to its template, forming an R-loop. Following processing by T4-encoded RNaseH, this untranslated transcript serves as a primer to initiate DNA synthesis. In the R-loop the nascent RNA anneals to the coding DNA strand, leaving the displaced noncoding DNA coated by the single-stranded DNA binding protein (Ssb) gp32. The acidic C-terminal tail of gp32 interacts with the replication mediator protein (RMP) gp59, the family B DNA polymerase gp43, the recombinase UvsX, recombination-mediator protein UvsY, the DNA helicase Dda, and the primase gp61 (Table 1). The interaction of gp32 bound to ssDNA with gp59 and the replicative gp41 helicase loads both proteins onto the viral linear and/or circular genome (Table 1). Gp59 binds to the R-loop-dependent replication origin. Gp41 then interacts with the gp61 whose recruitment promotes assembly of the T4 primosome. The primosome loads the T4 DNA polymerase holoenzyme, composed of gp43, the sliding clamp gp45, and the clamp loader gp44/gp62. Leading strand synthesis is initiated at the 3′-OH end of the RNA primer by the gp43-gp4-gp44/gp62 complex. After the gp59 mediator frees gp41, the primosome allows gp41 unwinding of the substrate in the 5′→3′ direction, gp61-mediated synthesis of the lagging strand primer, and gp32 binding to the ssDNA regions (Table 1) (Benkovic and Spiering, 2017; Barry et al., 2018). Gp43 synthesizing the lagging strand remains in contact with the leading strand polymerase at the replication fork. This organization folds the lagging strand into a 700–1500-nucleotide loop coupling leading and lagging strand synthesis, according to the trombone model of replication. The encounter of the replisome with the 5′-end of the previous synthesized Okazaki fragment triggers dissociation of the lagging-strand replisome that disengages from gp45, leaving the clamp protein bound to the laggingstrand. Synthesis of a new Okazaki fragment ensues (Barry et al., 2018). In the large Myoviruses of the T4 and SPO1 groups, early DNA replication starts form an internal origin of replication in a linear molecule. This implies that when viral replication reaches the chromosome ends, the synthesized DNA molecules are shorter than unit length. The leading strand is replicated to the 5′-end yielding a 3′-tailed duplex while lagging strand replication leaves a ssDNA region at both ends resulting in a 5′- and 3′-tailed duplex replicated DNA molecule. The 3′-ssDNA ends coated by gp32 are competent for recombination by strand invasion. At late stages of phage T4 infection, the UvsW DNA helicase inactivates initiation from R-loops by unwinding potential primer transcripts that may have persisted or were synthesized during late infection, ensuring that RDR predominates. The ATP-dependent UvsX recombinase, loaded by the UvsY recombinase mediator, polymerizes on the 3′-tailed substrate generated by early replication. UvsX-mediated DNA pairing forms a displacement loop (D-loop) at a homologous region on the linear DNA template to initiate DNA synthesis by RDR (Table 1) (Kreuzer, 2000). Viral-mediated RDR resembles eukaryotic break-induced replication, but the latter is mutagenic and prone to chromosome rearrangements (Llorente et al., 2008). Unidirectional replication, initiated from this invading 3′-end that is the primer for leading strand replication was proposed to duplicate most of the T4 DNA molecule. Multiple strand invasion events generate a complex branched network as the infection progresses. This replication mode promotes the appearance of replicating DNA intermediates containing multiple covalently linked copies of the genome. The Holliday junction (HJ) resolvase gp49 (also termed Endo VII) cleaves the network of chromosomes and such debranching generates linear concatemeric DNA molecules (Mosig, 1998). The viral terminase initiates packaging at pac within these concatemers into empty T4 phage proheads by a processive headful mechanism. DNA molecules slightly longer than the unit-length of a T4 genome are encapsidated, resulting in a population of terminally redundant and circularly permuted packaged T4 chromosomes. This differs from SPO1 that uses a substrate concatemer to package, by an unknown mechanism, DNA molecules that have a several kb-long terminal redundancy (Stewart et al., 2009).

Replication of Bacillus Double-Stranded DNA Bacteriophages

SPP1

34.1

36

35 708

GBK2

2068

Exo

Rec

38

39

39

38

Ssb 40

40

oriL

65

44 1725

Rep

RMP

45

46

RDH

Res

47

57 3982

1320

Rep-RMP

80α

16

17

20 1716

JS01

30

32 699

21

24

ori

397

33 30

35 234

Fig. 2 Genome overview of the replication module of SPP1-like phages. The figure organization and symbology are as in Fig. 1. The protein functional annotation is: Exo: 5′→3′ exonuclease; Rec: Recombinase; Ssb: single-stranded DNA binding protein; Rep: replisome organizer; RMP: replisome mediator protein; RDH: replicative DNA helicase; Res: Holliday junction resolvase. oriL and ori are origins of replication. The genome sequences accession numbers are X97918, KJ159566, DQ517338 and KC342645.

Another mechanism proposed to generate the phage T4 DNA concatemer is that 3′-ssDNA ends of the terminal redundant region pair, leading to a circular genome that becomes supercoiled by the action of the T4 DNA topoisomerase (the gp39-gp52-gp60 complex). Then, most of T4 DNA replication forks are initiated by UvsX DNA pairing and D-loop formation at a homologous region on a supercoiled circular DNA template. The gp59 mediator loads the T4 primosome onto the displaced ssDNA coated by gp32. The primosome, in concert with gp32, recruits the T4 DNA polymerase holoenzyme. In the presence of the T4 DNA topoisomerase, the DNA polymerase holoenzyme catalyzes DNA synthesis that begins at the 3′-OH end of the invading strand. DNA is synthesized continuously on one template strand, as in the leading strand of a replication fork, and discontinuously on the other template strand (Okazaki fragments) by a semiconservative mechanism. In one of the models, the circular DNA is then nicked to generate linear dsDNA concatemers by σ replication (Barry et al., 2018) that are the substrate for DNA packaging. If the replication machinery stalls by an unrepaired DNA lesion, the phage RDR system rescues the stalled or collapsed fork. The UvsW DNA helicase can catalyze the regression of a stalled replication fork into a HJ-like structure that was postulated to be an intermediate in an error-free DNA damage tolerance pathway. The T4 gp46/gp47 exo/endonuclease and the Dda DNA helicase play a critical, although still poorly defined, role in replication restart. Finally, the gp30 DNA ligase could seal hmC containing DNA ends.

DNA Replication of SPP1-Like Viruses B. subtilis phage SPP1 is the best-characterized member of middle-sized viruses from the Siphoviridae Family that infects bacteria of the Bacilli Class. Currently, phage SPP1 is included in the unassigned viruses of the Family, in spite of the fact that its DNA replication and packaging machineries as well as the viral particle assembly mechanisms are among the best characterized within the Siphoviridae. This calls for definition of a SPP1-like genus. A comparative analysis of the complete SPP1 genome shows that B. subtilis phages SF6, 41c, ρ15, Lurz2 and Lurz3 have strong nucleotide sequence identity to phage SPP1, defining group (i). The Bacillus spp phage PM1 has discrete modules with significant nucleotide sequence identity to phage SPP1. Phage GBK2 has no homology at the DNA sequence level but has extensive similarity at the protein level, particularly in case of the DNA replication proteins whose coding genes order is co-linear with the SPP1 genome (Fig. 2). Finally, the Staphylococcus spp phages 80α, ϕ11 and JS01; Lactobacillus spp phage LL-H; and Streptococcus spp phage Sfi11 viruses show a conserved gene order of the DNA replication module with phage SPP1. Phages SPP1 (representative of group i), GBK2 (group ii) and 80α and JS01 (group iii) are depicted in Fig. 2. All these phages package their DNA by a headful mechanism, generating encapsidated chromosomes with terminal redundancy and partial permutation. Phages of this group replicate, most likely, their 40–50 kb genome using a mechanism similar to phage SPP1. At early times of infection, the SPP1 genome circularizes and replication is initiated at a specific and discrete replication origin sequence (oriL) in a process that resembles host chromosome replication. The early (θ-type) and late (σ-type) replication modes rely on early transcribed SPP1 genes. Early replication requires the essential SPP1-encoded replisome organizer (Rep, gp38), replication mediator protein (RMP, gp39) and replicative DNA helicase (RDH, gp40). As the infection progresses, σ-type concatemeric DNA replication becomes predominant and θ-type ori-dependent replication is turned off. The switch of replication mechanism relies on recombination proteins: the SPP1-encoded ATP-independent recombinase (Rec, gp35), a 5′-3′ exonuclease (Exo, gp34.1), a singlestranded DNA binding protein (Ssb, gp36), and a Holliday junction resolvase (Res, gp44). In certain phages the initiator and mediator functions might be fused in a single polypeptide (Rep-RMP: phage 80α gp20 and phage JS01 gp33) (Fig. 2) while the single-stranded DNA binding protein, the 5′-3′ exonuclease and the HJ resolvase can be substituted by host encoded functions. The essential recombinase (Rec) can belong either to the Rad52 superfamily of recombinases (defining the Resβ [SPP1 gp35, GBK2 gp39], Sak [80α gp16] and Erf [C2 gp8] subfamilies of recombinases) or to the RecA-like superfamily as the ATP-dependent Sak4 recombinase (ϕ11 gp12). The HJ resolvase (Res), like gp44 of SPP1, belongs to the RusA-like superfamily (Fig. 2).

66

Replication of Bacillus Double-Stranded DNA Bacteriophages

SPP1 relies on numerous host proteins (the SPP1 missing functions) that are essential for its DNA replication, namely PolC, DnaE, DnaN (β-clamp), the DnaX-HolA-HolB or τδδ′ complex and DNA gyrase. Cellular functions can also replace the non-essential gp34.1, gp36 and gp44 proteins. Although the lack of gp34.1 and gp36 is detrimental, the absence of gp44 has only a marginal effect on SPP1 multiplication. Initiation of circle-to-circle θ DNA replication and the late σ RDR (rolling circle replication) have been reconstituted in vitro (Seco and Ayora, 2017; Seco et al., 2013). After entry of the SPP1 linear chromosome in the B. subtilis cytosol, viral or host homologous recombination functions use the viral chromosome terminal redundancy to circularize the SPP1 genome, and the host DNA gyrase supercoils the circular DNA. The viral genome contains two replication origins (oriL and oriR) located B13 kb apart in the circularized genome. The oriL sequence lays within the gene of the ATP-independent replisome organizer gp38 and oriR in an intergenic region transcribed late. Both oriL and oriR contain a different number of gp38-binding boxes (boxes A and B). DNA replication is initiated by the preferential binding of gp38 to oriL that induces a localized unwinding of the adjacent AT-rich region on the supercoiled DNA template. The replication mediator gp39, which is an intrinsically disordered protein, interacts with the replicative helicase gp40 and with the replisome organizer gp38. Gp40 promotes gp39 protein folding to form a double hexamer structure together with gp40. The gp39-gp40 complex has neither ATPase nor helicase activity. Gp38 bound to oriL interacts with gp39 and loads the gp39-gp40 complex to the DUE region created when gp38 is bound to oriL. Gp39 loads and locks gp40 on the oriL lagging strand region until the gp38-gp39 complex is disassembled. Concomitantly with gp38-gp39 dissociation, gp36 binds to the ssDNA, and gp40 becomes free of the inhibition exerted by gp39. The hexameric replicative gp40 DNA helicase hub interacts with DnaX (or τ subunit) that is a component of the pentameric DnaX3-HolA-HolB (also termed τ3δδ′) complex, with the DnaG primase, and with the DnaE polymerase (Table 1). Then, the τ3δδ′ complex loads both family C DNA polymerases (PolC and DnaE) and two sliding clamps (DnaN) to assemble the functional replisome (Seco and Ayora, 2017). Unidirectional leading and lagging strand synthesis is initiated at oriL. Priming of both strands is carried out by the DnaG-DnaE complex (like eukaryotic DNA polymerase α) that synthesizes a RNA-DNA hybrid. This hybrid is further extended by PolC in the context of the full replisome (i.e., in a complex with DnaN-DnaX-HolA-HolB) and catalyzes unidirectional replication of both strands (Table 1). DNA gyrase is required to remove positive supercoiling produced by the advance of the replication fork on the circular DNA template. The DnaG primase actively contributes to DNA polymerase switch at the leading strand, by inhibiting DnaE and stimulating PolC-mediated leading strand synthesis. On the other hand, gp36 and gp40 bound to the lagging strand recruit the DnaG-DnaE complex. This complex synthesizes the RNA-DNA hybrid while gp36 inhibits DnaE and stimulates PolC (Seco and Ayora, 2017). The replisome catalyzes discontinuous synthesis using the lagging strand template (Okazaki fragments) by a leading and lagging strand coupling via a semiconservative mechanism of DNA synthesis (trombone model). The encounter of the replisome with the 5′-end of previous synthesized Okazaki fragment triggers the dissociation of the lagging-strand replisome that disengages from the DnaN clamp leaving it bound to the lagging-strand. After one or a few rounds of circle-to-circle DNA synthesis this θ mode of replication is turned off and SPP1 replication switches to linear concatemeric DNA replication (σ mode). The switch was proposed to be triggered by stalling replication fork progression (road blocking) by gp38 bound to oriR. Seven SPP1-encoded proteins are involved in the process of resuming replication at the stalled fork: four essential proteins (gp38 and gp39, working as a preprimosomal proteins, gp40 and gp35) and three partially dispensable proteins (gp34.1, gp36 and gp44). A branch migration translocase might regress the stalled fork leading to a Holliday junction (HJ) like structure. This extruded HJ can be cleaved by the gp44 HJ resolvase to generate a one-ended double strand break. The 5′→3′ exonuclease activity of gp34.1 generates the 3′-tailed substrate required for gp35 activity (Table 1). Gp35, in concert with gp36, catalyzes DNA pairing and D-loop formation at a homologous region on an intact DNA template to initiate homologydirected RDR. Host RecA recombinase cannot replace for the gp35 activity. Gp39 in the preprimosome gp38-gp39 complex loads the replicative gp40 helicase that orchestrates assembly of the active replisome, as described above for the θ-type replication mode, initiating σ-type DNA replication. Two different mechanisms were proposed to achieve concatemeric DNA synthesis (Lo Piano et al., 2011). The first is a bubble migration mechanism that invokes conservative DNA synthesis. The second invokes the action of the gp44 resolvase, or its cellular equivalent, to cleave and rejoin the D-loop structure such that a “rolling-circle”-like mechanism of DNA replication (σ-mode) occurs. The σ replication mode generates a long linear concatemer. The SPP1-encoded terminase initiates DNA packaging by cleavage at the pac sequence that occurs only once along a substrate concatemer. The packaging cycle is terminated by a non-specific sequence cut determined by the amount of DNA inside the capsid, yielding packaged molecules longer than the phage genome (headful packaging mechanism) (Oliveira et al., 2013). Processive headfuls are then sequentially packaged from the concatemer, generating a population of mature chromosomes with terminally redundant and partially circularly permuted sequence.

DNA Replication of /29-Like Viruses B. subtilis ϕ29 phage is the best-characterized member of small-sized viruses from the Podoviridae Family that infect bacteria of the Bacilli Class. Currently the Genus ϕ29-like virus is in the transition to be renamed Salasvirus in honor of Margarita Salas, who has worked on ϕ29 for over five decades. A comparative analysis of the complete genomes of Bacillus spp phages Nf, Goe1, ϕ15 and PZA show strong nucleotide sequence identity to the ϕ29 genome. All members of the ϕ29-like genus seem to have a similar core genome and a conserved organization, with phage PZA exhibiting the particularity of an inversion of the left early genome region (Fig. 3). In the ϕ29-like Genus there are phages with a mosaic genome organization, which have discrete modules with significant degree of

Replication of Bacillus Double-Stranded DNA Bacteriophages

67

Fig. 3 Genome overview of the replication module of ϕ29-like phages. The figure organization and symbology are as in Fig. 1. The small filled square denotes the TP bound to the 5′-end of the genome. The protein functional annotation is: Pol: DNA polymerase; TP: terminal protein; Ssb: single-stranded DNA binding protein; DBP: double-stranded DNA binding protein. The genome sequences accession numbers are NC 011048, PZACG, X99260, MF156577 and MH817022.

nucleotide sequence identity, as Bacillus spp phages GA-1, B103, Juan, BthP-Goe4, and Goe6; Staphylococcus spp phages P68 and ϕIPLA7; and Streptococcus spp Cp-1, Cp-5, Cp-7, and Cp-9. Phages ϕ29 and PZA (representatives of group i), B103 (group ii) and Juan and BthP-Goe4 (group iii) are depicted in Fig. 3. The protein-priming DNA replication mode of ϕ29-like viruses generates a linear unit-length molecule with a terminal protein (TP, also termed gp3 in ϕ29) at the 5′-end. The TP and a RNA molecule are essential for packaging of a unit-length viral genome into empty proheads. All ϕ29-like viruses have a linear dsDNA with a TP covalently linked to the DNA ends through a phosphoester bond between the OH group of a serine (e.g., ϕ29) or threonine (Cp-1) residue in the TP and the 5′-dAMP embedded within a DNA short inverted terminal repeat (ITR). The ITR length varies among phages of the ϕ29 Genus. The ϕ29 DNA replication system was reconstituted in vitro and will be described as the model for this group of phages (Salas, 1991; Salas et al., 2016). In contrast to the complexity of other reconstituted replication systems, efficient synthesis of the ϕ29 genome can be accomplished in vitro with only four phage proteins. They are the double-stranded ϕ29 DNA-binding protein (DBP) gp6, the single-stranded DNA binding protein (Ssb) gp5, TP gp3, and the family B DNA polymerase (Pol) gp2. Gp2 couples DNA polymerization to strand displacement. This monomeric DNA polymerase also shows 3′→5′ exonucleolytic activity, enabling proofreading. Its helix destabilizing activity, makes strand displacement by an accessory DNA helicase unnecessary. Upon delivery of the ϕ29 DNA into the host cytoplasm, the parental TP covalently linked to each 5′-end of the genome associates with the bacterial nucleoid (Muñoz-Espin et al., 2010). The abundant replication mediator gp6 binds along the ϕ29 DNA to form a nucleoprotein complex on viral DNA. This complex unwinds the ends of the DNA helix to facilitate loading of the heterodimer formed between a free TP and gp2 at a TP-DNA end. This TP-DNA end acts as the origin for replication. The DNA polymerase then catalyzes the formation of a phosphoester bond between a dAMP and the hydroxyl group of a Serine residue of the free TP, giving rise to the TP–dAMP initiation complex. The initiation reaction is directed by the second T at the 3′-end of the six nucleotides long ITR (3′-TTTCAT-5′) present at the extremity of the ϕ29 DNA molecule. After, the TP-dAMP complex translocates one position backwards to recover the information corresponding to the first T on the template strand. The second T serves again as template for the incorporation of the following nucleotide (sliding-back mechanism), followed by addition of five nucleotides. Any mis-incorporated nucleotide in the TP–dNMPs covalent complex is not a substrate for the 3′→5′ exonuclease proofreading activity of the ϕ29 DNA polymerase. Replication fidelity during the initiation reaction thus relies on the sliding-back mechanism in phages that initiate replication by protein priming. After initiation and sliding back, the ϕ29 DNA polymerase/primer TP heterodimer undergoes structural changes during incorporation of 6–9 nucleotides (transition stage). The DNA polymerase dissociates from the primer TP when nucleotide 10 is incorporated into the nascent DNA chain, shifting to the elongation mode. The polymerase dissociated from the TP-gp2 heterodimer catalyzes highly processive DNA synthesis coupled to strand displacement of the non-template strand that is coated by gp5. The replication of both strands proceeds continuously from each terminal priming event (type I replicative intermediates). When the two replication forks moving in opposite directions meet, the type I replicative intermediate gives rise to two physically separated type II replicative intermediates. These molecules consist of full-length DNA in which a portion of the DNA, starting from one end, is dsDNA and the portion spanning to the other end is ssDNA. Once replication of both strands is accomplished, the two DNA polymerase molecules dissociate from the viral DNA to start initiation of replication in a new ϕ29 DNA molecule. The transcription machinery could be an obstacle to type I replicative intermediates and head-on and co-directional collisions could stall the replisome, leading to reassembly of the replication machinery. The small-size ϕ29-like viruses do not encode for replication restart functions, suggesting that if replisome stalling occurs, the re-initiation relies on host functions.

68

Replication of Bacillus Double-Stranded DNA Bacteriophages

References Ackermann, H.W., 2003. Bacteriophage observations and evolution. Research in Microbiology 154, 245–251. Barry, J., Wong, M.L., Alberts, B., 2018. In vitro reconstitution of DNA replication initiated by genetic recombination: A T4 bacteriophage model for a type of DNA synthesis important for all cells. Molecular Biology of the Cell 30 (1), (mbc.E18-06-0386). Benkovic, S.J., Spiering, M.M., 2017. Understanding DNA replication by the bacteriophage T4 replisome. Journal of Biological Chemistry 292, 18434–18442. Casjens, S.R., 2011. The DNA-packaging nanomotor of tailed bacteriophages. Nature Reviews Microbiology 9, 647–657. Kreuzer, K.N., 2000. Recombination-dependent DNA replication in phage T4. Trends in Biochemical Sciences 25, 165–173. Llorente, B., Smith, C.E., Symington, L.S., 2008. Break-induced replication: What is it and what is it for? Cell Cycle 7, 859–864. Lo Piano, A., Martinez-Jimenez, M.I., Zecchi, L., Ayora, S., 2011. Recombination-dependent concatemeric viral DNA replication. Virus Research 160, 1–14. Masukata, H., Dasgupta, S., Tomizawa, J., 1987. Transcriptional activation of ColE1 DNA synthesis by displacement of the nontranscribed strand. Cell 51, 1123–1130. Mosig, G., 1998. Recombination and recombination-dependent DNA replication in bacteriophage T4. Annual Review of Genetics 32, 379–413. Nossal, N.G., 1983. Prokaryotic DNA replication systems. Annual Review of Biochemistry 52, 581–615. Oliveira, L., Tavares, P., Alonso, J.C., 2013. Headful DNA packaging: Bacteriophage SPP1 as a model system. Virus Research 173, 247–259. Rao, V.B., Feiss, M., 2015. Mechanisms of DNA packaging by large double-stranded DNA viruses. Annual Review of Virology 2, 351–378. Salas, M., 1991. Protein-priming of DNA replication. Annual Review of Biochemistry 60, 39–71. Salas, M., Holguera, I., Redrejo-Rodriguez, M., De Vega, M., 2016. DNA-binding proteins essential for protein-primed bacteriophage f29 DNA replication. Frontiers in Molecular Biosciences 3, 37. Seco, E.M., Ayora, S., 2017. Bacillus subtilis DNA polymerases, PolC and DnaE, are required for both leading and lagging strand synthesis in SPP1 origin-dependent DNA replication. Nucleic Acids Research 45, 8302–8313. Seco, E.M., Zinder, J.C., Manhart, C.M., et al., 2013. Bacteriophage SPP1 DNA replication strategies promote viral and disable host replication in vitro. Nucleic Acids Research 41, 1711–1721.

Relevant Websites http://millardlab.org/bioinformatics/bacteriophage-genomes/ Bacteriophage Genomes – millardlab. http://genome.jouy.inra.fr/phagonaute/ Phagonaute. http://biodev.extra.cea.fr/virfam/ VIRFAM – CEA/DSV.

Lytic Transcription William McAllister, Rowan University School of Osteopathic Medicine, Stratford, NJ, United States Deborah M Hinton, National Institute of Diabetes and Digestive and Kidney Diseases, National Institutes of Health, Bethesda, MD, United States Published by Elsevier Ltd.

Nomenclature

gp Gene product IC Transcription initiation complex NTD N-terminal domain of a protein

CTD C-terminal domain of a protein EC Transcription elongation complex

Glossary Lytic phage Phage that infect, takeover the host, make progeny, and lyse the cell. Non-template strand The DNA sequence that will be the same as the RNA, ‘the coding strand’.

Promoter DNA sequence that sets the start site for transcription initiation by RNA polymerase. Template strand The DNA sequence that is copied by RNA polymerase during transcription. It will be the complement of the RNA sequence.

Introduction Lytic phage, which infect, make progeny, and lyse the cell, need to rapidly takeover host resources in order to reprogram the host for phage development. Regulation of phage and host gene expression through transcriptional control is the primary mechanism for this appropriation of the host machinery. To accomplish this, phage-encoded proteins can inhibit host gene expression, redirect the host RNA polymease to phage genes, and/or function as a new RNA polymerase. Here we discuss mechanisms used by the model bacteriophages T4, T7, and N4, as well as phages within those families.

Transcription Initiation and Elongation by Host RNA Polymerase When a lytic phage enters the cell, it must first manage the host RNA polymerase (RNAP), by redirecting it to phage promoters and/or by inhibiting some or all of its host activity. Bacterial RNAPs belong to the family of multi-subunit polymerases that are highly conserved throughout biology. They consist of a multisubunit core that contains the RNA synthesizing activity and a specificity subunit, called σ factor, that recognizes the promoter sequence and selects the start site for transcription. In bacteria, the core is typically composed of 5 proteins: β, β′, two α′s, and ω. σ factors include a primary σ, such as σ70 in E. coli, which is used during exponential growth, and alternate σ′s, which are used under different growth conditions or times of stress. The σ70 family, the largest group of σ factors, shares regions of sequence and structural homology. For the primary σ′s, these are divided into Regions (R) 1, 2, 3, and 4, which can be further subdivided based on function (R1.1, R1.2, etc.). Structures of multi-subunit RNAPs, together with decades of biochemistry and genetics, have revealed molecular details about how bacterial RNAPs initiate, elongate, and terminate transcription. We use the well-studied E. coli RNAP as the example here (Figs. 1 and 2(A)). The β and β′ subunits of the core create a crab-claw like structure, with the active site for RNA synthesis at the center of the claw. The N-terminal domain (NTD) of one α (α1NTD) interacts with β, while α2NTD interacts with β′. Each αNTD is connected to its C-terminal domain (CTD) via a flexible linker (Fig. 2(A), top; αCTDs are not seen in the structure in Fig. 1). The small ω subunit interacts with β′. σ70 interacts extensively with core, primarily through a portion of R2 that interacts with a coiled-coil motif within β′, but in addition, σR1.1 lies within what will be the downstream DNA channel, σR3.2 follows the channel through which the synthesized RNA will extrude, and σR4 interacts with a structure within β, called the β-flap. The very C-terminus of σ70 (σCT) interacts with the β-flap tip (Fig. 2(A)). A competent transcription complex is formed when E. coli σ70-RNAP interacts stably with promoter DNA (Fig. 2(A)). Promoter recognition arises through interactions with double-stranded (ds) portions of the promoter: σR4.2 with the −35 element (consensus sequence −35TTGACA−30) and σR3.0/2.4 with the −15TGnT−12 motif, which includes the extended −10 sequence −15 TG−14 and the first position of the −10 element (consensus −12TATAAT−7). The αCTDs can also interact with ds UP elements, AT-rich sequences between −40 and −60. However, not all of these elements are needed for recognition, and there are two main promoter classes: −10/−35 promoters (with matches to the −35 and −10 elements), and extended −10 promoters (with a good match to the −15TGnTATAAT−7 motif, but no recognizable −35 element). A single-stranded (ss) transcription bubble (from −11 to +3) is bound by σR2.3, which interacts with specific sequences within the nontemplate strand (consensus −11ATAAT−7), while the template stand is within the active site. Transcription initiation begins as ribonucleoside triphosphates enter the active site. The synthesized RNA exits through the RNA exit channel, a region that is filled by σR3.2 within the holoenzyme (Fig. 1). Thus, RNA

Encyclopedia of Virology, 4th Edition, Volume 4

doi:10.1016/B978-0-12-809633-8.20951-5

69

70

Lytic Transcription

Fig. 1 Structure of E. coli RNAP [PDB 4IGC] modeled with promoter DNA. RNAP core subunits β (light gray-blue), β′ (dark blue), ω (magenta), α1NTD (teal), α2NTD (light cyan), the σ specificity subunit σ70 (yellow), and ds promoter DNA (template strand, red; non-template strand, orange) are shown. (The αCTDs are not observed in the structure). Features discussed in the text are indicated.

Fig. 2 Changes to E. coli RNAP as T4 infection proceeds. (A) Host RNAP used for T4 early transcription. (B) Host RNAP with T4 MotA and AsiA used for middle transcription. (C) Host RNAP with T4 gp33, gp55, and gp45 used for late transcription. In each case the schematic is shown above; for (A) and (B), structural models are shown below. RNAP core is designated in light gray, σ70 in yellow, MotA in green, AsiA in dark gray, gp33 in Carolina blue, gp55 in purple and gp45 as the gray triangle. In the bottom panels, the positions of σ70 R2, R3, and R4, αCTDs, σCT, MotA (NTD, linker, and CTD), AsiA, and the promoter elements (−10, −35) are shown.

extrusion destablizes the interaction of σ with core, facilitating σ release. Typically, highly processive elongation continues until RNAP encounters an intrinsic termination signal, a stem-loop hairpin followed by a U-stretch, or a factor-dependent termination site, where the RNAP/DNA complex terminates, and RNAP dissociates from the DNA.

Developmental Pathways and Transcription Control by T7, T4, and N4 T4 Bacteriophage T4, a member of the Myoviridae family, does not encode its own RNAP. Instead it encodes factors that change the specificity of the host polymerase as infection proceeds, allowing it to program a temporal pattern of phage development from

Lytic Transcription

71

early to middle to late promoters. In general, early proteins are needed for host takeover and middle transcription; middle proteins are needed for replication, for factors involved in late transcription, and for additional host takeover functions; and late proteins make up the virion structure. However, the functions of dozens of T4 genes, especially early, are still unknown. Early promoters are active immediately after injection of the phage DNA. Host σ70-RNAP is diverted to these promoters in two major ways: (1) early T4 promoters, which have near ideal matches to the consensus −35 and extended −10 sequence motifs, compete exceptionally well with host promoters for the available RNAP, and (2) modification of the RNAP by the T4 Alt protein, which is present in the phage head and is injected with the DNA, further increases the activity of early promoters relative to those of the host. Alt ADP-ribosylates one of the α subunits at R265, a residue that is needed for αCTD recognition of the UP element promoter motif (Fig. 2(A)). As approximately B70% of transcription arises from the very strong host ribosomal promoters and UP element recognition contributes to ribosomal promoter strength, it is thought that this modification serves to immediately suppress a substantial amount of host transcription after T4 infection. It is not known which α is modified by Alt. T4 early transcription leads to the expression of phage early genes. Two of these encode MotA, a DNA-binding transcription activator, and AsiA, a transcription co-activator, which are required for T4 middle promoter activation through a process called σ appropriation (Fig. 2(B)). This process, which starts about 1 min after infection, redirects RNAP to T4 middle promoters, which contain the σ70−10 element and a MotA box motif [consensus sequence 5′ (a/t)(a/t)(a/t)TGCTTtA3′] centered at −30. AsiA first binds to R4 of free σ70 to form a complex in which this portion of σ is completely restructured. Consequently, upon binding to core, σ is unable to bind the β-flap or to interact with the −35 element of σ70-dependent promoters. (Curiously, the binding of AsiA does not appear to inhibit T4 early promoter activity). Importantly, σCT, which would normally be held by the β–flap tip (Fig. 2(A)), is now available for interaction with its binding partner, MotANTD (Fig. 2(B)). The MotA box motif within the middle promoter is engaged by the MotACTD, which binds to the downstream portion of the MotA box in the major groove while the MotAlinker (between the NTD and CTD) interacts with the upstream portion in the minor groove. The DNA binding motif of MotACTD represents a previously unrecognized way to interact with DNA – a double wing saddle structure in which both wings lie within the major groove. Overall the binding of AsiA and MotA serves to switch promoter recognition from a standard −35/−10 promoter to the middle MotA box/−10 promoter. In addition to activation of middle promoters, T4 middle genes are also expressed by the extension of early transcription from early genes into downstream middle genes. Although this extension requires T4 protein synthesis, no specific T4 product, such as an anti-terminator, has been identified. Rather it is thought that the process of translation itself serves to promote either transcription elongation and/or stabilizes the extended RNA. All cytosines within the T4 genome are modified by the addition of a 5-hydroxymethylated, α- or β-glucosylated moiety, whose presence protects the phage DNA from E. coli and T4 nucleases. Although this modification is not required for T4 transcription, it does provide significant advantages for phage gene expression. In the case of σ appropriation, modeling of the modification within the MotAlinker-CTD/DNA crystal structure indicates that the glucosyl sugar will make significant contact with the MotACTD saddle. Binding analyzes demonstrate that the presence of the 5′-hydroxymethyl, glucosyl moiety increases MotA affinity for the MotA box by 4100-fold. The T4 Alc protein also uses this modification to distinguish between host and phage DNA. Alc, a T4 early protein of 16.7 kDa, specifically promotes transcription termination of unmodified DNA. This function is thought to arise through an interaction of Alc with a nonconserved region of the β subunit of E. coli RNAP (Dispensable Region 1) near a residue (R368) that lies close to the RNAP active site (Fig. 1). Interestingly, Alc function requires rapidly elongating RNAP, consistent with the optimal conditions for T4 infection in rapidly growing cells. The activation of middle promoters also results in middle products that severely disrupt host transcription. First, T4 nucleases, encoded by middle genes denA and denB, specifically degrade unmodified DNA, resulting in B50% decrease in the host genome and effectively destroying the ability of the host to express its DNA. In addition, the T4 Mod protein completes the ADPribosylation of RNAP by modifying the other α subunit (again at residue 265). This modification then prevents any interaction of the αCTDs with UP elements, presumably further inhibiting host promoter activity. About 5 min after infection at 37°C, T4 late promoters become active (Fig. 2(C)). These promoters are recognized by E. coli RNAP core together with two T4 middle products that together constitute a new σ factor: gp55 and gp33. Gp55, a distant member of the σ70 family, retains an R2 motif that interacts with the T4 late −10 promoter element, −13TATAATA−6. Gp33 bears no sequence resemblance to σ factors, but nevertheless interacts with the β-flap of core, just like σ70 R4. However, gp33 does not interact with the DNA, and late promoters do not have an upstream element like the −35 element of σ70-dependent promoters. Thus, both middle and late T4 transcription involves employing a ‘new’ σ factor, either by reconfiguring σ70 with MotA and AsiA (middle) or replacing σ70 with gp55/gp33 (late). Besides gp55/gp33, late promoter activation is also dependent on DNA replication. This connection stems from the requirement for gp45, a replication protein, called the ‘sliding clamp’ that can interact with either T4 DNA polymerase, gp33, or gp55. The structure of gp45 is a triangular donut, which encircles the dsDNA (Fig. 2(C)). Gp45 is loaded onto the DNA by the T4 clamp loader complex (gp44/gp62) and then moves along the DNA. During replication the interaction between gp45 and DNA polymerase ‘clamps’ the replication machinery onto the DNA, insuring processivity. However, the interaction between gp45 and gp33/gp55 allows RNAP to also move along the DNA, scanning for a late promoter. Consequently, gp45 behaves as a transcriptional enhancer that lacks a specific enhancer binding motif. The connection between late transcription and DNA replication allows the phage to produce an amount of virion assembly proteins that is consistent with the level of replicated phage DNA.

72

Lytic Transcription

Besides gp33/gp55, other T4 middle products, RpbA and DsbA, bind to RNAP during late transcription. In vitro the presence of RpbA together with ADP-ribosylation of the αCTDs improves late promoter activity while diminishing middle promoter function. However, RpbA is not essential, and whether this is its function in vivo is not known. DsbA is a B10 kDa protein that binds weakly upstream of some late promoters. Despite an early report that the presence of DbsA enhances transcription at these promoters, it is now reported that it has no effect on late transcription in vitro. However, the dsbA ORF overlaps the 3′ end of gene 33, suggesting that translational coupling of these two genes could influence late transcription in vivo.

T7 Bacteriophage T7 is the prototype of the Podoviridae. The hallmark of this group is that most encode an RNAP that is responsible for transcription of most of the phage DNA. The 40 kb T7 genome encodes Class I, Class II, and Class III genes (Fig. 3(A)). Class I genes are responsible for synthesis of the phage RNA polymerase and proteins required to protect the DNA from degradation (among other functions). Class II genes are largely involved in DNA replication and recombination. Class III genes encode proteins required for virion assembly, packaging of phage DNA, and lysis of the cell. All transcription proceeds unidirectionally, and only one strand of the DNA is transcribed. Initially, only the first B850 bp of DNA is injected into the cell; the remainder is transported by transcription, first by the host RNAP and subsequently by the phage RNAP.

Early transcription Transcription of the Class 1 genes by the host RNAP is initiated near the left end of the genome from a cluster of three strong promoters that have evolved to be optimal under different intracellular conditions (Fig. 3(A)). Termination occurs at the end of the early region at TE, which is similar to intrinsic terminators for E. coli RNAP and encodes an RNA with a potential stem-loop structure followed by a run of Us. The polycistronic early region transcript is processed by RNase III into individual mRNAs. Inhibition of transcription by the host RNAP involves the actions of gp0.7, a protein kinase that phosphorylates β′ (as well as other host proteins), resulting in increased factor-dependent transcription termination, and by gp2, which binds to σ70 R1.1, preventing its movement from the downstream DNA channel (Fig. 1), thereby blocking initiation. In addition, nucleases encoded by gp3 and gp6 efficiently degrade the host genome; over 80% of phage DNA is derived from host DNA.

Phage RNAP and late gene transcription T7 RNAP is related structurally and by sequence homology to other members of the Pol I family of nucleotide polymerases, which include DNA Pol 1, reverse transcriptase, and mitochondrial RNA polymerase. A catalytic core domain is conserved within this class, while other regions are specific to the transcription functions of the enzyme. The preservation of the core among these varied polymerases, which utilize different templates and substrates, is remarkable; the relationship to mitochondrial RNAP is thought to reflect the origin of this organelle as an early prokaryotic symbiont of eukaryotic cells. The T7 RNAP consists of 883 amino acids (Figs. 3(B) and 4(A)). The ‘cupped right hand’ finger, thumb and palm subdomains, containing the catalytic core, are within the conserved C-terminal region while the N-terminal domain is involved in promoter recognition, elongation, and termination. In addition to its role in transcription, the phage RNAP is involved in other essential functions, including transport of the phage DNA into the cell, priming of DNA replication, and maturation and packaging of the DNA. Promoters for the phage RNAP are related to a consensus sequence that extends from −17 to +6, where initiation is at +1 (Fig. 3(C)). Whereas the class II promoters differ from the consensus at a number of positions, class III promoters preserve this sequence. Control of late transcription involves T7 lysozyme (Lys, gp3.5), a dual function protein that includes an amidase that is involved in cell lysis (in its C-terminal portion) and a separable regulatory function that is involved in modulating activity of the RNAP during initiation and pausing (in the N-terminal portion). The mechanism of action of Lys is unclear. Binding of Lys to the RNAP alters the configuration of the C-terminus of the T7 RNAP, which is in close proximity to the active site, and increases Km for the initiating NTPs, suggesting that it may exert its effect by preferentially inhibiting initiation at the weaker class II promoters.

The transcription cycle T7 RNAP goes through the same steps in the transcription cycle as the multi-subunit RNAPs. These include promoter binding and initiation, promoter clearance, elongation, and termination. Extensive genetic, biochemical, and structural analyzes have provided insight as to how these steps occur in exquisite detail. Promoter binding involves interaction of the upstream region of a promoter with elements in the N-terminal domain of the RNAP (e.g., the AT-rich recognition loop and the intercalating loop (β-IH)) and with the “specificity loop” (a beta hairpin that projects from the fingers subdomain in the C-terminal core). Meanwhile, the downstream region of the promoter, which includes the start site for initiation (+1), interacts with elements in the catalytic core domain. Binding of the polymerase is accompanied by bending of the promoter that results in strand separation (“melting”) of the promoter from −4 to +3, exposing the template strand bases in the initiation region and allowing initiation and the incorporation of the first few nt into the nascent transcript. The initiation complex (IC) can only accommodate up to three base pairs of the growing RNA:DNA hybrid. As the transcript is elongated from 3 to 7 bp, steric clash with the N-terminal domain results in a rotation of this domain as a rigid body by about 40 degrees, without promoter release. The resulting strain in the complex can lead to release of abortive initiation products before the

Lytic Transcription

73

Fig. 3 T7 transcription. (A) Transcription map of T7. All transcription is from left to right. Locations of promoters and the direction of transcription for host (green) and phage (red, blue) RNAPs are indicated by arrows. Termination/pause signals are denoted by vertical bars. The primary origin of DNA replication (ori) is indicated. The genetic and physical maps are at the bottom; not all genes are identified in this figure. (B) Structural changes that occur during the transition from IC (PDB 1QLN) to an EC (PDB 1MSW). The two structures are oriented similarly with regard to subdomains in the catalytic C-terminal core: palm (red), fingers (blue), and thumb (green). The N-terminal domain is in gray. Elements involved in upstream promoter recognition include the specificity loop (purple), the AT-rich recognition loop (cyan), and β-IH, the intercalation loop (yellow). (C) Consensus promoter sequence for T7 RNAP. Key elements of the RNAP that interact with the binding and initiation regions of the promoter are indicated. The region that is melted open during promoter binding is indicated by displaced template (bottom) and non-template (top) strands of the DNA. The initiation site is at +1.

74

Lytic Transcription

Fig. 4 Comparison of the structures of the single-subunit RNAPs encoded by T7 and N4 complexed with DNA. (A) T7 RNAP (PDB 1CEZ). (B) N4 mini-vRNAP (PDB 3C3L). (C) N4 RNAP II (PDB 6DT7). In each case, the right ‘cupped hand’ composed of fingers (dark blue), palm (red), and thumb (green), the AT-recognition loop (cyan), the specificity loop (purple), and the β-IH (“intercalation loop”, yellow) are shown. DNA is shown in orange with position −1 indicated as a sphere.

transition to a stable elongation complex (EC) occurs. Notably, as the length of the transcript approaches B8 nt (the length of the hybrid in the EC), the 5′end of the RNA comes into contact with the specificity loop, which is involved in upstream promoter contacts. This is reminiscent of the situation with E. coli RNAP in which the lengthening transcript displaces σR3.2 (see above), suggesting that such interactions may be a common feature of promoter release. Beyond this point (after synthesis of 9–12 nt), release of promoter contacts occurs, and is accompanied by major conformational changes that result in the formation of an RNA exit tunnel, and a stable EC (Fig. 3(B)).

Elongation Once the complex has isomerized into the EC, it is highly stable and will transcribe many kb of template without dissociation until it encounters a pause or a termination signal. Each cycle of nucleotide addition is accompanied by opening and closing of the active site in the catalytic core, and translocation of the EC along the template. Recognition of the next incoming nucleotide occurs in the “open” state by the formation of a nascent base-pair with the exposed template strand base. Closing of the active site places the incoming nucleotide and template base in a position that is suitable for catalysis. Formation of the phosphodiester bond results in the release of PPi, which destabilizes the closed complex, triggering a return to the open complex and translocation of the EC to the next position along the template.

Termination T7 RNAP pauses or terminates at two types of signals. The termination signal Tϕ, located downstream of gene 10, resembles intrinsic terminators for E. coli RNAP such as TE, but has a longer stem-loop structure and a longer run of U residues downstream. Termination at this site is not complete, and read through transcription is required for expression of genes 11 and 12. The mechanism of termination at Tϕ is not clear, but may involve steric clash of the stem-loop structure with the RNA exit pore, and/or induced slippage of the transcript at the run of U residues, resulting in misalignment in the active site. A second type of signal is present in the junction of replicating T7 DNA concatemers (CJ). This signal does not encode an RNA with secondary structure but involves recognition of a specific sequence (TATCTGTT) and results in pausing rather than termination. The efficiency and duration of the pause is dramatically increased in the presence of T7 Lys, and the halted complex is thought to recruit other phage proteins that are involved in processing and packaging of the DNA. The mechanism of pausing is not clear, but is likely to involve recognition and binding of the signal in the nontemplate strand. This feature may be a common mechanism for pausing/termination, as E. coli RNAP has also been found to recognize pause sites in this strand.

N4 The lytic podoviridae phage N4 employs the host multi-subunit RNAP and 2 phage-encoded RNAPs, both of which are distantly related to the T7 family of single-subunit RNAPs. Expression of early N4 genes depends upon a virion-encapsulated RNAP (vRNAP) that is injected into the host upon infection. This injection results in the entry of the first 500 bp of the left end of the linear N4 DNA. The DNA is then constrained/supercoiled by the action of host gyrase to generate a template that extrudes hairpins on the template and non-template strands at three vRNAP promoter sequences. Binding of the E. coli single-stranded binding protein (Eco SSB) generates a single-stranded (ss) non-template strand and stabilizes the template strand hairpin, which is recognized through the interaction of vRNAP with bases −11, −10, −8, and +1. Thus, Eco SSB has been described as an ‘architectural transcription factor’. Once transcription initiates, Eco SSB also binds the exiting transcript. vRNAP lacks a domain present in T7 RNAP, which binds RNA, separating it from the template. Consequently, it is thought that the Eco SSB/transcript interaction serves to replace this function.

Lytic Transcription

75

Despite the large size of vRNAP (3500 aa), only 1/3 of the protein (1106 aa) is required for transcription. Structural analyzes indicate that this mini-vRNAP shares common structural features of the ‘cupped right hand’ single-subunit RNAPs (Fig. 4(B)). This structural similarity exists despite practically no sequence homology. In both vRNAP and T7 RNAP (Fig. 4(A)), the DNA is recognized similarly: a specificity loop interacts with the major groove of the ds DNA (the hairpin in the case of vRNAP promoters), the β-IH (intercalating loop) recognizes the ds/ss DNA junction present at the upstream edge of the transcription bubble, and the upstream recognition element recognizes position −11 in the N4 promoter or the upstream AT-rich region of a T7 promoter (Figs. 3(C) and 4). Once the N4 early genes are transcribed, the remaining portion of the N4 genome is injected and middle gene transcription commences. Middle transcription requires a second N4 RNAP, RNAP II, encoded by genes 15 and 16. Like vRNAP, RNAP II is a minimal, T7-like RNAP (Fig. 4(C)). It bears much structural and some sequence homology to the T7 protein. RNAP II associates tightly with the cytoplasmic membrane, and transcription requires either denatured DNA or a membrane-bound DNA containing the N4 ssDNA binding protein, gp2. While either substrate is active, only the gp2/membrane/DNA substrate results in specific transcription. It has been proposed that another N4 protein, perhaps gp1, denatures the promoter region, which then allows gp2 to bind and recruit RNAP II. In a perverse twist, N4 late transcription, which generates the RNAs needed for the production of morphological proteins, requires the host RNAP. Besides host σ70-RNAP, late transcription also requires a middle gene product, the N4 SSB protein. Although N4 SSB is required for N4 replication, a distinct region of the protein at the C-terminus interacts with the β′ subunit of RNAP. Thus, as in T4 late transcription, a part of the N4 late transcriptional machinery is directly connected to DNA replication.

Inhibition of Host RNAP by T4, T7, and N4, and Other Members of Those Families As detailed above for T4 and T7, phages produce products to eliminate or down-regulate transcription by the host RNAP. Recent work has identified novel phage inhibitors. Xp10, a Siphoviridae phage of the plant pathogen Xanthomonas oryzae, encodes a protein (p7), which both inhibits recognition of host −10/−35 promoters and functions as an antiterminator at intrinsic (factorindependent) termination sites. This results in efficient transcription from the phage extended −10 promoters while host transcription is inhibited. The primary binding site of p7 is the N-terminus of the β′ subunit (Fig. 1), but the β-flap domain and the ω subunit, have also been implicated in transcription inhibition and anti-termination, respectively. The Thermus thermophilus phage P23–45 produces two inhibitors of T. thermophilus RNAP, gp76 (an early gene product) and gp 39 (a middle gene product). Gp76 interacts with the downstream DNA channel of RNAP (Fig. 1) and the path that would be occupied by the upstream portion of the ss template stand (−11 to −4) within the transcription bubble. Gp39 binds to the β-flap and σR4, thus interfering with the σR4/β-flap interaction needed for −10/−35 promoter activity. This facilitates phage promoter activity since P23–45 middle and late promoters belong to the extended −10 class. Finally, gp67 of G1, a Myoviridae phage that infects Staphylococcus aureus, inhibits the host by yet another mechanism. In this case, gp67 binds to σR4, but this binding does not interfere with its ability to recognize the −35 element. Instead this binding allows gp67 to interfere with αCTD binding to UP elements between −40 and −60. Since the strong ribosomal promoters rely on αCTD/UP element interaction and ribosomal transcription accounts for the majority of overall bacterial transcription, this inhibition leads to a major down-regulation of transcription in the host.

Summary Despite their relatively small genomes, lytic phages employ highly sophisticated and varied strategies to maximize transcription of their own DNA while limiting host gene transcription. Understanding these processes at a molecular level has deepened our understanding of transcription mechanisms used by both multi-subunit and single subunit (Pol I-like) RNAPs. In particular, phages are providing a rich resource for studies of RNAP inhibitors. Interest in the detailed mechanisms of these inhibitors has surged in recent years as researchers search for ways to combat emerging and reemerging bacterial pathogens. It is likely that a vast array of other mechanisms will be discovered as new phage functions are identified.

Further Reading De Smet, J., Hendrix, H., Blasdel, B.G., Danis-Wlodarczyk, K., Lavigne, R., 2017. Pseudomonas predators: Understanding and exploiting phage-host interactions. Nature Reviews Microbiology 15, 517–530. Decker, K.B., Hinton, D.M., 2013. Transcription regulation at the core: Similarities among bacterial, archaeal, and eukaryotic RNA polymerases. Annual Review of Microbiology 67, 113–139. Geiduschek, E.P., Kassavetis, G.A., 2010. Transcription of the T4 late genes. Virology Journal 7, 288. Hinton, D.M., 2010. Transcriptional control in the prereplicative phase of T4 development. Virology Journal 7, 289. James, T.D., Cardozo, T., Abell, L.E., et al., 2016. Visualizing the phage T4 activated transcription complex of DNA and E. coli RNA polymerase. Nucleic Acids Research 44, 7974–7988. Lenneman, B.R., Rothman-Denes, L.B., 2015. Structural and biochemical investigation of bacteriophage N4-encoded RNA polymerases. Biomolecules 5, 647–667.

76

Lytic Transcription

McAllister, W.T., Raskin, C.A., 1993. The phage RNA polymerases are related to DNA polymerases and reverse transcriptases. Molecular Microbiology 10, 1–6. Molineux, I., 2005. The T7 Group. Oxford: Oxford University Press. Molodtsov, V., Murakami, K.S., 2018. Minimalism and functionality: Structural lessons from the heterodimeric N4 bacteriophage RNA polymerase II. Journal of Biological Chemistry 293, 13616–13625. Saecker, R.M., Record Jr., M.T., Dehaseth, P.L., 2011. Mechanism of bacterial transcription initiation: RNA polymerase – Promoter binding, isomerization to initiationcompetent open complexes, and initiation of RNA synthesis. Journal of Molecular Biology 412, 754–771. Steitz, T.A., 2009. The structural changes of T7 RNA polymerase from transcription initiation to elongation. Current Opinion in Structural Biology 19, 683–690. Sutherland, C., Murakami, K.S., 2018. An introduction to the structure and function of the catalytic core enzyme of escherichia coli RNA polymerase. EcoSal Plus 8. Washburn, R.S., Gottesman, M.E., 2015. Regulation of transcription elongation and termination. Biomolecules 5, 1063–1078.

Lysogeny Keith E Shearwin and Jia Q Truong, The University of Adelaide, Adelaide, SA, Australia r 2021 Elsevier Ltd. All rights reserved.

Glossary Lysogen A bacterial cell carrying one or more bacteriophage genomes, either integrated into the host chromosome or existing as independently replicating extrachromosomal elements. A true lysogen is able to exit lysogeny and re-enter lytic development. Lysogenic conversion The situation where a bacterial host acquires a new trait as a direct result of the expression of a gene or genes encoded by a prophage. Prophage The lysogenic form of a bacteriophage. Prophage induction The process of a bacteriophage exiting lysogeny and entering lytic development, resulting in lysis of the host cell.

Temperate bacteriophage A bacteriophage capable of entering either lytic development or lysogeny upon host cell infection. Terminally redundant DNA DNA that contains repeated sequences at each end called terminal repeats. These ends are used to join the ends of the linear DNA to form circular DNA. Transcriptional interference The suppressive influence of one transcriptional process, directly and in cis on a second transcriptional process.

Introduction Bacteriophages (phages) are obligate bacterial parasites. Many phages exhibit a purely lytic lifecycle, where following infection of a susceptible host, the phage DNA is replicated and the phage hijacks the bacterium’s cellular machinery to produce new virion particles, which are released upon lysis of the host cell. Other phages, the so-called temperate phages, are able to make a developmental ‘decision’ between two developmental regimes, the lytic and lysogenic cycles. In lysogeny, the phage persists indefinitely inside the bacterial host as a prophage, with the phage genome either integrated into the bacterial chromosome or existing extrachromosomally (Fig. 1). Lysogeny is a non-bacteriocidal state where the bacteriophage genome replicates without virion production. In this state, most of the phage genome is transcriptionally inactive, as the expression of lytic functions would be lethal to the host cell. A bacterial host cell lysogenic for a particular phage is also immune to further infection by the same phage. Filamentous phage, which have ssDNA genomes packaged into filament-like virions, are able to replicate without killing the host. Among the filamentous phage, some integrate in the host chromosome, while non-integrative filamentous phage replicate exclusively as extrachromosomal elements or episomes. Both classes of filamentous phages continually shed viral particles without host cell death, even while inserted into the bacterial genome as a prophage. Thus, bacteria infected permanently with filamentous phage represent a form of lysogeny, but don’t meet the definition of true temperate phage, in that there is no stage which brings about host cell lysis. In this article, we discuss specific examples of how true temperate phage establish and maintain lysogeny. The phenotypic effects of lysogeny on the host bacterium and the evolutionary impacts of lysogeny are also discussed.

Why Lysogeny? For a phage to successfully propagate, it must coax its host into manufacturing new virions. These new virions are released through host cell lysis, and need to infect another host bacterium in order for the phage to continue to propagate. From the phage perspective, during times of abundant resources and the presence of many host cells, lytic development allows for large numbers of phages to be produced, to subsequently infect other host cells and thus continue the cycle. However, when resources and/or susceptible host cells are scarce, the phage can benefit from entering lysogeny until conditions improve. The process of making a choice between the lytic and lysogenic pathways following host cell infection, is covered in more detail in another article (Golding). While there must be some (small) fitness cost in replicating extra genomic material, many phages have evolved such that lysogeny can provide a selective advantage to the host cell in a number of ways. Such benefits may include carrying genes which confer some growth advantage to the host, or by providing protection from subsequent infection, and potential lysis, by phage from the same or different families.

Persistence of DNA Integration Into the Chromosome The long term persistence of phage DNA inside their bacterial host cell is the key feature of a lysogen. Phage have evolved many mechanisms to achieve this long term persistence. In many cases, such as the paradigm bacteriophage λ, the phage DNA is

Encyclopedia of Virology, 4th Edition, Volume 4

doi:10.1016/B978-0-12-809633-8.20963-1

77

78

Lysogeny

Fig. 1 Lytic and lysogenic developmental cycles of a temperate bacteriophage. A bacteriophage infects a bacterial host by injecting its DNA (red) into the host cell. In the lytic lifecycle, phages produce new virions, lysing the cell to release them. The lysogenic lifecycle has the phage DNA integrated into the bacterial chromosome or persisting extra-chromosomally. This state is stable, with prophage DNA maintained in daughter cells over subsequent generations. Most temperate phage can efficiently exit lysogeny to reenter the lytic phase in response to specific environmental cues.

integrated into the bacterial chromosome via site-specific recombination (Fig. 2). The phage-encoded integrase protein is a sitespecific recombinase that recognizes attachment (att) sites on both the phage genome (attP) and the bacterial chromosome (attB). The integrase catalyses recombination between the attP and attB sites. This process generates two new att sites, attL and attR that flank the phage genome. When the phage exits lysogeny, the excision machinery, consisting of at least the integrase and a phageencoded recombination directionality factor (RDF), and often with additional host factors, recognizes the attL and attR sequences to catalyse the reverse reaction to excise the prophage genome. In general, integrase proteins belong to one of two families, the serine integrases or the tyrosine integrases. These proteins use a catalytic serine or tyrosine, respectively, to carry out the recombination reaction. Serine integrases create a double stranded break in both the phage and bacterial chromosome, and mediate strand rearrangement to bring them into the recombinant configuration, followed by strand ligation. Tyrosine recombinases, on the other hand, mediate recombination via a Holliday junction intermediate. Single stranded cuts are made in DNA and rejoined to single stranded DNA from the recombinant partner. Both types of integrases have been used extensively for biotechnology and molecular biology applications. Some temperate phage integrate their DNA less specifically into the bacterial genome. Upon infection of a host cell, phage Mu initially integrates at random into the host chromosome. Only after integration is a decision on lytic or lysogenic development made. If the lytic pathway is followed, the phage genome undergoes successive rounds of replicative transposition. In each round, Mu DNA is duplicated, and a range of host genome rearrangements (deletions, inversions, translocations) are introduced. Viral genomes, along with variably sized segments of the host genome, are packaged directly from the integrated copies dispersed around the scrambled host genome. Similar to other temperate phages, the Mu lysogenic pathway (termed the latent pathway in Mu) is followed if the Repc repressor protein accumulates to a sufficiently high concentration. Repc competes for overlapping operators with DDE-recombinase A, which binds to an internal activation sequence (IAS), to initiate the lytic pathway. In Mu, the initial integration process occurs by random transposition (Fig. 3). This process utilizes transposition machinery consisting of phage-encoded and host-supplied components to cleave the Mu DNA and insert it into target sites in the host genome with little or no target specificity. When Mu injects its DNA into a new host cell, its DNA is linear and flanked by DNA that was acquired from the previous host from which it was assembled. The phage supplied N protein, injected alongside DNA, noncovalently circularizes the DNA. To integrate Mu DNA into the chromosome, the phage DNA is nicked by MuA transposase to generate two 3′ hydroxyl (-OH) ends on both strands at the Mu-Host DNA junctions. The 3′-OH ends attack two phosphodiester bonds spaced 5 bps apart, which join the 3′-OH to the 5′ phosphate of the host DNA at the attack site. The flanking segments of DNA from the previous host are removed and the DNA gap is repaired. A natural consequence of phage DNA integration into the host chromosome is that host DNA replication will also replicate the prophage DNA. Thus, all daughter cells of a lysogen will inherit the prophage. This form of lysogeny can be extremely stable, with very low rates of ‘spontaneous’ induction.

Lysogeny

79

Fig. 2 Prophage DNA can be maintained by site-specific integration into the bacterial chromosome. This reaction is mediated by a phageencoded integrase protein (Int). Initially upon injection into the host, the phage DNA is linear. Circularization of the viral DNA occurs through annealing of complementary ends, followed by ligation. Integrases recognize the attachment sites attP on the viral DNA (dark green) and attB on the host chromosome (light green) to catalyse the site-specific recombination. This generates two new att sites, attL and attR which flank the viral DNA in the chromosome. The attachment sites have a conserved core, shown in black. To return to the lytic cycle, the viral DNA must be excised from the chromosome through a reversal of the site-specific recombination between attL and attR, mediated by the excision machinery, which include the Int and excisionase (or recombination directionality factor RDF) proteins.

Extrachromosomal Persistence of DNA Some prophages can also maintain their DNA separate from the host chromosome as a plasmid, either circular or linear. A wellcharacterized example of a circular extrachromosomal lysogen is bacteriophage P1. Infectious particles of P1 consist of a large (93 kb) linear double stranded genome with terminal redundancy of 10–15 kb. Upon being injected into the P1 host E. coli, the ends of the DNA undergo homologous recombination to circularize into an autonomous plasmid. In lysogeny, the prophage/ plasmid copy number is maintained to be the same as the number of bacterial chromosomes. This is achieved through the expression of P1 proteins involved in origin-specific initiation and partitioning. The phage also uses a toxin-antitoxin system to kill daughter cells that have failed to receive a P1 molecule upon cell division. The antitoxin is degraded rapidly and needs to be continually expressed to counteract the toxin’s activity. Loss of the P1 molecule in daughter cells results in a rapid depletion of antitoxin to a level where the more stable toxin will kill the host cell. Some prophages, such as E. coli bacteriophage N15 maintain their DNA as a linear plasmid. Similar to linear eukaryotic chromosomes, this presents a problem as DNA polymerase fails to replicate all the way to the ends of the DNA. Bacteriophage N15 belongs to a group of phages that express a protelomerase to create covalently closed hairpins. After entering the cell, N15 DNA circularizes through its cohesive termini. The phage encoded protelomerase cuts an inverted sequence in the phage genome and ligates the phosphodiester bonds of each strand to form two closed hairpin ends (Fig. 4). The replication of such as linear plasmid also presents mechanistic challenges and requires prophage encoded proteins. In the case of bacteriophage N15, the phage repA protein is necessary for replication. This multi-domain protein contains domains that resemble prokaryotic primases and helicases required for DNA replication.

80

Lysogeny

Fig. 3 Bacteriophage Mu integrates into the E. coli chromosome via random transposition. Linear Mu DNA is non-covalently circularized via the N protein. The Mu B transposase generates single stranded nicks at the junctions of the Mu (blue)-prior host DNA (red) sequence, revealing two 3′-OH ends. These ends attack a target site on the E. coli chromosome to join the 3′-OH of Mu DNA to the 5′ phosphate of the host DNA. Removal of the prior host DNA and subsequent repair of the DNA gap at the host-Mu DNA junction via limited replication results in successful integration of Mu into the chromosome.

Maintenance of the Lysogenic State Temperate phages often contain natural examples of genetic switches which enable the developmental choice between the lytic or lysogenic lifecycles. The biochemical basis of this decision-making process will be covered in detail in a separate chapter of the encyclopedia. After entering lysogeny, true lysogens are able to exit back into the lytic cycle, either spontaneously or in response to environmental cues. One strategy to generate such a bistable decision making circuit is through mutually repressing transcription factors. The most well characterized bacteriophage employing this strategy is bacteriophage λ.

Lysogeny

81

Fig. 4 A prophage of bacteriophage N15 maintains its DNA as a linear plasmid with closed hairpin loops. Upon injecting its DNA into E. coli, the phage initially circularizes through the annealing of complimentary ends. A phage encoded protelomerase binds and cleaves an inverted repeat sequence (telRL) within the genome. The phosphodiester backbone of single strands are then ligated together to form two hairpin loops, creating one covalently closed DNA molecule. Linear plasmid replication requires a phage encoded protein, repA.

Bacteriophage λ The bistable regulatory circuit in λ is generated by the CI and Cro repressors, which block each other’s transcription. To maintain lysogeny, the promoters that drive lytic functions must be repressed, whilst maintaining production of the CI repressor. In general, the repressor in temperate phages responsible for turning off lytic function is termed the immunity repressor, as it prevents secondary infections by phage of the same family, by blocking the lytic functions of incoming phage. λ has a lysogenic promoter, PRM, positioned back to back with the adjacent lytic promoter PR, with a second lytic promoter, pL located 2.3 kb away (Fig. 5). The transcriptional cascade in λ is such that repression of the early lytic promoters PL and PR will stop all of the lytic genes from being expressed, pushing the phage into lysogeny. This repression is achieved by the λ CI protein, which binds to operators at OR and OL operators that overlap the PR and PL promoters, respectively. In the lysogenic state, CI simultaneously represses transcription from PL and PR, while activating its own expression from PRM. The lambda CI protein is a two domain protein consisting of an N-terminal DNA binding domain connected by a cleavable linker to an oligomerization domain. The repressor homodimerizes and these dimers can bind to individual operators. However, repressor binding is cooperative such that pairs of dimers bound to adjacent operators (primarily OR1-OR2 and OL1-OR2) interact to improve overall binding affinity. In this state, PR and PL are repressed, whilst PRM is activated by a contact between the CI dimer bound at OR2 and the σ subunit of RNA polymerase. In this manner, λ CI produces positive feedback to increase its own expression. A further level of cooperativity also occurs, when a pair of dimers occupying two operators at OR interact with a pair of dimers bound at OL to form a CI octamer, looping out the intervening B2.3 kb of DNA (Fig. 5). Octamer formation in turn positions the OL3 and OR3 operators to allow λ CI dimers bound at each of these sites to interact and repress PRM at high levels of λ CI. This negative autoregulation is thought to limit the repressor levels to allow for efficient prophage induction back into lytic cycle (Fig. 5(D)). Cro is the second transcriptional regulator within the lambda genetic switch. The cro gene is the first of the early genes expressed from PR following phage infection, and is required to enforce prophage induction – the decision point when the phage switches from lysogeny towards lysis. Upon activation of the host SOS response, the CI repressor is degraded by a mechanism involving activated RecA improving an intrinsic but normally weak CI self-cleavage activity. Thus the lytic repressors are at least partially derepressed and Cro is made. Cro binds as a dimer to the same OL and OR operators as CI, but with different affinity, binding most strongly to OR3 at the OR operator. This preferential and non-cooperative binding to OR3 dampens CI expression from the PRM promoter, while leaving the PL and PR promoters active. Thus, Cro repression of PRM is required to prevent re-synthesis of new CI during prophage induction, which would otherwise re-establish the prophage state.

82

Lysogeny

Fig. 5 Regulation of the λ lytic and lysogenic promoters by λ CI involves cooperative interactions at multiple levels. A. The arrangement of early lytic and lysogenic promoters in bacteriophage λ. The regulation of these promoters is crucial in the lytic-lysogeny decision. Promoters are depicted as bent arrows. RNA transcripts are depicted as arrows. Early lytic transcripts from PL and PR are shown in yellow. OL and OR operators are depicted as red bars on the DNA. B. λ CI (depicted as red dumbbells) binds to operators at OL and OR. λ CI dimers at adjacent operators can further interact to form tetramers, mediated through the CI C-terminal domain. Binding of λ CI blocks transcription from PR and PL, by blocking RNA polymerase (RNAP) binding. Contacts between the N-terminal domain of CI at OR2 and the sigma subunit of RNAP activates the PRM promoter to drive expression of λ CI and other lysogenic functions. The lysogenic transcript from PRM is depicted in green. C. The repression of PR and PL is further enhanced by interactions between CI tetramers at OR and OL, assisted by DNA looping. D. Formation of the CI octamer allows CI dimers bound at OL3 and OR3 interact cooperatively, to repress PRM and downregulate CI’s own expression.

Bacteriophage 186 Bacteriophage 186 is a temperate bacteriophage of the P2 related family, a group of phages evolutionarily distinct from the lambdoid phages. 186 and lambda infect the same host (E. coli) and have very similar temperate lifecycles, yet the mechanisms by which these two phages enter and maintain lysogeny are quite different. In 186, early lytic pR and lysogenic promoters pL are in a convergent arrangement (Fig. 6), rather than divergent as is the case for the λ PR and PRM promoters. A consequence of this convergent layout is that RNA polymerases either bound at, or elongating from, the switch promoters can potentially collide with each other. In the absence of the immunity repressor (CI), transcription from the stronger lytic pR promoter represses transcription from the weaker lysogenic promoter pL by this mechanism of transcriptional interference (TI), favouring the lytic lifecycle. Conversely, in the presence of CI, pR is efficiently repressed, relieving transcriptional interference, and providing a form of positive feedback on CI expression.

Lysogeny

83

Fig. 6 Relief of transcriptional interference is used to activate transcription of lysogenic promoters in bacteriophage 186. A. The early lytic promoter pR and the lysogenic promoter pL are arranged face-to-face. CI is the 186 immunity repressor. Apl acts as both a repressor of the pR and pL promoters, and also as the excisionase for phage 186. The 186 CI repressor binding sites are drawn as circles – there are three strong sites (red) over the pR promoter, a weak operator located over pL (pink) and two distant sites (FL and FR), located B300 bp from pR. B. Model of transcriptional activation of 186 pL by 186 CI. CI is shown as a wheel-like structure where DNA wraps around the circumference of the wheel. CI operators are shown as circles. The weak operator at pL is sometimes wrapped to the wheel (left) thus repressing pL, in an equilibrium with a form (middle) where the pL operator is not bound, allowing pL to be active. The flanking sites FL and FR can compete with pL for binding to the CI wheel (right), fine tuning the response of the system to CI levels.

The 186 CI protein is the immunity repressor responsible for maintenance of lysogeny. Similarly to λ CI, it contains an N-terminal DNA binding domain and a C-terminal oligomerization domain and binds to its operators as a dimer. Crystal structures suggest 186 CI forms a wheel consisting of a heptamer of dimers. The repressor’s DNA binding domains face outwards from this wheel, allowing DNA to wrap around the protein (Fig. 6(B)) to regulate transcription of pR and pL. The CI wheel binds cooperatively to the three strong operators to repress pR and relieve transcriptional interference on pL. Binding of the weaker CI operator at pL to the CI wheel is in an equilibrium between wrapped (repressed) and unwrapped (active) forms. There is further fine tuning of pL promoter activity by the two flanking CI operator sites (FL and FR, located B300 bp away) which can loop to the wheel and displace pL from the wheel. When the CI operator at pL is displaced by either of the stronger flanking site operators, pL is transcriptionally active, maintaining CI levels. Interestingly, in phage 186, integrase is also expressed on the lysogenic transcript, such that integrase is continually present in a lysogen. When prophage induction is triggered, excision of the phage genome requires only expression of the excisionase (Apl), the first gene of the lytic transcript. Thus, phage 186 CI repressor maintains an active lysogenic promoter by relieving transcriptional interference on pL, to drive its own expression and to maintain a stable, self-correcting lysogenic level of CI. The lytic pR promoter is tightly repressed in this state, leading to low levels of spontaneous induction.

Integration-Dependent Bacteriophage Immunity A class of temperate mycobacteriophage have been described which use site-specific integration as the key decision point of the genetic switch. In these phage, the immunity repressor and integrase are expressed from the same promoter Prep. To the right of the repressor gene, under the control of a separate divergent promoter, PR, is a Cro-like protein. Both the immunity repressor and Crolike protein are predicted to have a DNA-binding helix-turn-helix (HTH) motif. Crucially, the bacteriophage attachment site attP is located within the open reading frame of the immunity repressor (Fig. 7). Unlike bacteriophages λ and 186, the promoter driving the immunity repressor function is not regulated by the immunity repressor. Rather, the key step needed to establish lysogeny is the integration of the phage DNA into the host chromosome. Integration of the phage genome through recombination of attP and attB sites results in a truncated immunity repressor without a short destabilizing C-terminal tag. This truncated repressor supplies sufficient repression activity to establish lysogeny. The tag, thought to target the viral repressor for degradation by host proteases, reduces its effective activity below the level required for lysogeny. The determining factor of the phage’s developmental fate is the stability of the integrase protein. If conditions favor high integrase activity, such as high multiplicity of infection (MOI) or when host protease levels are low, integration of the phage genome is favored. Integration subsequently establishes and maintains lysogeny via immunity repressor stabilization.

Use of a Host Protein as an Immunity Repressor In most phages, a phage-encoded immunity repressor binds to promoters that drive lytic functions to repress them. GIL01, a phage that infects Bacillus thuringiensis, is an interesting example of a phage that does not have encode an immunity repressor at all. It instead uses the B. thuringiensis host LexA repressor as its immunity repressor.

84

Lysogeny

Fig. 7 The organization of integrase (int), immunity repressor (rep) and cro-like protein (cro) genes in integration-dependent immunity systems. Promoters are shown as bent arrows. Site-specific attachment sites are shown as green boxes. The attP attachment site lies within the rep open reading frame. Site specific recombination of attP with attB results in a truncated repressor gene within the bacterial chromosome. This truncated gene encodes for a more stable immunity repressor, driving the phage towards lysogeny.

It has been shown that the DNA binding activity of LexA in B. thuringiensis is regulated by GIL01 phage encoded proteins. When complexed with specific phage proteins, LexA represses lytic promoters by binding to a conserved LexA site within the promoter region. If these additional phage proteins are not expressed, LexA binding alone is too weak to repress the lytic promoter and GIL01 is unable to form stable lysogens.

RNA to Maintain Lysogeny The mechanisms of maintenance of lysogeny that have been described thus far have been based on proteins, responsible for mediating either repression or integration. However, there are also examples of phage which use RNA-based mechanisms to maintain a stable lysogenic state. Lactobacillus casei phage A2 has a similar genetic arrangement to that of the OR region of lambda, with back-to-back promoters driving expression of CI and Cro repressors. However, in phage A2, the CI and Cro repressors have very similar affinities for their operators, leading to the expectation that the much stronger lytic pR promoter should dominate, leading to expression of Cro, and favoring the lytic cycle. However, A2 is able to effectively establish and maintain stable lysogens. It has recently been shown that the second gene of the A2 lytic transcript encodes an RNA binding protein (gp25) that is able to specifically bind to a region between the ribosome binding site and cro start codon on the lytic transcript. RNA binding by gp25 thus prevents translation of Cro, allowing establishment and maintenance of stable lysogens. P4 is a so-called satellite phage that relies on an unrelated helper phage, such as P2 or 186, to supply the structural genes needed for its propagation. P4 has the ability to redirect the genetic network of the helper phage in order to redirect the capsid assembly process for its own purposes. The P4 replicon can exist as a plasmid or can be stably integrated into the host chromosome as a lysogen. The integrated lysogenic form is maintained by an RNA-based mechanism. In a P4 lysogen, transcription from a constitutive early promoter generates a B300 nucleotide transcript that is processed to form the CI RNA. The CI RNA interacts with two specific regions in the untranslated leader sequence of the nascent transcript to mediate transcriptional termination, prevent expression of the P4 replication functions and thus, maintain lysogeny.

Prophage Induction Established lysogens are generally stable with low probabilities of reverting back to the lytic state. Certain environmental stresses can stimulate the prophage to revert to the lytic state. This process is called prophage induction. The immunity repressor must be degraded or otherwise inactivated during prophage induction in order to establish lytic gene expression.

Lysogeny

85

Many temperate phages are inducible by DNA damage, such as through mitomycin C treatment or UV irradiation. Such DNA damage results in activation of the host SOS response, a global cascade of gene expression centered on DNA repair. As explained previously, the stress response genes are under the control of the host LexA protein. LexA contains a labile linker region that can by cleaved by a C-terminal protease domain within LexA itself. In the SOS response, the activation of co-protease RecA as a result of the accumulation of single stranded DNA, promotes the auto-proteolysis of the host LexA repressor. Phage GIL01, which relies on LexA to directly repress lytic genes, will undergo prophage induction simply due to the removal of LexA. Some phage repressors, such as λ CI repressor, have structural similarity to LexA, and undergo similar RecA-stimulated self-cleavage. Inactivation of CI allows the expression of lytic genes, including the genes for excision and replication. Other phages, such as bacteriophage 186, express specific anti-repressor proteins, which complex with their cognate repressor to relieve repression of the lytic genes. For example, bacteriophage 186 expresses the anti-repressor Tum, which is in turn under the control of a LexA repressible promoter. Upon activation of the SOS response, RecA mediated proteolysis of LexA results in expression of the Tum anti-repressor, which inactivates 186 CI, to drive prophage induction. The related coliphage P2 does not code for an anti-repressor, and does not induce in response to DNA damage. However, it has a somewhat higher level of spontaneous induction than 186, which must be sufficient for it to sustain a viable number of free phage. The satellite phage P4, which uses an RNA based transcriptional termination mechanism for maintaining lysogeny, is induced by transcription from an alternative promoter activated by a protein from the helper phage. Translation of nested genes within this longer transcript leads to suppression of the CI RNA mediated termination, and expression of the P4 genes needed for replication.

Evolutionary and Phenotypical Effects of Lysogeny It is estimated that the number of phage outnumber bacteria in the biosphere by at least an order of magnitude. Thus, it is unsurprising that phages have a large effect on the evolution of their hosts and play a significant role in shaping bacterial communities in almost all ecosystems. Here, we discuss how lysogeny has altered the phenotype and evolution of its hosts.

Lysogenic Conversion Lysogenic conversion describes the situation where a bacterial host acquires a new trait as a direct result of the expression a gene encoded by a lysogen. Clinically relevant examples include the acquisition of virulence factors by bacterial pathogens. Many of the toxins contributing to virulence in bacteria such as Staphylococcus aureus, Shigella spp., Salmonella enterica and pathogenic E. coli have been acquired through lysogenic conversion. E. coli O157:H7, a causative agent of food poisoning, harbors Sp5 and Sp15 – two Shiga toxin expressing prophages. Vibrio cholera, some strains of which cause Asiatic cholera, acquired the cholera toxin from its CTXϕ prophage. Besides toxins, other phage-encoded virulence factors include adhesion factors, superantigens and factors that aid immune system invasion. Various examples of phage-encoded virulence factors are summarized in Table 1. In many cases, the virulence genes are arranged as a cassette flanked by a σ70 promoter on one side and a transcription terminator on the other. Such a configuration reduces interference with the prophage genes adjacent to the cassette. It has been postulated that virulence genes confer some fitness advantage to their host, either offensive (e.g., toxin production, invasins), defensive (e.g., detoxification) or survival (such as improved nutrient uptake). Very recently, a type of temperate filamentous bacteriophage that infects and integrates into Pseudomonas aeruginosa (Pa) was found to be associated with chronic human wound infections. In mice, phage-infected Pseudomonas aeruginosa led to more severe and longer-lasting wounds compared to wounds colonized by Pa alone. The authors showed that uptake of phage-infected Pseudomonas aeruginosa by cells of the immune system resulted in phage RNA production, inappropriate antiviral immune responses and suppression of bacterial clearance.

Gene Disruption Prophage integration into the host chromosome can alter bacterial phenotype by disruption of an open reading frame. The integration of phage may also disrupt sequences with regulatory roles within an intergenic region, altering how downstream genes are regulated. Prominent loci for attB phage integration sites are within tRNA open reading frames. The loss of a functional tRNA could potentially lead to a reduction in host fitness, so many phage have evolved such that integration will reconstitute a functional tRNA gene. This holds true for many other attB sites within important genes. There are cases where prophage integration does not reconstitute a functional gene and leads to loss of protein function. For example, the integration of L54a and ϕ13 phages into the S. aureus chromosome leads to a loss of a functional lipase and β-toxin, respectively.

Genomic Rearrangement Bacteria harboring multiple prophages with similar DNA sequences can give rise to homologous recombination events to cause large genomic rearrangements. The location of rearrangements in the chromosome are often correlated with the loci of prophages. For example, from a sequence alignment, it was observed that the genomic differences between an American and Japanese

86

Table 1

Lysogeny

Non-exhaustive list of phage-encoded virulence factors

Protein

Phage

Bacterial host

Gene

Extracellular toxins Diphtheria toxin Neurotoxin Shiga toxins Enterohaemolysin Cytotoxin Enterotoxin Enterotoxin P Enterotoxin A Enterotoxin A Exfoliative toxin A Toxin type A Toxin type C Cholera toxin Enterotoxin Leukocidin Superantigens

b-Phage Phage C1 H-19B fFC3208 fCTX NA fN315 f13 fMu50A fETA T12 CS112 CTXf CTXf fPVL 8232.1

C. diphtheriae C. botulinum E. coli E. coli P. aeruginosa S. aureus S. aureus S. aureus S. aureus S. aureus S. pyogenes S. pyogenes V. cholerae V. cholerae S. aureus S. pyogenes

Unnamed

E. coli

tox C1 stx1, stx2 hly2 ctx see, sel sep entA sea eta speA speC ctxAB ace, zot pvl speA1, speA3, speC, speI, speH, speM, speL, speK, ssa cdt

Proteins altering antigenicity Membrane proteins Glucosylation Glucosylation O-antigen acetylase Glucosyl transferase

Pnm1 e34 P22 Sf6 SfII, SfV, SfX

N. meningitidis S. enterica S. enterica S. flexneri S. flexneri

Mu-like rfb gtr oac gtrII

Effector proteins involved in invasion Type III effector Type III effector Type III effector

SopEf GIFSY-2 GIFSY-3

S. enterica S. enterica S. enterica

sopE sseI (gtgB) sspH1

Enzymes Superoxide dismutase Superoxide dismutase Superoxide dismutase Neuraminidase Hyaluronidase Leukocidin Staphylokinase Phospholipase DNase/streptodornase

Sp4, 10 GIFSY-2 Fels-1 Fels-1 H4489A fPVL f13 315.4 315.6, 8232.5

E. coli O157 S. enterica S. enterica S. enterica S. pyogenes S. aureus S. aureus S. pyogenes S. pyogenes

sodC sodC-I sodC-III nanH hylP pvl sak sla sdn, sda

Serum resistance OMP OMP

λ λ-like

E. coli E. coli

bor eib

Adhesions for bacterial host attachment Vir Phage coat proteins

MAV1 SM1

M. arthritidis S. mitis

vir pblA, pblB

Others Mitogenic factors Mitogenic factor Mitogenic factor Virulence Antivirulence

370.1, 370.3, 315.3 Unnamed phisc 1 GIFSY-2 GIFSY-2, Fels-1

S. P. S. S. S.

mf2, mf3, mf4 toxA Unnamed gtgE grvA

Cytolethal distending toxin

pyogenes multocida canis enterica enterica

Note: Table adapted and extended from Brüssow, H., Canchaya, C., Hardt, W., 2004. Phages and the evolution of bacterial pathogens: From genomic rearrangements to lysogenic conversion. Microbiology and Molecular Biology Reviews 68 (3), 560–602.

Lysogeny

87

S. pyrogenes M3 isolate could be largely explained by two sequential inversions. One of these inversions occurred between the lysis modules of two prophages. Genomic rearrangements may also aid phage evolution. In the example of S. pyrogenes, the lysogenic conversion genes were outside of the inversion region. The inversion event shuffled virulence genes between the prophages. This could lead to new prophages with altered host specificity and behaviors.

Conclusion Lysogeny can change the phenotype, fitness and evolution of the bacterial host cell. In the case of bacterial pathogens, prophages allow the acquisition of new traits such as virulence factors which in turn may improve the host’s fitness. This is emphasized by the observation that prophages have played a vital role in the emergence of several clinically relevant human pathogens, such as the food-borne pathogen E. coli O157:H7 and cholera causing V. cholerae. Prophages also drive bacterial evolution through the introduction of new genes, the disruption of existing genes and mediating large genomic rearrangements within the host. In the laboratory, temperate bacteriophages have provided numerous tools and reagents for molecular biology, and have been used as models of regulatory networks and biological switches. The study of bacteriophages λ, P2 and its relatives, Mu, P1, N15 amongst others have provided many seminal contributions to these fields. This article has touched on several of these mechanisms and simultaneously highlighted the diversity in means for achieving the persistence of DNA, maintaining the prophage in a quiescent state and for exiting lysogeny. As the number of bacterial genomes sequenced continues to grow, new prophages using novel mechanisms to achieve these lysogenic functions will no doubt be discovered.

Acknowledgments The Shearwin lab is supported by grants from the Australian Research Council (DP160101450) and the National Health and Medical Research Council (APP1100653).

Further Reading Brüssow, H., Canchaya, C., Hardt, W., 2004. Phages and the evolution of bacterial pathogens : From genomic rearrangements to lysogenic conversion. Microbiology and Molecular Biology Reviews 68 (3), 560–602. Carrasco, B., et al., 2016. Modulation of Lactobacillus casei bacteriophage A2 lytic/lysogenic cycles by binding of Gp25 to the early lytic mRNA. Molecular Microbiology 99 (2), 328–337. Chandra, B., Ramisetty, M., Sudhakari, P.A., 2019. Bacterial ‘grounded’ prophages: Hotspots for genetic renovation and innovation. Frontiers in Genetics 10 (February), 1–17. Christie, G.E., Calendar, R., 2016. Bacteriophage P2. Bacteriophage 7081 (November), e1145782. Christie, G.E., Dokland, T., 2012. Pirates of the Caudovirales. Virology 434 (2), 210–221. Dodd, I.B., Shearwin, K.E., Egan, J.B., 2005. Revisited gene regulation in bacteriophage lambda. Current Opinion in Genetics and Development 15 (2), 145–152. Golding, I., 2016. Single-cell studies of phage λ: Hidden treasures under occam’s rug. Annual Review of Virology 3, 7.1–7.20. Harshey, R.M., 2014. Transposable phage Mu. Microbiology Spectrum 2 (5), 1–22. Łobocka, M.B., Rose, D.J., Plunkett, G., et al., 2004. Genome of bacteriophage P1. Journal of Bacteriology 186 (21), 7032–7068. Mai-Prochnow, A., Gee, J., Hui, K., et al., 2015. Big things in small packages : The genetics of filamentous phage and effects on fitness of their host. FEMS Microbiology Reviews 39 (February), 465–487. Merrick, C.A., Zhao, J., Rosser, S.J., 2018. Serine integrases: Advancing synthetic biology. ACS Synthetic Biology 7, 299–310. Ptashne, M., 2004. A Genetic Switch – Phage Lambda Revisited. New York: Cold Spring Harbor Laboratory Press. Ravin, N.V., 2015. Replication and maintenance of linear phage-plasmid N15. Microbiology Spectrum 3 (1), 1–12. Schubert, R.A., Dodd, I.B., Egan, J.B., Shearwin, K.E., 2007. Cro’s role in the CI-Cro bistable switch is critical for lambda’s transition from lysogeny to lytic development. Genes and Development 21 (19), 2461–2472. Shao, Q., Trinh, J.T., Zeng, L., 2019. High-resolution studies of lysis – Lysogeny decision-making in bacteriophage lambda. Journal of Biological Chemistry 294 (10), 3343–3349.

Relevant Websites https://ecocyc.org/ EcoCyc. https://viralzone.expasy.org/ ViralZone root – ExPASy.

Decision Making by Temperate Phages Ido Golding, University of Illinois at Urbana-Champaign, Urbana, IL, United States Seth Coleman, Rice University, Houston, TX, United States Thu VP Nguyen and Tianyou Yao, Baylor College of Medicine, Houston, TX, United States r 2021 Elsevier Ltd. All rights reserved.

Glossary Induction A process by which a temperate phage switches from the lysogenic state to the lytic pathway, typically in response to a perturbation such as damage to the host cell’s DNA. Lysogenic pathway One of the two possible pathways following temperate phage infection, characterized by suppression of viral replication and lytic functions. The bacterial cell carrying the dormant phage (e.g., in the form of a prophage) is termed a lysogen. Lytic pathway The other possible pathway following temperate phage infection, in which phages replicate rapidly

and eventually kill (lyse) the host cell, releasing progeny viruses. Multiplicity of infection (MOI) The number of viruses co-infecting the same cell. Prophage A phage in the lysogenic state, which has integrated into the host genome. Replication of the viral DNA occurs passively during replication of the bacterial chromosome. Temperate phage A bacteriophage that is capable of the dormant (lysogenic) pathway, in addition to the active (lytic) one.

Introduction The defining lifestyle of most bacteriophages, analogous to the behavior of higher viruses, is one in which infection is followed by rampant viral replication and the release of numerous mature progeny, often accompanied by death of the host cell (lysis). However, a subset of phages, denoted as temperate, are capable of an alternative lifestyle called lysogeny, where the violent cataclysm is replaced by viral dormancy: Following infection, the phage genome is maintained inside the bacterial host – either integrated (as a prophage) into the bacterial chromosome, or replicating extra-chromosomally – with all virulent functions shut off. The dormant phage, now an integral part of the bacterial cell, is inherited from generation to generation. However, while dormant, the phage typically maintains the potential for lytic induction: a switch back to the virulent pathway in response to specific signals indicating stress to the host cells. Multiple aspects of the lysogenic lifestyle are the subject of current studies. These include the physiological costs and benefits, to both phage and host, of viral dormancy, as well as the ecological, medical, and evolutionary consequences of lysogeny. Here we focus on a single element: The decision between lysis and lysogeny, made by temperate phages upon infection of the host. In particular, we will devote most of our attention to the decision by bacteriophage lambda, which infects Escherichia coli (Fig. 1). Through more than half a century of genetic, biochemical, and biophysical studies, lambda has become arguably the bestcharacterized biological system, albeit one that still presents many open questions. As we review the lambda lysis/lysogeny decision, we will also provide examples for some of the ways that other temperate phages make this choice. In light of the overwhelming diversity of phage lifestyles, it is certain that many more ways in which phages pursue their lysis/lysogeny decisions remain to be discovered. Owing to the relative compactness and tractability of bacteriophages, many of their functions have been studied not only for their intrinsic value, but also as simplified models for the behavior of higher biological systems. This is also the case for the lysis/ lysogeny decision, which serves as a paradigm for the way genetic circuits, receiving external inputs, make binary cell-fate choices. In that role, for example, the decision by phage lambda has provided insights about the transition in and out of dormancy by HIV, and, beyond viruses, regarding the process of cellular differentiation during metazoan development. At the heart of lambda’s ability to inform us on higher organisms are common features of binary decision circuits across biological systems, notably, the utilization of auto-regulating, fate-determining genes to provide high cell-state stability while simultaneously allowing efficient fate switching in response to external signals. As part of its paradigmatic role, lambda has also served as a testbed for the idea of creating a quantitative description, formulated in mathematical terms, for the function of the genetic circuit, with the goal of predicting the decision outcome for given initial conditions. While efforts in this direction have yielded significant progress, they have also been limited by the phenomenon of cellular individuality, whereby genetically identical cells, in a uniform environment, nevertheless end up pursuing different paths. To explain the indeterminacy of single-cell choice, researchers invoked the effect of stochastic biochemical fluctuations (“noise”) on the decision circuit. With this concept, too, lambda has paved the way for the elucidation of cellular stochasticity and its consequences across the fields of microbiology, development, ecology and medicine.

88

Encyclopedia of Virology, 4th Edition, Volume 4

doi:10.1016/B978-0-12-809633-8.20969-2

Decision Making by Temperate Phages

89

Fig. 1 The post-infection decision of bacteriophage lambda. Following infection of an E. coli cell, a binary choice is made between lysis, defined by rampant viral replication and cell death, and lysogeny, in which the viral DNA is integrated into the host’s genome to become a dormant prophage. The lysogenic state is stably maintained, but a switch back to lysis can be induced by cellular damage. Adapted with permission from Golding, I., 2016. Single-cell studies of phage λ: Hidden treasures under Occam’s rug. Annual Review of Virology 3 (1), 453–472, permission conveyed through Copyright Clearance Center, Inc.

The Post-Infection Decision of Phage Lambda A decision between lysis and lysogeny first takes place following lambda infection, upon entry of the viral genome into the E. coli cell (Fig. 1). A genetic circuit encoded by the phage, integrating inputs from the host (and, indirectly, from the extracellular environment), converges on one of the two developmental pathways. If lysis is chosen, the decision is irreversible and, hence, the role of the decision circuit is complete. If, however, the choice is lysogeny, that fate needs to be actively maintained over the long term, using a small subset of the decision gene network. In addition to repressing all virulent functions, this maintenance circuit (known as a genetic “switch”) perpetuates a continuous version of the lysis/lysogeny decision: at any given time, whether to maintain stable dormancy or, if cellular conditions change, undergo lytic induction, switching back to the lytic pathway by relieving the repression of virulent functions. The choice between lysis and lysogeny can be seen as one of timescale, namely, whether to reproduce (and kill the host) immediately, or at a later, yet to be determined, time. An optimal choice, one which maximizes – in the long term – the expected number of progeny, requires knowledge of the state of the infected cell, e.g., how likely it is to support successful viral reproduction, as well as its environment, e.g., what the chances are that newly-released phages will find additional targets to infect. Below we discuss how lambda and a few other phages measure these critical parameters. From a theoretical point of view, the optimization problem that underlies decision making by temperate phages continues to be an area of considerable interest. The genetic circuit governing the lambda decision is depicted in Fig. 2. The decision involves a dense network of regulatory interactions between phage genes, as well as multiple inputs from various host functions. The regulatory interactions involve diverse molecular mechanisms, modulating all stages of gene expression: transcription initiation (through transcription factors, alternative promoters, and interference between neighboring promoters) and elongation (anti-terminators), post-transcription (antisense RNA), translation, and post-translation (protein lifetime). The role of some of these interactions in the decision process has been elucidated through genetics, biochemistry, and mathematical modeling of the circuit. The utility of other interactions, however, is less clear. It is possible that those come into play under infection conditions that are not emulated by standard laboratory conditions, serving to increase the robustness of the decision to environmental and biochemical fluctuations. To gain insight into the decision process, it is helpful to follow the cascade of transcriptional events taking place on the lambda genome following infection, and how these events diverge en route to each of the two possible outcomes (Fig. 3). Upon entry of the viral genome into the E. coli cell, transcription begins from the left (PL) and right (PR) early promoters. Transcription is initially attenuated at the tL1 and tR1 terminators, such that only a single protein is expressed from each promoter: N from PL and Cro from PR. One of these proteins, N, is an anti-terminator that allows readthrough at tL1 and tR1, as well at another terminator, tR2, further downstream of PR. This leads to the expression of additional lambda genes. These “delayed early” genes include those that allow progression of the lytic pathway: O and P, required for phage genome replication, and Q, which controls the expression of multiple lytic genes. However, cII and cIII, whose products drive the establishment of lysogeny, are also produced from those same left and right transcripts. If the Q protein accumulates to a sufficient level, it abrogates termination at the tR′ site, to enable expression from the late lytic promoter PR′ of genes responsible for the production of new viral particles and lysis of the host cell, and thus completion of the lytic cycle. For the alternative route of lysogeny to be taken, Q production needs to be reduced. This is achieved by CII, which activates the PaQ promoter to produce an antisense transcript to Q, resulting in inhibition of Q expression and thus preventing the expression of the late lytic genes. In addition, CII promotes the lysogenic pathway by activating the PI promoter, whose product Int

90

Decision Making by Temperate Phages

Fig. 2 Key lambda genes and host factors involved in the lysis/lysogeny decision. Regulatory interactions between the various nodes rely on diverse molecular mechanisms, at the level of phage DNA, mRNA and proteins.

catalyzes integration of the phage genome into the bacterial chromosome. Finally, CII activates the repression establishment promoter PRE, to produce CI. CI shuts off transcription from PL and PR and regulates its own transcription, now from the repression maintenance promoter, PRM. In the lysogenic state, PRM expression of CI is the only transcriptional activity from the decision network of the dormant prophage. What makes the post-infection decision elusive, even after decades of meticulous interrogation, is that, qualitatively, the expression patterns of early genes are similar in lytic and lysogenic cells, and therefore appear unpredictive of the eventual outcome. Consider cII, the key driver of lysogenic choice. Its expression from PR takes place in a limited time window of ~15 min, regardless of the eventual choice. Whether this transient expression will result in lysogeny depends on the exact timing and on the maximal CII level obtained. CII kinetics are regulated at multiple levels – at the promoter (PR) and terminator (tR1), as well as the mRNA and protein lifetimes. This multi-layer regulation provides the means by which the lysis/lysogeny decision is modulated by host and phage parameters, as discussed later. Whereas the decision by lambda is by far the one best characterized, progress has also been made towards understanding the decision circuits of other phages. In multiple cases, the decision network is quite reminiscent of lambda’s, with the cI-cII-cIII module conserved, and an orchestrated gene expression cascade taking place. Even in phages that are otherwise very different from lambda, a dual repressor motif similar to the cI/cro pair (discussed in more detail below) is often present, and an auto-regulatory viral “repressor”, which both establishes and maintains the lysogenic state, appears to be near universal. Among temperate phages whose decision process has been studied in some detail, a few offer slight variations on lambda, for example, P22, a “lambdoid” phage (i.e., one with a similar genome architecture to lambda), where the Cro-analog (ant) directly inactivates the CI-analog (c2) by binding to it and preventing it from effectively binding to DNA. Others exhibit dramatically different behavior. P4, for example, is a “satellite phage”, a genetic element that requires other viruses for its own propagation. Rather than having a global repressor, which establishes and maintains lysogeny by regulation of transcription initiation, the outcome and maintenance of the decision in P4 are driven by control of transcription termination, mediated by RNA-RNA interactions. The default pathway in P4 appears to be lysogeny, with the alternative state, in the absence of a co-infection by a helper phage, being that of a multi-copy plasmid. In that state, infection by a helper phage induces the transition into lysis.

Counting by Infecting Phages The perceived role of the decision circuit, as stated above, is to choose between lysis and lysogeny based on the conditions of infection. A key question is, thus, how these conditions are sensed by the infecting phage and processed by the decision circuit to yield an optimal outcome. This question is far from settled. To begin with, which aspects of the infection event are pertinent to the outcome? The parameter best characterized in terms of its effect on the lambda lysis/lysogeny choice is the multiplicity of infection (MOI), namely, the number of phages co-infecting the cell. It was found long ago that, the higher the MOI, the higher the probability of lysogeny. Fig. 4 depicts the results of an experiment measuring the relation between the two observables. A known number of bacteria is mixed with varying concentrations of phages, and, once infection is allowed to proceed, the number of

Decision Making by Temperate Phages

91

Fig. 3 The cascade of transcriptional events during the lambda lysis/lysogeny decision. Regulatory elements and their interactions are depicted on the relevant region of the lambda genome. The expression pattern of early genes is qualitatively similar irrespective of the eventual fate. After a decision has been reached, different sets of genes are expressed to execute the lysis/lysogeny choice.

resulting lysogens is measured using selection for an antibiotic marker that was engineered into the viral genome. To interpret the experimental results, the measured values are compared to a simple mathematical model, in which random phage-bacteria encounters (following mass-action probability and Poisson statistics) result in lysogeny if the number of phages co-infecting a single cell reaches some number m*. As seen in Fig. 4, the measured data is consistent with a value of m* ¼ 2, i.e., a scenario where infection by a single phage leads to lysis, whereas simultaneous infection by two or more phages results in lysogeny. The notion that infecting phages are able to count their numbers in the cell, and then decide on the mode of action based on that number, is intriguing both mechanistically – how do viruses count? – and in terms of its utility – why do they do so? In terms of mechanism, it is commonly held that counting is mediated through the level of CII reached during infection. The MOI affects both CII production (through increased gene dosage) and degradation (through production of CIII, which protects CII from degradation by FtsH, see below). As described earlier, high CII level triggers lysogeny by driving the expression of the integrase and CI, and inhibiting late-lytic gene expression from PR′. However, an alternative hypothesis, motivated by recent single-cell experiments and mathematical modeling, posits that phage counting simply reflects the number of cI gene copies, whereas CII levels respond only weakly to MOI, possibly due to the auto-repression of PR by Cro. If enough CI accumulates, it will then activate its own transcription from PRM and shut down expression of the lytic genes. Elucidating phage counting is complicated

92

Decision Making by Temperate Phages

Fig. 4 The dependence of lysogenization on the multiplicity of infection: bulk data. A known number of E. coli bacteria is infected with varying concentrations of lambda phage, and the number of resulting lysogens is measured using selection for an antibiotic marker that was engineered into the viral genome. The experimental trend is reproduced by a simple mathematical model, where infection by a single phage leads to lysis, whereas simultaneous infection by two or more phages results in lysogeny.

considerably by the fact that the lambda genome begins replicating shortly after infection, i.e., during, not after, the lysis/lysogeny decision. That early replication plays a role in the decision is evidenced by the fact that replication-deficient lambda mutants require a higher number of initial viruses to achieve lysogeny, compared to wild type phages. As for the biological utility of choosing lysogeny at higher MOI, the common interpretation is that the number of co-infecting phages is used by lambda as a proxy for the ratio of phage-to-bacteria abundance in the surrounding environment. That ratio, in turn, serves to assess the chances of successful infection by the next generation of phages, should the lytic pathway be chosen. Specifically, high MOI indicates that phages outnumber bacteria and that, therefore, releasing more phages into the environment is futile, since they are unlikely to find new bacterial targets. Hence, high MOI advocates lysogeny. Conversely, low MOI indicates the availability in the environment of yet-uninfected bacteria, thus promoting the release of more viral particles through lysis. Consistent with the idea that lambda measures the multiplicity of infection in order to assess the abundance of uninfected bacteria in the environment, recent studies found additional ways by which phages can infer this kind of information. During initial rounds of infection of Bacillus subtilis, phage phi3T expresses the genes aimR and aimP. AimR binds to the phage DNA and activates the transcription of aimX, which blocks the lysogenic pathway in a mechanism not yet elucidated. The lytic pathway is thus favored in the initial infections. Meanwhile, AimP, a short peptide, is secreted into the extracellular medium, where it accumulates and is taken up by uninfected cells. When some of these cells are later infected, the intracellular AimP, now at high concentration, binds to and inactivates AimR, resulting in repression of aimX expression. Consequently, lysogeny is favored in later rounds of infection. Using this intercellular communication system (termed “arbitrium”), infecting phi3T phages are able to record past infections of other cells and tune their lysis/lysogeny decision based on this knowledge. The ability to hijack the host’s quorum sensing system is also found in VP882, a temperate phage that infects Vibrio cholerae and other Vibrio species. The outcome of infection by VP882 depends on a repressor (Gp59) that inhibits the expression of the lytic regulator (Gp62), thus promoting the lysogenic pathway. During infection of Vibrio cholerae, the phage also expresses VqmAPhage, a homolog of the host’s endogenous quorum sensing receptor, VqmA. When bound by Vibrio’s autoinducer, DPO, VqmAPhage promotes the expression of an antirepressor (Qtip), which sequestrates the repressor Gp59, allowing genes involved in the lytic pathways to be expressed. Consequently, when the bacterial density is high, the increased DPO level in the medium promotes the lytic pathway of infecting VP882. Note that, in contrast to phi3T above, which relies on a phage-specific secreted molecule, here, VP882 assesses the bacterial density via an autoinducer encoded by the host. The cases described above, in lambda and other phages, as well as the ecological argument mentioned, all support the idea that increased phage-to-bacteria ratio promotes lysogeny. However, whether this rule applies universally is still unclear. The propensity for lysogenization by bacteriophage P1, for example, is reported to be insensitive to the number of viruses infecting the host cell. In infections by phage Mu, the probability of lysogeny appears to decrease with the multiplicity of infection, although this trend may reflect the virus' toxicity to the host at high MOI. There is also an ongoing debate whether, outside the artificial lab environment, bacterial density is correlated – positively or negatively – with the occurrence of lysogeny, and how to interpret the observed trends. Ecological studies of lytic infections support a “kill the winner” model, in which viral infection increases host diversity by preventing overabundance. However, the dynamics of temperate viruses are much harder to interpret, and the relationship between host density and outcome frequencies is unclear. Previous studies of prophage induction found that lysogeny is more prevalent at low host density, consistent with the picture above of increased lysogenization at high MOI. A more recent work, using metagenomics analysis, reported an inverse trend, but the interpretation of these newer findings is a subject of some controversy. Whereas the multiplicity of infection is the best-characterized driver of the lambda lysis/lysogeny decision, it is definitely not the only one. Recall that multiple host factors interact with the decision circuit (Fig. 2 above). One such factor is FtsH, a membrane-bound ATP-dependent protease. During lambda infection, FtsH degrades CII and thus impacts the choice between lysis and lysogeny. Part of FtsH’s influence comes about through the MOI, specifically, the dosage-dependent production of CIII, which

Decision Making by Temperate Phages

93

is believed to protect CII by serving itself as a target for FtsH. However, beyond the response to MOI, the level and activity of FtsH are regulated by the physiological state of the cell, thus providing a means for the condition of the host cell to inform the lysis/ lysogeny decision. For example, the increase in lysogenization at low temperature can be attributed to a decrease in FtsH levels, as well as an increase in the thermodynamic stability of CII, to which FtsH is highly sensitive. Temperature also impacts the decision circuit through its effect on another bacterial protease, Lon, which is expressed in a temperature-dependent manner and targets the N anti-terminator. Two other cellular sensors, cyclic adenosine monophosphate (cAMP) and guanosine tetraphosphate (ppGpp), have also been reported to affect the lysis/lysogeny decision, possibly by inhibiting FtsH. Cellular state also influences the lambda decision through RNase III, whose levels are modulated by the E. coli growth rate. RNase III promotes degradation of cII transcripts, stimulates ribosome binding and translation initiation of cIII, and blocks auto-repression of N translation during early infection, thus affecting multiple nodes of the decision network. The ways in which the decision circuit assesses the state of the cell via these and other physiological sensors remain a promising direction for future interrogation.

The View From the Single Cell Most of what we know about lambda’s post-infection decision comes from studies that used traditional genetic and biochemical assays, performed in bulk cultures, and thus involving the averaging of all measured observables over millions of cells. But these individual cells may, in fact, exhibit very different phenotypes. Over the last decade, traditional bulk assays have begun to be supplemented by microscopy-based studies, in which the infection process is followed in real time, at the level of individual cells and phages. Fig. 5(A) shows an example of such an experiment. Here, the lambda capsid was labeled using multiple copies of a fluorescent protein, such that each phage particle appears under the microscope as a diffraction-limited spot. The infected cells were simultaneously imaged using phase contrast microscopy. Time-lapse images show the infection and its outcome for two individual cells. The first cell, infected by a single phage, proceeds to produce more viral proteins (green) and lyse within two hours. The second cell, co-infected by three phages, survives to grow and divide. That cell has chosen the lysogenic pathway, as indicated by the production of a red fluorescent protein, here expressed from the lysogeny establishment promoter, PRE. Following many infection events in this manner allows one to determine how the decision outcome – lysis or lysogeny – depends on the infection parameters, such as the MOI. Fig. 5(B) shows that the fraction of cells choosing lysogeny increases with MOI, a trend consistent with the observations in bulk. However, in contrast to our original interpretation of the bulk data (Fig. 4 above), the single-cell data suggests that the MOI dependence is probabilistic rather than deterministic: At MOI of 2, for example, an infected cell has about a 50% chance of going either lytic or lysogenic. In other words, when we observe a cell co-infected by two lambda phages, we have no way of telling what route will be chosen! We are thus confronted with the indeterminacy of single-cell behavior: genetically identical cells, all subject to the same environment, exhibiting different phenotypes from each other. This phenomenon is observed throughout biology, and its origins are a subject of intensive interrogation. According to the prevailing picture, cellular individuality reflects the inherent randomness of biochemical reactions in the cell. In this view, the unavoidable fluctuations in molecular copy number and in the timing of events render the lambda decision “noisy” and unpredictable, rather than precise and deterministic. The plausibility of this argument was first demonstrated using a computational simulation of the lambda decision circuit, showing that fluctuations in biochemical reactions can result in diverging cell fates among infected cells. The concept of noise-driven decisions then evolved and was used to explain cell-fate indeterminacy in higher systems, including the transition in and out of HIV latency, as well as the differentiation and reprogramming of metazoan cells.

Fig. 5 The lysis/lysogeny decision at the single-cell level. (A) Images from a live-cell movie following the fate of two E. coli cells, infected by fluorescently-labeled lambda phages. The upper cell, infected by a single phage, proceeds to produce new viral particles and undergo lysis. The lower cell, co-infected by three phages, enters lysogeny, as indicated by a fluorescent reporter for PRE activity, and proceeds to divide normally. (B) The fraction of cells undergoing lysogeny as a function of the multiplicity of infection, as measured from 41000 infection events. In contrast to the original modeling of the bulk data (Fig. 4 above), the single-cell curve rises gradually, suggesting that the MOI dependence is probabilistic rather than deterministic. (C) Incorporating the effect of intracellular viral concentration (MOI divided by cell volume) captures the experimental data and yields a decision curve that is markedly more step-like. Adapted from Zeng, L., Skinner, S.O., Zong, C., et al., 2010. Decision making at a subcellular level determines the outcome of bacteriophage infection. Cell 141 (4), 682–691, Copyright 2010, with permission from Elsevier.

94

Decision Making by Temperate Phages

Fig. 6 Detecting the transcriptional activity of individual lambda phages. (A) Each phage is detected through the binding of fluorescently-tagged ParB proteins to the parS sequence, engineered into the lambda genome. mRNA molecules transcribed by the phage are simultaneously detected using single-molecule fluorescence in situ hybridization (smFISH). (B) Four lambda genomes (cyan) inside a single infected cell, at 10 min after infection. Individual phages vary in their transcriptional activity, with one transcribing cro (green) and two others producing cI (red).

But the fact that we can describe cell fate decisions probabilistically does not necessarily mean that we should settle for such a narrative and give up seeking a deterministic description of the decision process. While biochemical stochasticity is undisputedly present, automatically attributing all cellular indeterminacy to unknowable “noise” may be taking the easy path. One must consider the alternative hypothesis, which is, that our inability to predict the decision outcome reflects a failure to account for additional cellular variables that have a deterministic effect on the decision. So long as these “hidden variables” remain unknown to us, the decision will appear more random that it truly is, and our understanding of it remain limited. And, in fact, a number of lambda studies suggest that incorporating additional variables can reveal a more precise decision at the single-cell level. It was first found that, for a given MOI, smaller cells are more likely to be lysogenized than larger ones. This should not surprise us, since decreasing the size of the infected cell is, to a first approximation, the same as infecting with a larger number of phages: both result in an increased concentration of viral gene products in the cell. Detailed analysis revealed that a unique arithmetic combination of the MOI and cell size yields a more step-like (and therefore, more deterministic) probability of lysogenization (Fig. 5(C)). The way in which the infection parameters combine to yield a sharp decision curve points to a nonlinear interaction between the co-infecting phages as they converge on the cell’s fate. Elucidating the nature of this interaction will require characterizing the spatiotemporal dynamics and genetic activity of individual phages within the infected cell. Fluorescent reporters for phage capsid, genome, RNA and protein products, needed for such an investigation, are now becoming available (Fig. 6).

The Decision to Remain Dormant If, following infection, the lysogenic route is chosen, control of cell fate is then handed over from the post-infection decision circuit to a smaller circuit, whose role is to maintain dormancy by repressing all virulent functions. The maintenance circuit must also be able to trigger a switch back to lysis (induction) when cellular conditions change. Thus, the dormant virus continuously reevaluates its lysis/lysogeny decision. Some elements of lysogenic maintenance in lambda and other phages are covered by a separate article in this encyclopedia (Shearwin). Here we focus on the decision-making aspect of the maintenance circuit. In lambda, lysogenic maintenance is handled by a subset of the post-infection decision network discussed above. The smaller maintenance circuit, known as the lambda “switch”, consists of two phage genes, cI and cro, transcribed respectively from two diverging promoters, PRM and PR (Fig. 7). The two gene products, CI and Cro, compete for binding to six operator sites (OR1–3 and OL1–3) that regulate PRM and PR transcription, resulting in mutual repression by the two proteins. In the prophage state, high cellular level of CI represses transcription from PR and PL, thus maintaining viral dormancy. The lysogenic state is further stabilized by the formation of a DNA loop between OR1–3 and OL1–3, secured by oligomerization of CI dimers bound at the two loci. Perturbations that reduce the level of CI (such as activation of the bacterial SOS response, discussed below), can lead to lytic induction. In this process, the inhibition of PR and PL is relieved, leading to transcription of early lytic genes, including cro. Cro then represses PRM, leading to further reduction of CI level and allowing the lytic cascade to proceed. Despite decades of meticulous studies, recent experiments continue to reveal new features of the lambda maintenance circuit, such as the role of mechanical coupling between transcription, DNA supercoiling, and looping, and how this coupling may affect the stability of the lysogenic state. As in the case of infection, the phage (now, prophage)-encoded circuit requires input from the bacterial host in order to sense the state of the infected (now, lysogenized) cell and use that information to choose optimally between lysis and lysogeny. Specifically, during lysogenic maintenance, the role of host input is to alert the prophage when the bacterial cell is in danger, indicating that it is time to escape the host through the lytic pathway. Lambda receives this information via E. coli’s SOS system,

Decision Making by Temperate Phages

95

Fig. 7 The maintenance of lambda lysogeny. (A) The lysogenic state is maintained by a regulatory circuit consisting of CI and Cro, expressed from the PRM and PR promoters, respectively. CI and Cro compete for binding at six operator sites (OR1–3 and OL1–3), to determine which promoters (PRM, or PR and PL) are active, and thus decide whether lysogeny is maintained or, instead, lytic genes induction takes place. Two examples of binding configurations are shown. On the left, CI dimers bind to four operator sites, resulting in DNA looping that ensures repression of PR and PL during lysogeny. On the right, binding of Cro to OR3 represses transcription of CI from PRM and allows lytic genes to be expressed. (B) The regulatory interactions between CI and Cro form a bistable switch. The system can alter its state in response to a large perturbation, such as depletion of CI by RecA, leading to lytic induction, but is immune to small perturbations. Adapted with permission from Golding, I., 2016. Single-cell studies of phage λ: Hidden treasures under Occam’s rug. Annual Review of Virology 3 (1), 453–472, permission conveyed through Copyright Clearance Center, Inc.

96

Decision Making by Temperate Phages

which, in response to cellular DNA damage, halts progression of the cell cycle and triggers DNA repair and mutagenesis. Under normal growth, expression of the SOS response genes is repressed by LexA. However, in the presence of DNA damage (due to, e.g., UV radiation), regions of single-stranded DNA accumulate, leading to recruitment and activation of RecA. Activated RecA facilitates LexA self-cleavage, de-repressing SOS genes, including RecA itself. Phage lambda is coupled to the SOS response through the selfcleavage of CI by activated RecA. The consequent drop in cellular CI concentration leads to the relief of cro repression, activation of the lytic pathway, and prophage induction. Many other temperate bacteriophages are also induced following treatment with UV radiation or DNA-damaging agents. However, some, like P2, appear to be non-inducible, and immune to the bacterial SOS system. The cI/cro pair serves as a canonical example for a so-called “toggle switch”, a genetic module exhibiting two stable states, here corresponding to lysogeny and to the onset of lysis. Despite consisting of only two genes, the lysogeny maintenance circuit of lambda captures key features of cell-fate choice, as observed across the spectrum of biological complexity. Through the use of feedback (PRM autoregulation by CI), the system achieves extremely high stability in the absence of external perturbations, with fewer than one spontaneous switching event per 106 cell doublings. At the same time, almost 100% of lysogenic cells switch to lysis in response to an inducing signal. These properties have made the lambda maintenance circuit an attractive starting point for understanding cellular differentiation and reprogramming in metazoans. In particular, lambda has served as a fertile test ground for the attempt to formulate a detailed biophysical description of cellular behavior, in the form of a mathematical model that uses the known molecular interactions to predict the resultant cellular phenotype. This effort has been, at least partly, successful. For example, a thermodynamic model can be written, describing the different binding configurations of CI at the operator sites that control transcription from PRM, and this model used to predict the regulatory curve relating CI concentration in the cell to PRM activity. The theoretically predicted curve shows good agreement with experimental measurements of this regulatory relation (Fig. 8). The theoretical calculations can

Fig. 8 A theoretical biophysical model captures PRM regulation by CI. The regulatory curve relating CI concentration in the cell to PRM activity can be predicted from a thermodynamic description of the possible binding configurations of CI at the OR1–3 and OL1–3 operator sites. The theoretically predicted curve agrees with the experimental measurement of the regulatory relation. The average amount of CI protein present in each lysogenic cell can be estimated by requiring that CI production is exactly balanced by CI elimination (via dilution, due to cell growth and division). Adapted from Sepúlveda, L.A., Xu, H., Zhang, J., Wang, M., Golding, I., 2016. Measurement of gene regulation in individual cells reveals rapid switching between promoter states. Science 351, 1218–1222, reprinted with permission from AAAS; and from Golding, I., 2011. Decision making in living cells: Lessons from a simple system. Annual Review of Biophysics 40 (1), 63–80 with permission conveyed through Copyright Clearance Center, Inc.

Decision Making by Temperate Phages

97

next be utilized to estimate the amount of CI protein present in a lysogenic cell, by requiring that CI production is exactly balanced by CI elimination (via dilution, due to cell growth and division) (Fig. 8). The eventual test for a theory of the lambda switch is to successfully predict the key phenotype, namely, whether a given cell will remain in the lysogenic state or, instead, switch to lysis. In attempting to answer this question, we are again confronted with the challenge of cellular individuality: In the absence of an external signal, only one out of a million lysogens in a growing culture will spontaneously switch. Can we predict which cell this will be? It is widely believed that the answer is negative, and that we may only aspire to predict the probability of induction, not the actual fate of an individual cell. This is because spontaneous induction is considered a stochastic process, driven by random fluctuations of CI levels in the cell. Small drops in CI number will be corrected by the negative feedback in the PRM-CI circuit, raising CI level and thus reverting to the mean. However, rare, larger drops will overcome the feedback and lead to de-repression of PR, Cro production and onset of the lytic pathway. In this picture, spontaneous lytic induction is analogous to the way random thermal motion drives a physical system to transition from one stable state to another (Fig. 7 above). The challenge of theoretically predicting the behavior of the cI/cro switch dwarfs in comparison to the larger goal of predicting cell fate following infection, when the full decision network (Fig. 2 above) comes into play.

Conclusion The prevailing narrative for the lambda lysis/lysogeny decision, both following infection and during lysogenic maintenance, offers only a probabilistic prediction, rather a deterministic one, as to which path an individual cell will choose. This probabilistic point of view is similarly applied, further afield, to the choice of latency by mammalian viruses and to cellular differentiation and reprogramming. These decisions are all held to be indeterminate, noise-driven processes. Bacteriophage lambda, where this picture originally emerged, also bears the potential to challenge the probabilistic view by revealing previously-hidden variables that bias the decision outcome, or even determine it in full. Future studies, describing the lysis/lysogeny decision in individual phages and cells, in real time, will be key to delineating true randomness from the hidden precision of cellular decision-making.

Acknowledgments We are grateful to Ian Dodd and Keith Shearwin for commenting on an earlier draft of this article. Work in the Golding lab is supported by grants from the National Institutes of Health (R01 GM082837), the National Science Foundation (PHY 1147498, PHY 1430124 and PHY 1427654), the Welch Foundation (Q-1759) and the John S. Dunn Foundation (Collaborative Research Award). We gratefully acknowledge the computing resources provided by the CIBR Center of Baylor College of Medicine.

Further Reading Arkin, A., Ross, J., McAdams, H.H., 1998. Stochastic kinetic analysis of developmental pathway bifurcation in phage lambda-infected Escherichia coli cells. Annual Review of Genetics 149 (4), 1633–1648. Casjens, S.R., Hendrix, R.W., 2015. Bacteriophage lambda: Early pioneer and still relevant. Virology 479–480, 310–330. Dodd, I.B., Shearwin, K.E., Perkins, A.J., et al., 2004. Cooperativity in long-range gene regulation by the λ CI repressor. Genes & Development 18 (3), 344–354. Erez, Z., Steinberger-Levy, I., Shamir, M., et al., 2017. Communication between viruses guides lysis-lysogeny decisions. Nature 541 (7638), 488–493. Golding, I., 2011. Decision making in living cells: Lessons from a simple system. Annual Review of Biophysics 40 (1), 63–80. Golding, I., 2016. Single-cell studies of phage λ: Hidden treasures under Occam’s rug. Annual Review of Virology 3 (1), 453–472. Knowles, B., Silveira, C., Bailey, B., et al., 2016. Lytic to temperate switching of viral communities. Nature 531 (7595), 466–470. Oppenheim, A.B., Kobiler, O., Stavans, J., et al., 2005. Switches in bacteriophage lambda development. Annual Review of Genetics 39 (1), 409–429. Ptashne, M., 2004. A Genetic Switch: Phage Lambda Revisited, third ed. Cold Spring Harbor, NY: Cold Spring Harbor Laboratory Press. Silpe, J.E., Bassler, B.L., 2019. A host-produced quorum-sensing autoinducer controls a phage lysis-lysogeny decision. Cell 176 (1–2), 268–280. St-Pierre, F., Endy, D., 2008. Determination of cell fate selection during phage lambda infection. Proceedings of the National Academy of Sciences of the United States of America 105 (52), 20705–20710. Tal, A., Arbel-Goren, R., Costantino, N., Stavans, J., 2014. Location of the unique integration site on an Escherichia coli chromosome by bacteriophage lambda DNA in vivo. Proceedings of the National Academy of Sciences of the United States of America 111 (20), 7308–7312. Weitz, J.S., Beckett, S.J., Brum, J.R., et al., 2017. Lysis, lysogeny and virus–microbe ratios. Nature 549 (7672), E1–E3. Zeng, L., Skinner, S.O., Zong, C., et al., 2010. Decision making at a subcellular level determines the outcome of bacteriophage infection. Cell 141 (4), 682–691.

Mobilization of Phage Satellites Kristen N LeGault and Kimberley D Seed, University of California, Berkeley, CA, United States r 2021 Elsevier Ltd. All rights reserved.

Glossary Accessory gene A gene that is not required for the lifecycle of the mobile genetic element, but often confers a fitness advantage to the bacterial host, for example, by encoding genes involved in virulence. Burst assay Quantification of the number of bacteriophage progeny produced by one infected host cell. Capsid Protein shell in which the phage genome is packaged. Helper phage A phage that induces transfer of a satellite phage. Lytic (virulent) phage A phage that takes over the host to produce progeny phage culminating in lysis of the host cell, and is unable to integrate into the host chromosome or otherwise lysogenize the bacterial host. Mobile genetic element (MGE) DNA capable of moving within or between genomes. Phage inducible chromosomal island (PICI) A genetic element that is induced to excise, replicate autonomously and package itself into helper phage structural components.

Phage-inducible chromosomal island-like elements (PLEs) Anti-phage islands found in clinical isolates of Vibrio cholerae, specifically induced by, and inhibitory toward, the lytic phage ICP1. Satellite phage Subviral element lacking genes encoding functions needed to complete the phage lifecycle. Thus for their multiplication, satellites parasitize a helper phage during co-infection of a host cell. Staphylococcus aureus pathogenicity island (SaPIs) Prototypical PICI elements that are ubiquitous in S. aureus and carry virulence genes. Temperate phage (prophage) A phage that can undergo either a lytic cycle or integrate into the host genome to be transmitted vertically until induced by environmental cues to re-enter the lytic cycle. Transduction Packaging and transmission of bacterial DNA inside phage virions to new host cells.

Introduction Phages profoundly impact the evolutionary landscape of their bacterial hosts, both through predation, which selects for hosts with defenses to overcome phage killing, and through mobilization and dissemination of genetic material. Phages are the most abundant biological entities on the planet, making phage-mediated genetic transfer a critically important mechanism of gene mobilization between bacteria. Phages with strictly lytic lifecycles infect susceptible bacterial cells and redirect cellular metabolism to promote phage replication, resulting in the release of infectious phage progeny through cell lysis. In contrast to lytic phages, temperate phages are able to integrate into their host genome where they can be transmitted vertically. It is well-established that integrated phages can dramatically alter bacterial phenotypes by providing their host with novel genes. One of the most clinically relevant examples is the CTX prophage integrated into the genome of epidemic Vibrio cholerae; CTX encodes cholera toxin, the virulence factor responsible for the profuse watery diarrhea that is the hallmark of the disease cholera. For many temperate phages, signals that trigger the prophage to re-enter into the lytic cycle culminate in the lysis of the host cell and the release of progeny phage. Both lytic and temperate phages can occasionally mis-package regions of the host genome and move it to new recipient cells in a process called generalized transduction. Additionally, temperate phage can excise along with adjacent areas in the host genome, and move these regions at a higher frequency in a process called specialized transduction. In sharp contrast to these forms of “passive” transduction, there are mobile genetic elements, the phage satellites, that have evolved to precisely exploit phages for high-frequency transfer. These subcellular elements are consequential for bacterial and phage evolution as both providers of novel genetic content and as parasites of phages. Their impact can often go overlooked or be misinterpreted, as observing the transfer of a satellite phage requires a biological relevant tripartite system: the satellite phage, a compatible helper phage and a susceptible bacterium. It was initially thought that the genetic element encoding the toxic shock syndrome toxin was contained on a prophage in Staphylococcus aureus, though in the late 1990s it was discovered that this toxin containing island was able to excise, replicate and package itself not as an independent temperate phage, but rather through parasitizing a temperate phage for mobilization. This was demonstrated to be an autonomously replicating pathogenicity island, and was designated SaPI1, for “Staphylococcus aureus pathogenicity island 1”. Since this discovery, much has been learned about the SaPI lifecycle, including how SaPIs are induced by specific phages, how they replicate and efficiently package their genomes into virions leading to the spread of virulence genes across large phylogenetic distances. Most S. aureus virulence factors are found on these phage satellites, highlighting the phage-dependent lifestyle as a successful means to efficiently spread genetic material. Importantly, SaPIs are the archetype of a larger group of MGEs known as phage-inducible chromosomal islands (PICIs) which have only recently been found to be widespread among Gram-positive bacteria, and prevalent among Gram-negative Gammaproteobacteria. PICIs have several features in common, which will be discussed below. Other newly discovered PICI-like elements

98

Encyclopedia of Virology, 4th Edition, Volume 4

doi:10.1016/B978-0-12-809633-8.20968-0

Mobilization of Phage Satellites

99

Fig. 1 Genetic organization of representative phage-inducible chromosomal islands. Representative genomes of Gram-positive (SaPI1 and EfCIV583) PICIs and V. cholerae's PLEs (PLE1). All of these satellites share several features, including gene(s) involved in integration and excision from the bacterial chromosome (Int and Xis as well as the left and right attachment sites) in yellow, genes involved in regulation of gene expression in blue (although genes involved in PLE gene regulation have not yet been identified), a replication module consisting of an origin of replication (Ori) and replication genes in purple required for PICI replication, as well as genes involved in packaging and interference with their helper phage in grey. SaPIs additionally encode several genes involved in virulence, including production of toxic shock syndrome toxin (tst), enterotoxins K and Q (entK and entQ), and ear conferring ampicillin resistance. In black are genes encoding hypothetical proteins.

(PLEs) do not share obvious homology to known PICIs, yet are also induced by specific phages, suggesting that the known repertoire of genomic islands mobilized by phages only represent the tip of the iceberg when it comes to phage-mediated genome flux. Where there is experimental evidence, we will compare genetic elements that are mobilized by phages, highlighting what seem to be generalizable and convergent strategies that these elements use to effectively co-opt their inducing phage, as well as what is unique about these different islands. As we will see, the ability of phages to specifically mobilize regions of their host’s chromosome involved in pathogenesis and phage resistance cautions against the use of phage in treating certain bacterial infections, as such mobility has the potential to stimulate evolutionary innovation and undermine the apparent benefits of harnessing phage as antibacterial agents.

Genetic Organization PICIs and PLEs share many conserved features, which underscores several key elements of their lifecycle as phage satellites and also allows for identification of novel PICIs through sequence analysis. All PICI genomes are characterized by (1) specific attachment sites in their hosts genome, (2) a lack of phage structural or lytic genes and (3) a size of around 15 kb. PICIs have analogous genetic organization to phages, with genes clustered according to function. SaPIs have genes arranged in four modules: regulation, integration and excision, autonomous replication, and packaging (Fig. 1). Functionally, all PICIs discovered thus far also have strategies to interfere with helper phage propagation. In addition to the modules that allow for the propagation of PICIs, many contain accessory genes that result in the phenotypic conversion of a host carrying these mobile genetic elements to a more virulent state. PLEs appear to also have their genes arranged in operons, though PLE-encoded ORFs are not homologous to known PICI genes and few PLE-encoded gene products have known functions. Like the PICIs, PLEs lack obvious phage structural or lytic genes and they have unique attachment sites, though with genomes ranging from 18 to 19 kb, PLEs are larger than the archetypal SaPIs, and, as of now, have not been shown to be capable of autonomous replication in the absence of their inducing phage. Below we will discuss how PICIs and PLEs progress through their lifecycle starting with integration and induction.

Integration and Excision, Gene Induction, and Replication Both PICIs and PLEs are integrated in a site-specific manner in their bacterial host genome. When integration occurs, the attachment site in the PICI, known as the attP site (attS for SaPIs), recombines with the bacterial chromosomal attachment site, attC, resulting in the creation of unique left and right attachment sites when integrated (attL and attR, respectively). An integrase (Int), a hallmark of MGEs such as satellite phage, drives the recombination between the bacterial chromosome and the incoming island. For SaPIs, Int is a tyrosine integrase and for PLEs, Int is a large serine recombinase. In both systems, integration of the

100

Mobilization of Phage Satellites

Fig. 2 The lifecycle of SaPI and PLE showing induction, replication and packaging. A schematic of the steps of a PICI lifecycle, contrasting an archetypal SaPI on the top and PLE on the bottom SaPIs and PLEs are induced by specific helper phage, yet only SaPIs are induced by a temperate phage, which can be activated by the SOS response. SaPIs and PLEs replicate upon induction by their helper phage to several copies to nearly 1000 copies in the case of PLEs. Both PLEs and SaPIs transmit horizontally, but since it is unknown how PLE transduces, only the SaPI packaging is shown. SaPIs encode several mechanisms which interfere with helper phage, ensuring specific packaging and horizontal transmission of only the SaPI genome.

satellite phage is catalyzed by Int recognition of the attC site, and additional proteins are not required for acquisition. SaPIs integrate into one of six primary attC sites, while two attC sites are known for PLEs: PLEs 1, 3, 4 and 5 integrate into the V. cholerae repeat (VCR) while PLE 2 integrates into the hypothetical gene VCA581. The use of the VCRs as an attachment site for most PLEs provides immense flexibility for PLE integration, as VCRs are present in 4100 copies in V. cholerae. Int is also required for excision of the satellite phage from the chromosome, but it is not sufficient. SaPIs encode an excisionase, Xis, which is required for excision and is only expressed upon helper phage mediated de-repression of SaPI genes. PLEs also require the presence of their inducing phage to excise, though the reason for this dependence differs. In contrast to SaPIs, which encode their own excisionase, PLEs utilize a phage-encoded recombination directionality factor to trigger excision. PLEs’ Int is constitutively expressed, and upon infection by the helper phage ICP1, Int binds to the ICP1-encoded recombination directionality factor to catalyze excision. For PLEs 1,3,4 and 5 the recombination directionality factor has been shown to be PexA, which is uniquely encoded by ICP1 and underpins the specificity of PLE induction by ICP1. The recombination directionality factor required for PLE 2 Int is not yet known, though it is not PexA, and evidence points to it also being uniquely encoded by ICP1. PICI and PLE gene expression is also responsive to the presence of their respective helper phages. Although it is unknown how exactly PLE gene expression is induced by ICP1 infection, induction of the Gram-positive PICI genetic program is well understood. Prior to infection or induction by a helper phage, Gram-positive PICIs, like SaPIs, are thought to be transcriptionally quiescent in the bacterial hosts’ genome. A SaPI-encoded master repressor Stl prevents gene expression from the left and right SaPI operons. Upon infection by a specific helper phage, a phage-encoded protein disrupts the Stl-DNA complex, resulting in SaPI de-repression (Fig. 2). It is thought that Gram-negative PICIs use a different induction strategy than that of the Gram-positive PICIs. Rather than rely on a master repressor which is then de-repressed by a phage-encoded inducer protein, PICIs in Gram-negative bacteria use an AlpA activator to induce expression, though it is unknown how AlpA responds to infection by helper phage. The mechanism through which ICP1 triggers PLE gene expression is not yet known, but PLE gene expression appears to be dependent on the presence of ICP1. The responsiveness of gene expression to the SOS response differentiates Gram-positive PICIs from Gram-negative PICIs and PLEs described thus far. The SOS response is a bacterial stress response, triggered by the presence of ssDNA that can be induced in the laboratory by bombardment with UV irradiation or treatment with certain drugs like Mitomycin C or antibiotics like fluoroquinolone. LexA is a global repressor of the SOS regulon, which comprises a large number of genes that are repressed by LexA in the absence of SOS induction. Once de-repressed, these genes change the state of the cell, resulting in the cessation of translation and cell division. For many prophages, a homolog of LexA directly represses the lytic pathway, maintaining the prophage in a quiescent state in un-stressed cells. S. aureus hosts that contain an integrated helper prophage that are sensitive to the SOS response can trigger the excision and de-repression of SaPIs upon SOS induction. SaPIs are therefore indirectly triggered by the SOS response to mobilize and horizontally transmit to naïve cells, creating a link between antibiotic treatment and increased spread of virulence factors. Expression of the Gram-negative PICI activator, AlpA, is not directly impacted by the SOS response,

Mobilization of Phage Satellites

101

however its expression is likewise dependent on a yet to be identified phage-encoded inducer, and so it is possible that AlpAactivated PICIs are also indirectly induced by the SOS response. V. cholerae's PLEs are not thought to be induced by the SOS response, as their induction is specific to ICP1, a lytic phage, rather than temperate phage. Upon phage-triggered induction and excision from the chromosome, PICIs and PLEs begin replicating. SaPIs utilize their own primase (Pri), replication initiation protein (Rep) and origin (ori) in conjunction with their host’s bacterial DNA polymerase to replicate as DNA concatemers. PLE replication is likewise dependent on the PLE-encoded replicon comprised of an ori and a replication initiator protein RepA. A striking difference between PLEs and PICIs is in their dependence on phage-encoded products for replication: PICIs replicate autonomously, with their helper phage playing a primary role in transcriptional activation. For example, expression of the pri-rep-ori module on a plasmid is sufficient to drive replication of that plasmid in several Gram-positive hosts, indicating that host machinery is capable of replicating PICIs from Gram-positive and some experimentally validated Gramnegatives. In contrast, in PLE 1 (for which replication has been characterized), replication itself is dependent on ICP1-encoded products, indicating that PLEs have evolved to parasitize phage-encoded replication proteins. In support of the model of PLEmediated replication hijacking, PLE replication is associated with decreased ICP1 DNA replication, whereas that has not been reported to be the case for PICIs.

Packaging and Horizontal Transmission All satellite phage known to date use their helper phage’s structural components to package and move their genomes to new bacteria in a form of transduction. Many SaPIs rely on modification, specifically re-sizing, of their helper phage’s capsid. These smaller capsids are about one-third the size of helper phage capsids, and therefore are only able to accommodate the SaPI genome, which for many SaPIs is about one-third the size of the helper phage genome. Morphogenesis of a smaller capsid for SaPIs is accomplished by SaPI-encoded proteins, which can act as capsid scaffolds. Capsid modification is one mechanism SaPIs use to interfere with their helper phage (discussed further below). PICIs, like all phages, use two strategies for packaging: a headful mechanism dependent on a terminase complex and pac site, or a cos-site dependent mechanism. Many sequenced PICIs of both Gram-positive and Gram-negative bacteria encode their own small terminase subunit, TerS, which recognizes the PICI encoded pac site, bringing it to the phage-encoded TerL subunit to preferentially package the PICI genome. However, several PICIs lack a TerS homolog, yet package and transmit at a high frequency. It was first shown that some SaPIs, such as SaPIbov5, utilizes both a headful mechanism for packaging as well as a cos-site mechanism. Indeed, PICIs that encode a phage cos-site are more common than initially appreciated, and are found in lactococci as well as in Gram-negative PICIs from Escherichia coli. PICIs that are packaged via cos-sites use an unrelated protein Ccm (for cos capsid morphogenesis) which interacts with the phage capsid protein to direct formation of smaller phage capsids to only accommodate the smaller SaPI genome, reducing the titers of helper cos phage. By bioinformatics analyses, several Gram-negative PICIs appear to encode for their own capsid monomer, potentially explaining how these PICIs are packaged into smaller capsids despite not possessing the genes cpmAB. V. cholerae’s PLE has been shown to transduce to naïve cells, but since few PLE-encoded gene products possess homology to any known proteins, there are no clear homologs of Ccm or CpmAB and the structural makeup and morphology of PLE transducing particles remains to be elucidated. Interestingly, the disparity in genome size between PLEs and their only known helper phage ICP1 is much greater than other PICI systems: the genomes of PLEs are one sixth the size of the ICP1 genome, so it remains to be determined if the same tactics used by SaPIs to redirect capsid assembly are utilized by PLE. Nonetheless, horizontal transfer of PLEs requires that the recipient cells possess ICP1's surface receptor, a component of the lipopolysaccharide O-antigen, suggesting that PLE transduction relies on ICP1 structural components. Regardless of how PLEs transmit themselves horizontally, they likely interfere with ICP1 particle assembly, since no infectious phage are produced following ICP1 infection of PLE-containing V. cholerae. It is likely a generalizable phage satellite strategy to steal packaging resources to promote horizontal transmission, as this ploy accomplishes both interference with the helper phage and promotes dispersal of the satellite phage genome.

Strategies PICIs Use to Interfere With Their Helper Phages All PICIs discovered thus far confer a fitness benefit to their host bacterium by interfering with the propagation of their helper phages. As we have discussed, the co-dependence of PICIs on their helper phage for transmission means that natural selection would favor PICIs that only target vulnerable nodes in the phage lifecycle which are not self-defeating. Since most phage proceed through their lifecycle in a temporally regulated manner, where events late in infection depend on the success of those early in infection, PICIs have evolved not to interfere with early stages of their helper phage lifecycle, as they rely on the production of late gene products for their own transmission. PICIs therefore have finely tuned mechanisms that target assembly of their helper phage virions. In well-studied SaPIs three interference mechanisms have been described, and are used to varying extents by different SaPIs. First, as discussed above, SaPIs alter their helper phage capsids to produce smaller particles, restricting the packaging of the larger helper phage genome so that only the smaller SaPI genome can be packaged. Additional mechanisms of interference are found in the same gene cluster as the cpmAB operon (Fig. 1). A second mechanism further ensures that only the SaPI genome is packaged by targeting the helper phage terminase. The Ppi protein binds to the phage TerS, directly blocking the packaging of

102

Mobilization of Phage Satellites

helper phage DNA. Ppi does not interact with the SaPI TerS protein, ensuring that SaPI genome packaging proceeds uninhibited. A homolog of ppi is found in all sequenced SaPIs, as well as in several Gram-positive bacteria that have not been functionally shown to carry a satellite phage, indicating that additional phage satellites are waiting discovery. A third mechanism helps SaPIs interfere with certain helper phages through the PtiA, PtiB and PtiM proteins which interact with the helper phage late gene cluster. The late gene cluster in the helper phage genome encodes for structural components of the phage particles and host cell lysis. PICI helper phage regulate this late gene cluster by a member of the later transcription regulator” (Ltr) superfamily of proteins. The SaPI protein PtiA directly binds to LtrC to inhibit its activation of late gene expression. The gene immediately upstream of PtiA, PtiM, is co-translated with PtiA and acts as a modulator of the LtrC-PtiA interaction (Fig. 2). PtiM binds to PtiA, decreasing its ability to bind LtrC and allowing production of some structural components for SaPI packaging, as well as the lysis genes required for release of transducing particles. A third protein, PtiB, is known to interact with PtiAM, but is toxic to the host cell when expressed and its role in modulating late gene expression is not known. As discussed above, available evidence suggests that PLEs also transmit horizontally using ICP1 structural components, although the molecular basis for PLE-mediated phage inhibition remains to be determined. PLEs may encode proteins with functions analogous to satellite phage that divert structural components to promote horizontal transmission, and those functions would likely also inhibit ICP1 assembly. However, there are several interesting differences between PICIs and PLEs that operate at the level of phage interference. The most striking of these differences is that PLEs completely block ICP1 production such that no new ICP1 progeny are produced during a burst assay. In contrast, SaPI1 reduces, but does not block, the reproduction of its helper phage 80α in a comparable burst assay. This could indicate that all PLEs target several distinct assembly hubs, such that absolutely no ICP1 particles can assemble, and/or that the PLE-mediated decrease in ICP1 DNA replication compounds the effects on assembly to ensure that no ICP1 can overcome interference. It has also been observed that PLEs can accelerate cell lysis following ICP1 infection, which again may act in concert with other strategies to ensure complete inhibition of ICP1. PLEs appear to only provide a benefit to their V. cholerae host at the population level, as the infected cell always dies, but since no infectious ICP1 are produced, uninfected neighboring cells survive. In contrast to PLEs, PICIs known to target phage DNA replication or phage-mediated lysis have not been reported.

Prevalence of Phage Satellites Satellite phage that use a helper phage to trigger excision and spread themselves through parasitizing phage structural components appear to be widespread in many Gram-positive bacteria that are important for human health and disease, such as staphylococci, streptococci and lactococci. In addition, a mobile genetic element in Enterococcus faecalis, EfCIV583, was recently shown to be functional and use F1 as an inducing and packaging helper phage. EfCIV583 possesses many of the defining features of SaPIs (int/ xis function, replication mediated by the elements Pri-Rep-ori cluster, repression by a master regulator Rpr), except that it lacks the TerS subunit that SaPIs use to direct packaging of their own genome. In addition to PLEs in V. cholerae, a recent study has identified PICIs in several Gram-negative bacteria. Putative PICIs have been identified in several strains of E. coli and Pasteurella multocida, with their functionality demonstrated in E. coli, EcCICFT073 and EcCIO42, and from P. multocida, PmCIATCC43137. Several differences between Gram-negative and Gram-positive phage satellites suggest that they evolved multiple times in the different lineages. These studies likely underestimate the prevalence of comparable islands however, since bioinformatic approaches have relied on conserved genetic organization to identify PICIs in Gram-negative bacteria, and PLEs that have no known homology to any PICIs and they possess a distinct genetic organization (Fig. 2). Perhaps the earliest characterized phage parasite is the P4 satellite phage found in E. coli, which co-opts the phage P2 for key stages of its lifecycle. P4 relies on induction or infection by its helper phage P2 for activation and packaging, while encoding its own replicon. Familiarly, P4 interferes with P2 particle formation by altering capsid morphology to only allow successful packaging of the smaller P4 genome. The similarities between the P2-P4 system and the PICI-helper phage systems we have discussed here were once overlooked because of its distinct genomic structure from the Gram-positive SaPIs. However, as we have seen, the discovery of additional PICIs with alternative arrangements suggests that there is not one formula for a PICI, and the functional similarities between the P2-P4 system and known PICIs suggests these genetic elements should all uniformly be considered phage-satellites. The convergent evolution of PICIs suggests that life as a phage satellite is a successful strategy to allow for genome reduction, and therefore, would be selected for in multiple bacterial hosts where infection by a specific phage is a reliable feature of the host’s biology over time. Unbiased approaches to study genome flux following phage infection are needed to discover the true breadth of MGEs that harness phage for their own spread.

Counter-Evolution by Phages to Avoid Parasitism by Satellites In the known cases of helper phage parasitism by satellite phages, the excised elements limit the spread of the helper phage – partially in the case of PICIs, or completely inhibiting ICP1 in the case of PLEs. This partial or complete inhibition of helper phage propagation exerts strong purifying selection on the phage to overcome parasitism by its satellite phage. As discussed above, expression of the majority of SaPI genes is under the control of the global repressor Stl. Helper phages de-repress SaPIs by producing a non-essential phage protein that binds to Stl to disrupt the Stl-DNA complex leading to induction of the SaPI

Mobilization of Phage Satellites

103

program. Interestingly, these inducing proteins are structurally unrelated, yet all interact with the Stl repressor to perform the same function. Helper phage can therefore overcome SaPI interference by acquiring mutations in the inducing protein. Sequence analyses of known SaPI inducing proteins have shown that they are under purifying selection, which, interestingly increases the allelic variance at this locus. Experimentally, 80α encodes at least four anti-repressor proteins, one of which induces three different SaPIs that all have the same Stl: SaPIbov1, SaPIbov2 and SaPI1. Within three passages of 80α on each of these SaPIs, phages that do not induce SaPIs are selected for, demonstrating that phage readily evolve mutations in the inducing proteins. Interestingly, although these proteins are not essential, phage that possess a mutant version of the SaPI-inducing protein are less fit than wild type phage in the absence of SaPI pressure. Dut is a phage-encoded dUTPase, which maintains low cellular dUTP relative to dTTP in order to prevent the introduction of uracil into DNA, and serves as the inducer for SaPIbov1 and SaPIbov5. Between phages Duts show striking allelic variation and ability to induce SaPIs, suggesting the pressure to avoid inducing SaPIs has driven this diversification. Continued passaging of SaPIs with their mutant non-inducing helper phage results in selection for SaPIs that typically have a mutation in Stl, resulting in their constitutive expression which can be detrimental to their bacterial host. Phageencoded inducer proteins and SaPIs are in a dynamic arms race, where a helper phage evolves resistance to SaPIs by mutating or losing the inducer gene (assuming the fitness costs are not too high relative to benefits). SaPIs then acquire mutations in their repressor so they are constitutively expressed in the absence of their helper phage, meaning they are unable to be packaged and spread by their helper phage. This removes the adaptive advantage that the mutant SaPI allele has and they are lost in the population, favoring again the SaPIs with compatible regulation which can then spread using their helper phage. Despite many parallels between PLEs and PICIs in terms of mobilization by specific inducing phage and phage interference, induction of PLEs appears to show little functional similarity to the canonical SaPIs. PLE excision requires the previously described, ICP1-encoded gene product PexA. It is possible to generate DpexA ICP1 mutants in the laboratory that fail to catalyze PLE excision, however mutations in PexA are not seen in natural isolates of ICP1. PexA mutations are likely not selected for in nature however, because, surprisingly, PLEs still inhibit ICP1 without excision and circularization. In keeping with this observation, it has not been possible to successfully evolve ICP1 to avoid inducing PLEs in vitro with sequential passaging as has been done to study the SaPI-helper phage paradigm. These findings indicate that PLEs may have evolved to respond to several distinct ICP1-encoded inducers, such that ICP1 may not be able to overcome PLE-mediated inhibition through mutation and succeed in nature. However, PLEs do impose a significant evolutionary burden on ICP1 in nature as evidenced by some isolates of ICP1 acquiring an anti-PLE defense strategy in the form of a Type 1-F CRISPR-Cas system. Much in the way that CRISPR-Cas functions as an adaptive immune system to target foreign DNA for destruction in bacteria, the ICP1-encoded CRISPR-Cas system deploys CRISPR-RNAs transcribed from spacers that are complementary to PLE sequence to guide an effector nuclease to limit PLE activity. ICP1 is the first phage demonstrated to encode a functional CRISPR-Cas system, and such an unexpected evolutionary innovation highlights the significance of competition between a helper phage and its satellite in nature. The increasing prevalence of CRISPRCas+ ICP1 isolates in Bangladesh underscores the constant pressure ICP1 faces against PLEs in its V. cholerae host. Intriguingly, even though the ICP1 CRISPR-Cas system limits PLE replication, transmission and ICP1 interference, PLEs are still increasing in frequency in epidemic V. cholerae isolates, suggesting that although ICP1 has acquired a formidable weapon against PLEs, PLEs may still confer a fitness advantage to V. cholerae.

Consequences of Phage Satellites in Human Health and Disease The first characterized SaPI demonstrated that phages spread the gene for toxic shock syndrome toxin to naïve S. aureus cells through mobilization of a genetic island that has come to be appreciated as a phage satellite. The majority of S. aureus virulence factors are found on SaPIs, including toxic shock toxin and other superantigens, in leftward and rightward accessory regions with their own regulatory regions (Fig. 1). Accessory genes can make the host bacterium more pathogenic by encoding for toxin production (such as the tst gene or enterotoxins EntQ and EntK). Other accessory genes can confer antibiotic resistance to aminoglycosides and Fosfomycin, or encode for multidrug transporters. Of other relevance in host-pathogen interactions are genes involved in iron transport and biofilm formation that have been found on SaPIs, which are important for successful colonization of human hosts and evasion of the immune response. In addition to SaPI encoded virulence factors, these phage satellites have been shown to transfer portions of the host chromosome between bacteria through generalized transduction. This can greatly accelerate the speed at which bacteria evolve within their human hosts. With the rise of antibiotic resistance among human pathogens, there is renewed interest in phage therapy as a way of combating antibiotic resistant infections. Current interest in phage therapy has focused on lytic phages, which do not integrate into the host chromosome. Lytic phages are often considered safe because they are not likely to transduce genetic material from host to host through specialized transduction. This simplified view of the potential impact of lytic phages on promoting genomic flux is quite misleading. As highlighted in this article, lytic phages are capable of inducing and horizontally transferring mobile genetic elements. Although all known SaPIs and PICIs (for which helper phages have been identified) have evolved to parasitize temperate phages, PLEs in V. cholerae are mobilized by ICP1, a lytic myovirus. The dominance of temperate phages in mediating PICI mobilization is also likely an artefact of the ease at which temperate phages can be identified (i.e., their genomes are present in single colony isolates) and not likely a biological constraint that prohibits MGEs from parasitizing lytic phages. When the lytic phage ICP1 infects a PLE containing V. cholerae cell, only PLE transducing particles are produced when the cell lyses, and PLEs can transduce to naïve V. cholerae transforming phage sensitive hosts to dead ends for future ICP1 attack. Phage therapy employing

104

Mobilization of Phage Satellites

ICP1 has been suggested as a viable means to limit cholera spread in endemic regions, though mechanistic insights demand caution as phage resistance and perhaps other beneficial phenotypes are likely to become widespread. The notion that employing CRISPR-Cas+ ICP1 to overcome PLE-mediated resistance in V. cholerae is likewise oversimplified, as CRISPR targeting of PLE can permit ICP1 replication while still allowing for some PLE transmission.

Conclusion PICIs are widespread within both Gram-positive and Gram-negative bacterial hosts, playing a key role in host virulence and phage resistance. These satellite phages do not encode their own structural and lysis genes, but rather parasitize a helper phage for excision, induction, and packaging, and in the case of PLEs, also for genome replication. The reliance on a helper phage allows the satellite to reduce its genome size, suggesting that perhaps phage satellites evolve from defunct temperate phages. However, it remains possible that diverse MGEs evolve de novo to parasitize specific phages to promote their own dissemination. PICIs are functionally analogous to the so-called integrative plasmid P4, which acts as a phage satellite of the inducing phage P2 infecting E. coli. This strategy of genome loss and parasitism of a helper phage has evolved independently multiple times, and these phage satellites are likely present in diverse bacterial hosts that have not yet been characterized. Once more is known about how PLEs are induced and packaged, they may be included within the larger umbrella of PICIs. PLEs are not induced by SOS induction of their cognate helper phage, and it is unknown whether PLE encodes virulence factors. With the discovery of CRISPR-Cas and the advent of a public health crisis caused by antibiotic resistant bacterial infections, there has been renewed interest in phage, as both tools for biotechnology and in treatment of bacterial infections through phage therapy. There are many unknowns in how bacteria respond to the constant pressure of phage predation, and even less known about how competing phage satellites battle for a bacterial host. The prevalence of PICIs indicates that phage infection can lead to unexpected outcomes, such as the spread of virulence, anti-phage and antibiotic resistance genes. With the increase in the number of sequenced genomes and improved bioinformatic pipelines, we will likely continue to find new genetic elements that parasitize phage through induction and co-option of replication and packaging materials. Discovery and investigation of PICIs and PICI-like elements will reveal how bacterial evolution can proceed at a pace unexpected by mutation alone – the acquisition of mobile genetic elements can dramatically re-shape the phenotype of their host in one generation. One consequence of acquiring a PICI is restriction of helper phage production when infecting that host. PICI interference can select for helper phage mutants or, in the case of ICP1, selection of phage that have acquired CRISPR-Cas to overcome PLE. This evolutionary arms-race between phage satellites and their inducing phage can yield unexpected outcomes, urging caution before applying the use of phage in treating bacterial infections.

Further Reading Donderis, J., Bowring, J., Maiques, E., et al., 2017. Convergent evolution involving dimeric and trimeric dUTPases in pathogenicity island mobilization. PLOS Pathogens 13 (9), e1006581. doi:10.1371/journal.ppat.1006581. Fillol-Salom, A., Martínez-Rubio, R., Abdulrahman, R.F., et al., 2018. Phage-inducible chromosomal islands are ubiquitous within the bacterial universe. The ISME Journal 12, 2114–2128. doi:10.1038/s41396-018-0156-3. Frígols, B., Quiles-Puchalt, N., Mir-Sanchis, I., et al., 2015. Virus satellites drive viral evolution and ecology. PLOS Genetics 11 (10), e1005609. doi:10.1371/journal.pgen.1005609. Lindqvist, B.H., Deho, G., Calendar, R., 1993. Mechanisms of genome propagation and helper exploitation by satellite phage P4. Microbiological Reviews 57 (3), https://mmbr. asm.org/content/57/3/683.long. Lindsay, J.A., Ruzin, A., Ross, H.F., Kurepina, N., Novick, R.P., 1998. The gene for toxic shock toxin is carried by a family of mobile pathogenicity islands in Staphylococcus aureus. Molecular Microbiology 29 (2), 527–543. http://www.ncbi.nlm.nih.gov/pubmed/9720870. Martínez-Rubio, R., Quiles-Puchalt, N., Martí, M., et al., 2017. Phage-inducible islands in the gram-positive cocci. The ISME Journal 11 (4), 1029–1042. https://www.nature.com/ articles/ismej2016163. Mckitterick, A.C., Seed, K.D., 2018. Anti-phage islands force their target phage to directly mediate island excision and spread. Nature Communications 9. doi:10.1038/s41467018-04786-5. Novick, R.P., Ram, G., 2017. Staphylococcal pathogenicity islands – Movers and shakers in the genomic firmament. Current Opinion in Microbiology 38, 197–204. doi:10.1016/j.mib.2017.08.001. O’Hara, B.J., Barth, Z.K., McKitterick, A.C., et al., 2017. A highly specific phage defense system is a conserved feature of the Vibrio cholerae mobilome. PLOS Genetics 13 (6), e1006838. doi:10.1371/journal.pgen.1006838. Penadés, J.R., Christie, G.E., 2015. The phage-inducible chromosomal islands: A family of highly evolved molecular parasites. Annual Review of Virology 2, 181–201. doi:10.1146/annurev-virology-031413-085446. Ram, G., Chen, J., Kumar, K., et al., 2012. Staphylococcal pathogenicity island interference with helper phage reproduction is a paradigm of molecular parasitism. Proceedings of the National Academy of Sciences of the United States of America 109 (40), 16300–16305. doi:10.1073/pnas.1204615109. Ruzin, A., Lindsay, J., Novick, R.P., 2001. Molecular genetics of SaPI1 – A mobile pathogenicity island in Staphylococcus aureus. Molecular Microbiology 41 (2), 365–377. doi:10.1046/j.1365-2958.2001.02488.x. Tallent, S.M., Langston, T.B., Moran, R.G., Christie, G.E., 2007. Transducing particles of Staphylococcus aureus pathogenicity island SaPI1 are comprised of helper phageencoded proteins. Journal of Bacteriology 189 (20), 7520–7524. doi:10.1128/JB.00738-07. Tormo-Más, M.Á., Mir, I., Shrestha, A., et al., 2010. Moonlighting bacteriophage proteins derepress staphylococcal pathogenicity islands. Nature 465. doi:10.1038/nature09065. Úbeda, C., Maiques, E., Barry, P., et al., 2008. SaPI mutations affecting replication and transfer and enabling autonomous replication in the absence of helper phage. Molecular Microbiology 67 (3), 493–503. doi:10.1111/j.1365-2958.2007.06027.x.

Portal Vertex Peng Jing and Mauricio Cortes Jr., Department of Chemistry, College of Arts and Sciences, Fort Wayne, IN, United States r 2021 Elsevier Ltd. All rights reserved.

Glossary ATPase ATPase is an enzyme that catalyze the hydrolysis of the phosphate bond in adenosine triphosphate (ATP) to form adenosine diphosphate (ADP). The free energy harnessed from this reaction can be used to drive other cellular reactions. Brownian motor The Brownian motor is a device that can do mechanical or electrical work at a nanometer scale. In the device, externally sourced energy is needed to generate fluctuating anisotropic energetic potentials. The anisotropic potentials provide forces that would bias the motion of a particle in one direction. Translocation of the particle would be a consequence of a unidirectional Brownian motion. Capsid A capsid is the protein shell of a virus. It consists of several oligomeric structural subunits made of protein called protomers. Coat protein A predominant protein in the capsid that can form tiny channels in the major capsid proteins so that only ions or water can pass through. Concatemeric DNA The concatemeric DNA is a multimeric head-to-tail polymer of viral DNA that contains multiple copies of an entire genome linked in series. It serves as the substrate for the production of the relevant viral genome in a bacteriophage terminase during the DNA assembly. Portal protein The portal protein is a dodecameric protein, located at the portal vertex. It serves as a conduit through which dsDNA is translocated during DNA packaging in tailed bacteriophages. In Phi29 bacteriophage, the portal protein is also named as connector. Prohead A prohead, or procapsid, is an immature viral capsid structure formed in the early stages of self-assembly of some bacteriophages including tailed bacteriophages.

Scaffolding protein Scaffolding protein is the protein that directly reacts with the portal protein to help to form the empty prohead along with other proteins. After the formation of the prohead, this protein usually is expelled, leaving space for genomic DNA. Terminase Terminase is a viral enzyme that usually consists of two subunits with different sizes. The small subunit recognizes viral DNA by binding a specific/ nonspecific site on the concatemer. The large subunit, docking on the portal vertex, contains the ATPase activity that powers translocation of the viral genome into the prohead and the endonuclease activity that cuts concatemeric DNA to a single genome length. The Donnan’s effect The effect (also known as the Donnan law, Donnan equilibrium, or Gibbs–Donnan equilibrium) is used to describe uneven distribution of charged particles near a semi-permeable membrane. The usual cause is the presence of a different charged substance, e.g., DNA, that is unable to pass through the membrane and thus creates uneven electrical potentials across the membrane. The head-tail connector In a mature virion particle, the head-tail connector usually refers to a protein complex composed of the portal protein, the adapter protein and the stopper protein, which connects the filled head to the tail. The Tail (The tail protein complex) The bacteriophage tail is a molecular machine used during infection to recognize the host cell and ensure efficient genome delivery into the cytoplasm.

Introduction The Caudovirales, an order of viruses, are known as the double-stranded tailed bacteriophages. Based on their tail morphology, they are divided into three families known as Myoviridae which have long contractile tails (e.g., phages T4 and P1), Siphoviridae which have long noncontractile tails (e.g., l and SPP1), and Podoviridae which have short noncontractile tails (e.g., Phi29 and P22). A major strategy that bacteriophages use in order to protect their genomic chromosomes from the environment is to preassemble icosahedral proheads (procapsids). By means of a powerful DNA packaging motor, viral genomes are subsequently pumped into the preassembled icosahedral heads through a viral portal protein located at a unique axis of 5-fold symmetry known as the portal vertex. The portal provides a unique site through which the viral genome enters and exits the prohead. Because both tailed bacteriophages and certain human pathogens in the herpesvirus family share the similar structural features, their DNA packaging process is assumed to originate from an evolutionarily conserved mechanism. Owing to the ease of operation in research, the tailed bacteriophage is typically used as a model system for studying this viral genome encapsidation mechanism. In the past 50 years, the assembly mechanisms of several dsDNA bacteriophages have been investigated in depth where the experimental results have shown that the portal vertex is formed by a structurally conserved dodecameric portal protein which plays key roles in the viral chromosome packaging. Fig. 1 shows a typical, simplified viral assembly pathway where the portal protein is directly involved in the following major stages in a tailed bacteriophage:

Encyclopedia of Virology, 4th Edition, Volume 4

doi:10.1016/B978-0-12-809633-8.21518-5

105

106

Portal Vertex

Fig. 1 A Typical and Simplified Pathway for the Viral Assembly of a Mature Virion Particle.

Formation of a Prohead In order to start the DNA packaging, an icosahedral prohead needs to be preassembled. Aided by scaffolding proteins, the monomer of the portal protein may polymerize and form a dodecameric portal ring. The portal ring may work as a reaction center where both coat proteins and scaffolding proteins associate to form a preassembled prohead. When the assembly is finished, the scaffolding proteins trapped inside the prohead may diffuse out or be degraded and will finally not be present in a completed virion.

DNA Packaging into the Prohead In the majority of the tailed phages, a protein complex, terminase, is used to begin DNA packaging. The protein complex usually consists of two subunits: the large subunit (TerL) and small subunit (TerS). In the first step, TerS recognizes a concatemeric DNA substrate in which many copies of the viral genome are linked end-to-end. After the formation of the DNA-TerS protein complex, the TerL subsequently docks on the portal protein of a prohead. Under the action of the TerL ATPase, the concatemeric DNA is packaged into the prohead. When a genome length of DNA is completely encapsidated, the portal protein may signal to the TerL and activate the inherent endonuclease to cut the concatemeric DNA substrate at a specific site (e.g., l) or nonspecific sites (e.g., T4, P22, SPP1) to stop the packaging process. The package mode with non-specific cleavages is named as the headful packaging. Some tailed bacteriophages use different packaging modes. For example, in Phi29, the ATPase known as GP16, is equivalent to the ATPase on the TerL in other phages, but GP16 does not have endonuclease activity. The portal protein in the phage, however, is bound to a packaging RNA (pRNA), which can form a pentameric ring on the portal protein. The pRNA provides a binding site for the ATPase and is therefore considered to be evolutionarily associated with a part of the TerL. The DNA substrate for the ATPase contains one unit of the viral genome covalently bound by a protein (GP3) at both ends. The ATPase will stop the packaging process when the full genome is completely packaged into the prohead.

Virion Maturation As the prohead is filled with DNA, the icosahedral lattice may expand to mature the capsid shell. Under the assistance of the portal protein, the packaged DNA is spooling-like and highly ordered. As the terminase is dismounted, the portal protein may close its

Portal Vertex

107

channel. Adapter proteins, stopper proteins, as well as the tail are subsequently assembled at the portal vertex, completing the maturation of a virion particle.

The Structural Features of the Portal Proteins The subunits of the portal proteins can vary in molecular mass and amino acid sequence among the tailed bacteriophages shown in Table 1. However, they share highly conserved structural features in the tailed bacteriophages, such as that the channel of the portal protein in the bacteriophages is always composed of 12 subunits. Whereas the portal rings may consist of 11–14 subunits (e.g., 12 mers and 13 mers can be found in the Phi29 and SPP1 portal proteins) when the portal proteins are purified and self-assembled in vitro. The inconsistence in the oligomerization state can also be found in the pRNA in Phage Phi29 where it is observed to have five subunits in the virion particles by cryogenic electron microscopy (Cryo-EM), while the in vitro assembly experiments showed that the pRNA is a hexameric RNA complex. The results indicate that the in vivo assembly of the ring structure in the portal proteins may need assistance from other components such as the scaffolding proteins (Fig. 1). In addition to the 12-fold symmetry, the central channel of the portal proteins has a minimum 2.8 nm diameter at its narrowest constriction so that dsDNA has sufficient space to pass through. The channel in the portal looks similar to a truncated cone (Fig. 2) where the diameter of the channel for DNA entry is always smaller than the channel end for DNA exit in the packaging process. The amino acid residues in the interior wall of the channel are predominately negatively charged which may generate strong electrostatic repulsion against the DNA. The subunit in the portal proteins in the tailed bacteriophages have similar core organization in its appearance, which can be divided into four domains (Fig. 3): the clip, stem, wing, and crown domains. A brief summary of the main functions of the conserved structural motifs is provided in Table 2. The clip domain: The domain consists of an a/b fold and may interact with adjacent portal subunits by hydrophobic interactions to stabilize the oligomerization state of the portal protein. The domain is exposed to the exterior of the capsid providing an interface when in contact with the coat proteins, provides the binding site for components in the DNA packaging motor, e.g., TerL, and also provides a site where phage tail attachment occurs. The stem domain: This domain is sandwiched by two domains known as the clip and wing, which is featured with two wellconserved a-helixes. The central channel in the dodecamer is surrounded by the 24 stem helices, which may impart structural plasticity and induce conformational changes during the DNA packaging. A conserved structure known as the channel loops is present in the stem domain. The channel loops have been shown to generate a constriction in the channel that prevents leakage of genomic DNA near the end of DNA packaging. The wing domain: The domain is internal to the prohead and is above the stem domain. It is a globular domain whose size differs among tailed bacteriophages. The hydrophobic amino acid residues such as tryptophan, found in the domain can form a conserved hydrophobic belt around the exterior surface of the portal protein. This region is assumed to interact with scaffolding proteins to form the ring structure and initiate the assembly of the prohead. Owing to the presence of the hydrophobic belt, it is noted that the Phi29 portal protein can be readily inserted into a bilayer lipid membrane for channel property measurements and maybe potentially used for nanopore sensing technology. The crown domain: The crown domain in the portal proteins of the tailed bacteriophages is located at the C-terminal in the primary structure of the subunit. The crown domain in the portal proteins of phages SPP1, T4 and P22 orients towards the center of the capsid, which is directly in contact with the packaged DNA inside the capsid. The structure of the crown domains has some variations among certain tailed bacteriophages. Due to the flexibility, the crown domain of the portal proteins in some phages, e.g., Phi29, cannot be directly visualized in crystal structures or in Cryo-EM images. On the other hand, in the portal protein of P22, a 20 nm left-handed coiled-coil helical barrel is found to extend from its crown domain. Site-directed mutation studies have shown the barrel may even be involved in the organization of the spool-like DNA structure inside the capsid. Table 1

A Comparison of the Primary Structures of Several Portal Proteins in the Tailed Bacteiophages

Phage Family

The Virus Name

The number of Amino Acids in a Protomer

The I.D. in the PDB Database

Podoviridae

Phi29 P68 T7 P22

309 327 490 725

1FOU, 1JNB, 1IJG, 1H5W 6Q3G 6QWP, 6QX5, 6QXM, 3J4A 3LJ5, 4V4K, 5JJ3, 5JJ1

Myoviridae

T4

455

3JA7

Siphoviridae

SPP1 G20C

503 426

2JES 4ZJN, 5NGD, 6IBG

Note: Only the portal proteins whose 3-D structures are available in the RCSB Protein Data Bank (PDB) are included in the table.

108

Portal Vertex

Fig. 2 The Structure of the Portal Proteins Available from the PDB Database.

Functions of the Portal Proteins The Viral Portal Protein as a Nucleator in the Assembly of the Prohead With Proper Morphology The prohead assembly occurs at the earliest stage in the DNA assembly of the tailed bacteriophages. Three major proteins are involved in the process: the portal protein, the scaffolding, and the coat proteins. The portal protein functions as a nucleator and the scaffolding proteins may work as assembly chaperone to bind the subunit of the portal protein to assist the oligomerization of the subunits forming the ring structure. The conserved structure, a hydrophobic belt encircling the portal ring at the wing domain, may provide a site where scaffolding proteins bind to the portal protein. After the formation of the portal protein-scaffolding protein complexes, coat proteins then copolymerize with the scaffolding proteins. As copolymerization proceeds, the shell of the prohead grows around the portal ring. Relevant experiments show that more portal rings results in higher yield as well as the rate of prohead assembly. In addition, the portal ring was also shown to be capable of correcting the morphology of viral particles even in the presence of a high ratio of scaffolding protein vs coat proteins. When the copolymerization ends, the scaffolding proteins may diffuse out of the prohead or may be degraded and finally exit out of the prohead making room for incoming genomic DNA. During the process, the portal protein may act as an exit for scaffolding proteins or their degraded products. The above mechanism can be used to explain why only one portal ring is found at a unique portal vertex in the capsid of the tailed bacteriophages.

Portal Proteins Provide Binding Sites for Terminases The clip domain is a conserved structure, protruding out of the capsid in the exterior and is the site where the large subunit of the terminase protein (TerL) can bind. For example, in phage T4 the clip domain of the T4 portal binds to the C-terminus of the TerL. After the terminase is bound, genomic DNA will translocate into the prohead in an ATP-dependent process driven by the TerL ATPase. Some tailed bacteriophages differ in the binding process, for example, in Phi29 the clip domain on the portal protein

Portal Vertex

109

Fig. 3 The Conserved Structural Motifs in the Monomers of the Portal Proteins Available at the PDB Database.

provides a binding site for a pentameric pRNA. The ATPase known as GP16 (equivalent to the TerL in other bacteriophages) docks on the pRNA and self-assembles a ring structure around the pRNA to package the DNA into the prohead.

Portal Proteins Bound With Adapter Proteins Facilitates Tail Attachment At the end of DNA packaging, after terminase protein dissociates from the portal protein, the clip domain of the portal protein would be converted to be an interface where the adapter proteins dock and subsequently the stopper proteins are attached in order to plug the channel to prevent the genomic DNA from accidental leakage. The protein complex, composed of three proteins (the dodecameric portal protein, adapter proteins and stopper proteins), forms a knob-like structure at the portal vertex, which is used to connect prohead and the viral tail. Because of this function, the protein complex is also named as the head-tail connector which provides a site for attachment of the tail. It should be noted that, in the literatures about Phi29, the term connector only refers to its portal protein, also known as GP10.

Portal Proteins Help to Spool Condensed Genomic DNA in the Capsid to Form a Highly Ordered Structure The packaged DNA in the capsid can be condensed to a highly dense concentration, about 500 mg/mL, and images from Cryo-EM experiments have shown that the chromosomic DNA in the capsids are highly orderly packed like liquid crystals. The highly ordered structure would prevent the chromosomic DNA from the entangling between the DNA segments, and subsequent DNA ejection would benefit from it. The arrangement of the trapped DNA, however, varies among different tailed bacteriophages. For example, T4 DNA is arrayed longitudinally to the head-tail axis and both packed ends are located in the portal protein, whereas in T7 the packed DNA is spooled horizontally around the head-tail axis of the portal protein. The formation of the highly ordered structures of the packed chromosomic DNA in the capsid is assumed to be associated with the crown domain in the portal protein. However, how the domain is involved to spool the DNA into different structures is still unknown. In P22, the portal protein is unique as both the crown domain and its extension, a coiled-coil alpha helical barrel structure, can be visualized clearly in the protein 3-D structure. The barrel is assumed to interact with genomic DNA and help the genome spool onto the interior surface of the capsid during genome packaging, working like a rotary sprinkler. To facilitate DNA ejection, the domain can also provide a docking site for ejection proteins and assure DNA ejection without tangling. In the phages that do not have the barrel structure, ejection proteins can assemble to a proteinaceous core bound onto the portal crown domain. The

110

Table 2

Portal Vertex

The Main Functions of the Conserved Structures in the Portal Proteins Among the Tailed Bacteriophages

Highly Conserved Structural Motifs

The Functions of the Structural Motifs

The Crown Domain as well as its extension (the barrel structure)

Sensing the pressure of chromosome in the capsid to terminate the DNA packaging. P22 Portal The barrel structure: Helps the DNA form spool-like structure. May hold the trailing end of the packaged genome to keep the DNA primed for ejection. May serve as a packaging sensor for the pressure, the length, and the density of the packaged DNA, and as a valve to retain the DNA in the capsid at the end of the DNA Packaging. May help to spool the DNA inside the capsid. T4 Portal

The Wing Domain

The hydrophobic residues encircling the wing domain are the most likely region that T4, Phi29, P22 portal interact with the scaffolding protein to form a dodecamer ring structure and proteins provide binding region for coat proteins. Helps stabilize the interactions of the adjacent subunits in the portal ring structure. Phi29 and P22 portal proteins

The Stem Domain

The exterior in the domain: May bind the terminase for DNA packaging in the prohead and the adapter proteins in the mature virion for tail assembly. The channel in the domain help to grip and release of DNA to assist DNA packaging. The tunnel loops help to translocate DNA. The motion of the helix a5 on the loops may provide a path for signal and force transmission between the loops, DNA and the viral ATPase. The channel loops in the channel play a critical role in the retention of the DNA under a high pressure.

P22, Lambda, T3, SPP1, and T4 ortal proteins. T4 portal protein SPP1 portal protein

May bind to TerL to assist DNA packaging and play a role in the cross talk between the portal and the terminase. May bind to coat proteins. The negatively charged amino acid residues at the channel entrance at the clip domains can keep translocated DNA centered in the channel. Interlock the adjacent subunits by hydrophobic interactions to impart additional stability to the oligomerization states. Provides a site for the docking of pRNA, that is subsequently bound to the ATPase.

SPP1, P22 Portal proteins

The Clip Doman

The Portal Proteins Studied

Phi29 Portal

P22 portal SPP1, HK97, Phi29, and T4 Portal proteins Phi29 portal P22 portal proteins Phi29 portal protein

resultant tunnel created by the stacked proteins, similar to the barrel structure in P22, is assumed to spool the DNA in a highly ordered structure inside the capsid.

Portal Proteins as a Check-Valve Preventing DNA From Slipping Out A DNA packaging motor transports the viral DNA into the procapsid against a pressure of up to 20–60 atm, and thus a check-valve mechanism is needed to prevent the genomic DNA from slipping out of the prohead. Using the molecular dynamics simulation for the Phi29 portal protein as well as the relevant in vitro viral assembly experiments, three positively charged residues (K234, K235, and R237) on the structurally conserved channel loop are shown to play a crucial role in the check-valve mechanism. As the pressure is increasingly built up inside the prohead, the positively charged amino acids may intensively interact with genomic DNA to prevent it from slipping out. Based upon the structural observation, it is noted that another two rings of positively charged lysine residues (K200 and K209) located on the clip domain in the channel of the Phi29 portal protein are also assumed to be involved in the check-valve mechanism. The existence of the check-valve mechanism was evidenced in an in vitro T4 DNA packaging experiment measured by optical tweezer. The experiments have found that the terminase proteins did not grip the portal protein strongly when ATP was hydrolyzed into ADP. Under the situation, the T4 portal tunnel loops are assumed to strongly “grip” the DNA to prevent it from slipping out. Similarly, the lysine residues (K331 and K342) in the SPP1 portal protein are also proposed to have similar gripping functions during the DNA packaging.

Portal Proteins as a Sensor to Control the DNA Packaging In bacteriophages T4, P22, T1, P1, and SPP1, all follow the headful packaging mechanism in which the DNA packaging proceeds until a threshold amount of DNA is reached inside the viral capsid. When the prohead is filled in full, a high internal pressure is maintained. As a result, the high pressure may induce the conformational changes at the clip domain in the portal protein where TerL is docked. The TerL endonuclease is subsequently activated to cleave the DNA concatemer and generate a segment of DNA that contains one unit of the phage genome. In the process, the portal protein serves as a packaging sensor that may use the pressure in the capsid to regulate the maturation process of the capsid. The relevant mutation experiments on the phage P22 portal protein further suggest that the sensor could be located on the barrel of the portal protein.

Portal Vertex

111

In Phi29, near the end of DNA packaging, the portal protein may act as a pressure sensor, meaning it can change its conformation to detach the bound pentameric pRNA associated with ATPase. The portal protein is believed to use the conserved structure, the channel loops (residues 229–246), located in the channel interior as a clamp to retain the genomic DNA within the prohead before the tail components are attached. In an optical tweezer experiment on the Phi29 DNA packaging motor, the packaging sensor in the portal protein was assumed to sense the changes in the density and conformation of the packaged DNA at the late stage of DNA packaging. As a result, the activity of the ATPase would be allosterically regulated.

The Portal Proteins Actively Assists DNA Packaging Coupling With ATP Hydrolysis The terminase-driven mechanism plays a key role at the initial stage of DNA packaging When DNA enters an empty prohead or a prohead with a low pressure, terminase plays a central role in DNA packaging. A terminase-driven model is proposed to explain the process. The terminase is a hetero-oligomeric complex, composed of two subunits, TerS and TerL. TerS selectively binds the viral DNA concatemer from a pool that includes the host DNA. TerL uses its endonuclease to make a cut and binding itself on a portal vertex of the head to initiate DNA packaging. The TerL is assumed to form a pentameric ring around the portal protein. The ATPase on each of the five subunits in the TerL may also have a domain that is responsible for DNA gripping. A section of genomic DNA is gripped in the center of the TerL ring when ATP is bound to all five subunits. ATP hydrolysis occurs at one subunit at a time. When the ATPase on one subunit is fired to hydrolyze ATP, the subunit alters its conformation to release DNA and then initiate motion that is transmitted to the adjacent ATP bound subunit, which is still gripping the DNA. The conformational changes of the ATP-hydrolyzing subunit results in an upward translocation motion of DNA at the adjacent subunit. The sequential interlaced ATP hydrolysis/binding on each of the TerL subunits may lead to unidirectional DNA translocation, which requires the activities of the individual subunits are highly coordinated. The terminase-driven model can be used to explain the 2.5 bp translocation per ATP measured for the Phi29 motor at low packaging force by optical tweezers.

The portal proteins actively assists DNA packaging at the late stage of DNA packaging DNA packaging is an energetically unfavorable process. At the late stage of DNA packaging, a large amount of free energy is used to package DNA to overcome the high pressure in the prohead. In addition, energy is also consumed to assist the genomic DNA to form a highly ordered structure similar to a liquid crystal state in the capsid (an entropy-decreasing process). Images from Cryo-EM have shown that the portal proteins in all bacteriophages undergo conformational changes during the DNA packaging. These conformational changes may have three possible origins: (1) They may be induced by the interior pressure in the capsid as discussed above; (2) They may be associated with the mechanical cycles of the ATPase during ATP hydrolysis; (3) The conformational changes may cyclically occur during packaging, and could be induced by the electrochemical potentials across the capsid shells. On how the portal protein uses conformational changes to assist packaging, a handful of models have been proposed, which can be roughly categorized into two modes: portal directly-mediated packaging mode vs portal indirectly-mediated packaging mode. The portal directly-mediated packaging mode The ATPase in the terminase uses energy from ATP dephosphorylation to generate movement by mechanical cycles. The moving and pushing by the ATPase on the terminase may induce conformational changes in the portal proteins because TerL is directly docked on the portal protein. In the portal directly-mediated packaging mode, the resultant conformational changes are assumed to directly apply forces to the genomic DNA and push it into the procapsid. Different models have alternate explanations to the conformational changes, thus the description about the DNA packaging process assisted by the portal protein varies among models. Several models categorized under the mode are highlighted below: Rotation of the portal protein may apply forces to DNA Because the portal protein has a 12-fold symmetry and it is located at a vertex with 5-fold symmetry, the model assumes that the asymmetry may facilitate the rotation of the portal protein at the viral vertex. In this model, the energy from ATP hydrolysis was assumed to rotate the portal protein which may screw the helical DNA into the capsid like a bolt moves on a nut. However, this model has been discarded currently due to recent experiments proving that the portal protein does not rotate during DNA packaging. Lengthwise channel expansion and contraction in the portal protein may apply forces to DNA The connector expansion and contraction model is based on the structural observation of the portal protein in Phi29. The portal protein is assumed to have plasticity and its channel may experience lengthwise expansion and contraction after the portal protein is slightly rotated by 12 degree when ATPase is activated. The movement in the model is described like a helical spring pumping the chromosomic DNA into prohead.

112

Portal Vertex

Sequential movement of tunnel loops may apply force to DNA The model is proposed on the structure of the portal protein in SPP1. The SPP1 portal is found to have 12 long flexible loops that project into the central channel of the portal. These loops are a conserved structure among other portal proteins. They are assumed to tightly fit the DNA during the packaging. When the portal protein slightly rotates in coupling with ATP hydrolysis, the loops may undergo sequential movement, like an undulating belt. The DNA then moves sequentially along this belt and is packaged into the prohead. In summary, the models following the portal directly-mediated packaging mode are proposed based on the structural observation of the portal proteins from the Cryo-EM or X-Ray Crystallography. However, the descriptions in the models about how the portal protein apply the forces to assist the DNA packaging as well as how the ATPase uses its mechanical cycles to induce the channel conformational changes have not been visualized experimentally. The portal indirectly-mediated packaging mode In this mode, the portal protein is proposed to work like an energy transducer. The energy from cyclic and random conformational changes in the portal is converted to a different energy form that is subsequently used to pack genomic DNA into the capsid. Two physical models that use the indirectly-mediated packaging mode are highlighted here: DNA translocation driven by the energy from the DNA deformation in the portal protein This model, named as the scrunchworm model, is based upon molecular dynamic simulation on the portal structures of P22, Phi29, and T4. The simulation assumes that sufficiently large conformational changes in the portal proteins can cause the DNA in the portal channel to undergo conformational changes between stretched and relaxed states. When DNA is in its extended form, the portal protein needs to grip the DNA at the bottom of the channel through electrostatic interactions and release the rest of the DNA at the top. When DNA is packaged in its shortened form, the top of the channel may grip the DNA and the bottom grip would open. The portal protein cyclically changes its conformations and applies force to store the energy of the DNA deformation. Like a spring, the DNA would be compressed and released back and forth, propelling the DNA into the prohead. Apparently, the process simulated in silico requires that the ATPase alternately changes the conformations of the portal protein in a well-controlled manner and this hypothesis needs to be further studied experimentally. DNA translocation driven by a brownian motor inherent in the portal protein In this model, the truncated cone structure of the portal protein ensures that the electrostatic interactions between the DNA and the negatively charged residues in the interior of the channel along the direction of DNA movement are asymmetrical. The ATPase continuously packages the DNA into the capsid and results in an increase in the electric potentials across the semipermeable capsid shell membrane called the Donnan’s effect. This is the direct energy source for the Brownian motor where the electrochemical potentials can alternately change the channel conformation. Consequently, the portal protein works as a power rectifier, by which the asymmetrical and cyclic interactions between the portal channel and DNA would ratchet the DNA Brownian diffusion back to the channel bottom, and ensure the unidirectional Brownian motion towards the prohead. The Brownian motor model is proposed based upon planar lipid bilayer membrane (BLM) experiments, in which the channel of the portal protein of Phi29 was shown to randomly and cyclically change its conformations under membrane potentials when the protein was embedded into a lipid membrane. Although the BLM results come from those measured by indirect experimental methods, they strongly suggest that the high probability of the existence of channel fluctuation under the membrane potentials. The Brownian motor model has successfully explained many experimental results from optical tweezers. For example, why the DNA packaging motor operates in a dwell-burst fashion and how it is affected at the late stage of DNA packaging. More importantly, the BLM experiments indicate that, besides the mechanical cycles of the ATPase in the ATP hydrolysis, the membrane potentials on the prohead shell may also play a role in inducing the conformational changes cyclically and alternately in the portal proteins.

Main Techniques Used to Study Protal Protein Structure and Function Biochemical methods Biochemical methods are traditional ensemble methods where a mutant component, e.g., the prohead containing a mutant portal protein, is usually designed and prepared using molecular biological methods such as site-directed mutagenesis. The mutant proheads are then mixed with other components for in vitro DNA packaging experiments. By means of enzymes such as DNase I and proteinase K, the size of the DNA packaged inside the capsid can be quantified. The conversion efficiency from filled prohead to phage can also be determined by mixing packaging solutions with bacteria for virus quantification. The methods are often used for investigating the functions of specific domains, e.g., the channel loops, in the portal proteins of the tailed bacteriophages.

Techniques for structural analysis Two useful structural analysis methods are usually used in Structural Biology: X-Ray Crystallography and Cryo-EM. In X-Ray Crystallography, the target protein needs to be dissolved in an aqueous environment. When the resultant sample solution reaches its supersaturated state, it may form a single-crystal form under precisely controlled conditions. In the process,

Portal Vertex

113

many factors may affect the growth of the protein crystal such as pH, temperature, ionic strength, and even gravity. Once properly developed, these crystals can be used in structural biology to study the molecular structure of the protein. Although X-ray crystallography can provide 3-D protein structure at atomic resolution, the process for protein crystallization may pose a great challenge to the study of portal proteins since many of them are known to be hydrophobic and may aggregate in the aqueous solutions. Another problem with the X-ray crystallography is that the structure of the portal proteins in its crystal form is not necessarily the same as the conformation during DNA packaging. Cryo-EM can overcome some of the limitations found in X-ray crystallography. The method uses rapid freezing of solutions containing virions or virion components in which the intact structures of the biological macromolecule complexes can be preserved for electron imaging. The 3-D structure of a virus particle in its native conformation can be reconstructed by means of the combined images of virion particles using an image-processing software. The Cryo-EM can provide virus structures at subnanometer resolution and therefore is a very useful imaging tool to infer the information about how different motor components move and interact with each other. However, Cryo-EM only provides snapshots representing the average of large ensembles of individual virion particles, thus upon freezing, the DNA packaging motors in different virion particles would be stalled in different stages of the enzymatic or mechanical cycle. The conformational heterogeneity would be an obstacle for Cryo-EM to directly visualize the conformational change of the target components as well as the relevant interactions at different stages, e.g., how the terminase proteins interact with the portal protein to induce its conformational changes during ATP hydrolysis.

The optical tweezers: a technique to observe DNA packaging in real time Optical Tweezer is a technique that uses a focused laser beam to trap a micron-sized plastic microsphere in an optical microscope. In the single-molecule studies of viral DNA packaging, a single DNA molecule is tethered to the dielectric microsphere. When the tethered DNA is packaged into a capsid, the position of the trapped microsphere is also changed, which can be detected with great precision by imaging the light scattered by the microsphere. The light sensed by a position-sensitive photodetector can be used to provide sub-nanometer resolution of displacements and subpiconewton resolution of forces. The viral DNA packaging in phages, such as Phi29, T4 and lambda, have been studied in real time and the forces exerted by the motor on the DNA and the translocation dynamics can be measured. The measurement is performed at the single-molecule level, and therefore can provide the information that the ensemble methods cannot provide, for example the dwell-burst phenomena found in the DNA-packaging.

Molecular dynamics simulation Molecular dynamics simulates the natural motion of molecular systems. The energy provided in a molecular dynamics procedure allows the atoms or molecules to interact for a definite period of time, providing a view of dynamic evolution of the molecular system e.g., the patterns, strength, and the behavior of the target components under various conditions. In the study of viral portal proteins, the methods were usually applied to the interactions between DNA and the amino acid residues on the interior surface of the channel in the portal proteins including Phi29, T4, and P22. It has been shown that, under the influence of sufficient and cyclic conformational changes in the channel of the portal proteins, the DNA can be unidirectionally transported into the capsid.

Planar bilayer membrane (BLM) technology According to the Brownian motor model, the portal protein in the bacteriophage can fluctuate under the influence of the membrane potentials and high pressure in the capsid (the Donnan’s effect). The resultant conformational change may play a critical role to ratchet the Brownian motion of DNA. Because the virion particles are too small (nanometer sized), the patch clamp cannot be used to measure the channel fluctuation under the in vivo potentials across the capsid shell. Therefore, BLM technology would be the only available experimental method that is used to indirectly visualizes the conformational changes on the portal protein under membrane potentials. In the method, a portal protein needs to be inserted into a bilayer lipid membrane. Under the influence of the Donnan’s effect, the protein may fluctuate in response to the applied membrane potentials. The BLM technology can be potentially used to look for the domains in the portal protein that are involved in the sensing of the membrane potentials as well as in the channel fluctuation.

Concluding Remarks The portal protein plays a core role in the packaging of the DNA of tailed bacteriophages. At the beginning stage, the portal protein is a nucleator to initiate the copolymerization of coat and scaffolding proteins to form an empty prohead with a unique portal vertex containing a dodecameric portal ring. The portal protein provides binding sites for terminase protein for packaging chromosomic DNA and for the adapter proteins for the installation of the tail. To prevent the DNA from tangling in the capsid, the portal protein may help DNA form a highly ordered structure. Because of the proximate contact with the DNA in the capsid, the portal protein may serve as a packaging sensor to participate the regulation of ATPase activity and transmit a signal to the terminase to dissociate from the capsid when DNA packaging is completed. As the pressure inside the shell increases, the portal can also serve as a clamp to prevent DNA from leakage near the end of DNA packaging. Because it has a flexible structure, the

114

Portal Vertex

portal protein may change its conformation cyclically and assist the DNA packaging in the portal directly-mediated packaging mode or the portal indirectly-mediated packaging mode. The conformational changes in the portal proteins are assumed to play critical roles in the control, regulation and assistance of DNA packaging. How the conformational changes occur in vivo and what energy source is used to induce the conformational changes still lack direct experimental evidence. Therefore, the in vivo conformational changes are described differently in many models. In addition, the current experimental techniques are not perfect in providing the relevant experimental evidence. For example, the methods for the structural analysis, e.g., Cryo-EM, cannot provide details about how the terminase hydrolyze ATP in its mechanical cycles to induce the conformational changes in the portal proteins. The molecular dynamic simulations, in addition, cannot answer whether the sufficient conformational changes that cause the deformation of the DNA in a precisely controlled manner can still occur when the portal protein is embedded in the capsid shell composed of coat proteins. BLM technology is only an indirect method to measure the channel’s conformational changes in a non-native environment. Optical tweezer measurements have never been used for the force measurement of DNA packaging when the viral assembly happens in a prohead containing a mutant portal protein. The integration of the available experimental techniques therefore is necessary in exploring how the portal protein changes its conformation in the tailed bacteriophages under influence of the pressure, ATPase and the membrane potentials as well as the roles that the resultant conformational changes play. It has been known that mammalian dsDNA viruses, such as human herpesviruses, package their genome through their portal proteins using the similar mechanism that the tailed bacteriophages use. The structure of the portal proteins in human herpesviruses also share the common features with the bacteriophage portal proteins. The relevant study on the roles of the portal proteins in viral assembly in the tailed bacteriophages would provide very useful information for the preclinical studies that target the portal proteins in the human herpesviruses.

Further Reading Bayfield, O.W., Klimuk, E., Winkler, D.C., et al., 2019. Cryo-EM structure and in vitro DNA packaging of a thermophilic virus with supersized T ¼ 7 capsids. Proceedings of the National Academy of Sciences of the United States of America 116, 3556–3561. Cuervo, A., Fabrega-Ferrer, M., Machon, C., et al., 2019. Structures of T7 bacteriophage portal and tail suggest a viral DNA retention and ejection mechanism. Nature Communication 10, 3746. Dedeo, C.L., Cingolani, G., Teschke, C.M., 2019. Portal protein: The orchestrator of capsid assembly for the dsDNA tailed bacteriophages and herpesviruses. Annual Review of Virology 6, 141–160. Grimes, S., Ma, S., Gao, J., Atz, R., Jardine, P.J., 2011. Role of j29 connector channel loops in late-stage DNA packaging. Journal of Molecular Biology 410, 50–59. Hendrix, R.W., 1978. Symmetry mismatch and DNA packaging in large bacteriophages. Proceedings of the National Academy of Sciences of the United States of America 75, 4779–4783. Jing, P., Burris, B., Zhang, R., 2016. Forces from the portal govern the late-stage DNA transport in a viral DNA packaging nanomotor. Biophysical Journal 111, 162–177. Kornfeind, E.M., Visalli, R.J., 2018. Human herpesvirus portal proteins: Structure, function, and antiviral prospects. Reviews in Medical Virology 28, e1972. Lebedev, A.A., Krause, M.H., Isidro, A.L., et al., 2007. Structural framework for DNA translocation via the viral portal protein. The Embo Journal 26, 1984–1994. Liu, S., Chistol, G., Hetherington, C.L., et al., 2014. A viral packaging motor varies its DNA rotation and step size to preserve subunit coordination as the capsid fills. Cell 157, 702–713. Olia, A.S., Prevelige, P.E., Johnson, J.E., Cingolani, G., 2011. Three-dimensional structure of a viral genome-delivery portal vertex. Nature Structural Molecular Biolology 18, 597–603. Prevelige, P.E., Cortines, J.R., 2018. Phage assembly and the special role of the portal protein. Current Opinion in Virology 31, 66–73. Sharp, K.A., Lu, X.-J., Cingolani, G., 2019. DNA conformational changes play a force-generating role during bacteriophage genome packaging. Biophysical Journal 116, 2172–2180. Simpson, A.A., Tao, Y., Leiman, P.G., et al., 2000. Structure of the bacteriophage Phi29 DNA packaging motor. Nature 408, 745–750. Smith, D.E., Tans, S.J., Smith, S.B., et al., 2001. The bacteriophage j29 portal motor can package DNA against a large internal force. Nature 413, 748–752. Sun, L., Zhang, X., Gao, S., et al., 2015. Cryo-EM structure of the bacteriophage T4 portal protein assembly at near-atomic resolution. Nature Communication 6, 7548.

Prohead, the Head Shell Pre-Cursor Marc C Morais and Michael E Woodson, The University of Texas Medical Branch, Galveston, TX, United States r 2021 Elsevier Ltd. All rights reserved.

Glossary Capsid The protein shell that contains and protects the genetic material of a virus. Capsomer A pentameric or hexameric sub-assembly unit of the capsid. Icosahedron A regular polyhedron with 20 equilateral triangular faces. Each face consists of 3 equivalent smaller triangles such that an icosahedron is made from 60 identical asymmetric subunits. Maturation The process by which a prohead transforms into its more stable mature form.

MCP The Major Capsid Protein, multiple copies of which are arranged as capsomers to make up the shell that protects the phage genome. Quasi-equivalence The slight difference in bonding patterns and conformations assumed by different individual MCPs in the asymmetric units of icosahedra with T-numbers greater than one. Triangulation number (T-number) The number of additional triangles each of the 60 icosahedral asymmetric subunits can be divided into. Thus, the total number of MCPs will be T  60.

Introduction The need for viruses to convert the limited resources of their host cells into the greatest number of progeny in the shortest time represents an optimization problem. Billions of years of evolution have produced a surprisingly small set of general solutions to this problem in the form of ‘assembly pathways’, i.e., the sequences of reactions and products that result in infectious virions. The most successful pathway, in terms of numbers of infectious particles, is that of dsDNA bacteriophages. A notable feature of this pathway, shown in Fig. 1, is that it goes through a structurally distinct intermediate en route to the assembly of mature infectious virions. This extra ‘prohead’ step allows efficient construction of environmentally stable protein capsules, or capsids that contain and protect the viral genome. The prohead is formed when hundreds to thousands of copies of the protein building block of the capsid, the major capsid protein (MCP), self-assemble into a fully enclosed shell. After its completion, the prohead transforms into the next intermediate in the assembly pathway, the mature head. To successfully complete the self-assembly pathway, MCP has conflicting requirements; weak binding is initially needed such that MCPs can detach and re-attach as necessary to correct for errors that may occur during prohead self-assembly, yet strong binding interactions are ultimately required to construct a robust mature capsid that can survive the extracellular environment. This conflict is resolved by an MCP that has evolved to adopt distinct conformations appropriate for the two separate requirements. The prohead state coordinates the transformation between these two conformations so it only occurs after all MCPs have assembled into the correct geometry. Understanding how these complex molecular rearrangements are regulated and coordinated at the atomic level will inform antiviral drug development, engineering of gene therapy vectors, and rational design of nanomaterials and catalysts, among other applications. Understanding of the viral lifecycle requires a consideration of the evolutionary pressures that influence it. Competition for resources between a virus, its host, and its rivals favors genetic efficiency. Hence viruses generally have small genomes that produce a limited number of multi-functional proteins that can rapidly assemble the structures needed to construct an infectious virus.

Fig. 1 Assembly Pathway of tailed dsDNA bacteriophages. A simplified assembly pathway for the tailed-phages is shown. In A, soluble capsid protein (blue wedges), scaffold protein (yellow rods) and portal complexes (pink cone) begin to associate. In B, these components have assembled into a closed shell, the prohead. In C, the scaffolding protein has decoupled from the capsid protein, which expands into the mature head. The packaging motor (orange shape) has attached to the portal and begun to feed DNA (purple coil) into the capsid. In D, the capsid is filled with DNA, the motor has detached, and the tail and accessory proteins have bound to the capsid, resulting in the mature, infection-competent virion.

Encyclopedia of Virology, 4th Edition, Volume 4

doi:10.1016/B978-0-12-809633-8.21234-X

115

116

Prohead, the Head Shell Pre-Cursor

Viral capsids are therefore made of many copies of one or a few types of proteins that bind together to produce large MDa-scale macromolecular assemblies. Viruses generally, and bacteriophage especially, have limited access to sophisticated methods of component segregation, including a means of controlling the extent and timing of component movements. Typically, the protein components that assemble the virion must all coexist in solution simultaneously and perform their various tasks in the proper sequence, without interfering with one another. Double-stranded DNA bacteriophages avoid interference by assembling different structures as separate functional units called the ‘head’ and the ‘tail’, which contain/protect the viral genome and deliver that genome into the host cell, respectively. Because of these ‘tail’ structures, dsDNA bacteriophages are often referred to as tailedphages. The ‘head’ structure is thus a context-specific term for capsid and is not typically used for viruses that lack tails. In the tailed-phage assembly pathway an empty prohead, or procapsid, is assembled first. A phage-encoded molecular motor then assembles at a unique vertex of the prohead and actively pumps the phage’s genomic DNA into the head. In many phages, DNA packaging is believed to trigger a series of structural changes that transform the prohead to mature head. After packaging is complete, the motor disassembles, and then the tail and related structures replace the motor at the unique vertex to complete assembly of the infectious virion. All tailed-phage heads are icosahedral in shape. A standard icosahedron has 20 equilateral triangular faces and twelve vertices where 5 triangular faces meet. Some capsids are prolate, meaning that they are elongated along one axis and 10 of the 20 triangular faces are elongated and no longer equilateral. Icosahedral shapes are common among many different types of viruses because icosahedra have the greatest symmetry of any regular polyhedron; of the shapes that can be built from identical subunits, icosahedra use the largest number of subunits (60). Since a large number of subunits are used to construct an icosahedral capsid, the size of individual MCP subunits necessary to encapsidate a given volume can be smaller. Hence, an icosahedral capsid built from 60 copies of the MCP results in a large ratio of internal volume to encoding sequence length. Despite the genetic economy provided by high symmetry, an icosahedral capsid built from 60 copies of the MCP is not large enough to enclose the full genome of dsDNA bacteriophages, which must also code for proteins for the tail, DNA replication, DNA packaging, and other genes. Rather than simply coding for a larger MCP, tailed phages have evolved to build larger icosahedral capsids by using more than 60 copies of the MCP. To accomplish this while still maintaining overall icosahedral symmetry, phages have evolved to have more than one copy of the MCP in each of the 60 identical asymmetric units (ASU) in the icosahedron. The number of MCP copies in each asymmetric unit is called the triangulation number of the icosahedron. An icosahedral capsid with only one copy of the MCP per ASU has a triangulation number of T ¼ 1 and a total of 60 MCPs per capsid, whereas a T ¼ 7 capsid has 7  60 ¼ 420 MCPs in total. While each individual copy of the MCP in the ASU has the same amino acid sequence, their structures are only quasi-equivalent, that is, similar but non-identical. It is important to note that the increased volume-to-sequence ratio gained by adopting quasi-equivalence comes at a cost. Because the quasi-equivalent structures are slightly different, they are capable of interacting differently with their neighbors. Thus, while an MCP capable of binding together in different ways can produce larger icosahedra, it is also capable of producing an infinite variety of other shapes, very few of which will allow the virus to infect new hosts, and none of which are optimal. Further challenges arise when we consider that infectious virions must attach the tail to the capsid. As discussed above, icosahedra have 12 vertices, and in tailed-phage capsids 11 of these are occupied by pentamers of MCP while the 12th is a unique vertex occupied by a ‘portal’ complex which allows DNA to enter and exit the capsid. Viral particles that incorporate more than one portal vertex, or none at all, will have reduced or no infectivity. Despite the simplistic description we have given of the phage assembly pathway, in practice a large number of sequential steps and intermediate states are involved. While the overall pathway is consistent among tailed phage species, differences in host environment, infection strategy, genome size, and other factors result in variations in individual states and sequences in assembly. Only the most stable intermediates from these have been isolated and characterized. Further, the specific intermediate used to represent the ‘prohead’ state in one phage may not be a direct analog of the ‘prohead’ state of another. The term refers collectively to states in which MCP monomers have been assembled and arranged to form a fully enclosed shell, but have not yet transformed into the characteristic, highly stable mature conformation. All studied prohead states share a few common characteristics, and the “variations” observed may in fact be present in other phages but too short-lived to have been characterized. Common features of known prohead states include (1) the presence of a “scaffolding” protein that will not be present in the mature virion, (2) a smaller, more spherical shape than the mature heads, (3) a rougher or more wrinkled surface than mature state, and (4) the ability to bind a packaging motor at the unique portal vertex. Because of the large morphological differences among individual phages and the prohead’s nature as an intermediate state, understanding it is best served by describing features in relation to the mature virion. Hence, these and subsequent features of proheads are described relative to the mature head. The more angular icosahedral shape of the mature capsid results from the need for tight inter-MCP binding and a high volumeto-sequence ratio. Similarly, the different shape of the prohead is the result of constraints associated with a different set of requirements, i.e., self-assembly of a closed icosahedral shell and incorporation of a portal vertex that, by its very nature, is inconsistent with the otherwise icosahedral symmetry. The sets of requirements of thermodynamic stability in the mature head and kinetically accessible meta-stability in the prohead are conflicting, yet both must be satisfied by the same protein, the phage MCP. The capacity to resolve this conflict is provided by an MCP protein fold capable of adjusting its three-dimensional structure according to the requirements of both quasi-equivalence and the particular step in the phage’s assembly pathway. While significant variations of the MCP occur across all known tailed phages and herpesviruses, the canonical MCP fold, shown in Fig. 2, is always recognizable despite very little sequence conservation. The fold was first described in the E. coli phage HK97, which, coincidentally, also best represents the essential structural core. Extra features and embellishments are present in all subsequently determined structures of phage MCPs, and in herpesviruses account for the majority of the MCP fold. To describe the general structure and function of the prohead state, it is necessary to refer to features of this conserved MCP structural core. Briefly, the HK97 protein

Prohead, the Head Shell Pre-Cursor

117

Fig. 2 The HK97-like fold of the MCP. This illustration of an idealized MCP shows the secondary structural elements and other features typical of all phage MCP that have the HK97 fold as a structural core. Blue features are present in all known structures, yellow features are present in some conformations but not others, and pink features are present in some phage species but not others. Features are named according to the type of secondary structure present if any, the domain in which they occur, and a number to differentiate similar elements. Number values are smaller for features that are present in more known structures. For example, the wedge-shaped A domain is composed of the five-stranded b sheet bB1, flanked by aA1 on the left and aA2 on the right. The position and conformation of aA2 varies more over different states and species than does aA1, while aA3 only exists in some states of some phages.

fold is divided into four domains: a wedge-shaped A-domain, a plank-shaped P-domain, an extended E-loop, and an N-terminal arm. All four domains change their structures as needed to meet the requirements of quasi-equivalence and maturational transformation. However, the A- and P-domains that make up the bulk of the shell of the capsid are less conformationally variable than the E-loop and N-arm which form bridges to bind neighboring MCP monomers together. Fig. 2 illustrates the HK97 fold and its conserved features and serves as a reference to the MCP features described below.

Major Capsid Protein Before Capsid Assembly The structure of the prohead is a consequence of its ability to self-assemble from MCP in solution in the cytoplasm of host cells. As such, it is useful to understand the structure of tailed phage MCP before it has polymerized into a prohead as well as the evolutionary pressures that have dictated its structural properties. The canonical HK97 fold of the MCP appears to be incapable of folding properly on its own, as it precipitates out in random aggregates. This is not surprising; like any protein capable of self-polymerizing, MCPs are in danger of misfolding. Phage MCP sequences must encode complementary stretches of residues that can bind tightly to their intended partner in another monomer to form a robust capsid. However, these stretches could also lead to mis-folding if binding occurs intrarather than inter-molecularly. HK97 itself avoids aggregation and assembles proheads only in the presence of the host GroEL/ES molecular chaperone system, as do lambda and other phages of E. coli. T4 codes for a particularly large MCP (521 AA), which is too large to utilize the host GroEL/ES system and thus encodes its own more capacious co-chaperonin analogous to GroES. In P22 and related phage, assembly-competent folding is provided without the apparent need for host factors by the addition of an “insertion” I-domain that folds rapidly and nucleates folding in the rest of the MCP. Point mutations in the I-domain lead to MCP aggregation and precipitation, presumably due to misfolding of the domain and thus the protein as a whole. P220 s I-domain is similar to the IG-like protein telokin. Other phages also include IG-like domains in their MCPs, but a role for these in promoting proper folding has yet to be demonstrated. However, proper folding requires more than just avoiding aggregation; initial inter-molecular interactions must be relatively weak so that MCPs in the prohead can freely dissociate and re-associate to find the lowest-energy arrangement of MCP subunits. This means that the MCP folds such that the stretches of polypeptide that bind strongly to each other in the final assembled structure are inaccessible until transformation into the mature head begins. There is likely a soluble intermediate state of the MCP after it has folded but before it has assembled into a prohead. However, that intermediate, i.e., its conformation and oligomerization state, remains unknown for any phage. In HK97, pentamers and hexamers appear to be the soluble units. This is important because MCP is arranged in the finished capsid as pentamers and hexamers, collectively termed capsomers. It seems logical for the capsid to be constructed from pre-assembled capsomers, but this appears to not always be the case. Investigations into P22 assembly indicate that its prohead forms monomer-by-monomer, and non-denaturing gel electrophoresis studies in T7 suggest that MCPs are present as tetramers prior to prohead assembly. Clearly, more definitive biochemical, biophysical, and structural characterization of these intermediate species would significantly illuminate the mechanism underlying capsid assembly.

Prohead Assembly Assembly of an icosahedral prohead with a Triangularization number (T number) greater than one requires MCP to adopt different conformations and bind together in different ways according to the quasi-equivalent positions of the lattice. However, this capacity for different conformations and binding also allows MCP to link together in a large number of incorrect arrangements

118

Prohead, the Head Shell Pre-Cursor

Fig. 3 Local Rules and Prohead Assembly. Two examples of ‘local-rules’ and their resulting assemblies are shown. On the top, monomer pair interactions that are more thermodynamically favorable are in a green box while pairs that do not quite fit together have less favorable interactions and are in the red box. These pairs with different interaction energies can be thought of as ‘rules’ which only allow interactions with more negative free energies of formation. The result of continuously adding monomers in compliance with the rules in the green and red boxes is a T ¼ 3 icosahedron. A T¼7 icosahedron can be specified by a much larger set of rules, as shown in the lower half of the figure.

in addition to the correct one. Two levels of regulation are utilized to promote the assembly of the single MCP arrangement that yields functional proheads. In the first, termed ‘local rules’, interaction between MCP monomers is favored for specific pairs of different conformations and disfavored for others, as shown in Fig. 3. These rules may be satisfied by monomers with consistent conformations sampling many potential binding sites until one matches or they may move to binding sites randomly and switch conformations to match. The second level of regulation requires the participation of an additional protein known as the scaffold, which is encoded by all known tailed phages. As their name suggests, scaffolding proteins are necessary for proper prohead assembly but are not present in the mature phage. The scaffolding proteins act to protect against errors that would accumulate into fatal defects in an assembly process driven entirely by ‘local-rules’ MCP interactions. The combination of specific binding relationships dictated by the local rules of MCP and more global dynamic interactions of the scaffold ensures correct assembly of the prohead. Point mutations in either component can shift the equilibrium toward more misformed, nonfunctional proheads relative to the correct form.

Scaffolding Protein The function of scaffolding proteins varies somewhat from phage to phage and has not been determined in full for any species. It has been investigated via in vivo studies of scaffold-deletion mutants and in vitro assembly experiments in which the scaffold was not included at all, or was included only as a truncated version. These experiments produce a wide range of phenotypes. T7 MCP produces no structures at all without scaffold, and the MCP remains in solution. The prolate phages T4 and phi29, in which

Prohead, the Head Shell Pre-Cursor

119

wild-type (WT) capsids consist of both icosahedral and cylindrical regions, produce elongated cylindrical and icosahedral-only structures, respectively. Long cylindrical “polyheads” are also produced by lambda, although its WT capsid is not prolate and thus has no cylindrical regions. P22 deletion mutants are able to produce some complete proheads, although at a greatly reduced yield and rate compared to WT. P2 MCP precipitates as random aggregates when scaffold is not present, as does HK97 MCP mutants from which the scaffold domain has been deleted. These different phenotypes suggest that scaffolding proteins perform several different functions, and the importance of each function varies among different phages. Some functions of scaffolding proteins that have been proposed include: (1) accelerating assembly, (2) promoting the correct overall curvature and shape of the prohead, (3) occupying the interior volume of the prohead to prevent host molecules from being trapped inside, (4) stabilizing MCP in its prohead conformation to prevent premature maturation, and (5) incorporating the portal protein at the unique vertex.

Accelerating assembly Wild-type scaffolding-protein’s accelerating function during prohead assembly is demonstrated by the observed slower assembly time of P22 MCP when scaffolding is absent. The mechanism of acceleration has been proposed to result from scaffold-bound MCPs being brought into close proximity via inherent oligomerization interactions of the scaffold. Specifically, scaffolding protein sequences in all phages have long stretches of sequence that promote alpha-helix formation and coiled-coil interactions. Indeed, the only two scaffolding proteins whose structures have been determined both form dimeric coiled-coils. Hence, if each monomer in a multimer can also bind MCP, then the inherent oligomerization of scaffold would bring MCPs into close proximity, increasing the local concentration and thus accelerating binding interactions. Experiments with truncated P22 scaffolding proteins show that acceleration of assembly occurs if and only if the region that promotes dimerization is present. In addition to the highly structured coiled-coil regions, most scaffolds also code for C-terminal regions of unstructured polypeptide. These partially unfolded regions would permit the polypeptide to sample a larger region of space, increasing the chance of finding its binding partners.

Promote correct geometry In addition to the coiled-coil driven dimerization described above, the only crystal structures of scaffolding proteins indicate that two dimers associate to form tetramers. While, the mechanisms by which scaffolding protein promote correct assembly are not fully known, they may depend on this ability to form higher order oligomers. As described above, P22 MCP in the presence of mutant scaffold in which the dimerization region was deleted not only assembled proheads with much less accuracy than when WT scaffold was present, but also at a decreased assembly rate. In contrast, a scaffold mutant missing the putative tetramerpromoting region accelerated polymerization of MCP relative to WT scaffold, but produced a greater number of aberrant particles. In many phages, the most commonly seen aberrant particle is a ‘spiral’ which can be imagined as a particle wherein the curvature of the shell is not correct, and two leading edges have continued to grow past each other after failing to close. This defect illustrates the inadequacy of relying on local MCP/MCP interactions to define the geometry of the large number of subunits in the prohead; even small errors in local curvature can lead to a fatal misalignment when added over 3601. The larger volumes sampled by the scaffolding protein could bridge misaligned fronts and prevent larger-scale defects like the spiral shells (Fig. 4). By assuming this error correction function, the scaffold likely allowed the MCP more freedom to evolve, resulting in the wide range of observed capsid forms and likely contributing to the evolutionary success of tailed phages.

Fig. 4 The Role of the Scaffolding Protein in Correcting Errors During Prohead Assembly. Schematics of a commonly seen ‘spiral’ aberrant particle (left) and a closed prohead, assembled correctly with the aid of scaffolding protein (right) are pictured. As subunits are added to the growing assembly, differences in the curvature of the shell lead to two fronts failing to join and overlapping. The local interactions between MCP monomers cannot ‘detect’ the mismatch. Scaffolding protein monomers explore a large volume of space, so as the growth fronts approach each other they are brought into alignment by dimerization and tetramerization reactions. Proper joining is ensured and a high yield of closed shells results.

120

Prohead, the Head Shell Pre-Cursor

Delay MCP transformation MCP monomers in partially assembled proheads must not transition to their tight-binding mature conformation before the subunits have assembled a closed shell with the thermodynamically favored arrangement. Scaffolding protein interactions with the N-arm of the MCP are involved in preventing premature expansion. In HK97-like phages, scaffold and major capsid protein are initially translated as a single polypeptide where the approximately 100 N-terminal amino acids correspond to scaffold and the rest of the protein to MCP. In phage that have this covalently attached scaffold (δ- domain), it must be proteolytically cleaved before maturation can occur, thus providing a mechanism to control the timing of scaffold departure. In phages that employ a separate scaffolding protein, maturation does not occur until negatively charged DNA and its counter-ions enter the prohead. High salt concentrations have been found to reduce or completely prevent binding of scaffolding protein to MCP; similarly, DNA entry may disrupt critical electrostatic interactions between MCP and scaffold. Experimentally determined estimates of the stoichiometric ratios of scaffold to MCP in proheads are imprecise but are less than one in all cases where the scaffold is not a covalently-bound δ-domain. This implies that either not all MCP monomers are directly affected by scaffolding interactions or that the scaffold dynamically dissociates and associates with different MCPs to affect the entire prohead. If the latter, the scaffold must rebind after unbinding from an MCP monomer significantly faster than the monomer transforms to its mature state when unbound.

Exclude host factors Crowding out host cellular components from the prohead interior is necessary to reserve the full volume of the capsid for storage of nucleic acid. A loss of volume to contaminants would prevent complete packaging of the phage genome, with an accompanying loss of infectivity. Scaffolding protein could prevent this scenario by simply occupying space inside the prohead. In experiments that reduce the length of a scaffolding protein, larger copy numbers compensate in the resulting proheads, resulting in a similar final scaffold-to-MCP mass fraction. Of course, scaffolding protein must also be able to exit the complete prohead to make room for the phage genome. Some phages have holes in their proheads’ capsomers and their scaffolding proteins exit intact and can participate in assembling new proheads, while other phages co-assemble a protease inside the prohead which digests the scaffold which then exits as smaller fragments. In some cases, the scaffold itself provides the proteolytic function.

Scaffold involvement in portal incorporation Inclusion of the portal into proheads assembled in vitro has been demonstrated to be promoted by the presence of scaffolding protein for many phages. Density near the portal in asymmetric cryo-EM maps of proheads has been attributed to scaffolding protein in HSV-1 and others. Mutations in the scaffolding protein of P22 have resulted in in vitro assembly of prohead particles that did not incorporate the portal. However, the mechanisms by which scaffolding proteins and portals interact has not been explored as thoroughly as the interactions that promote MCP assembly into the prohead. All these functions depend on the scaffold’s ability to bind phage MCP and to self-associate in a manner that promotes a particular internal volume. In phages where structural or mutagenic information is available, MCP binding typically involves a helix-turn-helix motif approximately 100 amino acids from C-terminal of the scaffolding protein binding via non-covalent interactions to the N-arm of the MCP. The fact that scaffolding proteins seem to immediately precede MCP in the phage genome suggests an evolutionary relationship between the N-terminal delta-domain of some phages and the separate but sequenceadjacent scaffolding proteins of others. Either the scaffolding protein and MCP genes fused in HK97 and some other phages, or the combined protein is ancestral and split into the two separate genes we now find in most phages.

The Portal In addition to the MCP and the SP, viable proheads must include a portal protein. This protein assembles into a dodecameric funnel-shaped ‘portal’ structure, sometimes also referred to as the connector since it will ‘connect’ the head to the tail in the mature phage. The portal occupies the unique vertex of the icosahedral capsid and provides the pore through which DNA can enter and exit the phage during genome packaging and ejection, respectively. The outermost narrow-end of the funnel-shaped portal interacts with the DNA-packaging ATPase motor in the prohead and with the tail in the mature virion. It has been hypothesized that the wide-end of the portal funnel, which is located in the interior of the prohead, senses DNA filling and initiates an allosteric conformational change that (1) causes the motor to detach at the end of genome packaging and (2) constricts the central channel of the portal such that DNA will be blocked from exiting the prohead until the tail can bind and serve as a plug the prevent premature DNA ejection. The portal is therefore important for maturation. Proheads of T4 showed delayed expansion when the portal protein was modified. The portal also has an important role in prohead assembly, though the details are not consistent for all species of phage. In T4, phi29, HSV1, and T7 the portal has been shown to be necessary for assembly of MCP into capsids with the correct geometry, and is proposed to serve as nucleation site for procapsid assembly. This hypothesis is attractive since it provides a simple explanation as to why only a single portal is incorporated in the prohead structure. However, early experiments in phage P22 suggested that portal incorporation occurs as the last step of prohead assembly rather than the first. Specifically, in vitro assembly of icosahedral P22 shells can proceed from MCP and scaffolding protein alone, while inclusion of an excess of portal protein results in aberrant spiral particles with multiple portals incorporated. However, recent studies indicate that P22 assembly proceeds similarly to other phages, with the portal nucleating assembly. The structure of the portal itself suggests it must

Prohead, the Head Shell Pre-Cursor

121

be incorporated early in prohead assembly since the radius of the wide end of the funnel-structure of the portal is positioned in the prohead interior and is wider than the hole where the portal sits. Hence, it is difficult to imagine how this wide funnel-end of the portal could pass through this hole if it were incorporated into proheads as the last step in assembly.

Structure of the Prohead The prohead’s structure differs from that of the mature head in its overall shape and size, in the local arrangement of monomers, and in the conformation of the individual monomers themselves (Fig. 5). Compared to the mature head, the prohead is smaller in diameter, has rougher and thicker capsid walls, and more closely resembles a sphere than an angular icosahedron. The spherical shape of the prohead suggests that while the MCP monomers interact quasi-equivalently, the scaffolding proteins’ dynamic oligomerization interactions are equivalent and not sensitive to different positions in the icosahedral shell. The capsomers are cupshaped as opposed to the flat capsomers in mature heads, which is what causes the prohead shell to have a rougher appearance. Cup-shaped capsomers result in shorter distances across capsomers and therefore smaller prohead diameters and internal volumes. The previously described functions of scaffolding proteins can be achieved more easily with a smaller internal volume. The magnitude of expansion varies significantly among phages (e.g., a 100% increase in diameter for phage lambda versus a 30% change for P22). Capsomers in proheads also tend to have lower symmetry, as typified by the skewed, two-fold symmetric hexons in T ¼ 7 icosahedral phage which transition into nearly six-fold symmetry during maturation. Reduced symmetry in prohead capsomers increases the ability of ‘local rules’ to regulate assembly since the energy difference between ‘correct’ and ‘incorrect’ hexamer/hexamer interactions will be greater. Many phages encode accessory proteins that bind to the mature capsid and provide further stabilization. For example, hoc and soc in T4, and gpD in lambda stabilize the mature head to the extent that maturation is an essentially irreversible event. These proteins are unable to bind to the MCP in the prohead state, and the detection of their binding has been used to assay prohead maturation. The differences between the prohead and mature capsids are illustrated in Fig. 5, using phage HK97, as a representative of tailed dsDNA phage generally. All of the differences in size and shape between proheads and mature capsids are the result of conformational transitions in all the individual MCPs. These transformations are complex, and to simplify their description, we focus on the changes that occur at different inter-monomer junctions, categorized based on the number and parts of capsomers involved. We will examine the

Fig. 5 Transformation from Prohead to Mature Head. The transformation of the HK97 prohead into a mature head is shown at several scales. In A and F, the entire capsid changes from a small bumpy spheroid into a larger, smoother icosahedron. In B and G, density simulated from atomic coordinates of the T ¼ 7 asymmetric unit includes an entire hexon, which changes from a 2-fold symmetric skewed form into a more 6-fold symmetric hexagon. C and H show cross-sections of the hexons in which the change from a cup-shape to a more flat shape can be observed. D and I show ribbon diagrams of the atomic models of the asymmetric unit. In E and J, each monomer in the asymmetric unit has been superimposed for each state, showing the greater variability of the monomers in the prohead, especially at the tips of the A-domain and the P-domain.

122

Prohead, the Head Shell Pre-Cursor

conformational changes that occur upon maturation at (1) capsomer vertices where three MCPs meet, (2) across capsomer edges where two MCPs interact, and (3) at capsomer centers. In addition to the structural rearrangements at monomer-monomer interfaces, we will also describe that changes in the basic fold of the MCP that accompany maturation, (Fig. 6). Note that only five high-resolution prohead structures have been published at the time of this writing, so we should be cautious in drawing conclusions; features common to these structures may not be present in all subsequent structures. The interfaces where three capsomers meet at their vertices remain largely the same before and after maturation. In most phages, the P-loops of three monomers contact each other, and these contacts do not change during maturation. MCPs at the fiveor six-fold symmetry axis at the center of a capsomer undergo significant refolding and rearrangements during maturation, and this seems to drive the transition from skewed 2-fold hexamers to the more symmetric 6-fold state observed in mature capsids. In the prohead state, the interactions between neighboring monomers at the centers of capsomers mostly involve helices aA1 and aA2. When skewed prohead hexons mature into the more 6-fold symmetric state, this is accompanied by an increase in the length of aA1 and, in some of the monomers, movement of aA1 results in a different set of interactions with aA2. The most substantial structural changes occur at the interfaces joining the edges of two capsomers, resulting in entirely different interaction networks. These maturation changes are consistent in all available prohead/mature head structure pairs, and likely drive the transition from prohead to expanded mature head. As discussed above, changes at the twofold interface during maturation are the most extensive and give rise to a completely different set of paired interactions. Most of these changes result from movement and refolding of the N-arm, which is uninvolved in the prohead 2-fold interface but prominently involved in the interface in the mature head state. The change in binding is most obvious at the position of the local twofold axis halfway along the interface. In the mature state neighboring N-arm strands cross each other, while in the prohead the ‘N-shoulders’ of opposite monomers meet. The term ‘N-shoulder’ refers to the loop connecting the lower strand of the E-loop to the bottom strand of bP2 which leads directly to the N-arm (Fig. 2). The thermophilic archaeal phage P23–45 is an exception in that the tips of the long spine helices meet at the 2-fold axes rather than the N-shoulders.

Fig. 6 Inter-MCP interfaces before and after maturation. Conformational changes in the HK97-fold during the prohead/mature head transformation are categorized by their participation in different interfaces in the capsid. Threefold junctions, where the corners of three capsomers meet, remain mostly unchanged. Twofold junctions, where the edges of two capsomers meet, undergo significant unbinding, movement, refolding, and rebinding. Junctions at the centers of capsomers undergo some refolding in all monomers. In skewed hexons, aA1 of some monomers unbind from their prior locations and rebind in positions that match those of the other monomers.

Prohead, the Head Shell Pre-Cursor

123

The capsomers of P23–45 (and P74–26) are much larger than those of all known mesophilic phages (B170 versus B120–140 Å from capsomer center to capsomer center), which is the result of large additions and extensions to the canonical HK97 fold. Features are thus shifted into a different alignment. Despite this, the same overall logic prevails at the twofold interface, with different pairs of features binding across the interface in the prohead and mature state. Non-conserved features that supplement binding across the prohead 2-fold interface are present in most phages. Prominent examples are the G-loop in HK97 and the D-loop of the I-domain in P22. In these cases, the complementary binding site is on the outer strand of bE1 and bP1. This complementary site is closely associated with a short helix on the N-arm in the mature head state. In summary, despite differences in the details, it seems that all interactions which stabilize the two-fold interfaces of the prohead are disrupted by the N-arm at some point during maturation. The residue pairs that interact in the prohead state are in proximity with one another when the monomers are in their more convex shape, and farther apart after capsomers have flattened out in the mature state. This change in curvature is especially pronounced in the long ‘spine’ helix of the P-domain. In proheads, a second helix formed by the N-arm lies alongside the spine helix and seems to stabilize its greater curvature. Hence the N-arm is important for both stabilizing the prohead state during assembly and destabilizing it during maturation.

Stability of the Prohead Proheads are metastable, that is, they are not in the lowest possible thermodynamic state, but the rate at which they transition to the more thermodynamically favored mature state is slow enough that they are ‘stable’ in the conversational sense of the word. The MCP in assembled proheads is in dynamic equilibrium with subunits in the solution phase. Indeed, purified proheads of P22 have been shown to partially disassemble if diluted below their dissociation constant of 5 mM. Proheads of several phage have also been found to expand spontaneously (i.e., before the scaffold δ-domain has been cleaved or DNA has been packaged) under certain environmental conditions. Elevated temperature, or the addition of 4 M urea or surfactant can trigger this process. In P22 and HK97, manipulation of the environment has produced ‘wiffle-ball’ capsids, which have all their hexons present and in the flattened expanded state, but which have lost their pentons leaving large holes that give the resulting particle the appearance of a wiffle ball. This may be due to the scaffold remaining bound to the N-arms such that the N-arms from pentameric MCPs are unable to stabilize the 2-fold interfaces as in fully matured capsids. As a result pentameric capsomers are bound too weakly to remain attached to the capsid and dissociate. In Phi-29, structures of partially matured capsids have been determined where there are large gaps between the edges of pentamers and adjacent hexamers. In these particles, scaffold remains attached to the N-arms of pentons after hexons have matured, resulting in weakly bound pentamers with very few contacts joining them to neighboring hexamers. Both these particles and the HK97 wiffle balls likely represent particles where hexamers transitioned to the mature state but pentamers did not.

Assembly Parasites Scaffold-assisted assembly of a prohead followed by maturation has been an enormously successful assembly strategy for tailed phages, as evidenced by their prevalence throughout the biosphere. However, this strategy has a drawback in that it allows parasitism by virophages that encode scaffold proteins that can compete for another phage’s MCP. The Staph. aureus Pathogenicity Island (SAPI) and E. coli satellite phage P4 both encode a scaffolding protein that induces the MCP of another ‘helper’ phage to form smaller capsids that cannot contain the genome of the helper phage and preferentially package the virophage genome. In the case of SAPI, the scaffolding protein is internal and analogous to those encoded by the helper phage, 80-a, but has a higher affinity for the 80-a MCP and outcompetes the 80-a helper scaffold for MCP binding, leading to smaller T ¼ 4 capsids rather than the T ¼ 7 of WT 80-a. Another SAPI-encoded protein may block MCP/helper scaffold binding by binding the helix-turn-helix region of the scaffold that binds to MCP. P4 encodes an external scaffold, sid, which directs the P2 MCP into T ¼ 4 capsids and can do so when the helper phage’s internal scaffold is not present. Given the high degree of gene exchange between host cells, viruses, and virophage, we might expect to eventually find HK97-like phages that use an external scaffold similar to P40 s sid for capsid size determination instead of the ‘traditional’ internal scaffold protein.

Further Reading Dokland, T., 1999. Scaffolding proteins and their role in viral assembly. Cellular and Molecular Life Sciences 56, 580–603. Johnson, J., 2010. Virus particle maturation: Insights into elegantly programmed nanomachines. Current Opinion in Structural Biology 20, 210–216. Moody, M.F., 1999. Geometry of phage head construction. Journal of Molecular Biology 293 (2), 401–433. Shin, H., 2013. Viruses as self-assembled nanocontainers for encapsulation of functional cargoes. Korean Journal of Chemical Engineering 30, 1359–1367. Suhanovsky, M.M., 2015. Nature’s favorite building block: Deciphering folding and capsid assembly of proteins with the HK97-fold. Virology 479–480, 487–497. Teschke, C., 2019. The amazing HK97 fold: Versatile results of modest differences. Current Opinion in Virology 36, 9–16. Zlotnick, A., 2005. Theoretical aspects of virus capsid assembly. Journal of Molecular Recognition 18, 479–490.

Enzymology of Viral DNA Packaging Machines Carlos E Catalano, University of Colorado Anschutz Medical Campus, Skaggs School of Pharmacy and Pharmaceutical Sciences, Aurora, CO, United States r 2021 Elsevier Ltd. All rights reserved.

Glossary AXP Denotes an ATP/ADP binding site. bZIP Conserved basic leucine zipper DNA-binding motif. IHF The Escherichia coli integration host factor protein. NTP Denotes an ATP/ADP and GTP/GDP binding site. TerL The large terminase subunit. Terminase holoenzyme The heterooligomeric, catalytically competent terminase enzyme complex composed of TerL and TerS subunits.

Terminase protomer The stable, protomeric TerLTerS2 heterotrimer of phage l terminase. TerS The small terminase subunit. WA Conserved Walker A ATPase motif. WB Conserved Walker B ATPase motif. wHTH Winged helix-turn-helix DNA binding motif.

Introduction Viruses are obligate intracellular parasites whose developmental pathways are initiated upon insertion of their genetic material into a host cell. Subsequently, viral genomes and structural proteins are synthesized and infectious particles are assembled utilizing a number of viral and host-encoded biomolecules. The assembly pathways follow sequential steps generating transient intermediates that transition to the next, typically in a non-reversible manner. The pathways are generally conserved within broad virus classes such as the large dsDNA viruses, which includes the Caudovirales (tailed bacteriophages) and the Herpesviridae groups. In these cases, viral DNA is typically replicated via a rolling circle mechanism resulting in linear, head-to-tail concatemers of the viral genome (immature DNA). Expression of late viral genes produces structural proteins that self-assemble into procapsid shells into which the newly replicated DNA is actively packaged. Genome-packaging represents the intersection between the DNA replication and capsid assembly pathways and involves processive excision of a single genomes from the concatemer and simultaneous packaging of the “mature” DNA into a preassembled procapsid shell; both reactions are catalyzed by a virus-encoded terminase enzyme. Genome packaging pathways are strongly conserved in the dsDNA viruses and can be summarized as outlined in Fig. 1; the terminase enzymes assemble at a specific packaging initiation site in the genome concatemer (pac or cos) to afford a stable, site-specifically bound maturation complex. This activates an endonuclease activity that cuts (matures) the duplex to generate the first end of the genome to be packaged. The post-cleavage complex binds to a portal situated at a unique vertex of a pre-assembled procapsid; the portal is a ring-like structure that provides a conduit for DNA entry into the capsid during packaging and for exit during infection. Terminase binding to the portal affords the packaging motor complex, which triggers a transition from a stable, site-specifically bound maturation complex to a dynamic motor that translocates DNA into the shell in a sequence-independent manner, fueled by ATP hydrolysis. Terminase motors are extremely processive and very powerful; they package the entire genome length in a single binding event and to liquid crystalline density, which generates a remarkable 425 atmospheres of internal shell pressure. Upon packaging a genome-length of DNA, the enzyme transitions back to a static maturation complex that again cuts the duplex to terminate the packaging process (from which the enzymes derive their name). The binary terminaseDNA complex, formally equivalent to the maturation complex that initiated packaging, is displaced from the genome-filled capsid by “finishing” proteins and binds another empty procapsid to initiate a second round of processive genome packaging (Fig. 1). Addition of a pre-assembled tail to the nucleocapsid (Caudovirales) or a complex lipid envelopment process (Herpesviridae) ultimately affords an infectious virus. The essential features described above are strongly conserved in the large dsDNA viruses, both prokaryotic and eukaryotic, and the terminase enzymes perform two essential functions in virus assembly; (1) nucleolytic excision of individual genomes from a concatemeric precursor (maturation reaction) and (2) translocation of viral DNA into the procapsid (packaging reaction). These two reactions are catalyzed by two distinct terminase complexes that sequentially alternate during processive genome packaging. Notable exceptions to this general paradigm are represented by the phage f29 and adenovirus groups, which replicate monomeric genomes in a protein-primed manner and thus have no maturation requirement. Notwithstanding, these viruses utilize an analogous “packaging ATPase” enzyme to package viral DNA into a pre-assembled procapsid shell. Herein I focus on the enzymology of the dsDNA viruses that package genomes from a concatemeric DNA precursor, which include phages l, HK97, T4, P74-26, P22 and SPP1, among others, and the eukaryotic herpesviruses.

Unit Length Versus Headful Packaging Mechanisms Within the broad class of dsDNA viruses, terminases carry out two basic strategies for genome packaging. The “unit-length” phages both initiate and terminate packaging at a “cos” sequence and thus package exactly 100% genome length (Fig. 1); the herpesviruses

124

Encyclopedia of Virology, 4th Edition, Volume 4

doi:10.1016/B978-0-12-809633-8.20981-3

Enzymology of Viral DNA Packaging Machines

125

Fig. 1 Processive genome packaging in the dsDNA viruses. Terminase enzymes are responsible for processive excision of an individual genome from a concatemeric packaging substrate (genome maturation) and for translocation of the duplex into a pre-formed procapsid shell (genome packaging). The functional enzymes are composed of large catalytic (TerL) and small DNA recognition (TerS) subunits, both of which possess conserved functional domains, as depicted at bottom, left. Two basic strategies for genome packaging, unit-length and headful, are summarized at bottom, right. Processive packaging requires that terminase cycles between a stable maturation complex and a dynamic motor complex. Details are provided in the text.

similarly initiate and terminate packaging at repeated “a” sequences in the concatemer. In contrast, while the “headful phages” initiate packaging at a “pac” sequence, the translocating motors bypass the downstream pac site and continue packaging until a “head full” of DNA has been inserted into the shell. This affords virions that contain DNA 104%–110% larger than the unit-length genomes and that possess “circularly permuted, terminally-redundant” ends (Fig. 1). In both cases, processive genome packaging from the concatemer requires that the terminase enzymes catalyze both genome maturation and DNA translocation reactions, alternating between distinct, but tightly coupled nucleoprotein complexes.

The Viral Terminase Enzymes Terminase enzymes function as heterooligomeric complexes consisting of a large catalytic subunit that contains both maturation and packaging activities (TerL) and a small DNA recognition subunit that is required for site-specific assembly at the packaging initiation site (TerS; Fig. 1); both subunits are essential for virus development in vivo. In some cases, such as phages l and P22, the holoenzymes can be isolated as stable complexes of both subunits, while in others, such as phages T4 and SPP1, the two subunits do not strongly interact until they assemble the maturation complex on DNA.

The TerS Subunits The small DNA-recognition subunits (B17–23 kDa) are generally composed of conserved functional domains (see Fig. 1); (1) an N-terminal DNA binding domain containing a helix-turn-helix DNA binding motif, (2) a highly helical central self-association domain and (iii) a C-terminal domain that also plays a role in DNA binding and that is required for binding to the TerL subunit in the holoenzyme. The isolated TerS subunits assemble into ring-like structures ranging from 8 to 11 subunits of varying dimensions, and in some cases double rings. Due to this structural diversity, there is debate as to whether TerS binding to DNA threads the duplex through the central channel of the rings or if it wraps the duplex around the perimeter of the rings. This is discussed further below.

The TerL Subunits The large catalytic subunits (B50–80 kDa) are typically monomers in solution whose overall fold is conserved despite their divergent primary sequence; they are similarly composed of conserved domains (see Fig. 1). An N-terminal domain contains the packaging ATPase site that fuels motor movement; sequence, structural and biochemical data suggest that the packaging ATPase belongs to the Additional Strand, Conserved Glutamate (ASCE) family of ATPase enzymes. As such, they possess the conserved Walker A (WA) and Walker B (WB) motifs that participate in ATP binding and positioning the g-phosphate for hydrolysis. A conserved “catalytic glutamate” lies immediately downstream of the classical WB motif and is proposed to position and activate a water molecule for nucleophilic attack on the g-phosphate in the hydrolysis transition state. The terminase family also possess a trans-acting “arginine finger” residue that is conserved in ring ATPase enzymes; these residues are proposed to (1) catalyze ATP hydrolysis in an adjacent subunit and (2) to communicate the nucleotide bound state of one subunit in the motor to the next. These features serve to ensure inter-subunit communication and coordination of ATP hydrolytic events during motor movement. Finally, a “lid” domain is also conserved in the terminase enzymes that is likely involved in coupling ATP hydrolysis to duplex movement.

126

Enzymology of Viral DNA Packaging Machines

A short linker connects the packaging ATPase domain to a conserved C-terminal nuclease domain responsible for genome maturation. Structural interrogation suggest that the nuclease domains are related to the RNaseH family of nuclease enzymes. Non-specific nuclease activity has been demonstrated in several headful TerL subunits and their nuclease domains, including phages T4, SPP1 and P74-26, among others, and the nuclease domain of herpes simplex and cytomegaloviruses. Biochemical and kinetic characterization of nuclease activity of these proteins has generally been modest; nevertheless, similar features are apparent in both prokaryotic and eukaryotic systems, as follows. Duplex nicking requires divalent metal and structural studies suggest a twometal nuclease mechanism. Nuclease activity is non-specific, it is often weak and where examined, it is inhibited by the TerS subunit. Kinetic analysis of the nuclease activity of P74-26 TerL indicates that the enzyme possesses both single-strand nicking and double-strand duplex cleavage activities. In contrast to the headful phages discussed above, the nuclease activity of unit-length genome packaging terminases, such as those from phages l, 21 and HK97, is site specific and strictly requires the TerS subunit. This is described in detail below.

The Terminase Holoenzymes Genetic and biochemical data indicate that holoenzyme complexes composed of both TerL and TerS subunits are required to assemble the maturation complex at pac or cos to initiate genome packaging. Where characterized, C-terminal residues in the TerS subunits are required for binding to N-terminal residues in TerL to assemble functional holoenzymes. Unfortunately, structural studies performed to date have characterized the TerL and TerS subunits in isolation and while they have provided significant insight into structure-function relationships of the isolated terminase subunits, there is little information on the holoenzyme complexes assembled from both subunits. Elegant single molecule tweezer studies have revealed mechanistic details of DNA translocation by the translocating motors, again most often in the absence of the cognate TerS subunit. Moreover, there is limited information regarding the conserved genome maturation complexes and a mechanistic interrogation of the tightly coupled maturation and packaging catalytic activities remains largely unexplored. An exception is phage l terminase, which is isolated as a stable heterotrimer of TerL and TerS subunits and has been extensively studied genetically, biochemically and by single-molecule techniques. This article focuses on the enzymology of phage l terminase as a model system for interrogating the conserved nucleoprotein complexes that catalyze genome maturation, that drive DNA translocation and the cyclic interconversion between the two required for processive genome packaging. Biochemical and kinetic characterization of l terminase has revealed complex allosteric interactions between the nucleotide and DNA binding sites that bear directly on these conserved transitions. While the focus will be on l, comparisons to other viral systems are discussed in order to provide a broad discussion of the enzymology and biochemical features of virus DNA packaging.

The Lambda System Bacteriophage l is composed of a linear 48.5 kb dsDNA genome tightly packaged within an icosahedral capsid shell that contains a tail structure situated at a unique vertex of the icosahedron (the portal vertex). Injection of viral DNA into an Escherichia coli cell initiates the infection cycle, which ultimately affords linear concatemers of the l genome and packaging-competent procapsids. Phage l assembly has been fully reconstituted in vitro and an infectious virus may be assembled using purified proteins and commercially available l DNA; this allows mechanistic interrogation of each step along the virus assembly pathway.

The Terminase Enzyme of Phage Lambda l terminase purifies as a stable, homogeneous heterotrimer composed of two TerS subunits tightly associated with a single TerL subunit [TerLTerS2, the terminase protomer]. The two subunits possess the conserved functional domain organization described above and a number of structural motifs and catalytic sites have been characterized in each subunit. The protomer is catalytically silent and is activated only when assembled into a higher-order ring-like holoenzyme complex. Detailed biochemical interrogation of the enzyme has revealed complex allosteric interactions between the multiple substrate binding and catalytic sites identified in the enzyme, as outlined in Fig. 2.

The k TerS Subunit Genetic, biochemical and structural studies identified an N-terminal DNA binding domain (DBD) in l TerS that contains a winged helix-turn-helix (wHTH) motif and that assembles into a stable dimer. Based on structural modeling data, we proposed that the HTH motif interacts with the major groove of DNA and that the wing acts as a nucleotide modulated switch that regulates cos-specific versus non-specific DNA binding interactions that are important for maturation and translocation activities, respectively. Consistent with this model, a low-affinity nucleotide binding site has also been identified in this domain, which binds both ATP/ADP and GTP/GDP nucleotides (NXP, KD,appB 0.5–1 mM). While early studies reported ATP/GTP hydrolysis at this site, more recent studies with the purified and homogenous protomer suggest that turnover at this site is very low. Rather, this nucleotide binding site appears to act as an allosteric regulator of terminase function; NXP binding to this site (1) modulates cos-specific versus non-specific DNA binding

Enzymology of Viral DNA Packaging Machines

127

Fig. 2 Domain organization and allosteric communication between the binding and catalytic sites in l terminase holoenzyme. TerS and TerL are shown in cyan and blue, respectively. The TerS DNA binding domain contains a winged helix-turn-helix DNA binding motif (wHTH) and an allosteric NXP binding site (NXP denotes ATP/ADP and GTP/GDP). The TerL packaging ATPase and genome maturation domains are depicted, connected by a short linker domain. AXP, denotes ATP/ADP and bZIP denotes a basic leucine-zipper motif. Characterized allosteric interactions between the nucleotide, DNA and catalytic sites are indicated. Details are provided in the text.

interactions, (2) inhibits the nuclease activity of TerL and (3) stimulates ATP hydrolysis by the packaging ATPase site in TerL (Fig. 2). These features have been incorporated into a “nucleotide switch” model that is responsible for alternation between the maturation and packaging complexes during processive genome packaging (vide infra). Following the DBD is a long (B100 residues) hydrophobic coiled-coil helix that plays an important role in TerS self-assembly interactions and that is likely involved in protomer and motor assembly. The helix connects to a C-terminal domain that similarly plays a role in higher-order self-association interactions, that is required for high-affinity DNA binding interactions and that is also responsible for binding to the TerL subunit in the l protomer (Fig. 2). As noted above, the domain organization and structural features described for l are generally conserved in all of the TerS subunits characterized thus far (vide supra).

The TerL DNA Packaging Domain The genome packaging activity of l terminase is located in an N-terminal domain of TerL that contains the highly conserved ASCE ATPase catalytic site described in detail above. Kinetic studies reveal that the l domain binds ATP with high-affinity (KM ¼ 5 mm) and that turnover at this site is stimulated by NXP binding to the TerS subunit in the holoenzyme (see Fig. 2). Further, non-specific DNA duplexes stimulate ATP hydrolysis at this site, while binding to specific cos-containing DNA sequences inhibits ATPase activity. These features are likely involved in regulating terminase ATPase activity in the maturation versus motor complexes.

The k TerL Maturation Domain As a unit-length packaging enzyme, l terminase introduces site specific nicks into the cos sequence 12 bases apart. This catalytic activity is contained within a C-terminal RNaseH-like nuclease domain of TerL that is conserved within the terminase family, as discussed above. In contrast to the TerL subunits of the headful phages, l TerL does not possess non-specific nuclease activity. Moreover, site-specific cleavage at cos requires the TerS subunit and further assembly into a multimeric nuclease complex. A socalled “helicase” strand-separation activity is also centered in this domain which separates the nicked, annealed strands generated by the cos-cleavage reaction; the upstream DR fragment is ejected from the complex and the DL strand, which is to be packaged, remains tightly bound to the enzyme (T1/2B8 h). A second high-affinity ATP/ADP (AXP) binding site is situated in the nuclease domain (KD,appB1 mm) and AXP binding to this site strongly stimulates both cos-cleavage and strand-separation activities of the enzyme (see Fig. 2). Turnover at this site is not observed and AXP appears to act solely as an allosteric effector of TerL maturation activity. It is noteworthy that ATP stimulates the non-specific nuclease activity of TerL subunits in the P22, Sf6 and P74-26 phage systems, but this has not been studied in detail. Further, a second ATP binding site, distinct from the packaging ATPase site was also proposed in the phage T4 TerL subunit based on sequence homology; however, genetic studies have not revealed a defined function for the putative site. While the domain organization of l terminase is strongly conserved among all of the terminase enzymes, catalytic variations exist. An interesting variation is the terminase enzyme from phage HK97, which also possesses a cos-specific nuclease activity that is dependent on TerL and TerS subunits; however, efficient duplex nicking further requires a nuclease-associated “HNH” protein, that appears to be common among the long-tailed phages. The mechanism by which this HNH protein stimulates the cos-cleavage

128

Enzymology of Viral DNA Packaging Machines

Fig. 3 Phage l TerS and E. coli IHF cooperatively bind and bend cos-DNA. The l cos sequence is bi-partite consisting of cosB terminase binding and cosN duplex nicking subsites. cosN possesses a pseudo-dyad symmetry and terminase introduces nicks into the duplex, 12 bases apart (red arrows). IHF binds to the I1 element while a TerS dimer binds to the R3 and R2 elements, both of which introduce a strong bend in the duplex. Characterized thermodynamic binding constants are indicated and a structural model for the ternary IHFTerScosB complex is shown. Similar binding constants are obtained with the l terminase protomer [TerLTerS2] indicating that terminase assembly at cos is primarily mediated by the TerS subunit. The bent and likely wrapped duplex presumably positions a head-to-head TerL dimer symmetrically disposed at cosN as a catalytically competent duplex nicking complex. Structural details remain speculative at this time.

reaction remains obscure. Of note, the herpesviruses also appear to encode a three subunit terminase, but the correspondence between the prokaryotic and eukaryotic terminase subunits is unclear.

Escherichia Coli Integration Host Factor IHF was first identified as a host protein required for l lysogeny. This 20 kDa a,b heterodimer plays an important role in host DNA replication, transcription and site-specific recombination. IHF is also required for lytic l development and strongly stimulates the genome maturation activities of l terminase in vitro. In all characterized cases, IHF binds to a specific recognition element and introduces a strong bend in the DNA (1601–1801); this provides a duplex architecture conducive to assembly of additional regulatory/catalytic proteins at that site.

The Genome Maturation Complex The Lambda Cohesive End Site and Assembly of the Genome Maturation Complex The l cos sequence spans B260 bp and is bi-partite consisting of a binding subsite (cosB) that is required for site maturation complex assembly and a nicking subsite (cosN) into which terminase introduces symmetric nicks in the duplex (Fig. 3). Genetic and biochemical data indicate that the isolated TerS subunit specifically binds to three repeating “R-elements” in cosB and similar to IHF, TerS binding introduces a strong bend in the duplex (1601). Cooperative binding and bending of cos-DNA by IHF and l TerS has been demonstrated (KD,app ¼ 22 nM, DDG ¼ 1 kcal/mol). A high-resolution NMR structure of the l DNA-binding domain has provided insight into the molecular details of the IHFTerScosB nucleoprotein complex, as depicted in Fig. 3; IHF binds to the I1 element in cosB, bending the duplex and juxtaposing the R3 and R2 TerS binding elements. The wHTH motifs in the TerS DBD are presented on opposite sides of the dimer such that each can appropriately bind to an R-element in the bent duplex. Ostensibly, the TerSIHFcosB bent duplex serves to appropriately position two TerL subunits symmetrically disposed at the cosN subsite to enable high-fidelity cos-cleavage activity (red arrows, Fig. 3). Similar to what is observed with the isolated TerS subunit, biophysical studies have demonstrated that the terminase protomer [TerL1TerS2] and IHF also bind to cos-DNA with high affinity (KD,app ¼ 20 nm) and cooperativity (DDG ¼ 1 kcal/mol). Comparison of the TerS and protomer thermodynamic binding data suggest

Enzymology of Viral DNA Packaging Machines

129

Fig. 4 Model for l maturation complex assembly at cos and duplex nicking. Four l terminase protomers and IHF cooperatively bind to the cos sequence to assemble a dimer of dimers nuclease complex and position a head-to-head TerL dimer symmetrically bound to the cosN subsite; the nucleoprotein complex strongly bends and wraps the duplex. Assembly down-regulates the packaging ATPase site and activates nuclease activity, which introduces symmetric nicks into cosN, 12 bases apart (red arrows).

that recognition and assembly of the terminase protomer at cos is primarily mediated by the TerS subunit. Consistent with these observations, while terminase and IHF cooperatively assemble at the cos sequence to engender a catalytically competent nuclease complex, the protomer alone does not discriminate between non-specific and cos-containing duplexes and is devoid of nuclease activity. The physical nature of the duplex in the maturation complex is of interest. As discussed above, isolated TerS subunits and subdomains assemble into ring-like complexes with central channels of varying dimensions, which has led to a controversy as to whether DNA passes through the central channels or whether the complexes wrap the duplex around the TerS ring exterior; however, the role of TerS is to site-specifically assemble TerL into a catalytically-competent holoenzyme complex at cos/pac. Thus, interpretation of the structural data is complicated by the fact that (1) the TerS structures were obtained in the absence of the cognate TerL subunit and (2) they were all examined in the absence of DNA, both of which are required for the assembly of a biologically-relevant complex. It is further unclear whether the structurally characterized complexes are representative of TerS subunits assembled into a maturation complex or into the packaging motor complex. Thus, the duplex structure in the maturation complex remains speculative in all systems.

Model for the Assembly of the Genome Maturation Complex The ensemble of structural, kinetic and biophysical data can be incorporated into a model for maturation complex assembly at cos and the activation of maturation activities in phage l terminase, as depicted in Fig. 4. The terminase protomer, which is the relevant species during a productive infection in vivo, is essentially devoid of catalytic activity; however, the protomer and IHF at physiological concentrations cooperatively assemble at a cos-sequence of a genome concatemer to afford a catalytically-competent maturation complex. This provides a mechanism to avoid potentially damaging effects of non-specific nuclease; activity is activated only upon assembly at the packaging initiating site. Biophysical interrogation of the l maturation complex suggests that four protomers and one IHF dimer assemble at the cos sequence to engender the catalytically competent ring-like nuclease complex [TerL4TerS8IHF] in which the duplex is highly bent and/or wrapped. Importantly, kinetic studies demonstrate that the packaging ATPase site in TerL is downregulated in the assembled maturation complex (see Fig. 2); this allosteric interaction serves to ensure a stable, site-specifically bound nucleoprotein complex and prevent pre-mature translocation from the cos-sequence. Given the conservation of structure and function in the terminase enzymes, this model is likely recapitulated, albeit with some variation, in the terminase enzyme family. Consistent with this hypothesis, TerS-mediated duplex bending at the pac site has been demonstrated in the SPP1 system, among others. In contrast, the TerS subunits inhibit the nuclease activity and stimulate the ATPase activity of TerL in the T4, SPP1 and P22 headful phage systems in vitro. These observations appear contrary to the role of TerS in assembling a stable, site-specifically bound nuclease complex to initiate genome packaging and it is clear that further biochemical characterization of the headful terminase enzymes is necessary to clarify this discrepancy.

Genome Maturation: The cos-Cleavage Reaction Once assembled at cos, l terminase cuts the duplex within the cosN subsite. This site displays two-fold rotational symmetry and the nicks are symmetrically introduced into the upper and lower strands 12 bases apart (Fig. 4). Based on this symmetry and the presence of a basic leucine zipper (bZIP) DNA-binding/dimerization motif in the C-terminus of TerL (Fig. 2), it is not surprising that early models predicted that two TerL subunits are directly bound to the cosN half-sites, “head-to-head” in the maturation complex. This model bears analogy to the classical type II restriction endonuclease enzymes and similar head-to-head nuclease complexes have been proposed in the T4 and SPP1 systems; however, biophysical data suggest that the catalytically competent l complex is composed of four terminase protomers. How can this be rationalized? The cos-cleavage endonuclease reaction and the presumed stoichiometry of terminase protomers in the maturation complex is reminiscent of the Type IIE and Type IIF restriction endonucleases. In these enzymes, a dimeric head-to-head complex can assemble at a specific duplex nicking site, but this intermediate possesses weak nuclease activity. The catalytic activity of the dimer is strongly activated by direct association with a second DNA-bound dimer to yield a “dimer of dimers” nuclease complex through

130

Enzymology of Viral DNA Packaging Machines

Fig. 5 Model for symmetry resolution and motor assembly. The head-to-head dimer of dimers nuclease complex must re-orient the protomers to engender a parallel ring-like motor complex for binding to the portal vertex (symmetry resolution). TerL and TerS subunits are depicted in blue and cyan, respectively. Details are provided in the text.

DNA looping. In the case of the IIE enzymes the second DNA-bound dimer acts as an allosteric effector for cleavage at only one of the cognate sites. In analogy, we propose that two terminase protomers assemble as a head-to-head dimer of TerL subunits bound symmetrically at cosN. Bending of the duplex by IHF and TerS positions DNA such that it can interact with a second dimer to afford an activated dimer-of-dimers nuclease complex, as depicted in Fig. 4. In contrast, the type IIF enzymes similarly assemble as a dimer of dimers, each bound to a cognate recognition sequence and both sites are cut by the activated enzyme. Within this context, “synapse” initiation models have been proposed in the phage T4 and T7 systems where it has been suggested that the maturation complex binds to two sequential pac sequences in the DNA concatemer, separated by one genome length. Thus, it is feasible that in the context of a multi-genome concatemer, a l terminase dimer of dimers interacts with two sequential cos sequences (separated by 48.5 base pairs) to engender an activated synapse maturation complex as proposed in other systems. Indeed, this model has been previously considered in phage l.

Genome Maturation Summary A central dogma is that all dsDNA viruses that package DNA from multi-genome concatemers, from phages to herpesviruses, initiate the process by assembling a maturation complex at a specific initiation site; however, these complexes remain poorly studied and ill-characterized in most systems. All of the characterized viruses encode a TerS subunit that is essential to virus development in vivo. Presumably, this reflects the role of TerS in recognition of a specific packaging sequence present only in the viral genome and the assembly of TerL into a catalytically competent nuclease complex, the first and essential step in genome packaging. In this regard, it is noteworthy that phage T4 destroys host DNA and synthesizes viral concatemers that contain 5-hydroxymethyl-cytosine in place of cytosine. In this case, a pac-like sequence is not formally required to specifically recognize viral DNA since host DNA does not exist. Nevertheless, T4 TerS is required for packaging concatemeric DNA both in vitro and in vivo, despite the fact that a defined pac sequence is not apparently required to initiate the process. This likely reflects a TerS requirement for maturation complex assembly and duplex cleavage in a concatemeric substrate and/or subsequent motor assembly. Whatever the case, given the conservation of structure and function of the terminase enzymes, the l terminase maturation complex, which has been extensively characterized, provides a wealth of information that can be generalized to all of the dsDNA viruses.

The Genome Packaging Complex Assembly of the Packaging Motor The next step in the packaging pathway is binding of the maturation complex to the portal vertex of an empty procapsid to afford the packaging motor complex (see Fig. 1). While there is some controversy with the motor orientation in phage T4, genetic, biochemical and structural studies have localized procapsid binding residues to the C-terminus of l, P22 and T3 TerL subunits, and also in the f29 packaging ATPase. Thus, it has long been presumed that the packaging motors are composed of TerL subunits assembled in a ring-like complex oriented in a parallel manner such the C-terminal portal interaction residues directly bind to the portal ring. In contrast, it is generally presumed that the maturation complex orients the TerL subunits in a head-to-head arrangement for symmetric nicking of cos, as depicted in Fig. 4. This begs the question as to how a head-to-head maturation complex resolves this symmetry issue in the transition to a parallel packaging motor complex. Based on kinetic and biochemical data, we propose the following plausible mechanism for “symmetry resolution” in phage l (Fig. 5). Subsequent to duplex nicking, the strand-separation activity of the maturation complex ejects the upstream DR strand; this results in the loss of DNA binding energy by the upstream protomers. Next, the upstream protomers flip into a parallel orientation with all of the TerL subunits bound to the newly formed DL end. This conformational reorganization is driven by cooperative binding of the TerL subunits to the downstream DNA, plus TerS self-association interactions, mediated by the hydrophobic helix as discussed above (see Fig. 2). This resolved conformation (1) projects the C-terminal procapsid binding residues of TerL in a parallel orientation for direct and cooperative binding to the dodecameric portal complex and (2) affords an oligomeric TerS ring-like structure that is conserved among the published structures of TerS subunits (vide supra).

Enzymology of Viral DNA Packaging Machines

131

Fig. 6 Model for cos-clearance – the transition to a translocating motor complex. Several changes must occur for cos-clearance, as summarized in the Figure. The allosteric NXP binding site in TerS may serve as a “nucleotide switch” that modulates DNA binding, nuclease and packaging ATPase catalytic activities to promote the transition. Details are described in the text.

cos-Clearance: Transition to a Translocating Motor Once the motor complex has been correctly assembled at the portal, a series of events must occur to transition from a stable complex specifically bound at the genome end, to a dynamic translocating motor complex. We refer to this transition as “cos-clearance” in analogy to RNA polymerase clearance from the promotor and the transition to processive RNA synthesis. Again, the l system provides insight into this essential process as depicted in Fig. 6; (1) IHF specifically bound a the I1 element of cosB must release from the duplex and (2) TerS, site-specifically bound to the R3 and R2 elements of cosB must switch to a non-specific DNA binding mode. This must occur to “un-bend” the DNA to provide a linear duplex for motor translocation from cos; (3) the TerL subunits specifically bound at cosN must also switch to a non-specific DNA binding mode to release the duplex end; (4) the packaging ATPase activity of TerL, which is downregulated in the maturation complex, must be activated to fuel translocation. In addition, the nuclease activity of the motor must be down-regulated to prevent damage to the duplex during translocation.

A Nucleotide Switch for cos-Clearance? The molecular events leading to cos-clearance remain obscure, but it is clear that this requires coordinated allosteric interactions between the DNA and nucleotide binding sites in both the TerL and TerS subunits. As discussed above, several relevant kinetic interactions have been identified and characterized in l terminase as summarized in Fig. 2. Most relevant to cos-clearance is the observation that nucleotide binding to the low affinity NXP binding site in TerS (1) modulates cos-specific vs. non-specific DNA binding interactions, (2) inhibits the nuclease activity in TerL and (3) stimulates ATP hydrolysis by the packaging ATPase site in TerL. These are the precise features required for the cos clearance transition and we posit that the NXP binding site in TerS serves as a “nucleotide switch” that regulates site specific DNA binding and nuclease activation versus non-specific DNA binding and motor translocation functions. Further, non-specific DNA also activates the packaging ATPase site in TerL, ostensibly by cooperative mechanochemical coupling events in the translocating motor. The mechanistic details linking this complex enzymology and the structural alterations mediating the transition to packaging remain a fertile area of investigation.

The Translocating Motor In contrast to the maturation complexes discussed above, translocating packaging motors from several systems, including phages f29, l, T4 and P74-26, among others, have been examined in great detail. Several excellent reviews have described the structural and functional features of the motors and only salient points will be discussed here. The activated packaging motors translocate along the duplex, fueled by ATP hydrolysis, inserting the DNA through the portal and into the procapsid shell. Ensemble biochemical and single molecule studies have shown that the translocating motors are highly processive and very fast. ATP hydrolysis is tightly coupled to motor translocation (mechanochemical coupling) and it is commonly accepted that 2 bp DNA are packaged per ATP hydrolyzed by the terminase enzymes. The motor subunits are tightly coupled and incorporation of a single defective subunit into the motor poisons translocation activity. The motors can generate an astounding B60 pN force and package DNA to near crystalline density within the capsid shell.

Structure of the Translocating Motors Despite the strict requirement of TerS in vivo, structural studies in the T4 and T7 systems suggest that functional headful packaging motors can be assembled as pentamers of TerL subunits assembled at the portal vertex, in the absence of TerS. Pentameric TerL

132

Enzymology of Viral DNA Packaging Machines

motors have also been proposed in the P74-26 and D6E systems. Similarly, the phage f29 “ATPase” can package DNA without the requirement of a TerS-like subunit, though there remains debate as to the stoichiometry of the f29 motor complex with pentamer and hexamer motors proposed based on structural and biochemical data, respectively. In contrast, we have proposed that the functional l motor is composed of four protomers, which includes the TerS subunit ([TerLTerS2]4). This discrepancy may reflect inherent differences between headful and unit-length genome packaging motors. Alternatively, it is feasible that a fifth protomer is recruited to the tetrameric l maturation complex at some point during symmetry resolution, procapsid binding or cos-clearance steps to afford a pentameric motor (see Figs. 5 and 6). Then again, it is feasible that the l maturation complex is, in fact, a pentamer of protomers contrary to published biophysical data. We disfavor this model because while there is precedence for tetrameric endonuclease enzymes, there is to our knowledge no precedent for a nuclease complex composed of five catalytic subunits. Notwithstanding, we cannot rigorously exclude these possibilities and the physical nature of the l maturation and motor complexes must await further structural interrogation.

Coordinated ATP Hydrolysis by the Terminase Motors Single molecule studies of phage f29 and l motors suggest that non-cooperative ATP binding appears to be associated with motor translocation (Hill coefficient, n B 1). This observation is a central feature of a mechanistic model describing mechanochemical coupling by the motors, based high resolution data in the f29 system, as follows. A pentameric motor is assumed and ATP binds to each of the subunits in a dwell phase. This is followed by a sequential but tightly coordinated hydrolysis of four ATP’s, which is proposed to “mask” cooperativity in ATP binding and/or hydrolysis when measuring motor translocation. Each hydrolytic event is coupled to the translocation of 2.5 bp into the capsid, for a total of 10 bp per motor cycle; the fifth motor subunit is postulated to “maintain a grip” on the duplex during each catalytic cycle. In contrast to the single molecule studies, kinetic studies of solution-based steady-state ATP hydrolysis by the phage P74-26 TerL subunit (under non-packaging conditions) exhibits modest cooperativity (Hill coefficient, n ¼ 1.7). Similarly, solution-based ensemble studies have characterized ATP hydrolysis by the translocating l motor, composed of both TerL and TerS subunits, which reveal strong cooperativity in ATP hydrolysis (n ¼ 3.8). How can these observations be reconciled with the apparent noncooperative ATP binding events observed in the single molecule studies? A critical distinction between the single molecule and solution-based ensemble biochemical studies is that the former does not actually measure ATP hydrolysis. Rather, the single molecule studies examine motor translocation as a function of ATP concentration. Thus, the discrepancy may reflect what is actually observed by the two approaches. Alternatively, it is feasible that the f29 motor, which packages genome monomers, is mechanistically distinct from those that package DNA from a genome concatemer. Resolution of this question must await high-resolution single molecule studies of additional terminase enzymes, such as l, T4 and P74-26, among others, to verify the results observed in the f29 system.

Mechanochemical Coupling and DNA Translocation While extensively studied, the molecular mechanism coupling ATP hydrolysis to duplex translocation remains speculative and several models have been proposed. These include an “inchworm” translocation model, similar to those describing translocation by monomeric and dimeric helicase enzymes. In this model, the TerL nuclease and ATPase domains both bind DNA and conformational switching between a “tense” and “relaxed” state, driven by ATP hydrolysis; this mediates sequential steps of DNA binding and release by the two domains, which drives translocation. A second “lever” hypothesis proposes that duplex translocation is mediated solely by the ATPase domain, which grips and translocates DNA in the central channel of a ring-like motor complex. In this model, ATPhydrolysis alters the conformation of the lid subdomain, which is propagated with a “lever-like” motion to the adjacent subunit in the motor, driving duplex movement. Within this context, recent studies in the l system suggest that TerL undergoes a “tight binding transition” upon ATP binding that is required for both tight ATP binding and tight gripping of the DNA. Further, molecular dynamics studies suggest a general terminase mechanism wherein the conserved ASCE catalytic glutamate and a conserved WA arginine act in concert to facilitate this “open” to “closed” transition, which juxtaposes the catalytic glutamate and a water molecule proximate to the g-phosphate of ATP for catalysis. After hydrolysis and Pi release, the arginine “toggles” to interact with a different glutamate in the lid subdomain to communicate the nucleotide-bound state, which has been implicated in mechanochemical coupling. Analogous models have been proposed for the f29, T4 and P74-26 packaging motors based on single-molecule and biochemical data. The above models presume that the DNA is an “inert” substrate during translocation; however, other models suggest that the DNA itself plays a direct energetic role in duplex translocation. A “crunching” model proposes the motor compresses B-form DNA into a shorter A-conformation, mediated by ATP hydrolysis. Subsequently decompression back to a longer, B-form duplex propels the DNA into the shell. A related “scrunching” model proposes that the motor proteins repeatedly dehydrate and rehydrate sections of DNA, which undergo a cyclic shortening and lengthening transition. In both these models, the motor, which includes both terminase and portal proteins, capture the shortening-lengthening transitions to bias movement of the duplex into the shell interior. Finally, a novel model invokes an “osmotic pump” mechanism wherein expansion/contraction cycle of the capsid, mediated by ATP hydrolysis, “sucks” the duplex into the shell. Given that the mechanism of coupled ATP hydrolysis and DNA translocation in terminases remains vague, few of the above models are mutually exclusive.

Enzymology of Viral DNA Packaging Machines

133

Termination of Packaging Upon insertion of a full-length genome into the capsid, the translocating motor stops, cuts the duplex and ejects the stable terminaseconcatemer complex, shortened by one genome length, from the genome-filled nucleocapsid. Minimally, this requires downregulation of the packaging ATPase site (to avoid motor movement) and re-activating the nuclease activity of TerL. Further, the parallel orientation of TerL subunits in the translocation motor must switch back to a head-to-head conformation (symmetry resolution), which is the presumed conformation for a catalytically competent nuclease complex (vide supra). As with all ill-characterized processes, several models have been proposed for this conserved process. The first, based on structural data in phage T4, is that the nuclease catalytic site of TerL is separate and distinct from the DNA binding site involved in duplex translocation. Upon packaging the genome, the duplex is transferred from the translocation binding site to the nuclease active site. Another model, based on structural data in the P74-26 system, proposes that the TerL nuclease domain is bound to the portal ring during packaging and is sequestered from the DNA. Upon insertion of the full genome length, the nuclease domains release the portal and bind to the duplex to catalyze cleavage. Both of these models suggest that nuclease activity is “regulated” by sequestration of the catalytic site from the DNA during translocation. This begs the question as to what regulates the conformational changes that allow duplex access to the nuclease site. Single molecule studies have demonstrated a dramatic slowing of the packaging motors at high packaging densities. The simplest model is that motor slowing provides sufficient time for a slow conformational change that allows duplex transfer to the nuclease site. This kinetic model is analogous to DNA strand switching from the polymerase site to the exonuclease editing site in DNA polymerase enzymes. The model predicts that duplex cutting would occur whenever the translocating motor pauses; however, a number of studies do not show evidence for cutting by a stalled motor. Thus, the termination signals appear more complicated and alternative models propose that shell filling density is directly communicated from the nucleocapsid to the translocating motor. These models suggest that the portal serves as a “packaging sensor” that allosterically activates TerL nuclease activity to allow for headful cleavage.

Unit Length Packaging Motors The situation is more complicated with viruses that package unit-length genomes, such as l, HK97 and f80, among others. In these cases, upon arrival at cos and with attenuated motor velocity, the motor subunits must specifically engage the cos-sequence prior to nuclease activation (cos-capture). Again, phage l provides a useful model to understand cos-capture and the terminal cos-cleavage reaction, as depicted in Fig. 7. Upon arrival at the terminal cos, ATPase activity must be downregulated and nuclease activity must be activated in the switch from a dynamic motor to a stable nicking complex. Nuclease activation is likely concomitant with resolution of the motor symmetry to again bind to a symmetric cosN nicking site. Specifically, the “parallel” motor complex must switch back to a “headto-head” complex symmetrically disposed at cosN for duplex nicking. We presume that the regulatory ATP binding site in l TerS (Fig. 2) is involved in this transition, but data is limited. Feiss and co-workers have proposed that specific elements within the cos sequence also play important roles in terminal symmetry resolution and cos-capture steps, as follows. Genetic and biochemical studies have identified a cosQ element upstream of cosN and an I2 IHF binding element between cosN and cosB, that are essential for terminal cos-capture (Fig. 7). When mutations are introduced into cosQ, the translocating motor nicks the top strand of cosN, but not the bottom strand and the motor continues to package DNA until the capsid is filled to capacity. These mutations are lethal because uncut DNA protrudes from the portal and tails cannot be attached. The observation that only the top strand is nicked in the presence of cosQ mutations suggest that this element is required for symmetry resolution steps in the transition from a parallel motor complex to a head-to-head nuclease complex capable of symmetric nicking of the duplex. Genetic characterization of cosQ mutations reveal three modes of suppression; (1) compensating mutations in cosQ, (2) an increase in genome length (4–5 kb), which requires IHF and (3) missense mutations within the portal gene. It has been proposed that genome length and portal mutations serve to slow the translocating motor and increase the efficiency of cosQ-cosN-I2 capture. In this model, cosQ serves as a molecular “speed bump”. This is consistent with the observation that unit-length terminase enzymes retain a “headful” component to packaging termination; the efficiency of cleavage at the terminal cos sequence in l decreases from 100% to 75% as the genome is shortened to B80%. Further, insertion of additional cos sequences upstream of the natural termination site do not induce premature nuclease activity and the aggregate data indicate that cos-capture requires specific signals in the DNA plus sufficient capsid filling densities. Genetic studies further suggest that the portal may serve as a packaging sensor, transmitting information on packaging density to the TerL subunits in the translocating motor, as has been proposed for the headful packaging motors. Thus, the termination signals include cosQ-cosN-I2 elements that are likely recognized by the translocating portal-terminase complex and IHF and by attenuated motor velocity.

Terminase Ejection and Virion Completion Once terminase has nicked the strands within the terminal cosN element, the strand separation activity of terminase is poised to release the DNA-filled nucleocapsid and regenerate the maturation complex, as depicted in Fig. 7. This begs the question as

134

Enzymology of Viral DNA Packaging Machines

Fig. 7 cos-capture and the terminal cleavage reaction. The translocating l motor slows at high packaging densities to allow terminase binding to cosQ-cosN elements, presumably enhanced by IHF binding to the I2 element. This requires symmetry resolution of the TerL subunits from a parallel orientation in the motor to a head-to-head orientation in the activated nuclease complex. Finishing proteins likely displace the terminase-DNA complex from the nucleocapsid, which affords an infectious virus upon addition of a pre-assembled tail. Note that the matured genome end (DR) extends from the capsid into the tail tube of the assembled viral particle. The ejected terminaseDNA concatemer complex, formally equivalent to the maturation complex that initiated genome packaging, again undergoes a symmetry resolution transition and binds another procapsid to initiate a second round of DNA packaging.

to how the highly-pressurized packaged DNA is retained within the shell subsequent to terminase release. One model proposes that the portal ring binds tightly to the duplex as a “one-way valve” to prevent release. Additionally, it is feasible that “finishing proteins” that add to the portal of a DNA-filled shell actually play a direct role in ejection of the terminaseconcatemer complex, simultaneously stabilizing the DNA-filled shell. Whatever the case, the ejected terminaseconcatemer complex, formally equivalent to the complex that initiated genome packaging, again binds to an empty procapsid shell to initiate the next round of processive packaging. In this manner, terminase alternates between a stable maturation complex and a dynamic motor complex.

Conclusion Herein I have proposed specific molecular events in phage l required for the site-specific assembly and activation of a stable maturation complex and its transition to a dynamic packaging motor including “symmetry resolution”, procapsid binding and cos-clearance steps. We discuss current models that describe motor translocation and mechanochemical coupling of ATP hydrolysis and complex movement. Finally, we discuss the features and mechanistic complications involved in the terminal cos-cleavage reaction, including the signals required for cos-capture, symmetry resolution and activation of nuclease activity. The terminase enzymes must sequentially cycle between these two complexes, stable nuclease and dynamic motor, to allow processive genome packaging from a concatemeric DNA substrate. The proposed models derive from extensive genetic, kinetic, biochemical and biophysical data obtained over decades of study in our lab and by others. Kinetic and thermodynamic interrogation of l terminase figure prominently in the development of the models and complement the structural and single-molecule data published in l and many other systems. Analogous assembly steps and catalytic transitions must occur in all of the dsDNA viruses that package monomeric genomes from concatemeric precursors and while details of these processes may differ, it is likely that the general features proposed for l are recapitulated in all of the terminase family members. Thus, continued mechanistic interrogation of the packaging complexes using integrated genetic, structural, biophysical, enzymological and computational approaches is warranted. Indeed, the availability of multiple packaging systems including l, T4, SPP1, P74-26, SF6, among others, provides an attractive pallet from which to choose and define conserved features and system-specific mechanisms of packaging amongst the terminase enzymes.

Further Reading Black, L.W., 2015. Old, new, and widely true: The bacteriophage T4 DNA packaging mechanism. Virology 479–480, 650–656. Casjens, S.R., 2011. The DNA-packaging nanomotor of tailed bacteriophages. Nature Reviews Microbiology 9 (9), 647–657. Casjens, S.R., Hendrix, R.W., 2015. Bacteriophage lambda: Early pioneer and still relevant. Virology 0, 310–330. Catalano, C.E., 2005. Viral genome packaging machines: An overview. In: Catalano, C.E. (Ed.), Viral Genome Packaging Machines: Genetics, Structure, and Mechanism. New York, NY: Kluwer Academic/Plenum Publishers, pp. 1–4. Cuervo, A., Daudén, M.I., Carrascosa, J.L., 2013. Nucleic acid packaging in viruses. In: Subcellular Biochemistry 68, pp. 361–394.

Enzymology of Viral DNA Packaging Machines

135

Oliveira, L., Tavares, P., Alonso, J.C., 2013. Headful DNA packaging: Bacteriophage SPP1 as a model system. Virus Research 173 (2), 247–259. Ortiz, D., delToro, D., Ordyan, M., et al., 2018. Concerted Action of ASCE glutamate and arginine residues in catalysis of ATP hydrolysis in the phage lambda DNA packaging motor. Nucleic Acids Research 47 (3), 1404–1415. Rao, V.B., Feiss, M., 2015. Mechanisms of DNA packaging by large double-stranded DNA viruses. Annual Review of Virology 2 (1), 351–378. Sharp, K.A., Lu, X.J., Cingolani, G., Harvey, S.C., 2019. DNA conformational changes play a force-generating role during bacteriophage genome packaging. Biophysical Journal 116 (11), 2172–2180.

DNA Packaging: DNA Recognition Sandra J Greive and Oliver W Bayfield, University of York, York, United Kingdom r 2021 Elsevier Ltd. All rights reserved.

Nomenclature

gp Gene product HTH Helix-turn-helix Pac Packaging sequence RNA Ribonucleic acid ss Single stranded T Thymidine TerL Large terminase protein TerS Small terminase protein TP Terminal protein TR Terminally redundant wHTH Winged Helix-turn-helix

A Adenine aa Amino acid ATP Adenine triphosphate bp Base pairs C Cytosine Cos Cohesive end sequence DBD DNA binding domain DNA Deoxy ribonucleic acid ds Double stranded E. coli Escherichia coli G Guanosine

Glossary Affinity Description of the interaction between two components of a binary complex that can be qualitatively referred to as either strong/tight or weak. Usually this is defined quantitatively as the equilibrium dissociation constant. Avidity The increase in affinity that occurs as each new binary interaction is made between additional components of the final ternary complex. Capsid or phage head The protein shell that is preassembled from the major capsid protein into which the viral genome is packaged. In non-enveloped viruses this forms the outside of the virus and eventually acts as the protective container for transfer of the viral genome to a new host. Concatemer The nascent dsDNA product from rolling circle replication of viral genomic DNA that is a continuous dsDNA strand with multiple copies of the genome fused head to tail. Conformational recognition The situation where assembly of a protein-DNA complex is facilitated by electrostatic attraction between oppositely charged surfaces that have complementary shapes, and the ability of the target DNA sequence to form the matching shape. Since contact between the protein and DNA base pairs are not

directly sequence identifying, this type of binding is an indirect readout of the underlying sequence. Genome maturation The process where the concatemeric viral DNA is recognized and cut to create the free DNA end that is key for insertion into the capsid pore at the start of packaging. Headful termination The DNA sequence independent termination of the DNA translocation reaction that occurs once the capsid reaches maximum DNA capacity. The DNA stalled within the molecular motor is cut by an endonuclease domain of the motor protein. The entire motor assembly (DNA bound to ATPase) is then transferred to a new empty capsid and the packaging process is repeated. Packaging The process where the nascent genomic DNA is pumped into an empty capsid shell to form a new infectious virus particle or virion. Replication The process in which new copies of the viral genome are synthesized for the creation of new virus particles. Sequence specific binding The binding interaction between a protein and a defined sequence of DNA, mediated by specific hydrogen bonding contacts between the side chains of the recognition amino acids on the binding surface and the edges of the individual base pairs in the DNA sequence.

Introduction Bacteriophages repurpose the host cell’s machinery to make the specialized molecular machines and macromolecular complexes required to produce new virus particles: a copy of the viral genome encased in a protective protein container known as a capsid or phage head. These new viruses are then released and can infect other bacteria. The nascent copies of the bacteriophage genome produced during replication need to be pumped, by a powerful DNA translocation motor, into pre-assembled capsids through a specialized pore. The packaged DNA forms concentric loops of decreasing diameter around the inside of the capsid until the strands of DNA are tightly packed against each other. As the phage head fills, the translocation motor continues to force DNA into the capsid, driving the ordered bending and stacking of the DNA against the massive physical forces that normally prevent such compaction. Indeed, the pressure of the restrained DNA inside the packaged virus is 10 times greater than the pressure contained

136

Encyclopedia of Virology, 4th Edition, Volume 4

doi:10.1016/B978-0-12-809633-8.20956-4

DNA Packaging: DNA Recognition

137

Fig. 1 Overview of dsDNA bacteriophage genome packaging systems. The phage genome is assembled with a powerful DNA translocation motor (red) on a unique pore (portal or connector protein; gray) in the preformed protein capsid. ATP hydrolysis powers the macromolecular machine that pumps the viral genomic DNA into the protective viral shell in a process known as packaging. (A) Protein primed replication of linear dsDNA genomes produce (e.g., Bacillus phage f29 and Tectiviridae e.g., PRD1) create monomeric dsDNA products fused to terminal protein (green). (B, C) Rolling circle replication of a circularized template intermediate yield long polymeric concatemers of dsDNA genomes, fused head to tail, that need to be cut (arrows) to create the free DNA end required for packaging into empty capsids. Once the capsid is filled, the DNA is cleaved again and the motor-DNA complex is transferred to a new empty capsid and the process is repeated. (B) Packaging of unit length genomes mediated by cos sequences (E. coli phage λ or HK97; orange) or terminal repeat sequences (E. coli phage T3, T5, T7). (C) The headful packaging mechanism is initiated by a cleavage at the pac site (cyan) and terminated at non-specific sequences after B104% of the genome has been packaged (arrows), producing virions containing slightly different circularly permutated genomes.

in a champagne bottle. This DNA packaging process is started when one end of the viral genome is assembled into the ATP hydrolysis powered DNA translocation motor located on the capsid pore (known as either portal or connector, Fig. 1). Like all cells, the inside of a bacteria cell is a densely crowded dynamic environment, in which the genomic DNA is stored in ordered structures. The macromolecular machinery that rearranges, copies, repairs and transcribes the information coded within the genome are integrated within this superstructure. Additionally, bacteria also contain other large biomachines, such as ribosomes, proteasomes, and the complexes that drive metabolism, all of which are mixed together with individual protein molecules, nutrients and metabolites into a molecular stew with the consistency of a gummy bear candy. Virus replication and packaging occurs within this complex environment. Consider the case of an average E. coli cell in exponential growth, containing: B2 copies of its own DNA genome; a high copy number average sized plasmid DNA B100 copies of 5 thousand base pairs (bp); and up to 300 new copies of a replicating phage DNA genome (B50 thousand bp); equating to a total DNA composition of B5 million base pairs. How does an infecting bacteriophage recognize and selectively package its own genomic DNA?

Genome Replication and Selection Mechanisms The problem of preferential packaging of phage genomic DNA, rather than host genomic or plasmid DNA, can be overcome through several different methods. While some bacteriophages (e.g., Pseudomonas chlororaphis phage 201f2-1) keep their replication machinery and the newly synthesized genomes separate from the host cell genome within a virally encoded protein compartment, most rely on some form of selective recognition between host or viral DNA and virally encoded proteins. For example, E. coli phages T4, T3, and T7 produces nuclease enzymes that target and degrade the host cell DNA, leaving only the nascent viral DNA to be packaged. However, the most common mechanism uses dedicated viral proteins that selectively bind to bacteriophage genomic DNA, and then direct the assembly of the packing motor-DNA complex, onto the portal of the capsid, to initiate DNA packaging. Since double stranded (ds) DNA bacteriophages use different replication systems to make new copies of their genomic DNA and thus produce diverse dsDNA products, specialized mechanisms are required for specific recognition of these replication products in order to initiate packaging.

Monomeric replication products Bacteriophages that use protein-primed replication (Bacillus phage f29 and Tectiviridae e.g., PRD1) produce linear monomeric dsDNA genomes. The products of this type of replication are most obviously defined by the presence of a protein covalently attached to the terminal 5′ phosphate at each end of the genomic dsDNA. This terminal protein (TP) remains attached to the nascent DNA strand after acting as the protein primer for initiation of strand synthesis by the bacteriophage DNA polymerase (Fig. 1(A)).

Concatemeric replication products New copies of most linear bacteriophage dsDNA genomes are produced through rolling circle replication of a circularized template intermediate (e.g., E. coli λ and T3 bacteriophages). This results in a concatemeric genome product: multiple copies of the genome, joined head to tail in a single long dsDNA strand. In a process known as genome maturation, these fused copies must be separated

138

DNA Packaging: DNA Recognition

at the start of each individual genomic unit, in order to create the nascent DNA end that is key for loading into the packaging motor. Since many of these viral packaging systems show no genome preference when provided with a free dsDNA end in vitro, the protein participants that generate the nascent DNA end, play an important role in genome selectivity. Several different methods for achieving this genome maturation process have been noted. Some phages package exactly one genome unit length defined by cohesive end site (cos) sequences (E. coli phage λ or HK97) or packaging (pac) sequence encoded direct terminal repeat (E. coli phage T3 and T7) sequences. Other phages package circularly permutated genomes (e.g., Bacillus phages SF6 and SPP1, and Salmonella phage P22), with the initial cleavage defined by a pac sequence site, resulting in packaging of greater than 1 genome unit length. After the initiation event at the pac site, when at least one genome length of the concatemeric DNA has been pumped into the phage head, DNA translocation is terminated by another cleavage event and the motor-DNA end complex is transferred to a new empty capsid to begin the packaging process again (Fig. 1(B)). This secondary, ‘motor transfer’ mode of packaging initiation accounts for at least two thirds of all packaging events in bacteriophages that use rolling circle replication. Bacteriophages that package unit length genomes, such as λ or HK97, terminate DNA translocation and cut the DNA at the same specific sequence motif (cos) in the next adjacent genome. However, viruses that package more than one genome length (B104%), terminate packaging at non-sequence specific sites using a ‘headful’ mechanism (Fig. 1(C)). Headful cleavage is presumably mediated by the deceleration of the packaging rate that results from the mounting internal pressures that are induced as the capsid approaches maximal DNA capacity. Indeed, this headful signal for termination of translocation appears to be an underlying physicochemical property of dsDNA bacteriophage packaging systems, as it has also been observed in mutant versions of viruses that usually package unit length genomes. Despite these differences, in each case, initial genome end processing and motor assembly requires specific recognition of the viral genome by virally encoded proteins. This selective binding relies on the physical properties of the targeted DNA: the average base composition and/or modifications, local and global variations in shape, and the presence of specific sequence motifs.

The Shape of DNA The helical architecture of dsDNA is stabilized by both the hydrogen bonds formed between the complementary Thymidine: Adenine (A:T) and Guanosine:Cytosine (G:C) base pairs at the center of the two sugar-phosphate backbone chains, and the helical stacking interactions between adjacent bases along each strand (Fig. 2(A)). Double stranded DNA is a conformationally dynamic molecule that can exist in many different shapes or forms. The most commonly adopted shape is the ‘B-form’ that has an average of 10.5 bp per helical turn, creating wide major and narrow minor grooves of equal depths. Since the minor groove is on the sugarphosphate side of the bases where access to the bases is more sterically hindered, sequence specific interactions occur in the major groove. Additionally, the minor groove of ideal B-form DNA is also comparatively more electrostatically negative than the major groove. While B-form DNA is the most prevalent form within cells, DNA conformation is both dynamic, fluctuating locally between conformational variants, and dependent on sequence, so that some sequence patterns are more likely to enter different conformations. For instance, A-rich sequences form non-ideal B-form DNA with more narrow minor grooves, while GC-rich DNA sequences are often found in ‘A-form’ shapes (Fig. 2(A)). A-form DNA is shorter and wider, with a looser helical turn (11 bp per turn) creating a deep narrow major groove and a wide shallow minor groove. The extent of hydration of DNA, and the resulting hydrogen bonds formed with water molecules along the phosphate backbone and within the major and minor grooves, is crucial in determining these conformations. Indeed, dehydration of B-form DNA sees a transition to A-form. Further conformational variants are formed by weakening of the hydrogen bonds between base pairs (breathing or melting) resulting in the formation of single stranded (ss) regions such as found in bulges, bubbles or loops. Perturbation of the base stacking interactions and rotation of the backbone strand allows bases to ‘flip out’ of the helical structure. These variants can be further stabilised and/or induced by local DNA base sequence or protein binding interactions. Over longer distances, such sequence induced local conformational changes in the DNA helix can contribute to curvature of the DNA helix over a single basepair (kinking) or several base pairs (bending) (Fig. 2(B)), conformations which can be sharply bent by the binding of histone-like proteins in bacteria (Fig. 2(C)). When multiple repeats of these sequences occur in defined patterns, the DNA can be curved into circular or superhelical structures, as observed in nucleosome core particles (Fig. 2(D)). These conformations are also somewhat dependent on the base sequence composition, as well as the binding of counter-ions or proteins. Globally, the action of macromolecular machines that separate the DNA strands during RNA transcription or DNA replication, cause the helical dsDNA to become supercoiled (Fig. 2(E)). Positive supercoils, where the DNA is overwound and less likely to melt into individual strands, are formed in front of the advancing enzyme complex, while negative supercoils (underwound DNA with a higher propensity for melting) are formed behind it. Negatively supercoiled DNA is thought to exist as Z-form DNA (Fig. 2(A)), where the phosphate backbone has a zigzag conformation and the stacked bases are exposed towards the outer edge of the fiber. These differently shaped DNA structures can also be specifically bound by specialized proteins using conformational recognition. Double stranded DNA is a surprisingly versatile information storage device. At its most basic, it contains the ordered sequence of bases that comprise the code, or blue print, for each of the RNA and protein components of the molecular machinery that perform the processes necessary for organism survival. Overlaid around, and sometimes over, this basic code are the sequence patterns, or motifs, that act as signals that are preferentially bound by the protein complexes that copy or transcribe the basic code, or control these processes. Redundancy in the code leads to variation in the average base composition between organisms, and the

DNA Packaging: DNA Recognition

139

Fig. 2 The conformational variation of DNA. (A) Cartoon depiction of the DNA double helix in A-, B- and Z-form conformations (top panel), showing the sugar-phosphate backbone (orange ribbon; and gray rings), base pairing between complementary bases (blue/white/red shaded rings) and the helical stacking interactions between bases along the same strand (bottom panel). Surface cartoons show the distance from the center of the DNA helix in a gradient colored from dark gray (center) to orange (10 Å from helix center). Models were created from fiber diffraction data using 3DNA (x3DNA.org). Cartoon depiction of: (B) bending (PDB 1LMB) or kinking (PDB 1JJ4) of dsDNA stabilized by proteins; (C) sharply bent DNA bound to Integration Host Factor (PDB 1IHF); and (D) tightly coiled circular DNA in nucleosome core particles (PDB 3LZ0). (E) Representation of positive and negative DNA (orange) supercoiling that occurs during transcription by RNA polymerase (blue).

over- or under- representation of certain patterns, the bases of which are sometimes specifically targeted for chemical modification (e.g., methylation) by dedicated proteins. The conformational repertoire of helical dsDNA introduced by particular sequences of base pairs, as well as the contribution to larger shape variations by repeating patterns of such sequences, can be considered as an additional ‘conformational code’ that can be specifically acted upon by protein cofactors with complementary shaped binding surfaces. Combination of sequence specific binding with conformational recognition can significantly expand the protein binding code available to a DNA sequence.

140

DNA Packaging: DNA Recognition

Fig. 3 Examples of preferential DNA-protein interactions. (A) Cartoon image of sequence specific binding to B-form DNA by a Helix-Turn-Helix (HTH) motif (PDB 1W0T), showing the recognition helix (green), the support helix (yellow) and an example of an arginine rich N-terminal arm (magenta). (B,C) Image showing modes of conformational recognition using positively charged protein surfaces with complementary shapes to the bound DNA. The electrostatic surface potential for each protein is depicted (positive, blue; and negative, red). (B) E. coli Integration Host Factor (IHF) bound to H-box DNA, and (C) a nucleosome core particle bound to bent DNA. In all frames the phosphate backbone (orange strands), ribose sugars (gray rings) and nucleotide bases (gray, red, blue rings) of the DNA are shown.

Protein-DNA Binding A Question of Fidelity: Sequence Specific and Non-Specific Binding Protein binding interactions with nucleic acid molecules are considered sequence-specific when they are stabilised by hydrogen bonding networks or hydrophobic interactions between the amino acid side chains of the protein and the individual base pairs of the DNA sequence motif. This is also known as direct readout. Additional non-sequence specific binding interactions occur around the sequence specific contacts. These largely electrostatic connections between the positively charged amino acids forming the binding surface of the protein and the negatively charged sugar-phosphate backbone of the DNA, serve to stabilize the bound complex. Indeed, the comparatively weak non-sequence specific binding interactions can play an important role in localizing the protein to the DNA, allowing the protein to slide randomly along the dsDNA in a linear search for a sequence specific binding motif. Conformational changes in both the protein and the DNA allow direct contacts to be made with the bases of the binding sequence motif, increasing the affinity of the protein for the DNA at this sequence. Despite not making identifying contacts with the base pairs in the major groove, the overall shape of the positively charged surface of the protein can also play a role in selective binding to segments of DNA with a potential for matching this shape. This indirect readout mode has been noted in the sharply bent or superhelical protein-DNA complexes of genome architectural proteins.

DNA Binding Domains Many proteins interact with DNA through discrete domains, known as DNA binding domains (DBDs). These are often folded independently of the main protein body and can be swapped between proteins or studied in isolation. DBDs are found in different shapes, also known as folds or structural motifs, depending on the functional requirements of the core protein or complex. DBDs that interact with specific DNA sequence patterns have folds that allow placement of the binding surface so that direct contacts can be made between the residues of this surface and the individual bases that comprise the binding sequence. These include: HelixTurn-Helix (HTH); Helix-Loop-Helix; Zinc fingers and Leucine zippers, among others. Additional non-sequence specific contacts are often made by extensions to the DBD that are not always considered part of the core DBD motif. Proteins that use a conformational recognition mode for binding to targeted DNA sequences have core structures that place the DNA binding features in a defined geometry to allow non-sequence specific contacts with the sugar-phosphate backbone or the minor groove of DNA of a distinct shape, such as the histone complexes of eukaryotes and the histone-like proteins of bacteria. HTH motifs are highly conserved and widespread throughout nature, including in bacteriophages. The core domain is usually presented on a small helical bundle and consists of 2 α-helices, connected by a turn, and held tightly together in a cross shape by conserved hydrophobic contacts between the helices. Since an α-helix is just wide enough to lie along the major groove of B-form DNA when aligned roughly parallel to the backbone strands, the recognition helix (usually the second in the HTH) affords a compact scaffold from which to project the combination of amino acid side chains that will contact the base pairs within the major groove and bind specifically to the target sequence. The amino acid residues around this helix assist with correctly positioning the base specific contacts through additional non-sequence specific contacts with the dsDNA backbone. Common extensions to this core fold, such as N-terminal arms and/or an anti-parallel β-sheet wings, such as in the winged HTH (wHTH), make electrostatically mediated contacts with the sugar-phosphate backbone and/or the minor groove (Fig. 3(A)). Conformational recognition, such as used by genome architecture proteins, uses larger protein assemblies with surfaces of defined geometry and patterns of positive charge to capture and stabilize transient DNA shapes. For instance, histone-like proteins

DNA Packaging: DNA Recognition

141

(e.g., Integration host factor, IHF) of bacteria bind as dimers in a non-sequence specific manner to DNA that has a propensity for forming sharp bends, stabilizing the bend and assisting in wrapping and condensing the DNA chromatid. The protein dimer creates a positively charged, saddle-shaped β-sheet surface that is suspended between two flexible extended β-ribbon arms that lie along the minor groove when bound to DNA. The tips of the arms display conserved proline residues that stabilize the kinked DNA conformation by projecting between two stacked T bases and distorting the helical assembly of base pairs. IHF binds preferentially to DNA sequences in a conformational recognition mode where the spacing between short sequences of conserved base pairs is required for placing narrowed minor grooves at the correct position for interaction with distinct features of the protein (arginine residues along the arms). Electrostatic interactions between the bent DNA strands and the saddle groove over the top of the protein and down the side of the protein further stabilize the bent conformation (Fig. 3(B)). In Eukaryotes, the histone octamers in nucleosome core particles form a somewhat asymmetric disk-like shape that is highly positively charged around the circumference so that B140 bp of DNA wraps twice around the disk to form a superhelical structure. The diameter of the core is such that it selectively binds to DNA with repeating patterns of sequences (A-tracts followed by CA bases) that are kinked, stretched and bent into a tightly wrapped helical coil (Fig. 3(C)). These shapes are further stabilised by precise placement of electrostatic arginine contacts in regions of narrowed minor grooves. The general properties of these DBDs that modulate interaction with their target DNA sequences are exploited and refined, both in phage encoded proteins and the target DNA sequences, to ensure selective packaging of the viral genomic DNA. Examples from bacteriophage packaging initiation systems are discussed below.

Strength in Numbers: Multiple Binding Events Regulate Complex Assembly The macromolecular complexes that function on DNA assemble in a cooperative, or synergistic manner, through a network of interactions that form between the different protein components and their individual contacts with the DNA. Consequently, the DNA can be considered as acting both as a substrate, being the functional target of the macromolecular machine, and, as the scaffold on which the functional protein complex is assembled. These multiple interaction events serve to increase the overall affinity, or avidity, of the complex for DNA, above the initial binding level observed for the individual components. Such avidity stems from the apparent increase in the local concentration of the binding components that are tethered together through mutual contacts with other members of the complex. A systematic build-up of weak binding interactions in this modular fashion, not only increases the number of different specific interactions that are possible for a particular genome, but also introduces more flexibility into the range of mechanisms for regulating the formation and disassembly of the functional complexes. For instance, ligand binding and/or chemical processing can trigger conformational changes that either positively or negatively modify the binding affinity of one component for DNA or for other complex components. This modulation of affinity allows the assembly, function and dissociation of the different macromolecular machines to be coordinated efficiently during replication and packaging.

Recognition of Viral Genomic DNA The combinatorial power of DNA sequence and conformational recognition by specialized proteins is used to ensure selective packaging of bacteriophage genomic dsDNA. This process exploits the basic properties of the different genomic replication products, leading to the variety of specific recognition mechanisms described below.

Monomeric Genomes and Terminal Proteins In f29, the terminal protein (TP or gp3), in complex the viral DNA polymerase, binds preferentially to the end of the dsDNA genome through recognition of the combination of the 6 bp inverted terminal repeat sequence (5′-AAAGTA-3′) and the TP covalently attached at the free 5′ end during the previous round of replication. The N-terminal 74 amino acids of TP are predicted to form α-helices that have been shown to non-specifically interact with either ss or dsDNA. Presumably this non-sequence specific interaction mediates formation of a loop or lariat structure where the 5′-end bonded TP loops back and binds to the same dsDNA genome molecule. Binding of the gp16 motor assembly (equivalent to the ATPase domain of large terminase) to the TP-lariat loop junction may induce or stabilize supercoiling of the DNA within the lariat loop. The nascent bacteriophage dsDNA genome is negatively supercoiled compared with the host genomic DNA by the action of replication complexes, and the formation of the lariat loop structure. This topology contributes to the conformationally selective binding of the viral genomic DNA to the capsid located connector assembly to make a complex where the DNA is wrapped around the connector. In a motion similar to a rope moving around a pulley, this wrapped DNA may slide around the connector complex until the gp16 bound TP-lariat loop junction reaches the connector and the complete motor assembly is formed at the connector/pRNA pore on the capsid (Fig. 4).

Concatemeric DNA and Terminase Proteins Packaging of the concatemeric product from rolling circle replication is initiated when the nascent genomic DNA end is produced by a virally encoded endonuclease. This enzyme is recruited to the cleavage site by at least one other viral protein that bind(s)

142

DNA Packaging: DNA Recognition

Fig. 4 Model for packaging of monomeric genomic dsDNA. Cartoon model of the features required for specific packaging of f29 genomic dsDNA (orange). The 5′ covalently attached Terminal protein (TP, PDB 2EX3, green) binds non-specifically to an internal segment of the genomic DNA forming a looped lariat structure. The gp16 ATPase motor (red, PDB 5HD9) binds to TP-DNA bound end of the lariat, forming or stabilizing negative supercoils in the DNA within the lariat. The connector protein (gray, PDB 1H5W; assembled into a unique vertex of the capsid, shaded background) binds the negative supercoiled DNA to form a wrapped complex.

specifically to recognition signals in the genomic DNA. These proteins are usually referred to as the terminase, or Ter, proteins, as they help define the termini of packaged DNA. The small terminase protein (TerS) binds specifically to recognition signals that mark the genome end within the concatemeric replication product. This protein-DNA complex is bound and stabilised by the large terminase protein (TerL), which contains the nuclease domain that cleaves the end of the genome as well as the ATPase domain that powers the packaging motor. The terms ‘small’ and ‘large’ are in reference to their respective monomeric molecular weights. The nascent DNA end is assembled within a pentameric ring of large terminase subunits at the portal/connector pore complex of the empty capsid, forming the powerful DNA translocation motor that pumps the DNA into the capsid shell. The mechanisms by which the initiating terminase protein complexes of bacteriophages specifically recognize, and assemble, onto their cognate genomes are described below.

Direct terminal repeats: T3 and T7 The packaged linear genomes of T3 and T7 bacteriophages are bounded at each end by terminally redundant (TR) direct repeat sequences of 230 and 160 bp, respectively (Fig. 5(A)). In concatemeric DNA, a single copy of the TR sequence is preceded by the pac binding (pacB) sequence and bounded at each end by pac cleavage (pacCL and pacCR) sites. The pacB site defines phage specific promoter sequences for the cognate viral RNA polymerase, rather than the site of TerS binding, hence coupling packaging initiation to RNA transcription of the TR sequence. Genome end maturation is triggered when an RNA polymerase transcription complex becomes stalled at the concatemer junction (CJ) pause signal downstream of pacCR. In a process that is still poorly understood, the corresponding viral terminase proteins (gp18 and gp19) interact directly with the paused RNA polymerase complex in order to assemble at the TR sequence and subsequently nick the dsDNA on opposite strands at pacCL and pacCR, creating long complementary 5′ extensions over TR. However, given that the action of the transcription complex would also produce supercoiling in the dsDNA template, and that the T7 TerS forms an oligomeric complex of B8 monomers, it is possible that the terminase protein complex also recognizes the global conformation of the DNA surrounding the paused transcription complex to form a stable and functional terminase assembly. The direct repeat sequences of the TR are subsequently generated through an end filling extension reaction, by a DNA polymerase from the 3′ end of the nicked strand along each newly formed 5′ strand until the second nicked site is reached. The nascent dsDNA end at pacCR is then assembled into the macromolecular motor complex for the DNA packaging process.

Cohesive ends: λ-like

In phages λ and 21, the end of the genome is defined by the cos sequence; a complex 200 bp region comprised of the cosQ, cosN and cosB sites (Fig. 5(B)). The cosB (binding) site is bound specifically by the terminase complex, thought to be a tetramer of protomers, where each protomer is comprised of a heterotrimer of 2 gpNu1 proteins and gpA (TerS and TerL respectively). This proteo-DNA complex directs the nuclease domain of TerL to introduce staggered cuts in the cosN (nicking) sequence creating complementary 12 base ss 5′ overhangs at the end of each adjacent genome. The cosQ containing DNA is freed from the complex, and the remaining DNA end (with the cosB site) is assembled into the packaging motor. DNA translocation begins and continues

DNA Packaging: DNA Recognition

143

Fig. 5 Models of the packaging initiation processes for concatemeric genomes. In all images packaging proceeds from the cleavage site towards the right hand end of the genome as depicted. (A) Depiction of the genome recognition and packaging initiation events in T3 and T7 phages; inset: surface cartoon showing specific contacts between T7 RNA polymerase (PDB 1CEZ, light purple) and the T7 promotor pacB DNA (light orange) with the transcription start site depicted by the bent arrow. RNA polymerase pauses after transcription of the pacC Terminally Redundant (TR) sequences (dark orange) and is bound by TerL (red) which cuts the end of the pacC sequence. T7 DNA polymerase (green) extends the cleaved 3′ ends of the DNA to create the terminal repeat sequences. It is unknown how the cleavage of the opposite end of the TR sequence occurs, or what role the octomeric TerS (gray circles) may play in this process (indicated by the arrows and question marks). (B) Cartoon image of bent cos DNA (orange) with bound IHF dimer (dark gray) and the tetramer of terminase protomers binding to I1 and R sites respectively (TerS2:TerL, white and red respectively); the sequence of both strands of the cosN site are shown with the arrows denoting the cleavage sites. cosN and R site sequences are shaded in orange. The sequence coding for TerS/gpNu1 is shown in blue. (C) Diagram of T4 packaging initiation showing a model for the synapsis complex formed by two stacked TerS oligomers (white) loaded on to two different genomic segments of the concatemer (gray and blue). TerL (red) interacts with gp55 (late sigma factor, yellow) bound to the T4 DNA sliding clamp (gp45, dark gray) and a transcriptional coactivator (gp33, green). (D) Cartoon representation of genome recognition and maturation events in P22 phage. A complex of 1–3 TerL proteins (red) assembled on the TerS nonamer (white) binds to the pac sequence (orange) located within the terS gene (blue). Potential binding site of pac DNA (orange) located near the C-terminal end of TerS is shown by a segment of DNA. Cleavage occurs in a defined pattern with a 120 bp segment around the pac sequence, (arrows, length indicates cleavage frequency). (E) Representation of the recognition of Sf6 genomic DNA recognition by the HTH domains of TerS (white) of the pac sequence (orange) located in the terS gene (blue). Cleavage occurs non-specifically over a B2000 bp region of the genome centered on the pac sequence, with decreasing frequency depicted by the gradient from gray (high frequency) to white (low frequency). This non-specific cleavage may be mediated by sliding of TerS wrapped DNA along the DNA strand. (F) Depiction of SPP1/SF6 pac site showing the pacL (light orange) sequence which has a propensity for bending and covers the promoter (bent arrow) regulating terS (blue) expression. Two TerS assemblies (white) bind to pacL and pacR containing DNA in a wrapped complex that recruits TerL (red) to cleave at pacC. Cleavage (arrows) occurs within a few bases of the preferred cleavage site (large arrow).

144

DNA Packaging: DNA Recognition

Fig. 6 HTH domains and oligomeric TerS assemblies. Top Panel: Cartoon image of the HTH domains (yellow-orange) from TerS proteins of different phages (A: λ (PDB 1J9I); B: phBC65A1 (PDB 2A09); C: T4-like 44RR (3XTS); D: P22 (PDB 3P9A); E: Sf6 (PDB 4YDQ); F: SF6 (PDB 3ZQQ); G: G20c (PDB 4XVN)) aligned with the SF6 DBD (PDB 4ZC3, blue). The support and recognition helices for the SF6 DBD (α2 and α3 respectively) and P22 TerS (α3 and α4, respectively) are labeled. Middle Panel: Ribbon diagram of the oligomeric assemblies of the TerS with the DBDs highlighted by yellow circle and the number of monomers in the assembly listed above. Bottom panel: Image of the electrostatic surface potentials of circular TerS oligomers, showing areas of positive (blue) and negative (red) charge.

until the cos sequence in the next adjacent genome reaches the motor. At this point, the cosQ motif assists to terminate the packaging process, promoting cleavage at the subsequent cosN site and the DNA/motor complex is transferred to an empty capsid for the ensuing round of packaging. Assembly of the terminase complex onto the cosB sequence during the initiation process utilizes both DNA sequence specific and conformational recognition. Crucial to the assembly of the active terminase complex at the cosB site is the specific interaction of the IHF dimer, with the I1 subsite (27 bp) located equidistant between the essential TerS binding sites (inverted repeats R3 and R2). As expected, IHF binding introduces a sharp hairpin-like bend (B160°) in the I1 segment of dsDNA which places the R2 and R3 motifs directly opposite each other on the inside surfaces of the bent dsDNA strands. This IHF stabilized conformation promotes the cooperative assembly of the final tetrameric protomer (TerS2-TerL) assembly that is tightly bound to the R2 and R3 sequences, possibly through one TerS subunit from each of two protomers (Fig. 5B). The remaining two protomers in the tetrameric complex are likely bound to DNA at the non-canonical R3 and cosN or cosQ (also known as R4) sites by weak DNA binding interactions and the increased local concentration generated by the tetrameric complex bound with high affinity to the canonical R2/R3 sequences on the same dsDNA hairpin scaffold. Cleavage at cosN by TerL creates an intermediate complex where the cosQ containing DNA fragment is only weakly bound to a single protomer and no longer benefits from the cooperative affinity mediated by a continuous uncleaved strand of dsDNA. This results in the release of the cosQ containing DNA from the assembled tetramer of TerS2-TerL protomers bound to the nascent end of the genome. Molecular detail of the contacts between components of the terminase complex and the cos DNA scaffold are limited to the N-terminal DBD of TerS, which forms a winged HTH domain (wHTH). This domain forms a dimer in solution with the helices of the HTH domain on the opposite protein surface to the dimerization interface (Fig. 6). While this model may not represent the natural oligomeric status in either the protomer or the tetrameric protomer assembly, it has proven to be a useful model system for analysis of λ TerS wHTH interaction with its cognate R2 or R3 DNA sequences. Residues on the recognition helix (helix 2) are proposed to form hydrogen bonds with base pairs in the major groove, while non-specific electrostatic contacts are likely made between the negatively charged phosphate backbone of the DNA and the positively charged amino acids on the N-terminal arm, wing and C-terminal end of helix 4 (Fig. 5(B)). Differences between the closely related λ and 21 phages can be used to infer the interactions that are crucial to genome selectivity. Within the 16 bp sequence of the conserved R motifs, two base positions determine packaging preference, where T7/C10 is bound specifically by λ TerS and G7/G10 is recognized by 21 TerS. However, precise mapping of the contacts requires more biophysical and structural experiments. Cleavage at cosN and release of the cosQ DNA end allows the reordering of the terminase proteins around the nascent end of the cosB DNA and the subsequent assembly of this motor-DNA complex to start the packaging process. The mature packaged genomes of E. coli phage HK97 and related viruses (e.g., HK022) have complementary 10 base 3′ ssDNA overhangs (3′ GCGGCCGGTTT 5′), defined by a conserved cos sequence around the maturation cleavage site. While some proteins are predicted to share 490% amino acid (aa) similarity with λ phage proteins, no similarities are predicted in the terminase proteins. However, weak homology has been noted with the terminase proteins from the Bacillus phage phBC6A51 (40% aa similarity), the TerS of which forms a circular oligomeric complex (Fig. 6). Other than the fact that cos cleavage requires the terminase proteins (gp1 and gp2) and a virally encoded endonuclease containing an HNH motif (gp74), little is known about how the viral genome is recognized and the nascent ssDNA 3′ overhang end is generated.

DNA Packaging: DNA Recognition

145

Genome recognition in headful bacteriophages Bacteriophages that use the headful packaging system to pump slightly more than one genome length of the concatemeric DNA, result in the packaged DNA being circularly permutated, where the end of the genome that is packaged last overlaps somewhat into the adjacent genome copy, duplicating a short segment of the end of the genome that was packaged first. The iterative secondary transfer events after this initial 1+ packaging event cause the ends to the subsequently packaged genomes to migrate along the genome in the direction of packaging. This degeneracy in the genome composition relative to the ends of the packaged dsDNA is reflected in the variable requirement for sequence specific genome binding and initial cleavage precision used by different headful phages described below. Since the E. coli T4 bacteriophage produces nuclease enzymes that degrade the host genomic DNA during viral replication, specific binding of the terminase proteins (gp16 and gp17) to a pac sequence appears to be less critical for packaging of the correct genome. Despite this, weak preferential binding has been observed between TerS (gp 16) and a GC rich sequence at the end of its coding region, subsequently denoted as the pac sequence (Fig. 5(C)). However, in the absence of this sequence, cleavage of the concatemeric DNA, prior to packaging, can occur at alternative terminase binding sites. T4 phage TerS forms single circular oligomeric assemblies of 11 monomers, while TerS from a related phage, 44RR, is found as oligomeric assemblies of 11 or 12 monomers (Fig. 6). Single T4 TerS rings may also stack, and two such rings can bind to different sites on the concatmeric T4 genomic DNA creating a ‘synapse’ that is thought to be important for terminase gene amplifications and packaging initiation. No DBD has been identified in these TerS proteins, and while the central oligomerisation domain does not bind to DNA, it is essential for genome packaging. This suggests a role for conformation recognition of genomic DNA. The T4 phage TerS oligomers support the stable assembly of higher order assemblies of TerL which exhibit enhanced ATPase activity. In vivo, genome end processing and packaging initiation of nascent concatemer DNA is coupled to genome replication and RNA transcription of the late genes required for viral survival. This occurs via the interaction of TerL with a large regulatory protein complex comprised of the DNA sliding clamp and two transcription cofactors (gp45/gp55/gp33), which assist in localizing the terminase complex to newly synthesized genomic DNA for end maturation (likely at the pac site in gp16) and subsequent packaging into the phage heads. Selective packaging of the Salmonella phage P22 genome relies on sequence specific binding of the terminase proteins to the 22 bp pac sequence in order to direct the cleavage event that generates the free dsDNA end required for translocation of the genome into the capsid (Fig. 5(D)). A single cleavage event occurs most often close to the first base of the pac sequence. However, this initial cleavage event can also occur at any one of several positions, approximately spaced by 20 bp, within the 120 bp region spanning the pac site, suggesting that endonuclease activity is not directly sequence dependent. The P22 TerS protein (gp3) forms a circular assembly of 9 monomers (Fig. 6) that is known to associate with up to 3 TerL (gp2) monomers (Fig. 5(D)). The N-terminal domain of TerS does not form a classical HTH motif as it contains a β-hairpin insertion, however, the most similar structure to this domain is the DBD from the Bacillus phage SF6 TerS (Fig. 6), an HTH motif. The first two α-helices after the P22 TerS β-hairpin (α3 and α4) are in a similar orientation to the support and recognition helices (α2 and α3, respectively) of the SF6 HTH (Fig. 6). Indeed, mutations in P22 TerS amino acid residues, Leucine 80 and Glutamic acid 81, shown to alter pac specificity, map to helix α4 of the HTH-like motif, suggesting that this helix may also play a direct role in pac sequence recognition and binding (Fig. 5(D)). The flexible C-terminal segment of the TerS monomer is highly positively charged and can bind to the N-terminal domain of TerL to form the full Terminase complex (TerS9:TerL2–3). Although this C-terminal TerS segment can interact nonspecifically with DNA in the absence of TerL, stable formation of TerS-TerL complexes, both in vivo and in vitro, preclude the requirement of this interaction for specific recognition of the P22 viral genome. While a threading model for DNA interaction has been proposed, where the DNA passes through the central channel of the nonamer, the 120 bp cleavage zone around the pac site suggests a wrapping model reminiscent of the nucleosome core particle (Fig. 3(C)) and is consistent with the involvement of α4 in pac binding (Fig. 5(D)). Assuming a roughly B-form DNA conformation with regular bends, as in the nucleosome wrapped DNA, with the pac site centered within the bound DNA segment, evenly spaced sequences (where 20 bp is approximately equal to two turns of B-form DNA) would be presented facing outwards for cleavage by a TerL nuclease domain. This spacing, rather than the 10 bp of a single turn of B-form indicates that additional interactions between the DNA and TerS or TerL may stabilize the complex and occlude access to the DNA by TerL nuclease active site. The packaging initiation endonuclease reaction of the Shigella phage Sf6 concatemer can happen at any one place over a 2000 bp range, centered around a 30 bp pac site identified in the TerS (gp1) gene (Fig. 5(E)). This sequence is recognized by the N-terminal HTH domains of the Sf6 TerS protein that are displayed externally around the central stabilising core of the circular octamer (Fig. 6), suggesting that both sequence specific and conformational recognition modes play a role in assembly of the TerS-pac DNA complex. While a pac sequence for preferential binding has been identified, Sf6 TerS shows high non-specific binding activity to both linear and supercoiled dsDNA, through positively charged residues located on the external surface of the circular array. The C-terminal domain of the DNA bound TerS assembly interacts with TerL (gp2) bringing the nuclease domain into close proximity to the DNA for the maturation cleavage event. Since cleavage can occur anywhere up to 1000 bp from the pac site in either direction, it possible that once assembled on the pac sequence, the DNA wrapped TerS complex can slide randomly along the DNA, in a similar mechanism to nucleosome core particle sliding. A random sliding activity is consistent with the observation that the cleavage event takes place more often at pac proximal sites than at sites located at the extreme ends of the 2000 bp cleavage zone. The concatemer DNA from the closely related Bacillus phages, SF6 and SPP1, is initially processed to produce DNA that is cleaved within a short 6bp segment of the pacC site to produce single base 3′ overhangs (Fig. 5(F)). The 270 bp pac sequence is comprised of two main binding regions, pacL and pacR, on either side of the pacC site. The pacL sequence has a strong propensity

146

DNA Packaging: DNA Recognition

for bending and is bound by the nonameric TerS ring-like assembly through conformation recognition of the bent DNA assisted by weak sequence specific contacts. Nine HTH domains project a positively charged surface from the central oligomeric core (Fig. 6) that presumably allows the protein to form a complex with the DNA wrapped around it. Two TerS oligomers each bind to individual pacL and pacR sites placing the associated TerL nuclease domain so that it cleaves accurately at pacC regardless of the local nucleotide sequence at the cut site. Atomic structures have been determined for TerS assemblies from understudied bacteriophage (Thermus phage G20c and Bacillus phBC65A1) where there is little information on the genome maturation and packaging processes. These proteins also form circular assemblies of 9 monomers that display classical HTH domains around the central oligomerisation domain (Fig. 6), however the mechanism of genome recognition and the packaging initiation cleavage event are yet to be defined.

DNA Wrapping/Bending is a Common Feature in Selective Packaging Bacteriophages have long served as relatively simple model systems in which to study the fundamental mechanisms used by the macromolecular machines that act on nucleic acids. This longstanding focus towards understanding these processes has yet to yield much of the molecular detail of protein-nucleic acid complex assembly and function during packaging initiation. Conspicuous among these is the absence of atomic structural information of Terminase-DNA complexes. However, the puzzle pieces of information that are available for the different phage systems highlight some general features of the mechanisms for selective binding and packaging of cognate genomic DNA. The N-terminal DBDs of TerS and f29 TP are predicted to be α-helical. Indeed, many TerS DBDs form HTH or HTH-like motifs (Fig. 6(A)) supporting a model where sequence specific binding might occur through the placement of a recognition helix along the major groove of the dsDNA at the preferred sequence motif. However, the affinity of the sequence specific contacts between the cognate DNA targets and HTH motifs are generally weak, consistent with the observation that TerS proteins from many phage systems form circular oligomeric arrays of 8–11 monomers (Fig. 6(B)). Oligomeric assemblies of TerS have also been reported for other viruses (8 for T7 and 11 for T4). The positioning of the α-helical DBDs, displayed in a ring around a core oligomerisation domain, creates a ring of positive charges (Fig. 6(C)) around which the genomic DNA is wrapped. Additional non-sequence specific contacts between the DNA and TerL protein(s) bound to the TerS oligomer may act like DBDs, to assist in stabilising the ternary complex in some viruses, such as P22. Flexible positioning of the DBDs relative to the oligomeric core, and each other, provide a pliable DNA binding surface that can accommodate variation in the shape of the target DNA. Similar wrapping of negatively supercoiled DNA around the cylindrical connector protein has been proposed for initiation of f29 packaging. This serves two purposes: providing a mechanism for recognizing the general topology of viral genomic DNA compared to host cell DNA (for both monomeric and concatemeric genome replication products); and in concatemeric DNA, allowing the circular oligomeric TerS to form a stable complex with the continuous strand of nascent genomic DNA and to present a more accessible bent strand as the substrate for the cleavage event required for packaging initiation. Despite the variation in approaches used by different dsDNA bacteriophages to generate new copies of their genomes and to couple packaging to this replication process, specific genome binding and selective packaging generally combines both sequence specific and conformational recognition modes, along with the avidity afforded by multiple contacts between the protein subunits of the oligomeric terminase assembly and DNA. This commonality exploits the basic topological properties of phage genomic DNA that, along with the most powerful molecular motor known in nature, allow the DNA to be packed tightly into the capsid creating an efficient method for packaging and transferring these large viral genomes between host cells.

Further Reading Black, L.W., 2015. Old, new, and widely true: The bacteriophage T4 DNA packaging mechanism. Virology 479–480, 650–656. Djacem, K., Tavares, P., Oliveira, L., 2017. Bacteriophage SPP1 pac cleavage: A precise cut without sequence specificity requirement. Journal of Molecular Biology 429 (9), 1381–1395. Feiss, M., Reynolds, M., Schrock, M., Sippy, J., 2010. DNA packaging by λ-like bacteriophages: Mutations broadening the packaging specificity of terminase, the λ-packaging enzyme. Genetics 184 (1), 43–52. Grose, J., Jensen, G., Burnett, S., Breakwell, D.P., 2014. Genomic comparison of 93 Bacillus phages reveals 12 clusters, 14 singletons and remarkable diversity. BioMed Central Genomics 15 (1), 855–875. Jardine, P.J., Anderson, D.L., 2006. DNA packaging in dsDNA phages. In: Calendar, R.L. (Ed.), The Bacteriophages. Oxford University Press, pp. 49–65. Juhala, R.J., Ford, M.E., Duda, R.L., et al., 2000. Genomic sequences of bacteriophages HK97 and HK022: Pervasive genetic mosaicism in the lambdoid bacteriophages. Journal of Molecular Biology 299 (1), 27–51. Kala, S., Cumby, N., Sadowski, P.D., et al., 2014. HNH proteins are a widespread component of phage DNA packaging machines. Proceedings of the National Academy of Sciences of the United States of America 111 (16), 6022–6027. Leavitt, J.C., Gilcrease, E.B., Wilson, K., Casjens, S.R., 2013. Function and horizontal transfer of the small terminase subunit of the tailed bacteriophage Sf6 DNA packaging nanomotor. Virology 440 (2), 117–133. McNulty, R., Lokareddy, R.K., Roy, A., et al., 2015. Architecture of the complex formed by large and small terminase subunits from bacteriophage P22. Journal of Molecular Biology 427 (20), 3285–3299. Oliveira, L., Tavares, P., Alonso, J., 2013. Headful DNA packaging: Bacteriophage SPP1 as a model system. Virus Research 173 (2), 247–259.

DNA Packaging: DNA Recognition

Oram, M., Black, L.W., 2011. Mechanisms of genome packaging. In: Agbandje-McKenna, M., McKenna, R. (Eds.), Structural Virology. Royal Society of Chemistry, pp. 205–211. Rao, V., Feiss, M., 2008. The bacteriophage packaging motor. Annual Review of Genetics 42, 647–681. Rohs, R., Jin, X., West, S.M., et al., 2010. Origins of specificity in protein-DNA recognition. Annual Review of Biochemistry 79, 233–269. Salas, M., Redrejo-Rodriguez, M., de Vega, M., 2016. DNA-binding proteins essential for protein-primed bacteriophage f29 DNA replication. Frontiers in Molecular Biosciences 3, 1–21. Yang, T., Ortiz, D., Yang, Q., et al., 2017. Physical and functional characterization of a viral genome maturation complex. Biophysical Journal 112 (8), 1551–1560.

Relevant Websites x3DNA.org 3DNA Homepage – Nucleic Acid Structures. pdb.org wwPDB: Worldwide Protein Data Bank.

147

DNA Packaging: The Translocation Motor Janelle A Hayes and Brian A Kelch, University of Massachusetts Medical School, Worcester, MA, United States r 2021 Elsevier Ltd. All rights reserved.

Glossary Arginine finger A basic residue, most commonly an arginine, that works in trans during ATP hydrolysis within an oligomeric complex. In other words, the arginine finger of a subunit within a multimeric ATPase will participate in hydrolyzing the ATP bound in a neighboring subunit’s active site. ASCE A member of the Additional Strand Conserved Glutamate family of ATPases.

Concatemer A single DNA strand that contains multiple identical copies of a DNA sequence in series. Power stroke The force-generating step of a motor. Rossmann fold A highly conserved nucleotide binding motif which is composed of a β-sheet sandwiched between multiple ɑ-helices.

Introduction All viruses must encapsulate their genome into a capsid shell during the replication cycle. Many viruses use an ATPase motor to actively pump their DNA into the small volume inside the viral particle, strictly confining the genome. Viruses that use this method of maturation typically have large genomes and the ATPase motor overcomes the entropic and enthalpic penalties for encapsulating this large amount of genetic material. In many of these motor-driven viruses, the genome is packaged to very high density, often approaching a quasi-crystalline state. For example, the B48 kb genome of phage lambda is B16 µm end-to-end, but is packaged into a head with a diameter of only B60 nm. The motor needs to be extremely powerful in order to pump the genome against the internal pressure that builds during packaging, and also highly regulated to ensure complete and efficient encapsulation. Therefore, viral genome packaging motors are a particularly exciting model for understanding motor force-generation and regulation. Genome packaging motors are found in many viruses that infect organisms from all three domains of life. This includes viruses as large as the giant mimiviruses to small viruses such as adeno-associated viruses. For many of these motors, the molecular mechanisms and even the identities of motor components remain unknown. Here, we will focus on the best-understood classes of viral genome packaging motors: the terminases and phi29-type motors. These motors catalyze genome packaging in tailed bacteriophage, as well as herpesviruses. Because phage are the most numerous biological entities, these motors likely perform the bulk of viral genome packaging on Earth. Terminases are also important drug targets for combating herpesvirus infections, further highlighting the importance of these motors. Both terminase and phi29-type motors pump DNA through a dodecameric ring complex that sits at one of the five-fold vertices of the icosahedral shell. In terminase motors, this complex is called the ‘portal’, whereas it is called ‘connector’ in phi29-type motors. Not only does DNA enter through the portal ring during packaging, but it also exits through the same channel. The portal also serves as the binding site for the neck and tail proteins that assemble upon maturation of the virus particle. Numerous crystal structures of portal complexes from diverse phages reveal that these proteins share a common core structure. This indicates that all portals are evolutionarily related, despite low sequence homology in their primary sequence. The portal ring has a central channel that is just wide enough for double-stranded DNA to pass through. This finding has led to the proposal that the portal strips the incoming genome of any proteins bound to the DNA. Moreover, this tight channel is suggested to play a critical role in the packaging process itself, perhaps by acting as a one-way valve or even as a primary component of the force-generation step. In all examined cases, the dodecameric portal complex creates a symmetry mismatch with the capsid by replacing a pentameric coat protein complex at the five-fold vertex of the icosahedron. Whether this symmetry mismatch plays a functional role in packaging remains unknown, although recent work suggests a possible role in genome packaging termination.

Terminase and Phi29 Motors are Branches Within the ASCE Family of ATPases Phi29 motors primarily consist of the connector assembly, ATPase motor, and a specialized prohead RNA molecule, known as ‘pRNA’ (Fig. 1). During packaging, the connector protein binds to the ATPase motor through interactions with the pRNA, and the motor translocates the DNA through the connector into the capsid. In terminase motors, the portal assembly binds to the ATPase subunit (the large terminase, or TerL) directly without pRNA. Terminase motors have an additional endonuclease domain, which is used to initiate and terminate packaging (discussed below). Despite the differences in motor assembly, both phi29 and terminase motors rely on a similar catalytic core to power DNA translocation: the ASCE ATPase fold. Viral genome packaging motors are derived from the ASCE (additional strand conserved glutamate) division of ATPases. The ASCE division contains numerous superfamilies of ATPases, including widely studied AAA+ ATPases, ABC transporters, RecA, and

148

Encyclopedia of Virology, 4th Edition, Volume 4

doi:10.1016/B978-0-12-809633-8.20967-9

DNA Packaging: The Translocation Motor

149

Fig. 1 CryoEM structure of the Phi29 Motor. Density from difference maps isolating the connector (green), pRNA (magenta) and ATPase (blue) were combined to visualize the DNA packaging motor in f29. (A) Close-up view of the phi29 motor. (B) The motor is shown in the context of the entire prohead. The front half of the prohead density has been removed so all the motor components can be seen. This figure has been adapted with permission from Morais, M.C., Koti, J.S., Bowman, V.D., et al., 2008. Defining molecular and domain boundaries in the bacteriophage f29 DNA packaging motor. Structure, 16 (8), 1267–1274.

SF1/SF2 helicases (Fig. 2). ASCE ATPases are ancient machines that function across all cellular life, as well as many viruses. Although their molecular mechanisms and roles are remarkably diverse, ASCE ATPases usually function as multimers that convert the chemical energy of ATP into mechanical work. ASCE ATPases also generally use a similar ATP hydrolysis mechanism. Thus, studies of the packaging motor structure and mechanism have broad relevance towards understanding the general principles of molecular machines. The ASCE nucleotide binding motif uses a Rossmann ‘βɑβ’ fold which is comprised of a β-sheet sandwiched between multiple ɑhelices. Different clades within the ASCE tree may add variable elements to this base structure, however there are four features that are common to all ASCE ATPases: the Walker A and B motifs, a catalytic base that is usually a glutamate residue, and a trans-acting motif (Fig. 3). The Walker A motif (also known as the P-loop; consensus sequence [G/A]xxxxGK[T/S]), is a nucleotide binding motif that forms hydrogen bonds with the ATP phosphates in the active site. The threonine/serine residue in the Walker A motif is critical, as it coordinates a metal ion, often a magnesium, for ATP hydrolysis. Downstream in the protein sequence, the Walker B motif (consensus sequence φφφφ[D/E]), further assists in binding this metal, which coordinates the β and γ phosphates of ATP. Immediately downstream of the Walker B motif, a glutamate residue acts as an essential catalytic base, activating water for nucleophilic attack on the γ-phosphate of ATP. (Often this ‘catalytic glutamate’ is included in the Walker B motif, which yields a consensus sequence φφφφ[D/E]E.) The fourth feature, a positively charged trans-acting element from a neighboring subunit (known as the arginine finger) interacts with the γ phosphate of ATP to stabilize the hydrolysis transition state. This trans-acting element also links nucleotide sensing and ATP hydrolysis between subunits, a necessity for regulating activity within oligomeric ATPases. Terminase and phi29-like motors have numerous features that are similar, suggesting a relatively close phylogenetic relationship. Previous phylogenetic analysis classified the terminases and the phi29-type motors as disparate families within the ASCE division of ATPases. The phi29-type motors were placed within the HerA/FtsK family of dsDNA translocases and the terminases as an offshoot of RecA ATPases. This classification was based purely on sequence analysis and secondary structure prediction. However, recent structural and mechanistic discoveries led to the conclusion that terminase and phi29-type motors are much more closely related and are two distinct clades within the same family (Fig. 2). The ATPase subunits of phi29 and terminase motors have a core Rossmann fold with additional elements appended. For example, there is an additional ɑ-helix and three antiparallel β-strands inserted between strand 2 and helix B of the core Rossmann fold, differentiating them from other ASCE ATPases. An N-terminal helix has also been added, where it packs atop helices A, B, and C of the core Rossmann fold. The phi29 ATPase domain also contains a short antiparallel β-strand inserted between helix D and β-strand 5. Additionally, terminase motors have a C-terminal nuclease domain, while phi29 motor proteins lack nuclease activity. This is not surprising, because the phi29 motor does not cleave DNA during packaging (see below for more details). Finally, the terminase motor has a three-helix bundle subdomain in between the core ATPase domain and the nuclease domain. We refer to this subdomain as the ‘lid’ because it caps the ATPase active site (others have referred to this substructure as a ‘linker’ domain or ATPase subdomain II). The three-helix bundle lid subdomain is distinguished from the four-helix bundle lid subdomains of AAA+ and STAND ATPases. It is unclear whether phi29 has a similar lid subdomain, as the protein used for crystallization only contained the core Rossmann fold of the ATPase. In addition to similar ATPase domain structure, both terminase and phi29 proteins form oligomers when bound to the portal/ connector. In the absence of the portal/connector proteins, the subunits of terminase and phi29 motors are generally monomeric. CryoEM and biochemical studies of the terminase motor show a pentameric assembly of subunits. However the oligomeric state of the phi29 motor remains contentious, as biochemical studies suggest a hexameric assembly. Assembly of the oligomeric motor onto the portal/connector facilitates ATPase and packaging activity. Both terminase and phi29 motors use a trans-activated ATPase mechanism through an arginine finger motif, similar to most other ASCE ATPases. The

150

DNA Packaging: The Translocation Motor

Fig. 2 Phylogenetic tree of ASCE ATPases. The core Rossmann fold is shown in green, while branch-specific structural features are shown in various colors. The terminases and phi29-type ATPase families likely evolved from AAA+ ATPases.

Fig. 3 The terminase ATPase active site. The P74-26 bacteriophage ATPase domain is shown. ADP-BeF3, a non-hydrolyzable ATP analog, is bound in the active site. The Walker A motif hydrogen bonds with the ATP phosphates, and the conserved serine residue coordinates a magnesium ion (purple sphere). The Walker B motif further assists in binding the magnesium, with the catalytic glutamate coordinating a water (not shown) for nucleophilic attack.

DNA Packaging: The Translocation Motor

151

arginine finger is positioned within the interface between two subunits, directly contacting the ATP γ-phosphate to stabilize the developing negative charge during the ATP hydrolysis transition state. In phi29, the arginine finger also causes ejection of ADP from the post-hydrolysis state. Thus, the arginine finger residue not only plays a catalytic role, but also ‘communicates’ nucleotide status between subunits within the ring. Because the arginine finger is central to motor function, its position on the protein surface dictates the relative orientation of neighboring subunits within the ring. In terminase and phi29 motors, the arginine finger is located in different regions of the protein, suggesting that the overall orientation of ATPase subunits may be altered between the two classes. Phi29 and terminase motors are further distinguished by the accessory subunits required for efficient packaging. Terminase motors use the small terminase protein (TerS), which specifically recognizes the viral genome and modulates large terminase ATPase and nuclease activity. Phi29 lacks a clear homolog of TerS, and instead uses the gp3 protein which is covalently bound to the 5′ ends of the viral genome. The gp3-DNA complex is suspected to be the recognition element for packaging initiation. Additionally, phi29 uses the 174 base pair long pRNA for binding the motor to the connector. The pRNA forms a pentameric assembly on the connector, binding to the phi29 motor through a spoke-like RNA secondary structure. Terminase motors do not require a pRNA to bind portal, and bind through direct protein-protein interactions.

Phi29 Motors The viral genome packaging machinery in phage phi29 is arguably the best understood system. The phi29 packaging motor was discovered in the 1970s and an in vitro packaging system was established in 1986. The ability to reconstitute genome packaging from purified components opened the door for years of incisive study. Furthermore, the phi29 motor chemomechanical cycle has been elucidated to high detail using an elegant single-molecule optical trapping assay for motor activity. Early optical trap experiments established phi29 as an exceptionally powerful motor, working against forces averaging 57 pN, a force close to that at which dsDNA fundamentally alters its conformation through over-stretching. Subsequent experiments identified individual subunit stepping, which provided incredible detail into the mechanisms of force generation. Finally, recent structural analysis has visualized a functioning holoenzyme at low resolution and various individual components at high resolution. Phi29 genome replication and packaging begins with the gp3 protein. At the beginning of viral genome replication, gp3 covalently binds to the 3′ end of the genome, where it acts as a primer for replicating the 5′ strand. The gp3 priming domain dimensions and charge mimic DNA, allowing the polymerase to initiate replication by covalently linking a nucleotide to the hydroxyl of a serine within gp3. After DNA replication is complete, both ends of the DNA, known as left and right, remain bound to gp3. DNA-gp3 forms lariat loops, with the left end junction preferentially interacting with the packaging motor bound to the prohead. Binding to the motor causes DNA to supercoil, with some evidence suggesting that the supercoil wraps around the connector protein outside of the capsid. In an unknown mechanism, the motor begins packaging the gp3-DNA, possibly through packaging the loop in its entirety, or through gyrase activity that has been speculated but never directly observed, allowing the loop to resolve. The translocation of the genome into the empty phi29 capsid occurs in a complicated, ATP-dependent mechanism. Each translocation cycle can be separated into two phases: a ‘burst’ phase in which DNA physically pushes into the capsid, and a ‘dwell’ phase in which the motor resets (Fig. 4). At the beginning of the ATP hydrolysis cycle, each subunit of the phi29 pentameric motor binds ATP. Once the ring is fully loaded with ATP, ATP hydrolysis is stimulated in a sequential order around the ring. During the burst phase, DNA is translocated by a total of 10 bp. By slowing the rate of translocation, Bustamante and colleagues showed that the 10 bp burst actually consists of four discrete 2.5 bp substeps. Thus, the pentameric phi29 motor uses 4 translocation events per cycle, which suggests that not every subunit directly translocates DNA per cycle. One of the five motor subunits hydrolyzes ATP, but does not translocate DNA, and is designated the ‘special’ subunit. The exact role of the special subunit is still unclear; however, it is critical for regulating the motor during the dwell phase, which occurs in between each of the 10 bp translocation increments. It is thought that the special subunit grips DNA during the dwell phase, and releases ADP from its active site, allowing ATP to bind. The arginine finger of the special subunit and each subsequent subunit triggers ADP exchange for ATP sequentially within the ring, resulting in a fully ATP-loaded motor. The DNA-bound special subunit then initiates ATP hydrolysis, breaking its contact with DNA, and triggering the burst translocation phase. During the four subsequent ATP hydrolysis events, ATP is hydrolyzed and Pi released. Pi release induces the conformational change that drives the translocation step, packaging 2.5 bp of DNA before handing it to a neighboring subunit. Upon the power stroke conformational change, the arginine finger of the currently active subunit positions into the neighboring subunit’s active site, priming the next subunit for an ATP hydrolysis event. In this way, hydrolysis in the next subunit is triggered, and the DNA is processively packaged. During the various packaging steps, DNA needs to undergo multiple ‘grip-and-release’ cycles. DNA is held by each subunit tightly during the translocation substeps as well as during the dwell; however, each subunit must relinquish its hold on DNA to easily pass off the DNA to a neighboring subunit within the ring. DNA is gripped mostly through the 5′-3′ strand, with less contact made with the 3′-5′ strand and minor contacts with sugars and nucleotide bases. Upon passing to the next subunit during translocation, the DNA undergoes rotation. As sequential hydrolysis and DNA translocation occurs during the packaging reaction, translocation step size and DNA rotation per cycle starts to vary. As more DNA is encapsulated, the step decreases from 2.5 to 2.3 bp, and the rotation of DNA increases from −14° to −48° per increment. This is in accord with increased packaging density within the capsid and subsequent elevated backpressure, which the motor must work harder against in order to continue the

152

DNA Packaging: The Translocation Motor

Fig. 4 The Phi29 chemomechanical ATPase cycle. At the end of the burst, all subunits are ADP-bound (“D” label). At the beginning of the dwell, the motor makes an electrostatic contact with two backbone phosphates (small red circles) on the dsDNA substrate (inside the ATPase ring). This unique contact determines the identity of the special subunit (yellow label “s”). The formation of the electrostatic contact facilitates ADP release by the special subunit. Subsequent ATP (“T” label) binding and ADP release events are interlaced, with ATP binding to one subunit enabling ADP release from its neighbor. After all five subunits have bound ATP, the special subunit hydrolyzes ATP (“D•Pi” label), releases Pi, and uses the hydrolysis free-energy to break the electrostatic contact with DNA, triggering the burst phase. During the burst, the remaining four ATP-bound subunits sequentially hydrolyze ATP, release Pi, and translocate DNA by 2.5 bp. The motor-DNA geometry (10.0 bp burst size versus 10.4 bp dsDNA helical pitch) favors a mechanism in which the same subunit is special in consecutive cycles. This figure is reprinted with permission from Chistol, G., Liu, S., Hetherington, C.L., et al., 2012. High degree of coordination and division of labor among subunits in a homomeric ring ATPase. Cell 151 (5), 1017–1028.

packaging reaction. Recent work has shown that this backpressure can be relieved by a temporary pause in packaging, which allows the DNA to relax in the capsid. Because this relaxation time is far longer than the packaging time, this result suggests that the fast speed of packaging causes DNA to adopt a metastable conformation. These findings illustrate that motor speed has an important role in determining capsid internal pressure and genome dynamics. The structure of the phi29 motor is beginning to come into focus. The connector assembly structure was first determined to high resolution by x-ray crystallography, along with a low-resolution cryoEM structure of the motor actively packaging. Subsequent cryoEM studies show that the portal undergoes large conformational changes in structure upon packaging and suggest a pentameric arrangement of the pRNA and ATPase components of the motor. The pentameric arrangement has been challenged, with the more commonly seen hexameric ATPase arrangement proposed. However, high-resolution asymmetric reconstructions of the phi29 motor structure exhibit a clear five-fold or pseudo-five-fold arrangement of subunits. Over the past decade, high-resolution structures of most of the motor components have been determined: a majority of the pRNA structure was determined by crystallography and NMR, as well as the ATPase domain of the gp16 motor protein. By docking the high-resolution structures into the cryoEM density, we are nearing a complete picture of the phi29 motor. High-resolution structures of the motor in action are still needed to understand the motor’s mechanism in detail. However, the rapidly developing field of cryoEM is nearly certain to profoundly impact our understanding of the phi29 motor in the future.

Terminase Motors In comparison to phi29 motors, less is known about terminase motor mechanism. Because terminases have been studied from multiple phages, here we synthesize the results from many model systems, such as phages T4, lambda, SPP1, P22, Sf6, and P74–26. Terminases initiate genome packaging using the small terminase protein. The small terminase binds to the viral genome and modifies the enzymatic activities of the large terminase protein. In some cases, such as found in phage lambda, host factors augment the small terminase’s role as a regulator of large terminase function. Additionally, terminase activity is modulated by the procapsid, which is required for terminase translocation. Other factors have been proposed to regulate terminase functions, such as the gpFI protein of phage lambda. However, more recent evidence suggests that regulation by gpFI is either unnecessary for phage replication or does not occur in vitro. Terminases are also controlled at the level of protein translation, as the terminase protein is often expressed late in the infection cycle. Finally, large terminase genes often contain self-splicing intein sequences in their ATPase active sites that are hypothesized to modulate terminase activity. Terminase motors translocate DNA at a faster rate than their phi29 counterpart, with the T4 motor taking 5–10 min to package its 171 kb genome. While terminases are faster translocases, they exhibit more variable speed. In vitro, the T4 motor translocates DNA at an average rate of 700 bp/s, peaking at 2000 bp/s with an ATP turnover rate of 300/s. The terminase of phage lambda peaks at B600 bp/s at low capsid density, but this rate drops to B200 bp/s towards the end of packaging. Meanwhile, phi29 packaging peaks at around 165 bp/s in vitro, making it significantly slower. Therefore, the motor speed seems to be correlated with

DNA Packaging: The Translocation Motor

153

Fig. 5 Proposed ‘inchworm’ model for the mechanism of DNA translocation by the large terminase. Panels (A)–(D) relate to the sequence of events that occur in a single large terminase molecule. The large terminase N-terminal subdomain I, subdomain II, and C-terminal domain are represented as green, yellow, and cyan ovals, respectively. The five-pointed stars show the charge interactions between the N-terminal subdomain I and the C-terminal domain. The four-pointed stars show the charge interaction between the N-terminal subdomain II and the C-terminal domain. The flexible linker between N- and C-terminal domain is represented by a wiggly cyan line. (A) The large terminase C-terminal domain is ready to bind DNA. (B) The C-terminal domain, when bound to the DNA, brings the DNA closer to the N-terminal domain of the same subunit. Conformational change in the N-terminal domain causes Arg162 to be placed into the ATPase active center in preparation for hydrolysis. (C) Hydrolysis of ATP has rotated the N-terminal subdomain II by about 6°, thereby aligning the charge pairs resulting in an electrostatic attraction that moves the C-terminal domain and the DNA 6.8 Å (equivalent to the distance between two base pairs) closer to the N-terminal domain and into the capsid. (D) ADP and Pi are released and the C-terminal domain returns to its original position. DNA is released and is aligned to bind the C-terminal domain of the neighboring subunit. This figure is adapted with permission from Sun, S., Kondabagil, K., Draper, B., et al., 2008. The structure of the phage T4 DNA packaging motor suggests a mechanism dependent on electrostatic forces. Cell 135 (7), 1251–1262.

genome size: faster motors are found in phages with larger genomes such that the total packaging time is similar across multiple phages. In comparison to other ASCE ATPases, the T4 large terminase moves along DNA considerably quicker than the fastest known helicase RecBCD, which unwinds DNA at a rate of B1000 bp/s. However, it is slower than the Ftsk and SpoIIIE dsDNA translocases, which move at a rate of 5–17 kb/s and B4 kb/s respectively. The speed of the motor coupled with increased forces inside the capsid during packaging requires an incredibly powerful ATPase to complete the packaging reaction. It is estimated that the power density of the T4 motor is approximately 5000 kW/m3, which is twice that of a typical automobile engine. How terminases generate this tremendous power is one of the central questions that has been debated within the field for decades, and several models for terminase mechanism have been proposed to explain this phenomenon (see below). Terminases are generally composed of an N-terminal ATPase domain and a C-terminal nuclease domain. The two domains are connected by a short linker, allowing for some degree of flexibility between the two halves of the protein. In addition to the ASCE ATPase family features listed in the above section, the large terminase ATPase domain contains a subdomain known as the ‘lid subdomain’ immediately upstream of the linker. The lid subdomain forms the upper portion of the nucleotide binding pocket, and is essential for ATPase activity (Fig. 3). In related AAA+ proteins, the lid plays an important role in propagating nucleotidedependent conformational changes to adjacent subunits within the ring. Although the mechanism of translocation in terminases remains unknown, many models have been proposed. Some of these models have been proven incorrect. For example, an early hypothesis that the portal ring rotates within the capsid to drive translocation has been disproven by observations that the portal does not rotate during packaging or in the mature virion. Here we examine several of the current leading models in detail.

The “Inchworm” Translocation Model One of the primary models for DNA translocation by terminases is an “inchworm” model in which DNA is pulled into the capsid via a spring-like motion (Fig. 5). This model was originally derived from biochemical and structural analysis of the T4 phage large terminase protein, including high-resolution crystal structures of the T4-TerL protein, and a low-resolution cryoEM structure of the T4 procapsid bound to TerL. A modified version of the inchworm hypothesis was proposed later from high-resolution crystal structures of Sf6 phage large terminase. In this model, the ATPase domains bind to the capsid and portal while the nuclease domains grip DNA in the center of the pore. In the inchworm mechanism, a key catalytic residue (Arg162) is repositioned into the active conformation from both DNA binding to the nuclease domain and ATP binding to the active site in the core ATPase domain (Fig. 5(A) and (B)). This induces ATP hydrolysis, which in turn is predicted to rotate the lid (also known as subdomain II) by B6° and move the short flexible linker region between the ATPase and nuclease domains by about 3 Å (Fig. 5(C)). This conformational change is proposed to align a set of ion pairs between the ATPase and the nuclease domains, resulting in a stronger electrostatic attractive force between the two domains. The electrostatic interactions pull the domains towards each other, and this ‘relaxed’ to ‘tensed’ conformational change translocates 2 base pairs of DNA upward into the capsid. During the reset phase of translocation, ADP and/or Pi release from the active site reorganizes the terminase subunit back into the relaxed state via the loss of negative charges from ADP and Pi release (Fig. 5(D)). ATPase subdomain II rotates back, attenuating the electrostatic force between the

154

DNA Packaging: The Translocation Motor

Fig. 6 Proposed ‘lever’ model for the mechanism of DNA translocation by the large terminase. (A) Large terminase ring is shown. The nuclease domains interact with the portal complex (translucent gray rectangle) and possibly the procapsid (black curve). DNA is not shown but interacts with the large terminase through the DNA interaction motif. Each subunit’s lid is bound tightly to the Rossmann fold of the adjacent subunit. (B) Upon ATP hydrolysis and release by the magenta subunit, the lid stays bound to the blue subunit and the Rossmann fold rotates 13° upward. To allow for this movement, the adjacent red subunit must also move in concert with the magenta subunit. To represent that the second site of symmetry breaking is unknown, the other three ATPase domains are faded. After hydrolyzing ATP, the magenta subunit releases DNA to the red subunit to translocate DNA upward through the pore; into the pore of the portal complex; and, ultimately, inside the procapsid. The release of DNA at each cycle by the ATP-hydrolyzing subunit allows for unidirectional DNA translocation. This figure is reprinted with permission from Hilbert, B.J., Hayes, J.A., Stone, N.P., et al., 2015. Structure and mechanism of the ATPase that powers viral genome packaging. Proceedings of the National Academy of Sciences of the United States of America 112 (29), E3792–E3799.

ATPase and nuclease domains. The nuclease domain releases its grip on DNA, which is presumably passed to a neighboring subunit preceding or during the release of ADP and/or Pi. While this model paints a comprehensive picture of the terminase packaging reaction, the proposed mechanism does not explain several observations. First, previous studies found the motor binds the portal in the reverse orientation, with the nuclease domains contacting the portal rather than the ATPase domains. Second, with the ATPase domains splayed radially from each other and bound to portal, the trans-activation mechanism required for ringed ASCE ATPase activity cannot occur. Third, the regions predicted to mediate interactions with DNA show particularly low conservation and are only found in T4-like phage. Finally, the observed conformational changes do not fully explain DNA translocation. The small changes in conformation do not match a B2 bp step size that has been predicted for terminases. No significant conformational changes were observed in the structures of T4-TerL. This may be due to crystal contacts within the active site or the mutation of Walker B motif in the crystallization constructs. Moreover, there is a discrepancy in the Sf6 structures wherein the ATPγS structure, which should be locked in a ‘ATP-like state’ is identical to the ADP-bound and apo states. This may be due to the fact that the Sf6 TerL was crystallized in the apo state, and then nucleotide ligands were soaked into the pre-assembled crystals. In many cases, the crystal lattice in the apo state does not allow for a ligand-induced conformational change in soaked crystals.

The “Lever” Translocation Model Another model for viral motor DNA translocation is the lever model (Fig. 6). In this model, a lever-like conformational change in the ATPase domain driven by ATP hydrolysis forms the power stroke of DNA translocation. In contrast to the aforementioned “inchworm” model, the “lever” model positions the ATPase domains as the central hub of the ring, while the nuclease domains protrude radially and attach to portal and capsid. Evidence of this assembly model comes from several crystal structures, molecular docking, and various biochemical analysis to determine the critical residues for motor function.

DNA Packaging: The Translocation Motor

155

The identification of a trans-acting arginine finger was a critical step in developing the lever model. Identifying this residue led to reinterpretation of the “inchworm” structural model in which the ATPase domains do not contact neighboring subunits. The “lever” model places the ATPase domains into a ring to facilitate trans-activated ATP hydrolysis, placing the nuclease domains in contact with portal. Further unbiased pentamer docking experiments of a monomeric crystal structure of the P74–26 ATPase domain positions the predicted arginine fingers precisely for trans-activated ATP hydrolysis, providing a second layer of evidence for the inverted structural model. Interesting, the pentameric model also fits well into the T4 packaging motor electron density map, indicating its potential biological relevance. Furthermore, the pore of the pentameric docked model is lined with several conserved basic residues, indicating a role in DNA binding. Mutating these residues abrogates TerL binding to DNA. Interestingly, mutation of residues in the nuclease domain has no substantial effect on DNA binding affinity. In fact, the TerL ATPase domain can bind DNA with nearly the same affinity as fulllength TerL. Assimilating these findings with the structural model yields a mechanistic model where the ATPase domains grip and translocate DNA in the central pore of the ring, while the nuclease domains bind to the portal and do not interact with the DNA during packaging (Fig. 6(A)). The lever model predicts that DNA is translocated by a lever-like motion of the ATPase domain while gripping to DNA. These motions were identified by comparing crystal structures of P74–26 phage TerL, which shows significant conformational changes in the apo state or bound to a non-hydrolyzable ATP analog. Comparing the apo and ATP analog bound states, the lid subdomain undergoes a 13° rigid-body rotation. Because modeling and mutational analysis suggests that the lid subdomain forms the primary interaction surface between adjacent ATPase domains, this rotation has a substantial effect on the conformation of the ring. With the lid subdomain stabilized through neighboring contacts, the 13° rotation is transmitted to the Rossmann fold, which is the large domain of the ATPase module. This results in a lever-like movement of the ATPase domain (Fig. 6(B)). This movement is calculated to shift the DNA binding region 8 Å upward towards the capsid and rotate DNA an estimated 2.3°. This is very similar to the 2.5 bp DNA translocation step measured in phi29 motors. It is predicted that upon either Pi or ADP release, the gripping subunit loses affinity for DNA and passes it off to the adjacent subunit. The adjacent subunit will now grip DNA tightly because it is in an ATP-bound state, which has been shown to have the highest affinity for DNA. This handoff would then allow the subunit that hydrolyzed ATP to release ADP and Pi so that it can bind ATP, thereby resetting for the next cycle. By acting in concert with two subunits simultaneously, the DNA can be translocated efficiently. Moreover, the motor produces very high force because the lever arm (the entire Rossmann fold) is quite long.

The DNA “Crunching” or “Compression” Translocation Model A third DNA packaging model is the DNA “crunching” or “compression” model (Fig. 7). In this mechanistic model, the motor uses ‘torsion’ caused by DNA compression inside the terminase assembly to propel DNA into the capsid. In other words, the DNA itself plays an active role in the force generation process. Specifically, with the motor attached to portal and DNA bound in the center, both portal and the terminase subunits grip the DNA and hydrolyze ATP (Fig. 7(A)). ATP catalysis leads to a brief DNA compression which disrupts the B-form DNA structure, transiently adopting A-form DNA in what is referred to as a “crunched” state (Fig. 7(B)). The return of this compressed state into the B-form state is assumed to cause the DNA translocation power stroke during the packaging cycle (Fig. 7(C)). Evidence for this mechanism lies in several observations. A FRET experiment using a doublelabeled fluorescent Y-DNA designed to stall the T4 packaging motor estimated 22%–24% DNA compression within the terminase motor when packaging was stalled. Additionally, subsequent packaging experiments using an intercalating dye, YOYO-1, showed dye release from the DNA during the packaging reaction. This observation is attributed to DNA compression inside the motor, as covalently attached DNA-binding labels are more readily packaged. Observations that the motor does not readily package RNA-DNA hybrids also suggest this mechanism, as RNA-DNA hybrid complexes assume a mostly A-form structure and therefore are not compressible, preventing the scrunched to relaxed transition from driving the translocation power stroke. Two slightly different variants on the scrunching model have been proposed, with one model hypothesizing that the scrunching occurs within the terminase central pore, while the other model hypothesizing that the compression occurs within the portal channel. We note that the compression model is not mutually exclusive with either of the two other models we discuss here; it is possible that force generation is through a combination of the compression model and either the inchworm or lever models.

Termination of Packaging Phi29 and related bacteriophages replicate their genome as monomers rather than concatemers as most phages do. Thus, packaging is terminated when the entirety of the DNA strand is packaged and no DNA cleavage is required. A 16 Å resolution cryoEM structure of a mature phi29 viral particle shows the DNA packaged inside the prohead with density in the channel of the connector and within the tailtube, both of which were assigned to the right-end gp3. A second cryoEM structure of fiberless phi29 particles at 7.8 Å resolution confirmed the presence of gp3 within the tailtube, indicating that the right-end gp3 does not get fully packaged, while the connectortailtube channel contains DNA bent in a toroid-like structure. This novel 60 Å-in-diameter highly-bent DNA structure appears to be the result of DNA under extreme pressure. While the function of the toroid structure is unknown, the authors speculate it plays a role in holding the DNA inside the tail during infection. Further studies are necessary to explore this idea.

156

DNA Packaging: The Translocation Motor

Fig. 7 Phage DNA packaging via a torsional compression mechanism. (A) A fully assembled packaging complex is shown, with the empty prohead (red) carrying the portal ring (blue) shown at one vertex, and the DNA duplex drawn in black. Only two subunits of the large terminase (orange) are shown for clarity. The terminase subunits are sketched with a minor lobe, representing a flexible region that undergoes a conformational change (black dashed arrow) coupled with ATP binding and hydrolysis (white solid arrow). The exact steps and temporal order by which the binding and hydrolysis of ATP is coupled to movement of the large terminase during the reaction cycle is not specified. (B) Directed linear motion of the flexible arm of the large terminase subunit engages the DNA substrate and translocates this towards the prohead. This movement coupled with interaction of the DNA with the portal region causes induced changes to the helical pitch and temporarily stores energy. (C) The stored energy is released by translocation into the prohead (green arrow), restoring the B-form helical repeat. This figure is adapted with permission from Oram, M., Sabanayagam, C., Black, L.W., 2008. Modulation of the packaging reaction of bacteriophage t4 terminase by DNA structure. Journal of Molecular Biology 381 (1), 61–72.

Terminases package concatemeric genomes and therefore must cut the DNA at the end of packaging. For viruses in this class, the genomes are typically synthesized by rolling circle DNA replication, resulting in multiple copies of the genome linked in a series (i.e., concatemers). Packaging is accomplished in one of two ways. In the first mechanism, typified by the lambda phage, the terminase cuts a specific site in the genome, resulting in exactly one genome length being packaged into the head (referred to here as ‘unit-length packaging’). In the second mechanism (known as ‘headful packaging’), the volume of packaged DNA triggers the nuclease cleavage mechanism, rather than a specific DNA sequence. The result of a headful packaging mechanism is that each capsid is filled with slightly more than one genome of DNA, creating terminally redundant ends that can be circularly permuted. For ‘unit-length’ bacteriophage such as lambda phage, the motor nuclease domain cuts at a specific sequence, known as cosN in lambda phage. Nicks are made on either side of the DNA strand, leaving a 5′ twelve nucleotide overhang. For the initial cleavage event, a complex with the small terminase protein (gpNu1) and the E. coli protein Integration Host Factor (IHF) increase specificity and affinity for the large terminase to the cosN site. IHF bends the DNA nearly 180°, positioning two gpNu1 binding sites adjacent to each and presumably facilitating gpNu1 DNA binding, which in turn promotes large terminase nicking at cosN. During termination, an upstream site, cosQ, and a downstream site, I2, are necessary for accurate nicking, and mutations in cosQ lead to aberrant termination and formation of particles with uncut DNA protruding from the head. It is proposed that cosQ plays a role in properly positioning the terminase/nuclease domains for antiparallel DNA cleavage at the cosN site, and that the volume of packaged DNA may trigger its recognition.

DNA Packaging: The Translocation Motor

157

In comparison to ‘unit-length’ packaging bacteriophage, ‘headful’ bacteriophage do not terminate packaging at specific DNA sequences; instead, termination is determined by the volume of the head. Because headful phages generally package slightly more than one genome into the capsid, the genome is terminally redundant for circular permutation. This process can result in the phenomenon of generalized transduction, in which phages can carry segments of bacterial genome from one cell to another. There are several strategies for headful packaging. For example, phage P22 cleaves after B2%–10% of the next genome in the concatemer, which results in the next packaging cycle starting with a region downstream from the initial packaging recognition sequence. Other cleavage strategies are seen across different phage, suggesting that termination (and initiation) are controlled idiosyncratically by different phage. Headful terminases must trigger nuclease activity through a mechanism that senses how full the capsid is. One hypothesis for how this is accomplished relies on the conformational state of the portal protein. In a recent study of the P22 portal ring, the authors found that during packaging the portal adopts an asymmetric assembly that binds tightly to the terminase motor (it is of note that the discovery of this asymmetric state also addressed how a pentameric motor could bind a dodecameric portal, a question that has eluded the field for several decades). However, once the head nears complete fullness, DNA binds around the capsid-enclosed portion of portal. DNA binding in this region triggers the asymmetric portal assembly to adopt a symmetric ring, thus losing affinity for the pentameric motor and presumably discharging it from the capsid head. After portal releases the motor, the motor nuclease domains cleave DNA, severing the packaged DNA from the rest of the concatemer, and terminating the packaging reaction. Headful packaging mechanisms necessitate strict control to prevent premature cleavage of DNA. The small terminase subunit may play a regulatory role in this process, as it suppresses large terminase nuclease activity (while increasing ATPase activity). How this mechanism works is unclear, and the role of the small terminase during DNA translocation has yet to be fully resolved. One model of DNA cleavage relies on separate DNA ‘binding’ and ‘cleavage’ interfaces of the nuclease domain. Terminases have nuclease domains of the RuvC/RNase-H family that use three acidic active site residues to coordinate two divalent metals (either Mg2+ or Mn2+) to catalyze the cleavage reaction. Crystal structures of the T4 nuclease domain show a patch of basic residues far from the active site that are presumed to make electrostatic contacts with the DNA backbone during ‘translocation mode’. Once endonuclease activity is triggered, the DNA is transferred through an unknown mechanism to the nuclease active site for cleavage. This putative DNA translocation interface is proposed to regulate the endonuclease activity by holding DNA away from the active site during translocation. However, it was later noted that no positively charged residues from this region are conserved throughout other bacteriophage nuclease domains, and that the nuclease domain has very poor affinity for DNA. Further structural studies attempt to explain nuclease regulation through nuclease domain conformational changes. Analysis of the SPP1 bacteriophage nuclease domain crystal structure predicts that a conserved extended β-hairpin clashes with DNA bound in an RNase-H-like conformation, and that the loop must reorient upon DNA binding. Heterogeneity of loop positioning in different bacteriophage species and normal mode analysis calculations support this idea, suggesting that loop flexibility may limit active site accessibility for DNA, and thus acts as a nuclease regulation mechanism. Later studies using thermophilic terminases suggest that movement of this loop is not required if the terminase uses a different DNA binding mode. It was suggested that the nuclease domain cleaves DNA in a fashion similar to the thermophilic RuvC resolvase rather than RNaseH. In this model, the β-hairpin and residues surrounding the active site cradle DNA within a groove, positioning it in a desirable conformation for cleavage. Another suggested mechanism of cleavage regulation is that DNA packaging occurs too quickly for nucleolytic activity to take place. While this is formally possible, motor stalls for extended time periods (seconds to hours) do not result in DNA cleavage, indicating this ‘kinetic competition’ model is not a primary mechanism for nuclease regulation. Recently, the ‘steric hindrance model’ was proposed to explain regulation of terminase nucleolytic activity (Fig. 8). In this mechanism, the nuclease domains bind to portal and capsid during packaging, preventing the nuclease active site from engaging with DNA. Evidence suggests that the ATPase domains are solely responsible for gripping DNA, while the nuclease domains have no measurable affinity, supporting this arrangement. After ejection from portal, two nuclease domains bind and cleave DNA in an antiparallel arrangement, forming a double-stranded break. This rearrangement of the nuclease domains is made possible by a flexible linker connecting to the ATPase domain, allowing the nuclease domains to adopt the correct orientation for DNA cleavage. Once the DNA strands are cleaved, the tail proteins assemble on portal to create a mature virion. Once packaging is finished, the motor must be replaced with the virus neck and tail. It is not clear how the DNA does not get prematurely released during the transfer of motor for tail. Various mechanisms for maintaining the pressurized DNA have been proposed. For example, the portal could act as a one-way valve. Additionally, the conformation of the DNA itself may help maintain packaged genome. Cryo-electron microscopy studies show the DNA tightly spooled in the capsid, typically with the central axis of the spool coincident with the long axis of the tail. The arrangement of DNA appears to be most homogenous at the periphery of the shell where the genome contacts the capsid directly, with the ‘layers’ of DNA becoming progressively more heterogeneous towards the center. DNA is often found to be ordered in the portal complex and even within the tail, as if it were locked down but poised for ejection. Typically, the DNA sequence that enters the head first is the last to leave the head upon infection, suggesting a simple unidirectional model for spooling of DNA in the head (T4-like phage are an exception; the first sequence in during packaging is the first sequence out during infection). Moreover, there are often proteins that also reside in or nearby this channel that are ejected soon after infection; these ‘ejection proteins’ presumably play an important role in early stages of infection, including the process of crossing the periplasm. It remains a mystery whether these proteins hold DNA in the capsid until infection or the motor plays a role in packaging these ejection proteins inside the capsid.

158

DNA Packaging: The Translocation Motor

Fig. 8 Proposed ‘steric hindrance’ model for nuclease regulation. During ‘translocation mode’ the nuclease domain active site is sequestered from DNA by interactions of the large terminase with portal and capsid, preventing premature cleavage. The ATPase domain serves as the sole surface for gripping DNA during packaging. Upon completion of packaging the large terminase enters ‘cleavage mode’. The large terminase loosens its attachment to the portal and capsid, releasing the inhibition of the nuclease domains. The ATPase domains remain tightly bound to DNA. The nuclease domains rearrange to cleave each of the antiparallel DNA strands. Although depicted as a blunt cut, cleavage could also leave overhangs depending on how both nuclease domains engage DNA. This figure is reprinted with permission from Hilbert, B.J., Hayes, J.H., Stone, N.P., Xu, R.-G., Kelch, B.A., 2017. The large terminase DNA packaging motor grips DNA with its ATPase domain for cleavage by the flexible nuclease domain. Nucleic Acids Research 45 (6), 3591–3605.

Outlook and Future Studies Despite decades of research, there remain numerous questions as to how viral genome packaging motors processively translocate DNA. High-resolution structures of actively packaging motors are necessary to elucidate the DNA translocation mechanism. Visualizing the conformational transitions, intersubunit interactions, and DNA gripping contacts will shed light on how the motor generates the power to package DNA against the immense back pressure in the capsid. Structures of both phi29 and terminase motors will determine their similarities and differences, as well as address how force is generated. While phi29 genome packaging has been investigated to fine detail, less is known about the terminase motor DNA translocation cycle. For instance, do terminases also hydrolyze ATP in a sequential order? Do they follow a similar chemomechanical cycle? Given that there are considerable differences in DNA translocation speeds of phi29 and terminases, it is possible that the two types of motors use different regulation strategies. Furthermore, it appears that the trans-acting arginine finger is different in terminases and phi29 motors. If so, how do two proteins with similar ATPase domain topologies employ different mechanisms? Having a clearer understanding of the terminase DNA translocation cycle will provide insight into these questions. There are also many unknown regulation steps within the endonuclease reaction catalyzed by terminases. For example, how the motor senses when to cleave for both headful and unit-length filling mechanisms is unknown. For headful terminases, the portal has been implicated as a sensor, but whether the large terminase plays an active sensing role is still inconclusive. In unit-length bacteriophage such as lambda, the initial cleavage event has been elucidated in great detail, but how the termination cleavage occurs in the absence of gpNu1 and IHF is less clear. Some evidence supports the idea that lambda may use a headful-like sensing mechanism to trigger cosN cleavage, but whether this is regulated by the portal, the terminase itself, or the rate of DNA translocation at the end of packaging is undetermined. Finally, once the motor is released from portal, what prevents the genome from immediately ejecting from the capsid before the pre-assembled tail binds? Answering these questions will undoubtedly require synthesizing structural studies with careful biochemical and single-molecule assays. Moving forward, the study of motor structure and mechanism will likely lead to new biotechnological applications. For example, the pRNA of phi29 has already been used as a therapeutic delivery device. Are there applications for the motor itself? One could imagine

DNA Packaging: The Translocation Motor

159

roles in nanomaterials and targeted delivery of biological cargo. As an example, the portal was recently engineered as a nanopore for sensing various biomolecules. Future studies of the motor may lead to a plethora of applications in biotechnology.

Further Reading Burroughs, A.M., Iyer, L.M., Aravind, L., 2007. Comparative genomics and evolutionary trajectories of viral ATP dependent DNA-packaging systems. Genome Dynamics 3, 48–65. Chistol, G., Liu, S., Hetherington, C.L., et al., 2012. High degree of coordination and division of labor among subunits in a homomeric ring ATPase. Cell 151 (5), 1017–1028. Erzberger, J.P., Berger, J.M., 2006. Evolutionary relationships and structural mechanisms of AAA+ proteins. Annual Review of Biophysics and Biomolecular Structure 35, 93–114. Hilbert, B.J., Hayes, J.A., Stone, N.P., et al., 2015. Structure and mechanism of the ATPase that powers viral genome packaging. Proceedings of the National Academy of Sciences of the United States of America 112 (29), E3792–E3799. Lokareddy, R.K., Sankhala, R.S., Roy, A., et al., 2017. Portal protein functions akin to a DNA-sensor that couples genome-packaging to icosahedral capsid maturation. Nature Communications 8, 14310. Mao, H., Saha, M., Reyes-Aldrete, E., et al., 2016. Structural and molecular basis for coordination in a viral DNA packaging motor. Cell Reports 14 (8), 2017–2029. Moffitt, J.R., Chemla, Y.R., Aathavan, K., et al., 2009. Intersubunit coordination in a homomeric ring ATPase. Nature 457 (7228), 446–450. Oram, M., Sabanayagam, C., Black, L.W., 2008. Modulation of the packaging reaction of bacteriophage t4 terminase by DNA structure. Journal of Molecular Biology 381 (1), 61–72. Smits, C., Chechik, M., Kovalevskiy, O.V., et al., 2009. Structural basis for the nuclease activity of a bacteriophage large terminase. EMBO Reports 10 (6), 592–598. Sun, S., Kondabagil, K., Draper, B., et al., 2008. The structure of the phage T4 DNA packaging motor suggests a mechanism dependent on electrostatic forces. Cell 135 (7), 1251–1262. Tafoya, S., Liu, S., Castillo, J.P., et al., 2018. Molecular switch-like regulation enables global subunit coordination in a viral ring ATPase. Proceedings of the National Academy of Sciences of the United States of America 115 (31), 7961–7966. Zhao, Z., De-Donatis, G.M., Schwartz, C., et al., 2016. An arginine finger regulates the sequential action of asymmetrical hexameric ATPase in the double-stranded DNA translocation motor. Molecular and Cellular Biology 36 (19), 2514–2523.

Biophysics of DNA Packaging Joshua Pajak and Gaurav Arya, Duke University, Durham, NC, United States Douglas E Smith, University of California, San Diego, La Jolla, CA, United States r 2021 Elsevier Ltd. All rights reserved.

Glossary ASCE The additional strand, conserved E (glutamate) superfamily of ATPases. A classification of proteins based on sequence, structure, and function. Terminases and other viral DNA packaging ATPases are members of this superfamily.

ATPase A protein that catalyzes ATP hydrolysis. Procapsid The viral capsid prior to complete maturation of the virion. Terminase A viral protein which packages DNA and performs endonuclease activity.

Introduction A crucial step in virus assembly is the encapsulation of viral genome into the procapsid. Different viruses have evolved different strategies by which they package their genome. In many double-stranded DNA (dsDNA) viruses, such as some families of bacteriophages, DNA packaging is accomplished by packaging ATPases, powerful molecular motors that convert energy from ATP hydrolysis into mechanical work. This energy from ATP hydrolysis is used to overcome large internal forces arising from electrostatic self-repulsion, entropy loss, and DNA bending rigidity, which oppose DNA confinement within the capsid. The packaging ATPase is a dimer of homo-oligomeric rings. One ring is comprised of large packaging proteins, and the second ring is comprised of small packaging proteins. The small packaging complex is responsible for recognizing the viral genome at a sequence-specific “pac site” to ensure that the virus only packages its own genome and not stray genome from the host bacteria. In some cases, such as the ϕ29 phage, the small packaging proteins are covalently attached to the end of the viral DNA, and recognition is attained via the assembly of the large and small packaging complexes. The large packaging motor protein is responsible for the ATPase activity which provides the necessary energy to translocate the genome into the viral procapsid. The packaging motors have evolved various mechanisms for initiating packaging, translocating DNA, and terminating packaging. The mechanisms involved in DNA translocation are especially elaborate, where mechanical actions such as DNA gripping and ungripping (and/or steric pushing of the DNA), force generation, and coordination of the multiple motor subunits must be coupled to the ATP binding, bond cleavage, and product release steps of the ATP-hydrolysis cycle. While there is currently no unified model that explains all aspects of DNA packaging for all phages, our understanding of packaging ATPases, especially their molecular structure, packaging characteristics, and packaging mechanism, has improved significantly through knowledge primarily gained from the three well-studied systems of the ϕ29, T4, and λ phages, which will mainly be highlighted here.

Molecular Structure Viral DNA packaging proteins are members of the Additional Strand, Conserved Glutamate (ASCE) superfamily of ATPases, and are related to other ASCE ATPases such as ATP Binding Cassette (ABC) and ATPases Associated with diverse cellular Activities (AAA+) proteins. ASCE ATPases contain a conserved P-loop/Walker A (WA) motif involved in binding ATP, and a Walker B (WB) motif which helps catalyze ATP hydrolysis. Based on mechanisms deduced for other ATPases, catalysis is thought to be specifically dependent on a conserved glutamate residue located immediately downstream of the WB motif. Many packaging proteins also have endonuclease activity necessary to initiate and terminate the packaging process in the case where viral DNA replication results in catenated strings of multiple genomes; As such, these proteins are called “terminases”. Terminases have two distinct domains: the ATPase and nuclease domains, which are connected by a flexible linker. The T4 and λ phages use terminase proteins. The ϕ29 motor does not perform endonuclease activity and its motor enzymes are referred to by the more general term “packaging ATPase.” The molecular structures of the monomer subunits from several packaging ATPases have been solved either in part or in whole via X-ray crystallography. The first near-complete monomer structure to be solved was from the T4 phage in 2008, followed by those from the Sf6 and D6E phages in 2013 and 2017. The structures of the ATPase domain of the P74-26 phage terminase protein and a partial ATPase from the ϕ29 phage were also solved, and so have the structures of the nuclease domains of the P74-26, SPP1, P22, G20c, and RB49 phage terminase proteins. There is significant structural homology across all of the ATPase and nuclease domains, although the relative positioning of the nuclease domain with respect to the ATPase domain is drastically different in the solved T4, Sf6, and D6E terminase structures (Fig. 1). The homology of independent domains suggests that some aspects of the packaging mechanisms may be conserved across all bacteriophages; however, the reason for the observed differences in domain positions across the three proteins is unclear.

160

Encyclopedia of Virology, 4th Edition, Volume 4

doi:10.1016/B978-0-12-809633-8.20966-7

Biophysics of DNA Packaging

161

Fig. 1 (A) The pathway of virion maturation for the λ phage. (B) Cryo-EM data of the ϕ29 packaging motor with modeled DNA. The portal (or connector) density is colored green, the structural RNA density is colored magenta, and the packaging ATPase density is colored purple. (C) Three large terminase structures from the T4 (PDB: 3CPE), Sf6 (PDB: 5IDH), and D6E (PDB: 5OE8) phages are depicted to highlight the unique domain arrangements found in each crystal structure. The ATPase domain is colored green, the lid subdomain is colored yellow, and the nuclease domain is colored blue. Reproduced from (A) Andrews, B.T., Catalano, C.E., 2013. Strong subunit coordination drives a powerful viral DNA packaging motor. Proceedings of the National Academy of Sciences 110 (15), 5909–5914. doi:10.1073/pnas.1222820110. (B) Morais, M.C., Koti, J.S., Bowman, V.D., et al., 2008. Defining molecular and domain boundaries in the bacteriophage ϕ29 DNA packaging motor. Structure 16 (8), 1267–1274.

The packaging ATPase complex assembles at a unique vertex of the viral procapsid, which houses the portal protein complex, a 12-mer ring structure that forms the pore through which DNA passes during translocation. It is known that the large packaging ATPases dock to the portal complex. However, it is not known which domain of terminase proteins binds to the portal. Cryo-EM data from the T4 and T7 bacteriophages suggest that the ATPase domain binds to the portal, while Fluorescence Resonance Energy Transfer (FRET) and some genetic and biochemical data suggest that the nuclease domain binds to the portal. This is an important issue to resolve, as the two orientations imply different mechanisms for certain aspects of motor function, such as DNA-gripping residues and force-generation mechanisms.

DNA Packaging Characteristics Biophysical experiments, particularly single DNA molecule manipulation experiments with optical tweezers, have produced a wealth of quantitative data on the packaging process (Fig. 2). These experiments involve trapping and manipulating dielectric microspheres using focused laser beams. In one approach, used to study ϕ29 and T4, an empty phage procapsid, with its motor complex attached, is tethered via antibodies to one of the microspheres while its genome is attached by one end via biotin-streptavidin labels to a second microsphere. Packaging is initiated when the laser brings these two spheres close enough for the packaging motor on one sphere to grab the free end of the DNA on the other sphere. In a second approach, used to study λ, a DNA-motor complex attached to one sphere docks onto an empty procapsid bound to a second sphere to initiate packaging. Subsequent packaging of DNA exerts a force on the DNA and the attached microspheres, and the magnitude of this force is determined by measuring the deflections (change in momentum) of the trapping laser beams. A commonly used measurement technique is the “force-clamp” in which a feedback control system is used to change the separation between the two trapped beads so as to apply a small constant stretching force to the DNA as it is being packaged, enabling the length of the DNA packaged to be tracked continuously as a function of time. These measurements have high sensitivity and can resolve force changes o1 pN and displacements as small as B1 bp in some experiments. Motor characteristics that have been determined include instantaneous and average packaging rates, motor pausing and slipping, and in some cases incremental translocation step sizes.

Packaging Speed Viral DNA packaging is highly processive (Table 1). Single-molecule experiments showed that the ϕ29 phage packages at a maximum rate (at B23°C) of B200 base pairs per second (bp/s), the λ phage packages as fast as B800 bp/s, and the T4 phage can package up to

162

Biophysics of DNA Packaging

Fig. 2 Single molecule experimental set-up and examples of data collected. (A) A schematic of the single molecule experimental set-up using optical tweezers to study the ϕ29 phage packaging process. In these early studies the bottom microsphere was held with a narrow glass pipette. In more recent studies both microspheres were trapped with optical tweezers. (B) DNA translocation distance vs. time data gathered on the ϕ29 system displaying dwell and burst phase behavior. During the dwell phase the motor does not package DNA while it waits for all subunits to be loaded with ATP. During the burst phase, four subunits hydrolyze ATP in quick succession to translocate 10 bp of DNA. (C) Nucleotide-dependent slipping of the T4 motor ; the black lines are with a high concentration of non- (or slowly) hydrolyzable ATP analog, blue lines are with high ADP concentration, and red lines are with no nucleotides. Reproduced from (A) Smith, D.E., Tans, S.J., Smith, S.B., et al., 2001. The bacteriophage ϕ29 portal motor can package DNA against a large internal force. Nature, 413 (6857), 748–752. doi:10.1038/35099581 with permission from Springer publishing. (B) Moffitt, J.R., Chemla, Y.R., Aathavan, K., et al., 2009. Intersubunit coordination in a homomeric ring ATPase. Nature, 457 (7228), 446–450. doi:10.1038/nature07637 with permission from Springer publishing. (C) Ordyan, M., Alam, I., Mahalingam, M., Rao, V.B., Smith, D.E., 2018. Nucleotide-dependent DNA gripping and an end-clamp mechanism regulate the bacteriophage T4 viral packaging motor. Nature Communications, 9 (1). doi:10.1038/s41467-018-07834-2 with permission from Springer publishing.

Table 1

Packaging speed and genome length

Bacteriophage

Maximum packaging speed (bp s−1)

Genome length (kbp)

ϕ29 λ T4

200 800 2000

19 48.5 170

2000 bp/s. These rates appear to be roughly proportional to the length of the virus’ genome: the ϕ29 phage genome is B19 kilo base pairs (kbp), and the λ and T4 phage genomes are B48.5 and B170 kbp long, respectively. This proportionality ensures that each virus can package its genome within a relatively short window of time during the infection of the host cell. The experiments also showed that the packaging speeds of the ϕ29, T4, and λ phages decrease with increasing applied load force. Similarly, packaging speed was found to decrease during packaging with the amount of DNA packaged in the capsid, suggesting that internal forces which oppose DNA confinement also slow down the motor. Recent studies of ϕ29 indicate that both internal load force and allosteric regulation of ATP binding regulate the motor speed in response to the amount and conformation of packaged DNA. Single-molecule studies of the λ phage revealed a sharp dip in packaging speed when B30% of the genome was packaged. The interpretation of this feature is that it is due to capsid expansion which occurs in that system induced by a build-up of internal pressure due to the confined DNA, which temporarily reduces the load force acting on the motor. The ϕ29 motor has been best characterized in terms of the kinetics of the mechano-chemical cycle, and because the motor translocates relatively slowly very high-resolution measurements of DNA translocation were possible. High stability optical tweezers measurements at high load forces revealed that ϕ29 translocates DNA in incremental steps of 2.5 bp per ATP hydrolyzed. These 2.5 bp steps are further grouped into 10 bp “bursts”, during which four subunits package rapidly in sequence (Fig. 3). This sequence of rapid firing is termed the “burst phase”. After the burst phase, the motor continues to grip DNA but does not package it. During this “dwell phase”, the motor waits for ATP to bind to every subunit that had catalyzed ATP to package DNA during the burst phase. Interestingly, this implies that the multiple motor subunits are tightly coordinated and that one of the five subunits does not participate in DNA translocation and is understood to regulate the occurrence of bursts. It is currently unknown whether other phages also use a strictly coordinated dwell-burst packaging mechanism, as those phages package too quickly to resolve the dwell phase if it were to be present.

Biophysics of DNA Packaging

163

Fig. 3 Kinetic model for the dwell phase/burst phase cycle of the ϕ29 packaging process. ATP-bound subunits are labeled as T, ADP/Pi-bound subunits are labeled as D·Pi, and ADP-bound subunits are labeled as D. Electrostatic contacts with the DNA backbone are indicated in red. The “Special Subunit” grips DNA, but does not translocate. Reproduced from Chistol, G., Liu, S., Hetherington, C.L., et al., 2012. High degree of coordination and division of labor among subunits in a homomeric ring ATPase. Cell, 151 (5), 1017–1028. doi:10.1016/j.cell.2012.10.031 with permission from Elsevier.

Force Generation Viral DNA packaging motors are some of the strongest molecular motors identified in biology. Single-molecule studies on the ϕ29, λ, and T4 phages revealed that all three phages could overcome applied load forces in excess of 50–60 pN. For comparison, the skeletal muscle protein myosin can generate B2 pN of force. In fact, the 50–60 pN measurements represent a lower bound on the generated force; at higher forces the procapsids were pulled off the microspheres and at B65 pN the DNA helix sharply deforms. At high prohead filling levels internal forces contribute an estimated B20–30 pN and, combined with applied external forces ranging up to B50 pN, suggest that these motors can produce forces of at least B80 pN. ATP hydrolysis supplies B18 kcal mol−1 (B130 pN nm) of free energy. If a motor were to convert this energy into useful work with 100% efficiency, almost 155 pN of force would be available to translocate DNA by the 2.5 bp (B8.5 Angstrom) step size observed experimentally. Thus, the predicted packaging force magnitude of B80 pN is within the realm of possibility, provided the motor achieves B50% efficiency.

Pausing and Slipping Although these motors are highly processive and capable of producing large forces, the packaging process is not without defects. Pausing and slipping (DNA leaving the procapsid) are induced at high external loads and if ATP concentration is lowered. Interestingly, pausing and slipping events are not proportionally induced in each packaging motor. For example, under high load force, the λ phage packaging motor pauses more but slips less than the ϕ29 motor. While packaging the fastest overall, the T4 phage has the most variable packaging speed. Additionally, pausing and slipping is highly dependent upon the availability of nucleotides in the bulk environment. For instance, slippage is more frequent in solutions with low ATP concentration, while pausing is more frequent in conditions with small amounts of non-hydrolyzable ATP analog. Recent single-molecule experiments on the T4 phage in an ADP-rich environment showed that the motor appears to fluctuate between apo-like, ATP-like, and intermediate states.

DNA Packaging Mechanisms The molecular mechanisms by which these enzymes package DNA are becoming better understood thanks to a combination of increased availability of high-resolution crystal structures, biochemical analysis of mutant enzymes, and computational modeling. These mechanisms include mechanochemical coupling, motor-substrate interactions, force generation, and motor coordination. However, these mechanisms are not mutually exclusive, and often perturbing one aspect has a cascade effect which perturbs other mechanisms.

Mechanochemical Coupling Single molecule experiments in which ϕ29 translocation was studied with varying ATP concentration, varying applied force, and with mixtures of ATP and non-hydrolyzable ATP analog demonstrates that ATP binding does not induce the DNA translocation

164

Biophysics of DNA Packaging

step. Instead, it is likely coincident with product (inorganic phosphate) release. Because the nonhydrolyzable analog induces pausing and not slipping, ATP binding likely “cocks” the motor by causing the enzyme to grip DNA before translocation. Rapid solution exchange experiments with T4 also showed that the motor usually tightly grips the DNA when a non-hydrolyzable ATP analog is bound, while the DNA slips out rapidly when there is no nucleotide. Thus, it is evident that small changes within the ATP binding pocket must be sensed by the protein and then somehow converted into mechanical motion. The connection between the chemical steps of ATP hydrolysis and the mechanical motion of the motor is termed “mechanochemical coupling.” The mechanisms involved are generally expected to have features in common with related ATPases, such as AAA+ motors. One such recently identified mechanism has been termed the “arginine toggle.” A unique structural element of packaging ATPase enzymes relative to other ASCE proteins is the strong conservation of an arginine residue at the 3 or 4 position of the canonical WA motif. Mutational studies of the λ, T4, P74-26, and D6E phage terminase proteins have provided evidence suggesting that the WA arginine is crucial for ATPase catalytic activity as well as mechanochemical coupling. Further computational studies of the T4, P74-26, and Sf6 terminase proteins predicted that the WA arginine toggles in and out of the binding pocket in response to ATP binding, and this toggling motion is part of the signal transduction pathway which informs the rest of the protein that ATP has bound. In this sense, it appears that the “arginine toggle” is conceptually analogous to the “sensor II motif” arginine found in related AAA+ motors. A related family of ATPase proteins, helicases, were identified to use a “Q-motif” which also couples ATP binding to conformational change. This motif typically consists of a “YQ” amino acid pairing upstream of the WA motif responsible for binding the adenine ring of ATP. Sequence alignment suggested that the λ phage terminase enzyme contained a Q-motif, namely 46-YQ-47. Single molecule experiments of a Y46F variant displayed slower packaging rate and increased slipping, as well as higher sensitivity to applied force. This suggests that the Q-motif is responsible in part for mechanochemical coupling as well as force generation.

Motor-substrate Interactions Many proteins that bind to DNA do so at specific sites dictated by short DNA sequence motifs. This allows for increased specificity and increased binding affinity. However, since these motors must package entire genomes, they must grip onto a wide variety of sequence motifs. Thus, sequence specific interactions are not employed by these motors, but instead they rely on non-specific interactions. Such interactions can be classified into two broad categories: electrostatic and steric. Although each base of DNA varies in hydrogen-bonding potential, each base is attached to an identical negatively-charged phosphate backbone. Thus, positively-charged residues located in the pore of the motor can non-specifically bind with these negatively-charged groups via electrostatic interactions. Further, the double-helix structure of DNA creates a major-groove and minor-groove which bulky protrusions could fit into, such that the motor grips DNA via steric interactions. To test whether the motor primarily grips via electrostatic or steric interactions, single molecule experiments have been performed with structural modifications introduced to the substrate DNA. 10 bp sections of methylated DNA, which neutralizes the negative charge of the phosphate backbone, were able to be packaged with little change in kinetics compared to packaging regular DNA. However, DNA with 11 bp sections of methylated DNA often caused stalling or lengthy pauses in packaging. This suggests that the motor makes crucial electrostatic contacts every 10 bp of DNA. This behavior fits with known structural features as follows: the motor has pentameric stoichiometry and packages in 2.5 bp steps. One helix turn of DNA is 10.5 bp. Thus, every 10 bp of DNA approximately aligns the charged phosphate backbone with the same subunit. Although the 11 bp insertions strongly affected packaging dynamics, the motor was able to accommodate insertions up to 30 bp long. This indicates that while electrostatic interactions are important, non-specific steric interactions are sufficient for DNA translocation. To test whether electrostatic interactions alone are sufficient for DNA translocation, DNA with 10 bp abasic phosphate backbone insertions were introduced to the system. This modification was also able to be packaged under low applied load, indicating that electrostatic interactions are sufficient at low forces. However, other modifications such as single-strand insertions, unhybridized bulges, and polymer insertions were also able to be packaged. Taken together, this evidence demonstrates that both non-specific steric interactions and attractive electrostatic interactions play roles in the translocation process. There remains debate whether the ATPase domain or the nuclease domain of the terminase proteins interact with DNA during translocation. The ϕ29 motor does not contain a functional nuclease domain, which suggests that the ATPase domain may primarily interact with the DNA. However, it is unclear if this is consistent across all packaging proteins. A model based on T4 cryo-electron microscopy data implicates the nuclease domain as gripping the DNA during translocation steps. However, biochemical data from mutant P74-26 terminase proteins identified that crucial DNA gripping residues reside in the ATPase domain and not the nuclease domain. Thus, it is unclear whether one model can explain protein-DNA interactions for all packaging motors.

Force Generation Several models have been proposed to explain how these molecular machines generate the incredible forces necessary for complete genome packaging. Most models suggest that the force is generated by conformational changes within the ATPase packaging proteins that actuate a “lever” motion (Fig. 4), but one model suggests that the DNA segment threaded through the motor channel itself could be involved in force generation.

Biophysics of DNA Packaging

165

Fig. 4 Three proposed models for the motor force generation mechanism. (A) The scrunchworm model, indicating the motor gripping events needed to facilitate unidirectional motion of the substrate DNA. (B) A model proposed based on crystal structures of the P74-26 terminase enzymes in apo and ATP-bound states. The ATPase domain (N-terminal) grips DNA, and the nuclease domain (C-terminal) attaches to the portal. Rotation about the lid subdomain drives translocation. (C) The extended-to-compact transition model is depicted, proposed based on cryo-EM and X-ray data for the T4 motor. The nuclease domain (C-terminal) grips DNA and the ATPase domain (N-terminal) attaches to the portal. Alignment of charged residues across the domain interface drives the compaction process and DNA translocation. Reproduced from (A) Waters, J.T., Kim, H.D., Gumbart, J.C., Lu, X.J., Harvey, S.C., 2016. DNA Scrunching in the packaging of viral genomes. Journal of Physical Chemistry B 120 (26), 6200–6207. doi:10.1021/acs.jpcb.6b02149. (B) Hilbert, B.J., Hayes, J.A., Stone, N.P., et al., 2015. Structure and mechanism of the ATPase that powers viral genome packaging. Proceedings of the National Academy of Sciences, 112 (29), E3792–E3799. doi:10.1073/pnas.1506951112. (C) Migliori, A.D., Smith, D.E., Arya, G., 2015. Molecular interactions and residues involved in force generation in the T4 viral DNA packaging motor. Journal of Molecular Biology, 426 (24), 4002–4017. doi:10.1016/j.jmb.2014.09.023.

The “scrunchworm” model suggests that the translocation machinery primarily grips and releases DNA while the DNA is compressed and expanded. Coordination between gripping/releasing and compression/expansion could result in directional motion of the DNA. In this model, the force generated is a restorative force stored within the DNA itself and the motor is speculated to induce the DNA to go into the compressed conformation. Support for this model comes from molecular simulations demonstrating that DNA is expanded/compressed within the portal complex. Further, dsRNA and RNA-DNA heteroduplexes could not be packaged by the T4 motor, presumably because they cannot undergo the same compression/expansion as dsDNA.

166

Biophysics of DNA Packaging

On the other hand, more conventional models assume that the force to drive translocation is generated inside the motor proteins. One such model proposed for the T4 motor posits that the alignment and misalignment of charged residue pairs at the ATPase/nuclease domain interface drives translocation. Alignment of the charged residue pairs induces compaction, and misalignment induces expansion. If compaction occurred while the protein gripped DNA and expansion occurred while the protein did not grip DNA, this would result in unidirectional movement, and thus packaging. In this model, mechanochemical coupling of the ATP hydrolysis cycle coordinates the alignment and misalignment of the charged residue pairs. This model is supported by single molecule studies of motor protein mutants in which charged amino acids at the interface were substituted for residues of opposite charge. It was found that these residue changes lower translocation velocity and force generated by the motor. Alternative models have been proposed for the P74-26 and Sf6 packaging motors based on solved crystal structures of the monomeric terminase subunits. In both cases the lid subdomain (a small motif within the ATPase domain) of the ATP-bound structure is rotated relative to the apo state. In these models, DNA is gripped by either the ATPase domain or a combination of the ATPase and nuclease domains. The rotation of the lid subdomain powers DNA translocation and passes DNA from one subunit to the next subunit, and mechanochemical coupling of the ATP hydrolysis cycle coordinates the lid subdomain rotation.

Motor Coordination As mentioned above, the ϕ29 motor has been shown to be highly coordinated overall, exhibiting a “dwell” phase during which no ATP is hydrolyzed, and a “burst” phase where four subunits consecutively hydrolyze ATP and translocate DNA. Coordination within the burst phase is beneficial for motor function. For instance, if two subunits attempted to translocate DNA simultaneously, they might spend two ATP hydrolysis events to translocate DNA 2.5 bp. If those two subunits instead were to translocate DNA consecutively, they would spend the same two ATP hydrolysis events to translocate DNA 5 bp. Thus, it is in the motor’s best interest to coordinate its subunits’ actions. One identified mechanism for inter-subunit communication is a trans-acting “arginine finger.” The arginine finger protrudes from one subunit into a neighboring subunit’s ATP-binding pocket. This allows information about an ATP binding event in one subunit to be communicated to its neighbor, so that their actions can be coordinated. Biochemical studies have assigned the transacting arginine finger in several packaging motors, such those from the ϕ29, P74-26, and D6E phages. Although sequence alignments suggest that the T4 packaging motor could also contain a trans-acting arginine finger, a pentameric model of the T4 packaging motor based on cryo-EM reconstruction places each subunit’s ATP-binding pocket distal from its neighbors, arguing against such a mechanism. Additionally, the Sf6 packaging motor does not appear to contain an arginine at the sequence alignment predicted site but it is possible that it utilizes a deviant lysine to accomplish the same goal.

Further Reading Black, L.W., 2015. Old, new, and widely true: The bacteriophage T4 DNA packaging mechanism. Virology 479, 650–656. Chemla, Y.R., Smith, D.E., 2012. Single-molecule studies of viral DNA packaging. In: Rossmann, M., Rao, V.B. (Eds.), Viral Molecular Machines. Advances in Experimental Medicine and Biology, vol 726. Boston, MA: Springer. Rao, V.B., Feiss, M., 2015. Mechanisms of DNA packaging by large double-stranded DNA viruses. Annual Review of Virology 2, 351–378. Tafoya, S., Bustamante, C., 2018. Molecular switch-like regulation in motor proteins. Philosophical Transactions of the Royal Society B: Biological Sciences 373 (1749), (20170181).

Energetics of the DNA-Filled Head Alex Evilevitch, Department of Experimental Medical Science, Lund University, Lund, Sweden r 2021 Elsevier Ltd. All rights reserved.

Glossary Capsid The protein shell of a virus. It consists of oligomeric structural subunits made of proteins called protomers. Enthalpy A property of a thermodynamic system, is equal to the system’s internal energy plus the product of its pressure and volume. In a system enclosed so as to prevent mass transfer, for processes at constant pressure, the heat absorbed or released equals the change in enthalpy. Entropy Entropy is an extensive property of a thermodynamic system. It is closely related to the number of microscopic configurations that are consistent with the macroscopic quantities that characterize the system.

Metastability A stable state of a dynamical system other than the system’s state of least energy. Osmolyte Compounds that influence the properties of biological fluids. Their primary role is to maintain the integrity of cells by affecting the viscosity, freezing point, and ionic strength of the aqueous solution. Osmotic pressure The minimum pressure which needs to be applied to a solution to prevent the inward flow of its pure solvent across a semipermeable membrane. It is also defined as the measure of the tendency of a solution to take in pure solvent by osmosis. Persistence length A basic mechanical property quantifying the stiffness of a polymer.

Introduction Different virus traits determine virus-host interaction with resulting dynamics and heterogeneity of viral populations. Investigating these traits is of fundamental importance for understanding viral replication pathways, viral diversity and fitness, as well as for improving the potency of treatments and viral vaccines. Structural virology has been instrumental in providing a detailed morphological description and classification of viruses and their receptors required for studies of traits influencing virus-host interaction dynamics. However, most of the structural studies have been focused on viral protein shell (termed the capsid) due to its symmetry, rather than on the encapsidated genome where the lack of symmetry prevents high-resolution analysis. Recent findings of pressure-driven viral DNA ejection into cells from phage, archaeal viruses and herpesviruses, have raised strong interest in understanding the role that packaged double-stranded (ds) DNA structure plays in dynamics of genome ejection and viral replication. Knowledge of the structure and directly coupled to it energetics of the encapsidated viral DNA reveals the mechanism of DNA packaging and release during viral replication. Furthermore, it provides insight into the dynamics of viral genome delivery into a host cell and subsequent replication dynamics. A significant amount of work has been done to describe the structural arrangement and resulting energy of dsDNA packaged in bacteriophage capsids. As mentioned above, while the exact arrangement of dsDNA in the capsid has not been established, the current models combined with the experiments provide a reasonable estimate of the intracapsid DNA equilibrium energy. This energy, dominated by strong DNA-DNA repulsions and bending stress, results in pressure of tens of atmospheres that powers DNA ejection into a cell. We recently found that DNA packaged in phage l and human herpes simplex virus type 1 undergoes a structural disordering transition from a solid-like to a fluid-like state. In a parallel study, it was shown that DNA packaged in a phage capsid does not have an equilibrium structure and is instead trapped in non-equilibrium conformational states since genome relaxation in the capsid occurs on a slower time scale than processivity of the packaging motor. These findings suggest that DNA packaged in a capsid is in a structural and energetic metastable state. Therefore, knowing the intra-capsid DNA equilibrium energy alone is insufficient for understanding of mechanism and dynamics of viral DNA ejection. This article describes the nonequilibrium changes in energy, structure and associated mechanics of encapsidated viral genome facilitating its ejection during viral replication. As will be reviewed here, the metastable state of the packaged viral DNA is an important trait affecting virulence and course of infection (where the term “infection” denotes the introduction of viral nucleic acid into a host cell by a virus). Phage and herpesvirus model systems are used interchangeably below due to major similarities in their DNA ejection and packaging mechanisms, as well as similar DNA packing densities resulting in essentially identical mechanical properties and structure of the encapsidated genome.

Equilibrium Energy and Structure of Intracapsid DNA As mentioned above, many families of viruses, infecting all three domains of life, have stressed packaged genomes resulting in high internal capsid pressure. This pressure is generated by an ATP-driven packaging motor located at a unique capsid vertex, shown to be the strongest molecular motor known. Structural features of packaging motor components are shared by bacterial and archaeal dsDNA viruses and eukaryotic herpesviruses. The viral genome organization resulting from intra-capsid confinement is closely

Encyclopedia of Virology, 4th Edition, Volume 4

doi:10.1016/B978-0-12-809633-8.21519-7

167

168

Energetics of the DNA-Filled Head

Fig. 1 Cutaway views from cryo-EM reconstructions of (A) HSV-1 C-capsid and (B) phage l capsid. HSV-1 and l capsids are shown to scale. Owing to the icosahedral symmetry imposed during the reconstruction, concentrically packed DNA within the capsid becomes shells of density (green). The density maps are downloaded from the Macromolecular Structure Database of the European Bioinformatics Institute with accession codes EMD-1354 (HSV-1) and EMD-5012 (phage l). Modified from Sae-Ueng, U., Liu, T., Catalano, C.E., et al., 2014. Major capsid reinforcement by a minor protein in herpesviruses and phage lambda. Nucleic Acids Research, 42 (14), 9096–107. doi:10.1093/nar/gku634.

associated with its energetic state. The cryo-EM single particle reconstruction of DNA phage l and human herpes simplex virus type 1 (HSV-1) shown in Fig. 1 reveals that the entire capsid volume is filled with DNA extending all the way to the center of the capsid. Starting from the capsid walls there are well ordered, multiple concentric DNA layers. The layers are evenly spaced indicating that DNA has adapted an ordered repetitive structure characteristic for a liquid crystalline state. However, toward the center of the capsid the ordered layers disappear suggesting a less ordered DNA structure with lower packing density than in the periphery of the capsid. Similar dsDNA distributions within the capsids were also observed for other viruses. The DNA structure in viral capsids is mainly determined by DNA–DNA interactions, bending stress and packing defects. DNA-DNA interaction and DNA bending are two main energy terms determining the structure of the encapsidated DNA. At high packaging densities, DNA-DNA interactions are mediated mainly by electrostatic and hydration forces. Hydration forces arise from surface structuring of water on the macromolecular surface with a range of interactions of B15–20 Å , corresponding to 3–4 water layers on each DNA surface. It was shown that DNA confined in a phage capsid behaves very similar, in terms of interactions and structural ordering, to the multiple DNA arrays confined by a polymer in a bulk solution at equal inter-strand separations. However, the encapsidated DNA, unlike the condensed DNA arrays in the bulk, is strongly bent by the confining capsid walls. This is due to the fact that the dsDNA persistence length is of the same order of magnitude as the diameter of many viral capsids (B50 nm). This leads to an extra bending energy term affecting the structure of the DNA in the capsid. (Persistence length defines the stiffness of a polymer, describing the minimum radius of curvature it can adopt by the available thermal energy. Bending it to a smaller radius requires additional work). To relieve the bending stress, helices are packed closer to the capsid wall, decreasing the bending radius (providing lowest curvature) and also decreasing the spacing, therefore increasing the interaction energy. At the same time the repulsive DNA-DNA interactions will push DNA strands as far from each other as possible, filling the entire capsid volume and maximizing the inter-strand separations. There is a trade-off between bending and interaction energies. Furthermore, when DNA is bending inside the capsid, the initial correlation between two helices that have slightly different radii of curvature is lost, and the mutual orientation between helices must be re-established. This leads to the packing defects that are absent for linear packaging of DNA in solution. Packing defects are required in order to re-establish a favorable phosphate-phosphate “phasing” of helices, reducing the repulsive interactions due to bending. In order to experimentally dissect the thermodynamic determinants of DNA packaging and ejection energetics, isothermal titration calorimetry (ITC) was used to directly measure the heat released by DNA ejection from a bacteriophage. This approach allows one to address experimentally the relative weight and origin of enthalpic and entropic contributions to the free energy of DNA packing inside a phage capsid. The values of free energy, ΔG, obtained for packaged DNA were one order of magnitude smaller than the values of the enthalpy measured with ITC; the calculated free energy of ejection (ΔGej) was on the order of 1017 J/virion while the measured enthalpy (ΔHej) was 1016 J/virion for the wild-type DNA length (WT) phage l at 251C. This is due to the fact that the entropy associated with DNA packaging in the capsid is positive, reflecting an increase in the disorder of the system. This may appear surprising, mainly because when DNA is packaged it is becoming more ordered, leading to a reduction in conformational entropy. However, as the DNA helices come closer together in the capsid, the hydration properties of DNA change so that the ordered water molecules directly surrounding the DNA are released, increasing the net disorder of the system, thus increasing its entropy. This observation from ITC measurements suggests that the entropy of packaged DNA is strongly related to hydration. The observations above describe the equilibrium properties of virally packaged DNA. However, it was found that intra-capsid DNA is not in its equilibrium state but has a metastable behavior. This metastability affects the mobility and related to it dynamics of DNA ejection from the capsid during viral infection. While DNA is always condensed inside the cell, it is not condensed to the same extent as inside a viral capsid. Other than in sperm nuclei, in vivo packaging densities range from B5%–10% by volume. DNA confined in viral capsids, on the other hand, is at the extreme end of the packaging scale where it is confined to B55% by volume, forming a hexagonally ordered structure. At only a few angstroms of DNA-DNA surface separation (e.g., 7 Å surface separation in WT DNA length of 48,502 bp packaged in phage l), hexagonally ordered DNA has been shown to have very restricted mobility. It has been proposed that so-called Coulomb sliding friction between neighboring DNA helices plays a

Energetics of the DNA-Filled Head

169

Fig. 2 (A) Enthalpy of DNA ejection per virion (J) versus temperature for wt-DNA phage l titrated into LamB solution. Dashed lines are drawn to guide the eye. ΔHej values were obtained as an average of six independent measurements for each sample. Vertical error bars are SEs. (B) DNA disordering transition, which occurs when the temperature is increased, originates close to the center of the capsid, where DNA is more stressed due to stronger bending and larger packing defects. (Left) Cross-sections of the top view of the capsid. (Right) A side view cross-section of the capsid, which illustrates that DNA closer to the center of the capsid is likely to be ejected first since, due to dsDNA bending stress constraints, it is the last DNA portion to be packaged in the capsid during phage assembly. The schematic illustration of DNA inside the capsid shows the ordering of an averaged DNA structure and not the arrangement of individual DNA strands. Modified from Evilevitch, A., 2018. The mobility of packaged phage genome controls ejection dynamics. eLife 7. doi:10.7554/eLife.37345.002.

significant role in mobility of DNA at high packing densities in the viral capsids. Indeed, it was found that interhelical sliding friction leads to a kinetically trapped, glassy DNA state inside the capsid. This high friction genome state was shown to significantly affect the rates of DNA packaging and ejection in vitro. This occurs from dragging closely packed negatively charged DNA helices past other helices. The external environmental factors (e.g., ionic conditions and temperature) along with DNA packing density in the capsid strongly affect the metastable state of intra-capsid DNA and facilitate the required mobility of the hexagonally ordered viral genome during the initiation of its ultra-fast ejection reaching 60,000 bp/s. Transition in the metastable states of encapsidated viral genome structure and its energetics associated with the ejection process are reviewed below.

Metastability of Intra-Capsid DNA Facilitating or Inhibiting Viral DNA Ejection Virion metastability is one of the central concepts in virology. It implies that the virus, in order to successfully replicate, must be sufficiently stable to prevent spontaneous release of its genome outside the cell between infection events, and at the same time be unstable enough to release its genome during infection. Viral particles are therefore not inert structures and have not attained the minimum free energy conformation, separated by an energetic or kinetic barrier, prior to cell attachment and entry. Thus, viral structure plays an active role in genome delivery to the host cell. Viral metastability is mostly associated with structural transformations in the nucleocapsid and/or surrounding lipid envelope in response to changes in the virion’s environment. However, this does not apply to motor-packaged dsDNA viruses (e.g., dsDNA phages and herpesviruses) whose capsid remains intact after the genome is released into a cell through a portal opening in the capsid structure. In these viruses, it is the encapsidated DNA rather than the capsid itself in these viruses that is metastable, where it can undergo a transition from a more ordered to a less ordered state. Atomic force microscopy (AFM) nano-indentation experiments have shown that these two states are characterized by a solid-like (ordered DNA) to a fluid-like (disordered DNA) mechanical transition of intra-capsid genome. This provides a mechanical shift affecting genome mobility. This mechano-structural transition is influenced by temperature, ionic conditions and/or DNA packing density in the capsid which either facilitate or inhibit genome ejection from a capsid into a host cell during viral replication. By mapping both energy (using ITC) and structure (using solution Small Angle X-ray Scattering, SAXS) of the packaged l-genome a remarkable paradigm of physical adaptation of viruses to their physiologic environment of the host was found. It was shown that mobility of the packaged DNA in l-capsid is “switched on” to a fluid-like state facilitating its release from the capsid at the most favorable ionic and temperature (371C) conditions for E. coli infection. Correspondingly, the intra-capsid DNA is in a solid-like state inhibiting its released and infection under less favorable conditions for infection, preserving the infectious viral particle. Furthermore, the transition occurs only at the WT l-DNA length (48,500 kbp) density packaged in the phage l capsid, as opposed to the shorter DNA lengths of phage l mutants. The ITC provides the most direct method to measure the internal energy of the confined viral genome. Using ITC, the enthalpy change (DH) associated with DNA ejection from phage l was measured as heat released when concentrated phage particles are titrated into a LamB receptor solution, which triggers DNA ejection in vitro. (LamB is an outer membrane maltoporin protein in Gram-negative bacteria which also serves as the phage l receptor). Since the total volume of the system does not change during the DNA ejection in the ITC measurement, and the pressure is constant, the internal energy and the enthalpy (provided by ITC) are approximately equal. The enthalpy change associated with DNA ejection from phage is DHej(T) ¼ DH(T)DNA ejected – DH(T)DNA inside phage. DHej(T) can be determined versus temperature. Fig. 2(A) reveals a discontinuity in the approximately linear dependence of DHej on temperature for WT DNA length (48.5 kbp) ejection from phage l occurring close to the physiologic temperature of infection and optimum ionic conditions for infection (10 mM MgCl2 Tris-buffer). The discontinuity demonstrates an abrupt transition that is attributed to the DNA

170

Energetics of the DNA-Filled Head

inside the capsid. Prior to the transition, the absolute value of the ejection enthalpy change jΔHej ðTÞj shows a strong linear increase with increasing temperature. This increase in the internal energy indicates an increase in the stress of the confined genome as temperature is being raised. At the transition temperature, T*, the internal energy is reduced by almost half, suggesting partial relief of the stressed state. After the transition, jΔHej ðTÞj shows only weak temperature dependence when the temperature is further increased to 421C, see Fig. 2(A). This observation demonstrates that the DNA inside l capsid can exist in two energy states. Thus, the critical genome stress is reached at temperature T* and is required for the structural transition to occur. There is a strong dependence between the internal genome stress and the structural DNA transition in phage l. The DNA stress in the capsid is regulated by both packaged DNA length and by the concentration and nature of the DNA counterions freely diffusing through the capsid wall of most viruses. E.g., lower DNA density in the capsid has larger spacings between packaged DNA helices, resulting in weaker repulsive interactions and lower genome stress. As a result, critical genome stress required for transition to occur is not achieved at lower DNA packing density in the capsid than intra-capsid density corresponding to WT l-DNA length. Solution small angle X-ray scattering (SAXS) is a powerful technique that provides structural information about the encapsidated genome. In parallel with the increasing inter-strand repulsions, the increase in temperature will decrease the DNA persistence length, leading to less bending stress. If the bending stress decreases and if there is room to expand in the capsid, then DNA-DNA interstrand spacings would increase and interaction energy decrease. However, with SAXS it was shown that the DNA-DNA spacing remains constant, suggesting that there is no room for DNA to expand in the l-capsid. Instead, as a result of increasing repulsive interactions (due to packing defects) and decreasing bending stress with increasing temperature, the disordering DNA transition occurs, as confirmed by SAXS. This transition is suggested to take place closer to the center of the capsid, where DNA bending stress is stronger and packing defects are larger than for DNA closer to the capsid wall, making the DNA in the center more destabilized and therefore more sensitive to the increasing temperature. At the transition temperature, the DNA bending stress becomes sufficiently small, allowing a fraction of the ordered DNA layers closest to the capsid’s center to undergo a disordering transition. The disordered DNA will have a lower packing density than the ordered DNA, which maximizes DNA-DNA spacings and simultaneously reduces the repulsive interactions and DNA-DNA sliding friction. This yields an overall lower energy state of the encapsidated genome and increases DNA mobility in the center of the capsid which is required for rapid DNA release during infection. DNA closer to the center of the capsid is likely to be ejected first since it is the last DNA portion to be packaged in the capsid during the phage assembly, due to the dsDNA bending stress constraints. The schematic representation of this transition behavior versus temperature is shown in Fig. 2(B). While variation in the monovalent ion concentration has a small influence on intra-capsid DNA stress, polyvalent cations present in the bacterial cytoplasm, such as polyamines and Mg2 þ , have been shown to have a strong effect on the repulsive interactions between packaged DNA helices. Since the free polyamine concentration in cells is very low as most polyamines are bound to cellular DNA and RNA, there is a particular interest in the effect of the Mg-ion concentration on the DNA transition temperature in phage l at concentrations similar to those of free Mg2 þ in vivo. Mg-ions are essential for both cellular metabolism, e.g., enzyme activity, protein synthesis, preservation of ribosome and nucleic acid structures, as well as the phage l infectious cycle. It has been shown that a concentration of 10–20 mM of Mg2 þ ions in the extracellular solution is critically important for optimum adsorption and infection of E. coli by phage l Mg-concentrations below or above 10–20 mM had a strong negative effect on the number of bacterial cells that could be infected. Interestingly, a similar Mg-concentration (B10 mM Mg2 þ ) has also been shown to provide optimum conditions for DNA packaging in phage l and also corresponds to the free Mg-concentration in E. coli cytoplasm. Since the Mg-concentration affects DNA stress in the capsid (by freely diffusing through the capsid walls and screening the negative charges in the packaged DNA), it also strongly influences the transition temperature of the encapsidated genome. While T* varies significantly with Mg2 þ concentration, the most favorable Mg-concentration for phage l infection of E. coli (B10–20 mM) triggers DNA transition in the capsid precisely at the physiologic temperature of infection (B371C). This striking correlation shows that the intra-capsid DNA transition mechanism in l is evolutionarily adapted to both the ionic environment and temperature of its host (phage l replicates in E. coli in the human gut at 371C), suggesting its significance for viral replication. An analogous metastable DNA state inside the viral capsid is also present in HSV-1, where solid-to-fluid like DNA transition occurs close to 371C. This suggests that the DNA transition mechanism regulating infectivity may be universal for many pressurized DNA viruses. These observations suggest that dynamics of DNA ejection can be affected by a transition in mobility of the encapsidated genome, which can also influence viral replication dynamics and course of infection. This is discussed in the next section.

The Mobility of Packaged Genome Controls Ejection Dynamics Gaining insight into the dynamics of viral gene delivery in a host cell during infection is critical for understanding the virus replication dynamics and viral fitness. We recently found a direct link between the structure (and its associated mobility) of the encapsidated dsDNA in phage l and dynamics of initiation of its release from the capsid. Time-resolved ITC measurements provide ultra-sensitive detection of ejection dynamics from a phage population triggered by instant mixing with LamB l-receptor in vitro, revealing the effect of mechano-structural DNA transition in a viral capsid on DNA ejection dynamics from a phage l population. ITC is mainly used for thermodynamic analysis. However, it was recently demonstrated that it can be successfully used to measure reaction kinetics reflected by heat flow (so-called kinITC) due to the very short response time of modern ITC instruments.

Energetics of the DNA-Filled Head

171

It was shown that the solid-to-fluid-like intra-capsid DNA transition, is a determining factor for either rapid synchronized or slow desynchronized ejection dynamics from a phage l population. At optimum temperature for infection (i.e., B371C) and Mg-concentration r10 mM, the encapsidated DNA is in a fluid-like state and ejection events from the majority of the phage population are synchronized with DNA ejections occurring within B10 s directly after phage adsorption to E. coli receptors (this time corresponds to DNA translocation time from a capsid). However, deviations from these conditions for infection (e.g., low temperature, high Mg-concentration) result in intra-capsid DNA being in a solid-like state where DNA is arrested in different conformations due to high interstrand sliding friction. This leads to a striking heterogeneity in phage ejection dynamics, displaying coexistence of populations with synchronized ejection events (where DNA is in a fluid-like state) and desynchronized stochastically occurring ejection events (where DNA is in a solid-like state). The latter presents one to two orders-of-magnitude slower ejection dynamics, ranging from minutes to tens of minutes, depending on salt conditions and temperature, compared to DNA translocation time from a capsid. This in vitro ejection dynamics behavior is in good agreement with earlier observations of temperature’s effect on DNA injection dynamics from phage l after preadsorption to E. coli culture. These findings also dismiss an earlier claim that stochasticity of DNA ejections from phage is solely controlled by capsid portal opening dynamics. These assumptions were based on earlier light scattering measurements of phage ejection dynamics that could not distinguish between synchronized and desynchronized ejection events. The lytic-lysogenic decision is one of the most crucial factors determining virus population dynamics. Lytic course of infection is where a virus replicates immediately, leading to cell lysis and progeny phage release, while lysogenic infection is a latent state where phage DNA is integrated into a cell’s chromosome without killing the cell. There are three early viral genes (CI, Cro and CII) which have been previously found to be involved in the decision-making process. CII appears to provide positive and negative feedback in the gene regulatory network. High levels of CII promote production of CI which leads to lysogeny, while low levels of CII promote production of Cro leading to bacterial cell lysis. It was recently proposed that the lysogeny-lytic cell decision initially occurs at the phage level through the timing of DNA delivery from multiple phages into a cell during infection. The timing of DNA injection from phage is what determines the onset of viral genome replication and gene expression because both phage DNA injection and DNA replication processes occur on comparable timescales. In turn, replication and expression dynamics of injected phage genomes determines how phages interact with one another when multiple phage particles are infecting a single cell. It was shown that lysogeny is more likely to occur when infecting phages interact cooperatively during cell infection. Synchronized injections result in immediate presence of DNA from multiple phages in a cell that can be integrated into cell’s chromosome without competition for the progeny. Lysogeny results in the absence of selection, which allows various mutants to integrate and persist in the lysogenic state, leading to diversification of phage genes and adaptation to poor or new growth conditions. On the contrary, when DNA injections are desynchronized, few phage particles are delivering their genes faster than the rest of the phage population, resulting in competitive interactions. If the first phage to deliver its genome is lytic, it will override lysogeny, rapidly replicate itself using the cell’s limited resources and therefore take a larger share of phage progeny. The asynchronous infection delays decrease the chance of lqqysogeny by also lowering CII viral transcription factor levels in the cell. It was shown that the cutoff time, where delayed phage infections of one cell no longer affect the lytic-lysogenic cell decision varies from few minutes to tens of minutes, depending on the multiplicity of infection, MOI (higher MOI results in shorter cutoff time), comparable to timescales for overall ejection delays observed in ITC measurements. This suggests that ejection delays resulting from solid-like DNA state in the capsid will increase cell lysis probability. These in vitro findings suggest a new paradigm affecting virus population dynamics – mechano-signaling mediated by packaged DNA structure in phage l. The environmental factors regulating the mechanical transition between solid- and fluid-like intracapsid DNA states strongly influence the timing of genome ejections. In vivo, desynchronized ejections from a phage population lead to competition between phage genomes, affecting the rate of phage gene replication and transcription as well as the lysogenylytic cell decision. Given similarities between phage l and herpesviruses in their mechanism of pressure-driven DNA ejection (where a temperature-induced solid-to-fluid like DNA transition was observed in both viral systems), the new insights into the lysogeny-lytic decision process for phage provide intriguing prospects for understanding of latency mechanism in herpesviruses. Indeed, analogies between phage and herpesviruses in other regulatory factors that affect viral latency have been previously observed. Many viruses have been shown to have pressurized genomes inside their capsid. However, in order to place this important observation in the biological context of viral infection, it is critical to demonstrate the role that capsid pressure plays in viral genome delivery into a cell. Next Section describes the definitive demonstration of a pressure-dependent infection mechanism in viruses by showing that pressure-driven DNA ejection from HSV-1 capsids into a host cell nucleus is inhibited when intra-capsid DNA pressure is “turned off”.

Pressure-Driven Release of Viral Genome Into a Host Cell is a Mechanism Leading to Infection All eukaryotic DNA viruses, with the exception of poxviruses, deliver and replicate their genomes within the nucleus. In addition, retro-viruses transport their genomes across nuclear membranes in order to replicate after they have reversely transcribed their single-stranded (ss) RNA to dsDNA. The mechanism of this most significant step of infection, viral DNA entry into the host nucleus, remains poorly understood for the majority of viruses. Despite previous measurements of intra-capsid DNA pressure in phage and herpesvirus, the role that capsid pressure plays for intra-cellular viral genome delivery has not been previously

172

Energetics of the DNA-Filled Head

Fig. 3 After binding at the outer membrane (Fig. 3(a)), viruses enter the cell cytoplasm and are transported toward the nucleus (Fig. 3(b)). The viral capsid ejects its genome upon docking to a nuclear pore complex (Fig. 3(c)). From Bauer, D.W., Huffman, J.B., Homa, F.L., Evilevitch, A., 2013. Herpes virus genome, the pressure is on. Journal of the American Chemical Society 135 (30), 11216–11221. doi:10.1021/ja404008r.

demonstrated. The reconstituted capsid-nuclei experiments below provide a demonstration of a nuclear entry mechanism of DNA from HSV-1, driven by high mechanical pressure of the encapsidated viral genome. Herpesviruses consist of a dsDNA genome packaged within an icosahedral capsid that is surrounded by an unstructured protein layer, the tegument, and a lipid envelope. Fig. 3 illustrates the HSV-1 infection process as observed by ultrathin-sectioning transmission electron microscopy (TEM). After binding at the outer membrane (Fig. 3(a)), viruses enter the cell cytoplasm and are transported toward the nucleus (Fig. 3(b)). The viral capsid ejects its genome upon docking to a nuclear pore complex (NPC), which forms a passageway for molecular traffic into the nucleus (Fig. 3(c)). To investigate the effect of capsid pressure on the single step of viral DNA ejection into a nucleus, cell nuclei were isolated and reconstituted with cytosol supplemented with an ATP-regeneration system. This reconstituted-nuclei assay avoids interference from other processes occurring within the cell during viral replication. The reconstituted nuclei system also accurately reproduces capsids-nuclei binding and nuclear transport of the herpes genome into living cells and it was previously shown that purified HSV1 capsids bind to NPCs on isolated nuclei and eject their DNA into nuclei. To provide the evidence that intra-capsid DNA pressure is responsible for DNA release from a herpesvirus capsid into a cell nucleus, it was demonstrated that when the capsid pressure is “turned off” with addition of an external osmolyte, herpes capsids bound to NPCs do not eject DNA into a nucleus, while the ejection is completed successfully without osmolyte addition. The external osmotic pressure in solution surrounding the capsids effectively eliminates genome pressure in the capsid when it matches the pressure of the packaged DNA. As shown in Fig. 4, the viral genome ejection from HSV-1 through the NPCs into a cell nucleus can be completely suppressed in the presence of the biologically inert osmolyte polyethylene glycol (PEG) 8 kDa at 30% w/w corresponding to 20 atm of osmotic pressure matching that of DNA pressure in the capsid. While the term “infection” usually refers to both viral genome transport into the cell and subsequent replication of the virus, the primary infection by several types of herpesviruses (including HSV-1) is latent (i.e., the herpes genome is translocated into the host nucleus, without subsequent genome replication). Thus, the osmotic suppression assay, combined with the reconstituted nucleus system, provides a demonstration of a pressure-dependent mechanism of herpesvirus infection focused on the viral genome translocation step.

Concluding Remarks Correlating structure and energetics of the encapsidated genome with the efficiency of viral replication provides a unique approach to explore the connection between physical and genetic aspects of viral evolution, and creates novel opportunities for control of viral infections. Various aspects of genetic evolution have been investigated in the field of virology. The physical aspects of viral evolution, however, are less understood. Recently, an intimate relationship was found between the physical and genetic evolution of dsDNA viruses. Specifically, it was shown that the physical limit of DNA length imposed by the capsid volume has led to gene overlap evolving as a mechanism for producing more proteins from the same genome length. This demonstrates how a genetic mechanism has evolved from a physical constraint by the capsid on the packaged genome density. Similarly with this finding, the data reviewed in this article suggest that the genome packing density in a viral capsid is unique and highly conserved. This precise packing density not only determines the number of genes required for replication, but also creates an internal stress in the capsid required for the solid-to-fluid like DNA transition to occur at the temperature of infection. Such intracapsid DNA transition facilitates initiation of DNA delivery from viral capsid into a cell. E.g., there is evidence that the packaged genome density in phage l corresponding to WT l-DNA length (as opposed to the shorter l-DNA length mutants), presents the most energetically and structurally optimized balance between genome mobility and its internal stress, both of which are required for efficient DNA ejection from the capsid. DNA transition occurs once the critical DNA stress level in the capsid is reached, which are in turn controlled by the temperature and the ionic conditions of the surrounding solution. The DNA transition in the capsid occurs precisely at the physiologic

Energetics of the DNA-Filled Head

173

Fig. 4 Ultrathin-sectioning EM visualization of the osmotic suppression of DNA ejection from HSV-1 capsids into reconstituted cell nuclei [Isolated cell nuclei with added cytosol with ATP regeneration system. Cytosol contains importin-b required for efficient HSV-1capsid binding to NPCs.] Negative control at 41C without added PEG and without ATP-regenerating system, shows that no ejection from nuclei bound C-capsids occurs. Positive control at 371C shows complete DNA ejection from C-capsids bound to isolated cell nuclei supplemented with cytosol ATPregenerating system. EM shows that the addition of 30% PEG 8 kDa (corresponding to 20 atm of osmotic pressure) to reconstituted capsid-nuclei system, inhibits DNA ejection from HSV-1C-capsids into host nuclei through the NPC. Thin arrows show DNA-filled capsids, and bold arrows show empty capsids that ejected DNA. 1. Bar 500 nm. 2. Bar 90 nm. Modified from Brandariz-Nunez, A., Liu, T., Du, T., Evilevitch, A., 2019. Pressuredriven release of viral genome into a host nucleus is a mechanism leading to herpes infection. eLife 8. doi:10.7554/eLife.47212.001.

temperature of infection (371C) and at the most favorable external Mg2 þ -concentration for phage l adsorption to E. coli and subsequent replication. This suggests an evolutionary adaptation of the intracapsid DNA transition mechanism to the temperature and ionic environment of its host. The remarkable observation of mechano-regulation of phage genome ejection dynamics can be compared to gene expression regulation in meters-long, tightly packaged chromosomal DNA in eukaryotic cells, where gene expression is controlled not only by genetic regulation but also by mechanical properties of DNA in nucleosomes. Environmental stress factors, such as an abrupt change in temperature or in ionic conditions, were demonstrated to play a major role in the mechano-regulated gene activation process, turning certain genes on and off, in a process termed epigenetics. Similarly, temperature and ionic conditions control the mechanical properties of virally encapsidated DNA and act as an on-off switch between synchronized and desynchronized ejection dynamics. This can influence timing of viral replication and the lytic-lysogenic cell decision process. Analogously to DNA pressurized phage capsids, high intra-capsid DNA packing density resulting in tens of atmospheres of pressure is a distinctive trait of all nine human herpesviruses. This strongly suggests that pressure-driven entry of viral DNA into the host nucleus during infection is universal to all herpesviruses. Other types of viruses also involve replication steps dependent on the pressurized state of the intra-capsid genome. For instance, during genome packaging, reoviruses replicate ssRNA to dsRNA inside the capsid, which results in genome packaging densities similar to that of herpesviruses. Such intra-capsid replication could be impacted, at least in part, by generation of internal pressure resulting from the increasing genome packaging density as newly synthesized dsRNA continues to fill the internal capsid volume. Another example is HIV, where similar to herpesviruses, HIV capsids dock to NPCs at the nucleus and release transcribed dsDNA through the NPC channel. It was recently shown that the reverse transcription process from ssRNA to dsDNA inside the HIV capsid is associated with increasing internal DNA pressure. The experiments summarized in this article provide the necessary tools for exploration of pressure-regulated replication in these viruses.

174

Energetics of the DNA-Filled Head

Further Reading Bauer, D.W., Huffman, J.B., Homa, F.L., Evilevitch, A., 2013. Herpes virus genome, the pressure is on. Journal of the American Chemical Society 135 (30), 11216–11221. doi:10.1021/ja404008r. Berndsen, Z.T., Keller, N., Grimes, S., Jardine, P.J., Smith, D.E., 2014. Nonequilibrium dynamics and ultraslow relaxation of confined DNA during viral packaging. Proceedings of the National Academy of Sciences of the United States of America 111 (23), 8345–8350. doi:10.1073/pnas.1405109111. Booy, F.P., Newcomb, W.W., Trus, B.L., et al., 1991. Liquid-crystalline, phage-like packing of encapsidated DNA in herpes simplex virus. Cell 64 (5), 1007–1015. doi:10.1016/ 0092-8674(91)90324-R. Brandariz-Nunez, A., Liu, T., Du, T., Evilevitch, A., 2019. Pressure-driven release of viral genome into a host nucleus is a mechanism leading to herpes infection. eLife 8. doi:10.7554/eLife.47212.001. Catalano, C.E., 2005. Viral Genome Packaging Machines: Genetics, Structure, and Mechanism. New York: Bioscience/Eurekah.com. Earnshaw, W.C., Harrison, S.C., 1977. DNA arrangement in isometric phage heads. Nature 268, 598–602. Evilevitch, A., 2018. The mobility of packaged phage genome controls ejection dynamics. eLife 7. doi:10.7554/eLife.37345.002. Evilevitch, A., Lavelle, L., Knobler, C.M., Raspaud, E., Gelbart, W.M., 2003. Osmotic pressure inhibition of DNA ejection from phage. Proceedings of the National Academy of Sciences of the United States of America 100 (16), 9292–9295. doi:10.1073/pnas.1233721100. Goldhill, D.H., Turner, P.E., 2014. The evolution of life history trade-offs in viruses. Current Opinion in Virology 8, 79–84. doi:10.1016/j.coviro.2014.07.005. Hanhijarvi, K.J., Ziedaite, G., Pietila, M.K., Haeggstrom, E., Bamford, D.H., 2013. DNA ejection from an archaeal virus – A single-molecule approach. Biophysical Journal 104 (10), 2264–2272. doi:10.1016/j.bpj.2013.03.061. Kindt, J., Tzlil, S., Ben-Shaul, A., Gelbart, W.M., 2001. DNA packaging and ejection forces in bacteriophage. Proceedings of the National Academy of Sciences of the United States of America 98 (24), 13671–13674. Lander, G.C., Evilevitch, A., Jeembaeva, M., et al., 2008. Bacteriophage lambda stabilization by auxiliary protein gpD: Timing, location, and mechanism of attachment determined by cryo-EM. Structure 16 (9), 1399–1406. doi:10.1016/j.str.2008.05.016. Leikin, S., Parsegian, V.A., Rau, D.C., Rand, R.P., 1993. Hydration forces. Annual Review of Physical Chemistry 44, 369–395. doi:10.1146/annurev.pc.44.100193.002101. Liu, T., Sae-Ueng, U., Li, D., et al., 2014. Solid-to-fluid-like DNA transition in viruses facilitates infection. Proceedings of the National Academy of Sciences of the United States of America 111 (41), 14675–14680. doi:10.1073/pnas.1321637111. Mackay, D.J., Bode, V.C., 1976. Events in lambda injection between phage adsorption and DNA entry. Virology 72 (1), 154–166. Petrov, A.S., Harvey, S.C., 2008. Packaging double-helical DNA into viral capsids: Structures, forces, and energetics. Biophysical Journal 95 (2), 497–502. Purohit, P.K., Kondev, J., Phillips, R., 2003. Mechanics of DNA packaging in viruses. Proceedings of the National Academy of Sciences of the United States of America 100 (6), 3173–3178. Riemer, S.C., Bloomfield, V.A., 1978. Packaging of DNA in bacteriophage heads: Some considerations on energetics. Biopolymers 17 (3), 785–794. Smith, D.E., Tans, S.J., Smith, S.B., et al., 2001. The bacteriophage straight phi29 portal motor can package DNA against a large internal force. Nature 413 (6857), 748–752. Zeng, L., Skinner, S.O., Zong, C., et al., 2010. Decision making at a subcellular level determines the outcome of bacteriophage infection. Cell 141 (4), 682–691. doi:10.1016/j. cell.2010.03.034.

Bacteriophage Receptor Proteins of Gram-Negative Bacteria Sarah M Doore, Kristin N Parent, Sundharraman Subramanian, and Jason R Schrad, Michigan State University, East Lansing, MI, United States Natalia B Hubbs, Hanover College, Hanover, IN, United States r 2021 Elsevier Ltd. All rights reserved.

Glossary Assay An experiment used to assess or measure a property of interest. Genome The entirety of genetic material present in a cell or organism. LPS Lipopolysaccharide; a molecule comprised of fatty acid and repeating sugar units that decorates the outer membrane of Gram-negative bacteria and varies widely between species and genera.

Omp Outer membrane protein; a protein found in the outermost membrane of Gram-negative bacteria. Tail Phage structure comprised of several proteins through which many bacteriophages identify and bind to their host; the tail is bound to the phage capsid, which carries genomic information. Transposon A sequence of DNA that can insert itself into or change its location in the genome.

History and Overview Shortly after the discovery of bacteriophages, scientists became interested in multiple aspects of their infection cycle. However, the initial stages of infection – host recognition and attachment – received relatively little attention until three decades later in 1951, when Garen and Puck described two distinct steps during bacteriophage T2 attachment to its host Escherichia coli. This two-step process has since been described for many well-characterized bacteriophages. The first step is reversible, meaning the phage can unbind and has not committed to infection. The second step is irreversible and the particle commits to infecting the host. Subsequent experiments determined that the second step involves proteins in the bacterial membrane, but the identity of these proteins was not determined until the 1970s. Since those initial studies, our knowledge of phage receptors has grown exponentially. We now know that for many Gramnegative bacteria and their associated phages, the primary receptor is typically the lipopolysaccharide (LPS), capsular layer, flagellum, or pilus. At this stage, if the phage has attached to an incompatible host cell, it is still able to detach and search again for a suitable host. If the host is compatible, the phage reaches its secondary receptor by either degrading the LPS or the capsular layer, or by moving closer to the cell via rotary action of the host flagellum or retraction of the pilus. If the secondary receptor is compatible, the particle binds irreversibly to this host protein and its genetic material is ejected into the host (Fig. 1). Developing better techniques, as discussed later in this article, has been critical to our understanding of membrane proteins and how bacteriophages interact with them. While bacteriophages as a whole employ vastly different strategies for primary receptor recognition, secondary receptors tend to fall into a few main classes of membrane proteins. This article will focus on secondary receptors by covering a few common techniques and methods to study membrane proteins, the characteristics of outer membrane proteins (Omps), and some of the known interactions between phages and host Omps. Within this context, the ecology and evolution of these interactions will be discussed, along with the importance of studying these interactions for use in phage applications.

Fig. 1 Representation of tailed bacteriophages interacting with the outer membrane of a Gram-negative bacterium. Membrane proteins are colored using the scheme in Fig. 2. Representative myovirus (purple), siphovirus (orange), and podoviruses (blue) are shown interacting with LPS and/or Omps via their tail proteins.

Encyclopedia of Virology, 4th Edition, Volume 4

doi:10.1016/B978-0-12-809633-8.20952-7

175

176

Bacteriophage Receptor Proteins of Gram-Negative Bacteria

Techniques and Methods for Studying Phage Receptors Since the initial studies from the 1950s, scientists have developed dozens of methods for identifying and studying the membrane proteins that bacteriophages use as secondary receptors. Some of these have been used for decades, including forward genetic screens and gel electrophoresis. Others are still being developed, such as single-gene deletion libraries and methods using advanced microscopy. Table 1 presents a summary of these methods along with strengths and weaknesses. A more in-depth discussion of each is presented at the end of this article. Once the receptor is identified from genetic or protein screens, and once its structure and/or function has been determined by atomic or low-resolution techniques, scientists can begin to visualize how the phage interacts with its receptor in the context of the native host. This question can be investigated by in vivo approaches such as time-lapse fluorescence microscopy and cryo-electron tomography (cryo-ET). These techniques visualize an entire population of phages interacting with their hosts or intact membranes. Time-lapse fluorescence microscopy uses fluorescent markers, which can selectively dye either the host or the phage. A series of images taken during the course of the infection are then stitched together to form a movie of the infection process. This method can show the timing of host binding and genome ejection, as well as the distribution of entry sites on hosts. Another direct visualization technique is cryo-ET. Similar to cryo-electron microscopy (cryo-EM), this involves flashfreezing populations of cells being infected with phage in vitreous ice and imaged in the transmission electron microscope. Host mutants called “mini cells” or outer membrane vesicles are useful to replace bona fide cells, as these limit the thickness of the specimen. The sample is then tilted to capture images from many different angles, and image reconstruction techniques are employed to create a more complete 3D picture. With cryo-ET, the image reconstruction results in being able to visualize the whole volume of the sample. From cryo-ET experiments, researchers can see details about the global structure of the phage tails and channels into the host membrane. Since flash freezing stops biological processes within milliseconds, this technique can capture phages at various stages of infection, letting scientists see the sequence of events more clearly.

Properties of Receptor Proteins Through both classical and modern techniques, scientists have identified many common properties of phage receptor proteins of Gram-negative bacteria. Numerous well-studied phages infect intestinal – or enteric – bacteria. These include species of Escherichia, Klebsiella, Salmonella, Shigella, and Yersinia. The Omps used by many phages infecting these species are listed in Table 2. The vast majority of proteins found in the outer membrane are β-barrels, which are proteins rich in β-sheets that twist to form a barrel shape. These barrels are then embedded in the membrane, oriented as channels through which molecules can move into or out of a cell. Many of these proteins have α-helical or disordered loops connecting the β-sheets. Additionally, the majority have larger extracellular domains than periplasmic, or between-membrane, domains. Some of these proteins are involved in the transport of small molecules or nutrients, while others are able to sense membrane stress or contribute to membrane biogenesis. Many have unknown functions. Regardless of their role in bacteria, these proteins can be co-opted by bacteriophages to identify and gain access to their host. Representative structures from this list are shown in Fig. 2. Bacteriophages use their aptly-named receptor binding proteins to bind their specific host receptor. These receptor binding proteins, which are located on the surface of the phage particle, can recognize a suitable host and trigger a conformational change, resulting in the phage genome moving across the membrane. Many bacteriophages have specialized protein machinery called tails to accomplish this, with receptor binding proteins located on tail fibers or tail spikes. Non-tailed phages can identify their host via spikes directly attached to the capsid. Tailed phages, which are classified according to morphology, use different mechanisms of host binding and genome ejection depending on the tail. The three most common families of bacteriophage are the Myoviridae (long, contractile tails), Siphoviridae (long, non-contractile tails), and Podoviridae (short tails). Several phage-host pairs have been studied in detail to determine the interactions between the host receptor and the phage’s receptor-binding proteins. Some of these include the siphovirus lambda (λ) and its receptor LamB; siphovirus T5 and its receptor FhuA; and the podovirus Sf6 and its receptors OmpA and OmpC. These model systems have given us a wealth of information about genetics, protein-protein interactions, and predator-prey dynamics. Bacteriophage λ and its receptor LamB is a classic system for studying phage-host interactions. The LamB outer membrane protein forms a homotrimer consisting of three β-barrels and functions as a maltose and maltodextrin transporter. The initial model of λ and LamB binding was based on the isolation and characterization of λ-resistant bacteria by Thirion and Hofnung in 1972. These researchers observed that the bacteria were also unable to use maltose as a carbon source, but the exact mechanism for this was unknown. The following year, Randall-Hazelbauer and Schwartz fractionated bacteria, separating the membranes and other components of the bacterial cell via ultracentrifugation. The end result is “fractions” of the sample that each contain a specific biomolecule: only membranes, only DNA, only ribosomes, etc. By doing this, they learned where this “λ-resistance factor” was located in the cell: the outer membrane. Once the specific protein was known, researchers could focus on a more in-depth characterization. Initially, they used directed genetic screens to determine which mutations affected either antibody binding or protease digestion of LamB. This information told researchers which regions of the protein were likely exposed to the extracellular environment and available for antibody binding or protease digestion. Mutations in LamB that made the bacteria resistant to λ could then be compared to these putative

Bacteriophage Receptor Proteins of Gram-Negative Bacteria

Table 1

177

Techniques for studying bacteriophage receptor proteins Pros

Techniques for receptor identification: Genetics Resistance mutants • Fast (o1 day)

• •

Single gene deletion library

• •

Transposon libraries

• •

Protein-based Binding assays 2D Gels Chemical crosslinking Mass spectrometry

• •

Simple and inexpensive Can process large populations (4108 cells) in a single experiment

• • • •

• • • • • • •

Multiple genes contributing to the same pathway can make interpretation difficult

Detects direct binding Narrows down a few possible candidates instead of screening an entire library

• •

Non-specific host components can co-purify Receptor/phage interactions may be too transient to effectively capture The receptor might be present but not detected

chemical bonds) Shows regions that are flexible Provides basic information about the overall degree of folding at the level of tertiary or secondary structure Can be used to determine the number of copies of a protein within a complex Can be used to assay how dynamic or stable a receptor is

Techniques for characterization of phage-host interactions: Interaction location Site directed mutagenesis • Identifies specific amino acids or regions of experimental evolution a protein that contribute to binding



• • • • • •

• • •

Chemical crosslinking

Interaction dynamics Time-lapse fluorescence microscopy Cryo-electron tomography (cryo-ET)

• • •

Mutants can arise from many steps in life cycle, not just receptors Resistant host may have multiple mutations complicating interpretation Binary results only: resistant or susceptible Redundant results if multiple genes contribute to a pathway Genetic tools are not available for all hosts Expensive and time consuming to generate

Systematic: checks all non-essential genes “Clean”: isogenic populations and a single, sequenced mutation per host Libraries can be generated from a single starting population Non-binary: can also identify genes that delay and accelerate infection

Techniques for receptor characterization: Atomic/near atomic resolution Crystallography • Provides detailed information about the Nuclear Magnetic Resonance (NMR) receptor, including the position of all Cryo-electron microscopy (cryo-EM) important atoms (amino acids or

Lower resolution Oligomeric state: Gel shift, analytical ultracentrifugation, native mass spectrometry Optical: Circular dichroism, fluorescence

Cons

Identifies specific amino acids or regions of a protein that contribute to binding



Native conditions Can directly visualize many distinct infection events

• •



Requires purification: not possible or practical for all cell components Not all substances readily crystallize. Size may be a limiting factor: either too large for NMR or too small for cryo-EM. Resolution is too low for specific details Membrane proteins often are purified in detergents, which may interfere with some of these methods Requires protein purification and extensive optimization

Some structural information is needed to guide the experiments Individual changes may result in subtle phenotypes Requires genetic tools such as engineering or sequencing approaches Directed approaches may be limited if the structure of the receptor is not known Non-specific crosslinking often occurs Expensive and time consuming Whole cells are often too thick for cryo-ET, so mini cells or outer membrane vesicles are needed (less-native) (Continued )

178

Bacteriophage Receptor Proteins of Gram-Negative Bacteria

Table 1

Continued

Computation

Pros

Cons



• •

Currently limited to small time scales Some structural information is needed to guide the experiments



Requires protein purification; not all receptors remain active when purified

• In vitro kinetics

• • • •

Looks at very complex landscapes including lipids, proteins, LPS, etc. Accesses conditions not easily achieved in laboratory settings Easily repeatable Can be used to determine strength of binding interactions Rates of binding can be determined If the receptor has enzymatic activity this can be studied as well

extracellular sites to see where the phage might be binding. Most of these resistant bacteria contained mutations that mapped to the disordered protein strands (loops) 4, 5, 6, and 8 of LamB. Although a majority of these mutations affected simple binding of λ to host, a single amino acid immediately adjacent to loop 8 was also identified as critical for infection. This mutation, which resulted in glycine 401 being changed to aspartate, allowed the phage to bind but prevented genome ejection. To examine how λ receptor binding proteins worked, researchers used a combination of methods similar to those described for LamB – examining antibody binding sites and genetic studies – as well as some early techniques in electron microscopy. They determined that λ binds to its host via protein J, which forms the needle-like tip of the tail. Images from the electron microscope showed gold-labeled J proteins interacting directly with the cell membrane of wild-type bacteria, but not with a mutant cell lacking lamB. Although λ was one of the earliest phages to be studied in detail, it is still highly relevant in modern techniques. This includes dynamics experiments regarding λ “searching” for receptors using single-cell fluorescence microscopy and utilizing the extensive genetic screens made possible by the KEIO collection – a library of single gene deletion mutants including all non-essential genes in E. coli – to determine which genes are important for infection. Besides receptor interactions, this bacteriophage is also used extensively to investigate ecological and evolutionary questions. After λ, one of the next rising stars in phage-host interactions was bacteriophage T5 and its receptor FhuA. From the late 1970s through the 1980s, several groups determined that the tail fiber, denoted pb1, binds to the O-antigen of LPS during the first step of attachment. They also characterized two other proteins involved in host attachment and entry: pb5, which we now know binds specifically to FhuA; and pb2, which is involved in translocating the DNA into the host cell. The pb5 and FhuA binding interaction was first proposed by Heller and Bryniok in 1984. These researchers isolated a T5 mutant that infected wild-type cells more slowly than wild-type T5 and were completely unable to infect cells lacking the O-antigen: a specific arrangement of sugars on LPS that are often recognized by phages. Instead of a mutation in pb1, which they knew bound to the O-antigen, they mapped the mutation to a separate protein, pb5. They hypothesized that this mutant depended on pb1 sticking to O-antigen for a longer amount of time. This would increase the probability that the phage could bump into and bind to its protein receptor. Since the mutant pb5 could not bind as efficiently to the protein receptor, the phage relied on O-antigen to stick long enough to find its receptor. The researchers then treated cells with ferrichrome, which binds selectively to and inhibits FhuA transport of iron, and saw that this treatment caused the mutant phage to bind even less. From these experiments, they determined that FhuA, which the bacteria uses for iron transport, was also the protein receptor for phage T5. The role of pb5 as a receptor-binding protein was later confirmed by additional biochemical and structural experiments. Initially, this was investigated much like the phage λ and LamB interaction, using inhibitory peptides to determine which regions of the receptor were recognized by the phage. In 1995, Killmann et al. investigated the binding regions of FhuA-dependent phages T1, T5, and ϕ80. They determined that all three phages use the same large extracellular loop of FhuA, with some minor variation in the exact regions of binding. In 2001, Böhm et al. used cryo-EM to look at T5 bound to FhuA-containing vesicles. The following year, Plancon et al. purified pb5 and FhuA to look at their binding interaction in vitro. They performed a battery of assays to measure DNA ejection rates, the stoichiometry of binding, and the stability of the complex in different denaturants and at a range of temperatures. The combination of these studies has given us a much better picture of where, when, and how T5 recognizes this protein. A number of bacteriophages can use more than one secondary receptor to infect its host, though these are less well-studied than bacteriophage that rely on a single receptor because of the complicated nature of screening for phages that use more than one protein. In contrast to the systematic genetic approaches taken to identify receptors for λ and T5, the identification of the host receptors used by bacteriophage Sf6 was more serendipitous. When purified phage particles were analyzed by separating the proteins in a gel matrix, two mystery proteins appeared on the gel along with the expected structural proteins. These proteins were later identified by mass spectrometry as OmpA and OmpC by Parent et al. (2012). Since these proteins were caught in the act of associating with bacteriophage particles, the researchers tested this further via in vitro genome ejection assays and mutational analyzes. To confirm that OmpA was a true secondary receptor, all the components for genome ejection – LPS, OmpA, and phage – were purified individually,

Bacteriophage Receptor Proteins of Gram-Negative Bacteria

Table 2

Common bacteriophage receptors in Gram-negative bacteria

Outer membrane protein

Species

Bacteriophage(s)

OmpA

Escherichia coli

OmpC

Shigella flexneri Escherichia coli

TuII K4 Ox2 Ox4 M1 Sf6 T4 Me1 Hy2 434 PP01 Gifsy-1 S16 Sf6 GH-K3 T2 TP1 TP2 Yep-phi TG1 K10 λ Stx2ϕ-II T6 H8 K9 T5 E21 UC-1 mEp167 mEp410 mEp003 mEp043 mEp390 mEp506 ES18 fd M13 BF23 K6 K11 Ac4 SPC35 SPN9C SPN12C SPN17T BSP22a T2 Stx2ϕ-II H8 H8 TLS ST27 ST35 TC45 Yep-phi N4 vB_EcoS_IME347

Salmonella enterica

OmpF

Shigella flexneri Klebsiella pneumoniae Escherichia coli

LamB

Yersinia pestis Yersinia enterocolitica Escherichia coli

Tsx

Escherichia coli

FhuA (TonA)

Escherichia coli

TolA

Salmonella enterica Escherichia coli

BtuB

Escherichia coli

Salmonella enterica

FadL

Escherichia coli

FepA

Escherichia coli Salmonella enterica Escherichia coli Salmonella enterica

TolC

PhoE Ail NfrA, NfrB YncD

179

Escherichia coli Yersinia pestis Escherichia coli Escherichia coli

K3 K5 Ox3 Ox5 Ac3 TuIb PA2 SS4 SS1 TP2 Gifsy-2

TuIa K20

ϕR1-RT TP1 SS1 H3 Ox1 D T1 ϕ80 mEp237 mEp213 mEp023 mEp174 mEp416 HK022 f1 E15 K8 M3 EPS7 SPN7C SPN10H SPN14 SPN18 EPS7 Stx2ϕ-I

ST29 TC23

Note: Modified from Hubbs, N.B., 2017. Hijacking the Cell: How Bacteriophage Sf6 uses Shigella flexneri Outer Membrane Proteins for Infection. Michigan State University. doi:10.25335/M5J19S.

180

Bacteriophage Receptor Proteins of Gram-Negative Bacteria

Fig. 2 Structures of representative outer membrane proteins, with extracellular domains shown in blue, transmembrane domains in red, and periplasmic regions in gray. Structures were obtained from the Protein Data Bank (PDB) with accession numbers as follows: OmpA (1BXW), Ail (3QRA), OmpX (1QJ9), OmpT (1I78), Tsx (1TLY), FadL (1T16), BtuB (1NQE), FepA (1FEP), FhuA (1QFG), OmpF (3POX), OmpC (3UPG), LamB (1AF6), PhoE (1PHO).

mixed in a test tube, and incubated. No genome ejection was observed in tubes with only phage, with phage and LPS, or with phage and OmpA alone; however, when a tube containing all three components was incubated, the phage genome was ejected. This indicates that both LPS and OmpA are necessary and sufficient for the phage to infect its host. In vivo, Sf6 plating efficiency was tested on wild type cells, cells lacking either OmpA or OmpC, or cells lacking both OmpA and OmpC. Parent et al. determined that cell survival was much higher for the double ompA-/ompC- mutant compared to either the single mutant or the wild-type bacteria. This suggested that Sf6 can use either of two receptors to infect its host. In addition to plating experiments, cryo-EM studies showed that Sf6 can still bind to outer membrane vesicles produced from cells lacking either OmpA or OmpC but not from cells lacking both. Using plaque assays, they then screened for mutagenized ompA clones that would not restore plaque formation and determined that for Sf6 to recognize and bind to OmpA, the critical regions were extracellular loops extending from the β-barrel. With a 10-fold reduced plating efficiency, Sf6 is probably able to use more than two secondary receptors, as cells lacking both these receptors are not fully immune to Sf6. The identity of this mystery third receptor, or if there are even more than three, is currently unknown. Another system that has been characterized is the E. coli phage T2 and its two receptors, OmpF and FadL. Only bacteria lacking both membrane proteins OmpF and FadL are immune to T2. OmpF was the first protein receptor identified for T2, but the ability of T2 to still infect OmpF-negative cells led to the discovery of FadL by Morona and Henning in 1986. To do this, Morona and Henning generated a transposon library in a strain lacking OmpF, then screened the strains of their library to see which mutants could survive T2 infection. Subsequent mapping of these T2-resistant strains indicated that the ttr locus was important for T2 infection. This locus encodes FadL, an outer membrane protein involved in fatty acid transport. Given the genetic data and location of the protein in the outer membrane, this was determined to be the other secondary receptor of T2. These researchers also determined the relative preference of T2 for its receptors. Outer membranes isolated from cells lacking OmpF were much more

Bacteriophage Receptor Proteins of Gram-Negative Bacteria

181

effective in neutralizing T2 when compared to outer membranes isolated from cells lacking FadL, suggesting a much stronger interaction between T2 and FadL. While many phages may not initially use more than one receptor, they can adapt to use a different or additional secondary receptor when propagated on mutant bacterial strains that lack the phage’s preferred receptor. These mutants, known as host range mutants, can use a different outer membrane protein for infection and may or may not lose the ability to use the original. These types of studies have mostly been done for the bacteriophages Tula, M1, Ox2, all of which infect E. coli. Bacteriophage Ox2 uses the outer membrane protein OmpA for infection and has been shown to develop mutations in its genome to use OmpC or OmpX. When propagated on a strain lacking OmpA or a modified version of OmpA, Ox2 mutants that could use OmpC for infection were isolated. One mutant, Ox2h5, could not infect a strain lacking OmpA, while the two other mutants Ox2h10 and Ox2h12 could still infect this strain. These two mutants were further propagated on a strain lacking both OmpA and OmpF, which also expressed a modified version of OmpC. After a series of passages, five new host range mutants were isolated. These mutants were now able to use the outer membrane protein OmpX for infection. Some phages may require the assistance of other membrane proteins for productive infection, which can complicate interpretation of results. One well-established example is the TonB dependence on FhuA to support infection by T1. While T1 appears to be “TonB-dependent,” the activity of TonB is required for FhuA function rather than T1 binding to TonB directly. Though these confounding interactions appear to be limited, they can nevertheless make it difficult to understand the exact interactions between some membrane proteins and their phages. With the renewed interest in phage therapy, phages able to use more than one receptor (e.g., OmpA and OmpC) or more than one type of receptor (e.g., TonB-dependent vs. non-TonB-dependent) are particularly useful. It is less likely that a host will become immune via single mutations or deletion of a single gene, thus rendering the phage more “flexible” in terms of its infectivity. Despite this, relatively few phages are known to use more than one receptor – or, if they are able to use more than one receptor, the additional receptor(s) have not yet been identified. This dual-receptor feature is particularly useful for phages and for our use of them in phage therapy, given the ability for phage and bacteria to co-evolve as discussed in the next section.

Ecology and Evolution of Phage-Host Interactions Bacteria are capable of blocking a successful phage infection at multiple steps of the phage’s replication cycle. One key mechanism of phage immunity is to ensure the phage does not gain access to the host at all. To do this, the host can use four main strategies: (1) receptor loss, (2) receptor mutation, (3) receptor blocking or masking, or (4) producing decoys. The first strategy is an extreme but effective way to eliminate possible phage infection: to reduce the abundance of whatever protein the phage is recognizing, or to stop producing it altogether. This is most often seen in laboratory experiments, where bacteria are already grown in nutrient-rich media at optimal conditions. A classic example is the phage λ receptor LamB, which is required for maltose uptake in the E. coli host. In typical laboratory growth media, plenty of other nutrients are available, so the loss of maltose transport does not significantly hinder the growth of the host. In nature, however, the simple loss of a phage receptor may come with tradeoffs. For some pathogenic bacteria, the loss of a phage receptor actually reduces pathogenicity. This has been observed for Vibrio cholerae phage ICP2, which utilizes OmpU. Expression of ompU is regulated by the transcription factor ToxR, and deletion of the toxR gene renders the bacterium avirulent. In 2014, Seed and coworkers isolated strains of V. cholerae that were resistant to phage ICP2. Some of these strains contained mutations in either the ompU or toxR genes. When tested in an animal model, the ompU mutant strains showed slightly decreased fitness compared to wild-type V. cholerae. Strains containing the mutant toxR, however, were severely attenuated and unable to cause disease. The authors showed that these strains also did not express the ompU gene. These latter strains were indistinguishable from a strain completely lacking toxR. Thus, although the toxR mutant strains were resistant to phage ICP2, they were unable to colonize their mammalian host. This type of interaction is of particular interest for phage therapy applications, as a tradeoff between phage resistance and virulence is a “win-win” scenario. Another example of receptor loss or reduction is that of “phase variation”, where the levels of protein vary depending on environmental signals. An example of this is the phage KVP40, which infects Vibrio anguillarium via OmpK. Depending on the population density of V. anguillarium, individuals may up- or down-regulate expression of ompK. In dense populations, where many susceptible hosts are growing together, ompK is not expressed; however, as the population becomes less dense, the expression of ompK increases. While OmpK serves a function for the bacteria, this added “herd immunity” has a beneficial effect for the entire local population of V. anguillarium. Similarly, experiments done by Kim and Ryu in 2012 demonstrated that Salmonella enterica serovar Typhimurium will modify its LPS if the culture is shaken vigorously during growth. This modification can completely block phage infection, but the loss of this modification during non-shaking conditions will result in the bacteria becoming susceptible to phage infection once more. Since the interaction between phage and host can be limited to a small region of the receptor protein, it is also possible for hosts to become immune to phage infection by simple mutations. Returning to the phage Sf6 and its host receptor OmpA, Porcek et al. demonstrated in 2015 that a change at just one amino acid in the attachment site could reduce phage infection by 80%. Similarly, mutations occurring in the LPS biosynthesis pathway often render the host immune to specific phages. Simple changes in LPS structure, such as an additional sugar group, can reduce phage infection by several orders of magnitude. Even if the phage receptors themselves do not change, they can be blocked, masked, or sent into the extracellular space as decoys. Receptor access can be reduced by molecules that block them, such as some sugar molecules produced by species of

182

Bacteriophage Receptor Proteins of Gram-Negative Bacteria

Pseudomonas. These produce a mucoid layer around the bacteria, preventing the phage from encountering any underlying receptors. Finally, many species of Gram-negative bacteria produce outer membrane vesicles, which are spheres that bleb off the membrane of the bacteria and are released into the surrounding environment. While these vesicles may not carry anything, they do have all components of the bacteria’s outer membrane. Thus, phage can bind to the LPS and receptors and initiate an infection process. Instead of infecting a live cell, however, their genomes remain unused inside this decoy host. If bacteria were always successful in preventing phage infection, this article would not exist. To overcome bacterial strategies of receptor loss, change, blocking, or decoys, the bacteriophages have a similar array to regain access to their hosts. These include: (1) being able to use multiple receptors, as discussed above; (2) targeting conserved regions of receptors; (3) mutations of the receptor-binding proteins, either by small changes or dramatic rearrangements; or (4) using a tail protein that changes conformation depending on which host protein(s) are available. In addition, phages can also use bacterial defense strategies to their benefit and prevent subsequent infection by other phages. Phages that target conserved regions of receptors generally have a broad host range, as these regions in the membrane protein tend to be conserved due to functional requirements. Thus, they are conserved across many species and genera of bacteria. If the interaction can be modified by a single mutation in the phage’s tail protein, this is another way for phages to regain access to their host. Meyer et al. showed in 2012 that bacteriophage λ was capable of recognizing a completely new receptor after only eight days of co-culturing the phage with its host in minimal media. The high number of replicates in this study showed that traversing this evolutionary landscape is highly parallel in laboratory settings, meaning that this evolutionary adaptation is highly favorable for the phage. The ability to recognize this new receptor, OmpF, was conferred by as few as four mutations in the phage’s J protein, which forms the needle-like tip of the tail. These mutations mapped to the region known to interact with LamB. Since OmpF and LamB have similar global structures, these few point mutations were sufficient to confer favorable interactions between the phage and this new receptor. A more dramatic method is employed by T4 and related phages, which have hypervariable regions of their tail proteins. These can be rearranged or swapped between phages relatively easily, ensuring that at least some phages are not able to access a host if the host receptor protein becomes mutated. These tail fiber domains can allow the phage to use a different receptor or even a related receptor in a new host. Even more innovative are the phages that can switch receptors depending on what’s available in the environment. Some examples are the Salmonella phage SP6 and E. coli phages DT57C and DT571/2. These phages have V-shaped tailspikes that swivel to bind to whichever host is more prevalent in the environment. Similarly, phi92 has a radial array of at least 3 receptor-binding regions on its tail fibers. This allows the phage to infect both Salmonella and E. coli – two separate genera of enteric bacteria. Based on our current understanding, this is an unusual feature for most bacteriophages. Besides recognizing alternate receptors, phage can actually affect receptor structures by producing their own proteins to modify their host’s exterior surface. These modifications can occur in either primary or secondary receptor structures. For example, several Shigella phages and some Salmonella phages produce enzymes that add sugar moieties to the bacterial O-antigen. This O-antigen modification, known as seroconversion, can prevent other phages of the same or related species from double-infecting a host cell. This strategy requires the host to produce an O-antigen, which is not the case for all strains of bacteria. In addition to LPS modifications, a few bacteriophages can encode “shielding proteins”, which bind to secondary protein receptors. The related E. coli phages EPS7 and T5 each produce a shielding protein that targets a separate Omp. Phage EPS7 utilizes protein BtuB as a secondary receptor, whereas T5 uses FhuA as discussed above. The shielding proteins produced by these phages bind specifically to their own secondary receptors: the EPS7 protein BF23 binds to BtuB and the T5 protein Llp binds to FhuA. Shielding their own receptors specifically likely serves as a superinfection exclusion mechanism to prevent secondary infection by the same phage species. The effects of phage-encoded proteins on the exterior surface of their hosts will likely need to be studied further, especially if phage therapy starts being used to replace or supplement antibiotics.

Relevance of Phage-Receptor Interactions to Phage Therapy Phage therapy involves the use of bacteriophages to treat bacterial infections and could be a viable strategy to combat drug resistant strains. Even though the use of phages for therapy was explored as early as 1917, very few countries have approved this treatment for infections. One of the key factors contributing to the under-development of phage therapy is the lack of extensive characterization of bacteriophages and their infection process. The ability of bacteria to develop resistance against phages by evolution is another key challenge, and a successful treatment would require more than one kind of bacteriophage. The use of phage cocktails to treat bacterial infections has gained importance recently, as it could be more effective against a diverse set of bacteria and utilize a suite of different receptors. The challenge is to develop a phage treatment protocol with different sets of phages that can be effectively utilized against an evolving population of bacteria. One of the strategies that bacteria employ to combat phages is to suppress the production of the Omp that a particular phage utilizes, especially if the protein is not critical for growth in the current environment. Knowing which Omp a particular phage uses is critical to address the issue of bacterial resistance to phages. In addition to isolating novel phages that target different receptors, previously-characterized phages that utilize a set of Omps can be obtained by experimental evolution in the laboratory. Phage Ox2, which utilizes OmpA as its secondary receptor, can be evolved in the lab to utilize OmpC or OmpP/X instead of OmpA. Alternatively, screening for phages that can still infect a bacterial strain lacking its receptor can be a mechanism for generating a diverse population and for investigating co-evolutionary dynamics.

Bacteriophage Receptor Proteins of Gram-Negative Bacteria

183

A unique approach developed by Turner et al. simultaneously utilizes both antibiotic resistance and phage resistance in Pseudomonas aeruginosa to neutralize infections. Certain bacteria utilize membrane-based pumps to expel toxic antibiotics – for example, TolC in E. coli or the Mex system of P. aeruginosa. Turner et al. screened for phages that are dependent on the outer membrane-porin M (OprM) for infection, a component of the drug efflux system, and isolated the phage OMKO1. If exposed to a combination of OMKO1 and antibiotics, the bacteria is doomed: it must downregulate or lose the receptor OprM to become resistant to phage OMKO1. Losing OprM will make the bacteria susceptible to antibiotics by rendering the drug efflux pump ineffective. In this study, researchers demonstrated that both laboratory strains and clinical isolates of P. aeruginosa that were resistant to OMKO1 phage were sensitive to antibiotic treatment. Confirming this deadly combination, the addition of both phage OMKO1 and tetracycline was more effective in inhibiting growth than the use of tetracycline alone. This combination treatment has been successfully utilized to clear P. aeruginosa infection in a patient’s aortic graft. Continuing to identify phage outer membrane receptors that also serve as antibiotic efflux pumps in bacteria will greatly advance phage therapeutics and likely enhance the longevity of both types of treatments.

Summary Our understanding of phage-host interactions has come a long way since the 1950s and is increasingly becoming more direct and precise. Although receptors are now known and characterized for many phages, we are far from saturated. In addition to not having a phage-receptor pair for all – or even most – types of phages or species of bacteria, several confounding factors may complicate our progress. First, although phages that utilize more than one secondary receptor may be most beneficial for phage therapy, it can be difficult to identify which receptors are used and in which order of preference. Second, Omps that require an additional membrane protein, such as the TonB-dependent Omps, can complicate the interpretation of results. Third, since some phages are able to modify their own receptors – and possibly the receptors of other phages – this may have additional affects on the host that we do not yet realize. Investigating these and other relationships involving phage-receptor interactions is critical for understanding bacteriophage biology, phage-host evolution, and for future development of phages for treatments.

Appendix: Methods for Receptor Identification or Characterization To identify which receptor a phage recognizes, scientists can use genetic or protein-based screens. Genetic screens can be forward or reverse. In a forward genetic approach, scientists find naturally-occurring mutants based on experimental conditions. For example, lytic phages and hosts can be co-cultured, sometimes in the presence of a mutagen, and any bacteria that survive phage attack would be isolated afterwards. If the surviving colonies “bred true” and were still resistant to the input phage, they would be analyzed further. Any mutations found in these survivors could indicate which genes and proteins are involved in phage susceptibility. This is a classic method, as it is possible to conduct screens for resistant mutants without immediately knowing an organism’s genome sequence. In reverse genetic screens, the sequence of the bacterial genome is known, and scientists can therefore create mutants by making specific bacterial genes nonfunctional and testing the effect of each gene product in the process of phage infection. These screens make use of single gene deletion (SGD) or transposon libraries. These libraries are constructed in populations of bacteria, where each individual in the population contains a deletion or disruption of only one gene. The population is large enough to ensure that every gene in the genome is disrupted. To determine which genes are important for bacteriophage attachment, researchers can systematically test the growth of phage on the thousands of strains within the collection and identify which disrupted gene(s) prevent and/or inhibit phage production. For transposon libraries, each strain has one gene that has been disrupted by the insertion of a transposon, usually covering every gene not required for bacterial survival. A powerful aspect of using a transposon library is that it can identify non-binary effects; for example, it is possible to identify genes that either delay or accelerate the infection process when disrupted by a transposon. The KEIO collection in E. coli and SGD library in S. enterica sv. Typhimurium have been used to identify many phage receptors and other host factors that are important for phage infection. Identifying a bacteriophage’s secondary receptor can also be done at the protein level. One very direct way to do this is to isolate phage-receptor protein complexes. To do this, researchers let the phage bind to its host, lysing the host, then re-isolating the phage with any host proteins still attached. The separation of phage and whichever host proteins, glycans, or even whole membrane vesicles are still bound can be achieved by several methods: ultracentrifugation (typically using a sucrose or cesium chloride gradient), precipitating the phage by crowding it out of solution (e.g., via polyethylene glycol), or size exclusion filtration. Once the phage-receptor complex is separated from the rest of the sample, the associated proteins or surface glycans can be identified through a variety of biochemical means. If the interactions are transient, chemical treatment can covalently link the phage to its receptor(s). Otherwise, it can be difficult to capture complexes that are stable enough to persist throughout the preparation and treatment of the sample. Another complicating factor is that phages can bind to very complex structures like intact membrane vesicles. Binding to these vesicles means an entire blob of membrane will be associated with the phage. In these vesicles, there will be a number of proteins and surface glycans, so follow-up studies are needed to confirm which of the host products are the true or critical receptor(s).

184

Bacteriophage Receptor Proteins of Gram-Negative Bacteria

Once the receptor’s identity is known, it is useful to determine the structure of the protein and which regions are important for interaction with the virus. Structural studies can illustrate the function of a given protein and how it might interact with a phage. Both classic and modern structural studies require the receptor to be reasonably pure and stable during the experiment. Many phage receptors are transmembrane or membrane-associated proteins, which can be challenging to work with as discussed in Table 1. There are two main levels of structural studies: atomic/near-atomic resolution, which is useful for looking at fine details; or low resolution, which can indicate the oligomeric state and general secondary or tertiary structure of a protein. Structures obtained from atomic resolution techniques are extremely detailed and contain information about most, if not all, atoms in the structure. A classic method to obtain this information is X-ray crystallography. Here, a purified sample is prepared in a way that causes the proteins to become highly ordered and densely packed, forming a solid crystal lattice of repeating subunits. Using X-ray beams, the angles and intensities of the resulting diffraction pattern can produce a three-dimensional picture of the sample’s electron density. This describes the positions of all (or most) of the atoms and chemical bonds within the protein. Omp receptors – especially those with a β-barrel fold, as discussed below – are excellent proteins for this method due to their inherent stability. In most cases, stable proteins can be resolved at a high resolution, meaning more detailed analyzes can follow. Crystallography is not the only method to gain information about a protein’s structure and/or function. A newer technique to do this is cryo-electron microscopy (cryo-EM). Cryo-EM is a form of transmission electron microscopy where samples are first flash frozen in amorphous, or vitreous ice. These samples are then imaged under cryo temperatures (below −180°C) in an electron microscope to collect 2D images of the proteins. A common “single particle” experiment requires collection of thousands to millions of images of individual proteins. Ideally, the proteins are sitting in the ice in many different orientations so the structure is visible from every angle. Computational algorithms applied to a subset of individual particles are then used to combine the 2D images and reconstruct a 3D model of the proteins. Recent advances in instrumentation, camera design, and the ability to handle big data have lead to the cryo-EM “resolution revolution”, where single particle cryo-EM studies are approaching the high resolution given by X-ray crystallography. Cryo-EM is especially useful for large, multimeric complexes, which are often a challenge in X-ray crystallography. It is also useful for less stable proteins or for proteins that do not crystallize easily. Despite these advantages, cryo-EM is currently limited by the size of the complex, so it is not possible for all samples. Smaller Omps, especially if they are monomers, are too small to be effectively imaged by cryo-EM. On the other hand, several of the giant viruses and jumbo phages are too big to effectively image while bound to cells or their purified receptors. With the cryo-EM field still evolving rapidly, this may not remain a barrier for long. Not all Omps or phage-host complexes can be prepared or used for the atomic resolution techniques listed above. Instead, there are lower-resolution experiments that still provide information about the basic structure of a given phage receptor. For example, many phage receptors form a variety of oligomeric states that can be calculated by the molecular mass of the complexes. Different Omps can may be found as monomers, trimers, or larger complexes. The oligomeric state of receptor proteins are commonly studied by native gels or chromatography, analytical ultracentrifugation, and native mass spectrometry. Native gels or chromatography can be used to sort protein complexes by mass in their native state, then compared to size standards to determine how many copies are found in each complex. Analytical ultracentrifugation combined with sucrose or cesium chloride gradients will separate complexes of different mass, shape, and/or density. Modifications of this technique can also quantitatively determine the affinity of proteins within multimeric complexes. Finally, native mass spectrometry – a newer and very powerful method – can actually be used to calculate the distribution of each complex type within a population. This allows researchers to determine whether protein complexes are highly dynamic and form many different oligomeric states, or if they are more uniform and predominantly have only one form. Even if these specialized techniques are not available or practical, there are other methods that use optical measurements like fluorescence. Measuring the intrinsic fluorescence from aromatic amino acids or targeted fluorescence from the addition of certain chemicals are relatively quick and easy methods to determine a protein’s tertiary structure. Circular dischroism, which measures differences in absorption of polarized light, is also a common technique to determine whether a protein is largely comprised of α–helices or β–sheets. Both of these optical techniques can be combined with chemical or thermal denaturants to test conditions that promote unfolding of the protein, providing information about the receptor stability. From these types of experiments, scientists can learn the secondary or tertiary structural elements of a protein or its inherent stability alone or in a complex. Some properties of the phage-receptor binding interactions can be measured in vitro, including: binding affinity, kinetics of binding, and genome ejection after receptor docking. These approaches rely on the ability to purify the receptor protein and for the receptor to stay active in test tubes. To measure binding affinity and kinetics, increasingly popular platforms use optical measurements like fluorescence to measure how fast receptors bind to phage particles at varying conditions. For other types of phage receptors such as the primary LPS receptor, hydrolysis by the phage tail draws the phage closer to the cell membrane. Enzyme assays are therefore a popular method that can measure how quickly the phage degrades LPS and how quickly it can reach its secondary receptor. For secondary protein receptors which have no associated enzymatic activity, measuring association and disassociation rates can indicate binding strength between the phage and host protein. Techniques such as surface plasmon resonance (SPR), biolayer interferometry (BLI), and fluorescence anisotropy can all be used for these measurements. All of the aforementioned experiments can be done with whole phage or with purified phage tail proteins that interact with the phage’s specific receptor. When used in combination, these techniques can paint a relatively complete picture of how the phage identifies and binds to its host.

Bacteriophage Receptor Proteins of Gram-Negative Bacteria

185

Finally, with advances in computational capabilities, in silico methods are becoming common and powerful techniques to study biological processes. Computational approaches such as molecular dynamics (MD) can model the complex landscapes of membranes, membrane protein receptors, and complex structures such as LPS. As a virtual microscope, MD simulations describe structural biophysics of and biochemical interactions between all these components which can be used predictively to determine how membrane biophysics changes in different environments. Which is the best method to use? Each has strengths and weaknesses, as discussed here and summarized in Table 1. Like all science, a combination of experimental approaches more accurately captures the complexities of biology.

Acknowledgments This material is based upon work supported by the National Institutes of Health R01 GM110185, the National Science Foundation CAREER 1750125 and Cooperative Agreement No. DBI-0939454, and the JK Billman, Jr,. MD Endowed Research Professorship.

Further Reading Bohm, K., Porwollik, S., Chu, W., et al., 2018. Genes affecting progression of bacteriophage P22 infection in Salmonella identified by transposon and single gene deletion screens. Molecular Microbiology 108 (3), 288–305. Casjens, S., Hendrix, R., 2015. Bacteriophage lambda: Early pioneer and still relevant. Virology 479–480, 310–330. Chan, B.K., Sistrom, M., Wertz, J.E., et al., 2016. Phage selection restores antibiotic sensitivity in MDR Pseudomonas aeruginosa. Scientific Reports 6 (1), 26717. de Jonge, P.A., Nobrega, F.L., Brouns, S.J.J., Dutilh, B.E., 2019. Molecular and evolutionary determinants of bacteriophage host range. Trends Microbiology 27 (1), 51–63. Hancock, R.E., Reeves, P., 1975. Bacteriophage resistance in Escherichia coli K-12: General pattern of resistance. Journal of Bacteriology 121 (3), 983–993. Heller, K.J., 1992. Molecular interaction between bacteriophage and the Gram-negative cell envelope. Archives of Microbiology 158 (4), 235–248. Hubbs, N.B., 2017. Hijacking the Cell: How Bacteriophage Sf6 uses Shigella flexneri Outer Membrane Proteins for Infection. Michigan State University. doi:10.25335/M5J19S. Konovalova, A., Kahne, D.E., Silhavy, T.J., 2017. Outer membrane biogenesis. Annual Review of Microbiology 71, 539–556. Maynard, N.D., Birch, E.W., Sanghvi, J.C., et al., 2010. A forward-genetic screen and dynamic analysis of lambda phage host-dependencies reveals an extensive interaction network and a new anti-viral strategy. PLoS Genetics 6 (7), e1001017. Morona, R., Henning, U., 1984. Host range mutants of bacteriophage Ox2 can use two different outer membrane proteins of Escherichia coli K-12 as receptors. Journal of Bacteriology 159 (2), 579–582. Parent, K.N., Erb, M.L., Cardone, G., et al., 2014. OmpA and OmpC are critical host factors for bacteriophage Sf6 entry in Shigella. Molecular Microbiology 92 (1), 47–60. Rothenberg, E., Sepulveda, L.A., Skinner, S.O., et al., 2011. Single-virus tracking reveals a spatial receptor-dependent search mechanism. Biophysical Journal 100 (12), 2875–2882. Silva, J.B., Storms, Z., Sauvageau, D., 2016. Host receptors for bacteriophage adsorption. FEMS Microbiology Letters 363 (4), fnw002. van Houte, S., Buckling, A., Westra, E.R., 2016. Evolutionary ecology of prokaryotic immune systems. Microbiology and Molecular Biology Reviews 80 (3), 745–763. Wang, J., Hofnung, M., Charbit, A., 2000. The C-terminal portion of the tail fiber protein of bacteriophage lambda is responsible for binding to LamB, its receptor at the surface of Escherichia coli K-12. Journal of Bacteriology 182 (2), 508–512.

Relevant Websites https://cge.cbs.dtu.dk/services/HostPhinder/ HostPhinder. http://blanco.biomol.uci.edu/mpstruc/ Membrane proteins of known structure. https://opm.phar.umich.edu/ Orientations of Proteins in Membranes (OPM) database. https://phred.herokuapp.com/ Phage Receptor Database (PhReD). https://www.genome.jp/virushostdb/view/?VirusHostLineage=OtherVirus%7CBacteria Virus-Host Database.

Tail Structure and Dynamics Shweta Bhatt, University of Copenhagen, Copenhagen, Denmark Petr G Leiman, The University of Texas Medical Branch, Galveston, TX, United States Nicholas MI Taylor, University of Copenhagen, Copenhagen, Denmark r 2021 Elsevier Ltd. All rights reserved.

Nomenclature

gp Gene product (protein expressed by a particular gene) ssDNA Single-stranded DNA TMP Tape measure protein

ATP Adenosine triphosphate DNA Deoxyribonucleic acid dsDNA Double-stranded DNA

Glossary Caudovirales Order of dsDNA bacteriophages with an external tail structure. Includes the Podoviridae, Siphoviridae and Myoviridae. Contractile injection system Protein complex related to (and including) the contractile tail apparatus of Myoviridae. Also includes the type VI secretion system, R-type pyocins, the antifeeding prophage/Photorhabdus virulence cassette and other tailocins. Myoviridae Bacteriophages with a long, contractile tail.

N-fold symmetric assembly A protein complex comprised of N parts in which these parts are spatially related by a rotational axis of symmetry of order N is called an N-fold symmetric complex (e.g., a complex composed of three polypeptide chains that are related by a threefold is a threefold symmetric complex). Podoviridae Bacteriophages with a short, noncontractile tail. Siphoviridae Bacteriophages with a long, noncontractile tail.

Introduction To replicate, bacterial viruses or (bacterio)phages have to infect their microbial hosts. Unlike eukaryotic viruses, which are usually taken up through endocytosis or membrane fusion, bacteriophages are required to translocate their genome and certain proteins across the host cell envelope. To address this challenge, an overwhelming number of phages use a structure known as a tail. The tail creates a conduit between the phage capsid and host cytoplasm and allows the phage particle to remain attached to the cell surface. Such phages are known as the tailed bacteriophages or Caudovirales (Fig. 1). The genome of all Caudovirales is a double-stranded DNA molecule, which is packaged into an icosahedral capsid. The proteinaceous tail structure is attached to a special vertex of the capsid through which the DNA is packaged during capsid assembly. The tail is crucial in the infection process of Caudovirales: it is responsible for the initial host recognition and attachment that may include digestion or modification of the cell surface polysaccharides, for the irreversible attachment to the cell, for the degradation of the peptidoglycan, for sensing the “injection signal”, and for the translocation of DNA and proteins into the host. Because of the multitude of challenges that need to be overcome, it is not surprising that these tails are large, multiprotein assemblies of extreme complexity. For example, the tail of bacteriophage T4, one of best studied and more complicated phages, consists of a total of at least twenty different gene products present in varying stoichiometries. In gram-positive bacteria and archaea, the genome needs to be injected across the only membrane of the cell. In gram-negative bacteria however, both the inner and outer membranes (as well as the peptidoglycan layer) need to be crossed, generating some additional requirements for the tail structures of phages infecting these organisms. It should be noted that other viruses outside the order of the Caudovirales, have been found to have structures that are reminiscent of tails, and are used to inject the genome: e.g., the ssRNA virus MS2, the ssDNA virus jX174, the dsDNA phage PRD1, the algal virus PBCV-1, certain archaeal viruses such as Acidianus two-tailed virus (ATV) and some eukaryotic viruses such as Herpes simplex. These viruses are discussed in other chapters of this collection. Here, we will discuss the function and dynamics of the tail of the Caudovirales. We will examine the similarities and differences of all three families belonging to this order and point out specific differences between tails of bacteriophages targeting gram-positive and gram-negative bacteria.

Caudovirales The order of the Caudovirales represents the largest group of bacteriophages, and therefore, of viruses. Caudovirales can infect either bacterial or archaeal hosts, suggesting that they share a common ancestor that is as old as the emergence of the bacteria circa

186

Encyclopedia of Virology, 4th Edition, Volume 4

doi:10.1016/B978-0-12-809633-8.20965-5

Tail Structure and Dynamics

187

Capsid proteins

Head (icosahedral) Head (icosahedral)

Capsid proteins

Head (icosahedral)

Neck (decorated with collar and whiskers composed of fibritin)

Tail tube

Tail sheath (contractile) enclosing the tail tube, which itself encloses the tape measure protein (TMP).

Capsid proteins Tail Tail spike

TMP present inside

Baseplate

Myoviridae – T4

Tail terminator protein

Tail tube (non-contractile)

Baseplate

Podoviridae – P22

Siphoviridae – P2

Fig. 1 Cryo-EM reconstructions of archetypal phages representing the three families in the order Caudovirales. From left to right: T4, which is characterized by the presence of a long tail surrounded by a contractile sheath and a terminal baseplate. P22, with a short tail and appendages. The lactococcal siphophage P2 with a long, flexible tail encompassed by a non-contractile sheath and a distal baseplate. Figure adapted from Veesler, D., Cambillau, C., 2011. A common evolutionary origin for tailed-bacteriophage functional modules and bacterial machineries. Microbiology and Molecular Biology Reviews 75, 423–433.

3.5 billion years ago. The common architecture and sheer complexity of these viruses makes it near impossible to have arisen from convergent evolution. The tail is attached to a special vertex of the capsid that is occupied not by a capsid protein but by a 12-fold symmetric mushroomor cone-shaped protein called the head-to-tail connector or portal (Fig. 1). Most tails have (mostly) sixfold symmetry, although pseudo-sixfold/true threefold symmetric tails exist. The portal protein is responsible for symmetry adjustment between the tail and the five-fold symmetric vertex of the capsid which houses the portal. During phage particle morphogenesis and prior to tail attachment, the portal protein is part of the phage-encoded machinery that uses ATP to package the DNA genome into the preformed capsid. Tails can be divided into three different types, and tail morphology has been one of the obvious and widely used characteristics to classify phages. There are bacteriophages with a short, non-contractile tail (Podoviridae); with a long, non-contractile tail (Siphoviridae) and with a long, contractile tail (Myoviridae) (Fig. 1). This classification emphasizes host recognition, binding, attachment and genome delivery mechanisms – all the processes that precede host takeover by the phage and production of new phage particles. In that aspect, post-DNA delivery behavior of phages that have different tail morphologies, e.g., P22 and lambda (Podoviridae and Siphoviridae, respectively), could be more similar to each other than those of the same type, e.g., P2 and T4 (both Myoviridae). Siphoviridae are the most prevalent family of tailed phages (Fig. 2). They are characterized by their long tail which is tube-like in appearance. The length of the tail is regulated by a tape measure protein (TMP). Myoviridae are also a very common family (Fig. 3). They contain a tube that is structurally related to the tube of the Siphoviridae and which is also of defined length, controlled by a TMP. However, Myoviridae tails are more complex as they also contain a sheath structure wrapped around the tail tube. The high complexity of the sheath, the conservation of the tail sheath protein structure and the fact that Myoviridae can be found both in bacteria as well as archaea strongly suggests that Myoviridae only evolved once. The Podoviridae on the other hand are less prevalent (Fig. 4). They only target certain types of bacteria, which suggests that they might have evolved multiple times from the Siphoviridae and/or Myoviridae.

Organization and Assembly of Long Tails Certain features of Siphoviridae and Myoviridae are similar, which is why we discuss them here together. In these phages, the capsid and the tail assemble independently from each other. Upon completion of DNA packaging, one or two head completion proteins bind to the portal vertex of the capsid making the latter competent for tail attachment. In the absence of tail completion proteins, the DNA can leak from the capsid. Tail assembly starts from the assembly of its capsid-distant part, which is called the tail tip complex in Siphoviridae or the central hub-spike complex in Myoviridae. In Myoviridae, the central hub-spike complex forms the centerpiece of a roughly planar structure called the baseplate. In both systems, this complex interacts with a “ruler” protein called the TMP, its chaperone, and other

188

Tail Structure and Dynamics

Fig. 2 Schematic representation of the conformational changes in the lactococcal phage P2 baseplate upon adsorption to the host cell wall components. The RBPs interact with the lipoteichoic acids through the host recognition domains and six RBP trimers undergo a 2001 downward rotation, enabling them to interact with the host-specific “pellicle” phosphopolysaccharides. Adapted from Sciara, G., Bebeacua, C., Bron, P., et al., 2010. Structure of lactococcal phage p2 baseplate and its mechanism of activation. Proceedings of the National Academy of Sciences of the United States of America 107 (15), 6852–6857.

gp6A

gp6B

gp7

gp25

gp53

gp5

gp5.4

gp27

gp48

gp54

gp19

gp3

gp18

gp15

gp29

Tail fiber 1

Tail fiber 2

Fig. 3 (A) Organization and architecture of the Myoviridae tail. Schematic showing the components of the contractile tail, which have been labeled according to the gene products in bacteriophage T4. (B) Organization of the contractile tail baseplate. Figures have been adapted from Taylor, N.M.I., van Raaij, M.J., Leiman, P.G., 2018. Contractile injection systems of bacteriophages and related systems. Molecular Microbiology 108, 6–15.

chaperones whose conservation and functions are unclear. The TMP determines the eventual length of the tail. The tube is then assembled onto the tail tip/central hub-spike complex, first by binding one or two special tube initiator proteins and then by subsequent binding of multiple copies of the tube protein around the TMP (Figs. 2 and 3). The tube initiator proteins and the tube proteins are hexamers, so the trimeric tail tip/central hub-spike complex contains three-to-six-fold adapter domains. Binding of the tail tube terminator protein to the end of the growing tube completes tube assembly. The central hub protein, tube initiator proteins, tube protein and terminator protein are all structurally and evolutionary related.

Tail Structure and Dynamics

A

B

189

C

Distal tail knob (gp9/gp13) Tail spikes (gp12) Tail tube (gp11)

Portal/connector(gp59) Tail fiber (gp66)

Distal tail knob (gp9/gp13)

Tail tube

gp65

φ29

Plug

N4

Tail fiber (gp17)

Portal/ connector (gp8)

Tail tube (gp7.3/11/12)

T7 30 nm

Fig. 4 Cryo-EM reconstructions of different phages of the Podoviridae family showcasing the diversity in size, tail structure and components. All the phages share a common structural feature in the form of fibers/appendages attached generally to the proximal part of the tail. Fig. A is adapted from Choi, K.H., McPartland, J., Kaganman, I., et al., 2008. Insight into DNA and protein transport in double-stranded DNA viruses: The structure of bacteriophage N4. Journal of Molecular Biology 378, 726–736. Fig. B is adapted from Xiang, Y., Morais, M.C., Cohen, D.N., Bowman, V.D., Anderson, D.L., Rossmann, M.G., 2008. Crystal and Cryo-EM structural studies of a cell wall degrading enzyme in the bacteriophage j29 tail. Proceedings of the National Academy of Sciences of the United States of America 105 (28), 9552–9557. Fig. C is adapted from Hu, B., Margolin, W., Molineux, I.J., Liu, J., 2013. The bacteriophage T7 virion undergoes extensive structural remodeling during infection. Science 339 (6119), 576–579

In the Siphoviridae, binding of the tail tube terminator marks the end of tail synthesis. However, in the Myoviridae, tube completion is followed by the attachment of the tail sheath. One of the baseplate proteins (gp25 in bacteriophage T4) is homologous to one of the domains of the sheath protein (Fig. 3). It forms an integral part of the baseplate-proximal layer of the sheath and can therefore be thought of as the tail sheath initiator protein. The sheath itself is assembled as a six-start helix with a symmetry matching that of the tube. In the very last step of contractile tail assembly, the tail sheath completion protein binds to the tube terminator and to the last disk of the sheath. In Siphoviridae, the head–tail interface is formed by the head completion proteins and the tail tube terminator. In Myoviridae, the head completion proteins instead bind to the tail sheath terminator. In bacteriophage T4, the final step in the assembly is the attachment of the long tail fibers to the tail. However, in some other systems, such as e.g., the R-type pyocin, the tail fiber is required for assembly of the baseplate showing that it acts early in the assembly process.

The Tube The tube is organized around the TMP. The TMP is mainly alpha-helical in structure and assembles as a multimer of unknown stoichiometry. Given the long size of the tail, the TMP is likely crucial to transmit the “injection signal” from the tail tip complex to the capsid. It has been shown that the length (in number of amino acids) of the TMP correlates with the length of the tail across different phages. Furthermore, deletion of fragments of the tape-measure protein results in a concomitant shortening of the tail. The TMP likely interacts with both the tail tip complex and the tube terminator protein (Fig. 3). As mentioned previously, structural homology of these proteins positioned at both ends of the tube, as well as for the major tube protein, has been established for both Myoviridae and Siphoviridae. For bacteriophage T4, a myovirus, the structure of the pseudo-hexameric trimeric hub protein gp27, the two tube initiator proteins gp48 and gp54 (each forming a hexamer), and the tube protein gp19 (forming a tube of hexameric rings) have been determined experimentally. Even though the experimental structure of the tail tube terminator gp3 is unknown, its sequence is similar to that of the tube protein. For the Siphoviridae there are also several structures of the tube protein, exemplified by p2 ORF18 (pseudohexameric trimeric hub), p2 ORF15 (hexameric tube initiator protein), the N-terminal domain of bacteriophage l gpV (major tail protein) and bacteriophage l gpU (tail terminator protein). The tube of most Siphoviridae phages is rather flexible. In contrast, the tube of bacteriophage T4 (and probably most myoviruses) is rigid, even in the absence of the tail sheath. One exception that has been described however is bacteriophage A511 which has a flexible tail although it is a myovirus.

The Sheath The tail sheath of Myoviridae is a complex and intricate molecular machine. Its initial extended state is a high energy conformation whereas the contracted state found in the particle in the post-injection state is a low energy conformation. The contraction powers a drill-like

190

Tail Structure and Dynamics

motion of the tube that results in breaching the integrity of the cell envelope of the host. The contraction is an irreversible process: this makes it especially important that it is only initiated at the right time, when the bacteriophage is correctly positioned on the cell. The sheath is assembled starting from the baseplate, and the tube–baseplate complex is a necessary primer for sheath assembly. In T4, the sheath consists of 23 hexameric layers of sheath protein gp18 (six per layer). The extended sheath is a six-start helix, with a rise of 40.6 Å and a rotation of 17.21, is 240 Å wide and 925 Å long. The symmetry of the sheath repeats the symmetry of the tube. This is an important fact, as it suggests the mechanistic basis for sheath assembly. The tube acts as a template for the assembly of the sheath in the extended, high-energy state: indeed, when gp18 is overexpressed in the absence of the baseplate–tail tube complex, the sheath is assembled into a low-energy state similar to the contracted tail sheath, which is only 420 Å long and 330 Å wide. The rise of the contracted sheath is 16.4 Å , and the rotation is 32.91. No atomic structure of an assembled bacteriophage sheath is available, but the structures of the sheath in the extended and contracted state for the related R-type pyocin and the type VI secretion system are known. Sequence similarity shows that the structure of the sheath subunit is conserved in all contractile injection systems, although the size of the subunit varies thanks to additional surface exposed domains in some systems. Also of note is that the sheath subunit of the type VI secretion system is encoded by two sequential genes. Each sheath subunit donates and accepts two long linker arms that interconnect all the subunits into a cylindrical mesh resembling a protective bottle sleeve. The connectivity of this mesh is the same in the extended and contracted state. In the contracted state, the entire mesh is simply widened and twisted. The linker arms of the first layer of the sheath subunits extend into the baseplate (where they interact with a T4 gp25-like protein) whereas the tail sheath terminator protein donates its linker arms to the last layer of the sheath subunits.

Baseplates of Long Contractile Tails The baseplate is by far the most complex part of the contractile bacteriophage (Fig. 3). The best-characterized baseplate is the one from bacteriophage T4. It is believed that all contractile bacteriophages share a common architecture similar to the one of T4. However, each myophage will have very specific adaptations which are important to infect its host. These especially include enzymes to degrade cell walls and tail fibers to specifically bind cells of choice. All known contractile tail baseplates are hexameric. They are built up of six “wedges”. In T4, the wedge consists of seven gene products, in different stoichiometries, which assemble in a specific order: gp11, gp10, gp7, gp8, gp6, gp53, and gp25. Once wedge synthesis is complete, the wedges can assemble around the central hub complex, consisting of gp27 (the central hub protein), gp29 (the TMP), gp5 and gp5.4. Gp26 and gp28 likely perform a chaperone function as they are not found in the fully assembled baseplate-tube complex. The formed, high energy baseplate has a “dome” shape. In vitro, in the absence of the central hub complex, the baseplates will assemble in a lower energy “star-shaped” form (see next section). After assembly of the dome-shape baseplate, the tube initiators (gp48 and gp54) can bind, followed by assembly of the tail tube and incorporation of the tail tube terminator as described in the previous section. A minimal baseplate wedge has been proposed to consist of homologs of the following T4 proteins: gp6, gp7, gp25, gp53, and at least one type of tail fiber protein (usually trimeric). Two chemically identical copies of gp6 interact with gp7 to form a heterotrimer that comprises the main part of the wedge. Starting from the center and moving towards the periphery, the gp6–gp7 heterotrimer consists of a part where the three chains interdigitate or interact extensively (the core bundle and the trifurcation unit) and the part where they go their separate ways (two opposing dimerization domains of gp6 and fiber attachment domain of gp7). Gp25 is positioned at the tip of the core bundle and gp53 is wrapped around the bundle. The trifurcation unit sends the three chains comprising it in three different directions. The dimerization domains of two neighboring wedges interact to link the baseplate into a ring. The fiber attachment domain of gp7 forms a radial protrusion on this ring. The N-terminal part of T4 gp7 (which is upstream of the core bundle domain) also interacts with the tail fibers. However, in simpler baseplates (e.g., in phage P2, phage Mu, the R-type pyocin) but even in some complex baseplates (e.g., of SPO1-like phages), orthologs of gp7 do not contain equivalent N-terminal extensions. To summarize, gp6 has an important role in the circularization of the baseplate and gp7 is crucial for the interaction with the tail fiber network (Fig. 3(B)). Apart from these conserved baseplate proteins, each bacteriophage has a specific set of tail fibers and other structural proteins and/or adaptations. In the case of T4, gp7 is an extended protein, and together with gp8 and gp9 (a dimer and a trimer, respectively), it makes up the intermediate or less conserved part of the baseplate, which is contracting the tail fiber network. In T4, the tail fiber network consists of two different types of tail fibers: the short and the long tail fibers. The short tail fibers (trimers of gp12) are “curled up” around the periphery of the baseplate and form part of the short tail fiber network, which also consists of two other trimeric proteins (gp10 and gp11). The long tail fibers (which have a stoichiometry (gp34)3(gp35)1(gp36)3(gp37)3) bind to the gp9 trimer. These viral adhesins have a tremendous length (B1600 Å ) and recognize both LPS as well as the outer membrane protein (OmpC). In the assembled T4 particle, the long tail fibers can interact with the neck protein fibritin (gp wac) when the former are in the retracted state. Upon infection, the long tail fibers need to become extended. This regulation might prevent the inadvertent binding of the tail fibers to the host in conditions that are unfavorable. In vitro, retraction and extension can be controlled by pH, ionic strength, temperature and polyethylene glycol concentration. During the infection process, it is thought that the binding of the long tail fibers to the cell surface induces a conformational change in the baseplate (going from the high energy, dome-shaped state to the low energy, star-shaped state) which also allows binding of the short tail fibers. Although the precise sequence of events is not clear, the “end state” of this process is structurally well understood from the

Tail Structure and Dynamics

gp48

Pre-attachment

gp27

Intermediate

191

Post-attachment

gp25 gp53

gp19 gp54

gp5

gp9

gp6B gp6A

gp7

gp12 (short tail fiber) gp10

gp11

gp5: central spike protein gp6: inner baseplate protein gp7: inner and intermediate baseplate protein gp9: long tail fiber attachment protein gp10: peripheral baseplate protein

gp11: peripheral baseplate protein gp12: short tail fiber protein gp19: tail tube protein gp25: inner baseplate protein

gp27: central hub protein gp48: tail tube initiator 1 protein gp53: inner baseplate protein gp54: tail tube initiator 2 protein

Fig. 5 Structure of the complex baseplate of phage T4 in a pre-attachment (from cryo-EM), intermediate (modeled) and a post-host-attachment state (based on cryo-EM). The insets are focused on the central part of the baseplate and demonstrate a release of the of the tail tube–central spike complex, which remains in a fixed position throughout the transformation. Figure reproduced from Taylor, N.M.I., Prokhorov, N.S., Guerrero-Ferreira, R.C., et al., 2016. Structure of the T4 baseplate and its function in triggering sheath contraction. Nature 533, 346–352.

structure of baseplates in the “star-shaped” state (Fig. 5). In this post-attachment-mimicking state, the long tail fibers make extended interactions with the baseplate, especially gp11. The short tail fibers are then extended and can interact with the LPS. This is accompanied by a slight opening of the iris formed by gp6 and gp7 and results in an outward movement of the gp6-gp7 core bundles. As gp25 is attached to the tip of the bundles, it also moves outward, pulling on the sheath, initiating its contraction. As already alluded to, even though the architecture of the baseplate is universally conserved, adaptation to different hosts results in tail fibers and receptor binding proteins (RBPs) of extreme complexity. In fact, the size and complexity of RBPs dwarf the actual baseplate in many large bacteriophages, e.g., in Twort-like phages (e.g., Listeria monocytogenes phage A511, Staphylococcus aureus phage j812) or Vil-like phages (e.g., Escherichia coli phage CBA120).

Tail Tip Complex of Long Non-Contractile Tails The tail tip complex and the adjacent part of the tail in siphophages is similar to the basal part of the tail tube and the baseplate hub of myophages. The tip protein, called Tal (tail-associated lysin) in siphophages, is homologous to T4 gp27 (Fig. 2). The protein that forms the interface between Tal and the rest of the tail is called Dit (distal tail protein). Dit is homologous to the T4 tube initiator protein gp48. The tail fibers are attached directly to this complex. In certain phages infecting Lactococcus lactis the RBP network is so extensive that it forms a large planar structure that by analogy with myophages is called a ‘baseplate’. However, this ‘baseplate’ appears to be structurally and evolutionary unrelated to myophage baseplate. In some of these phages (e.g., lactophage p2) the RBPs perform an acrobatic move during attachment to the host cell surface – they rotate by 2001 upon binding to the cell surface polysaccharides (Fig. 2). This conformational change is accompanied by opening of the Tal trimer, which allows the TMP to exit the tail. The TMP forms a channel across the cell envelope through which the DNA is translocated into the host cytoplasm. Of note, whether the TMP participates in forming a membrane channel in myophages is unclear. In the case of the coliphage T5, three L-shaped fibers (pb1) are attached to sides of the baseplate-hub protein (BHP) pb3 and one straight fiber containing a single RBP is attached to its center. Pb3 is the Tal protein of T5, as it adopts the same protein fold as that seen with the BHP (gp27) of the T4 myophage. At the end of the tail tube, a two-domain tail tip protein (TTP) pb9 is located in the upper region of the tail tip cone. X-ray crystallography studies have revealed structural similarities between domain A of pb9 and the N-terminal domains of known Dit proteins, such as those present in the lactophage p2. It has hence been proposed that Dit protein modules are likely to be a conserved structural motif amongst both Gram positive and gram-negative host siphophages. It has also been proposed that pb2 forms the TMP and possesses fusogenic and muralytic activity. In T5, host recognition occurs via the reversible binding of three L-shaped fibers (pb1) to the O-antigen of the lipopolysaccharide. The irreversible interaction of the RBP (pb5) that forms the tip of the central tail fiber with an outer membrane ironsiderophore receptor FhuA initiates a cascade of conformational changes, leading to the expulsion of TMP which digests the peptidoglycan and perforates the cell membrane. TMP plays a role in signal transduction by initiating the opening of the head-tail connector, which leads to DNA release from the capsid. The TMP likely forms a channel that extends the tail through the periplasm and allows the DNA to reach the host cytoplasm.

192

Tail Structure and Dynamics

Short Non-Contractile Tails: Tube-Like Tail Structure Bacteriophages belonging to the family Podoviridae are characterized by the presence of short non-contractile tails varying in lengths in the range of 10–30 nm (Fig. 4). They are the smallest family in the Caudovirales, considering the number of different phages discovered. In podophages, the tail is assembled directly onto the portal protein of the capsid after the DNA packaging is complete. The Podoviridae tail is generally comprised of two different proteins – a dodecameric protein that interacts with the portal protein (also called, confusingly, the head-to-tail connector in some phages), and a second hexameric protein which forms the rest of the tail tube. One of the most extensively studied podophages is the gram-positive Bacillus phage j29 (Fig. 4). In j29, the tail is approximately 380 Å long. Its portal-proximal part consists of dodecameric gp11 and is called the lower collar. It is a funnel-like structure to which twelve receptor-binding tailspikes (called ‘appendages’ in the j29 literature) are connected. In another podophage, P22, X-ray crystallography has revealed a dodecameric toroidal gp4 complex (called the ‘head-tail connector’ in the P22 literature) to which six P22 tailspike complexes are attached. Phage RBPs that carry enzymatic domains and thus are stockier in their appearance than fibers in electron microscopy images own their ‘tailspike’ name to the P22 tailspikes as they looked like small spikes attached to the P22 tail. A combination of the location, the stoichiometry and the predicted high a-helical content suggests that gp11 of j29, gp4 of P22 and the head completion proteins of the long-tailed phages could be evolutionarily related. The distal part of the j29 tail tube is thicker than the portal-proximal part. It is called the tail “knob” and is formed by a hexamer of gp9. It plays a key role in the infection process because the distal end of the j29 tube is blocked by a long loop of gp9 prior to DNA ejection. Interestingly, not only the structure of the gp12 tail knob of the Streptococcus podophage C1 is similar, but the tubular domain of j29 gp9 and C1 gp12 is found in the tail tube, neck and baseplate proteins of myophages and siphophages – such as in the gp27 baseplate hub protein of T4 and in the l tail tube protein gpV. This is an indication that the tails of all three families of phages (Myoviridae, Siphoviridae and Podoviridae) have evolved from a common ancestor. The tail tip or distal part of the tail tube typically contains proteins that aid either in digesting the host cell wall, or in host recognition or both. j29 degrades the Bacillus cell wall with the help of an enzyme located at the distal end of its tail knob (gp13), capable of cleaving the polysaccharide backbone and the peptide cross-links of the peptidoglycan layer. In phage P22, a long needle extending outwards from the center of the tail base is constructed by a single trimer of gp26. It has been observed that in addition to making the first contact with the host surface, the N-terminal end of the needle acts as a “plug” and prevents DNA leakage.

The Tailspikes, Tail Fibers and Phage-Host Interaction As the host receptors continuously evolve, the genes present in the tail fibers, tailspikes or tail appendages are selectively pressured to adapt to the ever-changing target, which can range from peptide sequences to polysaccharide moieties. The thin, long tail fibers are rigid and are flexibly attached to the tail. They scout the cell surface for a specific binding target and initiate adsorption. Tailspike proteins on the other hand are shorter, stubbier and possess enzymatic activity (although the substrates are known only for a subset of tailspikes). Tailspikes digest or modify cell surface polysaccharides, thus irreversibly binding the phage particle to the bacterial surface and allowing the tail tube to reach the cell membrane. This event triggers conformational changes in the phage, leading to the formation of a channel or a pore through which genetic material and proteins located in the capsid can traverse across the bacterial cell wall. The tailspikes or fibers hence play a crucial role in the initial host recognition and subsequently in triggering the transfer of the genome. There is considerable diversity in terms of length, shape and specific functionality not only in fibers and tailspikes of myo-, sipho-, and podophages, but even within each group. The majority of these spikes and fibers are homotrimeric – such as gp12 of j29 and gp17 of T7. In phages such as j29, the tailspike appendages recognize and cleave the glucosylated poly-glycerol phosphate teichoic acid by a mechanism similar to that seen in some ribonucleases. In phage C1, the appendages are more tail fiber-like and form a skirt around the tail. These appendages carry a flexible globular domain at their distal end. The dynamics of phage-host interaction depend on the nature of the bacterial host. Gram-positive and gram-negative bacteria differ in cell wall thickness and components, which have led phages to the adoption of different infection strategies. Gram-positive bacteria lack an outer membrane and phages typically interact with cell wall-associated teichoic acids and other glycopolymers embedded in the thick peptidoglycan layer. They must overcome this physical barrier to reach the cell membrane and deliver their genome into the bacterial host cytoplasm. After the phage has reached the cell membrane, it is thought that there needs to be some protein–membrane fusion event. The archetypal phage j29 relies on the formation of a cone-shaped structure constructed out of the long loop of gp9 that exits from the tail knob upon genome release. This structure functions as a membrane pore, similar to a hydrophobic fusion peptide, which is found in enveloped eukaryotic viruses and of a membrane-active peptide that is present in non-enveloped viruses. This shared mechanism employed by eukaryotic and prokaryotic viruses is likely a consequence of convergent evolution. For phages of gram-negative hosts, the DNA needs to cross the peptidoglycan and the inner membrane layers. As discussed above, the tails of podo- and siphophages have to be extended by phage and, possibly, host proteins that connect the tail with the cytoplasm. In myophages, the tail tube crosses the outer membrane and the periplasmic space to contact the cytoplasmic membrane, which is invaginated near the tip of the tube. The involvement of other phage proteins (e.g., the TMP) or host proteins in creating the channel spanning the cytoplasmic membrane is unclear and the tip of the tube might be capable of crossing the membrane.

Tail Structure and Dynamics

193

The outer membrane of gram-negative bacteria is dotted with highly conserved porin proteins that serve as reversible binding sites for phage tail fibers. These fibers are also known to interact with the inner (e.g., fibers of bacteriophage T4) or outer (e.g., fibers of bacteriophage jCTX from the Myoviridae family) core of LPS. Phage tail spikes on the other hand recognize and degrade the O-antigen (the polysaccharide part of LPS), which brings the particle closer to the bacterial cell surface, thus enabling the recognition of a second receptor in the outer-membrane, which paves the way for irreversible binding and triggers DNA release upon the proper orientation of the phage particle on the cell surface. A number of phages carry multiple sets of tailspike proteins on the particle which allow them to expand their host range from different bacterial strains to different bacterial species (e.g., phage K1–5 infects E. coli K1 and K5, phage SP6 infects different Salmonella enterica serogroups, and phage SFP10 infects S. enterica and E. coli O157:H7). Phages from the T7 supergroup infect most E. coli strains (Fig. 4). The T7 particle contains six trimeric fibers of gp17 attached to the proximal end of the tail tube (to the dodecameric gp11 ring). Even though the other end of the fiber is free, and that part will eventually interact with a cell surface receptor, in the free state of the phage most of the fibers are bound to the capsid. Similar to the more complex T4 fibers and the side fibers of the siphophage T5, the T7 fibers are not straight but are bent roughly in the middle. In fact, the T7 fibers resembles the letter “L”. The rough LPS forming the bacterial outer membrane is the main receptor for T7 fiber binding. This interaction is accomplished by rigid body rotation of the fibers and leads to ejection of the capsid core proteins. The latter form an extension to the tail and connect it to the cytoplasm. The extended tail then acts as the conduit for DNA transfer across the bacterial membranes while also protecting it from periplasmic nucleases. N4-like bacteriophages usually encode at least two different tailspike/tail fiber proteins with the larger protein directly attached to the phage tail and the smaller protein bound to the longer one (Fig. 4). The shorter tailspike (gp63.1) of the N4-like phage G7C is a deacetylase, and the host recognition of G7C depends not only on the main chain structure of the O-antigen, but on the presence of the acetyl group at a certain position of that particular O-antigen. N4 was the first phage that was shown to contain a large virion-encapsidated RNA polymerase (vRNAP) inside its capsid. This enzyme exits the capsid through the tail during infection and participates in genome transfer. The accumulating body of data shows that many phages contain vRNAPs in their capsids that are delivered (together with the genome) into the host cell cytoplasm where they initiate RNA synthesis, although such phages are usually much larger than N4 (e.g., AR9 and PBS1, giant phages of Bacillus subtilis).

Acknowledgments The Novo Nordisk Foundation Center for Protein Research is supported financially by the Novo Nordisk Foundation (NNF14CC0001). This work was also supported by an NNF Hallas-Møller Emerging Investigator grant (NNF17OC0031006) to NMIT. NMIT is a member of the Integrative Structural Biology Cluster (ISBUC) at the University of Copenhagen.

Further Reading Ackermann, H.W., 1998. Tailed bacteriophages: The order caudovirales. Advances in Virus Research 51, 135–201. Aksyuk, A.A., Bowman, V.D., Kaufmann, B., et al., 2012. Structural investigations of a Podoviridae streptococcus phage C1, implications for the mechanism of viral entry. Proceedings of the National Academy of Sciences of the United States of America 109, 14001–14006. Arnaud, C.-A., Effantin, G., Vivès, C., et al., 2017. Bacteriophage T5 tail tube structure suggests a trigger mechanism for Siphoviridae DNA ejection. Nature Communications 8, 1953. Bebeacua, C., Tremblay, D., Farenc, C., et al., 2013. Structure, adsorption to host, and infection mechanism of virulent lactococcal phage p2. Journal of Virology 87, 12302–12312. Choi, K.H., McPartland, J., Kaganman, I., et al., 2008. Insight into DNA and protein transport in double-stranded DNA viruses: The structure of bacteriophage N4. Journal of Molecular Biology 378, 726–736. Fokine, A., Rossmann, M.G., 2014. Molecular architecture of tailed double-stranded DNA phages. Bacteriophage 4, e28281. Hu, B., Margolin, W., Molineux, I.J., Liu, J., 2013. The bacteriophage T7 virion undergoes extensive structural remodeling during infection. Science 339, 576–579. Lander, G.C., Khayat, R., Li, R., et al., 2009. The P22 tail machine at subnanometer resolution reveals the architecture of an infection conduit. Structure 17, 789–799. Leiman, P.G., Arisaka, F., van Raaij, M.J., et al., 2010. Morphogenesis of the T4 tail and tail fibers. Virology Journal 7, 355. Prokhorov, N.S., Riccio, C., Zdorovenko, E.L., et al., 2017. Function of bacteriophage G7C esterase tailspike in host cell adsorption. Molecular Microbiology 105, 385–398. Sokolova, M., Borukhov, S., Lavysh, D., et al., 2017. A non-canonical multisubunit RNA polymerase encoded by the AR9 phage recognizes the template strand of its uracilcontaining promoters. Nucleic Acids Research 45, 5958–5967. Taylor, N.M.I., Prokhorov, N.S., Guerrero-Ferreira, R.C., et al., 2016. Structure of the T4 baseplate and its function in triggering sheath contraction. Nature 533, 346–352. Taylor, N.M.I., van Raaij, M.J., Leiman, P.G., 2018. Contractile injection systems of bacteriophages and related systems. Molecular Microbiology 108, 6–15. Veesler, D., Cambillau, C., 2011. A common evolutionary origin for tailed-bacteriophage functional modules and bacterial machineries. Microbiology and Molecular Biology Reviews 75, 423–433. Xu, J., Gui, M., Wang, D., Xiang, Y., 2016. The bacteriophage j29 tail possesses a pore-forming loop for cell membrane penetration. Nature 534, 544–547.

Relevant Website https://www.khanacademy.org/science/biology/biology-of-viruses/virus-biology/a/bacteriophages Bacteriophages ‒ Viruses ‒ Khan Academy.

Bacteriophage Tail Fibres, Tailspikes, and Bacterial Receptor Interaction Mateo Seoane-Blanco and Mark J van Raaij, National Center for Biotechnology, Madrid, Spain Meritxell Granell, National Center for Biotechnology, Madrid, Spain and Institute of Chemical Research of Catalonia (ICIQ), Tarragona, Spain r 2021 Elsevier Ltd. All rights reserved.

Glossary Beta-structure Structural element of a protein made up of beta-strands (beta-strands are elongated parts of protein chains connected to each other via multiple main-chain hydrogen bonds, they can be parallel or anti-parallel); beta-structured elements can be in the form of beta-hairpins, beta-sheets or beta-barrels, among others. Lipopolysaccharide (LPS) Cell wall component of Gram-negative bacteria which consist of a lipid A membrane anchor and a carbohydrate extra-cellular part. The carbohydrate part is composed of an inner core, an outer core and the variable O-antigen. Lipopolysaccharide is also known as endotoxin; it’s toxicity is due to our immune system reacting to them. Peptidoglycan Bacterial copolymer consisting of carbohydrates and amino acids that forms a mesh-like layer

outside the plasma membrane. It contains alternating residues of beta-(1,4) linked N-acetylglucosamine and N-acetylmuramic acid. A short peptide chain joins N-acetylmuramic acid residues together, leading to a three-dimensional network. Also called murein. Tail fiber Long, thin receptor-binding proteins or protein complexes attached to the tail of a bacteriophage. A typical example are the six long tail fibers of bacteriophage T4. Tailspike Shorter, stubby receptor-binding proteins attached to the tail of a bacteriophage. A typical example is the tailspike of bacteriophage P22. Teichoic acid Bacterial copolymers of glycerol phosphate (or ribitol phosphate) and carbohydrates that are linked to each other through phosphodiester bonds, fortifying the cell wall of Gram-positive bacteria.

Introduction Tailed bacteriophages, belonging to the Caudovirales order (“cauda” is Latin for tail), are formed by an icosahedral or prolate capsid, a neck complex, and a short or long tail (Fig. 1). For some phages, these tails are short (podoviruses), for others they are long and flexible (siphoviruses) and for yet others they are composed of two concentric tubes, of which the outer one is rigid and contractile (myoviruses). Phage tails are very important for correct host recognition and efficient DNA transfer into the bacterium. For phages infecting Gram-negative bacteria, the genome must avoid the periplasmic space, where many nucleases reside. Just before DNA ejection, members of the Podoviridae family (like T7 and P22) expel multiple copies of internal capsid proteins, which then assemble into a tube, spanning the cell wall and contacting the cytoplasmic membrane (Fig. 1(e) and (f)). Members of the Myoviridae family contract the outer tail sheath, driving the inner tail tube into the cell wall (Fig. 1(c) and (d)). Less is known about the exact infection mechanism for the Siphoviridae family, but they may recognize existing membrane-spanning protein complexes and subvert these to promote DNA transfer into the cytoplasm (Fig. 1(a) and (b)). Before DNA transfer can take place, however, the correct bacterium has to be identified. Bacteriophages recognize exposed structures of the bacterium they are going to infect (Fig. 1(g) and (h)). These structures may be proteins, like flagella, pilins, porins or transporters, or oligosaccharides, like lipo-polysaccharide (LPS) for Gram-negative bacteria and teichoic acid for Gram-positive bacteria. Accurate host cell recognition is essential, because a phage that ejects its genome into the medium or into the wrong cell can't replicate. Phage receptorbinding proteins perform this selection of encountered bacteria. These receptor-binding proteins are usually present in multiple copies. Furthermore, each of these are in general homo-trimeric, i.e., they consist of three identical subunits. The presence of multiple trimeric copies leads to a strong avidity effect, allowing relatively weak individual interactions to contribute to strong binding of the phage. Bacteriophages may first encounter a potential host cell with their heads instead of their tails and maintaining contact with this bacterium could be useful. Candidates for these head-host interactions are the head fibers of phi29, the immunoglobulin-like domains found in structural proteins of many phages and the Dec protein on the bacteriophage L head. These interactions may allow additional time for the tail fibers or tailspikes to recognize the primary receptor on the cell surface. Alternatively, these head proteins may also serve to maintain phages in an environment where suitable host bacteria may be encountered, for example by binding to intestinal mucus. Adsorption of the phage tail to the host often takes place in two steps. First, via a reversible interaction with a specific primary receptor on the surface of the target cell and then, irreversibly binding a secondary receptor. The phage proteins that mediate these interactions are usually tailspikes or tail fibers. Phage receptor-binding proteins can be divided into several classes (Fig. 2). Fibers are long and thin, and sometimes composed of more than one type of protein chain, like in phage T4 (Fig. 2(a)). Their length provides reach, perhaps necessary to bind distant receptors on the bacterial surface. Tailspikes are shorter and stubby and may contain an enzymatic function, helping to digest host capsule polysaccharides and allowing the phage to reach the host cell membrane. An example are the tailspikes of phage P22 (Fig. 2(e)). Atomic structures of phage receptor-binding proteins have revealed new protein folds consisting of mainly beta-sheets with different topologies. Often, the three monomers are intertwined, stabilizing the proteins and explaining their resistance to heat and denaturants.

194

Encyclopedia of Virology, 4th Edition, Volume 4

doi:10.1016/B978-0-12-814515-9.00152-1

Bacteriophage Tail Fibres, Tailspikes, and Bacterial Receptor Interaction

195

Fig. 1 Tailed bacteriophages and their infection mechanisms. Schematic drawing of a siphovirus attached to the cell wall (in brown) in pre-infection conformation (a) and in post-infection conformation (b). The capsid wall is colored blue, the DNA yellow, the portal complex red, the sheath dark blue, the baseplate dark brown, the tail fibers green and the spike in orange. Schematic drawing of a myovirus in pre-infection conformation (c) and in post-infection conformation (d). Coloring is as for the siphovirus, with the addition of the tail tube in dark red and the short tail fibers in pink. Schematic drawing of a podovirus in pre-infection conformation (e) and in post-infection conformation (f). The tail tube is colored dark blue and the core proteins are colored dark red. Schematic representations of a typical cell wall of a Gram-negative (g) and of a Gram-positive (h) bacterium are also shown, with the different components labeled (LPS, lipopolysaccharide; OM, outer membrane; IM, inner membrane; WTA, wall teichoic acid; LTA, lipoteichoic acid).

In this article, we will review known structures of tail fibers and tailspikes. Detailed knowledge of these receptor-binding proteins will inform us on their receptor-binding properties and allow us to better understand phage infection. It may also allow site-directed mutation to direct phages to different receptors, or the generation of chimeric receptor-binding proteins for the same purpose. Possible applications are detection and elimination of bacteria and incorporation into protein-based biotechnological devices.

Structures of Bacteriophage Tail Fibers Although most podoviruses appear to have tailspikes, some have relatively long and thin fibrous appendages (Figs. 2 and 3). Coliphage T7, a well-studied example, has six kinked fibers. The T7 tail fibers are each formed by a parallel homotrimer of gp17. The kinked fibers are comprised of a rather large (when compared to other fibers) amino-terminal tail attachment domain, a thin

196

Bacteriophage Tail Fibres, Tailspikes, and Bacterial Receptor Interaction

Fig. 2 Tailspikes and tail fibers. (a) Bacteriophage T4 (EMDB entry EMD-2774). Tail fiber proteins are shown in blue (proximal tail fiber protein gp34), yellow ("knee" protein gp35), dark green (distal tail fiber protein gp36) and red (distal tail fiber protein gp37). The structures of the C-terminal regions of gp34 and gp37 (ribbon diagrams) are shown in their approximate positions. The baseplate (light blue) contains the six short tail fibers, which are made up of gp12 (one is shown as a ribbon diagrams). The carboxy-terminal part of gp37 (b; PDB entry 2XGF), the carboxy-terminal part of gp34 (c; PDB entry 5NXH) and full-length gp12 (d; from PDB entry 5IV5) are shown as ribbon diagrams. (e) Bacteriophage P22 (EMDB entry EMD-1220). P22 tailspikes (gp9) are in yellow. A close-up of a gp9 trimer with each chain colored differently is also shown (f; PDB entry 2XC1). The Salmonella O-antigen receptor (ball representation in red and yellow; from PDB entry 1TYX) is shown bound to one of the chains. Amino- and carboxy-termini are indicated for each protein (Nt and Ct, respectively).

shaft, a flexible kink, and a carboxy-terminal domain that is apparently composed of four nodules. The structure of the distal three nodules is known (Fig. 3(c)). This structure revealed a pyramid domain and a globular carboxy-terminal domain. The trimeric pyramid domain is composed of three interacting beta-strands that become shorter towards the carboxy-terminus, yielding a pointed shape. Each beta-sheet is formed by one beta-strand from the first monomer, followed by five beta-strands from the next monomer and three strands from the third monomer. A short alpha-helical neck connects the pyramid domain to the carboxyterminal globular domain, which is composed of three beta-sandwiches, one contributed by each of the three gp17 molecules. The gp17 beta-sandwich has a unique topology, although it looks a lot like the beta-sandwich head domain of adenovirus fibers. The Salmonella podovirus epsilon15 has six fibers, each made up of a trimer of gp20. Like the T7 fibers, they have a slender amino-terminal part and a flexible kink, but they have a much thicker and somewhat longer carboxy-terminal part. The carboxyterminal part consists of a tailspike-shape domain and ends in a flower-like structure with three petals. The tail of myovirus bacteriophage T4 is the best-studied contractile system. The tail consists of a sheath, an internal tail tube and a baseplate which is situated at the distal end of the tail (Fig. 2(a)). Attached to the baseplate are the fibers. There are six long or side tail fibers which interact with the primary receptor and six short or central tail fibers, which interact with the secondary receptor. The long tail fibers of bacteriophage T4 are kinked structures composed of four different proteins: gp34, gp35, gp36 and gp37. The proximal half-fiber is formed by a parallel homo-trimer of gp34. The amino-terminal end of gp34 is attached to the baseplate, while the carboxy-terminal end interacts with the distal half-fiber, presumably with gp35 and/or gp36. A monomer of gp35 forms the “knee”. The distal half-fiber is composed of trimers of gp36 and gp37. The gp36 protein subunit is located at the proximal end of the distal half-fiber, while gp37 makes up the distal part, including the receptor-recognizing tip. The short tail fibers are trimers of gp12.

Bacteriophage Tail Fibres, Tailspikes, and Bacterial Receptor Interaction

197

Fig. 3 Structures of tail fibers. Ribbon diagrams of the distal regions of the T5 fiber (a; PDB entry 4UW7), the T5 fiber with its intramolecular chaperone domain (b; PDB entry 4UW8), the T7 fiber (c; PDB entry: 4A0T), the phage S16 fiber consisting of three chains of gp37 and one chain of gp38 (d; PDB entry 6F45) and the Mu fiber with its chaperones (e; PDB entry 5YVQ). Fiber chains are colored magenta, cyan and orange and their chaperones in light pink, purple and dark yellow. Gp38 of phage S16 is colored green. Amino- and carboxy-termini are indicated for each protein (Nt and Ct, respectively).

For the correct folding of gp12, gp34 and gp37 a molecular chaperone, gp57, is necessary. It is a small protein of 79 residues that lacks aromatic amino acids, cysteines and prolines. In vitro, it adopts different oligomeric states. Another chaperone, gp38, must be present for the correct trimeric assembly of gp37. The molecular bases of the gp38 and gp57 chaperone activities are unclear, but it has been proposed that gp57 functions to keep fiber protein monomers from aggregating non-specifically before folding is completed, while gp38 may bring together the carboxy-terminal ends of gp37 monomers to start the folding process. In the related coliphages T2 and T6, in the Salmonella phage S16, and presumably many yet other undiscovered T4-like phages, gp38 plays another role. Instead of being just a chaperone, gp38 stays bound to the end of gp37 and serves as the de facto receptor-binding protein. Structures are known for the carboxy-terminal region of gp34 (Fig. 2(c)) and the receptor-binding needle of gp37 (Fig. 2(b)). The carboxy-terminal part of gp34 is important for folding, which is generally true for viral fiber proteins. Crystal structures of carboxy-terminal parts of gp34 revealed three structural repeats containing three central alpha-helices preceded by intertwined loops. Seven more of these repeats are likely present in the unknown amino-terminal part of the structure. Carboxy-terminal to the repeat region, a short triple-helical region followed by two turns is present. The rest of the structure is a long triple beta-helix, interspersed with three tower domains. The tower domains are formed by three anti-parallel beta-sheets, one contributed by each of the three monomers. The first tower domain is decorated with loops, while the loops of the second tower domain are really short and mainly beta-turns. The last tower domain is extensively decorated, including several short alpha-helices and a long betahairpin extending towards the carboxy-terminus. The carboxy-terminal part of gp37 revealed an elaborately interwoven trimeric structure formed by a globular collar domain (about 45 Å wide, at the bottom in Fig. 2(b)), an elongated needle domain (around 15 Å wide) and a small head domain with a width of around 22 Å (at the top of Fig. 2(b)). Each of the three chains runs from the collar domain to the end of the tip and twists around a neighboring chain before turning back, with both the amino- and carboxy-terminus located near the bottom of the collar domain. In the collar domain, each monomer contains a small beta-sandwich of anti-parallel sheets with an alpha-helix inserted into a loop of the outside beta-sheet. The three inner sheets provide the trimer interactions. Next to the collar domain is a small intertwined region. The carboxy-terminus of each gp37 is folded back into the collar domain, making the needle domain an

198

Bacteriophage Tail Fibres, Tailspikes, and Bacterial Receptor Interaction

insertion into the collar domain, projecting outwards. The needle domain is a 150 Å long six-stranded antiparallel right-handed twisted circular sheet, with each of the chains completing one-and-a-half turns (about 5401) around the fiber axis. In the core of the needle domain, hydrophobic and hydrophilic regions alternate, with the latter forming metal-binding sites. Seven iron ions are coordinated octahedrally by two histidine residues from each chain (six histidines per iron ion). Twenty-six residues from each of the three chains form a compact, interwoven head domain of 22 Å in diameter and 18 Å high. The first fourteen amino acids loop around a neighboring chain; the chain then threads through the loop of a neighbor, before turning back into the needle domain. The head domain, being located at the extreme distal end of the long tail fiber, is likely to play a primary role in receptor binding. The short tail fibers of bacteriophage T4 are parallel homotrimers of gp12. Morphologically, they consist of an amino-terminal virus-binding region, a long thin shaft domain and a more globular carboxy-terminal receptor-binding domain (Fig. 2(d)). The amino-terminal part of the short tail fiber is made up of a base-plate binding region or domain of unknown structure, followed by several of the repeats also present in gp34 (described above). The repeats are followed by a triple beta-helix domain, which is connected to a collar domain by a short alpha-helical neck. From the collar domain, the globular receptor-binding domain protrudes, just like the needle domain protrudes from the gp37 collar domain. The topology of the carboxy-terminal globular part has a complicated "knitted" fold, in which the three protein chains interact extensively with each other. A metal ion was found to be buried deep inside the carboxy-terminal domain. It is positioned on the threefold axis of the protein and is coordinated by the side chains of two histidines from each of the three chains, like the seven iron ions in the long tail fiber. The distal carboxy-terminal end of the trimeric Salmonella phage S16 gp37 protein forms a triangular beta-prism domain and ends in a short triple beta-helix (Fig. 3(d)). The monomeric gp38 adhesin has a modular design. Its amino-terminal attachment domain binds to the tip of gp37 via a hydrophobic platform, whereas the carboxy-terminal specificity domain determines affinity for the host cell receptor. The small amino-terminal gp37 attachment domain forms a helical bundle of three short alpha-helices, from which three tryptophan residues intercalate into three sites in the gp37 platform. Around the hydrophobic platform, a ring of three-fold symmetric hydrogen bonds completes the stable interface. The specificity domain contains single beta-helix, from which loops project in the direction away from the phage (Fig. 3(d)). Two of these loops are long and contain glycine-rich motifs. There are ten of these glycine-rich motifs, two in one loop and the remaining eight in the other. These motifs form poly-glycine type II helices that are folded into a packed lattice, forming a PGIIsandwich. Sequence variability between related phages is limited to surface-exposed helices and distal loops that form putative receptor-binding sites. The structure of the bacteriophage mu fiber (gpS) revealed an alpha-helical shaft, a central domain containing beta-sheet domains (called “tower” domains) and triple beta-helix domain, plus a carboxy-terminal beta-strand domain (Fig. 3(e)). The carboxy-terminal domain is relatively thin, elongated and has distal protruding loops which are postulated to form the receptorbinding site. The amino-terminal phage-binding domain was missing from the structure. Interestingly, upon co-expression, the mu fiber chaperone gpU remained bound to the carboxy-terminal domain of gpS and part of its structure could be resolved. GpU contains two domains, an amino-terminal domain that binds to gpS and a carboxy-terminal domain (only partly visible in the structure) that is presumed to be responsible for chaperone oligomerization; this organization is proposed to be general to gpUlike chaperones. Phage particles with and without gpU are both infectious and both the fiber and the chaperone appear to specifically bind LPS. The structures of the side tail fibers of other “simple” myoviruses like phage P2 are likely similar to the fiber of phage Mu. This is based on the known structures of the fibers from pyocins R1 and R2, which in turn show sequence homology with the fiber of P2. Both the R1 and R2 fibers contain tower domains, thin shaft domains and a carboxy-terminal lectin-like domain. The shaft domains are composed helix-plus-turn motifs, which intertwine with their counterparts in the trimer. The carboxy-terminal lectinlike domain may be involved in receptor-binding. Siphoviruses lambda and T5, which infect the Gram-negative bacterium Escherichia coli, possess fibers at the bottom and sides of their tail tips. The side tail fiber proteins are involved in recognition of cell surface molecules, while the central fiber proteins play an active role in membrane penetration. No high-resolution structure of a siphovirus central tail fiber has been determined, but structural information on some siphovirus side tail fibers is available. Interestingly, the carboxy-terminal part of the lambda side tail fiber has high sequence identity to the carboxy-terminal part of phage T4 gp37, and thus must also form a collar and needle domain. An extra metal-binding motif suggests that the needle domain of the lambda side tail fiber contains eight instead of seven iron ions. The receptor-binding head domain is not homologous and has probably mutated to bind a different receptor. Both the lambda and T5 side tail fibers need a chaperone for folding, but while the lambda chaperone is a separate protein, the T5 chaperone is intramolecular (see below). Interestingly, the lambda side tail fiber chaperone can functionally substitute for the T4 chaperone required for gp37 assembly, gp38. Bacteriophage T5 has a flexible, three-fold symmetric tail, to which three L-shaped side tail fibers are attached. These fibers recognize oligo-mannose units on the bacterial cell surface prior to infection and are composed of homotrimers of the pb1 protein. Pb1 has 1396 amino acids, of which the carboxy-terminal 133 residues form a trimeric intra-molecular chaperone that is auto-proteolyzed after correct folding. The structure of a trimer of the preceding residues 970–1263 resembles a bullet, with the walls formed by partially intertwined beta-sheets, conferring stability to the structure (Fig. 3(a)). The topology of the structure is unique to pb1 and the protein inhibits T5 infection by competition. A site-directed mutant (Ser1264 to Ala), in which autoproteolysis is impeded, revealed the structure of the additional chaperone domain (residues 1263–1396; Fig. 3(b)). It consists of a central trimeric alpha-helical coiled-coil flanked by a mixed alpha-beta domain. Three long beta-hairpin tentacles, one from each chaperone monomer, extend into long curved grooves of the bullet-shaped domain.

Bacteriophage Tail Fibres, Tailspikes, and Bacterial Receptor Interaction

199

Fig. 4 Structures of beta-helical tailspikes. (a) Salmonella phage P22 tailspike (PDB entry 2XC1). The amino-terminal phage-binding domain (PDB), the central beta-helical receptor-binding domain (RBD) and the carboxy-terminal interdigitated domain (CTD) are indicated. The three chains are colored green, magenta and cyan. (b) The host recognition apparatus of coliphage CB120, consisting of tailspikes 1 (purple, PDB entry 4OJ5), 2 (green, PDB entry 5W6P), 3 (blue, PDB entry 5W6F) and 4 (brown, PDB entry 5W6H). Domains for which high resolution structures are unknown, but that link the four tailspikes to each other are shown as circles or squares. (c) Bacillus phage Phi29 appendix (PDB entry 3SUC). The locations of the linker (D1) that links to the aminoterminal phage-binding domain, the beta-helical domains D2 and D3, and the intramolecular chaperone domain D4 are indicated. Coloring is as in part (a).

High-resolution structures of fibers from bacteriophages that infect Gram-positive bacteria are not yet known. Bacillus bacteriophages PBP1 and AR9 have long helical fibers with which they ensnare bacterial flagella. SPO1-like phages (e.g., A511) have fibers that fold back under the baseplate, similarly to the way T4 gp12 does, but whether there are structural similarities remains to be seen.

Structures of Bacteriophage Tailspikes and Other Receptor-Binding Proteins The first high-resolution tailspike structure to be solved was that of Salmonella phage P22 (Fig. 4(a)). In each tailspike, a small trimeric beta-structured amino-terminal domain connects with the rest of the virus. These virus-binding domains are connected by a short alpha-helical coiled-coil linker to a central domain, allowing for some flexibility. The central domain is a trimer of parallel left-handed beta-helices, with each helix containing thirteen complete turns. This beta-helix domain is responsible for carbohydrate binding and cleavage (endorhamnosidase activity). The carboxy-terminal domain of the P22 tailspike consists of interdigitated beta-strands and has been shown to be important for correct protein trimerisation and folding. Many of the lessons learnt from the P22 tailspike structure have turned out to be general. The parallel beta-helix fold present in other tailspike proteins is the most obvious example of this. This fold appears to be especially suitable for carbohydrate cleavage reactions. A monomeric beta-helix fold was first discovered in the enzyme pectate lyase; bacteriophage tailspikes have adopted the fold for efficient recognition and cleavage of host cell wall polysaccharides. Another general lesson is that folding of tailspikes (and of tail fibers and most other viral receptor-binding proteins) starts at the carboxy-terminus. The carboxy-termini of three monomers come together, sometimes with the help of a chaperone, and folding starts. It might appear to be counterproductive to wait until the protein chain is synthesized completely before folding starts. However, for proteins containing repetitive structural motifs, folding the carboxy-terminal domains first makes out-of-register interactions, which might lead to unproductive misfolding, less likely.

200

Bacteriophage Tail Fibres, Tailspikes, and Bacterial Receptor Interaction

Tailspike structures have also been determined for other phages. The Salmonella myovirus Det7 and siphovirus 9NA both contain a tailspike with high sequence identity to the Salmonella podovirus P22 tailspike (except for the phage-binding amino-terminal domain). Not surprisingly, the three structures turned out to be very similar, containing the same beta-helix and carboxy-terminal domains. The Shigella flexneri podovirus Sf6 also has a tailspike with a beta-helix domain comprised of 13 complete turns, although its sequence has no homology to the beta-helix of the P22 tailspike. The Sf6 tailspike beta-helices are wound around each other with a slight left-handed twist. At the distal carboxy-terminal end of the tailspike, each monomer forms a beta-sandwich domain. A similar carboxy-terminal beta-sandwich domain is also observed in the Escherichia coli podovirus HK620, although, with the domain rotated by about 180º around the trimer axis. The tailspike of Pseudomonas podovirus LKA1 has a beta-barrel insertion domain in its beta-helix, which contributes to the active site. Here, the carboxy-terminal domain is a longitudinal beta-sandwich. A similar longitudinal beta-sandwich is also present in a second tailspike of the aforementioned phage Det7, which in fact has a total of four different tailspikes. In this second Det7 tailspike the beta-sandwich domain has homology to a lectin domain, so apart from a putative role in trimerisation and folding, this carboxy-terminal domain may also help the tailspike bind to the LPS O-antigen. The myovirus CBA120 also has four tailspikes, attached to the side of the baseplate (Fig. 4(b)). The four tailspikes allow binding to and infection of different hosts. Tailspike 1 recognizes Salmonella enterica serovar Minnesota, tailspike 2 Escherichia coli O157, tailspike 3 Escherichia coli O77 and tailspike 4 Escherichia coli O78 strains. All four tailspikes have a central beta-helix domain, followed by a bend and another beta-helix domain at the carboxy-terminus (although the carboxy-terminal beta-helix of tailspike 3 is very small). The central beta-helix domains are involved in receptor-binding, while the carboxy-terminal beta-helix domains may be responsible for correct folding. Tailspike 4 has five small amino-terminal domains with which it attaches to the baseplate, to tailspike 2 and to tailspike 1 (Fig. 4(b)). The amino terminal domain of tailspike 3, in turn, attaches to an amino-terminal domain of tailspike 2. Examples of podoviruses with multiple tailspikes are Salmonella phage SP6, Escherichia coli phage G7C and Klebsiella phage KP32. Phage SP6 has two tailspikes, recognizing the O-antigens of Salmonella enterica serovars Typhimurium and Newport, respectively. The two tailspikes are attached to the phage via an adapter protein. This adapter protein can rotate, locating the correct tailspike into the correct orientation for host recognition and infection. In the case of coliphage G7C, one tailspike, gp66, binds to the phage, and the second tailspike, gp63, binds to an aminoterminal region of gp66. Gp63 has six domains, of which the first two resemble those of the phage CBA120 tailspike 1. These domains have thus evolved to organize the relative organization of tailspikes in myovirus baseplates and in podovirus tails. Domain 3 is a connector domain, also seen in tailspikes with endosialidase activity (see below). Domain 4 is an esterase domain, and it has been shown that the esterase activity is necessary for infection. Domains 5 and 6 are homologous to non-catalytic carbohydrate binding domains. This is an example of a tailspike without a beta-helix fold. Klebsiella phage KP32 has a similar organization to coliphage G7C, with one tailspike, gp37, binding to the phage, and the other, gp38, binding to an amino-terminal domain of gp37. The crystal structure of gp38 is known. A flexible amino-terminal region is followed by a catalytic right-handed parallel beta-helix domain, and two non-catalytic carboxy-terminal carbohydrate binding domains. Each monomer of the trimeric tailspike of the Acinetobacter baumannii phage PhiAB6 has a small amino-terminal betastructured phage-binding domain. This domain is followed by a short alpha-helix and a beta-helical exopolysaccharide cleavage domain. The carboxy-terminal part is formed by a triangular beta-prism, an interdigitated region, another triangular beta-prism and a carboxy-terminal beta-sandwich. Structures of other Acinetobacter phage tailspikes have also been solved. The appendices of the Gram-positive Bacillus subtilis phage Phi29 (Fig. 4(c)) also contain parallel left-handed beta-helices. Each appendix has an amino-terminal domain D1 of unknown structure, which is bound to the phage neck. A short triple coiled-coil domain connects the amino-terminal domain to a barrel-shaped domain D2 composed of three parallel left-handed beta-helices. The beta-helices in this domain contain around 24 residues per turn and, like in the P22 tailspike, have inserts of variable lengths that form loops. This domain looks similar to the central domain of P22 tailspike and may thus be involved in receptor-binding. Supporting this is a bound buffer molecule in one of the crystal structures (PDB entry 3GQ8). Carboxy-terminal to this, another thinner and much more regular beta-helix domain D3, with turns that contain only around fifteen residues (without significant inserts), is found. Between the two beta-helical domains, a cyclical swap of the polypeptide chains takes place. To infect Escherichia coli K1, which is surrounded by a polysialic-acid capsule, bacteriophage K1F is equipped with receptorbinding proteins that have an endosialidase activity. These proteins are also trimers but lack the typical beta-helices of conventional tailspikes. Instead, the receptor-binding proteins of phage K1F have an amino-terminal phage-binding domain of largely unknown structure, central beta-propeller and beta-barrel domains and carboxy-terminal beta-prism and triple beta-helix domains (Fig. 5(a)). Endosialidases depend on alpha2,8-glycosidically linked sialic acid oligomers. The beta-barrel domain is where the enzymatic polysialic-acid cleavage reaction takes place. Interestingly, the phage K1F tailspikes and phage Phi29 appendices have intramolecular chaperones like that of the phage T5 fiber (and were actually structurally analyzed in these phages first). The central parts of chaperones are structurally homologous to each other, although the chaperone from Phi29 (D4 in Fig. 4(c)) is smaller and lacks the carboxy-terminal triple coiled-coil present in the intramolecular chaperones of the K1F tailspike and the T5 fiber. The well-studied lactococcal phages p2, TP901-1 and Tuc2009 all have large baseplate assemblies, containing arrays of six or 18 trimeric globular receptor-binding proteins. These receptor complexes can bind various saccharidic moieties and likely interact with teichoic or lipoteichoic acid on the cell surface. Although these receptor-binding proteins are different in structure from tail fibers, they still share a trimeric structure and also possess a short region of beta-helix as is seen in tailspike proteins. Interestingly,

Bacteriophage Tail Fibres, Tailspikes, and Bacterial Receptor Interaction

201

Fig. 5 Atypical tailspikes and other receptor-binding proteins. (a) Coliphage K1F endosialidase (PDB entry 3JU4). (b) Receptor-binding protein from staphylococcal phage phi11 (PDB entry 5EFV). (c) Receptor-binding protein of lactococcal phage p2 (PDB entry 1ZRU). (d) Receptor-binding protein of lactococcal phage TP901-1 (PDB entry 3EJC). For each panel, the three protein chains are colored green, magenta and cyan. Domains, amino-termini (Nt) and carboxy-termini (Ct) are indicated, where appropriate.

phage p2 appears to bind to the cell surface using only its receptor-binding proteins, while phage TP901-1 possesses both receptorbinding proteins and a central fiber protein, which may also be involved in cell surface recognition and/or cell-wall degradation. The receptor-binding proteins achieve high-affinity binding through avidity resulting from the large number of receptor-binding proteins displayed on the baseplate. The receptor-binding protein of lactococcal phage p2 has an amino-terminal shoulder domain, a central neck domain and a globular head domain (Fig. 5(c)). The shoulder domain has a beta-sandwich fold and serves to anchor the protein to the baseplate. The neck domain is a triple-stranded beta-helix, with the three protein chains of the trimer winding around each other. Finally, each monomer of the head domain forms a seven-stranded beta-barrel with the same topology as that of adenovirus and reovirus fiber head domain (and different from the coliphage T7 fiber head domain). The receptor-binding protein of the lactococcal phage TP901-1 has the same head and neck domains as the receptor-binding protein of phage p2, but the shoulder domains are replaced by a parallel bundle of three helices (Fig. 5(d)). Two staphylococcal phages, phi11 and P68, have been shown to have receptor-binding proteins with very similar structures. This is likely to be another example of horizontal gene transfer between different phage types. In the siphovirus phi11, the protein is folded away into the baseplate, while in the podovirus P68, the protein extends away from the phage. Both structures consist of a long, kinked, three-membered parallel alpha-helical bundle. This stalk domain is followed by a five-bladed propeller domain, which may harbor the receptor-binding site. At the carboxy-terminal end, two tower domains are present, which probably have a role in trimerization and structural stability. Appreciable differences between the two structures are the hinge, which in the phi11 receptor-binding protein (Fig. 5(b)) is more acutely folded, and the presence of an octahedrally coordinated iron ion in the phi11 receptor-binding protein, but which is absent in the P68 receptor-binding protein.

Host Cell Recognition Cell-surface recognition by bacteriophages is a two-step process. While the two steps of interaction with the cell surface may involve only one type of surface molecule, many phages interact with more than one receptor. The phage T5 tail first interacts nonspecifically with the O-antigen of E. coli LPS through its L-shaped tail fibers (pb1) and then makes strong specific interactions with

202

Bacteriophage Tail Fibres, Tailspikes, and Bacterial Receptor Interaction

Fig. 6 Structures of bacteriophage receptor-binding proteins in complex with bacterial receptors. (a) Salmonella phage P22 tailspike complexed with O-antigen octasaccharide (PDB entry 3TH0). (b) Tailspike protein of Sf6 bacteriophage bound to Shigella flexneri O-antigen octasaccharide fragment (PDB entry 4URR). (c) Endosialidase of bacteriophage K1F in complex with oligomeric alpha-2,8-sialic acid (PDB entry 1V0F). (d) Lactobacillus phage 1358 in complex with a bacterial cell wall trisaccharide (PDB entry 4RGA). Approximate locations of active sites for polysaccharide cleavage are shown with asterisks where appropriate. In panels (a), (b) and (d), the receptor-binding proteins are viewed from the side, with the carboxy-terminal ends at the bottom of the figure; in panel (c) the protein is viewed from the bottom, i.e., down from the carboxyterminal end towards the amino-terminal end.

cell-surface receptor, FhuA, through the central tail fiber, pb5. In an analogous manner, bacteriophage SPP1 binds reversibly to cell-wall teichoic acids and irreversibly to a specific cell surface receptor, YueB. The tailspikes or fibers mediate initial, reversible, contact of the virion with the target cell, but in general do not trigger DNA release from the virion. The role of primary receptor binding is to speed up infection, facilitating a two-dimensional search on the bacterial surface, rather than a three-dimensional search of a large volume. Once the phage is in the correct orientation and/or a secondary receptor is encountered, irreversible binding takes place. At this point, the phage is committed to DNA transfer into the host. The phage, as a microbiological particle, ceases to exist, but its genetic material will direct the genesis of multiple daughter phages - if the bacterium is a suitable host and does not successfully defend itself by a post-infection mechanism (such as a restriction enzyme or CRISPR-Cas system). Most of the fibers of podovirus T7 normally point upwards, interacting with the phage head. Occasionally, a detached fiber may bind to the host cell LPS. When this happens, the phage may “walk” over the host cell surface, alternately detaching fibers that recognize the host LPS. When all six fibers are detached and bound to the bacterium, the phage is in the correct orientation for infection, with the short conical tail pointing towards and contacting the bacterial outer membrane. Whether there is a specific secondary receptor interaction between the tail and the membrane or a membrane protein is currently unknown. Core proteins then pass through the tail, forming a protective tube through the periplasm for the phage genome to pass through. Podovirus P22 tailspikes bind glycans with long grooves along its length. Each of the three monomers has a long groove that binds and cleaves the Salmonella O-antigen receptor through endo-rhamnosidase activity (Fig. 6(a)). Other tailspikes, such as that of Shigella phage Sf6, have the binding site (and active site) in grooves between monomers, with contributing active site residues from both neighboring monomers for each of the three sites in the trimer (Fig. 6(b)). The O-antigen degradation activity allows the phage to get near the bacterial outer membrane and position itself in the right orientation for infection (with the tail tube pointing towards the cell wall). Apart from these two examples, additional structures for tailspikes in complex with their oligosaccharide receptors have been reported. Furthermore, tailspikes often have additional carbohydrate binding domains, these may help increase the affinity for the bacterial cell wall. Where bacteria have developed different cell wall structures, phages have adapted to bind and, where necessary, digest these. The endosialidase of bacteriophage K1F is an example of this. Poly-alpha2,8-sialic acid binds to the beta-barrel domains of the endosialidase and to the triple beta-helix domain (Fig. 6(c)). Only the former site is thought to be involved in polysaccharide cleavage, with the active site on the surface of the beta-propeller domain of a neighboring monomer (Fig. 6(c)). With regards to the enzymatic reactions, endo-depolymerases appear to be preferred over exo-depolymerases, which prevents bacteria to inhibit their activity by "capping" the polysaccharides. The siphoviruses lambda and T5 recognize host LPS with their long, L-shaped, tail fibers. The exact receptor for the coliphage lambda long tail fibers has not been determined, but the coliphage T5 long tail fibers binds to alpha(1,2)-linked oligo-mannose of the O-antigen of O8- and O9-type Escherichia coli. Interestingly, it appears that before cleavage of the intramolecular chaperone, the L-shaped fibers of phage T5 do not bind their receptors, suggesting that the long beta-fingers cover up the oligo-mannose binding site. However, it has to be kept in mind that both lambda and T5 can infect perfectly well in the laboratory without their long tail fibers, utilizing only the interaction of the central tail fiber with their secondary receptor proteins LamB and FhuA, respectively, even though the rate of irreversible adsorption of the phage to its host cell is substantially reduced.

Bacteriophage Tail Fibres, Tailspikes, and Bacterial Receptor Interaction

203

Many bacteriophages that infect Gram-positive bacteria are thought to recognize teichoic acid moieties, but, unfortunately, no structures of receptor-binding proteins with teichoic acid analogs have yet been determined. Some Gram-positive bacteria form cell wall structures that cover the teichoic acid layer. An example is Lactobacillus lactis strain SMQ388, which forms a polysaccharide cell wall "pellicle" made up of specifically linked N-acetyl-glucosamine and galactose and glucose units. The siphovirus 1358 can recognize this polysaccharide and the structure of the receptor-binding protein in complex with a trisaccharide analog has been determined (Fig. 6(d)). This protein has a structure related to that of the receptor-binding protein of Lactobacillus phage p2, with a beta-sandwich shoulder domain and a larger head domain, lacking the triple beta-helical neck. In the case of the myovirus T4, in adverse conditions for phage multiplication, the long tail fibers are in a retracted conformation, lying against the tail sheath and head of the bacteriophage. In the extended conformation, only the proximal end of the fiber is attached to the baseplate. Primary host cell interaction is mediated by the six long tail fibers. As gp37 is known to interact with the glucosyl-alpha-1,3-glucose terminus of LPS and protein-saccharide interactions almost always involve stacking of sugar residues onto aromatic amino acid side chains, aromatic surface amino acids are attractive candidates for receptor binding. Basic residues like lysine and arginine may also interact with the LPS phosphate groups. Gp37 may also bind reversibly to the outer membrane protein OmpC. If a certain number of long tail fibers are bound to the host, estimated to be three or four, they initiate a conformational change of the baseplate. In the uncontracted phage, the short tail fibers form a garland around the baseplate, in which the amino-terminal domain, a region forming a kink in the shaft domain and the receptor-binding domain are bound the baseplate. During infection, the conformation of the baseplate changes and only the amino-terminal domain remains bound to the baseplate, releasing the rest of the protein and allowing the receptor-binding domain to strongly bind to the LPS oligosaccharide core. When six of the short tail fibers bind to LPS, the phage becomes effectively bound irreversibly to the host, ready for the extensive conformational changes necessary for DNA transfer. The conformational change of the baseplate triggers contraction of the tail sheath that drives the tail tube through the cell membrane, digesting the peptide glycan layer by the tail lysozyme, and ejection of the phage DNA through the tail tube channel into the bacterial cytoplasm. Phages are not restricted to having only one type of receptor-binding protein, and several mechanisms to increase the number of hosts that can be infected are known. Phages can carry two or more types of tailspikes that have different target specificities, either of which can function in primary receptor binding. For example, coliphage K1-5 is a T7-like phage that has two different tailspikes in its virion, binding alpha-polysialic acid (K1) or poly-glucuronic acid-N-acetylglucosamine (K5) capsular polysaccharides. The Bordetella phage BPP-1 has a reverse transcriptase mechanism with which it creates sequence variability in the receptor-binding site of its fiber. This variability helps the phage keep up with phase variation of the host, allowing it to evolve fibers that recognize the changed host cell wall. The modular nature of tailspikes and tail fibers favors these gene duplication, domain exchange and mutational events, because changes in the receptor-binding domains do not affect the interaction of the spikes and fibers with the rest of the virion. Some bacteriophages alter the carboxy-terminal regions of tail fiber proteins by DNA inversion, changing their host range. An example is bacteriophage Mu, where the invertible gene segment is called G, and the corresponding site-specific DNA invertase enzyme is Gin (the gene for Gin maps outside the G-segment). The G-segment starts inside the long tail fiber gene (S) and also contains the tail fiber adapter gene (U) and, in inverted form, the genes U0 and S0 . U0 codes for an alternative tail fiber adapter and S0 for an alternative carboxy-terminal end of the long tail fiber. The orientation of the G segment thus determines which carboxyterminal end of the fiber is expressed and which bacterial hosts can be infected. Mu G( þ ) phage particles expressing S and U are only infectious for Escherichia coli, while Mu G(-) phage particles expressing S0 and U0 have a wider host range and can also infect Citrobacter freundii, Enterobacter cloacae, and Serratia marcescens.

Conclusion and Perspectives The increasing occurrence of pathogenic antibiotic-resistant bacteria calls for alternative approaches to combat pathogenic bacteria. Using bacteriophages and their proteins are some of a number of promising strategies. Phages or their endolysin proteins may be used to directly kill, while phage tailspikes and fibers can be used to specifically detect pathogenic bacteria. Enzymatic tailspikes may also be used to weaken their cell walls. Atomic structures of phage tailspikes and fibers have yielded new protein folds and insights into their stability, their interaction with bacterial receptors and O-antigen degradation mechanisms. These insights have been obtained mainly by X-ray crystallography. The rapid development of cryo-electron microscopy is allowing the structure determination of entire phage tail complexes, although for detailed receptor interactions X-ray crystallography and NMR spectroscopy still have a major role to play. In general, the genes of receptor-binding proteins are those that show most sequence variability between related bacteriophages. This is presumably the result of adaptation to mutations in the receptor of the phage host, of adaptation to a new receptor, or of adaptation to a new bacterial host strain or species. The fact that receptor-binding proteins protrude from the phage capsid allows for these mutations not to affect phage structure, even if whole domains are replaced. Phage-binding domains of receptor-binding proteins of closely related phages, which do interact with the rest of the virus, usually show more sequence similarity than the receptorbinding domain. In some cases, different phages have very similar receptor-binding proteins. Examples are Salmonella phages P22, 9NA and Det7 and the carboxy-terminal domains of the lambda and T4 fibers mentioned before. Where these phages infect the same or a very similar host, horizontal gene transfer between co-infecting phages or an infecting phage and a prophage in the genome is a likely mechanism. For many fibers and tailspikes no sequence similarity is detectable between domains of their receptor-binding

204

Bacteriophage Tail Fibres, Tailspikes, and Bacterial Receptor Interaction

Fig. 7 Schematic drawing of site-directed mutated and chimeric phages. In the top row (a), three phages are shown with site-directed mutations in their tailspikes. These mutations modify the shape of the binding site enabling it to fit different bacterial receptors. In the bottom row, three phages with chimeric tailspikes are shown (b). The carboxy-terminal domains are from other phages with different host ranges.

proteins, but the folds are nevertheless the same, even for phages that infect very different hosts. An example is the tower domain in the fiber protein gp34 of coliphage T4 and in the receptor-binding proteins of phages infecting Staphylococcus. These similarities may be the result of convergent evolution of different phages finding the same structural solution independently, or divergent evolution of a phage from which one of the branches adapted to a very different host. Given the large amounts of phages present in Nature and the rapid evolutionary race with their host bacteria, it is likely more new folds will be discovered for their receptor-binding proteins. Knowledge of these folds will allow the structure, and in favorable cases, receptor-binding properties, of newly discovered genes to be predicted. Experimentally determined and reliably predicted structures will also allow the tailoring of phage tailspikes and fibers to new hosts (Fig. 7). This may be done by site-directed mutation, as has been done for the tail fiber of coliphage T3. Replacing entire domains, i.e., generating chimeric receptor-binding proteins, is another exciting possibility. It is possible that in a relatively near future, a battery of phage receptor-binding proteins, plus a plethora of natural and synthetic bacteriophages, will be available for detection and elimination of most bacterial pathogens.

Acknowledgments The authors thank the Spanish Ministry of Science, Innovation and Universities for financial support (grant number BFU201787022-P; MCIU/AEI/FEDER, EU and Severo Ochoa program SEV 2017-0712).

Further Reading Betts, S., King, J., 1999. There's a right way and a wrong way: In vivo and in vitro folding, misfolding and subunit assembly of the P22 tailspike. Structure 7 (6), R131–R139. Casjens, S.R., Molineux, I.J., 2012. Short noncontractile tail machines: Adsorption and DNA delivery by podoviruses. Advances in Experimental Medicine and Biology 726, 143–179. Dams, D., Brøndsted, L., Drulis-Kawa, Z., Briers, Y., 2019. Engineering of receptor-binding proteins in bacteriophages and phage tail-like bacteriocins. Biochemical Society Transactions 47 (1), 449–460.

Bacteriophage Tail Fibres, Tailspikes, and Bacterial Receptor Interaction

205

Davidson, A.R., Cardarelli, L., Pell, L.G., Radford, D.R., Maxwell, K.L., 2012. Long noncontractile tail machines of bacteriophages. Advances in Experimental Medicine and Biology 726, 115–142. Dunne, M., Rupf, B., Tala, M., et al., 2019. Reprogramming bacteriophage host range through structure-guided design of chimeric receptor binding proteins. Cell Reports 29 (5), 1336–1350. Garcia-Doval, C., van Raaij, M.J., 2013. Bacteriophage receptor recognition and nucleic acid transfer. Subcellular Biochemistry 68, 489–518. Goulet, A., Spinelli, S., Mahony, J., Cambillau, C., 2020. Conserved and diverse traits of adhesion devices from Siphoviridae recognizing proteinaceous or saccharidic receptors. Viruses 12 (5), 512. Hyman, P., van Raaij, M., 2018. Bacteriophage T4 long tail fiber domains. Biophysical Reviews 10 (2), 463–471. Leiman, P.G., Arisaka, F., van Raaij, M.J., et al., 2010. Morphogenesis of the T4 tail and tail fibers. Virology Journal 7, 355. Leiman, P.G., Shneider, M.M., 2012. Contractile tail machines of bacteriophages. Advances in Experimental Medicine and Biology 726, 93–114. Mitraki, A., Papanikolopoulou, K., van Raaij, M.J., 2006. Natural triple beta-stranded fibrous folds. Advances in Protein Chemistry 73, 97–124. Nobrega, F.L., Vlot, M., de Jonge, P.A., et al., 2018. Targeting mechanisms of tailed bacteriophages. Nature Reviews Microbiology 16 (12), 760–773. Sanz-Gaitero, M., Seoane-Blanco, M., van Raaij, M.J., 2019. Structure and function of bacteriophages. In: Harper, D., Abedon, S., Burrowes, B., McConville, M. (Eds.), Bacteriophages. Cham: Springer. https://doi.org/10.1007/978-3-319-40598-8_1-1. Schultz, E.C., Ficner, R., 2011. Knitting and snipping: chaperones in beta-helix folding. Current Opinion in Structural Biology 21, 232–239. Yehl, K., Lemire, S., Yang, A.C., et al., 2019. Engineering phage host-range and suppressing bacterial resistance through phage tail fiber mutagenesis. Cell 179 (2), 459–469.

Phage Genome and Protein Ejection In Vivo Ian J Molineux and L Letti Lopez, The University of Texas at Austin, Austin, TX, United States Aaron P Roznowski, The University of Texas at Austin, Austin, TX, United States and University of Arizona, Tucson, AZ, United States r 2021 Elsevier Ltd. All rights reserved.

Glossary Continuum mechanics theory A theory that considers packaged phage DNA as a uniformly charged rod with a specified persistence length (rigidity) organized in a toroidal spool on a hexagonal lattice. Contractile tail Myophage tails have an internal tail tube encased in an outer sheath. Only the sheath contracts. Cryo-electron microscopy/tomography (cryoEM/cryoET) Sometimes referred to as electron cryomicroscopy. In both techniques, aqueous samples on a carbon grid are instantaneously frozen in liquid ethane at  1961C. CryoEM, usually specifically referred to as single particle cryoEM, takes a single picture of thousands of particles, which are different orientations in the liquid film. These images are then computationally combined to generate a 3-D image. CryoET is used for larger samples but has much lower resolution. A series of images are obtained from a single sample at different tilt angles to the electron beam. Computation can then generate a 3-D image of the sample.

Higher resolution structures can subsequently be obtained by combining data from different samples. DNA-condensing agents Among other compounds, low concentrations of the polyamines spermine and spermidine, and the trivalent cobalt hexammine chloride, condense DNA in solution to densities that approach that in phage heads. PEG and high [NaCl] also induces condensation to form a similar DNA structure. Extensible tail Some podophages eject internal proteins from their capsid that functionally extend tail length. Proton motive force The central aspect of Mitchell’s chemiosmotic theory. The pmf has two components: a membrane potential (electrochemical gradient) and a pH gradient across the two sides of the cytoplasmic membrane. The protonophore FCCP (carbonyl cyanide-p-trifluoromethoxyphenylhydrazone) is an aromatic, negatively charged compound that can delocalize the charge of bound protons, enabling proton transfer across lipid bilayers. FCCP collapses both the electrochemical and pH gradients of the cytoplasmic membrane.

Protein Ejection and Trans-Envelope Channel Formation The first observations relating to how phage genomes enter the infected cell were those of Anderson, who obtained electron micrographs showing that phages adsorbed to cells through their tails (although øX174 was known at the time it had not been recognized as being tailless). The following year the classic Hershey-Chase experiment showed that the majority of 32P-labeled T2 DNA but only a minority of 35Slabeled protein became cell-associated, i.e., Waring blender-resistant, after infection. The Hershey-Chase experiment is often described as proving that DNA is the genetic material; however, that had actually been conclusively demonstrated eight years earlier by Avery et al. Their demonstration that highly purified DNA constituted the transforming principle was incontrovertible. However, Avery only made the claim privately that DNA was likely to be the genetic material, possibly because the scientific community at the time was not ready to accept his conclusion. The prevailing feelings are perhaps best reflected in a 1946 Nobel Prize being awarded for the structure of Tobacco Mosaic Virus’s protein shell. TMV’s RNA genome had not been detected, thus protein “must be” the genetic material. Surprisingly, even though their conclusion was in agreement with Avery’s data, the Hershey and Chase paper failed to cite Avery et al. Although simple and elegant, the Hershey-Chase experiment has been grossly over-interpreted. That single experiment showed that blending released 75%–82% of T2 35S-labeled proteins from infected cells but only 15%–35% of T2 32P-DNA. The paper also reported that less than 1% of the blender-resistant 35S, but a substantial proportion of the 32P, was found in progeny phage. Several additional experiments were also presented in the paper and all the data, not just the usually cited single experiment, are essential for the solid conclusions Hershey and Chase made. However, their conclusion that “protein [of the infecting virion] has no function in the growth of intracellular phage” requires further qualification. That conclusion is largely correct if one only considers the phage’s major structural proteins, which protect the genetic material from environmental nucleases. In addition, long phage tails usually protect the nucleic acid during its translocation into the host’s cytoplasm. However, this often involves ejection of virion proteins into the host cell that can protect the infecting virus’s genetic material from well-known periplasmic and sometimes cytoplasmic nucleases. These proteins are all likely to be “blender-resistant” in a Hershey-Chase experiment. It remains a conundrum that, at least when using laboratory hosts and laboratory conditions, the DNA genome of the Bacillus subtilis phage SPP1 is transiently exposed to the environment and is nuclease-sensitive during infection. As a final comment on phage proteins entering the infected cell, some phages eject their own RNA polymerase, unequivocally essential for phage development, into the host. Thus, some proteins contained within the infecting virion do indeed have a function during intracellular growth. Examples of this phenomenon will be described throughout this article. We will utilize the phagocentric term “ejection”; phage proteins and DNA are ejected from the virion during infection. The term is much preferable to its counterpart “injection” because it avoids the misleading implication of phages being thought of as a hypodermic syringe; unlike when using the latter, the volume inside a phage capsid does not change significantly during infection.

206

Encyclopedia of Virology, 4th Edition, Volume 4

doi:10.1016/B978-0-12-809633-8.21560-4

Phage Genome and Protein Ejection In Vivo

207

Proteins are Ejected into the Host Cell at the Initiation of Infection Hershey himself first showed that “the germinal substance of bacteriophage T2” was ejected into the infected cell; this was most likely the IP (internal proteins) of T-even phages. T4 IP proteins are non-essential for growth in common laboratory hosts but one has been shown to inhibit a restriction enzyme in a hospital-derived strain. Avoiding host restriction by ejecting internal virion proteins is a strategy used by other phages, including the workhorse Escherichia coli transducing phage P1. Another T4 IP is gpAlt, which, among other target proteins, ADP-ribosylates one a-subunit of bacterial RNA polymerase in a process that leads to the preferential transcription from T4 promoters after infection. It is not known if these proteins leave the phage capsid before, concomitant with, or after genome ejection, but if the latter there cannot be a significant delay as they need to function rapidly to promote phage infection. Recently, some giant phages (defined as possessing a genome 4B230 kb, compared to the B170 kb T4 genome) have been shown to harbor all the subunits of a bacteria-like multi-subunit RNA polymerase inside their capsid. Growth of the Pseudomonas aeruginosa phage øKZ has been demonstrated to be independent of the host transcriptional apparatus. The virion-encapsidated RNA polymerase subunits must therefore be ejected into the cell in order to transcribe the phage genome. Whether enzyme ejection precedes, occurs simultaneously with, or follows genome ejection is not known. It should be noted that proteins ejected into the infected cell cytoplasm in an unfolded (perhaps partially) state have the possibility of being refolded by host chaperones. However, unfolded proteins that have to refold in the cell envelope probably do not have this option. Only limited chaperone activity has been observed in the periplasm, and most ejected proteins likely spontaneously fold into their active conformation. All long-tailed phages must eject at least one protein into the cell before genome translocation into the cytoplasm can occur. Tail length is controlled and made constant by a protein tape measure that occupies the lumen and extends through most of the tail in mature particles. The lumen of the tail is also used as the conduit for genome ejection, and thus the tape measure protein must be removed in order to allow DNA egress. Although there is not yet any direct experimental evidence, there are suggestions that tape measure proteins may facilitate formation of the channel across the cytoplasmic membrane and allow the phage genome to access the cell cytoplasm. Some tape measure proteins have also been shown to harbor a cell wall-degrading activity; these proteins therefore do have at least one defined function inside the infected cell.

Tail-Less Phages Microviruses, e.g., øX174 and St-1, which lack any external tail on their isometric capsids, have perhaps the most obvious requirement for proteins to be ejected from the virion before any DNA. The capsid of these phages, like those of the more common dsDNA tailed phages, remains on the infected cell surface after genome ejection. øX174 ejects its gpH protein into the cell in order to allow genome transfer into the cell cytoplasm (Fig. 1). gpH was originally called a pilot protein because it appeared to escort the single-stranded (ssDNA) circular genome into the cell cytoplasm. Several molecules of gpH have now been shown to form a tube, functionally equivalent to a tail, through which the ssDNA is transported. Phage protein synthesis is not required but the activity of host proteins, which convert the ssDNA to supercoiled dsDNA (replicative form, RF), may be necessary for complete genome internalization. The filamentous (ff) ssDNA phages (f1, fd, M13; Inoviridae) were once also described as possessing a pilot protein gp3. However, gp3 mainly functions as an adsorption protein that is brought to the surface of the infected cell as the F pilus retracts. Gp3 then interacts with TolA, an inner membrane-anchored component of several interacting proteins spanning the cell envelope of E. coli. The TolQRA protein complex mediates disassembly of the f1 coat protein in the inner membrane (where it is

Fig. 1 øX174 infection pathway. Pathway of øX174 genome ejection. After øX174 adsorbs to the bacterial LPS, one of the gpG spikes (green) dissociates. Dissociation causes conformational changes in the newly exposed capsid proteins that result in the opening of a pore at the attached vertex. The pilot proteins gpH (red) emerge and form a tube across the cell envelope through which the ssDNA genome can pass into the cytoplasm. Adapted from Sun, Y., Roznowski, A.P., Tokuda, J.M., et al., 2017. Structural changes of tailless bacteriophage FX174 during penetration of bacterial cell walls. Proceedings of the National Academy of Sciences of the United States of America 114, 13708–13713.

208

Phage Genome and Protein Ejection In Vivo

Fig. 2 PRD1 infection pathway. After its initial collision with a cellular receptor, PRD1 undergoes random movement over the cell surface, eventually orienting its unique vertex almost perpendicular to the cell surface. De-capping of this vertex triggers reshaping of the internal viral membrane by an osmotic flow from the environment into the cell. Reshaping includes formation of a membrane tube that traverses the cell envelope. Once in the cytoplasm, the tube tip is opened to allow DNA delivery. Reproduced from Peralta, B., Gil-Carton, D., Castaño-Díez, D., et al, 2013. Mechanism of membranous tunnelling nanotube formation in viral genome delivery. PLOS Biology 11 (9), e1001667.

subsequently reused for progeny phage particles) and the ff genome may therefore simply enter the cytoplasm by default. The single-strand replication origin is then utilized by pre-existing host enzymes to create RF DNA before mRNA synthesis is initiated. Little is also known about genome internalization of the small single-stranded RNA (ssRNA) phages (Leviviridae and Alloleviviridae), the best known of which are MS2 and Qb, which infect F plasmid-containing bacteria. Similar phages are known to adsorb to other conjugative plasmid pili. The single copy of the maturation protein per particle is essential for infectivity; it is cleaved into two fragments and leaves the virion together with the RNA genome. Surprisingly, at this stage of infection, at least in the laboratory, the genome is sensitive to extrinsic RNases. The final location of the maturation protein fragments after infection is not known; how the genome avoids degradation by the periplasmic enzyme RNase I, and how the RNA crosses the cytoplasmic membrane are also poorly understood processes. However, recent studies of both MS2 and Qb by cryo-electron microscopy (CryoEM) are beginning to reveal more details. Internalization of double-stranded RNA (dsRNA) phage genomes (Cystoviridae) is more akin to animal viruses. These bacteriophages have segmented genomes packaged as a nucleoprotein complex that is inside a protein shell. The latter is surrounded by an external membrane consisting of phospholipids and proteins. The membrane fuses with the outer membrane of the host bacterium, and the nucleocapsid enters the periplasm. An endopeptidase that is exposed on the nucleocapsid surface likely hydrolyzes the cell wall to allow passage across the periplasm. An invagination of the cytoplasmic membrane and intracellular vesicle formation is thought to allow the nucleocapsid to enter the cytoplasm where the protein shell is removed. The remaining nucleoprotein core is then active for RNA-dependent transcription and replication. There is therefore no genome or protein ejection per se by this class of phages. The marine Pseudoalteromonas spp. phage PM2 (Corticoviridae) has an outer protein capsid underlain by a lipid membrane. This phage is novel not only in its mode of genome entry but also in that no virion protein is known to enter the infected cell. PM2 adsorbs to an unknown outer membrane component(s) that in some way senses the cytoplasmic ATP concentration - and not the proton motive force. If the [ATP] is high, the capsid protein dissociates, exposing the lipid core that then fuses with the outer membrane. If the [ATP] is low, the infection is aborted. Fusion of the lipid core releases the genome into the periplasm where it enters the cytoplasm through a membrane pore likely composed of cellular proteins. Interestingly, Ca þ þ ions are essential, not only to stabilize the outer membrane of the infected cell but also to facilitate localized cell wall hydrolysis by a host-encoded muralytic enzyme. Although the structure of PM2 is similar to the Tectivirus PRD1, the mode of PRD1 infection is strikingly different (Fig. 2). Receptor recognition by PRD1 triggers structural rearrangements and decapping of a capsid vertex. This allows the internal membrane to form a tube-like structure that penetrates the infected cell envelope, forming a channel for phage genome translocation. A virion protein that is embedded in the tube facilitates localized degradation of the cell wall. The energy for tube extension is provided by the flow of liquid along the osmotic gradient from the environment (growth medium, low osmolarity), through the filled capsid, into the higher osmolarity composition of the periplasm/cytoplasm of the cell. The empty PRD1 capsid remains attached to the cell surface. Genome internalization of the related phage Bam35, which infects Gram-positive bacteria, has also been studied.

Short-Tailed Phages (Podoviridae) Short-tailed phages infecting Gram-negative hosts also obviously require proteins to be ejected from the virion before DNA. The short, stubby tails of podophages are too short to span the cell envelope from their adsorption site on the cell surface. The coliphage T7 was the first podophage observed to eject proteins (E proteins) into the cell. In mature virions these proteins constitute a distinct morphological structure inside the capsid abutting the portal and they are ejected into the cell before the genome. Inside the cell they form a dramatically different structure that extends from the tip of the phage tail through the outer membrane, and across both the periplasm and the cytoplasmic membrane (Fig. 3(A)). This elongated structure comprises the channel for DNA transport into the cell. T7 has therefore been described as having an extensible tail, in contrast to the contractile tail sheath of myophages. Other podophages, e.g., salmonellaphages SP6 and P22 (Fig. 3(B)), have also been shown to have extensible tails. This feature is likely common to this phage morphotype, at least of phages infecting Gram-negative hosts.

Phage Genome and Protein Ejection In Vivo

209

Clearly, these ejected proteins, as do the tape measure protein of long-tailed phages, have a defined and essential function in facilitating genome entry into the cytoplasm and thus the subsequent growth of intracellular phage. A major question remains in understanding how protein ejection is accomplished. Three proteins constitute the T7 extended tail and the process must be tightly coordinated as 12 copies of gp14 localize exclusively to the outer membrane of the infected cell, whereas the eight gp15 and four gp16 molecules function together in spanning the periplasm and the cytoplasmic membrane. T7 gp16 contains the only predicted helical transmembrane domain of the ejected proteins (E proteins), and disrupting that predicted helix abolishes infectivity. It has also been shown to bind membranes in vitro. Gp16 is also responsible for the localized enzymatic degradation of the cell wall that facilitates phage genome internalization. However, the structure of packaged gp16 is far too large to exit through the phage portal and tail intact; the four copies of this 144 kDa protein must at least disaggregate and partially unfold during ejection. Gp16 must therefore also spontaneously refold, without the help of energy-dependent chaperonins, to form an active enzyme in the periplasm. A similar argument can be made for the eight copies of the 84 kDa gp15. Gp15 has been shown to bind DNA, and may thus escort the leading end of the phage genome into the infected cell. T7 is not unique in having three E proteins; those of P22 are also present in multiple copies in the virion. Their ejection is clearly controlled as distinct intermediate extended tail structures have been observed during infection when one E protein is defective. Presumably all copies of each E protein are ejected together and separately from other E proteins. How this is accomplished is unknown. P22 also discards its tail needle after it has penetrated the outer membrane. The needle, as with the central tail fiber of the Siphophages described below, blocks the exit channel for phage genome translocation. In Gram-negative bacteria, the cell wall is composed of a single layer of peptidoglycan, an open network that allows passage of globular proteins up to 45–50 kDa. As mentioned above for some specific phages, many virions contain a cell wall-degrading activity. Note that this cell wall-degrading activity is distinct from the endolysin that elicits cell lysis at the end of the phage growth cycle. Lysozyme, lytic transglycosylase and endopeptidase (cleaving the cross-links in peptidoglycan) activities have been found as part of structural virion proteins. These must therefore penetrate into the periplasm of the infected cell. In T4 and T7, enzyme activity is not essential for infection when the host bacteria are growing exponentially at or above 301C. However, as cells approach stationary phase or are growing at low temperature, plaque formation requires enzyme activity. In its absence the phage latent period during growth in liquid is greatly extended. T-even phages release their tail-associated lysozyme into the periplasm where it can diffuse and cause extensive degradation. High multiplicities of infection by T-even phages lead to lysis-from-without, a phenomenon where the newly infected cell immediately lyses without producing progeny. Lysis-from-without is primarily a property of T-even phages; most phages infecting Gram-negative cells that contain virion-associated muralytic enzymes constrain that activity to the infecting particle and thus the cell wall is only degraded at the site of infection. However, cell-wall degrading activity has not been detected in all phage virions: l and its relative HK022, P22, and øX174 appear not to degrade peptidoglycan during the initiation of infection. It is unknown whether the genome ejection apparatus of these phages can freely pass through the peptidoglycan layer or whether the infecting phage merely waits for the normal turnover of peptidoglycan during cell growth to generate a sufficiently large gap. The thick cell walls of Gram-positive cells require that virion enzymes degrade either or both peptidoglycan and teichoic acid to allow phage access to the cytoplasmic membrane. Such enzymes are often present on the baseplate or tailspikes of phages. The podophage ø29, for example, hydrolyzes both polymers as it drills its way through the B. subtilis cell envelope so that its tail knob protein gp9 can fuse with the cytoplasmic membrane (Fig. 3(C)).

Non-Contractile Long-Tailed Phages (Siphoviridae) In several long-tailed phages, including l, T5, and the contractile tailed P2, it has been shown that mature virions have the leading end of the genome extending into lumen of the tail. Linear phage genomes always enter the cell with a fixed polarity. Most phages that have been studied exhibit a first-in (during packaging), last-out (during ejection) strategy. An exception is T4, which uses a first-in, first-out strategy. In either scenario, the end that exits first must be positionally constrained during packaging. Cross-linking studies with long-tailed phages have shown that DNA penetrates the portal channel and can extend into the lumen of the tail. As these phages assemble their tails independently of the capsid and of DNA packaging, the plug that closes the portal channel prior to tail addition must be remodeled or removed in the final assembly steps. Recent cryoEM reconstructions have extended the observation of a genome end in the portal channel to short-tailed phages. These are significant observations suggesting that triggering of genome ejection may only require conformational changes at the tail tip and, for long-tailed phages, ejection of the tape measure protein. The latter may serve to transduce a signal to headproximal components to initiate genome ejection. It should also be noted that positioning one end of the genome inside the tail lumen during virion assembly provides a direct and efficient path for DNA ejection; it could be difficult for a free genome end inside the capsid to find the exit channel. Some early studies of phage genome ejection employed l. A repeat of the “Hershey-Chase experiment” allowed the same conclusions originally made with T2: the majority of l proteins could be removed by blending while the majority of DNA remained cell-associated. It was also shown that l can eject its genome in vitro by the simple addition of Mg þ þ and the purified receptor protein LamB. Shigella sonnei LamB is effective by itself although E. coli LamB also requires the addition of CHCl3 to cause genome ejection. Pioneering in vitro studies using electron microscopy of LamB-containing liposomes or crude membranes prepared from l-infected cells revealed the presence of two distinct complexes: one showed the central gpJ fiber interacting with the outer membrane

210

Phage Genome and Protein Ejection In Vivo

Fig. 3 Podophage infection pathways. 3A. T7: After initial adsorption to outer core LPS, T7 is thought to “walk” over the cell surface using a subset of fibers. At a preferred site, “walking” stops and all tail fibers bind, allowing the tail protein gp12 to interact with its receptor in the lipid A-KDO region of LPS. This interaction initiates ejection of the small gp6.7 and gp7.3 proteins into the outer membrane, where they are normally degraded. The internal core proteins gp14, gp15 and gp16 then disassemble and eject into the cell to form a trans-envelope channel from the tip of the tail extending into the cytoplasm. The T7 genome is then translocated into the cell using molecular motors. For more details, see Hu et al. 3B. P22: Tailspikes bind to O antigen and by hydrolysis bring the phage to the cell surface. Needle length causes the virion to initially bind at an oblique angle. The virion then reorients, forcing the needle through the outer membrane where it is released. In turn, needle release triggers ejection of the P22 E proteins that form a complete channel for DNA transport extending from the phage baseplate into the cell cytoplasm. For more details, see Wang et al. 3C. ø29: ø29 uses its gp12* appendages (called tailspikes in other phages) to adsorb to and then hydrolyze B. subtilis wall teichoic acid to bring the virion close to the cell surface. As with P22, the ø29 tail causes the virion to bind obliquely but the gp13 tail tip, which harbors a muralytic activity, can now interact with the cell wall. Continued enzyme activity, both of gp12* and gp13, causes reorientation of the virion and fusion of the gp9 knob protein with the cytoplasmic membrane to form a channel for DNA ejection. For more details, see Farley et al. and Xu et al. Reproduced from (A) Hu, B., Margolin, W., Molineux, I.J., Liu, J., 2013. The bacteriophage T7 virion undergoes extensive structural remodeling during infection. Science 339, 576–579. (B) Wang, C., Tu, J., Liu, J., Molineux, I.J., 2019. Structural dynamics of bacteriophage P22 infection initiation revealed by cryo-electron tomography. Nature Microbiology 4, 1049–1056. (C) Farley, M.M., Tu, J., Kearns, D.B., Molineux, I.J., Liu, J., 2017. Ultrastructural analysis of bacteriophage Phi29 during infection of Bacillus subtilis. Journal of Structural Biology 197, 163–171. Xu, J., Gui, M., Wang, D., Xiang, Y., 2016. The bacteriophage ø29 tail possesses a pore-forming loop for cell membrane penetration. Nature 534, 544–547.

Phage Genome and Protein Ejection In Vivo

211

(or LamB in the proteoliposome), whereas the second lacked gpJ and the phage tail contacted the membrane surface. In this type II complex, gpJ is sensitive to proteolysis, suggesting that it does not enter a proteoliposome whereas the tape measure gpH*, which is protease-sensitive in free phage, becomes resistant and thus is likely buried in the membrane. This same complex was subsequently shown to form a transmembrane channel, providing the first suggestion that a phage tape measure protein could be ejected from an infecting virion and form a conduit for genome translocation. Unfortunately, perhaps due to methodological limitations of the time, these experiments were not pursued further. The structures of the counterpart proteins to gpJ from the Gram-positive phages SPP1 and p2 are now known; the structures support the idea that the central tail fiber bends out of the way during infection to allow genome release, a feature that may thus be common to many siphophages,

Contractile Long-Tailed Phages The prototype contractile tailed phage is obviously T4, and we now have a good understanding of the initial steps of infection, particularly for the process of tail tube penetration of the cell envelope (Fig. 4). Other well-studied myophages likely follow a similar pathway, although their baseplate structures are simpler than that of T4 and not all may possess cell wall-degrading activity. Atomic structures of many T4 proteins have been obtained by X-ray crystallography, and detailed structures of particles both before and after genome ejection in vitro have been solved by cryoEM. Furthermore, cryoET has elucidated trans-envelope channel formation for genome ejection in vivo. Reversible adsorption of at least three long tail fibers to the cell surface destabilizes the baseplate. This results in the short fibers rotating downwards where they contact the cell and irreversibly bind. In addition, baseplate destabilization allows the tail sheath to spontaneously contract, which drives the inner tail tube through the outer membrane. The tail tube tip then dissociates, releasing the gp5* lysozyme complex into the periplasm. Degradation of the cell wall facilitates further penetration of the tail tube into the cell. Following tail sheath contraction, the end of the T4 tail tube (now lacking its tip complex) is in the periplasm; despite innumerable cartoons to the contrary it was clearly shown 50 years ago that the tube only extends to the outer leaflet of the cytoplasmic membrane. CryoET now reveals that the membrane blebs outwards in fusing with the tail tube. For DNA to be transported into the cell cytoplasm, the tail tape measure protein must be ejected and it is currently thought that this protein forms the transmembrane channel. Blebbing may then be the simple consequence of the membrane moving towards the fixed end of the tail tube during the fusion process. Only in myophages has the source of energy been clearly established for changes in protein conformation that occur during infection initiation. The outer sheath of the contractile tailed phages is assembled in a metastable form. When infection is triggered, the sheath spontaneously contracts to a lower energy state structure, one that may be identical to that obtained with purified protein. Although sheath contraction is an essential step during infection, it does not necessarily lead to genome ejection. Sheath contraction can be induced in vitro by treating virions with urea but the resulting non-infective virions retain an encapsidated genome. Comparable detailed studies have not been performed with podophages but the structure of purified and complexed T7 gp15 and gp16 bears more resemblance to the extended tail than their structures in the mature virion. Perhaps the internal E proteins of T7 are also assembled in a metastable form.

Factors Affecting Phage Genome Translocation In 1956, the first quantitative model of ejection hypothesized that simple Brownian motion was responsible for genome ejection, but it was later calculated that the time that would be spent for complete genome ejection by diffusion greatly exceeded the phage latent period. A theoretical proposal, one that was supported using a 105-scale glass model of the T4 virion with nylon thread representing the DNA (!), invoked osmotic pressure inside the phage head as driving the ejection process. However, the most widespread description derives from the first molecular genetics textbook authored by Gunther Stent in 1963: “One may imagine, therefore, that the DNA is packed into the phage head under constraint and forces its own way out through the sheath [Myoviridae] after the contraction and “uncorking” reactions of the tail are triggered”. Although usually depicted in textbooks as an instantaneous event, complete internalization of most phage genomes may occupy a substantial part of the latent period. Some specific values are given below, but cryoET studies of infection initiation by several different phages have all visualized partially emptied capsids, where DNA is presumably in the process of entering the cell cytoplasm. Such structures would not be detected if genome transport is extremely fast but there is a caveat to those experiments: all but one study used minicells rather than intact bacteria. Minicells, which are smaller than whole cells and thus allow acquisition of higher resolution structures, have been shown to support growth of some phages. However, minicells lack a chromosome and cannot grow, and additional experiments are therefore underway in which exponentially growing cells are infected. It should be remembered that many of the other assays that have been employed to estimate genome translocation rates are indirect and require interpretations that can also be questioned. For example, genome ejection, once initiated, may continue even at low temperature during disruption of the phagebacterium complex, and although a genome may be shown to have exited the virion, it may not have entered the cell cytoplasm. A second source of variability lies in the nature of the assay – are populations being measured, where the fastest fraction determines the earliest time recorded, or are single phage-cell complexes being observed?

212

Phage Genome and Protein Ejection In Vivo

Fig. 4 T4 infection pathway. Sequential binding of a subset of long tail fibers (LTF) allows T4 to “walk” over the cell surface. When three or more LTF are simultaneously bound, the strain in each LTF–gp9 baseplate junction triggers conformational changes that releases some STF short tail fibers from the baseplate. The latter, in this transient intermediate, rapidly transitions from its hexagonal to a star configuration, which triggers contraction of the tail sheath and the release and binding of the remaining STF and LTF to the cell. Sheath contraction also pushes the tail tube and needle complex through the outer membrane; this complex then dissociates, allowing the gp5* lysozyme to freely digest the cell wall. The gp27 tail tube, now the distal end of the ejection machinery, contacts the cytoplasmic membrane, which bulges out from its normal plane. Bulging may be caused by ejection of the gp29 tape measure that is thought to make the transmembrane channel for DNA translocation into the cytoplasm. For more details, see Hu et al. and Taylor et al. Reproduced from Hu, B., Margolin, W., Molineux, I.J., Liu, J., 2015. Structural remodeling of bacteriophage T4 and host membranes during infection initiation. Proceedings of the National Academy of Sciences of the United States of America 112, E4919–E4928. Taylor, N.M., Prokhorov, N.S., Guerrero-Ferreira, R.C., et al., 2016. Structure of the T4 baseplate and its function in triggering sheath contraction. Nature 533, 346–352.

In addition to the possible presence of DNA phosphate counterions, a function for ion movement has often been proposed during genome ejection. Determining this role has been difficult, in part because ionic strength also affects particle adsorption. Many phages do not adsorb, or do not progress beyond the reversible adsorption step in low ionic strength environments. A very high ionic strength environment is usually also inimical to infection as the initial adsorption steps are governed by electrostatic interactions. In addition, many phages have a specific requirement for Mg þ þ or Ca þ þ , again both for initial adsorption and for the conformational rearrangements of virion proteins necessary to allow genome delivery into the cytoplasm. As examples, Ca þ þ triggers opening of the lactococcal phage p2 baseplate and stabilizes the post-ejection conformation of the distal end of the SPP1 tail in vitro. Ca þ þ and/or Mg þ þ are also necessary to stabilize stocks of some, but not all phages. l provides an interesting example of the complexities of specific ion requirements for both stability and infection. Free virions and isolated heads are freely permeable to small ions, including the common bacterial polyamines putrescine and spermidine. Isolated tails are slowly (hours) inactivated by Mg þ þ and are usually stored in buffers containing EDTA. Conversely, isolated heads, which have four nucleotides from the single-stranded right end of the genome protruding from the portal, are rapidly inactivated by both EDTA and Mg þ þ due to premature genome release. They are stabilized by putrescine but not by spermidine or spermine, both of which actually markedly increase the rate of genome loss. This is surprising, as spermidine and spermine, unlike

Phage Genome and Protein Ejection In Vivo

213

putrescine, are DNA-condensing agents and a priori would be expected to stabilize the DNA inside heads. Isolated tails and heads can be mixed in vitro to make infective virions that are stable to all polyamines, although Mg þ þ is usually employed for long-term viability. As final complications, l growth, either by infection or induction of a lysogen, is inhibited in cells grossly deficient in polyamines, but in low (mM) concentrations of Mg þ þ all three common polyamines have been shown to inhibit infection. Cytoplasmic ions, in particular K þ , have been shown to leak from cells during the adsorption and genome penetration steps of infection by several phages. It has been suggested that these ion fluxes are responsible for the transient membrane depolarization of the cell that is also associated with the initiation of infection by most phages. Leakage from T5-infected cells occurs only during the two distinct steps of DNA translocation (see below), suggesting that DNA transport and leakage are intimately coupled. However, T5-induced leakage also occurs in cells that have been depolarized, and membrane depolarization can be uncoupled from transport of the SPP1 genome. In addition, K þ leakage does not require membrane depolarization during PRD1 infection. Ion leakage and/or the transient reduction in membrane potential may simply be a consequence of a phage breaching the cell envelope, an idea first proposed in 1954. Alternatively, membrane depolarization was suggested to be due to protons associating with T1 and T4 DNA as it translocated into the cell. However, in the case of T4, the membrane potential was later shown to be required only for opening or maintaining the channel across the cytoplasmic membrane. In contrast to the above, initiation of infection by T3 or T7 causes neither a reduction in membrane potential nor ion leakage, observations that are currently confined to this phage group. This difference between T7-like and other phages may be the key to understanding their fundamentally different mechanisms of genome ejection.

Transcription-Mediated Genome Internalization In Vivo The detailed mechanism of DNA translocation from the phage head into the cell cytoplasm has only been established for two phage types; the majority of the T7 and N4 genomes are internalized by transcription. RNA polymerases (RNAP) pull the genome from the phage head into the cell. NTP hydrolysis thus provides the energy both for RNA synthesis and for translocation of the genome out of the phage head into the cytoplasm. Close relatives of these prototype phages likely utilize a similar mechanism of genome internalization, but it has so far only been demonstrated experimentally for the coliphage T3 and the cyanobacteriophage Pf-WMP4. B 1 kb of the T7 genome is ejected into the cell at 140 bp/sec (371C) by a process that is consistent with it being enzyme-catalyzed. This reaction then stops and entry of the remainder of the genome requires transcription. It is not known what limits the initial genome internalization process; there is not a specific stopping nucleotide sequence and the B1 kb appears to be a length measurement. Perhaps the leading end of the genome is directly escorted into the cell together with the E proteins gp14, gp15 and gp16. Regardless, promoters for E. coli (and T7) RNAP lie on the initial DNA segment that enters the cell, and transcription by the E. coli enzyme normally pulls B7 kb of the genome into the cell at 90 bp/sec (Fig. 5). Termination of E. coli RNAP-catalyzed transcription at the early terminator TE is not 100% efficient, and in the absence of phage protein synthesis the E. coli enzyme internalizes the entire T7 genome at a constant rate. However, T7 RNAP is encoded on the B7 kb DNA transcribed by E. coli RNAP. in normal infections, T7 RNAP is synthesized and the trailing 32 kb of the phage genome is internalized by transcription using the faster phage enzyme. Complete genome internalization takes 5–6 min at 371C, about 1/3 of the latent period. Separately measured times of T7 mRNA and late protein syntheses are consistent with the genome internalization rates determined by genome entry experiments. The transcriptional mode of genome internalization is part of the regulatory process of T7 gene expression. Genes expressed early in infection are the first to enter the cell, whereas late genes, largely encoding structural proteins, enter the cell later and thus can only be expressed after early gene expression. However, this regulation of gene expression is not essential for phage growth. If T7 RNAP is provided in the cell prior to infection, the leftmost T7 promoter (present on the initially ejected B1 kb) is utilized by the exogenously provided T7 RNAP and late mRNAs are synthesized much earlier. Conversely, delaying expression of T7 gene 1 (RNAP) by placing it in the distal 40% of the genome delays genome entry and as a result, synthesis of late RNAs and proteins is delayed. Mutant virions containing substitutions in two regions of gp16 allow the entire genome to enter the infected cell without the need for transcription. With all substitutions tested, DNA entry occurs at the same constant rate over the entire 40 kb genome without detectable pauses. The substitutions thus appear to inactivate the mechanism that normally stops DNA translocation after the initial B1 kb has been internalized. Importantly, the rate of genome internalization by this mechanism varies with temperature and data can be linearly fitted on an Å rrhenius plot, consistent with the process being enzyme-catalyzed. Truncations that remove six or more C-terminal residues of gp16 prevent any DNA from entering the cell cytoplasm; this defect can be suppressed, and viability restored by substitutions affecting gp15. Collapsing the pmf with the protonophore FCCP while the genome is being actively translocated into the cell immediately stops further internalization. Purely by analogy with the F1FO ATP synthase, it was suggested that the ejected gp15 and gp16 proteins formed a rotary pmf-dependent motor that ratchets the genome into the cell base-pair by base-pair. However, it was not determined whether the pmf is directly involved in DNA translocation or in maintenance of the channel across the membrane. RNA polymerase could complete genome internalization (using ATP as its energy source) in the absence of a pmf but only at a much lower rate than in its presence. In contrast to T7, coliphage N4 virions contain four copies of a phage-specific RNAP (vRNAP). The initial B1 kb of the N4 genome is internalized by transcription but the host RNAP is not involved. The vRNAP must therefore be ejected into the cell cytoplasm before or concomitantly with the leading end of the genome. Extensive genome internalization requires the synthesis of N4 gp1, whose gene is encoded on the initial B1 kb. About 40 kb of the genome is thought to be internalized by this vRNAP-gp1 complex; the remaining B30 kb is catalyzed by transcription using N4 RNAP II and its transcription factor N4 gp2. For many years, N4 was an orphan with no known relatives. However, this phage type now has many examples that infect Pseudomonas spp.

214

Phage Genome and Protein Ejection In Vivo

Fig. 5 Transcription-mediated T7 genome internalization. T7 ejects B 1 kb of its 40 kb genome into the cell using a molecular motor (see text). On the leading end are three strong promoters for host RNA polymerase (RNAP) and a T7 RNAP promoter (not shown). A. Transcription by RNAP from a promoter proceeds normally. B. Transcription continues unabated once RNAP encounters the cell membrane. Rather than the enzyme moving along its template, duplex DNA must now be pulled through the enzyme. The fate of the superhelical turns that are created ahead of the transcription complex is not known, although topoisomerases do not appear to be involved.

The conservation of the two phage-specific RNAPs and their cofactors gp1 and gp2 make it likely that the description of N4 genome internalization applies to the entire group. Transcription-mediated genome internalization is currently limited to these two phages and their close relatives, and it seems unlikely that many different phage groups will be shown to employ a similar strategy. It is unsuitable for the B50% of tailed phages that package by a headful mechanism (pac-type) and therefore have terminally redundant sequences; appropriate promoters for RNAP would not necessarily be correctly located on the varying genome ends of individual virions. The same argument precludes any sequence-specific DNA-binding protein, be it phage- or host-encoded, from facilitating internalization of pac-type genomes after only a short segment has entered the cell. Furthermore, some phage genomes that have fixed genome ends (cos-type packaging), e.g., l and T5, are known not to utilize transcription to catalyze genome entry. However, in addition to T7 RNAP and N4 RNAP II, other phageencoded proteins are known to function in multi-step mechanisms of internalization of cos-type genomes.

Non-Transcriptional Genome Internalization In Vivo The influence of Stent’s 1963 textbook, which informed molecular biologists and geneticists of the time – and then their descendants, coupled with the technological inability to experimentally test the idea that a phage genome forces its own way out of the capsid into the cell cytoplasm led to it becoming widely accepted as “fact” for all phages. However, even in 1963 it was known that T5 genome internalization could not be driven by the release of pressure in the capsid arising from the densely packaged DNA. Even more than five decades later, mechanisms of genome translocation from the capsid into the bacterial cell remain unresolved for most phage groups. Phage T5 genome entry takes place in two distinct steps, called first-step (FST) and second-step transfer (SST). In FST, the first 10 kb of the 121 kb genome enters the cell in B2 min. The transient leakage of cytoplasmic ions and the accompanying decrease in membrane potential stops after FST completion, without a requirement for phage protein synthesis. FST is normally followed by a B5 min pause when the membrane potential is restored and the A1 and A2 genes essential for complete genome internalization, are transcribed and translated. What causes the pause after FST transfer is unknown. The naturally occurring nicks in the T5 genome, the first of which is close to the end of FST, are not required for pausing. However, the DNA sequence surrounding the first nick is replete with direct and inverted repeats that can be imagined to be involved in FST termination; host proteins that bind to the repeats could also have a role.

Phage Genome and Protein Ejection In Vivo

215

After synthesis of A1 and A2, SST begins and is associated with a second transient leakage of cytoplasmic ions and a second drop in membrane potential. A2 is a simple dsDNA-binding protein but A1 has many functions that include binding to A2, host membranes, and RNAP. The actual role(s) of A1 and A2 in facilitating SST has not been determined but genome internalization proceeds at a rate of B500 bp/sec and does not require transcription. The assays used to measure T5 genome internalization would not have been adequate to detect pauses in SST in vivo, as demonstrated in vitro, but if any do occur they are not as pronounced as the termination of FST. The only mutants known to affect SST lie in the A1 and A2 genes. Internalization of the entire 121 kb genome thus occupies B10 min at 371C. Use of metabolic poisons both to deplete intracellular ATP and to abolish the pmf had no effect on the DNA transport process of either FST or SST. It was thus concluded that genome translocation occurred by passive, or perhaps facilitated, diffusion. Significantly, removal of the capsid after FST (and after synthesis of A1 and A2) still allowed the now naked B110 kb of SST DNA to translocate into the cell cytoplasm and at about the same rate as in a normal infection. These data are obviously inconsistent with the idea that any physical forces associated with DNA densely packaged in phage heads cause genome ejection into the cell cytoplasm. The ø29 genome also enters a B. subtilis cell in two phases, but unlike T5, the second requires metabolic activity. The first 65% of the genome is ejected into the cytoplasm in what was described as the “push” phase, where internal virion pressure was considered the driving force. Two proteins synthesized early in infection, gp17 and gp16.7, which are coded on the leading part of the genome, are necessary to complete DNA transfer. This second step was called the “pull” phase as energy from the membrane potential was shown to be required, although no detailed mechanism was proposed. An early study of phage genome entry in vivo, again using the Hershey-Chase approach also employed B. subtilis. Using an infective center assay, complete internalization of the 140 kb SP82G (Spounavirinae, a relative of SPO1) genome took B6 min at 331C after an initial adsorption step and was independent of phage protein synthesis. Again using blending or sonication to disrupt phage-cell complexes, a complex marker rescue assay showed that the rate of genome internalization is constant: B2 kb/sec at 331C, B1 kb/sec at 281C and B0.6 kb/sec at 251C; these data were linearly fitted on an Å rrhenius plot. No enzymatic mechanism that could catalyze genome internalization was proposed but the constant rate of DNA transfer at a given temperature and the major effect on rate by varying temperature strongly suggests that something other than, or in addition to, physical processes is critical. l genome ejection has been followed in vivo using single cell-phage complexes. l virions containing fluorescently stained DNA were observed by microscopy in real time during infection. The gradual reduction in fluorescence of the phage was used to estimate the amount of DNA ejected. Individual phages showed significant variability in the time for complete ejection into the cell with a mean of B 5 min, one to two orders of magnitude longer than for in vitro ejection into buffer. Pausing events, where genome ejection temporarily halted, were observed in some but not all phage-bacterium complexes. Comparable experiments were also performed using the l deletion mutant b221, whose genome lacks 22% of the DNA in wild-type phage. Continuum mechanics theory, popular because it is amenable to analytic procedures, considers packaged phage DNA as a uniformly charged rod with a certain persistence length (rigidity) organized in a toroidal spool on a hexagonal lattice. This theory predicts that after 22% of the wild-type genome had entered the cell it should display the same dynamics as the b221 mutant genome. Although in vitro DNA ejection dynamics for both genomes are captured by this theory, the in vivo results disagree. The experimental data suggest that the ejection dynamics reflect the amount of DNA that has entered the cell, rather than the amount remaining inside the capsid. As currently formulated, continuum mechanics theory cannot explain phage DNA ejection in vivo. The experimental results are also inconsistent with two-step models, in which energy stored in the capsid controls ejection of the leading half of the genome with the distal half being internalized by another mechanism (e.g., ø29). It was suggested that unspecified bacteriuminternal processes dominate ejection in vivo but it is difficult to relate the observation that the rate of ejection declines after B50% of the genome is inside the cell, when the bacterial internal forces proposed to dominate the process would be expected to be increasing. This clever in vivo assay comes with few assumptions beyond any effects of the intercalating fluorescent dye, but more experimentation is clearly necessary before any mechanistic conclusions can be made. The observation that T-even capsids contain polyamines sufficient to neutralize 30%–50% of the DNA phosphates, plus the suggestion that electrostatic repulsion between packaged DNA strands could cause genome ejection was generalized in Stent’s model that the phage genome was packaged in the capsid under constraint. An alternative idea, perhaps inspired by Mitchell’s Nobel-winning chemiosmotic theory, was proposed by Grinius et al.: T4 DNA entered the cell by associating with protons (symport) traveling down their electrochemical gradient. It was estimated that the rate of DNA transfer was 103–104 bp/sec and thus complete internalization of the entire B170 kb genome would take a few seconds to B3 min. These data were obtained using variations of the Hershey-Chase experiment; however, the time taken to form the trans-envelope channel was not considered and the major assumption was made that no DNA transport occurred during the separation of phage-cell complexes. The precise rate of T4 genome translocation may therefore be inaccurate but it does appear to be faster than other phage genomes. Furthermore, although the cellular proton motive force, specifically the membrane potential component, was clearly demonstrated to be important it was subsequently shown that the potential was required for trans-membrane channel opening/maintenance rather than DNA transport per se. The mechanism of T4 genome ejection therefore remains undetermined.

Models of Phage Genome Ejection In Vivo The dsDNA genome of all phages examined is packaged at near crystalline density, B500 mg/ml. Beginning with Stent’s 1963 textbook and the 1969 model building study of T4 genome ejection by Zárybnický, the idea that phage genomes are packaged

216

Phage Genome and Protein Ejection In Vivo

under pressure that is subsequently used to drive genome ejection has become commonly accepted. However, almost all the experiments conducted in order to substantiate the idea have been performed in vitro. These experiments have focused on inhibiting l, T5, and ø29 genome ejection by imposing an opposing osmotic pressure with concentrated solutions of polyethylene glycol, PEG 6000–8000. However, phage genome ejection into cells in vivo is fundamentally different from the in vitro experiments and cannot be explained by pressure internal to the phage capsid alone. Bacteria maintain an internal osmotic pressure that is necessarily higher than the surrounding environment so that they can enlarge during growth. This excess osmotic pressure is called turgor. Estimates of turgor in Gram-negative cells range from 1 to 5 atmospheres while that in Gram-positive cells is estimated to be B20 atm. In the same way that suspending phage particles in buffered PEG solutions inhibits genome ejection, turgor achieves the same result in vivo. Crucially, turgor opposes the ejection of DNA from the phage head into a bacterial cell, and for example, when B50% of a l genome has been internalized, the osmotic pressure of the cell cytoplasm and phage capsid was calculated to have been equalized; ejection using only pressure internal to the capsid must therefore stop. Using current continuum mechanics models, a separate process must always operate in vivo to complete genome internalization and allow a productive infection. Various theoretical solutions pertaining to the problem have been put forward but none has yet been supported by an actual in vivo experiment. It is worthwhile considering what biophysical characteristics have been established about mature phage particles. Their genomes occupy about half the available volume inside the capsid, the remainder being filled by internal proteins (in some phages only) and buffer. Capsids of the T-even phages are osmotically sensitive; they are slowly permeable to small ions, but polyamines and larger ions are maintained in the capsid at the concentration found in their last host. Osmotic shock-resistant T4 mutants have been isolated that can exchange polyamines by dialysis. Small ions and polyamines contained within most other phage capsids can be readily exchanged by dialysis and some phages are stable in EDTA solutions, at least at low temperature. If the DNA phosphates of the genome are ionized then B1.5 M monovalent (or their equivalent) counterions must be present. However, the presence of cations that neutralize ionized DNA phosphates in mature virions has recently been questioned; the packaged genomes of P22 and the eukaryotic virus AAV are reported to be in a largely non-ionic state. If confirmed and generalized by extension to other phages, these observations have the potential to profoundly affect long-standing thinking of genome ejection. If the DNA phosphates are actually neutralized as –P-O-H, not only is charge repulsion between helices negated but the lack of free counterions necessarily reduces the osmotic pressure inside the virion. Several observations, together with the long-term stability of most phages in simple buffers, are also not consistent with the view that phages are “pressurized”. Substantial indentations in a mature l capsid can be made with the cantilever of an atomic force microscope and the deformation is reversible. A reduction in capsid volume with the concomitant expulsion of internal water is therefore tolerated without loss of viability. The packaged highly supercoiled PM2 genome is reported to exert no detectable pressure on the membrane internal to the phage capsid. Remodeling of the PRD1 internal membrane occurs before genome ejection and is driven by exchange of osmolytes between the environment and the infected cell. Proteins have been shown to be mobile inside the T4 head, and the inner body of giant phages is not found at the same location in different virions and is presumably likewise mobile. Several different phage capsids trapped immediately after rupture on an electron microscope grid reveal that the released genome retains the same dimensions as in the intact head. None of these observations is readily reconciled with a high pressure inside the mature capsid. Furthermore, a single P4 genome, only 30% the length of P2 DNA, has been packaged in vitro inside a normal-sized P2 head. The calculated internal pressure of such chimeric particles is less than the osmotic pressure inside the bacterial cytoplasm; in these particles internal capsid pressure cannot therefore drive any genome ejection, yet the chimeras were shown to be infective in vivo. By contrast, analysis of electron micrographs of the Staphylococcus aureus phage P68 revealed that B2% of particles had ejected an apparently fixed amount of DNA; it is not known what causes DNA release nor what stops the entire P68 genome from exiting those particles. Quantized partial genome ejection in vitro has also been observed from T3 virions. An alternative process, termed the hydrodynamic model for phage genome ejection in vivo, posits that extracellular water flows along the osmotic pressure gradient between the environment and the cell cytoplasm, which, because of turgor, must be at a higher osmotic pressure (Fig. 6). In contrast to phage DNA ejection in vitro, which contains only two compartments (phage virion and surrounding buffer) with differing osmotic pressures, and where water may enter the phage head only as DNA leaves, phageinfected cells have three compartments (environment, phage head and cell cytoplasm). Each compartment has its own independent osmotic pressure. However, because water transfer is ultimately between the environment and the cytoplasm of the infected cell, the osmotic pressure of the intermediate compartment, the phage head, is thermodynamically unimportant. Water flowing through the phage head and tail into the cytoplasm transports the genome much as logs are moved downstream by a river current. The DNA in the phage tail exerts a drag force on the water flowing past it; in turn, because action and reaction are equal and opposite, the DNA will be moved by the flowing water into the cytoplasm. It is not immediately obvious how to design experiments that would rigorously test the hydrodynamic model for in vivo genome ejection. In vitro experiments could demonstrate that water flow from a low osmotic pressure environment can transport DNA out of a phage head into a separate compartment of higher osmotic pressure, but such experiments have not yet been reported. The fundamental differences between the continuum mechanics model for in vitro genome ejection and this idea are that the hydrodynamic mechanism involves three compartments: the bacterial cell, the infecting phage, and the environment. The hydrodynamic model thus posits that the energy required in transporting the phage genome lies not inside the capsid but in the environment surrounding the phage and the bacterial cell. Formal mathematical analyzes of all the pressures involved in this three-component situation show that if the osmotic pressure of the cell is artificially set equal to that of the environment, the equations and the overall system reduce to be equivalent to the two-body continuum mechanics model. However, in the in vivo

Phage Genome and Protein Ejection In Vivo

217

Fig. 6 Hydrodynamic model of DNA ejection. (A). There are two osmotic gradients: low-to-high between the growth medium and the phage, and low-to-medium between the growth medium and the cell cytoplasm (turgor). (B). During infection, the two gradients are connected through the phage and only the overall osmotic gradient between the growth medium and the cytoplasm is of thermodynamic importance. When the tail plug (A) is opened during infection (B), water will osmose across the capsid shell and eventually flow into the cell cytoplasm. Hydrodynamic drag on DNA in the tail tube will cause DNA translocation into the cytoplasm. As long as the channel is open and turgor is maintained, continued water flow will completely internalize the genome and any soluble proteins in the virion. Note that this mechanism cannot apply if phage infection (e.g., T7-likes) does not result in transient K þ -leakage (and thus water import) from cells. See text for more details.

situation where bacterial cells have a higher osmotic pressure, water will flow from the environment, through the phage head and tail, and into the cytoplasm until the channel across the inner membrane is closed. Under the hydrodynamics model, the osmotic pressure inside the capsid will equilibrate with the environment and not with the cell cytoplasm, as posited by the continuum mechanics model. In theory therefore, the entire phage genome can be transported into the cell by a constant force. DNA is much more condensed in phage capsids than the bacterial chromosome. In order to package a complete genome in vivo, the packaging motor must create room in the capsid by displacing water molecules from the DNA. Some of the work expended by the motor is thus used in expelling water across the capsid shell. DNA inside a mature phage head, although it retains its B-form configuration, is therefore dehydrated relative to the bacterial chromosome and especially relative to free DNA in solution. External water molecules thus have a thermodynamic incentive to enter a mature capsid but they cannot do so until infection is triggered and an open channel is created. Opening the channel allows water to flow into the capsid and out through the tail into the cell cytoplasm. DNA that is already inserted into the long tails of sipho- and myo-phages will then be propelled into the cell. The hydrodynamic model requires that the channel from the capsid into the cell cytoplasm is wide enough for both DNA and water molecules to pass at the same time. This requirement is reasonable, most phage infections are associated with a transient leakage of intracellular cations; if ions can leak out from the cytoplasm through a DNA-filled channel, water molecules should be readily able to enter the same cytoplasm. Significantly for this model of genome ejection, the only phage group (T7-likes) currently known not to leak cations at infection initiation, and thus may not be able to use water flow along an osmotic gradient, is known to require molecular motors to transport the entire phage genome into the cell. T7 has presumably found this mode of genome ejection advantageous for its specific life-style. The hydrodynamic model also provides a potential explanation for how proteins are ejected into the infected cell, a topic that cannot be addressed by DNA-centric continuum mechanics models. Water flow from the environment, through the phage virion and into the bacterium could provide the necessary force for ejection of the tape measure protein that is common to all long-tailed phages. In this case, the initial open channel may extend only into the periplasm, which, however, is iso-osmotic with the cytoplasm. Some tapemeasure proteins have actually been suggested to penetrate the cytoplasmic membrane, completing the channel from the phage capsid into the cytoplasm. Water flow could help transport the intra-capsid proteins that form the extended tail of the short-tailed phage P22, and perhaps facilitate formation of the tail-less øX174 ejection tube. Water flow into the cell cytoplasm is also suggested to cause the extended membrane tube formed by PRD1 during infection initiation. By contrast, as concluded from K þ leakage experiments T7 does not allow water flow; and as previously suggested, the T7 ejected proteins could be assembled into virions in a metastable state and thus contain the intrinsic energy necessary for their ejection into the infected cell. It seems unlikely that all the B1000 copies of the T-even IP proteins, and internal proteins of other phages packaged at high copy number are similarly metastable. Unless they are transported while bound to the phage genome, such proteins may require

218

Phage Genome and Protein Ejection In Vivo

an external force in order to enter the cell. Importantly, however, there is no temporal constraint on when proteins are ejected under the hydrodynamic model: they can leave the capsid before and/or after genome ejection, or if there is adequate space between the internal wall of the tail and DNA, at the same time. The only requirement is that a channel from the phage capsid into the infected cell is open to allow water flow. Genome and protein ejection from phage virions remains the least understood aspect of phage development. The continued development of cryoEM and cryoET techniques is yielding much needed structural information of the DNA translocation channel before, during, and after genome ejection. Traditional genetics and physiological studies are therefore now beginning to be fitted into a structural environment. Seeing can be believing, but visualizing structures in situ is also instructive – it both informs and constrains interpretations of genome ejection mechanistic data that by their very nature may be indirect and can thus be analyzed in different ways. Finally, the development of better kinetic assays that directly measure how much of a phage genome is internalized in the bacterial cytoplasm would certainly increase our understanding of the mechanisms of DNA translocation used by phages.

Acknowledgments We thank YeRin Kim for Figs. 1, 5 and 6. Work in the Molineux laboratory is supported by NIH grants GM110243 and GM124378. APR was supported by NSF MCB-1408217 to B. A. Fane at the University of Arizona.

See also: Energetics of the DNA-Filled Head

Further Reading Casjens, S.R., Molineux, I.J., 2012. Short noncontractile tail machines: Adsorption and DNA delivery by podoviruses. Advances in Experimental Medicine and Biology 726, 143–179. Cvirkaite˙-Krupovicˇ, V., Krupovicˇ, M., Daugelavicˇius, R., Bamford, D.H., 2010. Calcium ion-dependent entry of the membrane-containing bacteriophage PM2 into its Pseudoalteromonas host. Virology 405, 120–128. Daugelavicˇius, R., Cvirkaite˙-Krupovicˇ, V., Gaidelyte˙, A., et al., 2005. Penetration of enveloped double-stranded RNA bacteriophages ø13 and ø6 into Pseudomonas syringae cells. Journal of Virology 79, 5017–5026. Davidson, A.R., Cardarelli, L., Pell, L.G., Radford, D.R., Maxwell, K.L., 2012. Long noncontractile tail machines of bacteriophages. Advances in Experimental Medicine and Biology 726, 115–142. Farley, M.M., Tu, J., Kearns, D.B., Molineux, I.J., Liu, J., 2017. Ultrastructural analysis of bacteriophage Phi29 during infection of Bacillus subtilis. Journal of Structural Biology 197, 163–171. Fernandes, S., Labarde, A., Baptista, C., et al., 2016. A non-invasive method for studying viral DNA delivery to bacteria reveals key requirements for phage SPP1 DNA entry in Bacillus subtilis cells. Virology 495, 79–91. Gaidelyte˙, A., Cvirkaite˙-Krupovicˇ, V., Daugelavicˇius, R., Bamford, J.K.H., Bamford, D.H., 2006. The entry mechanism of membrane-containing phage Bam35 infecting Bacillus thuringiensis. Journal of Bacteriology 188, 5925–5934. Hrebik,́ D., Štveraḱ ova,́ D., Škubnik,́ K., et al., 2019. Structure and genome ejection mechanism of Staphylococcus aureus phage P68. Science Advances 5, eaaw7414. Hu, B., Margolin, W., Molineux, I.J., Liu, J., 2013. The bacteriophage T7 virion undergoes extensive structural remodeling during infection. Science 339, 576–579. Hu, B., Margolin, W., Molineux, I.J., Liu, J., 2015. Structural remodeling of bacteriophage T4 and host membranes during infection initiation. Proceedings of the National Academy of Sciences of the United States of America 112, E4919–E4928. Kemp, P., Gupta, M., Molineux, I.J., 2004. Bacteriophage T7 DNA ejection into cells is initiated by an enzyme-like mechanism. Molecular Microbiology 53, 1251–1265. Meng, R., Jiang, M., Cui, Z., et al., 2019. Structural basis for the adsorption of a single-stranded RNA bacteriophage. Nature Communications 10 (1), 3130. Molineux, I.J., 2006. Fifty-three years since Hershey and Chase; much ado about pressure but which pressure is it? Virology 344, 221–229. Molineux, I.J., Panja, D., 2013. Popping the cork: Mechanisms of phage genome ejection. Nature Reviews Microbiology 11 (3), 194–204. Panja, D., Molineux, I.J., 2010. Dynamics of bacteriophage genome ejection in vitro and in vivo. Physical Biology 7. Peralta, B., Gil-Carton, D., Castaño-Díez, D., et al., 2013. Mechanism of membranous tunnelling nanotube formation in viral genome delivery. PLOS Biology 11 (9), e1001667. Roessner, C.A., Ihler, G.M., 1986. Formation of transmembrane channels in liposomes during injection of l DNA. Journal of Biological Chemistry 261, 386–390. Samire, P., Serrano, B., Duche, D., et al., 2020. Decoupling filamentous phage uptake and energy of the TolQRA Motor in Escherichia coli. Journal of Bacteriology 202, e00428. Stent, G.S., 1963. Molecular Biology of Bacterial Viruses. San Francisco: W. H. Freeman. Sun, Y., Roznowski, A.P., Tokuda, J.M., et al., 2017. Structural changes of tailless bacteriophage FX174 during penetration of bacterial cell walls. Proceedings of the National Academy of Sciences of the United States of America 114, 13708–13713. Taylor, N.M., Prokhorov, N.S., Guerrero-Ferreira, R.C., et al., 2016. Structure of the T4 baseplate and its function in triggering sheath contraction. Nature 533, 346–352. Van Valen, D., Wu, D., Chen, Y., et al., 2012. A single-molecule Hershey-Chase experiment. Current Biology 22, 1339–1343. Wang, C., Tu, J., Liu, J., Molineux, I.J., 2019. Structural dynamics of bacteriophage P22 infection initiation revealed by cryo-electron tomography. Nature Microbiology 4, 1049–1056. Xu, J., Gui, M., Wang, D., Xiang, Y., 2016. The bacteriophage ø29 tail possesses a pore-forming loop for cell membrane penetration. Nature 534, 544–547. Xu, J., Wang, D., Gui, M., Xiang, Y., 2019. Structural assembly of the tailed bacteriophage j29. Nature Communications 10, 2366. Zaŕ ybnicky,́ V., 1969. Mechanism of T-even DNA ejection. Journal of Theoretical Biology 22, 33–42.

Dealing With the Whole Head: Diversity and Function of Capsid Ejection Proteins in Tailed Phages Lindsay W Black, The University of Maryland School of Medicine, Baltimore, MD, United States Julie A Thomas, Rochester Institute of Technology, Rochester, NY, United States r 2021 Elsevier Ltd. All rights reserved.

Glossary Ejection or E protein Protein that is packaged within the DNA of a phage capsid and upon host infection, exits the capsid with the DNA and enters the host cell.

Prohead capsid.

Spherical, protein-only precursor form of the

Introduction Tailed phages of the Caudovirales are considered to be among the most genetically diverse viruses, a trait that is reflected in a diversity of virion structures, and infection mechanisms. As their name suggests, the virions of all tailed phages are comprised of a tail and a head of icosahedral, or similar, symmetry. Tails vary dramatically between the main phage taxa: Members of the Podoviridae have short, non-contractile tails; members of the Siphoviridae have long, non-contractile tails; and members of the Myoviridae have long, doubled, contractile tails. The prime function of the tail is the identification and initiation of infection of a suitable bacterium, usually via baseplate and fiber structures, and then delivery of the genome and capsid ejection proteins, or E proteins, into the host cell. The major role of the head is to package, protect, and deliver both the double-stranded DNA (dsDNA) genome (B12 to B500 kb), and, more frequently than previously realized, capsid internal proteins. Phage heads are remarkable for so successfully protecting their DNA and protein cargos, as illustrated by the virions of some phages having been shown to be stable and infectious after decades of storage. In contrast to tail structures across phage families, the icosahedral heads of all tailed phages have shared features due to the fact that the basic head structure is formed by the actions and interactions of only three structurally conserved and essential proteins; the major capsid protein (MCP), the portal protein which forms a structure for DNA entry and egress between the head and tail, and the large terminase protein which actively packages DNA into the head. These proteins’ ancestries date back to a primordial virus in existence prior to the split of eukaryotes and prokaryotes. Many copies of the MCP form a thin (B20 Å ), but remarkably strong, protein outer shell. The MCP copy number varies as 60T, where the icosahedral triangulation number, or T number, increases from 1 to more than 50. Increasing capsid T numbers correlate with increased capsid dimensions necessary to house a longer genome, and in some instances, extremely large cargos of internal proteins (see below). The stability of the capsid shell can also in some cases be enhanced by MCP crosslinking or surface decoration proteins. Twelve copies of the portal protein form a turbine-like structure located at one unique capsid vertex. The portal has a number of essential functions, including that it is involved in (1) nucleation of prohead assembly and structure, (2) genome packaging into the prohead, (3) attachment of the tail to the head after the genome has been packaged, and (4) acting as a conduit for capsid internal contents as they exit the capsid into the tail upon infection of the host. The third head protein that is conserved in all tailed phages is the terminase which packages the DNA into the head through the inner channel of the portal. The terminase docks to the portal structure as a multimeric motor complex and hydrolyzes ATP to package, as well as often to measure and cut out one genome’s worth of DNA from a multi-genomic, replicated and joined head-to-tail DNA concatemer. The mature genome is packed into the prohead to an extraordinarily high, liquid crystalline density (B500 mg/ml), whether there are also cargo proteins packaged before the DNA. In tailed phages, the terminase protein(s) generally dissociate from the portal after packaging and are not part of the mature virion. Inside the MCP shell, in addition to the highly condensed genome, many phages have internal proteins. The existence of these internal head proteins illustrates beautifully the remarkable diversity that exists between different phages, as internal protein content can vary in sequence, length, copy number, location, and function in the capsid. The existence of internal proteins also highlights the fact that genome packaging occurs successfully in spite of their diversity which results in numerous final internal capsid conditions/states. However, although in every case studied these internal proteins are packaged into the prohead before the DNA, exactly how these proteins impact prohead structure and assembly, DNA packaging, head stability and DNA ejection is incompletely understood for most phages. The presence of internal proteins in different phages also raises the question of: Why are they there? Since, with the exception of a handful of model phages, the knowledge of internal head protein function(s) is limited. Based on knowledge acquired from model phages, such as T7, N4, P22, P1 and T4, there is an expectation that proteins packaged within the capsid outer shell are ejected into the host cell at the time of infection along with the DNA and have roles in host takeover. Consequently, internal head proteins in tailed phages are often referred to as ejection, or E proteins, and in this review we

Encyclopedia of Virology, 4th Edition, Volume 4

doi:10.1016/B978-0-12-809633-8.20987-4

219

220

Dealing With the Whole Head: Diversity and Function of Capsid Ejection Proteins in Tailed Phages

Fig. 1 Tailed phages whose capsids have Ejection, or E, proteins packaged within their DNA that have a diversity of functions (if known). All example phages infect Gram negative bacteria, but there are also phages that infect Gram positive bacteria that have E proteins.

will use both terms interchangeably, while acknowledging that the ejection status for many such proteins requires experimental confirmation. We previously delineated tailed phage heads into four subcategories based on the amount and location of E proteins within them (see Black and Thomas, 2012). Continued research has led us to modify slightly these categories into the following: (1) Capsids that contain only packaged DNA and no, or very few, proteins within them; (2) Capsids that contain a relatively small number (o50) of E proteins that have a specific locale within the capsid; (3) Capsids that contain many (over one-thousand) E proteins embedded within the DNA in no identified order; and (4) Capsids that have a large mass of co-localized E proteins that occupy an appreciable fraction of the inner capsid volume, sometimes in a structure with a defined shape and volume (see Fig. 1 for examples). We note that this categorization of E proteins is simply for ease of discussion, and has no taxonomic or functional basis. Examples of each of these categories of E proteins can be found in different phage families. Conversely, some phages, such as T4, have different classes of E proteins, with different functions, locations and/or abundances of each. All these E proteins, regardless of their structure and locales in the capsid, are able to exit the capsid through the narrow (B30 Å diameter) channel in the center of the portal complex and then must pass through the tail structure of their respective virion [sometimes a long (1000 Å ), narrow (B30 Å diameter) structure!] for delivery to their bacterial target. That target is dependent upon the E protein’s function, for some it is the host outer membrane (e.g., the podoviral core structures discussed below), whereas for others it is the host cytoplasm (e.g., the T4 internal proteins). It should also be noted that all icosahedral phage DNAs in capsids have small molecule counterions, such as polyamines and divalent cations, to neutralize the highly negatively charged DNA phosphodiester backbone; as these molecules likely have no specific biological phage function they will not be discussed as a part of this review. In this review, we will summarize the major features and functions, where known, of examples of phages whose E proteins fall into the different categories listed above (Fig. 1). While these examples are mostly from a limited number of extremely wellcharacterized phages, they serve to demonstrate the eclectic nature of E proteins and their functions. In addition, we will include examples from recent research that hint at the there being a virtual smorgasbord of E proteins in the virosphere, and finish with a discussion on some of the most pressing questions raised by the existence of E proteins.

Tailed Phage Capsids With Low Numbers of E Proteins in Specific Locales Numbers of unrelated phages have been shown to contain small numbers of E proteins (o50) within their capsids that have specific locales within the densely packaged DNA, often close to the portal structure. Some low copy number E proteins are highly organized into structures consisting of multiple proteins, whereas others may be present as just a handful, or one or two copies, of a single localized protein. Despite their generally low copy number relative to the highly abundant E proteins present in some other phage types (see below), characterization of a relatively small number of examples has illustrated a remarkable range of

Dealing With the Whole Head: Diversity and Function of Capsid Ejection Proteins in Tailed Phages

221

functions (numbers are apparently multi-functional), from promoting phage replication (often at the transcriptional level) to the manipulation and modification of cellular structures and molecules to allow or enhance infection. Based on this, it is not surprising that many of these low abundance E proteins have been shown to be essential.

Low copy number E proteins with specific locales in the capsid Numbers of E proteins are not thought to be part of a specific structure but nonetheless have specific locales within the capsid, typically in close proximity to the portal complex. Examples of localized internal proteins are found in E. coli phages N4 and T4, and Bacillus phages Phi29 and SPP1. Despite having similar locales, these localized inner capsid proteins have a remarkably eclectic set of functions and several of them are apparently multi-functional. One localized protein that stands out for its unusualness is the virion RNA polymerase (vRNAP) gp50 of the podovirus N4 which was among the first phage E proteins discovered (see Lenneman and Rothman-Denes, 2015). This enzyme is essential; transcribing early genes from atypical promoters containing hairpin/stem structures whose formation is assisted by host proteins (E. coli DNA gyrase and single-stranded DNA-binding protein). The single-subunit gp50 is surprisingly large (B380 kDa) relative to its homologs in mitochondria and other phages (e.g., the T7 gp1 RNAP, which is not an E protein, is B99 kDa). In addition to the central domain of gp50 that is responsible for RNAP activity, the enzyme has a C-terminal domain that targets the vRNAP into the head and an N-terminal domain that is required for the ejection of the first 500 bp of the genome. Only about four copies of the vRNAP are present in each N4 virion and cryo-EM comparison of wild-type versus gp50-minus mutants indicated that the vRNAP is located at the base of an inner protein core, close above the internal entrance to the portal channel. Considering its molecular weight, it is remarkable this vRNAP can pass through the portal channel and other structures to enter the cell. The myovirus T4 also has two low abundance E proteins, Alt and gp2, whose positions within the capsid are localized. Alt is a large protein (B75 kDa) which ADP-ribosylates host RNAP subunits, and a number of other host proteins, e.g., MazF toxin (see Alawneh et al., 2016). Although Alt is nonessential, its absence from the prohead results in capsids of varying size, both smaller and larger than those of the wild-type phage, suggesting it has an additional, assembly-related function. Alt is also located close to, possibly on, the portal complex as supported by the fact that in giant elongated T4 heads formed by certain head mutations the copy number of Alt remains constant, at about 40 copies. This is despite increases in the copy numbers of other internal proteins (IPI, IPII, and IPIII, see below) proportional to increases in head size in these mutant particles. T4 gp2 is a low copy (B1–2/head) DNA end protection protein that is required to block the E. coli exonuclease V (RecBCD); in E. coli lacking this nuclease phages lacking gp2 are fully infective. Both ends of the mature T4 DNA are located 80–90 Å apart and close to the portal, indicating gp2 must also be in this region and that it, like Alt, may have a secondary function, which for gp2 is likely a genome plug as mutations in it produce unstable heads that leak DNA. Genome plugs are proteins that function to prevent the packaged genome from escaping through the portal vertex until the appropriate time for DNA ejection or descent into the tail tube.

Podoviral E proteins The short tails of podoviruses present a unique problem with regard the delivery of their genome to the cytoplasm of the host cell; they are too short to span the cell wall, compared to the tails of phages from other families. Podoviruses overcome this multilayered barrier (consisting of inner and outer membranes, and periplasm) by having a set of E proteins, each typically present in relatively small copy number (o10), that are ejected ahead of the DNA and form a remarkable channel in the cell wall through which the genome passes, i.e., within their capsids podoviruses package an evertible tail of sorts! Structural evidence for the formation of this E-protein-derived channel has only been observed in two podoviruses to date (E. coli phage T7 and Salmonella phage P22) using cryo-electron tomography and exploiting minicells for phage adsorption to make visualization feasible. In T7, the E proteins that form the eversible tail, are derived from a multi-protein structure, or core. That is centralized within the densely packaged DNA in the capsid (Fig. 2). The proteins that form the core are technically E proteins, due to their presence in the core and multi-functionality that this positioning affords them, they are usually referred to as core proteins. The T7 core is roughly cylindrical in shape; 265 Å high and 175 Å wide and positioned immediately above the portal complex. Although the core is not composed of large numbers of proteins, it does occupy a considerable volume in the 510 Å diameter capsid – large enough to be visualized by conventional metal staining electron microscopy. The T7 core is not a simple cylinder, being composed of three domains, each with a different symmetry twelve-, eight-, and four-fold. The twelve-fold symmetry domain most probably represents the connector complex, whereas the other two domains are composed of three essential proteins: gps 14 (21 kDa), 15 (84 kDa) and 16 (144 kDa), present in B10–12, 8 and 4 copies, respectively. Along the length of the core is an internal channel/cavity (diameter varies from 35 to 110 Å ) that is continuous through to the outer side of the capsid due to the positioning of the core structure immediately above the portal complex. Regarding the genome, the T7 core is multi-functional, acting as a conduit via which the DNA enters the capsid (via the core inner channel). In addition, the core goes through an astonishing deconstruction, with the E proteins exiting the capsid via the portal, and then reassembling, into a channel that transverses the E. coli cell wall. This channel is formed by gp14, 15 and 16, with gp14 localizing in the outer membrane of the cell, and gp15 and gp16 localizing in the periplasm and cytoplasmic membrane. The packaged genome is then able to enter the cell via this channel whose development by-passes the need for specific phage or host structures employed by other phages to facilitate genome entry into the host cell (e.g., the long tail and LamB outer membrane porin required by Lambda phage). Even phages without a tail such as PhiX174 transfer their genome into a host cell by forming such an evertible tail.

222

Dealing With the Whole Head: Diversity and Function of Capsid Ejection Proteins in Tailed Phages

A.

B. Capsid shell

C.

dsDNA Core Core

OM Channel

Portal

IM

D.

E.

F.

Fig. 2 Cryo-electron microscopy of T7 phage. Central slices of the T7 virion (A, B and C) and orthogonal surface views of the virion (D, E and F). Cryo-Electron Tomography of the T7 virion attached to the E. coli cell wall shows the transformation of its inner core (A and B) into a transmembrane channel (C and F) that facilitates genome ejection into the host cytoplasm. IM, inner membrane, OM, outer membrane. Images are reproduced from EMD-1164 (A and D), EMD-5535 (B and E), and EMD-5534 (C and F). Reproduced from Agirrezabala, X., Martin-Benito, J., Caston, J.R., et al., 2005. Maturation of phage T7 involves structural modification of both shell and inner core components. The EMBO Journal 24, 3820–3829. Hu, B., Margolin, W., Molineux, I.J., Liu, J., 2013. The bacteriophage T7 virion undergoes extensive structural remodeling during infection. Science 339. doi:10.1126/science.1231887.

Core structures are likely conserved features of podoviruses related to T7, such as Prochlorococcus phage P-SSP7, as well as more distantly related phages, such as Salmonella phage Epsilon15 and E. coli phage N4. There is considerable variation in the size and dimensions of the cores among these and other phages, likely due to variations in core proteins which function to interact with the cell walls of different bacterial hosts. However, overall it is likely most cores serve an analogous function with regard genome delivery to that of the T7 core. Phage P22 has three essential E proteins, gp7, gp16 and gp20, that also create a channel structure that completely spans from the bottom of the baseplate through Salmonella cell wall, a total distance of B40 nm! The diameter of this channel varies from B5–10 nm, and has internal cavity of B4–8 nm through which the DNA passes. The exact functions of the P22 E proteins are unknown, but analyzes of mutants suggest that the mature form of gp7 (gp7* which is cleaved by a Salmonella protease) forms the extracellular component of the channel and that the majority of the channel in the periplasm is formed by gp20. In contrast to T7, the locations of the P22 E proteins inside the head are less well defined, but the fact these proteins are ejected ahead of the DNA, and in a specific order, suggests they likely have some sort of consistent locale between particles, possibly residing close to the portal complex, and represent exciting avenues of future research (see Casjens and Molineux, 2012; Hu et al., 2013; Wang et al., 2019; Jin et al., 2015; Agirrezabala et al., 2005).

Capsids With Large Numbers of E Proteins Distributed Throughout the DNA In contrast to the low copy number E proteins with very specific locales in the capsid, some phages have E proteins that are present in many (hundreds!) of copies with no specific structural locale, of which the capsid internal proteins (IPs) of T4 are the most well studied. There are more than one thousand molecules of three small (B8 to B20 kDa) proteins, IPI, IPII and IPIII, and several lines of evidence support that they are dispersed throughout the packaged but highly structurally mobile 500 mg/ml DNA. These include: (1) packaged IP-nuclease fusion proteins, whether present in B200 or B10 copies per head, are able to cleave the genome into the same capsid-sized fragments, either quickly or slowly, suggesting they are spread throughout the DNA and are mobile within it, (2) packaged IPIII-GFP fusion proteins show gradual formation of fluorescence suggesting mobility of the fusion protein within the capsid, (3) IP proteins sediment together with the genome after both have been released after treatments that disrupt the MCP outer shell, and (4) Solid state NMR has also shown that there are specific close T4 IP-lysine-to-DNA phosphate structural contacts (see Yu and Schaefer, 2008).

Dealing With the Whole Head: Diversity and Function of Capsid Ejection Proteins in Tailed Phages

223

T4 can form mature heads without any IP proteins as discovered by genetic analyzes. Mutants with knockouts in all three IP genes showed that elimination of these three proteins resulted in only minor effects on capsid volume, and also without significantly affecting the amount of DNA packaged within the normal size capsid. These studies demonstrating the T4 IPs were dispensable were initially surprising, as their high copy numbers had led researchers to infer they had a major function. A major inference from that discovery was that the IP proteins are not essential for the stability of packaged DNA. But if the IP proteins were not required for the stability of packaged DNA, this then raised the question of what was the main function of the IPs? As it was known the T4 IPs are all injected into the host cell, this led to the hypothesis that the IPs interacted with the host in a manner that would enhance infection. It took a number of years to show that IPI binds and protects T4 DNA against a highly specific DNA modification-dependent restriction enzyme (RE), which is only present in some, mostly pathogenic, strains of E. coli. About 350 copies of IPI are ejected before or with the DNA, and appear to function immediately to protect the DNA they accompany in the RE containing host, since phenotypic mixing shows a functional IPI gene is not required for the RE protection. A major question still to resolve is what are the targets or functions of the other two, abundant injected proteins, IPII and IPIII. Another E. coli phage with E proteins that also have an anti-restriction endonuclease inhibitor role is the myovirus P1. P1 has two large (60 and 200 kDa) E proteins, Dar A and Dar B, with four other Dar accessory proteins whose functions are also to protect the ejected genome from E. coli REs. These proteins enter the host through the long and narrow P1 tail and likely require refolding for the anti-RE functions to emerge. Interestingly absence of the Dar proteins shows they also play a major role in P1 capsid size determination, correlating to formation of small, middle, and normal size capsids see (Piya et al., 2017; Black et al., 1994; Mullaney and Black, 2014).

Capsids With Large Numbers of Co-Localized E Proteins In recent years, increasing numbers of phages have been identified as having large masses of co-localized E proteins inside their capsids. These masses are expected to be comprised of many different proteins, of which several are present in extremely high copy numbers per capsid. To date these masses have only been identified in myoviruses with capsids with high T numbers and long 4200 kb genomes, so-called “giant” or “jumbo” phages [Nb., the terms “giant” and “jumbo” have no taxonomic meaning with regard phages, and are used interchangeably by researchers. These terms were first introduced into the literature in 2002 (for “giant” phage (Mesyanzhinov et al., 2002)) and 2009 (for “jumbo” phage (Hendrix, 2009)). The authors prefer “giant” as the term has historically been used to describe large phage heads, which were the product of head gene mutations in T-even and other phages, e.g., (Doermann et al., 1973; Ackermann et al., 1975)]. These internal protein masses occupy an appreciable fraction of the inner capsid volume, sometimes in a structure with a defined shape and volume. The first of these masses was identified as being present inside the capsid of Pseudomonas aeruginosa phage fKZ, called the inner body (IB), in 1984 by Victor Krylov’s group. It was another 25 years, and ensuing developments of high resolution CryoEM imaging and proteomics via mass spectrometry, that the structure and major components of this mass were delineated by these modern techniques (Fig. 3). The fKZ IB is B90 nm in length and 35 nm in diameter – considerably larger than that of the T7 inner core with a total volume that is greater than the total volume of some smaller capsid phages. The fKZ IB has a spring- or spool-like appearance and the long 280 kb genome is spooled around it. Despite a mass of B15–20 kDa, the exact components of the IB require clarification. Proteomic analyzes of the fKZ head supported the IB being comprised of numbers, possibly many, different proteins, which range in abundances. Several high abundance proteins are excellent candidates for the major IB components including gps 93, 95 and 162, all of which are paralogs and members of the Pfam12699 [Pfam is a database for protein families that for each included conserved protein contains information about that protein (e.g., function if known) and an alignment of its homologous proteins, i.e., its family]. The IB proteins are apparently ejected into the host cell based on it disappearing after phage particles adsorb to bacteria, or in particles with contracted tails and DNA-free heads. Whether the IB has an inner channel and/or is anchored to the portal complex, as in podoviral cores, are yet to be determined. That an internal mass of E proteins is not an unusual feature may be a fairly recent discovery among “giant” or “jumbo” (genomes 4200 kb) environmental phages related to fKZ that infect a range of Gammaproteobacteria. The isolation of these new giant phages highlights a need for further studies on the E protein content and structure within the capsids of these phages. Data to date derived from bioinformatics, proteomic, genetic and structural analyzes, support that different giant phages related to fKZ, such as Salmonella phage SPN3US, all have large masses of co-localized E proteins. Intriguingly, despite these phages all sharing a core set of head genes, including counterparts to the abundant proteins in fKZ, there is likely to be great variability in the structures and E protein content between different phages. For instance, the number of Pfam12699 paralogs in each phage varies, from two in SPN3US to seven in Pseudomonas phage fPA3. In addition the amount of E proteins in SPN3US is estimated to be 440 MDa, approximately twice the mass of the fKZ IB. Since the SPN3US genome is B40 kb shorter than that of fKZ the increase in E protein content in its capsid raises the question of whether some of the E protein mass functions to maintain the density of packaged DNA in their similarly-sized capsids. Although the function(s) of the E proteins of fKZ, SPN3US and relatives, are mostly a mystery, genetic studies of SPN3US indicate that numbers of them have essential roles. Intriguingly, several minor, essential SPN3US E proteins have predicted transmembrane domains, suggesting they may interact with one of the host membranes, possibly assisting the genome and/or other E proteins to enter the cytoplasm. Other potential functions of the E proteins include assisting with the assembly of such a large capsid, possibly regulating capsid size (i.e., analogous to the shape-determining T4 core or to a tail tape measure protein),

224

Dealing With the Whole Head: Diversity and Function of Capsid Ejection Proteins in Tailed Phages

Fig. 3 Cryoelectron micrographs of purified fKZ virions: (A) initial low-dose exposure; (B) subsequent exposure of the same field, with bubbling showing the locale of the E proteins in radiation-damaged virions. (C) Three-dimensional reconstruction of the fKZ capsid viewed along the axis of fivefold symmetry that passes through the portal. The capsid has T ¼ 27 icosahedral symmetry. The central axis of the inner body passes through the center of the capsid. (D) Central section of the fKZ head sampled in the plane in which the inner body axis lies. (E) Multitiered structure of the inner body shown in surface rendering (left, magenta) and central gray-scale section (right). Reproduced with permission from Wu, W., Thomas, J.A., Cheng, N., Black, L.W., Steven, A.C., 2012. Bubblegrams reveal the inner body structure of fKZ. Science 335, 182. doi:10.1126/science.1214120.

DNA packaging, DNA compaction in the mature head and/or roles in host takeover. The latter is supported by each of the giant phages elated to fKZ and SPN3US having an essential encapsidated multi-subunit vRNAP (with diverged homology to prokaryotic RNAP b and b0 , rather than the single subunit podoviral RNAPs discussed earlier) which based on studies on fKZ are required for transcription of early phage genes. Interestingly, the SPN3US vRNAP is encapsidated into the DNA free prohead in an assembled form since knockout of a single one of the 5 vRNAP subunits prevents any one of this B3.2 MDa enzyme from incorporation into the procapsid. Possibly, following DNA packaging by the SPN3US terminase, the vRNAP subunits are unfolded and/or disassembled, which would provide an additional function for the terminase. At very least the vRNAP subunits must be in that state by the time they exit the capsid and transverse the long, narrow SPN3US tail into the host to begin phage development. That the heads of fKZ and SPN3US each have B50 different proteins suggests that their E proteins collectively likely have a multitude of functions, and represent a rich resource for future studies and novel structural precedents (see Wu et al., 2012; Ali et al., 2017; Ceyssens et al., 2014).

The Enigmas of E Proteins Phages with yet to be identified E proteins A major question to resolve is how many tailed phages have E proteins? It seems feasible that many phages have E proteins that are yet to be detected, or formally annotated in their genomes. This is because E protein identification is often not straightforward, requiring specialized techniques, possibly mass spectrometry, especially to identify low copy number E proteins, or Cryo-EM which is especially useful for the detection of large masses of co-localized E proteins, such as those in fKZ and its relatives. Homology searches are probably the most straightforward method for inference of the presence of an E protein if there is an existing, characterized homolog in the databases, such as homologs to the core proteins of T7, or the IP proteins of T4. However, such results may be of limited use, especially for small E proteins, even ones with homologs, depending upon the extent of sequence divergence to the characterized protein.

Dealing With the Whole Head: Diversity and Function of Capsid Ejection Proteins in Tailed Phages

225

Obviously, homology searches are of no practical use for the identification of E proteins in new phage taxa, whose genomes have a high percentage of novel genes – not an uncommon occurrence based on the overall genetic diversity of tailed phages. In such instances, there needs to be analyzes of at least one representative to assist with the inference of E protein content for related phages. As noted above, such analyzes are not insignificant, and we note that much of the existing knowledge regarding E proteins is derived from numerous and elegant studies on model phages, such as T4, T7, P22, P1, SPP1 and N4, conducted over many years. In addition, much of this knowledge is the by-product of these phages having genetic systems, which are not currently available for many phages. Despite these challenges, and based on the existing precedents, the analyzes of E proteins in novel phage types are likely to reveal many new functional classes of E proteins, possibly such as those which provide defense against prokaryotic CRISPR-Cas adaptive immune systems.

What mechanisms ensure encapsidation of E proteins? Remarkably E proteins, in all categories discussed above, are not transferred into the procapsid after it forms, as occurs, via the portal, with the DNA. E proteins are internalized into the head at a stage when it is a protein-only prohead and only associate with the DNA after it is packaged. This knowledge is derived from characterization of the proheads in numbers of model phages, as well DNA packaging studies which have shown that proteins are excluded by the terminase from being packaged with the DNA, even proteins with high DNA binding affinities that normally bind to metabolically active cellular DNA. It can then be inferred that potentially all E proteins are incorporated into the capsid via events based on protein-protein interactions. However, it is not at all clear for many phages what the specific determinants of these, often essential, interactions are. Possibly, the mechanisms of E protein encapsidation have evolved to be as diverse as the proteins themselves. There are several phages for which some of the determinants of E protein encapsidation are better understood, such as T4 phage (Fig. 4). In these phages their E proteins are targeted into the procapsid via attraction to proteins that are critical for prohead formation (e.g., scaffold and core proteins), and in doing so add an additional functional aspect to those proteins. The scaffold protein is typically present in hundreds of copies in the prohead – and as the name suggests – creates a structure around which the MCP shell can assemble, therefore having an important role in determining head dimensions. During head maturation the scaffold protein typically exits the capsid and by doing so creates space within the capsid for the DNA to be packaged. In phages with more complex prohead structures, such as T4, there is a main scaffold protein, gp22, and several other core proteins that are also present in many copies and work together to share the role of a single scaffold protein in other phage types. The T4 IP proteins are incorporated into DNA-free proheads based on the affinity of a short (10–19 residue), highly conserved propeptide consensus sequence found at their N-terminus (the capsid targeting sequence, or CTS) for the major essential gp22 core proteins in the procapsid. The affinity of the T4 CTS for the core is so strong that this sequence alone can be added to foreign, non-phage proteins which results in their being packaged into the capsid. Once IP, or other targeted proteins are packaged into the core, their interaction with Proteolysis & Expansion

A.

DNA Packaging

Mature Head

Prohead

B.

C.

D.

Fig. 4 Maturation of the T4 head containing densely packaged dsDNA (171 kb) and E proteins (IPI, IPII, IPIII, Alt and gp2). (A) Scheme showing T4 head maturation from a protein-only spherical core to the expanded mature DNA- and E protein-containing lprolate, mature head. (B) Transmission electron micrographs of thin-sections of E. coli infected with T4 showing proheads formed on the inner membrane, (C) expanding proheads moving away from the membrane after proteolysis by the prohead protease has removed scaffold protein (gp22) and E protein and major capsid protein propeptides, and (D) proheads undergoing DNA packaging to become mature heads.

226

Dealing With the Whole Head: Diversity and Function of Capsid Ejection Proteins in Tailed Phages

the temporary scaffold protein is no longer necessary. At this point, the prohead protease, gp21, which the bulk of the evidence suggests nucleates as a kernel at the center of prohead core, is activated to resolve these CTS-scaffold interactions. The prohead protease also cleaves an N-terminal propeptide from the major capsid protein and all transient core proteins into small peptides, most of which exit the capsid, although a few of which are also retained within the mature capsid. Despite their having major differences in their capsid dimensions, structures and composition to the T4 capsid, giant phages related to SPN3US and fKZ likely target many of their E proteins into the procapsid via a similar mechanism to the T4 CTS system. Support for this includes that these giant phages all have diverged homologs to the T4 major capsid protein, portal, large terminase and prohead protease proteins, suggesting they share major, anciently-derived head assembly steps. In addition proteomic analyzes of three phages, SPN3US, fKZ, and the Pseudomonas chlororaphis phage 201f2–1 identified head proteins to have had N-terminal propeptides removed, with cleavage in all cases being C-terminal to a glutamate residue after a short motif comparable to the cleavage substrate for the T4 protease. Proteins identified as being cleaved in all three giant phages included the major capsid protein, portal and E proteins belonging to Pfam12699. The propeptides of many highly abundant E proteins are considerably longer than the T4 CTS, i.e., 125 and 156 residues are removed from the N-termini of the highly abundant E proteins, gp53 in SPN3US, and gp93 in fKZ, respectively. These propeptides are longer than T4 IPI (95 residues) and IPII (100 residues) proteins! In vitro and in vivo data for several of giant phage long propeptides indicate they are cleaved at additional internal cleavage motifs. This apparent redundancy in cleavage sites is likely to ensure the propeptide regions are small enough to leave the capsid, however the need for the longer propeptides in giant phage head proteins compared to those of T4 is unknown; possibly these long propeptides ensure correct localization of the E proteins in the mature head and/or act as supplements to a scaffold protein/core. Giant phages related to SPN3US and fKZ are likely to have an equivalent to the T4 CTS despite their E protein propeptides being considerably longer than those of T4. Support for this includes that the N-terminal 15 residues (MANFVKSKLARESVE) of the SPN3US highly abundant E protein paralogs, gp53 and gp54 are identical and contain sequences (AXE-12, SXE-15) consistent with the known cleavage motifs for SPN3US and fKZ proteases. In addition, the diverged homologous abundant E protein in fKZ, gp93, is processed at SLE-13, in addition to other processing sites. There are many questions yet to be resolved regarding the role(s) of the longer E protein propeptides, however it is likely there are numbers of significant variations in giant phage head assembly mechanisms relative to those of T4, which in conjunction with the functions of their novel E protein cargos represent likely sources for future novel discoveries. In contrast to E proteins that are targeted to scaffold/core structures, some E proteins are likely targeted into the prohead via interactions with the portal complex. This interaction is based on the locales of numbers of different proteins immediately above the portal complex, such as the N4 vRNAP and T4 Alt. This location of Alt is a yet to be resolved puzzle, as Alt also has a CTS sequence like the IP proteins, albeit slightly shorter. Possibly, Alt’s N-terminal CTS sequence processing and our recent discovery that its C-terminus is also a CTS processed propeptide removed by gp21, suggest scaffold targeting is followed by portal targeting.

Impacts of E proteins on capsid DNA packaging and ejection The presence of E proteins raises a number of interesting questions, particularly as to how DNA packaging occurs around or through them, particularly those E proteins which are portal-localized. Do any of these proteins have any specific interactions with packaged DNA? Or conversely, do some of these proteins have limited DNA interactions so as to prevent any hindering of their ejection into the host cell, possibly ahead of the DNA? Some E proteins, such as the N4 vRNAP, Alt, and P1 Dar proteins must impact both the prohead structure and the packaging of DNA and its final structure based on the volume they would occupy in the phage head. Other proteins with specialized locations within the head potentially have less impact on the overall structure of packaged DNA as they are present in low copy number and/or have low molecular weights. However, even their presence is likely to influence the structure of the packaged DNA in specific regions, particularly near or within the portal complex, especially proteins that are bound to genome termini, such as T4 gp2.

Impacts of E proteins on DNA ejection The successful translocation of the phage genome through the tail and into the bacterial cell is critical for infection; however, with the exception of the T7 and podoviral E protein-to-channel transformations which facilitate DNA ejection (discussed above), the impacts of E proteins on this process are poorly understood. This fundamental gap in knowledge exists in part because of the difficulty of defining these processes experimentally, which is further complicated by the number of variables in play during this process. These include the characteristics of the packaged DNA itself (length, packaged density and structure), and the structures of the tail and cell wall channel employed by the phage (whether it be bacterial or phage encoded). Notably, during the ejection process the DNA must undergo a structural transition from liquid-crystalline-like to fluid-like state, but even the speed of this transition must vary between different capsids. Different E proteins themselves likely have different impacts on DNA ejection, simply due to dramatic variation in types, numbers, and structures within the head. For instance, the 10 kDa IPI of T4 is apparently ejected through the portal and tail tube with the DNA and without unfolding thereby promoting immediate inhibition of its nuclease target in the infected host. This does not seem feasible for other E proteins, particularly larger ones, such as the N4 vRNAP, and P1 Dar proteins, which based on their dimensions versus the internal diameters of the portal channel and tail, must undergo unfolding in the ejections process and refolding in the bacterial cytoplasm. That such permutations can occur was demonstrated by the packaging of b-galactosidase into the T4 head via an IPIII- b-galactosidase fusion and its subsequent ejection into, and activity within, the host. Physically it is not feasible that packaged b-galactosidase could exit the capsid in its folded state especially considering the structure of the myoviral

Dealing With the Whole Head: Diversity and Function of Capsid Ejection Proteins in Tailed Phages

227

tail through which it passes. The double tube contractile tail structure (typically B100 nm long, B0.3 nm inner diameter) would seemingly make tail expansion to allow ejection of bulky E proteins unlikely. In addition, the appearance of the 540 kDa tetramerdependent activity in the infected cell requires host chaperones. Hence, the ejection of the unfolded, or possibly semi-unfolded, protein must be followed by refolding and multimerization within the host. Possibly, unfolding of such non-native and native E proteins, such as b-galactosidase and Alt, that are folded and active upon synthesis prior to encapsidation, is a secondary function of the high force-generating packaging motor and the resulting DNA pressure in the full capsid. A further intriguing feature of E proteins is whether or not they resist CRISPR-Cas nuclease attack. A number of small acr proteins (anti-CRISPR proteins) comparable in size to the T4 IPs have been described that counter CRISPR-Cas attack, but thus far these proteins have been found only to be synthesized by phage origin genes in the host. Thus they are thought to function in an altruistic or cooperative fashion, unlike T4IPI that is an egoistic anti-RE protein. Could this be related to the apparent lysogeny of the acr proteins versus lytic phage proteins such as IPI. Or do E-acr’s still await discovery? (see Nussenzweig and Marraffini, 2018).

Potential of E protein delivery from capsid derived nanocontainers A related, intriguing feature of E proteins to be resolved is the mechanisms by which large proteins such as Alt, Dar, N4, giant multi-subunit phage vRNAPs, and even the 450 kDa non-phage protein b-galactosidase, are apparently unfolded within the capsid, probably while the DNA is being packaged, permitting them to be in a conformation that allows them to be ejected through the narrow portal and tail tube channels. Could this unfolding be a secondary but essential function of the high energy DNA packaging motor mechanism that, through high DNA pressure within the fully packaged capsid, promotes the ejection of unfolded or partially unfolded large proteins. Evidently these E proteins are then refolded upon ejection to display often essential enzymatic functions within the host cell? Ejection and refolding of b-galactosidase in an E. coli host cell argues against the necessity of special E protein structural evolution to permit transfer into a host. This high flexibility of E proteins potentially provides a means of transferring a variety of enzymes, singly or together, (e.g., Cre recombinase, CRISPR targeted Cas9, Lambda beta and exonuclease recombination proteins etc., see (Liu et al., 2014)) into a host cell for gene repair or gene knockout.

Conclusions In addition to the conserved features of all tailed phage heads – the portal, major capsid shell and DNA packaging enzyme(s) – required to form an icosahedral capsid full of densely packaged DNA, there are clearly many variations on the internal E protein composition between different phages. E proteins vary significantly in number, and locales within the head, some dispersed throughout the DNA, others form distinct structures. The reasons for these variations are in many cases still obscure but are likely linked to their functions which, based on existing precedents, potentially for most E proteins, are linked to the infection of and replication within a new bacterial host. In addition, whether fully packaged B500 mg/ml DNA in different E protein-containing heads is spooled or folded is uncertain and could vary among these viruses. As our knowledge of internal head or E proteins is from the phages isolated and studied to date, which represent a relatively narrow pool of total phage diversity, it is clear and exciting to consider that a myriad of internal proteins composition, structures and functions are yet to be discovered. Based on numbers of known E proteins having roles in infection and the potential of phages for biocontrol applications in a multi-drug resistant era, it could be proposed that phage E protein research is a relatively unexplored area that merits concentrated effort.

References Ackermann, H.W., Caprioli, T., Kasatiya, S.S., 1975. A large new Streptococcus bacteriophage. Canadian Journal of Microbiology 21, 571–574. Agirrezabala, X., Martin-Benito, J., Caston, J.R., et al., 2005. Maturation of phage T7 involves structural modification of both shell and inner core components. The EMBO Journal 24, 3820–3829. Alawneh, A.M., Qi, D., Yonesaki, T., Otsuka, Y., 2016. An ADP-ribosyltransferase Alt of bacteriophage T4 negatively regulates the Escherichia coli MazF toxin of a toxin–antitoxin module. Molecular Microbiology 99, 188–198. Ali, B., Desmond, M.I., Mallory, S.A., et al., 2017. To be or not to be T4: Evidence of a complex evolutionary pathway of head structure and assembly in giant Salmonella virus SPN3US. Frontiers in Microbiology 8, 2251. Black, L.W., Showe, M.K., Steven, A.C., 1994. Morphogenesis of the T4 head. In: Karam, J.D. (Ed.), Molecular Biology of Bacteriophage. Washington, DC: ASM Press, p. T4. Black, L.W., Thomas, J.A., 2012. Condensed genome structure. In: Rossmann, M.G., RAO, V.B. (Eds.), Viral Molecular Machines. US: Springer. Casjens, S.R., Molineux, I.J., 2012. Short noncontractile tail machines: Adsorption and DNA delivery by podoviruses. Advances in Experimental Medicine and Biology 726, 143–179. Ceyssens, P.-J., Minakhin, L., Van Den Bossche, A., et al., 2014. Development of giant bacteriophage fKZ is independent of the host transcription apparatus. Journal of Virology 88, 10501–10510. Doermann, A.H., Eiserling, F.A., Boehner, L., 1973. Genetic control of capsid length in bacteriophage T4. I. Isolation and preliminary description of four new mutants. Journal of Virology 12, 374–385. Hendrix, R.W., 2009. Jumbo bacteriophages. In: Van Etten, J. (Ed.), Lesser Known Large dsDNA Viruses. Berlin Heidelberg: Springer. Hu, B., Margolin, W., Molineux, I.J., Liu, J., 2013. The bacteriophage T7 virion undergoes extensive structural remodeling during infection. Science 339. doi:10.1126/ science.1231887. Jin, Y., Sdao, S.M., Dover, J.A., et al., 2015. Bacteriophage P22 ejects all of its internal proteins before its genome. Virology 485, 128–134. Lenneman, B.R., Rothman-Denes, L.B., 2015. Structural and biochemical investigation of bacteriophage N4-encoded RNA polymerases. Biomolecules 5, 647–667.

228

Dealing With the Whole Head: Diversity and Function of Capsid Ejection Proteins in Tailed Phages

Liu, J.L., Dixit, A.B., Robertson, K.L., Qiao, E., Black, L.W., 2014. Viral nanoparticle-encapsidated enzyme and restructured DNA for cell delivery and gene expression. Proceedings of the National Academy of Sciences of the United States of America 111, 13319–13324. Mesyanzhinov, V.V., Robben, J., Grymonprez, B., et al., 2002. The genome of bacteriophage phiKZ of Pseudomonas aeruginosa. Journal of Molecular Biology 317, 1–19. Mullaney, J.M., Black, L.W., 2014. Bacteriophage T4 capsid packaging and unpackaging of DNA and proteins. Methods in Molecular Biology 1108, 69–85. Nussenzweig, P.M., Marraffini, L.A., 2018. Viral teamwork pushes CRISPR to the breaking point. Cell 174, 772–774. Piya, D., Vara, L., Russell, W.K., Young, R., Gill, J.J., 2017. The multicomponent antirestriction system of phage P1 is linked to capsid morphogenesis. Molecular Microbiology 105 (3), 399–412. Wang, C., Tu, J., Liu, J., Molineux, I.J., 2019. Structural dynamics of bacteriophage P22 infection initiation revealed by cryo-electron tomography. Nature Microbiology 4, 1049–1056. Wu, W., Thomas, J.A., Cheng, N., Black, L.W., Steven, A.C., 2012. Bubblegrams reveal the inner body structure of fKZ. Science 335, 182. Yu, T.Y., Schaefer, J., 2008. REDOR NMR characterization of DNA packaging in bacteriophage T4. Journal of Molecular Biology 382, 1031–1042.

Jumbo Phages Isaac T Younker and Carol Duffy, University of Alabama, Tuscaloosa, AL, United States r 2021 Published by Elsevier Ltd.

Nomenclature bp

kb Kilobase pairs nm Nanometers

Base pairs

Glossary CRISPR-Cas system Prokaryotic acquired immune system that confers resistance to foreign genetic elements and phages via degradation of recognized nucleic acid sequences. ORFan gene Predicted open reading frame without detectable homology to coding sequences in the genomes of other organisms or viruses. Restriction-modification system Prokaryotic innate immune system that confers resistance to foreign genetic

elements and phages via degradation of invading DNA while precluding degradation of host DNA. Terminal redundancy Repeated genetic sequence present at both ends of a linear DNA molecule. Triangulation number (T-number) The number of unique environments occupied by icosahedral capsid subunits; generally representative of the size and complexity of the capsid.

Introduction Bacteriophages are the most numerous biological entities on the planet. Outnumbering their bacterial hosts by a factor of ten, there are an estimated 1031 phages on Earth. Ubiquitous in nature, phages are found in soil, composts, freshwater and marine environments, plants and animals, and even in the free atmosphere. With their varying lifestyles and prominent contributions to horizontal gene transfer, phages are the driving force behind bacterial populations. As such, they fundamentally influence biogeochemical cycles, the composition of microbial communities, bacterial evolution, and plant and animal health. Phages vary greatly in their morphology, virion size, genome composition and length, and have been grouped by various systems. Grouping by genome size, phages with genomes over 500 kb in length have been collectively designated ‘megaphages’ whereas phages with genomes 200–500 kb were originally referred to as ‘giant phages’ and later coined ‘jumbo phages’. Whereas all megaphages discovered to date were identified via analysis of viral metagenomic data and have not been physically isolated, an ever-growing list of jumbo phages have been isolated, characterized, and sequenced. Increased interest in jumbo phage biology has occurred hand-in-hand with a resurgence in both basic and applied phage research. These giants of the phage world offer unique insights into virus evolution, present novel genetic and molecular features, and can be especially useful in phage therapy and biocontrol applications. In this contribution we present a brief overview of jumbo phage history, discuss the challenges and methods used in the isolation and characterization of these oversized viruses and highlight a variety of unique genomic features and advanced capabilities identified in some of the 160 þ jumbo phages isolated to date.

History The story of jumbo phage biology begins, from a research standpoint, in 1968 when Gianfranco Donelli reported the isolation and initial characterization of a Bacillus megaterium bacteriophage named phage G. Long before the modern convenience of next generation sequencing technologies, Donelli and his colleagues recognized phage G was incredibly large, estimating the molecular weight of its double-stranded DNA (dsDNA) genome at 4.9  108 daltons. Phage G’s icosahedral head is 160 nm in diameter and its tail is 453 nm long. In 2011, the genome of phage G was sequenced and found to be 497,513 bp. To date, phage G holds the record for the largest isolated phage based on both virion size and genome length. Another early discovery in jumbo phage history, the Caulobacter crescentus phage phiCbK was first reported by Nina AgabianKeshishian and Lucille Shapiro in 1970. Originally characterized with head dimensions of 64 nm  195 nm and a tail 275 nm long, phiCbK virions were more recently measured at 56 nm  205 nm for the head and 300 nm for the tail length. The phiCbK dsDNA genome was sequenced in 2012 and found to be 215,710 bp. Due to its large head size, phiCbK helped advance the field of structural biology by serving as one of the first phages for which fine capsid structure was determined by electron microscopic image reconstruction. PhiCbK also aided in studies of the unique developmental program of Caulobacter, serving as a cell cycle and morphological indicator through its use of swarmer cells’ flagella and polar pili for adsorption. The Pseudomonas aeruginosa phage phiKZ, originally described by Krylov and Zhazykov in 1978, is one of the most thoroughly studied jumbo phages and claims several firsts in jumbo phage history. In 2001, phiKZ became the first jumbo phage to have its

Encyclopedia of Virology, 4th Edition, Volume 4

doi:10.1016/B978-0-12-809633-8.21522-7

229

230

Jumbo Phages

Fig. 1 GenBank jumbo phage genome submissions. Number of unique jumbo phage genome sequences submitted to GenBank each year from 2001 through 2019.

genome completely sequenced and entered in the National Center for Biotechnology Information’s GenBank. PhiKZ was the first jumbo phage found to possess a proteinaceous spool-like structure, termed the inner body, inside its head and the first for which the capsid structure was resolved by cryo-electron microscopy (cryo-EM). In addition, although Pseudomonas chlororaphis jumbo phage 201phi2-1 was the first phage determined to form a nucleus-like compartment in infected cells, phiKZ has led the way in studies aimed at determining the assembly and function of this fascinating structure. Changes in isolation methods and advancements in sequencing technologies have greatly eased the discovery of new jumbo phages. Thus, GenBank has seen a dramatic increase in the number of jumbo phage genomes submitted over the past few years. Whereas 0–3 new jumbo phage genomes were submitted to GenBank each year from 2001 to 2009, the past decade shows an unmistakable growth trend in submissions with a record 38 genome sequences from new and unique jumbo phages submitted in 2019 (Fig. 1). All jumbo phages isolated and characterized to date possess linear dsDNA genomes and belong to the Myoviridae and Siphoviridae families of the order Caudovirales. As more of these giant viruses are discovered, only time will tell whether jumbo phages exist that belong to other taxonomic families. This is an exciting time for phage biology; the accessibility and growing abundance of genomic sequence data allows questions to be asked and comparisons to be made that were previously impossible. The growing number of new phages reported each year has also necessitated a standardization and formalization of the naming process. The history of phage biology includes multiple instances of duplicate names. For example, prior to isolation of the jumbo phage G by Donelli in 1968 a much smaller phage G was reported by James Murphy in 1957. The current naming system, proposed in 2017, aims to avoid this confusion by incorporating the common name into a longer formal name for each newly discovered virus. An example of a formal phage name using this system is vB_DsoM_JA11. This naming scheme tells us this is a virus of Bacteria isolated on Dickeya solani that belongs to the Myoviridae family. The last portion of the name, in this example JA11, is the unique common name given by the researcher(s) who isolated and reported the phage. In this work we have introduced phages with their formal names when available. However, most of the jumbo phages discussed herein were first reported prior to the proposal of this naming scheme and thus are referred to by their published common names. To aid the reader in further exploration of the topic, we have also indicated the host bacterium for all phages discussed.

Isolation and Characterization Although jumbo phages represent a minority of the total collection of bacteriophages isolated in the past 100 years, they are similarly ubiquitous in nature. Jumbo phages have been successfully isolated and propagated from soil, freshwater and marine environments, sewage, and fecal samples from around the globe. The majority of jumbo phages isolated to date infect gram-negative bacteria including species from Achromobacter, Acinetobacter, Aeromonas, Burkholderia, Caulobacter, Cronobacter, Dickeya, Erwinia, Escherichia, Klebsiella, Pectobacterium, Prevotella, Prochlorococcus, Pseudomonas, Ralstonia, Salicola, Salmonella, Serratia, Sinorhizobium, Synechococcus, Tenacibaculum, Vibrio, Xanthomonas, and Yersinia (Table 1). Only six jumbo phages infecting gram-positive bacteria have been isolated, all of which infect Bacillus species. It is unknown whether this phenomenon simply reflects a bias in isolation efforts or if Bacillus species possess features unique among the gram-positive Bacteria that make them amenable to hosting very large phages. The most common phage isolation technique is the double-agar overlay method in which a filtered environmental sample is mixed with a liquid bacterial culture and molten agar and then poured onto a solid agar plate. Historically, a 0.7%–0.8% molten agar solution has been used resulting in a final top agar concentration of B0.5%. As phages infect, lyse, and diffuse through the top agar-bacteria mixture, visible plaques appear as holes in the bacterial lawn and can be picked for further propagation. With this method, diffusivity, adsorption rate and burst size can all contribute to plaque size. Thus, Philip Serwer and colleagues reasoned the great plaque count anomaly (the small number of unique phages successfully isolated relative to the large number observed by electron microscopy) could be due, at least in part, to top agar concentrations that preclude diffusion of very large phages. By using

Jumbo Phages

231

Table 1 Jumbo phages and bacterial hosts. Jumbo phage genome sequences submitted to GenBank as of December 31st, 2019. Phages are sorted by host name and by genome size within each host. Information regarding hosts and isolation materials was obtained from GenBank and/or published articles Host

Genome length (bp)

Phage name

Isolation source

Accession number

Achromobacter animicus Acinetobacter baumanii Aeromonas hydrophila

221,431 234,900 262,021 260,310 259,833 240,447 238,150 237,367 233,234 231,743 221,116 236,567 235,229 229,957 229,929 225,268 222,006 490,380 497,513 255,569 252,197 251,042 221,908 218,948 262,652 279,967 229,319 223,720 221,828 220,299 219,348 219,216 218,929 218,729 216,240 215,779 205,504 322,272 317,488 308,141 220,934 358,663 223,989 261,165 255,356 255,356 254,061 253,323 275,000 273,914 273,731 273,501 273,224 272,458 272,228 271,182 271,088 271,084 266,532 261,365 259,700

Motura vB_AbaM_ME3 D3 LAh10 D6 PS2 CF8 PS1 Aeh1 CC2 Ah1 65.2 65 As-szw AS-zj PhiAS5 PX29 Atu_ph07 Phage G vB_BpuM_BpSp PBS1 AR9 SP-15 0305phi8-36 BcepSauron CcrColossus Ccr29 CcrRogue CcrKarma Ccr2 Ccr10 CcrSwift CcrMagneto Ccr5 Ccr34 Ccr32 phiCbK CcrBL9 CcrSC CcrPW CcrBL10 vB_CsaM_GAP32 CR5 vB_DsoM_AD1 vB_DsoM_JA11 vB_DsoM_JA33 vB_DsoM_JA13 vB_DsoM_JA29 vB_EamM_Madmel vB_EamM_Mortimer Rebecca vB_EamM_Deimos-Minion vB_EamM_Special G vB_EamM_Desertfox vB_EamM_Bosolaphorus vB_EamM_RAY vB_EamM_Simmy50 Ea35–70 vB_EamM_Alexandra vB_EamM_Y3 vB_EamM_Yoloswag

Soil Wastewater effluent Sewage water Not provided Sewage water River water Aquaculture pond water River water Sewage Sewage Wastewater Not provided Not provided River water River water River water Not provided Soil sample Not provided Not provided Not provided Not provided Soil Soil Soil Surface water Pond/stream Surface water Surface water Pond/stream Pond/stream Surface water Surface water Pond/stream Pond/stream Pond/stream Not provided Not provided Not provided Not provided Not provided Wastewater Farm soil River water River water River water River water River water Soil Soil/foliage near infected tree Tree Branches, blossoms Branches, blossoms Soil Orchard dirt Leaves, stem Bark Pear tree soil Soil Soil Soil/foliage near infected tree

MN094788.1 NC_041884.1 MN102098.1 MK838116.1 MN131137.1 MN453779.2 MK774614.2 MN032614.2 NC_005260.1 NC_019538.1 MG250483.1 KY290955.1 NC_015251.1 MF498773.1 MF448340.1 NC_014636.1 NC_023688.1 NC_042013.1 NC_023719.1 KT895374.1 MF360957.1 NC_031039.1 KT624200.1 NC_009760.1 MK552141.1 NC_019406.1 KY555145.1 NC_019408.1 NC_019410.1 KY555143.1 KY555142.1 NC_019411.1 NC_019407.1 KY555144.1 KY555147.1 KY555146.1 JX163858.1 MH588546.1 MH588547.1 MH588545.1 MH588544.1 JN882285.1 JX094500.1 MH460463.1 MH389777.1 MH460462.1 MH460460.1 MH460461.1 MG655269.1 MG655270.1 MK514281.1 NC_041972.1 NC_041975.1 NC_042098.1 MG655267.1 NC_041973.1 NC_041974.1 NC_023557.1 MH248138.1 KY984068.1 KY448244.1 (Continued )

Aeromonas salmonicida

Agrobacterium tumefaciens Bacillus megaterium Bacillus pumilus Bacillus subtilis

Bacillus thuringiensis Burkholderia cenocepacia Caulobacter crescentus

Caulobacter vibroides

Cronobacter sakazakii Dickeya solani

Erwinia amylovora

232

Table 1

Jumbo Phages

Continued

Host

Genome length (bp)

Phage name

246,390

vB_EamM_Kwan

246,290 244,950 244,840

vB_EamM_Asesino Wellington vB_EamM_ChrisDB

243,953

vB_EamM_Stratton

243,050 241,654

phiEaH2 vB_EamM_Machina

241,147

vB_EamM_Caitlin

241,050

vB_EamM_Parshik

240,761

vB_EamM_Huxley

235,374 235,108

vB_EamM_Joad vB_EamM_Rising Sun vB_EamM_Phobos

Soil/foliage near tree Soil Not provided Soil/foliage near tree Soil/foliage near tree Soil Soil/foliage near tree Soil/foliage near tree Soil/foliage near tree Soil/foliage near tree Apple Blossoms Apple Blossoms

229,501 223,950 223,935 218,339

Derbicus

Escherichia coli

386,442 370,817 353,081 352,598 348,532 348,113 348,043 347,152 237,307 345,809 346,602 378,379 237,509 252,013 252,401 309,208 304,671 301,543 288,170 286,783 280,334 279,696 279,593 279,095 258,139 211,215 316,674 309,157 284,757 278,136 280,538 305,260 279,845 276,958 231,255 223,932 222,888

Klebsiellaa Klebsiella pneumoniae Pectobacterium carotovorum Photobacterium damselae Prochlorococcus marinus Prochlorococcus sp. NATL1A Pseudomonas aeruginosa

Pseudomonas chlororaphis Pseudomonas fluorescens

Pseudomonas putida Pseudomonas syringae Ralstonia solanacearum

PhiEaH1

Isolation source infected

NC_031010.1

infected

KX397364.2 MH426724.1 KX397366

infected

KX397373

infected

NC_019929.1 NC_042056.1

infected

KX397365

infected

KX397371

infected

KX397368

Soil/foliage near infected tree Pear tree vB_EamM_EarlPhillipIV Soil near infected apple tree CMSTMSU vB_EcoM_G17 UB vB_EcoM_phAPEC6 121Q PBECO4 vB_Eco_slurp01 SP27 vB_EcoM_Goslar vB_KleM-Rak2 K64-1 CBB PDCC-1 P-SSM5 P-SSM2 PhiPA3 PA1C vB_PaeM_PS119XW vB_PaeM_MIJ3 vB_PaeM_PA5oct phiKZ SL2 KTN4 PA02 PaBG EL 201phi2-1 Phabio OBP Noxifer Lu11 Psa21 RP12 RP31 RSL1 RSL2 RSF1

Accession number

MF459647.1 NC_042018.1 KX397372 MK514282.1 Soil/foliage near infected tree NC_023610.1

NC_031007.1

Shrimp farm wastewater Pig manure Orangutan feces Not provided Sewage Sewage Porcine feces Not provided Duck feces Sewage polluted pond Wastewater Wastewater Lollipop catshark Subtropical open ocean Sargasso Sea Sewage Not provided Not provided Manure runoff Sewage near farm Sewage Hospital sewage Sewage from irrigation field Sewage Lake water Pond water Soil Compost Compost Compost Mango tree soil Leaf litter in kiwi orchard Tomato soil Tomato soil Crop soil Soil Soil

MH494197.1 MK327931.1 MH383160.1 MK817115.1 KM507819.1 NC_027364.1 LT603033.1 LC494302.1 MK327938.1 JQ513383.1 NC_027399.1 NC_041878.1 MN562221.1 HQ632825.1 NC_006883.2 NC_028999.1 MK599315.1 MN103543.1 LR588166.1 MK797984.1 NC_004629.1 NC_042081.1 KU521356.1 AP019418.1 KF147891.1 NC_007623.1 NC_010821.1 MF042360.1 NC_016571.1 NC_041994.1 NC_017972.1 MK552327.1 NC_041911.1 AP017925.1 AB366653.2 AP014693.1 NC_028899.1

Jumbo Phages

Table 1

233

Continued

Host

Genome length (bp)

Phage name

Isolation source

Accession number

Salicola sp. PV4 Salmonellaa Salmonella enterica

440,001 350,103 348,718 250,739 240,413 239,461 242,624 241,405 240,198 273,933 212,807 357,154 276,025 206,713 219,372 244,930 243,633 232,883 232,878 222,619 216,121 213,993 208,857 208,007 204,930 234,670 226,876 224,680 250,485 248,605 248,088 247,619 242,446 238,099 238,053 237,722 231,998 288,967 278,116 290,532 247,511 246,421 244,834 239,276 246,964 384,670 352,596 262,391 257,877

SCTP-2 Munch 7t3 SPAsTU SPN3US SEGD1 SPFM1b SPFM13b SPFM6b Moabite PCH45 BF 2050HW phiN3 PAU B3 B23 S-SSM6a S-SSM7 ACG-2014f Syn7803C11c S-CAM7 0910CC49c S-B43 S-B05 S-SKS1 Bellamy pT24 Ptm5 Ptm1 phi-ST2 phi-Grn1 ValKK3 ValB1 HC 2 TSL-2019 USC-1 5 Aphrodite1 pVa-21 BONAISHI vB_VmeM-Yong MS32b vB_VmeM-Yong XC31b nt-1 phi-pp2 KVP40 pTD1 VH7D XacN1 fHe-Yen9-03 PhiR1-37 NCTB

Solar saltern Cattle feedlot Wastewater Sewage water Chicken fecal sample Not provided Not provided Not provided Not provided Mixed swine farm sample Sewage Compost Not provided Not provided Silkworms Sub-arctic lake Sub-arctic lake Sargasso Sea Sargasso Sea Not provided Pacific Ocean Bohai Sea Bohai Sea North Atlantic Ocean Seawater Shrimp culture pond water Fish aquaculture seawater Fish aquaculture seawater Coastal seawater Coastal seawater Marine sediment Not provided Not provided Marine water Not provided Not provided Not provided Sea water near coral reef Not provided Not provided Salt marsh mud samples Aquaculture waterways Polluted sea water Not provided Abalone farm seawater Orange grove soil Sewage Sewage Surface water

MF360958.1 MK268344.1 MK773491.1 MH221129.1 JN641803.1 KU726251.1 LR535901.1 LR535910.1 LR535905.1 MK994515.1 MN334766.1 NC_041917.1 MF285618.1 KR052482.1 NC_019521.1 MN695334.1 MN695335.1 HQ317391.1 NC_015287.1 KJ019142.1 NC_031927.1 MN018232.1 MK799832.1 HQ633071.1 MF351863.1 LC168164.1 AP019525.1 AP019524.1 KT919973.1 KT919972.1 NC_028829.1 MK568540.1 MK368614.1 MK905543.1 MK358448.1 NC_042100.1 KY499642.1 MH595538.1 MK308677.1 MK308674.1 NC_021529.2 JN849462 NC_005083.2 NC_041916.1 NC_023568.1 AP018399.1 LT960552.1 NC_016163.1 LT598654.1

Salmonella typhimirium Serratiaa Serratia marcescens Sinorhizobium meliloti Sphingomonas paucimobilis Synechococcusa

Tenacibaculum discolor Tenacibaculum maritimum Vibrio alginolyticus

Vibrio coralliilyticus Vibrio mediterranei Vibrio natriegens Vibrio parahaemolyticus

Vibrio strain 7D Xanthomonas citri Yersiniaa Yersinia enterocolitica Not provided a

Host genus but not species was provided. Phage listed is one of multiple phage genome sequences submitted to GenBank with 498% sequence identity. c Phage listed is one of multiple isolates of an individual phage that have been sequenced and submitted to GenBank. b

top agar concentrations as low as 0.15% they successfully isolated two new jumbo phages, P. chlororaphis phage 201phi2-1 and Bacillus thuringiensis phage 0305phi8-36, and opened the door to an era of jumbo phage discovery. The use of dilute top agar solutions, typically 0.1%–0.5%, is now widely practiced by jumbo phage biologists. For example, Agrobacterium tumefaciens phage Atu_ph07, which has the second largest genome of all jumbo phages sequenced to date (490 kb), was initially isolated on 0.3% top agar and later propagated on 0.15% top agar to promote the development of larger, clearer plaques. Additional modifications to standard methods also aid the isolation and propagation of jumbo phages. Double-agar plates may be incubated at a temperature lower than the host’s optimal growth temperature to prevent bacterial growth from outpacing plaque formation.

234

Jumbo Phages

Additionally, debris may be removed from environmental samples prior to plating by differential centrifugation rather than filtration to avoid inadvertent removal of very large phages. While sequencing is the only method to precisely determine the genome length, determining the packaged DNA lengths of jumbo phages is also of interest. Packaged DNAs purified from jumbo phage virions are typically longer than the corresponding genome lengths due to genomic terminal redundancies that vary from phage to phage and can be quite extensive. Knowledge of the virion packaged DNA length, rather than the genome length, is necessary to calculate DNA packaging densities and is of interest in contemplating mechanisms of jumbo phage evolution. Pulsed-field gel electrophoresis (PFGE) is typically used to determine the packaged DNA length. By alternating the direction of the electrical field at regular time intervals, PFGE allows the resolution of DNA molecules greater than 50 kb. To avoid shearing DNA prior to electrophoresis, jumbo phage virions are concentrated, mixed with agarose to form plugs, then treated with proteinase K to digest virion proteins within the plug. The plug containing the liberated packaged DNA then serves as the sample and is loaded directly into a well of the gel. Many jumbo phage genomes are heavily modified which can cause an overestimation of the packaged DNA length. To address this issue, Jianfei Hua and colleagues developed a two-dimensional agarose gel electrophoresis (2D-AGE) protocol. Taking advantage of the fact that length does not detectably affect the mobility of very long (450 kb) dsDNA molecules during electrophoresis under constant voltage (nonpulsed electrophoresis), the technique involves a typical PFGE run followed by a perpendicular nonpulsed run. The first run separates DNA by length and mass/charge ratio, and the second run separates by mass/charge ratio only. The distance traveled by a phage DNA band in the second dimension relative to the distance traveled by unmodified marker bands is then used to calculate a corrected migration position predicted to occur in the first dimension if the phage DNA was not modified. In prepping jumbo phages for PFGE, it is important to note that some are sensitive to the concentrated cesium chloride solutions used in step gradients and isopycnic separations. Thus, polyethylene glycol precipitation and sucrose gradient velocity sedimentation are often used to concentrate jumbo phage solutions. In addition to providing annotated genome sequences, packaged DNA lengths, and genomic terminal redundancy percentages, phage biologists have done an excellent job of characterizing the virion morphologies, plaque morphologies, host ranges, growth cycles, stabilities, virion protein compositions, head packaging densities, and even the transcriptional programs for many of the numerous jumbo phages isolated to date. Each of these characters aids the field in exciting comparative and evolutionary studies. In addition, novel morphological, genetic, and molecular features of several jumbo phages have been examined in greater depth and are highlighted below.

Virion Structure Morphotypes Bacteriophages are categorized into three orders: Caudovirales, Ligamenvirales, and Unassigned. The order Caudovirales represents the most numerous and widespread group of bacteriophages characterized to date. Three families belong to the Caudovirales: the Myoviridae, Siphoviridae, and Podoviridae. Members of these three families all possess unenveloped icosahedral capsids and are differentiated, morphologically, by their tail structures. Myoviruses have long contractile tails (morphotype A), siphoviruses have long noncontractile tails (morphotype B), and podoviruses have short tails (morphotype C). Each of the three morphotypes also display a range of head lengths, designated 1, 2, and 3 (Fig. 2). The vast majority of jumbo phages isolated to date belong to the Myoviridae family and display A1 and A2 morphologies. Exceptions include 16 different siphoviruses of C. crescentus, all of which belong to the genus Phicbkvirus. This genus of jumbo phages includes the group’s namesake, phiCbK, and all members display the B3 morphotype with elongated heads and long, flexible, noncontractile tails.

Icosahedral Geometries Jumbo phage capsids display a variety of icosahedral geometries with some possessing previously unseen triangulation numbers (T-numbers). Sphingomonas paucimobilis phage PAU and E. coli phage 121Q were the first examples of true T ¼ 25 and T ¼ 28 geometries, respectively, and B. megaterium phage G was the first example of a virus with T ¼ 52 geometry. Accordingly, jumbo phages also display a fairly broad range of capsid sizes, from the 90 nm  50 nm prolate head of Vibrio parahaemolyticus phage KVP40 to the 160 nm isometric head of phage G. The head structures of several jumbo phages have been determined via cryo-EM at resolutions sufficient for atomic model fitting. Interestingly, all utilize the canonical HK97 capsid fold conserved among smaller phages and simply assemble greater numbers of capsid protein subunits to generate larger heads. In addition, several different arrangements of decoration proteins have been noted among the jumbo phage capsids examined. For example, the heads of phages N3 (Sinorhizobium meliloti), PBS1 (Bacillus subtilis), G, and Bellamy (Synechococcus sp. WH8109) possess extra structures on their penton vertices that may help stabilize these large capsids.

Virion DNA Density Although viruses with larger capsids are typically capable of packaging larger genomes, jumbo phage head size does not always correlate with genome length. For example, P. aeruginosa phage phiPA3 has a 100 nm head and a 309,208 bp genome while phage EL (also of P. aeruginosa) possesses a larger 140 nm head but a shorter 211,215 bp genome. The density at which DNA is packaged in phage heads is inherently important to the process of DNA injection during the initial stages of infection. Thus, jumbo phages with small genomes relative to their head sizes package additional DNA in the form of genomic terminal redundancies. Virion

Jumbo Phages

235

Fig. 2 Caudovirales morphotypes. The order Caudovirales is comprised of the Myoviridae, Siphoviridae, and Podoviridae families. Jumbo phages characterized to date belong to the Myoviridae and Siphoviridae and exhibit A1, A2, and B3 morphologies. Modified from Hull, R., Brown, F., Payne, C., 1989. Virology: A Directory and Dictionary of Animal, Bacterial and Plant Viruses. London: The Macmillan Press Ltd., p. 165.

packaged DNA lengths, determined via PFGE or 2D-AGE, and head volumes, calculated from cryo-EM measurements, are used in estimating packaged DNA densities. Such estimates have been made for several jumbo phages and found to vary substantially, from 0.39 bp/nm3 for phage G to 0.55 bp/nm3 for phage PAU. Differences in packaged DNA densities are likely due to variabilities in capsid structures and associated strengths. In addition, some jumbo phages possess an inner capsid structure that decreases the head volume available for packaged DNA. The heads of phiKZ and related Pseudomonas jumbo phages 201phi2-1, phiPA3, EL, and OBP contain a highly ordered, proteinaceous, cylindrical structure that spans the inner length of the capsid. Originally visualized by transmission electron microscopy (TEM) in 1978, the protein composition of this ‘inner body’ has been determined via proteolysis and mass spectrometry. The inner body is encased in DNA in virions. This poses a problem for structural studies as the inner body is indistinguishable from the surrounding DNA in conventional cryo-electron micrographs. The inner body is sensitive to radiation damage, exploding into bubbles of gaseous radiation products at electron doses that leave the surrounding capsid structure intact. Using the information from these ‘bubblegrams’ the structure of the inner body and its localization within phiKZ capsids was determined and used to generate three-dimensional reconstructions. The phiKZ inner body is B24 nm wide and B105 nm long, is tilted B22˚ relative to the portal axis and is composed of multiple stacked tiers with some regions possessing six-fold symmetry. It is hypothesized to play a role in organizing the packaged genome thereby reducing internal pressure on the capsid and promoting efficient and rapid genome ejection. Interestingly, the inner body is not detected following infection indicating its constituent proteins may be injected into host cells along with the phage genome.

Head and Tail Fibers As with small phages, jumbo phage virions frequently contain fibers. Whereas in small phages such fibers are typically found attached to the tail baseplate and/or collar, some jumbo phages display fibers in unique locations and configurations. Phage PAU virions contain a narrow, long baseplate to which several tail fibers are attached. While the baseplate fibers of many jumbo phages are fairly short (20–30 nm), those of phage PAU are longer (B65 nm). These fibers extend in various directions from the bottom of the baseplate and appear somewhat flexible in electron micrographs, giving the tail complex a string mop appearance. The tail of Pectobacterium carotovorum jumbo phage vB_PcaM_CBB contains several long (B120 nm) whisker-like structures. Rather than emanating from the baseplate, these whiskers cover the length of the tail sheath. Phage CBB whiskers are highly stable, remaining attached during 1.5 years of storage at 41C, implying they serve an indispensable role. A few jumbo phages have also been shown to possess head fibers. Phage 121Q virions contain numerous long (B100 nm), curly, hair-like structures attached to both the capsid surface and all along the sheath section of the tail. The number of these fibers varies, ranging from 0 to 420 per virion. The distal ends of the fibers appear more electron dense in TEM images, suggesting they may terminate in a globular domain. Phage 121Q head and tail fibers likely interact tightly with the host cell as it is difficult to separate virion particles from host debris during phage purification. Jumbo phage Atu_ph07 virions also contain hair-like fibers that extend from both the capsid and along the length of the tail. In TEM images, the head fibers range in length with some appearing close to 200 nm, while the tail fibers are generally shorter (B50 nm). Finally, the Tenacibaculum maritimum jumbo phages PTm1 and PTm5 both possess fiber-like appendages emanating from the upper regions of their heads. These fibers are 50–100 nm long and appear to function in host recognition, as follows. Electron micrographs of phage-host cell mixtures show some PTm5 virions attached to cells

236

Jumbo Phages

via the tops of their heads, while others are attached via their tails. These head fibers are easily lost during phage purification and may serve to enhance infection in ocean currents by providing an addition initial host binding mechanism prior to tail attachment.

Genome Features The genomes of over 160 unique jumbo phages have now been sequenced and range in length from the 204,930 bp genome of phage Bellamy to the 497,513 bp genome of phage G. Jumbo phages with the 25 largest genomes infect a range of bacterial genera including Bacillus, Agrobacterium, Salicola, Escherichia, Xanthomonas, Pectobacterium, Cronobacter, Serratia, Yersinia, Salmonella, Klebsiella, Caulobacter, and Pseudomonas (Table 2). Comparative genomic analyses of jumbo phage genomes have aided immensely in our understanding of these giant viruses by revealing both broader commonalities and features unique to specific groups.

Terminal Redundancies Genomic terminal redundancies are commonly seen in the packaged DNAs of jumbo phages. Terminal redundancies in smaller phages with linear, circularly permuted genomes, such as phage T4, result from a head-full packaging mechanism in which concatemeric DNA serves as the packaging substrate. Once initiated, concatemer packaging continues until the capsid is full. Because the phage T4 head volume is larger than its genome, the packaged DNA is terminally redundant. A similar head-full packaging mechanism is thought to occur in jumbo phages which also contain linear, circularly permuted, terminally redundant genomes. Terminal redundancies, typically expressed as a percentage of the genome length, are determined by subtracting the genome length (ascertained via sequencing) from the packaged DNA length (estimated by PFGE or 2D-AGE). Whereas the terminal redundancy for phage T4 packaged DNA is 3%, the terminal redundancies for the jumbo phages PBS1, G, and PAU have been estimated at 26%, 26%, and 35%, respectively. Other jumbo phages display smaller terminal redundancies, such as 5% for phage N3, 8% for phage 121Q, and 9% for phage Bellamy. The extent of genomic terminal redundancy is largely a result of the difference in head volume and genome size. Thus, shorter terminal redundancies may be indicative of more evolved phages that have lengthened their genomes through successive acquisitions of additional unique sequences.

Genome Nucleotide Composition Although not exclusive to jumbo phages, it is worth noting that many of the jumbo phage genomes sequenced to date possess a low GC-content, particularly in comparison to that of their bacterial hosts. The phiKZ-like jumbo phages phiKZ, 201phi2-1, and Table 2 Largest jumbo phage genomes. The 25 largest jumbo phages based on lengths of phage genome sequences submitted to GenBank as of December 31, 2019. Does not include megaphage sequences obtained via metagenomic analyses Genome (bp)

Phage name

Host

Accession number

497,513 490,380 440,001 386,442 384,670 378,379 370,817 358,663 357,154 353,081 352,598 352,596 350,103 348,718 348,532 348,113 348,043 347,152 346,602 345,809 322,272 317,488 316,674 309,208 309,157

Phage G Atu_ph07 SCTP-2 CMSTMSU XacN1 CBB vB_EcoM_G17 vB_CsaM_GAP32 BF UB vB_EcoM_phAPEC6 fHe-Yen9-03 Munch 7t3 121Q PBECO4 vB_Eco_slurp01 SP27 K64-1 vB_KleM-Rak2 CcrBL9 CcrSC 201phi2-1 PhiPA3 Phabio

Bacillus megaterium Agrobacterium tumefaciens Salicola sp. PV4 Escherichia coli Xanthomonas citri Pectobacterium carotovorum Escherichia coli DSM 103244 Cronobacter sakazakii Serratia marcescens Escherichia coli O157:H7 Escherichia coli Yersinia Salmonella Salmonella enterica ssp. enterica Escherichia coli Escherichia coli OH157:H7 Escherichia coli MG1655 Escherichia Klebsiella pneumoniae Klebsiella isolate KV  3 Caulobacter vibroides Caulobacter vibroides Pseudomonas chlororaphis Pseudomonas aeruginosa Pseudomonas fluorescens

NC_023719.1 NC_042013.1 MF360958.1 MH494197.1 AP018399.1 NC_041878.1 MK327931.1 JN882285.1 NC_041917.1 MH383160.1 MK817115.1 LT960552.1 MK268344.1 MK773491.1 KM507819.1 NC_027364.1 LT603033.1 LC494302.1 NC_027399.1 JQ513383.1 MH588546.1 MH588547.1 NC_010821.1 NC_028999.1 MF042360.1

Jumbo Phages

237

OBP have genomic GC-contents of 36%–48%, significantly lower than the 60%–66% GC-contents of their Pseudomonas hosts. Likewise, the Aeromonas salmonicida jumbo phages 65.2, PX29, and AS5 have genomic GC-contents of 37%, 42%, and 43%, respectively, while their host’s genome possesses a GC-content of 58%. Whereas phage genomes are on average 4% richer in A/T than their hosts, this difference is 15%–30% for the above-mentioned jumbo phages. Although common among sequenced jumbo phages, this contrast in phage-host GC-content is not universal and several jumbo phages have genomic GC-contents similar to those of their bacterial hosts. The higher energy cost of synthesizing GTP and CTP combined with the greater abundance of intracellular ATP and TTP may result in increased replication of virulent phages with AT-rich genomes. Thus, higher genomic AT-contents may provide phages and other intracellular pathogens an evolutionary advantage.

Nucleotide Modifications and Substitutions Many jumbo phage genomes are heavily modified while others incorporate nucleotide substitutions. Restriction modification systems comprise a well-studied and important component of the bacterial arsenal used against infecting phages. To thwart this defense mechanism the genomes of many phages, both large and small, contain nucleotide modifications and/or substitutions that go unrecognized by restriction endonucleases. The first identified DNA base modification, 5-hydroxymethylcytosine, was discovered in T-even bacteriophages in 1953. Since then, various modifications in which chemical groups are biologically appended to the nucleobase have been shown for all four DNA nucleotides. These chemical additions range from simple methyl groups to amino acids, polyamines, and mono- and disaccharides. Some of the host- and phage-encoded enzymes that perform nucleobase modifications utilize free nucleotides as their substrates, while others perform base modifications post-replication. Genes encoding putative DNA modification enzymes have been found in several jumbo phage genomes. Jumbo phages SPN3US (Salmonella enterica) and phiEaH2 (Erwinia amylovora) encode adenine methylase homologs, E. amylovora jumbo phage Y3 encodes both adenine and cytosine methylase homologs, and Klebsiella jumbo phage vB_KleM_RaK2 encodes two putative DNA methyltransferase homologs. Genome modification in some jumbo phages is quite extensive. For example, whereas the length of jumbo phage PAU packaged virion DNA was estimated at 690 kb via PFGE, 2D-AGE provided a calculated corrected length of 296 kb. Thus, the phage PAU genome modifications impart a charge difference great enough to result in a 26% decrease in DNA electrophoresis migration rate. Several jumbo phages utilize nucleotide substitutions rather than nucleobase modifications. Yersinia enterocolitica phage phiR1-37 and the B. subtilis jumbo phages PBS1, PBS2, and AR9 all have genomes in which thymidine is 499% replaced by deoxyuridine. This pyrimidine substitution is accomplished by both altering the composition of intracellular deoxynucleotide pools and protecting the newly replicated uracil-containing phage genomes from host DNA repair pathways. Because the incorporation of deoxyuridine precludes direct sequencing, these phages presented unique challenges during genome sequencing. Nuclear magnetic resonance or liquid chromatography-mass spectrometry was first employed to determine the chemical makeup of the phage genomes and was followed by different creative solutions. For phiR1-37, phage DNA was cloned using E. coli CJ236, a strain lacking dUTPase and uracil N-glycosylase which made partial sequencing possible. Phage AR9 was amplified using a Uracil þ polymerase for library prep prior to sequencing on the Illumina MiSeq platform. As additional jumbo phages are isolated and chemically analyzed, it will be interesting to discover the novel base modifications and/or nucleotide substitutions utilized by these evolutionarily innovative viruses.

ORFan Genes One commonality seen across the spectrum of jumbo phage genomes sequenced to date is the high proportion of predicted open reading frames (ORFs) that currently lack detectable homology in public databases. These ORFan genes encode hypothetical proteins and constitute a large percentage of many jumbo phage genomes. For example, 70%–80% of the genomes of P. aeruginosa phage PaBG, Aeromonas hydrophila phage CC2, Acinetobacter baumanii phage ME3, S. enterica phage SPN3US, and E. amylovora phage phiEaH2 code for hypothetical proteins. This percentage is even higher in the jumbo phages phiCbK, PBECO4 (E. coli), and Ea35-70 (E. amylovora) with 88%–89% of their genomes encoding hypothetical proteins. As an increasing number of jumbo phage genome sequences are deposited in public databases, homologs are identified and the number of ORFan genes decreases. However, the functions of the predicted proteins remain largely unknown. This is both an exciting aspect and a challenge of jumbo phage biology; it currently imparts difficulty in making functionality predictions while presenting phage biologists with thousands of new and interesting proteins to explore.

Transfer RNA Genes In addition to the phage-host genome nucleotide composition differences noted above, virulent phages also typically encode more tRNA genes than temperate phages. Phage-encoded tRNAs tend to correspond to codons that are highly prevalent in phage ORFs and simultaneously rare in the coding sequences of their hosts. Although the inclusion of tRNA genes in their genomes is not unique to jumbo phages, the sheer number of tRNAs encoded by some of these giant viruses is nonetheless remarkable. For example, whereas phage T4 encodes eight tRNA genes, the V. parahaemolyticus jumbo phages KVP40 and phipp2 encode 30 tRNA genes and the C. crescentus jumbo phages CcrKarma, CcrMagneto, and CcrColossus encode 26, 27, and 28 tRNA genes, respectively. As with smaller phages, the tRNAs encoded by jumbo phages are thought to correspond to codons that are abundant in phage coding sequences, especially those encoding structural proteins, and serve to increase translation efficiency and virion production.

238

Jumbo Phages

Advanced Capabilities With the advanced functionalities encoded by their large genomes, jumbo phages and other giant viruses are beginning to blur the line between what we consider living and non-living. Many of these advanced capabilities provide a level of host independence not seen in smaller viruses while others aid virus replication by boosting intracellular nucleotide pools, countering host defense systems or precluding superinfection/secondary infections by competing phages.

RNA Polymerases A dependence on host transcriptional machinery is a hallmark of many dsDNA viruses. Most small dsDNA phages utilize the host’s RNA polymerase (RNAP) throughout infection, modulating its function via various mechanisms to ensure ordered viral gene expression. Notable exceptions include the T7-like phages and the N4-like phages. T7-like phages rely on the host RNAP immediately upon infection to transcribe the phage RNAP gene, then utilize the phage-encoded RNAP to transcribe the viral middle and late genes. In a reverse strategy, the N4-like phages inject a virion-encapsulated RNAP (vRNAP) into the host cell upon infection. The vRNAP is used to express phage early genes including a non-virion RNAP (nvRNAP) that is used to transcribe the phage middle genes. Finally, the host RNAP, modified by the phage, is used to carry out late gene transcription. Thus, even though T7-like and N4-like phages encode their own RNAPs they are still dependent on the host RNAP at particular stages of their replication cycles. Jumbo phage phiKZ and its giant relatives, however, are known to replicate completely independently of the host RNAP. The phiKZ genome encodes multiple ORFs with homology to RNAP subunits. RNA polymerases can be divided into two families on the basis of subunit composition, conserved amino acid motifs, and three-dimensional structures. The family of singlesubunit RNAPs (ssRNAPs) are known to transcribe phage, mitochondrial, and chloroplast genomes, while the family of multisubunit RNAPs (msRNAPs) transcribe genes in cells of all three domains of life. Whereas all other known phage-encoded RNAPs belong to the ssRNAP family, phage phiKZ encodes two sets of proteins homologous to amino-(N-) and carboxy-(C-) terminal fragments of the largest msRNAP subunits. Four proteins from one set are present in phiKZ virions and likely comprise a vRNAP that is injected into host cells upon infection and functions to transcribe phage early genes. The second set of proteins is thought to comprise an nvRNAP that functions in transcription of phage middle and late genes. Similar to the nvRNAP of the N4-like phages, the phiKZ-encoded nvRNAP is found in infected cells but not virions. The phiKZ nvRNAP has been purified from infected cells and analyzed by mass-spectrometry. The enzyme is composed of five polypeptides, all products of early phage genes. Two polypeptides are homologous to N- and C-terminal portions of msRNAP b subunits, two are homologous to N- and C-terminal portions of msRNAP b0 subunits, and the fifth polypeptide currently possesses no homologs in public databases other than counterparts in related phiKZ-like viruses. Curiously, the nvRNAP holoenzyme lacks similarity to s factors utilized by bacterial msRNAPs for promoter specificity, yet specifically transcribes from late phiKZ promoters in vitro. Thus, this jumbo phage may encode a novel mechanism for RNAP promoter recognition. To examine dependence on the host RNAP, phiKZ transcription levels were quantified at time points throughout infection of P. aeruginosa pre-treated with rifampicin, an antibiotic that inhibits bacterial RNAPs. Whereas bacterial transcription was decreased by the addition of rifampicin, phiKZ gene transcription was resistant to rifampicin treatment. The B. subtilis jumbo phages PBS2 and AR9 and the Ralstonia solanacearum jumbo phage phiRP31 are also resistant to rifampicin. Like phiKZ, phages AR9 and phiRP31 encode two sets of RNAP b and b0 subunits. Also like phiKZ, the RNAP purified from phage PBS2-infected cells is comprised of five subunits with molecular weights roughly corresponding to those of phiKZ’s nvRNAP. Because the genome of phage PBS2 has not yet been sequenced, its relationship to phiKZ is unknown at this time. Phages AR9 and phiRP31, however, are known to belong to the genus phiKZlikevirus. Thus, it appears multiple phiKZ-related phages have evolved a transcriptional apparatus that provides increased host independence.

DNA Repair Enzymes Viruses are often characterized as being generally ineffective at DNA repair because they typically do not encode the extensive and complex repair mechanisms found in their hosts. Jumbo phages, however, are expanding our knowledge of virus-encoded DNA repair mechanisms. The genomes of Cronobacter sakazakii jumbo phage CR5, S. enterica jumbo phage SPN3US, and E. amylovora jumbo phage vB_EamM_Deimos-Minion all encode homologs of the SbcC and SbcCD nucleases involved in double-strand break repair in E. coli and B. subtilis. In addition, the genome of E. amylovora jumbo phage Y3 encodes homologs of RecA, UvsE, and the MmcB-like family of DNA repair proteins. RecA is well known for its role in DNA maintenance and repair, UvsE is an endonuclease involved in removing nucleotides damaged by ultraviolet radiation, and MmcB-like proteins are endonucleases thought to function in DNA repair by generating the substrates for translesion synthesis. Functional in vivo analyses of these phage-encoded proteins have not yet been performed, but jumbo phages are thought to encode and employ DNA repair mechanisms to increase the number of viable progeny virions generated during infections. The larger genome and virion sizes of jumbo phages result in fewer infectious particles assembled prior to cell lysis. For example, phage Deimos-Minion has a reported burst size of only 5 plaque-forming particles per infected bacterial cell. Therefore, jumbo phages may have evolved to encode DNA repair mechanisms to help ensure sufficient progeny production.

Jumbo Phages

239

NAD þ Salvage Pathway To boost nucleotide synthesis in their hosts, many large phages encode a nicotinamide adenine dinucleotide (NAD þ ) salvage pathway. First identified in the genome of V. parahaemolyticus jumbo phage KVP40, over twenty jumbo phages encode enzymes that catalyze a two-reaction NAD þ salvage pathway. While not exclusive to jumbo phages, the smallest phage found to encode these enzymes, Ruegeria pomeroyi phage DSS3phi8, still possesses a genome greater than 146 kb. The phage-encoded NAD þ salvage pathway, proposed by Lee and colleagues, transforms nicotinamide (NAm) into NAD þ via a nicotinamide mononucleoside (NMN) intermediate (Fig. 3). Genes encoding the nicotinamide phosphoribosyltransferase (NAmPRTase) and nicotinamide mononucleotide adenylyltransferase (NMNATase) enzymes that catalyze the pathway are found in the genomes of several Caulobacter phiCbK-like jumbo phages, jumbo phages that infect Sinorhizobium, Ralstonia, Aeromonas, Cronobacter, and Klebsiella species, and half of the Vibrio jumbo phages sequenced to date. The phage KVP40 NAmPRTase and NMNATase homologs were purified and found to possess the requisite activities in vitro. Transcriptional analyses showed these enzymes were expressed early in the phage infection cycle, during the metabolic phase when intracellular NAD þ and NADH levels are most pertinent for phage replication. Phage DNA replication rates can be greater than four times those of host DNA replication, necessitating increased deoxynucleotide production during infection. With their exceedingly large genomes, this need is even more pronounced during jumbo phage infections. It appears several of these viruses have taken this matter into their own hands, so to speak, by encoding a mechanism to provide increased levels of DNA precursors.

LPS Biosynthesis Various temperate phages are known to encode enzymes involved in lipopolysaccharide (LPS) biosynthesis. Such enzymes are thought to function in altering the surface composition of host cells, thereby precluding superinfection/secondary infection by competing phages. Intriguingly, large clusters of genes involved in LPS biosynthesis have now been found in the genomes of two lytic jumbo cyanophages. Their genomic inclusion was originally thought to simply serve as stuffer DNA for headful packaging. However, Synechoccus jumbo phage S-SSM7 and Prochlorococcus jumbo phage P-SSM2 were isolated from geographically distant locations hundreds of miles apart in the Atlantic Ocean yet encode seven identical homologs to LPS biosynthesis genes. Thus, it is likely these LPS genes are functionally linked. Upon infection of cells undergoing suboptimal growth, some T4-like lytic phages can enter a reversible dormant state in which lytic replication is stalled following early and middle gene expression and resumes once the necessary nutrients become available. It has been suggested a similar condition, referred to as pseudolysogeny, occurs with some marine phages. If this is the case, altering the surface composition of their hosts via phage encoded LPS biosynthesis may serve to prevent secondary infections by other cyanophages, particularly smaller phages capable of faster replication, thereby ensuring production of jumbo phage progeny upon resumption of lytic replication.

Phage Nucleus Phages and their bacterial hosts are in a continuous co evolutionary arms race in which bacteria evolve to prevent phage infection and phages evolve to counter host defense mechanisms. Bacteria encode various defense mechanisms such as DNA restriction-

Fig. 3 Jumbo phage encoded NAD þ salvage pathway. Nicotinamide adenine dinucleotide (NAD þ ) is synthesized from nicotinamide (Nam), phosphoribosyl pyrophosphate (PRPP), and adenosine triphosphate (ATP) in a two-step pathway catalyzed by phage encoded nicotinamide phosphoribosyltransferase (NAmPRTase) and nicotinamide mononucleotide adenylyltransferase (NMNATase) enzymes. Modified from Lee, J.Y., Li, Z., Miller, E.S., 2017. Vibriophage KVP40 Encodes a Functional NAD( þ ) Salvage Pathway. Journal of Bacteriology 199 (9), e00855.

240

Jumbo Phages

modification systems and CRISPR-Cas systems. In addition, bacteria often evolve changes in cell surface receptors utilized by phages for attachment. Phages have countered these strategies with genome chemical modifications and/or nucleotide substitutions, antiCRISPR proteins, and changes to virion structures used in host attachment. Jumbo phages have also been shown to utilize a unique and intriguing counter-defense mechanism coined the ‘phage nucleus’. This phage encoded structure forms a proteinaceous barrier in infected cells that separates the phage genome from intracellular defense mechanisms. The compartment is centered in the cell by a bipolar spindle composed of a phage encoded tubulin-like protein. Remarkably, the proteinaceous shell of the phage nucleus allows selective entry and/or retention of specific host and phage proteins. Enzymes involved in transcription and DNA replication are enclosed within the phage nucleus, while ribosomes and metabolic enzymes are relegated to the surrounding cytoplasm. Thus, in many aspects this phage-encoded structure is indeed reminiscent of eukaryotic nuclei. During infection, phage capsids assemble on the bacterial membrane, then migrate to the phage nucleus where they dock for DNA packaging. Although the phage nucleus was originally described in P. chlororaphis cells infected with phage 201phi2-1, the role of this structure as a counter-defense mechanism has been more extensively studied with the jumbo phage phiKZ. While examining antiCRISPR capacities of P. aeruginosa phages, Mendoza and colleagues noticed phiKZ was resistant to a variety of CRISPR systems despite a lack of anti-CRISPR homologs in its genome. Purified phiKZ DNA was cleaved by restriction endonucleases in vitro, thus ruling out genomic chemical modifications and/or nucleotide substitutions as a counter-defense mechanism. Employing confocal microscopy to investigate the mechanism by which phiKZ resists host degradation of its genome, phage DNA was found localized to the center of infected cells, while fluorescently tagged Cas9 protein was clearly excluded from the phage nucleus. The phage nucleus also excludes host restriction enzymes, thereby protecting phage DNA from both defense mechanisms. It is currently unclear how widespread this counter-defense mechanism is among the various jumbo phage lineages. In addition to phiKZ and 201phi2-1, a phage nucleus is also encoded by phiPA3 and public database alignment tools reveal several other jumbo phages, including vB_PaeM_PS119XW, Phabio, and KTN4, encode homologs to the phiKZ tubulin and phage nucleus shell proteins. All these phages infect Pseudomonas species and belong to the PhiKZlikevirus genus, indicating the evolutionary distribution of this counter-defense mechanism might be limited. Recently, however, a Serratia jumbo phage (PCH45) highly divergent from all jumbo phages sequenced to date has been shown to encode and utilize a phage nucleus structure. Thus, this counter-defense mechanism may be more widespread than previously thought. Although the phage nucleus structures assembled in phiKZ- and PCH45-infected cells protect against Type I CRISPR-Cas systems that target phage DNA, they do not protect against Type III CRISPR-Cas systems which target phage mRNAs exported to the cytoplasm for translation. Many bacteria encode both DNA- and RNA-targeting CRISPR-Cas systems and, thus, the arms race continues.

Evolution Phages belonging to the order Caudovirales vary dramatically in genome lengths and head sizes, ranging from the tiny Rhodococcus rhodochrous phage RRH1 to the giant phage G. Across this range, two common rules apply. First, genome lengths are limited by head capacities and second, high packaged DNA densities are necessary for successful DNA injection during the initial stages of infection. As noted above, jumbo phage heads utilize the canonical phage HK97 capsid fold and simply use greater numbers of capsid protein to build larger heads. Noting this conservation at the structural level, Roger Hendrix suggested an elegantly simple ‘ratchet model’ of phage evolution that accommodates the common rules. In this model, mutations in the major capsid and/or scaffold proteins lead to a larger capsid with an increased T-number while packaged DNA density is maintained through head-full packaging with a concomitant increase in terminal redundancy length. The enlarged head size allows for successive incorporation of additional coding sequences into the phage genome until the genome fills the capsid. Importantly, the newly added sequences need not displace preexisting phage genes and instead simply result in a reduction in terminal redundancy length. At this point, reversion to a smaller capsid size would likely be evolutionarily disfavored as it would require deletion of genes that encode selective advantages. Thus, the larger capsid size would become locked in, as by a ratchet, and evolution could repeat the process over and over generating larger and larger phages in a stepwise manner. The upper limit of phage size is currently unknown. Several megaphages with genomes of 540,000–735,000 bp have recently been identified during metagenomic studies. Phylogenetic analyses of their terminase genes place these behemoths in the Myoviridae and all are predicted to infect members of the Bacteroidetes. As methodologies are developed to isolate these megaphages, capsid analyses will reveal whether structural conservation exits across the continuum of phage head sizes. Interestingly, megaphage genomes analyzed to date possess an apparently low (o70%) coding density when annotated according to the standard genetic code. Fragmentation of predicted coding sequences indicate these phages may use an alternative genetic code in which the canonical TAG stop codon is repurposed to encode glutamine. As larger and larger phages are discovered, additional interesting and exciting evolutionary innovations will surely be revealed.

Concluding Remarks In 2019 the World Health Organization listed the growing concern of antibiotic resistance as one of the top ten greatest threats to global health. Phage therapy, successfully developed and utilized in Eastern Europe over the past century, is experiencing a resurgence in interest among Western government agencies, health care workers, and the food industry. The typically lytic lifestyles, advanced counter-defense mechanisms, and adequately broad host ranges of many jumbo phages make them good candidates for

Jumbo Phages

241

phage therapy and/or biocontrol applications. However, detailed knowledge of any phage therapy candidate is a crucial prerequisite to broad clinical use. There is much work to be done in characterizing jumbo phage genomes and further exploring the unique capabilities, host ranges, and infection cycles of these phages. With continuing innovations in technology and the synergistic expertize of phage enthusiasts from diverse fields, the coming decades are sure to be an inspiring and productive time for jumbo phage biology.

Further Reading Al-Shayeb, B., Sachdeva, R., Chen, L.X., et al., 2020. Clades of huge phages from across Earth’s ecosystems. Nature 578, 425–431. doi:10.1038/s41586-020-2007-4. Chaikeeratisak, V., Nguyen, K., Khanna, K., et al., 2017. Assembly of a nucleus-like structure during viral replication in bacteria. Science 355, 194–197. Hendrix, R.W., 2009. Jumbo bacteriophages. In: Etten, J. (Ed.), Lesser Known Large dsDNA Viruses. Current Topics in Microbiology and Immunology 328. Berlin Heidelberg: Springer-Verlag, pp. 229–240. Hua, J., Huet, A., Lopez, C.A., et al., 2017. Capsids and genomes of jumbo-sized bacteriophages reveal the evolutionary reach of the HK97 fold. mBio 8 (5), e01579. Kawato, Y., Istiqomah, I., Gaafar, A.Y., et al., 2020. A novel jumbo Tenacibaculum maritimum lytic phage with head-fiber-like appendages. Archives of Virology 165, 303–311. Lavysh, D., Sokolova, M., Minakhin, L., et al., 2016. The genome of AR9, a giant transducing Bacillus phage encoding two multisubunit RNA polymerases. Virology 495, 185–196. Lee, J.Y., Li, Z., Miller, E.S., 2017. Vibrio phage KVP40 encodes a functional NAD( þ ) salvage pathway. Journal of Bacteriology 199 (9), e00855. Malone, L.M., Warring, S.L., Jackson, S.A., et al., 2020. A jumbo phage that forms a nucleus-like structure evades CRISPR-Cas DNA targeting bus is vulnerable to type III RNA-based immunity. Nature Microbiology 5, 48–55. Mendoza, S.D., Niewegloska, E.S., Govindarajan, S., et al., 2019. A bacteriophage nucleus-like compartment shields DNA from CRISPR nucleases. Nature 577 (7789), 244–248. Saad, A.M., Soliman, A.M., Kawasaki, T., et al., 2019. Systemic method to isolate large bacteriophages for use in biocontrol of a wide-range of pathogenic bacteria. Journal of Bioscience and Bioengineering 127 (1), 73–78. Yakunina, M., Artamonova, T., Borukhov, S., et al., 2015. A non-canonical multisubunit RNA polymerase encoded by a giant bacteriophage. Nucleic Acids Research 43, 10411–10420. Yuan, Y., Gao, M., 2017. Jumbo bacteriophages: An overview. Frontiers in Microbiology 8, 403.

CRISPR-Cas Systems and Anti-CRISPR Proteins: Adaptive Defense and Counter-Defense in Prokaryotes and Their Viruses Asma Hatoum-Aslan and Olivia G Howell, University of Alabama, Tuscaloosa, AL, United States r 2021 Elsevier Ltd. All rights reserved.

Nomenclature Cas CRISPR-associated CRISPR A genetic locus composed of clustered regularlyinterspaced short palindromic repeats

Glossary Conjugation A process by which conjugative plasmids are transferred from one prokaryote to another. Homologs A group of related proteins that share a common ancestor and likely perform similar functions. Horizontal gene transfer The exchange of genetic material from one organism to another. Lysogen A prokaryotic cell that harbors the genome of one or more prophages. Natural transformation A process by which microorganisms acquire DNA from their environment.

crRNA CRISPR RNA PAM Protospacer adjacent motif PFS Protospacer flanking site

Nuclease An enzyme that degrades nucleic acids (RNA and/or DNA). Nucleic acid Macromolecules such as DNA or RNA which are composed of mononucleotide building blocks. Phage A virus that specifically infects prokaryotic organisms. Prokaryotes Single-celled organisms which lack a nucleus and belong to the domain of bacteria or archaea. Prophage A phage genome that has been integrated into a prokaryotic chromosome.

The Phage-Host Arms Race Phages comprise a diverse range of viruses capable of infecting specific bacterial and archaeal hosts. Independently discovered in 1915 and 1917 by Frederick Twort and Félix d′Hérelle, phages are now recognized as the most abundant entities on the planet. In fact, there are an estimated 1031 phage particles in the biosphere, which equates to about ten billion times more phages than there are stars in the observable universe! Phages have been found to persist ubiquitously alongside their prokaryotic hosts and outnumber them by a factor of ten to one. Phages are parasites that rely on their hosts to reproduce. Most phages replicate in one of two manners: some undergo a strictly lytic cycle, while others may choose to enter into a lysogenic cycle (Fig. 1). The former are referred to as virulent or lytic phages, while the latter are called temperate phages. Both types begin their replication cycle by attaching to a specific receptor on their host’s surface and injecting their genetic material into the cell. Virulent phages strictly adhere to the lytic replication cycle, during which injection is promptly followed by the synthesis of the phage genome and proteins, and the assembly of progeny phages. These processes rely upon the host’s enzymes and reservoir of chemical building blocks. Following a defined latent period, the progeny exit the cell, an event that typically leads to cell lysis and the death of the host. In contrast, following injection, temperate phages decide whether to enter into the lytic cycle or persist in a state of dormancy through the lysogenic cycle. The lysogenic cycle usually entails the integration of phage DNA into the host chromosome immediately after entry into the cell. The integrated phage is termed a prophage, and prokaryotes harboring one or more prophages are referred to as lysogens. After integration, the prophage is replicated along with the host chromosome until a stressor or other signal triggers its excision. Following excision, the prophage completes the lytic replication cycle. While these varying strategies of replication follow distinct time courses and carry differing evolutionary ramifications, it has been estimated that nearly 1030 such phage infections occur each day. Therefore, phages impose a substantial selective pressure upon their hosts, who must quickly respond and adapt in order to survive. This relationship between phages and their hosts is a classic example of the Red Queen Hypothesis. Originally posited by Leigh Van Valen in 1973, the Red Queen hypothesis states that in order to maintain fitness, both the host and parasite must continually co-evolve in a veritable arms race to surmount the defenses of each party involved. Such a process can lead to the near-extinction of one side or the other; however, a series of evolutionary adaptations within host and parasite ensures that both continue to co-exist, albeit in a rather uneasy equilibrium. Accordingly, numerous studies have reported upon the various strategies phages and bacteria have acquired to ensure their continued survival. Prokaryotic defenses against phages can include passive mechanisms such as the modification of cell-surface receptors through random mutagenesis, as well as active mechanisms such as the induction of programmed cell death in the face of a phage invasion (a process known as abortive infection), or the degradation of phagederived nucleic acids by prokaryotic immune systems. The latter can be further divided into innate immunity, such as that

242

Encyclopedia of Virology, 4th Edition, Volume 4

doi:10.1016/B978-0-12-809633-8.20962-X

CRISPR-Cas Systems and Anti-CRISPR Proteins

243

Fig. 1 The lytic (left) and lysogenic (right) cycles of phage replication.

conferred by restriction-modification systems, and adaptive immunity, such as that provided by clusters of regularly-interspaced short palindromic repeat (CRISPR) sequences and CRISPR-associated (cas) genes. Adaptive immune systems were once thought to be found exclusively in animals; however, the discovery of CRISPR-Cas systems proved this notion to be incorrect. These systems utilize short CRISPR RNAs (crRNAs) and Cas proteins to detect and destroy foreign nucleic acids originating from plasmids, phages and other extrachromosomal genetic elements. The general CRISPR-Cas pathway occurs in three steps (Fig. 2): adaptation, expression, and interference. During adaptation, short segments of invading nucleic acids (B30–40 nucleotides in length) are captured and inserted into the CRISPR locus as “spacers” in between similarly sized repeat sequences in order to record molecular “memories” of the invader. During expression, the repeats and spacers are transcribed into a long precursor crRNA (pre-crRNA). The pre-crRNA is subsequently processed (i.e., chopped) within each repeat to liberate mature crRNAs that each specifies a single nucleic acid target. Mature crRNAs combine with Cas proteins to form a Cas complex (also known as an effector complex) which patrols the interior of the cell, searching for a match. During interference, foreign nucleic acids that bear sequences complementary to the crRNA (called “protospacers”) are detected and sliced by the Cas enzymes. CRISPR-Cas systems are relatively widespread in nature – about 40% of bacteria and 90% of archaea have been found to harbor one or more of these systems. They not only protect against phage infection, but also prevent other modes of horizontal gene transfer (HGT) such as conjugation and natural transformation. Since HGT is a major driving force of prokaryotic evolution, these systems profoundly impact the microbial communities within, on, and around us. However, CRISPR-Cas systems by no means represent an impenetrable barrier as phages have been shown to evolve a variety of resistance strategies. As an example, antiCRISPRs consist of a diverse set of phage proteins acquired in response to the selective pressures imposed by their CRISPR-Cas containing hosts. These proteins are the first-identified viral defense mechanisms that provide sustainable protection against the adaptive immunity conferred by CRISPR-Cas systems. This article will focus on the history and biology of CRISPR-Cas systems and anti-CRISPR proteins, two remarkable molecular innovations that have emerged from the perpetual phage-host arms race. It is worthwhile mentioning that over the past decade, significant advances in our basic understanding of CRISPR-Cas pathways have enabled the development of a myriad of cutting-edge biotechnologies that continue to revolutionize genetics research and promise to offer transformative treatments for genetic diseases. Notable pioneers of CRISPR-Based technologies include Jennifer A. Doudna of the University of California at Berkeley, Emanuelle M. Charpentier of the Max Planck Institute, Feng Zhang at the Broad Institute, and George Church at Harvard University. Such technologies include powerful genetic engineering tools that can be used to introduce precise modifications into the genomes of organisms spanning all domains of life, including humans. As a consequence, the birth of the very first CRISPR-engineered human babies was revealed in 2018 through a shocking announcement

244

CRISPR-Cas Systems and Anti-CRISPR Proteins

Fig. 2 The CRISPR-Cas pathway.

that has sparked an international controversy and heated ethical debates across the globe, the conclusions of which have yet to be discerned. The topic of CRISPR-based technologies and potential applications in humans lies outside of the scope of this article. However, for more information on these topics, the reader is directed to the relevant articles referenced in the “Further Reading” section.

CRISPR-Cas, A Brief History At the time of writing (Summer, 2019), a simple search for “CRISPR” in the NCBI (National Center for Biotechnology Information) “PubMed” database (see “Relevant Website section”) returned 13,718 publications – an immense body of literature that reflects the intense interest in the subject, yet stands seemingly at odds with its relatively brief history. The presence of the curious clusters of repeats were first observed in Escherichia coli in 1987, in a study lead by Atsuo Nakata at Osaka University, Japan. This study sought to characterize iap, a gene thought encode a proteolytic (protein-cleaving) enzyme. As almost an afterthought, it was noted that iap is encoded adjacent to an unusual genetic structure consisting of several 29-nucleotide direct repeats interrupted by 32-nucleotide unique sequences. In 1989, a follow-up paper by the same group was published in which these repeats were further characterized and identified in other gram-negative bacteria. Around the same time period, nearly half-way across the world, Francisco J. M. Mojica began his PhD dissertation research at the University of Alicante, Spain. While studying the gas vesicles of Haloferax mediterranei, a salt-loving archaeon, Mojica came across regularly-spaced repeats of 30 nucleotides separated by short stretches of unique sequences within the organism’s genome. In the years that ensued, multiple publications independently noted similar structures in diverse bacteria and archaea, including studies by Groenen et al. (1993), Masephol et al. (1996), Karlin et al. (1998), and Hoe et al. (1999). However, despite this increasing awareness of their prevalence, the biological function of these mysterious repeats remained elusive for over a decade. Undoubtedly, early bioinformatic and computational analyses played pivotal roles in predicting their biological relevance. In 2002, the term “CRISPR” was coined to describe this special class of repeats, an acronym agreed upon by Mojica – by that time a leader of his own research group – and a team from Utrecht University lead by Ruud Jansen. Jansen’s group was the first to conduct extensive in silico analyses of CRISPR loci and flanking regions, and identified the conserved cas genes in their vicinity. However, the most important clues arose when attention shifted from the repeats and cas genes to the short “spacer” sequences in between them. In 2005, Mojica’s group published the first report of a systematic analysis of the spacers – it was discovered that many of them matched sequences found within extrachromosomal genetic elements such as plasmids and phages. These and other observations lead to the hypothesis that CRISPR-Cas systems provide immunity against mobile genetic elements matching the spacers through a small RNA-mediated mechanism akin to eukaryotic RNA interference. Within a few months, similar observations were reported in two additional publications from two independent research groups in France (lead by Gilles Vergnaud and Alexander Bolotin, respectively). Both reports noted that many spacers are derived from phage sequences or other extrachromosomal elements, and Bolotin’s group also arrived at the hypothesis that CRISPR elements and associated cas genes likely protect cells against foreign nucleic acid invasion. In the years that followed, several seminal studies published in quick succession by a collection of international teams revealed the basics of how CRISPR-Cas systems operate at the molecular level. One such study, published in 2007, was led by Philippe Horvath and Rodolphe Barrangou. They reported the first experimental evidence that spacers and associated cas genes indeed provide immunity against phages. At the time, the lead researchers were affiliated with Danisco Inc., a food production company (acquired by DuPont in 2011), and the study focused on Streptococcus thermophilus, a key species used in the production of yogurt

CRISPR-Cas Systems and Anti-CRISPR Proteins

Table 1

245

Cas protein composition of the six CRISPR-Cas types

Class

Type

Subtype designations

Adaptation

Expression

Interference

Large subunit

1

I III IV II V VI

A-F, U, Fva A-D, Bv IV, IVv A-C, Cv A-Eb A, B1, B2, C

Cas1, Cas2 Cas1, Cas2 unknown Cas1, Cas2, Cas4 Cas1, Cas2, Cas4 Cas1, Cas2

Cas6 Cas6c Cas6v RNase III Cas12 Cas13

Cas5, Cas7, Cas8, Cas3, SSd Cas10, Cas5, Cas7, SS Csm6/CARFe Cas5, Cas7, Cas8-like, SS Cas9 Cas12 Cas13

Cas8 Cas10 Cas8-like Cas9 Cas12 Cas13

2

a

v, variant. Additional tentative systems have been proposed. c Additional host nucleases are also involved. d SS, subtype-specific small subunit. e CARF, CRISPR-associated Rossman fold domain-containing protein. b

and cheese. Presumably stemming from an interest in developing more robust, phage-resistant bacterial strains, the team investigated the relationship between the bacterium’s CRISPR-Cas locus and its phages. They demonstrated that phage infection caused the bacteria to acquire new spacer sequences that matched back to regions of the phage genome. Furthermore, by intentionally adding and removing spacer sequences from the bacterium’s genome, the investigators established a positive correlation between the presence of a specific spacer sequence and resistance to the phage that harbors that sequence. One year later, a study on E. coli’s CRISPR-Cas system was published by John van der Oost, Stan J.J. Brouns, and co-workers at Wageningen University in the Netherlands. In this study, the investigators unambiguously demonstrated that repeats and spacers are transcribed as a long precrRNA, and the spacers are snipped out to produce mature crRNAs that each contain a single spacer sequence. Mature crRNAs were found to associate with a group of Cas proteins, and both components were required to interfere with phage proliferation. As continuing insights into CRISPR-Cas mechanisms in diverse organisms appeared in the literature, clear distinctions between them began to emerge. As examples, a study published in 2008 by Luciano A. Marraffini and Erik J. Sontheimer at Northwestern University provided evidence which implied that DNA is the most likely target for CRISPR interference in the skin-dwelling bacterium Staphylococcus epidermidis. However, one year later, a similar CRISPR-Cas system in the archaeon Pyrococcus furiosus was convincingly shown to directly cut protospacer sequences within mRNA in a study lead by Michael and Rebecca Terns at the University of Georgia. Another report published in 2010 by a group at Universite´ Laval in Quebec lead by Sylvain Moineau showed that double-stranded plasmid and phage DNA are directly cleaved within protospacer regions by the CRISPR-Cas system in S. thermophilus. Although these initial reports on the targets of CRISPR-Cas immunity seemed to contradict one another, further advances have reconciled these seemingly disparate observations and have led to the current understanding that there exist multiple CRISPR-Cas Types which exhibit remarkable diversity with regard to their genetic architecture, cas gene composition, and mechanisms of action.

CRISPR-Cas Diversity and Classification The variety of CRISPR-Cas systems known today are believed to have evolved from a single common ancestor, which diverged over time due to its dissemination into diverse organisms and the constant selective pressure imposed by the phage-host arms race. In 2006, even before the very first experimental evidence of CRISPR-Cas function became available, Eugene V. Koonin and Kira S. Makarova of the National Institutes of Health, USA, began the process of building a framework for CRISPR-Cas classification. These investigators used computational methods to compare CRISPR loci and cas genes from different systems across available prokaryotic genomes in order to infer their evolutionary relationships. They also integrated their findings with structural and functional data to arrive at the current classification scheme. Therefore, it is important to note that the current scheme represents an evolving framework that is continually revisited and refined as new CRISPR-Cas systems are identified and more information about them becomes available. According to this scheme (summarized in Table 1), CRISPR-Cas systems are grouped into two broad classes and six distinct Types (I-VI). Class 1 CRISPR-Cas systems include Types I, III, and IV, while Class 2 systems include Types II, V, and VI. Each CRISPR-Cas Type is further subdivided into subtypes (indicated with a letter or alphanumeric designation). One defining feature that separates the classes is the configuration of their effector complexes: Class 1 systems all possess multi-subunit Cas complexes, while Class 2 systems possess single-subunit effectors. Furthermore, the protein composition of these complexes delineates the different CRISPR-Cas Types. All complexes contain a large Cas protein that is specific to each Type, and Class 1 complexes contain an additional small Cas protein subunit that is characteristic of each subtype. Class 1 systems are the most widespread in nature and have been found to reside in bacterial and archaeal species, while the less common Class 2 systems are found nearly exclusively in bacteria. It now bears mentioning that the early functional insights into the general three-step CRISPR-Cas pathway emerged from studies on a diverse set of Class 1 and Class 2 systems: Type I-E (found in E. coli), Type II-A (found in S. thermophilus), and Types III-A and III-B (found in S. epidermidis and P. furiosus, respectively). Adaptation is considered the most highly-conserved

246

CRISPR-Cas Systems and Anti-CRISPR Proteins

step in CRISPR-Cas immunity. Accordingly, Cas1 and Cas2, which are essential for adaptation, are found in the majority of CRISPR-Cas Types. On the other hand, the expression and interference steps of immunity, collectively referred to as CRISPR defense, are carried out by diverse Cas and sometimes even non-Cas proteins and/or additional factors. As examples to highlight this remarkable functional diversity, crRNA processing in E. coli’s Type I-E system is catalysed by the Cas6 nuclease, which snips the pre-crRNA within repeat sequences. In the S. epidermidis Type III-A system, crRNA processing occurs in multiple steps that are carried out by Cas6 and additional host-encoded nucleases. CrRNA processing is even more complicated in Type II systems. As an example, the Type II-A system in Streptococcus pyogenes relies upon the host-encoded nuclease RNase III as well as a second small RNA called a trans-activating crRNA (tracrRNA). The tracrRNA bears partial complementarity with the CRISPR repeats and anneals to the pre-crRNA in order to create the double-stranded RNA substrate that is required for cleavage by the host-encoded nuclease RNase III. This remarkable mechanism was first reported in 2011 in a study lead by Emmanuelle Charpentier, then at Umeå University in Sweden. The interference step of immunity is similarly diverse across all CRISPR-Cas Types and relies upon distinct sets of Cas and non-Cas proteins that degrade a variety of nucleic acid targets. For example, Type I, II and V systems recognize and cut DNA targets, Type VI systems degrade strictly RNA, and Type III systems can cut both DNA and RNA. To further illustrate their diversity, some CRISPR-Cas Types have evolved to perform functions beyond adaptive immunity, such as the regulation of endogenous gene expression. Undoubtedly, future explorations into CRISPR-Cas biology will continue to reveal such unexpected mechanisms and contribute a broader understanding of the impacts of these systems on the host organisms in which they reside.

Mechanisms of CRISPR-Cas Immunity Given the considerable diversity across the six CRISPR-Cas Types, it is difficult to draw generalizations that would encapsulate all of their mechanistic details. Therefore, the following sections will focus on the adaptation and defense mechanisms of the bestcharacterized examples to serve as a starting point for a more in-depth exploration of the literature. It is worthwhile mentioning that Types I, II, and III systems have historically been the most thoroughly investigated and are considered the main CRISPR-Cas Types, whereas Types IV, V, and VI are newer additions to the CRISPR-Cas collection, and considerably less is known about their biology.

Adaptation The first step in the CRISPR-Cas pathway, adaptation generally involves four steps: detection of foreign nucleic acids, selection of a protospacer, processing of the protospacer, and integration into the CRISPR locus as a new spacer. Since the mistaken integration of endogenous chromosomal DNA would lead to self-targeting (i.e., autoimmunity) and cell death, the protein machinery catalysing these events must necessarily exhibit a strong preference for foreign nucleic acids and/or remain under tight regulation by the cell. Although adaptation is considered the most highly-conserved step of the CRISPR-Cas pathway, it remains the least well-understood. Much of the information available on adaptation has been derived from studies on Types I-F and II-A CRISPRCas systems. E. coli’s Type I-F system is arguably the best-characterized of the six CRISPR-Cas Types, and the first insights into its mechanism of adaptation were reported in 2012 by Udi Qimron’s group at Tel Aviv University. These investigators demonstrated that Cas1 and Cas2, even in the absence of other Cas proteins, could mediate the acquisition of new spacer sequences from a plasmid. The investigators observed that new spacers are added on the promoter-proximal end of the repeat-spacer array and integration requires the presence of specific sequences in the “leader” region (i.e., sequences that reside directly upstream of array). Importantly, analysis of the acquired spacers and corresponding targeted regions in the plasmid revealed the presence of conserved sequences adjacent to the protospacer (referred to as a protospacer-adjacent motif, or PAM). It is now known that PAM sequences play an essential role in preventing autoimmunity by allowing the CRISPR-Cas system to distinguish between invader-derived DNA and the spacer sequences of the endogenous CRISPR locus. Subsequent studies on this system by numerous groups, including those lead by Severinov and Seminova (2012) and Brouns (2012), revealed that there are at least two possible modes of adaptation: naïve and primed. Naïve adaptation occurs when the system encounters a brand-new phage bearing little or no resemblance to an existing spacer. This mode of adaptation depends entirely on the activities of Cas1 and Cas2, and occurs at a lower frequency. In contrast, primed adaptation occurs when the system encounters a phage that bears partial complementarity to an existing spacer. Primed adaptation is a more efficient process that relies upon the effector complex to guide Cas1 and Cas2 to the partially-matching target and facilitate further acquisition of new spacers in the vicinity. Following these initial findings, roles for host-encoded proteins in protospacer selection and new spacer integration were later described by Rotem Sorek’s group at the Weizmann Institute of Science and Jennifer Doudna’s group at the University of California at Berkeley, respectively, in 2015. In subsequent studies by various groups, mechanisms of adaptation in the Type II-A system came to light. Similarly to Type I-F, adaptation by the Type II-A CRISPR-Cas system was found to rely upon Cas1 and Cas2 in addition to other Cas and non-Cas proteins encoded in the host. Although it is unclear whether this system exhibits primed adaptation, naïve adaptation in Type II-A also displays requirements for sequence motifs in the CRISPR leader region, preferences for adding new spacers to the leading end, and the presence of a PAM.

CRISPR-Cas Systems and Anti-CRISPR Proteins

247

Fig. 3 Examples of the six CRISPR-Cas types.

Defense Type I CRISPR-Cas systems, sometimes referred to as CRISPR-Cas3, are the most widely distributed in nature. As mentioned above, the best characterized example is the Type I-F system found in E. coli (Fig. 3, top panel). CrRNA expression in this system relies upon the endonuclease Cas6 (also known as CasE or Cas6e). This enzyme recognizes and cuts the pre-crRNA within each repeat. Cas6 and the mature crRNA are then joined by four additional Cas protein subunits in varying stoichiometries: one copy of Cas8 (the large Type-specific subunit, also called CasA or Cse1), six copies of Cas7 (also known as CasC), one copy of Cas5 (also called CasD), and two copies of CasB (a small subtype-specific subunit also known as Cse2). The resulting Cas complex is known as Cascade (CRISPR-associated complex for antiviral defense), a term that was coined by John van der Oost, Stan J.J. Brouns, and co-workers in 2008 in their original paper which first described this complex. Cascade can detect nucleic acid sequences that are complementary to the crRNA and positioned adjacent to a two- to six- nucleotide PAM sequence. Once such invading DNA sequences are identified, the helicase-nuclease Cas3, a protein separate from the complex, degrades the targeted DNA. An important feature of Type I systems is the requirement for perfect complementarity in the first 6–8 nucleotides adjacent to the PAM in the crRNA-protospacer match, a region known as the “seed”. Notably, the presence of a PAM and seed in Type I systems render them particularly susceptible to passive mechanisms of phage escape from immunity, whereby if a phage randomly acquires a point mutation in the PAM or seed region, it will evade CRISPR detection altogether. Type II CRISPR-Cas systems, also known as CRISPR-Cas9 (Fig. 3, bottom panel), have historically been the most commonly used for biotechnological applications due to their relative simplicity compared to Class 1 systems. The Type II-A systems found in Streptococcus pyogenes and S. thermophilus were among the first to be functionally characterized. As mentioned earlier, crRNA processing in these systems relies upon a small tracrRNA, which bears some complementarity to the repeat-derived sequence of the

248

CRISPR-Cas Systems and Anti-CRISPR Proteins

pre-crRNA. Pairing between the two RNAs facilitates crRNA processing by the host-encoded nuclease RNase III. During interference, the Cas9 nuclease bound to both RNAs recognizes and cleaves both strands of foreign double-stranded DNA matching the crRNA sequence. Similarly to Type I, Type II systems rely upon the presence of a PAM to distinguish “self” from “non-self” DNA and require perfect complementarity in a seed region to carry out interference. Accordingly, targeted phages can escape Type II immunity by simply acquiring point mutations in these critical regions. Type III CRISPR-Cas systems are the second-most prevalent in nature and are very likely the most complex of the six Types. One of the best-characterized examples of these is the Type III-A CRISPR-Cas system found in S. epidermidis, also known as CRISPRCas10 (Fig. 3, top panel). In this system, crRNA processing relies upon the activities of Cas6 as well as host-encoded nucleases including PNPase. The mature crRNAs that result are joined by five protein subunits in varying copy numbers: one copy of Cas10 (the large Type-specific subunit, also called Csm1), one copy of Cas5 (also called Csm4), multiple copies of the Cas7 homologs Csm3 and Csm5, and multiple copies of Csm2 (the small subtype-specific subunit). The resulting effector complex is termed “Cas10-Csm”. Interference by this system has at least two requirements: First, the targeted region must be actively transcribed, and second, the crRNA must bear complementarity to the mRNA. Once the Cas10-Csm complex pairs with a targeted mRNA, it performs interference through the activities of at least three different CRISPR-associated nucleases: Cas10 cuts the targeted DNA, Csm3 cuts the targeted RNA, and Csm6, which is not a member of the complex, cuts RNA outside of the protospacer in a nonspecific (i.e., sequence-independent) manner. During interference, Cas10 also releases cyclic oligoadenylates, which are small second-messenger molecules that bind to the CARF (CRISPR-associated Rossman fold) domain of Csm6 and further stimulate its nuclease activity. This remarkably sophisticated second-messenger signaling mechanism, first described in 2017 by two independent groups (lead by Virginijus Siksnys at Vilnius University and Martin Jinek at the University of Zurich, respectively) explains how Type III systems constrain the activities of Csm6 to the site of the phage infection. Interestingly, additional host-encoded nucleases are also essential for immunity through mechanisms that are still being unraveled. Unlike Types I and II systems, Type III systems do not possess PAM or seed sequences, and therefore exhibit a robust immune mechanism that cannot be easily overcome by point mutations in the phage genome. Notably, this unique feature makes Type III systems particularly suited to facilitate genome editing of lytic phages (see relevant article in “Further Reading”). However, the requirement for active transcription across the targeted region provides a route for lysogenic phages to remain unharmed by the CRISPR-Cas system provided they keep their targeted genes silent while integrating into the host chromosome. Type IV CRISPR-Cas systems are the least well understood (Fig. 3, top panel). A recent study in 2019 by Lennart Randau and coworkers of the Max Planck Institute for Terrestrial Microbiology showed that in the Type IV CRISPR-Cas system of Aromatoleum aromaticum, crRNA processing is carried out by a unique Cas6 variant (Cas6v, also known as Csf5). Cas6v and the mature crRNAs are joined by three additional subunits in yet undetermined stoichiometries: Csf1 (the large Type-specific subunit), multiple copies of the Cas7 homolog Csf2, and the Cas5 homolog Csf3. The resulting ribonucleoprotein complex, termed the Type IV crRNP, has a mechanism of action that remains unknown; however, the investigators speculate that this complex functions similarly to the Type I Cascade complex. As newer members of the Class 2 group, Types V and VI CRISPR-Cas systems have been the focal point of more recent investigations, particularly Types V-A and VI-A (Fig. 3, bottom panel). Both systems perform crRNA-mediated interference against nucleic acid invaders using single subunit effector complexes. In Type V-A systems, the large subunit is Cas12a (also known as Cpf1), while in Type VI-A systems, the large subunit is Cas13a (also known as C2c2). Both enzymes have been shown to process pre-crRNAs and cleave their targets using separate nuclease active sites. During interference, the Cas12 effector complex recognizes and cuts double-stranded DNA within the protospacer region, as well as non-target single-stranded DNA in the vicinity. In contrast, the Cas13 effector complex recognizes single-stranded RNA protospacers and subsequently degrades non-target RNA. The latter cleavage of non-target DNA or RNA upon binding a matching protospacer is termed collateral cleavage. This indiscriminate degradation of nucleic acids ensures that all invaders in the vicinity of the protospacer are shredded, and may also induce a state of dormancy in the cell due to the unintended degradation of host-derived nucleic acids. Finally, while Type V CRISPR-Cas systems have been shown to require specific PAM sequences to enable interference, Type VI CRISPR-Cas systems have a looser preference for specific nucleotides adjacent to the protospacer in a protospacer-flanking site (PFS). Thus, the latter sequence requirements may promote the facile accumulation of phage escape mutants via the acquisition of point mutations. It is worthwhile noting that the collateral cleavage exhibited by Types V and VI CRISPR-Cas systems has enabled the development of powerful CRISPR-based diagnostic tools. Additionally, since Type V CRISPR-Cas systems are even simpler than Type II systems with regard to their mechanisms of action, the former appear to rival the latter in their increasingly more widespread use for genome editing applications.

Anti-CRISPR Proteins In response to the risk of encountering a CRISPR-containing host, phages have evolved a variety of countermeasures. The most common of these occurs through the passive acquisition of random mutations in the phage genome at or near the protospacer sequence. Most CRISPR-Cas systems are particularly prone to this type of escape due to presence of a seed region which is highly sensitive to mismatches, or the requirement for specific sequence motifs adjacent to it such as the PAM and PFS. Although Type III CRISPR-Cas systems do not have a seed or PAM requirement, the loss of the targeted gene in the phage constitutes a feasible mechanism of passive escape, provided the gene is dispensable for phage survival. Even though such mutations can be effective in

CRISPR-Cas Systems and Anti-CRISPR Proteins

249

deterring CRISPR-Cas immunity, it is only a temporary solution since CRISPR-Cas systems can simply acquire new spacers from different parts of the phage genome to restore immunity. However, a more permanent solution to this problem lies in the diverse mechanisms of anti-CRISPR proteins. The majority of known anti-CRISPR proteins are small, ranging in size from 50 to 150 amino acids in length. They can be found encoded in the genomes of prophages, lytic phages, or bacteria (particularly within horizontally-transferred genomic islands). Since their initial identification in 2013, reports of CRISPR inhibitors have rapidly grown to encompass over 20 distinct families targeting three of the six currently known CRISPR-Cas Types (Types I, II and V). Owing to these inhibitors’ diversity in size, location, and specificity of targeting, various systematic approaches have proved necessary to elucidate each anti-CRISPR family and characterize their respective functions. The sections below highlight the seminal studies which reported the initial discoveries of anti-CRISPR proteins and the approaches that have since been applied to mine bacterial and phage genomes for these diverse CRISPR-Cas inhibitors.

Prophage-Dependent Functional Approach Anti-CRISPRs were first reported in 2013 by a group led by Alan R. Davidson and Joseph Bondy-Denomy and co-workers at the University of Toronto, Canada. The proteins were discovered in a group of related temperate phages that infect the bacterial pathogen Pseudomonas aeruginosa, which contains a Type I-F CRISPR-Cas system. The authors examined a suite of 44 P. aeruginosa lysogens which are genetically identical with the exception of the presence of a distinct prophage within each of their genomes. Although most prophage genes are typically repressed, the authors predicted that some prophages might encode one or more genes that inhibit CRISPR-Cas function. To test for the presence of such anti-CRISPR proteins, the investigators challenged the lysogens with three CRISPR-sensitive phages that should be targeted by the bacterium’s CRISPR-Cas system. As expected, the majority of the bacterial strains were able to thwart phage infection; however, a few of the prophage-containing strains failed to protect against the phages despite the presence of a functioning Type I-F CRISPR-Cas system. This suggested the lysogens harbored a prophage capable of inhibiting the CRISPR-Cas system. To identify the genetic elements responsible for this anti-CRISPR activity, the investigators compared the genomes of the CRISPR-inactivating prophages with the CRISPR-sensitive phages, and uncovered a set of five novel phage genes at a single genomic locus that were capable of inhibiting the Type I-F CRISPR-Cas system. These proteins, named AcrF1-AcrF5, constituted the very first anti-CRISPRs identified. In a second study published in 2014, the same group identified four additional anti-CRISPRs using the same approach; however, these proteins were active against the Type I-E CRISPR-Cas system, and correspondingly were named AcrE1-AcrE4. Notably, the Type I-E inhibitors were located at the same genomic locus as the Type I-F anti-CRISPRs. The nine anti-CRISPRs identified to that date were found to exist in varying combinations across other phage genomes, suggesting that phages must acquire diverse CRISPR inhibitors to maintain their host ranges. Moreover, anti-CRISPRs were discovered outside of the prophages in genomic regions of P. aeruginosa that are likely mobile genetic elements, suggesting that while these proteins are subject to horizontal transfer, they are by no means restricted to transfer via phage. The first insights into anti-CRISPR mechanisms were reported shortly thereafter in 2015 by members of the Davidson group. They used biochemical analyses to demonstrate that AcrF1 and AcrF2 bind directly to different protein subunits of the Type I-F effector complex and inhibit its ability to bind to the targeted DNA. In contrast, AcrF3 was unable to associate with the Cas complex and was instead shown to bind and block the function of the Cas3 nuclease-helicase. Follow-up structural analyses published in 2016 by Yongqun Zhu, Yan Zhou, and co-workers at Zhejiang University, China, showed that two copies of AcrF3 bind to each Cas3 subunit and prevent its recruitment to the Cas complex. Now, as more and more anti-CRISPR mechanisms come to light, it is becoming apparent that the majority function in two general manners: by disrupting target binding and/or inhibiting target cleavage.

Guilt-By Association Bioinformatic Approach The lack of sequence homology between the first reported anti-CRISPRs initially hindered efforts to identify additional inhibitors through genomic comparisons alone. To bypass this challenge, April Pawluk and colleagues from the Davidson lab in 2016 considered the genomic context of the Type I-E and I-F inhibitors. They found that these anti-CRISPRs were consistently located upstream of a highly conserved anti-CRISPR associated gene, aca1, which encodes a predicted transcriptional regulator. Accordingly, aca1 was used as a genomic landmark in bioinformatic screenings across phage genomes to identify diverse anti-CRISPR proteins lying adjacent to it. This method, termed a guilt-by-association bioinformatic approach, was used to discover two additional anti-CRISPRs acting against Type I-F systems. Furthermore, one of these was the first example of an anti-CRISPR protein capable of inhibiting two distinct CRISPR systems, Types I-E and I-F. In the same study, the team identified a second transcriptional regulator, aca2, adjacent to which three additional Type I-F anti-CRISPRs were identified. Subsequently, anti-CRISPR homologs were discovered across all Proteobacteria housing the I-F CRISPR-Cas system. Some of these proteins inhibited I-F CRISPR systems of different bacterial species despite their loosely related Cas proteins, thus underscoring the potential for anti-CRISPRs to act as broad-range inhibitors. In a subsequent study published in 2016, a group led by Alan Davidson, Karen Maxwell, and Erik J. Sontheimer applied this guilt-by-association approach in conjunction with aca2 to identify the first Type II CRISPR-Cas inhibitors. Originally discovered in

250

CRISPR-Cas Systems and Anti-CRISPR Proteins

Neisseria meningitidis, this family of inhibitors was found to block the function of the bacterium’s Type II-C CRISPR-Cas system, and thus the three inhibitors were named AcrIIC1-AcrIIC3. These findings proved significant for two reasons: First, they demonstrated that anti-CRISPRs are not limited to Type I CRISPR-Cas systems. Second, since Type II CRISPR-Cas systems are renowned for their broad genomic applications, Type II anti-CRISPR proteins have the potential to be exploited as a powerful mechanism to regulate the activities of Type II CRISPR-based technologies. Indeed, all three AcrIIC proteins discovered successfully inhibited CRISPR-Cas9 mediated genome editing within human cell lines, supporting their potential use as tools to reduce spurious cleavage events in genome editing applications. The mechanisms of action for two of the AcrIIC proteins was reported in a study led by Jennifer Doudna’s group: One of these inhibitors was found to disrupt DNA binding by forcing the effector complex into a dimerized state, while the other was shown to bind and prevent Cas9 from cleaving its DNA target.

Self-Targeting Bioinformatic Approach A distinct bioinformatics-guided method was used to identify additional anti-CRISPR genes acting against the Type II-A CRISPR-Cas system of Listeria monocytogenes in a 2017 study lead by Joseph Bondy-Denomy, now at the University of California in San Francisco. This methodology, termed a self-targeting bioinformatic approach, relies upon the prediction that the CRISPR-mediated acquisition of spacers from within a bacterium’s genome would result in chromosomal cleavage by the endogenous CRISPR system and death of the cell. However, the presence of anti-CRISPR proteins within the same genome are expected to prevent such self-targeting. By identifying bacteria with self-targeting spacers, the investigators were able to discover four prophage-encoded Type II-A CRISPR-Cas inhibitors, two of which were found to block the function of the Type II-A CRISPR-Cas system of Streptococcus pyogenes, the very first system exploited for genome editing. The remarkable mechanism of one of these inhibitor proteins, named AcrIIA4, was elucidated in 2017 by three independent groups headed by Dinshaw Patel, Zhiwei Huang, and Jennifer Doudna, respectively. Using structural analyses, these investigators discovered that AcrIIA4 structurally mimics the PAM and binds the PAM-interacting site on Cas9, thereby blocking Cas9′s ability to bind a bona fide target. In 2018, a similar bioinformatic-guided approach was used to identify the very first Type V-A CRISPR-Cas inhibitors – these were reported in two independent studies led by Jennifer Doudna and Joseph Bondy-Denomy, respectively. Altogether, five different inhibitors were identified and experimentally confirmed to function against Type V-A interference. Notably, some of these inhibitors were shown to successfully modulate editing in human cells, again highlighting their utility for use as tools to regulate Type V-A mediated genome editing applications.

Lytic Phage Dependent Functional Approach In 2017 and 2018, Sylvain Moineau and co-workers used a functional screen to identify additional Type II-A inhibitors within lytic phages. This approach involved engineering a Type II-A CRISPR-Cas system in Streptococcus thermophilus to contain spacers targeting an array of lytic phages, and then challenging the strains with each phage – those capable of reproducing within such a bacterial context were presumed to contain one or more anti-CRISPR proteins. Two such phages were identified, and further experiments in which the phages’ genes were methodically tested for anti-CRISPR activity identified AcrIIA5, a broadly acting inhibitor of S. pyogenes and S. thermophilus CRISPR-Cas9 systems. Capitalizing on this successful approach, a second anti-CRISPR, AcrIIA6 was discovered despite its failure to appear concomitantly with any single aca gene in available phage genomes. Furthermore, homologs of AcrIIA5 or AcrIIA6 were identified in 38% of virulent S. thermophilus phages, indicating the prevalence of antiCRISPRs within virulent phage genomes. Given the widespread distribution of CRISPR inhibitors discovered within bacterial phages, it logically follows that archaeal viruses would also house anti-CRISPR genes. In fact, since 90% of sequenced archaea maintain CRISPR-Cas systems, it would be expected that these inhibitors might prove even more widespread amongst archaeal viruses. While anti-CRISPRs specific to archaea have yet to be thoroughly studied and characterized, a study lead by Xu Ping published in 2018 identified the first example of an anti-CRISPR protein endemic to the lytic rudiviruses SIRV2 and SIRV3. This anti-CRISPR, AcrID1, was found to inhibit the activity of Cas10d, which is necessary for interference mediated by the Type I-D CRISPR-Cas system in Sulfolobus islandicus. Moreover, it was hypothesized that homologs of the newly identified archaeal anti-CRIPSR are likely to be present across a diverse range of archaeal viruses as has been observed for bacterial anti-CRISPRs.

Concluding Thoughts The microscopic war that has been raging for billions of years between prokaryotes and their viruses has given rise to an immense pool of genetic diversity, with CRISPR-Cas and anti-CRISPRs representing just two manifestations of the countless defense and counter-defense systems employed on the battlefield. Historically, the detailed investigation of such molecular weaponry has provided a fertile ground for discovery and innovation, and CRISPR-Cas research is no exception. Over the past decade, basic insights into CRISPR-Cas mechanisms have inspired the development of powerful genetic technologies, which, in turn, have propelled basic discoveries at an accelerated pace. A similar positive feedback loop is beginning to emerge for anti-CRISPR proteins, in which the continued discovery of new families and elucidation of their mechanisms are expected to not only deepen

CRISPR-Cas Systems and Anti-CRISPR Proteins

251

our understanding of the evolutionary trajectories of phages and their hosts, but also impact biotechnological and biomedical fields of research. It is anticipated that in the coming years, the collection of distinct CRISPR-Cas systems will continue to expand, and anti-CRISPRs will eventually be discovered for all CRISPR-Cas Types. Beyond this, the presence of CRISPR inhibitors implies the existence of anti-CRISPR inhibitors. Though no such actors have been identified within CRISPR-containing bacterial species to date, it would not come as a surprise to see the Red Queen Hypothesis play out in this manner given such a fertile venue.

Further Reading Cyranoski, D., Ledford, H., 2018. International outcry over genome-edited baby claim. Nature 563, 607–608. Doudna, J.A., Charpentier, E., 2014. The new frontier of genome engineering with CRISPR-Cas9. Science 346 (6213), 1258096. Hatoum-Aslan, A., 2018. Phage genetic engineering using CRISPR–Cas systems. Viruses 10 (335), 1–11. Hille, F., Richter, H., Wong, S.P., et al., 2018. The Biology of CRISPR-Cas: Backward and Forward. Cell 172, 1239–1259. Hsu, P.D., Lander, E.S., Zhang, F., 2014. Development and applications of CRISPR-Cas9 for genome engineering. Cell 157 (6), 1262–1278. Keen, E.C., 2015. A century of phage research: Bacteriophages and the shaping of modern biology. Bioessays 37 (1), 6–9. Klompe, S.E., Sternberg, S.H., 2018. Harnessing a billion years of experimentation: The ongoing exploration and exploitation of CRISPR–Cas immune systems. The CRISPR Journal 1 (2), 141–158. Koonin, E.V., Makarova, K.S., Zhang, F., 2017. Diversity, classification and evolution of CRISPR-Cas systems. Current Opinion in Microbiology 37, 67–78. Mohanraju, P., Makarova, K.S., Zetsche, B., et al., 2016. Diverse evolutionary roots and mechanistic variations of the CRISPR-Cas systems. Science 353 (6299), aad5147. Pawluk, A., Davidson, A.R., Maxwell, K.L., 2018. Anti-CRISPR : Discovery, mechanism and function. Nature Reviews Microbiology 16 (1), 12–17. Reardon, S., 2016. The crispr zoo. Nature 531, 160–163. Stanley, S.Y., Maxwell, K.L., 2018. Phage-encoded anti-CRISPR defenses. Annual Review of Genetics 52, 445–464. Stern, A., Sorek, R., 2011. The phage-host arms race: Shaping the evolution of microbes. BioEssays 33 (1), 43–51.

Relevant Website https://www.ncbi.nlm.nih.gov/pubmed/?term=CRISPR Genome engineering using CRISPR-Cas9 system.

Bacteriophage: Therapeutics and Diagnostics Development Teng-Chieh Yang, Scarsdale, NY, United States r 2021 Elsevier Ltd. All rights reserved.

Glossary Antibiotic resistance The ability of microorganisms to grow at high concentrations of an antibiotic. Antibiotic susceptibility test The test to examine the microbial susceptibility to antibiotics of choice under varying concentrations of antibiotics. Chemistry, manufacturing, and control The activities for development, manufacturing processes, quality control, and life cycle management of a drug product, to ensure the safety, efficacy, and manufacturing consistency of the product. CRISPR/Cas CRISPR (clustered regularly interspaced short palindromic repeats) is a family of DNA sequences found in the genomes of prokaryotic organisms, originally derived from DNA fragments of bacteriophages that previously infected the prokaryote. Cas (CRISPR-associated protein) is a family of enzymes that use CRISPR sequences as a guide to recognize and cleave specific strands of DNA that are complementary to the CRISPR sequence. CRISPR and Cas together form the basis of a technology for editing genes within organisms. Lysogenic cycle A life cycle of bacteriophages where most of the infecting bacteriophages integrate the genome into

the host genome, being passively replicated with bacterial genome as the host replicates. Some bacteriophage genomes exist as plasmids with separated replication and segregation systems, synchronizing the plasmid replication with host replication. Lytic cycle A life cycle of bacteriophages where the infecting bacteriophage takes control of host cellular machinery to produce its own progeny and ultimately kills the host, releasing new bacteriophages to kill more hosts. Mass spectrometry An analytical technique that measures the mass-to-charge ratio of ions from the analytes. Molecular diagnostics An array of diagnostic techniques for analyzing the genomic and proteomic biomarkers using molecular biology tools. Phage display A technology to study protein–protein, protein–peptide, and protein–nucleic acid interactions using bacteriophages as the vehicle to express proteins and connect proteins with the corresponding genetic information. Phage typing A method to trace the source of outbreaks of bacterial infections by detecting single strains of bacteria with lytic bacteriophages.

Introduction Antibiotics are one of the most important life-saving drugs for humankind. The human exposure to antibiotics has been dated back to 350–550 Common Era, where trace amounts of tetracycline have been found in human skeletal remains, likely due to the consumption of tetracycline-containing food. Traditional Chinese medicine practiced at least 1000 years ago was also found to have antibiotic activities. The modern “antibiotic era” started with the discovery of penicillin by Alexander Fleming in 1928 and, 15 years later, the mass production and introduction to the public for treating bacterial infections. In the next decades from 1950s to 1970s, the discovery and development of novel antibiotics flourished and revolutionized medicine by reducing global morbidity and mortality. The bacterial resistance to antibiotics was reported shortly after their discovery and became prevalent after the massive introduction. The resistance to penicillin became widespread 24 years after the commercialization, whereas the resistance to methicillin was identified in the same year as its introduction (Fig. 1). Most of the resistance is attributed to proteins expressed in bacteria, which inactivate antibiotics via different mechanisms including efflux pumping and enzymatic cleavage. The misuse and overuse of antibiotics further accelerate the resistance by creating a selective pressure to kill the susceptible bacterial strains and allow the resistant strains to thrive. This resistance became a problem after the 1980s as bacteria more rapidly developed resistance to new antibiotics and a dramatic slow-down on the development of new antibiotics with only three new classes of antibiotics marketed after 2000 (Fig. 1). Over the last decade, Antibiotic Resistant (AR) bacteria are recognized as global threats and projected to cause more annual global deaths than cancer by 2050. Each year in the US, more than 2.8 million people are infected with AR bacteria, causing at least 35,000 people to die directly from the infection. The annual nationwide cost to treat hospitalized patients with AR bacterial infections is estimated between $30 and $50 billion (between 2012 and 2015). These infections, if not diagnosed early and treated with effective drugs, can lead to severe diseases like sepsis which has a high mortality. Current standard diagnostic method for bacterial infections is to amplify bacterial pathogens from culture (i.e., bacterial culture), followed by the species identification with biochemical tests or mass spectrometry analysis. Upon identifying the pathogens, a separate Antibiotic Susceptibility Test (AST) is used to select effective treatments with the total testing time of two to five days from sampling to obtaining AST results. Patients with acute infections (e.g., bloodstream infection) cannot afford such a prolonged testing time, thus physicians have no choice but to empirically treat these patients with broad-spectrum antibiotics before the test results would be available, which accelerates the resistance to last-resort drugs. Several molecular diagnostics emerged for rapid bacterial identification;

252

Encyclopedia of Virology, 4th Edition, Volume 4

doi:10.1016/B978-0-12-814515-9.00012-6

Bacteriophage: Therapeutics and Diagnostics Development

253

Fig. 1 Timeline of antibiotic introduction and the identification of antibiotic-resistant bacteria from 1940 to 2015. Data modified from Center for Disease and Control and Presentation (https://www.cdc.gov/drugresistance/about.html).

however, all of them still require a preliminary bacterial culture, a process that takes at least 8–12 h to amplify culturable bacteria. It is almost impossible to detect bacteria that are difficult to culture or are unculturable. Therefore, we need to develop a rapid and accurate diagnostic tool and combine with evidence-based treatments to combat AR bacteria. In the past decade, bacteriophages (Phages) have gained attention from scientific community as an alternative therapy for AR bacterial infections. Phages are viruses that infect bacteria with two different life styles, lytic and lysogenic cycles. In the lytic cycle, the infecting phage takes control of host cellular machinery to produce its own progeny and ultimately lyse (i.e., kill) the host bacteria, releasing new phages to kill more bacteria. In the lysogenic cycle, most of the infecting phages become a prophage by integrating the genome into the host bacterial genome, being passively replicated with bacterial genome as the host replicates. If a bacterium with a prophage is under environmental stresses (e.g., UV), the prophage may be excised from the host genome and enter the lytic cycle, a process called induction. Phages hold great potential to be a new tool for detecting and treating AR bacterial infections because of two major advantages: specificity and self-replication. However, there are at least two major limitations: narrow host range and resistance of the bacteria to phages. Many excellent reviews describing the medical application of phage have been published in the last 10 years. The goal of this article is to discuss the history of phage development as therapeutic and diagnostic tools, to dissect the major advantages and limitations of phage technology associated with therapeutic and diagnostic applications.

The Development of Phage as Therapeutics for Bacterial Infections Phages were independently discovered by two microbiologists: Frederik Twort in 1915 and Félix d0 Hérelle in 1917, and named “bacteriophage” after the idea of “bacteria-eater.” Shortly after d0 Hérelle’s discovery of Shigella phages, he acknowledged the potential of using phages to treat bacterial infections (i.e., phage therapy) and successfully applied lytic phages to treat Salmonella gallinarum infected chickens. This success led to the first human clinical trial in 1921, where five patients with bacillary dysentery were treated with Shigella phages and recovered. In 1927, additional clinical trials were conducted in India for treating cholera with phages, which showed an over 50% decrease in mortality compared to the control groups. After the initial success of clinical trials, more scientists began to test phage therapy for other bacterial infections. The outcome of these tests, however, was mixed success and raised criticisms on the study design and the quality of phage products. The enthusiasm of phage therapy waned after the discovery and mass production of penicillin gained popularity in 1940s, shifting the pharmaceutical focus to finding and developing novel antibiotics as the treatment for bacterial infections. Research on phage instead moved towards the fundamental understanding of phage biology with several phages (e.g., l, T4, and M13) became powerful model systems for studying gene regulation, protein engineering, restriction modification, and among others, transforming the knowledge learned from phages into modern biology disciplines including molecular biology and genetics. This shift of focus continues to bring the technologies developed from phages at the center of modern drug development, including the introduction of “phage display” technology in 1985, which created bestselling protein therapeutics and the more recent discovery of “CRISPR/Cas system” which holds a great potential to be the next generation precision therapy by its targeted genome editing capability. Despite the success from phage-derived technologies, phage therapy has received mixed interests around the globe. From the 1940s to 1990s, Western countries largely abandoned the idea of phage therapy, whereas the former USSR still extensively exploited its capability through the 1970s. Today, phage therapy is still available as a treatment option for AR bacterial infections in the Republic of Georgia. A notable phage therapy center in the Republic of Georgia, Eliava Institute, has developed several phage products for treating septic, topical, and intestinal bacterial infections. Despite these products were subjected to clinical trials and are commercially available in Eastern Europe, they are not approved elsewhere, because the standards of clinical trials and pharmaceutical manufacturing procedures in the former USSR and the Republic of Georgia do not comply with international regulations. With the emergence of multi-drug resistant bacterial strains and the slowed development of new antibiotics, Western countries have a renewed interest in phage therapy.

254

Bacteriophage: Therapeutics and Diagnostics Development

Fig. 2 Interests of phage therapy indicated by the number of review article from 1991 to 2019 (data extracted from PubMed).

Table 1

Overview of randomized and placebo-controlled clinical trials for phage therapy

Industry sponsor

Indication

Target bacterial species

Phase Study number

Year

Intralytix

Venous leg ulcers

I

WIRB, protocol# 20061649

Biocontrol Limited Nestle

Chronic otitis

Pseudomonas aeruginosa, Staphylococcus aureus, Escherichia coli Pseudomonas aeruginosa

I/II

Escherichia coli Pseudomonas aeruginosa

Pherecydes Pharma SA

Pediatric diarrhea Burn wounds

Product

Number of phages Evaluated patient in the product population

References

2007–2008 WPP  201

8

39

Rhoads et al. (2009)

2004–001691–39

2009

6

24

I

NCT00937274

2016

T4-like coliphages Biophage-PA

11

115

I/II

2014–000714–65

2018

PP1131

12

25

Wright et al. (2009) Sarker et al. (2016) Jault et al. (2019)

This interest gradually grew as indicated by the steady increase of published review articles discussing phage therapy in Western scientific journals (Fig. 2). Additionally, four randomized, placebo-controlled Phase I/II clinical trials were conducted to address one of the biggest criticisms on phage therapy – suboptimal design of clinical trials (Table 1). All four trials pointed towards promising safety; however, three failed to demonstrate the superiority of phage treatment compared to controls or standard of care. In contrast, several case studies of compassionate treatments showed positive outcomes. Phages appear to have a niche in treating some bacterial infections but still have many unsolved questions from the fundamental mechanisms of action to the manufacturing and regulatory pathway. More high quality data is needed to turn phage therapy into the reality.

The Development of Phage as Diagnostics for Bacterial Infections Current diagnostic tools for bacterial infections primarily rely on bacterial culture combined with (1) secondary biochemical tests or mass spectrometry analysis to identify the pathogen species or (2) molecular diagnostics to detect species-specific and bacterial virulence genes. Both methods require at least 8–12 h to amplify the pathogens and a separate AST for selecting treatments. An ideal diagnostic tool should only detect viable bacteria, identify species, and provide treatment recommendations from a single test. If test results were available in one to three hours, physicians could initiate evidence-based treatments and minimize the empirical use of broad spectrum antibiotics. Phages have been used to detect bacteria since the development of phage typing in 1950s, where bacterial strains can be typed (i.e., classified) by the plaques created from a panel of lytic phages on the bacterial lawn. This technique is still used in some hospitals to monitor the bacterial outbreaks by detecting clusters of virulent strains. A reporter phage was later developed to detect Mycobacterium tuberculosis, a slow growing bacterial species, by incorporating a luciferase gene into the phage genome. After phage infection, the luciferase is expressed in minutes with the activity to be detected by the addition of luciferin, dramatically reduced the diagnostic time from weeks to one day. Importantly, this reporter phage is compatible with AST, allowing the selection of effective antibiotic treatments and the screen of new anti-tuberculosis drugs. With a better understanding of phage genomics and structure, the engineering effort turned to the use of phage capsids to display functional molecules on the surface of phage virions, enabling the detection of phage-infected bacteria either through conjugated fluorophores or direct fluorescent signal from the displayed molecules (Table 2). Despite extensive research on

Bacteriophage: Therapeutics and Diagnostics Development

Table 2

255

Summary of modern diagnostic phages

Technology

Target bacteria

Diagnostic phage Signal detection

Limit of detection Detection time Compatible with AST

Luciferase

Mycobacterium tuberculosis E. coli E. coli

TM4

Luminescence

500–5000 cells

o24 h

Yes

Jacobs Jr. et al. (1993)

T4 T7

Fluorescence Fluorescence

NA 10 cells/mL

1h 1h

NA NA

Tanji et al. (2004) Edgar et al. (2006)

TM4

Fluorescence

o100 cells

o24 h

Yes

Piuri et al. (2009)

FV10 l

Luminescence Fluorescence

5 CFU NA

7h NA

NA NA

Zhang et al. (2016) Trinh et al. (2018)

Fluorescent protein on phage Biotinylation coupled with quantum dots Fluorescent protein in bacteria Luciferase Biotinylation coupled with fluorescence nanodiamonds

Mycobacterium tuberculosis E. coli O157:H7 E. coli

References

Note: NA, not available.

diagnostic phages, very few products of this type have moved into clinical testing or become commercially available, owing to some practical challenges associated with phages. Below we discuss the major advantages and limitations of phage technology.

Major Advantages of Phage Technology Specificity Although broader host range phages do exist, more commonly a phage species is typically limited to infecting one bacterial species or even to several lineages within a species. This specificity is driven by phage adsorption, a process generally consisting of three steps, initial contact, reversible binding, and irreversible attachment to the target bacteria. The last step is mediated by a specific interaction between the often bacterial species-specific surface receptor and the receptor binding protein on the phage virion, allowing the accurate detection of target bacteria for diagnostic phages and the minimum impact on commensal bacteria for therapeutic phages.

Self-Replication Lytic phages replicate inside the target bacteria, kill the bacterial host, and release dozens to hundreds of progenies, continuing the bacterial killing. This self-replicating nature allows the sustained release of therapeutic phages from the target bacteria, potentially reducing the dosing amount and frequency. As the phages only replicate in live bacteria, this process enables the detection of live pathogens when infected with diagnostic phages.

Major Limitations of Phage Technology Narrow Host Range A phage’s host range is defined as the spectrum of bacteria that it infects. Some phages have a narrow host range and only infect few strains in a bacterial species, limiting their ability to treat or diagnose infections caused by single or multiple bacterial species. While this limitation has been overcome with the use of multiple phage species to expand the range of host targets, new challenges emerge for therapeutic applications, as discussed later in this article.

Phage Resistance Bacteria and phages co-exist in an evolutionary arm race, where a bacterial species may already have or will inevitably develop resistance against selected phages. The pre-existing phage resistance will increase the false negative results from diagnostic applications and decrease efficacy in therapeutic applications, while the acquired phage resistance may be difficult to monitor and control in human for therapeutic phages.

Additional Challenges for Phage Therapy Preclinical Data Translation Translating results from preclinical models to human is a significant and common challenge for drug development. Early studies showed that the outcome of in vitro phage experiments did not always give the same results as in vivo studies. Additionally, the

256

Bacteriophage: Therapeutics and Diagnostics Development

in vivo efficacy from one study may not be consistent with another study or translatable to the efficacy in humans. This is largely due to the poor study design and the discrepancies in biology between species. High quality preclinical data from in vitro and representative in vivo models is essential for the approval of phage therapy.

Safety and Efficacy Phages are generally considered to be safe to human because no major adverse events were reported from previous clinical trials. In animal models, circulating phages are also shown to be rapidly removed by liver and spleen in the absence of target bacteria. Recent studies suggested that, while phages can enter human cells via at least three mechanisms, phagocytosis, transcytosis, and receptor-mediated endocytosis, the entered phage virions were degraded in one day without damaging the viability of cells. However, it is not clear how phages interact with human immune cells. Such interaction may trigger immunogenicity with specific responses against therapeutic phages, leading to the development of anti-phage antibodies. The clinical impact of these antibodies ranges from no apparent effect, to the reduced efficacy of phages by neutralizing and/or eliminating phages from the body, to the safety concern from the induced allergic responses. To ensure the safety and efficacy of phage therapy, more studies are needed to understand the potential immunogenicity triggered by phages.

Dosing Strategies To maximize the therapeutic activity of phage, several dosing strategies should be considered, dose determination, dose interval, and route of administration, to improve the common pharmacology effects, (1) Pharmacokinetics: the fate of phages after being administered to humans, and (2) Pharmacodynamics: the impact of administered phages on humans. These pharmacology effects are more complex for phage therapy compared to antibiotics or protein therapeutics because of the population (from selfreplicating nature of phages) and evolutionary dynamics between phages and bacteria. Several population dynamic models were developed to predict the short-term dynamics (i.e., days) for in vitro phage-bacteria interactions; however, it is still challenging to model these interactions in humans in a clinically relevant time frame (i.e., weeks to months). Additionally, most of the models were developed for systems with a single phage and a single bacterial species, there is limited understanding on systems with multiple phage (i.e., phage cocktail) and bacterial species. A unique aspect to consider for phage therapy is the dosage form: single phage versus phage cocktail. Phage cocktails were developed to expand the host range and overcome phage resistance from bacterial mutants. While the cocktail sometimes outperforms the single phage treatment, it remains difficult to determine the mechanisms of action and the associated synergy (if any) from the cocktail. In-depth understanding of population and evolutionary dynamics among cocktail phages and the target bacteria may facilitate a more rational design of phage cocktails and prevent phage resistance. The rise of bacterial phage resistance in human clinical trials has been shown to be alleviated by subsequent treatments with different phages selected as able to propagate on the phage-resistant bacteria. This selection process typically continues until the patient showed no symptoms, making the entire dosing process “personal”, as the next dosing regimen depends on the bacteria evolved in a given patient. The major advantage of this process is the use of target phages. However, the selection process is timeconsuming and labor intensive with several manufacturing and regulatory challenges, as discussed below.

Chemistry, Manufacturing, and Controls (CMC) CMC are the activities for the development, the manufacturing processes, the quality control, and the life cycle management of a drug product, to ensure the safety, efficacy, and manufacturing consistency. Several CMC challenges were identified for phage therapy. (1) High levels of bacterial endotoxin in the drug product and loss of infectivity during the shelf-life storage were reported in phages used in recent clinical trials, raising the concerns of product quality. (2) Phage therapy has been evaluated as the personalized treatment process (i.e., personalized phage products over the course of treatment) or as the single off-the-shelf product (i.e., fixed composition of phages in a single drug product). The off-the-shelf products have a clear CMC pathway that is likely to be similar to biologics; however, bacteria will eventually develop phage resistance. On the other hand, the personalized treatments may be more effective in reducing phage resistance; however, there is no clear general strategy to control the product quality and establish shelf-life in a timely manner for each treatment cycle.

Regulatory Pathway There is no approved phage therapy in Western countries; therefore, the regulatory pathway is not clear. To engage with phage therapy developers, the US Food and Drug Administration and National Institutes of Health hosted workshops to discuss current drug approval processes and other considerations for phage therapy (e.g., phage therapy workshop, Rockville MD, 2017). Experts from both agencies also participated in public phage conferences to share their perspective on phage therapy (e.g., phage congress, DC, 2019). The message from these agencies suggested that, in general, the regulatory pathway will be similar to biologics, and there is a need for more high quality preclinical and clinical data to demonstrate the safety, efficacy, and manufacturing consistency of therapeutic phages.

Bacteriophage: Therapeutics and Diagnostics Development

257

Environmental Impact As shown in the antibiotic industry, the mass production (4100,000 tons per year) causes overwhelming pollution in the environment and accelerates antibiotic resistance. Mass production of phages will inevitably create a similar level (if not worse) of environmental impact from the toxic byproduct during manufacturing to the evolutionary pressure on natural bacterial populations. If phage therapy becomes widely available for treating bacterial infections, it will be essential to have proper regulation of phage production and surveillance programs dedicated to monitor the environmental impact and avoid the unintended consequences from antibiotic industry.

Additional Challenges for Phage Diagnostics Phage Manipulation To improve sensitivity, modern diagnostic phages were developed by the genetic manipulation of phage genome, which requires a delicate selection of the target location on the phage genome for editing and the discovery of new functional molecules to improve the sensitivity. Poor engineering strategy will result in damaged phages with little infectivity compared to the wild type phages.

Multiplex Testing Many bacterial infections are caused by different species; therefore, the ability to test multiple bacterial species (i.e., multiplex testing) becomes critical for improving the diagnostic efficiency and reducing the sample requirement. Current diagnostic phages were designed to target single bacterial species with single channel of detection on the functional molecules. With limited functional molecules (e.g., fluorophores) available for engineering, it is challenging to apply diagnostic phages for multiplex testing.

Intracellular Bacteria Several bacterial pathogens replicate inside human cells and transport between neighboring cells, which are often difficult to identify unless they outbreak and cause symptoms. Moller-Olsen and co-workers recently showed that K1F phage can enter human epithelial cells and kill the intracellular E. coli. It is not clear if this reaction is universal for different human cells, phages, and bacterial species. More studies at clinically relevant conditions are needed to understand the potential of using diagnostic phages to detect intracellular bacteria.

Sample Interference Sample interference is a common issue of traditional bacterial culture and modern molecular diagnostics, where the interference may come from contaminating bacteria picked up during sample collection or from the sample matrix, impacting the test sensitivity, specificity and predictive values. Some interference may be partially mitigated by repeated tests; however, not all patients (e.g., neonate) can afford the samples for re-testing. Again, future work should address the performance of diagnostic phages in clinically relevant conditions and compare the assay performance to current standard techniques.

Market Acceptance Bacterial culture has been utilized for several decades as the gold standard for bacterial diagnosis; therefore, customers may be reluctant to switch to other methods. Current diagnostic phages rely on the luminescence detection or the fluorescent imaging analysis; both are low throughput and labor intensive. To compete with bacterial culture, the diagnostic phages should be developed as a high throughput assay with a minimum labor requirement and compatible with current diagnostic workflow in the hospitals and clinical testing laboratories.

Conclusions Interest in using phages to fight AR bacterial infections remains high. As noted above, the emergence of AR bacteria and the shortage of new antibiotic development prompt the search for alternatives. Phages are approved as prophylactic agents for bacterial contamination in food products in the US and several other countries; therefore, they have the potential to be the alternative diagnostics and therapeutics for bacterial infections. The emergence of successful phage treatments of patients who were otherwise untreatable with antibiotics has gained attention from the media and ignited passion to pave the regulatory approval path for phage therapy. However, challenges remain at every level of development. For the single phage product, bacterial phage resistance is inevitable. It is not clear how fast bacteria will develop such a resistance and the impact of phage resistance on bacterial virulence. Will

258

Bacteriophage: Therapeutics and Diagnostics Development

co-evolutionary arm race between phages and bacteria exacerbate bacterial virulence or reduce bacterial fitness? For phage cocktails, the mechanism of bacterial killing is not clear as the individual and synergistic efficacy is difficult to dissect. An important question remains: how to rationally design the treatment with a single phage product, a phage cocktail, or a combination of phage(s) with antibiotics? We also have limited knowledge of human immune responses to phages and similarly the interaction between human cells and phages. Will phages transduce virulent genes from a virulent bacterial host to non-virulent bacteria or even human cells? The limited data available makes it challenging to define the clinical safety and efficacy profile for phage therapy. While there is tremendous body of work in biochemistry, structural, and genetic aspects of phages, the data is more concentrated on selected model systems. Every phage species is unique with its own properties and should be treated as an individual entity in the product. To produce a single phage product or a phage cocktail, one must monitor the quality attributes of every phage species during the manufacturing processes and the shelf-life. How to define the shelf-life and control the quality of phages are critical CMC issues for phage therapy and depend on the dosage form (cocktail versus single phage product) and the treatment process (off-the-shelf versus personalized). Finally, there is an urgent need for high quality studies to evaluate the clinical efficacy with sufficient patient population and the performance of diagnostic phages in clinically relevant conditions, which is at present hard to come by. Once these challenges are solved, phages can be considered as the alternative tools to fight AR bacterial infections.

Further Reading Abedon, S.T., Kuhl, S.J., Blasdel, B.G., Kutter, E.M., 2011. Phage treatment of human infections. Bacteriophage 1, 66–85. Bassett, E.J., Keith, M.S., Armelagos, G.J., Martin, D.L., Villanueva, A.R., 1980. Tetracycline-labeled human bone from ancient Sudanese Nubia (A.D. 350). Science 209, 1532–1534. Brussow, H., 2005. Phage therapy: The Escherichia coli experience. Microbiology 151, 2133–2140. Davies, J., Davies, D., 2010. Origins and evolution of antibiotic resistance. Microbiology and Molecular Biology Reviews 74, 417–433. Dedrick, R.M., Guerrero-Bustamante, C.A., Garlena, R.A., et al., 2019. Engineered bacteriophages for treatment of a patient with a disseminated drug-resistant Mycobacterium abscessus. Nature Medicine 25, 730–733. Edgar, R., McKinstry, M., Hwang, J., et al., 2006. High-sensitivity bacterial detection using biotin-tagged phage and quantum-dot nanocomplexes. Proc. Natl. Acad. Sci. USA 103, 4841–4845. Gandra, S., Barter, D.M., Laxminarayan, R., 2014. Economic burden of antibiotic resistance: How much do we really know? Clinical Microbiology and Infection 20, 973–980. Jacobs Jr., W.R., Barletta, R.G., Udani, R., et al., 1993. Rapid assessment of drug susceptibilities of Mycobacterium tuberculosis by means of luciferase reporter phages. Science 260, 819–822. Jault, P., Leclerc, T., Jennes, S., et al., 2019. Efficacy and tolerability of a cocktail of bacteriophages to treat burn wounds infected by Pseudomonas aeruginosa (PhagoBurn): A randomised, controlled, double-blind phase 1/2 trial. Lancet Infect. Dis. 19, 35–45. Keen, E.C., 2015. A century of phage research: Bacteriophages and the shaping of modern biology. Bioessays 37, 6–9. Kortright, K.E., Chan, B.K., Koff, J.L., Turner, P.E., 2019. Phage therapy: A renewed approach to combat antibiotic-resistant bacteria. Cell Host and Microbe 25, 219–232. Krueger, A.P., Scribner, E.J., 1941. The bacteriophage: Its nature and its therapeutic use. Journal of the American Medical Association 116, 2269–2277. Kutateladze, M., Adamia, R., 2008. Phage therapy experience at the Eliava Institute. Médecine et Maladies Infectieuses 38, 426–430. Lehti, T.A., Pajunen, M.I., Skog, M.S., Finne, J., 2017. Internalization of a polysialic acid-binding Escherichia coli bacteriophage into eukaryotic neuroblastoma cells. Nature Communications 8, 1915. Levin, B.R., Bull, J.J., 2004. Population and evolutionary dynamics of phage therapy. Nature Reviews Microbiology 2, 166–173. Lino, C.A., Harper, J.C., Carney, J.P., Timlin, J.A., 2018. Delivering CRISPR: A review of the challenges and approaches. Drug Delivery 25, 1234–1257. Merril, C.R., Biswas, B., Carlton, R., et al., 1996. Long-circulating bacteriophage as antibacterial agents. Proceedings of the National Academy of Sciences of the United States of America 93, 3188–3192. Moller-Olsen, C., Ho, S.F.S., Shukla, R.D., Feher, T., Sagona, A.P., 2018. Engineered K1F bacteriophages kill intracellular Escherichia coli K1 in human epithelial cells. Scientific Reports 8, 17559. Nixon, A.E., Sexton, D.J., Ladner, R.C., 2014. Drugs derived from phage display: From candidate identification to clinical practice. MAbs 6, 73–85. Pires, D.P., Cleto, S., Sillankorva, S., Azeredo, J., Lu, T.K., 2016. Genetically engineered phages: A review of advances over the last decade. Microbiology and Molecular Biology Reviews 80, 523–543. Piuri, M., Jacobs Jr., W.R., Hatfull, G.F., 2009. Fluoromycobacteriophages for rapid, specific, and sensitive antibiotic susceptibility testing of Mycobacterium tuberculosis. PLoS One 4, e4870. Powers, J.H., 2004. Antimicrobial drug development – The past, the present, and the future. Clinical Microbiology and Infection 10 (Suppl 4), 23–31. Rhoads, D.D., Wolcott, R.D., Kuskowski, M.A., et al., 2009. Bacteriophage therapy of venous leg ulcers in humans: Results of a phase I safety trial. J. Wound Care 18 (237-238), 240–243. Salmond, G.P., Fineran, P.C., 2015. A century of the phage: Past, present, and future. Nature Reviews Microbiology 13, 777–786. Sarker, S.A., Sultana, S., Reuteler, G., et al., 2016. Oral phage therapy of acute bacterial diarrhea with two coliphage preparations: A randomized trial in children from Bangladesh. EBioMedicine 4, 124–137. Schooley, R.T., Biswas, B., Gill, J.J., et al., 2017. Development and use of personalized bacteriophage-based therapeutic cocktails to treat a patient with a disseminated resistant acinetobacter baumannii infection. Antimicrobial Agents and Chemotherapy 61. Tanji, Y., Furukawa, C., Na, S.H., et al., 2004. Escherichia coli detection by GFP-labeled lysozyme-inactivated T4 bacteriophage. J. Biotechnol. 114, 11–20. Trinh, J.T., Alkahtani, M.H., Rampersaud, I., et al., 2018. Fluorescent nanodiamondbacteriophage conjugates maintain host specificity. Biotechnol. Bioeng. 115, 1427–1436. Wright, A., Hawkins, C.H., Anggard, E.E., Harper, D.R., 2009. A controlled clinical trial of a therapeutic bacteriophage preparation in chronic otitis due to antibiotic-resistant Pseudomonas aeruginosa; A preliminary report of efficacy. Clin. Otolaryngol. 34, 349–357. Zhang, D., Coronel-Aguilera, C.P., Romero, P.L., et al., 2016. The use of a novel NanoLuc -based reporter phage for the detection of escherichia coli O157:H7. Sci. Rep. 6, 33235.

Bacteriophage Vaccines Pan Tao and Venigalla B Rao, The Catholic University of America, Washington, DC, United States r 2021 Elsevier Ltd. All rights reserved.

Introduction Vaccines are one of the most effective medical interventions in the human history. More than 60 vaccines against 26 pathogens are currently available in the US market. Most of these are whole-pathogen vaccines (CDC data updated on April 13, 2018, for website link see “Relevant Website section”). Historically, vaccines were developed using mainly the whole pathogen, either the attenuated (live-attenuated vaccines) or the inactivated (heat- or formalin-killed vaccines) organism. However, whole-pathogen vaccines generally pose significant safety risks including reversion to a pathogenic form, severe reactions in immunocompromised hosts, and adverse effects such as allergic responses. Therefore, the focus in recent years has shifted to the development of recombinant subunit vaccines, which contain only the well-defined antigenic molecules of the pathogens, which are a much safer alternative to the whole-pathogen vaccines. Virus-like particles (VLPs) represent one of most promising approaches to design subunit vaccines. VLPs are nanometer scale particles with precise dimensions that are assembled with viral structural proteins. However, they generally lack the viral genome and hence, non-pathogenic. Due to their favorable characteristics as seen in viral pathogens such as size, geometry, highly ordered and repeat structure, they can, by simulating a viral infection, stimulate robust immune responses by the host immune system. Successful VLP vaccines have been developed and commercialized worldwide such as hepatitis B vaccine and human papillomavirus vaccine. It would be extremely useful if there is a “universal” platform that could assemble any antigen, or antigens, and generate VLP vaccines against one or multiple pathogens. Bacteriophages (phages) are highly promising candidates to develop such a platform because of their size, surface structure, stability, safety, biodegradability, and low cost of manufacture. Here, we describe some unique features of phage T4 that make it an effective VLP vaccine platform, both in terms of design and effectiveness.

Architecture of Phage T4 The phage T4 belongs to Myoviridae family and infects the E. coli bacterium. Thus, it is innocuous to humans. Phage T4 consists of three major components; head (capsid), tail, and tail fibers. These are assembled by independent pathways and joined together to form the infectious virion. The application of T4 phage as a VLP platform mainly involves the head, which is an elongated icosahedron, 120 nm long and 86 nm wide (Fig. 1). It is built with 930 copies of the major capsid protein gp23* (“*” represents the cleaved form) (48.7 kDa), which form the hexagonal capsid lattice. Eleven of the twelve vertices are occupied by pentamers of the vertex protein gp24* (55 copies per capsid) (46 kDa). The twelfth vertex is a unique vertex where the dodecameric portal protein (61 kDa) resides. A central channel in the dodecamer serves as a portal for entry of DNA during genome encapsidation and for exit of genome during infection. The dodecameric vertex is where the packaging motor docks for DNA packaging and later, neck and tail attach after packaging is terminated and motor dissociated. A unique feature of the T4 head is that it contains two non-essential outer capsid proteins; 870 copies of the small outer capsid protein (Soc, 9.1 kDa) and 155 copies of the highly antigenic outer capsid protein (Hoc, 39 kDa) (Fig. 1). Soc monomers assemble into trimers at the quasi three-fold axes and clamp the adjacent gp23* hexameric capsomers, thus reinforcing an already very stable capsid structure and protect the pressurized capsid against harsh extracellular environment (e.g., pH 11). Hoc is a linear “fiber” containing a string of four domains. It binds at the center of each capsomer as a monomer through its COOH-terminal domain whereas the NH2-terminus is projected at B170 Å distance away from the capsid. Hoc might facilitate attachment of phage to bacteria.

Phage T4 as a Vaccine Platform Although Soc and Hoc provide survival advantages to the virus in the natural environment, they are completely dispensable under laboratory conditions. Deletion of these genes has no significant effect on phage assembly, productivity, or infectivity. Moreover, purified Soc or Hoc proteins bind to Hoc-Soc - capsid with high specificity and nanomolar affinity, properties that are not greatly compromised by attachment of an antigen at the NH2- and/or COOH-termini. Actually, the NH2- and COOH-termini of Soc and the NH2-terminus of Hoc are well exposed on the capsid structure, thus allowing efficient display of foreign proteins fused to Soc and Hoc on the capsid surface. Proteins as large as B83 kDa anthrax protective antigen (PA) or 129 kDa β-galactosidase were efficiently displayed. However, the COOH-terminal domain of Hoc has the capsid binding site and is not desirable for antigen display. The T4 antigen display can be accomplished either in vivo or in vitro.

Encyclopedia of Virology, 4th Edition, Volume 4

doi:10.1016/B978-0-12-809633-8.20953-9

259

260

Bacteriophage Vaccines

Fig. 1 Structural model of bacteriophage T4. Three major components of phage T4 – Head, tail, and tail fibers – are indicated. The enlarged capsomer (inset) shows the major capsid protein gp23* (930 copies; * represents the cleaved form), Soc (blue, 870 copies), and Hoc (orange, 155 copies). Yellow subunits at the five-fold vertices correspond to gp24*. The unique portal vertex (gp20) connects the head to the tail.

Fig. 2 Schematic of phage T4 in vivo VLP display system. A. The Soc- or Hoc-antigen fusion gene is cloned into a donor plasmid flaked by homologous arms. B. The recombinant T4 phage is generated by homologous recombination between wild-type (WT) T4 and donor plasmid upon infection of E. coli. C. The fusion proteins are expressed under the control of a T4 promoter and assembled on T4 capsids during the propagation of the recombinant T4 phages.

In Vivo VLP Assembly For in vivo VLP assembly, the antigen gene is inserted into T4 genome at the NH2- or COOH-termini of Soc or the NH2-terminus of Hoc by homologous recombination between T4 phage genome and a donor plasmid, which contains an antigen gene flanked by two homologous arms of T4 genome (Fig. 2). The recombinant T4 phage containing the antigen-Soc or -Hoc fusion gene is then selected and amplified. Upon infection of E. coli, the fusion proteins are expressed under the control of a T4 promoter and

Bacteriophage Vaccines

261

Fig. 3 Schematic of phage T4 in vitro VLP display system. The polyhistidine-tagged Soc-fused antigen is purified from E. coli and assembled on purified Hoc-Soc- T4 phage by mixing the two to generate the VLPs. The same principle is used for the display of Hoc-fused antigens or targeting molecules.

assembled on the capsid in vivo. The resulting phage is purified and used as a VLP vaccine. The advantage of in vivo assembly is that, once the recombinant phage is constructed, the VLP vaccines can be produced by a simple procedure at the scale needed. However, the copy number of the antigen on the VLP could vary greatly because different Soc or Hoc fusion proteins may be expressed to different extents and the Soc/Hoc binding sites don’t have to be completely occupied. Additionally, it would be difficult to assemble multiple antigens to generate multivalent vaccines.

In Vitro VLP Assembly For in vitro VLP assembly (Fig. 3), Soc- or Hoc-antigen fusion proteins are expressed in E. coli under the control of a strong promoter such as the phage T7 promoter. The proteins are then purified and displayed on phage by mixing the recombinant protein with the purified Hoc-Soc- T4 phage. The Soc- or Hoc-fusion proteins can also be expressed using a mammalian expression system for antigens requiring specific post-translational modifications. This approach has many advantages. First, functional characterization of the purified Soc- or Hoc-fusion proteins can be performed prior to display to insure the production of fulllength antigens and their native-like function(s). Second, the composition of the in vitro assembly mixture can be adjusted to generate multivalent vaccines against one or more pathogens and to control the copy number (Fig. 3). For example, by adding two different anthrax antigens fused to Soc, LFn-Soc, and Soc-PA4 to the reaction mixture, T4 phage nanoparticles simultaneously displaying both the antigens could be generated. The copy number of each antigen was proportional to the ratio of antigen molecules to binding sites added to the assembly mixture. Third, in vitro assembly allows building of oligomeric complexes through interaction with a displayed antigen. The tripartite anthrax toxin complexes containing LFn, PA, and EF could be assembled using the Soc-fused LFn molecules anchored on the capsid. Finally, in vitro assembly allows display of antigens that require post-translational modifications such as glycosylation for function by simply producing the antigens using a mammalian expression system. T4 capsid nanoparticle has several features that make it a desirable platform to design VLP vaccines. The availability of up to 1025 Soc and Hoc binding sites per capsid provide a great deal of flexibility to generate symmetrically arrayed antigens on a nanoparticle at high density. Such ordered and repetitive patterns of display on phage capsid mimic the patterns exhibited by pathogenic viruses and are considered to be important for activation of the innate immune systems. Another useful feature of the T4 VLP is that the interior of the capsid can also be used for DNA vaccine design. This can be accomplished either by filling an empty capsid in vitro using the DNA packaging motor with DNAs that can express vaccine antigens upon delivery into human cells or by inserting DNAs into phage genome by CRISPR engineering. Furthermore, the T4 nanoparticles can be targeted to antigen presenting cells such as the dendritic cells by displaying specific ligands such as anti-DEC205 monoclonal antibodies on the capsid. The NH2-terminus of the Hoc fibers projected 170 Å away from the capsid wall provide a unique opportunity to display molecules with considerable reach to capture the cell surface receptors present on the targeted cells.

Phage T4 VLP Vaccines T4 nanoparticle platform has been used to develop VLP vaccines against a number of pathogenic bacteria and viruses, such as Bacillus anthracis, Yersinia pestis, HIV-1, foot-and-mouth disease virus (FMDV), classical swine fever virus (CSFV), and bursal disease virus. Here, we focus on our recent work on the design of biodefense vaccines against B. anthracis and Y. pestis using the T4 in vitro

262

Bacteriophage Vaccines

VLP display system. B. anthracis is the causative agent of anthrax, a deadly disease that leads to rapid death upon exposure. It is listed as a tier 1 biothreat agent by the United States Center for Disease Control (CDC). Y. pestis is the causative agent of plague also known as Black Death, which is one of the deadliest infectious diseases known to mankind. Currently, there are no Food and Drug Administration (FDA)-approved plague vaccines for mass vaccination in humans. Although there is a licensed anthrax vaccine consisting of a filtered crude culture supernatant of B. anthracis strain V770-NP1-R adsorbed to Alum adjuvant, it is limited to military and high-risk health care workers due to its significant reactogenicity in vaccinated individuals. Therefore, a safe and effective subunit vaccine has been a high priority to protect the public against these biothreats. The anthrax protective antigen (PA) is a primary target for the development of subunit vaccines. To develop an anthrax VLP vaccine, PA fused to the NH2-terminus of Hoc was displayed on T4 capsid by in vitro assembly to the maximum copy number of B155 molecules per capsid. When intramuscularly injected into mice with no adjuvant, the anthrax VLP vaccine elicited 6.5-fold higher PA-specific IgG when compared to those immunized with soluble PA adjuvanted with Alhydrogel. Similarly, PA could be fused to Soc and assembled on T4 capsid. In a rhesus macaque model, T4 phage displayed with both PA-Hoc and PA-Soc, after three intramuscular injections, again with no adjuvant, induced up to B500 µg/ml of PA-specific IgG and provided 100% protection against aerosol challenge with lethal B. anthracis Ames strain spores (85 LD50). No side effects due to T4 VLP vaccination were observed in either animal model. Two Y. pestis virulence factors, the capsular protein (Caf1 or F1) and the low calcium response V antigen (LcrV or V) are the main targets for plague vaccine design. We engineered a F1 mutant that produces a monomer as opposed to heterogeneous aggregate in the case of the native F1 and fused it to V to generate a bivalent F1mutV antigen. To develop a plague VLP vaccine using T4 platform, the F1mutV was fused to the NH2-terminus of Soc and assembled on T4 capsid by in vitro display up to about 660 copies per capsid. Without any adjuvant, the plague T4 VLP vaccine when immunized in mice or brown Norway rats (natural host for Y. pestis) elicited very high titers of F1V-specific antibodies, higher than the soluble antigen adjuvanted with Alhydrogel. Of particular note is that the T4 VLP vaccine also induced IgG2a and IgG1 antibodies whereas the soluble F1mutV vaccine mainly elicited IgG1 antibodies. IgG2a represent TH1 immune responses whereas IgG1 reflects the TH2 responses. This and additional data indicated that the T4 VLP vaccines induced more balanced TH1 and TH2 immune responses unlike the traditional subunit vaccines. Stimulation of both arms of the immune system, humoral (TH2) and cellular (TH1), is considered to be beneficial for protection, in particular for clearance of intracellular pathogens. Indeed, the T4 VLP vaccine provided 100% protection in both the mouse and brown Norway rat models against intranasal challenge with the most lethal Y. pestis CO92 strain up to as high B5000 LD50 dose. Taking one step further, we combined both the anthrax and plague T4 VLP vaccines into one and tested its efficacy in mouse, brown Norway rat, and New Zealand white rabbit models. With two intramuscular injections at three week intervals, the robust immunogenicity observed with single vaccine was recapitulated in this bivalent anthrax-plague vaccine and no antigen competition was observed. Most importantly, the vaccine conferred 100% protection against dual challenges with 1 LD100 of anthrax lethal toxin and 200–400 LD50 of Y. pestis CO92. Protection was observed whether the challenges were administered in sequence or simultaneously, in both mouse and brown Norway rat models. The vaccine also provided 100% protection in rabbits against 200 LD50 challenge with lethal B. anthracis Ames strain spores. Thus, the phage bivalent vaccine could protect against two deadly biothreat agents, inhalational anthrax and pneumonic plague.

Other Phage Vaccine Platforms Other phages have also been used as scaffolds for development of VLP vaccines (Fig. 4). In fact, the abundance and diversity of phages provides many choices for designing VLP vaccine according to specific purposes. For example, the size of VLP was reported to be an important factor for eliciting immune responses. The size of T4-VLP described above is about 120 × 86 nm. Most other phage capsids are smaller. The commonly used phages include λ, T7, MS2, and Qβ, which have sizes of 60 nm, 56 nm, 26 nm, and 28 nm respectively. The capsid of λ phage is composed of the major capsid protein gpE and the outer capsid protein gpD. The gpD, which has a copy number of 405–420 molecules per capsid stabilizes the capsid against the internal pressure of the packaged phage genome (48.5 kb). Therefore, it is an essential protein for the WT phage but dispensable for phages with shorter genomes. The COOH-terminus of gpD has been extensively used for display of epitope peptides. However, it is possible to display larger antigens but at reduced copy number. Although both in vitro and in vivo display have been used to assemble the foreign peptides fused to gpD, in vivo display is the preferred method because phages or capsids produced in the absence of gpD are less stable for in vitro display. The capsid of T7 phage is assembled with the major capsid protein gp10. However, a small portion of gp10 undergoes a −1 frame-shift at the COOH-terminus of gp10 (345 amino acids) to produce a slightly longer gp10 (398 amino acids). The former is named gp10A and the latter, gp10B. The gp10B is not essential for capsid assembly and, therefore, has been used for display of peptide epitopes and antigens as fusions to the COOH-terminus of gp10. Although the copy number is generally limited, as high as 415 copies of about 50 amino acid antigen peptides have been displayed on the T7 phage by in vivo display. Two small single stranded RNA phages MS2 and Qβ have also been used for VLP vaccine design. The capsid of these phages is composed of 180 copies of major coat protein (CP) to which a peptide epitope can be fused and self-assembled both in vivo and in vitro. However, successful capsid assembly requires WT CP, hence the copy number of the epitope-fused CP is limited. Phage Qβ also contains three to five copies of A1 protein, which is a large 196-amino acid extension at the COOH-terminus of CP as a result

Bacteriophage Vaccines

263

Fig. 4 Properties of various phage display systems.

of infrequent read-through at the termination codon. This feature has been used as a way to display peptides through fusion to the COOH-terminus of A1. Filamentous phages such as M13 and fd have also been used for vaccine development, although these phages have been most powerful for display of random peptide libraries. The 900 nm long phage fd filament is composed of about 2700 copies of major coat protein pVIII. At one end of the filament are 5 copies each of minor capsid proteins pIII and pVI, while the other end contains 5 copies each of minor capsid protein pVII and pIX. The pVIII can only display short 6–8 amino acid peptides, while the minor capsid proteins can display proteins although at low copy number, 1–5 molecules per filament. Many vaccine candidates have been developed using different phage VLP platforms described above. Readers are encouraged to read the review articles list below for details of these vaccine candidates.

Conclusion Statement Vaccine development thus far has been individualized for each disease. Consequently, it takes enormous time, resources, testing, and manufacturing hurdles, in order to bring a vaccine into human use. The question remains if a “universal” VLP platform can be developed to design vaccines against many infectious diseases. Phages in general, and T4 phage in particular, provide the necessary architectures for such a platform. Phages are highly stable, can be manufactured relatively easily and cost-effectively, is considered to be safe, and has no significant pre-existing immunity in humans. The features of phage T4; large size, high antigen capacity, robust immunogenicity without adjuvant, ability to co-deliver DNAs and targeting molecules, and ease of engineering using CRISPR provide distinct advantages to manipulate this VLP platform and design efficacious vaccines against pathogens, including complex and emerging pathogens such as HIV and Flu. The phage platforms could therefore potentially streamline and accelerate the whole vaccine development process in the future.

Acknowledgments This research has been supported by grants from the National Institute of Allergy and Infectious Diseases (current: AI111538 and AI081726) and in part from the National Science Foundation, to VBR. We thank many current and past post-doctoral fellows and graduate students who contributed to bacteriophage T4 vaccine research, in particular Jennifer Jiang, Laura Abu-Shilbayeh, Zhihong Zhang, Qin Li, Sathish Shivachandra, Taheri Sathaliyawala, and Guofen Guo, and Wadad Alsalmi; collaborators Drs. Michael G. Rossmann and Andrei Fokine (Purdue University), Ashok Chopra (University of Texas Medical Branch), Stephen H. Leppla (National Institutes of Health), and Carl Alving, Gary Matyas, and Mangala Rao (Walter Reed Army Institute of Research).

Further Reading Bao, Q., Li, X., Han, G., et al., 2018. Phage-based vaccines. Advanced Drug Delivery Reviews. doi:10.1016/j.addr.2018.12.013, (pii: S0169-409X(18)30317-X). Black, L.W., Rao, V.B., 2012. Structure, assembly, and DNA packaging of the bacteriophage T4 head. Advances in Virus Research 82, 119–153. Chen, Z., Sun, L., Zhang, Z., et al., 2017. Cryo-EM structure of the bacteriophage T4 isometric head at 3.3-Å resolution and its relevance to the assembly of icosahedral viruses. Proceedings of the National Academy of Sciences of the United States of America 114 (39), E8184–E8193.

264

Bacteriophage Vaccines

Mohsen, M.O., Zha, L., Cabral-Miranda, G., Bachmann, M.F., 2017. Major findings and recent advances in virus-like particle (VLP)-based vaccines. Seminars in Immunology 34, 123–132. Rao, V.B., Feiss, M., 2015. Mechanisms of DNA packaging in large double stranded DNA viruses. Annual Review of Virology 2 (1), 351–378. Rappuoli, R., Pizza, M., Del Giudice, G., De Gregorio, E., 2014. Vaccines, new opportunities for a new society. Proceedings of the National Academy of Sciences of the United States of America 111 (34), 12288–12293. Ren, Z., Black, L.W., 1998. Phage T4 SOC and HOC display of biologically active, full-length proteins on the viral capsid. Gene 215 (2), 439–444. Tao, P., Mahalingam, M., Kirtley, M.L., et al., 2013. Mutated and bacteriophage T4 nanoparticle arrayed F1-V immunogens from Yersinia pestis as next generation plague vaccines. PLOS Pathogens 9 (7), e1003495. Tao, P., Mahalingam, M., Zhu, J., et al., 2018. A bacteriophage T4 nanoparticle-based dual vaccine against anthrax and plague. mBio 9 (5). doi:10.1128/mBio.01926-18. Tao, P., Wu, X., Tang, W.C., Zhu, J., Rao, V.B., 2017. Engineering of bacteriophage T4 genome using CRISPR-Cas9. ACS Synthetic Biology 6 (10), 1952–1961. Tao, P., Zhu, J., Mahalingam, M., Batra, H., Rao, V.B., 2018. Bacteriophage T4 nanoparticles for vaccine delivery against infectious diseases. Advanced Drug Delivery Reviews. doi:10.1016/j.addr.2018.06.025, (pii: S0169-409X(18)30164-9).

Relevant Website https://www.cdc.gov/vaccines/vpd/vaccines-list.html List of Vaccines – CDC.

Bacteriophage Diversity Julianne H Grose, Brigham Young University, Provo, UT, United States Sherwood R Casjens, University of Utah, Salt Lake City, UT, United States r 2021 Elsevier Ltd. All rights reserved.

Glossary Genome mosaicism Phage genomes that have patches of similar sequence and patches of quite different sequence are considered to “mosaically related”. Such patches are said to be “mosaic sections” and are thought to have arisen by horizontal exchange among divergent phages rather than by differential divergence. Host range The hosts that can be successfully infected by a given phage. Metagemone A collection of all the nucleic acid sequences present in an environmental sample. Phage cluster A set of related phages with nucleotide sequence similarity that spans 450% of their genomes as determined by nucleotide dot plot. Phageome Collection of sequences identified as bacteriophage-derived in an environmental or metagenomic source.

Plaque assay An assay for phage infection that relies on mixing the bacteria and phage prior to plating, and observing the plaques that arise on plate. Prophage A phage genome whose lytic genes are not expressed and which replicates in concert with the host genome, either as a plasmid or integrated into the host chromosome. Protein cluster A cluster of protein sequences derived from adjacent genes in a sequencing contig in metagenomic studies. Virion The stable virus particle that is released from infected cells and is capable of binding to and infecting other sensitive cells. Virome Collection of sequences identified as virial due to the extraction method and derived from an environmental or metagemone source.

Introduction The diversity of the functional, physical and genetic aspects of bacteriophages (or phages), viruses that infect bacteria, seemingly knows no bounds. This is an exciting time of new discovery in this field, and with each new study known phage diversity continues to expand. Phages have been found that infect nearly every bacterial species that has been examined, and since they replicate rapidly and kill host bacteria, they are important players in the ecology of bacteria and thus the overall ecology of Earth. Bacteria, and thus their phage predators, are critically important in such varied processes as global nitrogen and carbon cycling, animal and plant diseases of many types and the human immune response. Although a few phages have been very successfully studied since the 1940s as model systems for the understanding of basic molecular biology of life, their amazingly high abundance was only fully recognized when particles with their unique tailed phage-like morphology were quantified in environmental samples in the 1980s and 90s. These studies showed that there are 1031–1032 tailed phage virions on Earth, which probably corresponds to a global ratio of about ten for each bacterial cell. In addition, the advent of facile nucleotide sequencing of bacterial genomes showed that most harbor one or more quiescent phage genomes called prophages. Other phage types beyond the tailed phages may be less common, but their virions are more difficult to recognize in environmental samples and so their abundance is less well understood. Phages exist that have several different general behavioral types. (1) Lytic phages always replicate to make progeny virions and kill their host upon infection. (2) Temperate phages can, depending on the conditions at the time of infection, either replicate lytically to make progeny or turn off their lytic, cell killing genes and replicate as a prophage in synchrony with the host chromosome, either integrated into the host chromosome or as a plasmid. (3) Filamentous phages continuously replicate progeny and they escape the cell without overtly killing it. Temperate prophages often express “lysogenic conversion” genes that give the host cell some advantage, and in response to environmental signals many prophages can undergo “induction” to enter the lytic cycle to produce progeny and kill the host cell. Like all viruses, phages are extremely diverse, and they exist as eight very different general types that include virions that carry ssDNA, dsDNA, ssRNA or dsRNA chromosomes and that have several different morphologies (Table 1). Virion chromosomes can be linear or circular in different phage types. The genome sizes of known phages range from less than 4 to nearly 500 kbp, and the number of encoded proteins varies commensurately. Most phage virions contain one nucleic acid molecule, but the dsRNA phages contain several separate chromosomes. (See the International Committee on Taxonomy of Viruses [ICTV] world-wide-web page: website link provided in “Relevant Websites section” for detailed descriptions of the various types.) In addition, the five different protein folds of the protein that makes up the virion’s protective coat suggests that this critical aspect of phages has been “invented” at least five times during the course of life on Earth, and divergence within these groups indicates that they are all very ancient. The chromosomes of different nucleic acids and the 4100-fold variation in genome size means that there are a multitude of varied molecular strategies for phage replication and survival in nature that cannot all be described in this venue.

Encyclopedia of Virology, 4th Edition, Volume 4

doi:10.1016/B978-0-12-809633-8.20954-0

265

266

Table 1 Group

Bacteriophage Diversity

Eight types of bacteriophages Well studied members

dsDNA Caudovirales (tailed phages) Myoviridae T4, P2 Siphoviridae λ, SPP1 Podoviridae T7, P22 Corticoviridae PM2c Plasmaviridae AVL2c Tectiviridae PRD1 ssDNA Inoviridae M13, f1 Microviridae øX-174 dsRNA Cystoviridae ø6c ssRNA Leviviridae MS2, Qβ

Virion DNA

Member lifestyles

Virion type

Linear, 28–500 kbp Linear, 36–125 kbp Linearb, 16–90 kbp Circular, 10–11 kbp Circular, 12 kbp Linearb, 15–18 kbp

Lytic or temperate Lytic or temperate Lytic or temperate lytic (or temperate?)d lytic Lytic or temperate

Icosahedrala head, long contractile tail Icosahedrala head, long non-contractile tail Icosahedrala head, short tail Icosahedral, internal lipid membrane Pleomorphic. external membrane Icosahedral, internal lipid membrane

Circular, plus strand, 4–9 kb Circular, plus strand, 4–6 kb

Lytic or temperate Lytic (or temperate?)e

Filamentous/helical Icosahedral

Linear, 3 molecules, 13 kbp

Lytic

Icosahedral, external membrane envelope

Linear, plus strand, 3–5 kb

Lyticf

Icosahedral

a

Some species have elongated pseudo-icosahedral heads. A protein is covalently attached to 5′-ends in all Tectiviridae and some for Podoviridae. c Only a small number of isolates have been studied. d The studied members of this group appear to be lytic, but PM2-like sequences (possible prophages?) are present in the genomes of some bacteria. e Recent reports of Microviridae genomes embedded in bacterial genomes have led to the suggestion that some may be temperate. f A poorly understood carrier state has been reported for phage LeviOr01. b

In spite of their staggeringly high abundance and the different kinds of virion structures, the true diversity of phages was not appreciated until genome sequencing became easy to perform. Phages MS2 (ssRNA), øX-174 (ssDNA) and λ (dsDNA) were the first virus genomes to be sequenced with high cost and great effort in the late 1970s and early 1980s, but as sequencing costs and required effort declined over the past several decades the number of phage genome sequences in the public databases has increased dramatically. Indeed, up to 100 phage genomes can now be sequenced in a single Illumina type run for only about ten US dollars per genome, and there are currently (in 2019) over ten thousand available genome sequences for “authentic” phages (defined as phages that have been propagated in the laboratory and not including prophage sequences in bacterial genomes or phage sequences assembled from metagenomic information). As phage genome sequences were compared to one another it was quickly realized that genetic differences are very much greater than easily observable virion morphology differences, and the latter are insufficient to distinguish phages into informative groups of similar individuals. For example, Siphoviridae tailed Escherichia coli phages λ (Accession No. JQ086376) and Bacillus subtilis phage SPP1 (Accession No. NC_004166) virions appear very similar by negative stain electron microscopy, but they have essentially no recognizably similar nucleotide sequence. Thus, the study of the diversity of phages is necessarily the study of the diversity of their genome sequences. There are eight general phage types (Table 1), and where more than a few individual phages are known, there is substantial sequence diversity among individuals, but only in the Caudovirales or tailed phages is this diversity known to extend to very major differences in gene content. The other seven types appear at present to have limited ranges of gene content, for example in the two other best studied types the Microviridae have about ten genes that include largely homologous but not identical virion assembly and replication gene complements and variable lysis and other accessory genes, and the Leviviridae have four genes that include conserved virion assembly and replication genes and variable lysis genes. Large numbers of authentic phage genome sequences are known for only the tailed phages, so we will focus on the tailed phages in the remainder of this review.

Understanding the Nature of Tailed Phage Diversity and Phage Classification The tailed phages have been historically separated into three types, the Podoviridae that have short tails, Siphoviridae that have long non-contractile tails and Myoviridae that have long contractile tails. These tail differences, which easily seen by electron microscopy of virions, reflect different strategies for DNA delivery into cells but do not correlate with other features of phage lifestyle details. The tailed phage classification method that is most similar to cellular ribosomal RNA based classification relies on the fact that there are (only) three proteins for which homologs are encoded by all tailed phages, the major capsid protein (MCP, which forms the head shell), portal protein (channel through which DNA is packaged and ejected) and the large terminase subunit (DNA packaging motor protein), but such an approach that uses one or a few proteins can be confused by the extensive horizontal exchange of genetic material that has characterized phage history. In addition, phages and thus these proteins, especially MCP and large terminase, are so diverse that they can have little to no easily recognizable sequence similarity, making comparisons of distantly related phages difficult. Thus, comparison of such a small set of genes, although usually informative, can incorrectly identify overall group membership, but more importantly it does not show important differences within groups of related phages

Bacteriophage Diversity

267

(discussed in more detail below). More informative methods for overall phage classification utilize whole genome or proteome approaches. Among these are nucleotide dot plot analysis, Average Nucleotide Identity (ANI; nucleotide sequence identity between two genomes), and whole proteome similarity determinations (such a proteomic tree that compares all the encoded proteins or proteome conservation that measured the fraction of genes that are homologs). Each of these strategies has weaknesses and strengths. A strength of whole genome nucleotide analysis is that it doesn’t depend on correct annotation of genes or encoded protein function. An advantage of the proteome-based approaches is that proteins often retain recognizable homology after the encoding nucleotide sequences have diverged beyond the point of having recognizable similarity, so more distant protein relationships among phages can be detected and analyzed. Protein comparison methods are also particularly useful in the analysis of metagenomic data where proteins encoded by genes or gene clusters, rather than whole genomes, are usually compared. A significant disadvantage of whole genome nucleotide or proteome sequence clustering or tree construction, is that such approaches average differences and similarities across the genomes, losing important information on the variable relationships within genomes. Whole genome nucleotide dot plots on the other hand are capable of showing varied relationships across highly mosaic and rearranged genomes. This operational approach was pioneered by Graham Hatfull and colleagues for easy visualization of relationships within large sets of phages by overall genome similarity in a way that does not disguise differential relationships across the genomes (genome mosaicism, see below). Groups of highly related phages recognized by the method are defined as “clusters” of phages that have dot plot diagonal line similarity that covers ≥50% of their genomes, which generally correlates with 450% ANI and 440% proteome conservation. Such lines indicate regions of genetic similarity and synteny. Thus, phage genomes within a cluster carry homologous genes in the same functional order and so have similar virion structures, transcription pattern and lifestyles. For example, no Enterobacteriales cluster contains convincingly temperate and lytic phages; however, some Actinobacteria clusters contain both and their transcription patterns are very similar. This suggests that at least some types of phages do not easily move between these two major types; however, a small number of Enterobacteriales lytic and temperate clusters have similar virion structure and assembly genes. The ES18-like and P22-like temperate phage clusters have rather divergent virion structural genes that are nonetheless syntenic with the E1-like and IME-EC2-like lytic phage clusters, respectively, indicating that a small number of ancient exchanges of large genome segments has occurred between lytic and temperate phages. We note that analysis of just the MCP has been shown to usually (B99% of the time) reflect the cluster to which a phage belongs, indicating that horizontal MCP exchange between clusters is in fact quite rare. The exceptions to this rule are phages that have acquired an MCP gene from a more distant relative outside of its cluster. The dot plot approach also reveals hierarchical relationships, and we use the term “subclusters” for more highly related groups (long, strong diagonal similarity lines, generally with 480% ANI) within clusters. The higher level “superclusters” include groups of phage clusters that have syntenic genomes that encode homologous but more distantly related proteins without extensive nucleotide sequence similarity (less than 50% of the genomes are similar by dot plot, generally with o50% ANI) (Fig. 1). We use this cluster terminology in this review. A weakness of the dot plot cluster method is that it is difficult to quantitate relationships. Nonetheless, it succeeds in placing nearly all phages into useful and informative groups. However, there are a few known cases where a given phage is about half related to one cluster and half to another within a supercluster. These are apparently rather recently formed hybrid phages between members of two closely related clusters, and the dot plot method shows this clearly. For example, among the Enterobacteriales phages (see below) about half of the phage FSL_SP-016 genome (Accession No. KC139516) is similar to the BP-4795-like cluster and about half is similar to the ES18-like cluster. Thus, classification by the dot plot method can in rare instances be somewhat ambiguous, but it provides a platform to easily recognize and understand such “hybrid” phages. We note that the two most common terminologies for discussing phage relationships, International Committee on Taxonomy of Viruses (ICTV) defined species and genera and dot plot defined clusters, are not unrelated or mutually exclusive in that they both rely on these whole genome nucleotide/proteome methods. The taxa put forward by the ICTV for phage classification (phylum, class, order, family, genus and species) are useful, but since they typically utilize whole genome averaging methods they do not emphasize the mosaic nature of phage genomes in the way dot plots do (see below). Discussion regarding what comprises a phage ‘species’ and what classification scheme is most useful remains lively.

Strategies for Studying Phage Diversity Phage diversity can be studied by two opposing tactics, a “top down” strategy where total extant phage sequence diversity is analyzed in environmental samples and a “bottom up” strategy where individual phage genomes are compared. The enormous diversity of phages means that to be meaningful the latter must examine a very large number of individual “authentic” phages or be limited to phages that can infect a given bacterial taxon. Again, each approach has strengths and weaknesses. The top down methods depend on metagenomic analysis of environmental samples which can include mining total metagenomic sequence libraries for sequences that are similar to known phages or isolation and sequencing (not propagation) of purified virion fractions of such samples. The major advantage of this type of approach is that in theory it can see the entire expanse of phage diversity. However, it also has weaknesses; for example it is difficult or impossible to recognize completely novel phages, RNA phages are overlooked when only DNA is sequenced, not all virion types purify in the same fractions, some virions may be unstable and hard to purify, whole genome sequences have been difficult to obtain in high numbers, any complete genomes assembled from such

268

Bacteriophage Diversity

Fig. 1 Hierarchical phage classification by dot plot. Six representative phage genomes in the Enterobacteriales T1 supercluster phage genomes and four T7 supercluster genomes are compared with Gephard and a window size of 10. Phage names are shown at the left and top of the figure. The T1 supercluster genomes fall into a single cluster since they all have strong or weak, long diagonal similarity lines. They comprise three (previously named) subclusters whose members form strong lines within the subcluster and weak lines between subclusters. The four members of the T7 supercluster form the T7-like and SP6-like clusters with no diagonal lines between them. Each of these clusters contains two members of a single subcluster. The T7- and SP6-like clusters are members of the same (T7) supercluster because their genomes are highly syntenic when the proteins encoded by their genes are analyzed (i.e., they have very similar gene orders, transcription patterns and molecular lifestyles).

samples are likely composites of similar individual phages, it can be difficult to know the bacterial host species of the phage genomes discovered, phage virions can contain DNA from their bacterial hosts, and, finally, interesting findings cannot be easily be studied in more depth because of the inability to find and propagate the actual phages that correspond to the sequences of interest. On the other hand, the bottom-up strategy of isolating phages on particular hosts and studying them has the major advantage that the hosts are known and phage features can be studied in as much detail as desired. Its disadvantages are as follows: phage types may vary across the bacterial spectrum, so many phages that infect many different hosts must be studied to determine the complete extent of phage diversity, some phages may be difficult to propagate in the laboratory and so may be missed, and most importantly one is entirely dependent upon random chance for the discovery of new phage types so it is very hard to know rigorously when all extant varieties have been found; even for E. coli and Salmonella enterica, two of the bacterial species for which phages are best understood, new phage clusters or types are still being discovered.

Examples of Phage Diversity Top Down Studies Human phageome A myriad of top down metagenomic virome and phageome analyses from such diverse sources as human, termite and sea squirt guts, ocean water and lake bottom sediment have shown that phages are spectacularly diverse and that this diversity has not yet been fully described. The majority of top down phage diversity studies have been human-centric or focused on the marine environment. These two kinds of samples provide evidence that phage diversity varies dramatically in different ecological niches. For example, temperate phages appear to be more prominent in the human gut, while lytic phages appear to be more prominent in the marine environment. The temperate phages of the gut, especially in their prophage state, influence human health through the expression of their lysogenic conversion genes within their bacterial hosts, providing novel functions and capabilities, including defense against the human immune system. Initial human phageome studies suggested a great diversity among individuals, but more recent deeper sequencing and better analysis methods have allowed the identification of bacteriophages that are common among many human individuals, including the crAssphage family. The latter is a novel group of viruses that infect at

Bacteriophage Diversity

269

least in some cases Bacteroides hosts and can comprise up to 90% of the human gut virome, making them the most ubiquitous and abundant phages found in human samples to date. These and additional studies support the presence of a stable, diverse and unique personal phageome as well as a population of phages that are highly conserved across most healthy individuals.

Marine phageome In contrast to the primarily temperate human phageome, the marine phageome is reported to be primarily lytic. With an estimated concentration of 108 phage per mL of ocean water, many marine virome metagenomic studies have been performed from geographically diverse sources. These studies report a staggering diversity in that 87%–93% of putative sequences in oceanic “virus preparations” lack similarity to known reference sequences. This lack of similarity also hampers assembly of whole genomes, and therefore, estimates of diversity are primarily derived from analysis of viral protein clusters. In 2015, the analysis of metagenomic marine data suggested a global virome consisting of less than 3.9 million “protein clusters”, and revealed variation according to depth, distance from shore and season of the year. Genetic flow appears to travel from the ocean surface to the deep in that core (ubiquitous) protein clusters are enriched on the ocean surface, while greater diversity is present in areas where the ocean is not illuminated by the sunlight (the aphotic zone). Relatively few whole genome sequences of authentic marine phages have been reported (482 complete sequences as of 2015, of which only 274 have known hosts), and these have been isolated on only a few representative bacterial hosts (22 genera of the 106 estimated in the ocean), making phage-host relationships relatively unstudied. This is rapidly changing, however, with advanced methods for bacterial culture, single-cell sequencing of bacteria and the phages that have infected them, and phage-tagging strategies. Although many of the fully sequenced marine phages represent novel types, some have similarities to the Enterobacteriales T4-like, T7-like and N4-like non-marine phage clusters.

Bottom Up Studies Two large diversity studies The bottom up method has been utilized on a large scale in only two out of the nearly 200 currently defined bacterial orders, the Gram negative order Enterobacteriales and the Gram positive order Actinomycetales (currently being expanded to include other orders in the Actinobacteria phylum). Analyzes of these two large phage groups have largely relied on the dot plot and shared gene approaches to define phage clusters. Over 1150 genomes of authentic phages that infect the Enterobacteriales and about 2700 that infect the Actinobacteria have been sequenced. These Enterobacteriales phages were independently isolated and sequenced by a large number of diverse researchers around the world and deposited in the GenBank database. They currently form 82 clusters of which 23 are singleton clusters (clusters with only one phage). The Actinobacteria phages were isolated through the Howard Hughes Medical Institute (HHMI) Science Education Alliance Phage Hunters Advancing Genomics and Evolutionary Science (SEA PHAGES) program and are available at NCBI GenBank and website available at: “Relevant Websites section”. They currently form 182 clusters of which 62 are singleton clusters. The significant number of singleton clusters strongly suggests that the diversity of both these large sets of phages remains quite incompletely described. Although the order Enterobacteriales is a narrower taxonomic unit that the phylum Actinobacteria, the current average number of phages examined for each cluster defined is quite similar, about 14 and 15, respectively. Thus, at this still early stage of the analysis of tailed phages, this aspect of diversity appears to be rather similar in these two very disparate bacterial taxa, and we project that it may well be similar across all the Bacteria domain. Nonetheless, currently these bottom up studies still suffer from the fact that a very small fraction of extant phages have been analyzed, so future studies could change current views.

The Enterobacteriales tailed phage example As an example of the rapidly expanding nature of the study of phage diversity, the initial 2014 analysis of the Enterobacteriales phages in GenBank included 337 complete bacteriophage genomes sequences. These phages were isolated on 18 bacterial host genera and 31 species and comprised 56 separate clusters (32 lytic and 24 temperate). There are currently about 1150 Enterobacteriales phage genomes in GenBank (a 3.4-fold increase in 4.5 years) isolated on 25 host genera and 50 species, and these fall into 82 clusters (50 lytic and 32 temperate). As further indication of the incomplete nature of our current understanding, we note that the bacterial order Enterobacteriales currently contains 55 genera (and at least 30 more have been proposed for membership), so phages that infect less than half of this taxon’s genera have been examined. The number of related phage clusters will certainly grow until phages are included that infect all its genera. As described above, members of a given phage cluster have very similar molecular lifestyles, but the various Enterobacteriales tailed phage clusters, nonetheless, have internal diversity, the extent of which can vary dramatically. For example, one of the least diverse clusters is at present the jumbo phage Ea35-70 Myoviridae cluster comprised of nine Erwinia phages that form a single subcluster level group of highly related phages (494% ANI with genome sizes that range only from 271 to 275 kbp). These correspond to the ICTV designated Agrican357 virus genus. Other clusters are comprised of more diverse phages, such as the SPN3US-like Myoviridae cluster whose 18 known members infect four different host genera including Escherichia, Salmonella, Cronobacter and Erwinia and show striking diversity with some members sharing less than 37% ANI. These SPN3US-like phages are diverse even when isolated from a single host in that the 12 Erwinia phages form seven different subclusters. Superclusters designate distant phage relatives with similar protein and content order but little nucleotide similarity (above). Analysis of the Enterobacteriales phage clusters reveals at least seven supercluster level groups that contain multiple clusters. The

270

Bacteriophage Diversity

most commonly sequenced Enterobacteriales bacteriophages are members of the T7 supercluster, which is comprised of nine clusters that contain 191 phages that infect 18 host genera. No doubt these phages are very successful in the environment, however, this may reflect ease of propagation in the laboratory rather than ecological domination. The remaining six groups that contain multiple clusters are the lytic superclusters typified by phage N4 (2 clusters, 28 phages, 7 host genera), SETP3 (3 clusters, 88 phages, 8 host genera) and rV5 (2 clusters, 46 phages, 6 host genera) and the temperate superclusters typified by lambda (25 clusters, 169 phages, 15 host genera), P1 (2 clusters, 7 phages, 2 host genera), and P2 (2 clusters, 25 phages, 6 host genera). Thus, superclusters of both lytic and temperate types are internally very diverse, infecting multiple hosts with a wide diversity of phage clusters. A closer look at diversity within the Enterobacteriales T7 supercluster reveals gene synteny with minor variations and conservation across all eight of its clusters for key early genes (host restriction proteins, DNA polymerase and RNA polymerase), middle genes (DNA metabolism functions and lysozyme) and late genes (virion structure and assembly). In general, within each of these clusters member phages share 450% of their proteome, whereas the proteome conservation is often B30% between these “closely-related” clusters. Phages in the even more diverse lambda supercluster also contain conserved modules of function including the following genes in the same order (with only a few minor exceptions): terminase and head assembly genes, followed by tail assembly genes, lysogenic conversion genes, integrase and homologous recombination genes, the early gene transcriptional control genes (antiterminator [N], prophage repressor [CI], cro repressor and CII transcriptional activator), DNA replication, “nin region” genes, late gene transcriptional antiterminator and lysis genes. Nearly every lambdoid cluster contains a unique, very different set of head assembly genes; the only exceptions are the P22-/APSE-like and lambda-/ø80-/N15-like groups of clusters that have similar head assembly genes but have major differences elsewhere in the genome.

Contribution of Prophages to Phage Diversity Although it has not been studied quantitatively, isolation of phages from the environment may sometimes underestimate temperate phages. For example, nearly all of the Enterobacteriales “model system” temperate phages that have been studied in great depth in the laboratory (e.g., phages P1, P2, P22 and lambda) were isolated after they were released from prophage-carrying bacteria in the laboratory rather than as infectious virions from environmental samples. Perhaps they spend most of their time as prophages and less as virions? This may be less true for the Actinobacteria phages, as many environmental temperate phages have been isolated that infect that host taxon. Bacterial genomes are known that harbor 415 prophages, and in one of the few systematic studies in this area it was found in 2016 that analysis of 3298 Salmonella genomes identified 9371 tailed phage prophages that belonged to the 20 temperate Enterobacteriales phage clusters that had been defined at the time. Prophages are clearly very abundant and diverse. It is very likely that unrecognized prophages that belong to currently undiscovered clusters are also present in bacterial genome sequences, but because of the great diversity of tailed phages no computer search engine currently exists to rigorously discover and classify all of them. Clearly, since bacterial cells are thought to be about 10% as abundant as phage virions in the environment, the number of tailed phage prophage DNAs on Earth is likely within about an order of magnitude of the number of free virions, and they represent a still rather poorly understood source of phage diversity.

Phage Diversity and Horizontal Exchange of Genetic Information The Nature of Horizontally Exchangeable Mosaic Section Alleles Many reports have emphasized credible examples of past horizontal exchange of genetic information amongst the tailed phages, and although such exchange does not increase the number of phage gene types, it increases the diversity of gene combinations, increases the diversity within the participating clusters, provides opportunity for further gene divergence and spreads phage genetic information across the Bacteria domain. The frequency of such horizontal exchange events between phage types is not known, but it is not high enough to homogenize all such genomes or even disrupt the clear division of phages into clusters. Comparison of phages within any cluster typically shows patchy similarities with regions of high similarity interspersed with regions that are very different. Such genomes are said to be “mosaically related” and the different patches are thought to have arisen through horizontal exchange mechanisms. Genome mosaicism was recognized early in the study of E. coli phage lambda and its relatives and has since been shown to be present to varying degrees in most Enterobacteriales phage clusters. In general, temperate phage clusters appear to exhibit more extensive mosaicism than virulent ones, and mosaicism has been studied in most detail in the phage lambda. All members of each supercluster have syntenic genomes where genes with parallel functions have the same chromosomal order and have similar overall transcriptional gene expression patterns. At present genomes of about 170 authentic phage (ignoring thousands of related prophage sequences) members of the Enterobacteriales-infecting lambda supercluster have been completely sequenced, and these are currently divided into 25 clusters. Comparison within each cluster shows much more overall similarity than comparisons between clusters; however, even phages within a cluster often show substantial mosaicism, and Fig. 2 shows, as an example, the relationship between two typical members of the temperate phage P22-like cluster within the lambda supercluster. This cluster has been studied in detail and contains at least 25 exchangeable mosaic segments or modules within the cluster of 15 virion assembly genes that make up about half of their genomes. Such segments can encode parts of genes, whole genes or clusters of genes.

at

ec

te In

R

gr

pi ls Ta i

H

Ta i

ea

ls

ds

ke

P22

271

io n om C on bi t n C rol atio on n tro D N l A r C epl on ic Ly tro atio n si l s

Bacteriophage Diversity

5

10

Emek

15

20

25

30

35

5

P22

10

15

20

25

30

N

35

40 Kbp

C

Emek P22 P22

TGCTAACGTATTGAAGTACGATCCAGATCAATATTCAATAGAAGCTGATAAAAAATTTAAGTATTC |||||||||||||||||| || |||||||||||| | || | | Emek TGCTAACGTATTGAAGTATGACCCAGATCAGCTTGAATACAGGCTGAGCCAACCAGACGGTTATCT Emek

P22 tailspike gene codon 114 Fig. 2 Phage genome mosaicism. The typical mosaicism present in the genomes of phages P22 and Emek in the P22-like cluster within the phage lambda supercluster is shown in the dot plot (Accession numbers BK000583 and JQ806763, respectively; plot was produced by DNA Strider with a scan window threshold of 17 identities per 23 bp). The transcription pattern and location of functional gene clusters of P22 are indicated above the plot. Below the plot, the two-domain tailspike proteins are diagrammed with white N-terminal domains indicating their strong similarity and two shades of gray indicating that their C-terminal receptor-binding domains have no recognizable homology. At the bottom, the aligned nucleotide sequences at the mosaic boundary between the two phage’s tailspike genes’ N- and C-terminal domains are shown (translation is from left to right). We note that P22 and Emek infect Salmonella serovars Typhimurium and Haardt, respectively. These two Salmonella serovars have very different surface polysaccharides, and the C-terminal polysaccharide-binding tailspike domains are essentially unrelated in amino acid sequence.

Three different types of diversity exist within allelic sets of mosaic modules, and the mosaic section that encodes the P22 tailspike C-terminal receptor-binding domain demonstrates this as follows: (1) Some such alleles are divergent but retain recognizable protein sequence similarity (e.g., phage P22 (Accession No. KR296686) and Det7 (Accession No. KP797973) tailspikes); (2) some are ancient homologs that have diverged to the point that recognizable sequence similarity is lost but they retain

272

Bacteriophage Diversity

similar protein folds (e.g., P22 and Sf6 (Accession No. AF547987) tailspikes); and (3) others are unrelated and have different protein folds that serve the same function (e.g., P22 and CUS-3 (Accession No. CP000711) tailspikes). The number of different alleles that are known for the various mosaic modules along the genome is highly variable. There are, for example, three major sequence types of P22-like major capsid protein, but there are many more types of tailspike C-terminal domain which binds the virion to its polysaccharide receptor and so is the primary determinant of host specificity; each of its 470 known very different sequence types in the P22-like phages is responsible for binding a different polysaccharide. Thus, there can be very extensive diversity even within a phage cluster.

How did Genome Mosaicism Arise? The patches of similarity and difference (mosaic modules) appear to be the result of recombination after rare horizontal exchange, usually between divergent members of a cluster or supercluster. A few short homologous syntenic sequences have been found between mosaic modules (often transcription terminators) that could allow homologous recombination to create new combinations of adjacent mosaic module alleles; however, no such common sequences have been found between most adjacent modules. Fig. 2 shows that module boundaries are often very abrupt. Thus, generation of new combinations of adjacent alleles (i.e., new mosaic section boundaries) is likely usually due to random nonhomologous recombination. The probability of generation of new combinations by random processes is clearly very low, but it does not seem unreasonable since phages are so abundant that random processes, even if they are infrequent, might be able to generate many different nonhomologous recombinants over evolutionary time. The probability of success is increased within a supercluster of phages by the fact that such phages have the same gene order, so a single rare non-homologous recombination event that happens to occur at the same location between two exchangeable DNA (mosaic) sections of two such phages will result in a genome that contains all the modules that are necessary to be a functional phage (i.e., multiple recombination events are not required). If such a rare recombinant happens to have an evolutionary advantage it will survive and thrive. Once such a nonhomologous recombination event has created a new module boundary, new combinations of nonadjacent module alleles can arise rapidly by homologous recombination within a module allele that two phages have in common. These exchangeable modules appear to be evolutionarily shuffled more or less randomly within clusters, suggesting that each unit may have evolved to function at least semi-autonomously. This idea is supported by the fact that where information is available, intragenic boundaries coincide with protein domain boundaries, and laboratory construction of new combinations of modules (not previously observed in nature) usually creates a functional phage.

Genetic Exchange Among Superclusters Rare successful recombination events have also clearly occurred between phages that belong to different superclusters, and most of these involve tail fibers or tailspikes. Two well studied examples are the tailspikes of phages P22, 9NA (Accession No. KJ802832) and Det7 and the long tail fibers of phages T4 (Accession No. AF158101) and lambda (Accession No. J02459). The first three phages are members of different superclusters in the Podoviridae, Siphoviridae, and Myoviridae, respectively. They have essentially no overt nucleotide sequence similarity except between C-terminal domains of the tailspikes that bind to the O-antigen polysaccharide on the surface of Salmonella enterica serovar Typhimurium. The second pair of phages above also reside in very different superclusters (a lytic and a temperate cluster) and are members of the Siphoviridae, and Myoviridae, respectively. They have similar long tail fiber genes, and the lambda fiber can functionally replace the T4 fiber. It seems reasonable that these islands of similarity in otherwise extremely distantly related phages must have been horizontally transferred by very rare nonhomologous recombination and have given the recipient phage access to a new, naïve host that allowed them to flourish.

Phage-host Relationships and Phage Diversity Phage Cluster-Host Species Relationships With the Enterobacteriales The relationship between phage clusters and their hosts must reflect their evolutionary history. But do phages simply co-evolve with their hosts over long evolutionary periods so that each phage cluster infects a single host taxon (species, genus, etc.), or is jumping to a new taxonomically very different host sufficiently common to disrupt this kind of simple co-evolutionary descent? This aspect of phage diversity is very poorly understood at present. The following paragraphs describe our current knowledge in this arena. When phage cluster membership is compared with the hosts on which the phages were isolated, there is one strong conclusion. Very few phages from outside the Enterobacteriales have been found that clearly fall within Enterobacteriales phage clusters, so phages appear to jump between distantly related hosts quite rarely. We suppose that there are at least two major factors that limit such jumps across large host taxonomic chasms. (1) Virions recognize specific bacterial surface receptor molecules and disparate hosts are unlikely to have very similar surface features, so delivery of virion nucleic acid into very distantly related cells should be an uncommon event. (2) Perhaps more importantly, phages must be evolutionarily tuned to optimally infect their primary hosts

Bacteriophage Diversity

273

Fig. 3 The N4 supercluster dot plot. Representative N4 supercluster phages are included in the dot plot. Thin red lines separate genomes, thick red lines separate subclusters, and blue lines separate clusters. Clusters and host genera are indicated at the right and thick vertical colored lines there indicate the taxonomic order to which the hosts belong. Note that the orders Enterobacteriales and Pseudomonadales are in the Gamma Proteobacteria class, and Burkholderiales is in the Beta Proteobacteria.

in terms of the many ways in which their encoded proteins interact with host components, for example during the commandeering of host macromolecular synthesis machinery. Such tuning would mean that phages would likely infect distantly related bacteria with suboptimal efficiency and so would not compete well with the native phages there. On the other hand, any given non-singleton Enterobacteriales phage cluster often contains phages that have different hosts. This suggests that either phages are more frequently jumping between closely related hosts, or that co-evolution of phages and hosts during host species separation has not driven the phages into different clusters. For example, Enterobacteriales clusters with more than four member phages often have hosts from multiple genera that are typically but not always within one family. Mycobacterium smegmatis phage Patience (Accession No. JN412589) is a counterexample of an unusual phage that may have switched to a very different host relatively recently. Nonetheless, closely related phages (phages that belong to the same subcluster) often infect the same host. Our 2014 analysis found that 78% of the non-singleton subclusters were populated by phages that infect a single host genus. Thus, there appears to be a strong tendency (but not a requirement) for closely related phages to infect closely related hosts. We emphasize that phage-host relationships must always be viewed with some caution because of inherent complicating issues. The most prominent issue is that the full host range of most phages is not known. Rigorous determination of the full host range of any phage would require plaque assay or other tests for successful infection on an impractically large number of potential hosts. Therefore, careful and extensive host range studies are rarely performed due to the technically difficult, labor intensive and time consuming nature of testing a wide variety of bacteria, so nearly all host range studies have been limited to a small number of closely related potential hosts, which may not be indicative of true host range (see below).

Narrow and Wide Host Range Phages Phages have a reputation for having very narrow host ranges, and many phages are indeed very host specific. Some phages like those whose virions adsorb to bacterial surface polysaccharides often successfully infect only a small fraction of cells in their host species. For example, phages such as S. enterica phage P22 and E. coli phage CUS-3, only adsorb to hosts whose surface polysaccharide has a repeating -mannose-rhamnose-galactose- unit or is polysialic acid, respectively. A few phage virions have up to four or five different tailspikes that allow them to adsorb to, and thus infect, hosts with several different surface polysaccharides. Salmonella and Escherichia strains are known that have over 200 different surface polysaccharides, so even phages that recognize several such receptors have rather narrow host ranges restricted to a small fraction of these species. At the other end of the scale there are reports of very broad host range phages than can infect quite disparate hosts, but “broad host range” is a poorly defined term that is commonly used to indicate the ability to productively infect many different strains within a species or multiple

274

Bacteriophage Diversity

relatively closely related species. Nonetheless, a few phages have been reported that are able to infect hosts from two different bacterial orders, Pseudomonadales and Enterobacteriales, within the class Gamma Proteobacteria (e.g., EMCL-117-like phage øFenriz, Accession No. KT254133), two different orders, Chroococcales and Synechococcales, within the phylum Cyanobacteria (T4-like phage Syn9, Accession No. NC_008296), and even two different bacterial classes, Alpha and Beta Proteobacteria, within the phylum Proteobacteria (phage P14, Accession No. KX660669). One study even suggested some phages are capable of infecting species across several bacterial phyla including Proteobacteria, Actinobacteria and Bacteroidetes (phages øFenriz and close relatives). Such reports of successful infection of extremely distantly related hosts by a single phage species are tantalizing but very rare and should be regarded with caution until the details of infection of the disparate hosts have been described. Nonetheless, if true such phages could be vehicles for spreading diverse phage genes across the bacterial spectrum. Finally, we re-emphasize that phages from a given cluster are often able to infect different bacterial hosts, especially if the hosts are not very distantly related, indicating that particular lifestyles do not necessarily restrict phages to particular hosts within the Enterobacteriales. Fig. 3 shows this for the Enterobacteriales phage N4 supercluster, where members are known that infect seven different Enterobacteriales genera, Escherichia, Shigella, Salmonella, Klebsiella, Enterobacter, Erwinia and Pectobacterium. For example, Escherichia phages EC1-UPM (Accession No. KC206278), IME11(Accession No. NC_019423) and G7C (Accession No. NC_015933) are very close relatives, while Escherichia phage Pollock (Accession No. KM236242) is quite different. On the other hand, Klebsiella phage Pylas (Accession No. MH899585) is more closely related to phage Pollock than it is to the other Klebsiella phage KP8 (Accession No. MG922974) in this group. Although the relationship of most of the phages with hosts outside the Enterobacteriales have not yet been compared in detail to all the Enterobacteriales phages, a few phages have been identified from other orders in the Gamma-proteobacteria class and even other classes in the Proteobacteria phylum that fall within the Enterobacteriales clusters. For example, Fig. 3 also shows that Pseudomonas phage Inbricus (Pseudomonadales, Gamma Proteobacteria; Accession No. MG018928) and Achromobacter phages øAxp-3 and JWDelta (Burkholderiales order in Beta Proteobacteria; Accession Nos. KT321317 and KF787094, respectively) fall within the N4-like cluster, while other Enterobacteriales phages CB1 and Nepra form a separate CB1-like cluster within the N4 supercluster. Similarly, other “outside” phages fall into the Enterobacteriales T4-like, T7-like and P2-like clusters or superclusters. Thus, similar molecular lifestyles (superclusters) can extend at least as far as across the Proteobacteria, and “T4-like” phages that infect the Cyanobacteria phylum suggest it may extend much further. It remains to be seen if such broad reaches are common or exceptions.

Summary Bacteriophages, like all viruses, are certainly very ancient and appear to have arisen independently multiple times; for example, virions can contain single-stranded RNA or DNA or double stranded RNA or DNA, and the proteins that form the virion’s protective coats are present in different phage types with at least five unrelated polypeptide folds. The dsDNA tailed phages are extremely abundant in the environment and their diversity has been studied in most detail through comparison of whole genome sequences, so this discussion focused on their diversity. Both top down and bottom up methods have shown that the tailed phages are extremely diverse. Multiple types (called clusters) of tailed phages exist that infect essentially all bacterial species that have been examined, and different phages within such a cluster often infect different (but almost always closely related) host species. The number of bacterial species of Earth is not known, but estimates have ranged widely from millions to billions, so the number of tailed phage types or clusters is likely this enormous as well. In addition, very substantial diversity exists within each phage cluster. Thus, phages are almost incomprehensively enormously diverse in general features as well as details, and many of these are yet to be discovered.

Further Reading Aiewsakun, P., Adriaenssens, E.M., Lavigne, R., Kropinski, A.M., Simmonds, P., 2018. Evaluation of the genomic diversity of viruses infecting bacteria, archaea and eukaryotes using a common bioinformatic platform: Steps towards a unified taxonomy. Journal of General Virology 99, 1331–1343. Brum, J.R., Sullivan, J.B., 2015. Rising to the challenge: Accelerated pace of discovery transforms phage biology. Nature Reviews Microbiology 13, 147–159. Casjens, S.R., 2005. Comparative genomics and evolution of the tailed-bacteriophages. Current Opinion in Microbiology 8, 451–458. Casjens, S.R., Thuman-Commike, P.A., 2011. Evolution of mosaically related tailed bacteriophage genomes seen through the lens of phage P22 virion assembly. Virology 411, 393–415. Casjens, S.R., Grose, J.H., 2016. Contributions of P2- and P22-like prophages to understanding the enormous diversity and abundance of tailed bacteriophages. Virology 496, 255–276. Clokie, M.R., Millard, A.D., Letarov, A.V., Heaphy, S., 2011. Phages in nature. Bacteriophage 1, 31–45. Doore, S.M., Fane, B.A., 2016. The microviridae: Diversity, assembly, and experimental evolution. Virology 491, 45–55. Grose, J.H., Casjens, S., 2014. Understanding the enormous diversity of bacteriophages: The tailed phages that infect the bacterial family Enterobacteriaceae. Virology 468–470, 421–443. Guerin, E., Shkoporov, A., Stockdale, S.R., et al., 2018. Biology and taxonomy of crAss-like bacteriophages, the most abundant virus in the human gut. Cell Host & Microbe 24, 653–664. Hatfull, G.F., 2018. Mycobacteriophages. Microbiology Spectrum 6. GPP3-0026-2018. Hayes, S., Mahony, J., Nauta, A., van Sindern, D., 2017. Metagenomic approaches to assess bacteriophages in various environmental niches. Viruses 9, 127. Hendrix, R.W., 2002. Bacteriophages: Evolution of the majority. Theoretical Population Biology 61, 471–480.

Bacteriophage Diversity

Lawrence, J.G., Hatfull, G., Hendrix, R., 2002. The imbroglios of viral taxonomy: Genetic exchange and the failings of phenetic approaches. Journal of Bacteriology 184, 4891–4905. Mahmoudabadi, G., Phillips, R., 2018. A comprehensive and quantitative exploration of thousands of viral genomes. eLife 7, e31955. Manrique, P., Bolduc, B., Walk, S.T., et al., 2016. Healthy human gut phageome. Proceedings of the National Academy of Sciences of the United States of America 113, 10400–10405. Pope, W., Mavrich, T.N., Garlena, R.A., et al., 2017. Bacteriophages of Gordonia spp. Display a spectrum of diversity and genetic relationships. mBio 8 (4), e01069-17. Rohwer, F., 2003. Bacteriophage diversity. Cell 113, 141. Roux, S., Hallam, S.J., Woyke, T., Sullivan, M.B., 2015. Viral dark matter and virus-host interactions resolved from publicly available microbial genomes. eLife 4, e08490. Sepulveda, B.P., Redgwell, T., Rihtman, B., et al., 2016. Marine phage genomics: The tip of the iceberg. FEMS Microbiology Letters 363, fnw158. Yutin, N., Backstrom, D., Ettema, T.J.G., Krupovic, M., Koonin, E.V., 2018. Vast diversity of prokaryotic virus genomes encoding double jelly-roll major capsid proteins uncovered by genomic and metagenomic sequence analysis. Virology Journal 15, 67. Yutin, N., Makarov, K.S., Gussow, A.B., et al., 2018. Discovery of an expansive bacteriophage family that includes the most abundant viruses from the human gut. Nature Microbiology 3, 38–46.

Relevant Websites https://phagesdb.org/ PhagesDB.org. https://talk.ictvonline.org/taxonomy/ Taxonomy.

275

Genetic Mosaicism in the Tailed Double-Stranded DNA Phages Welkin H Pope, University of Pittsburgh, Pittsburgh, PA, United States r 2021 Elsevier Ltd. All rights reserved.

Glossary BLAST Basic Local Alignment Search Tool. Algorithm that identifies related nucleotide or protein sequences using local alignments, available at NCBI. Capsid Protein shell that surrounds the genetic material of a phage, in the Caudovirales, the capsid is twenty-sided – icosahedral, and may be either isometric (all sides are the same length) or prolate (elongated). Caudovirales In formal phage taxonomy, Caudovirales (“tailed viruses”) is the name of the Order of phages that use double-stranded DNA as genetic material; enclosed in a protein capsid and possessing a protein tail that is used to recognize the host cell and deliver the DNA from the head. Circularly permuted Describes phage genomes from phages that use the headful packaging strategy; these phages do not have a defined start or end to their genome, but instead package DNA from a long concatemer of the genome sequence. The DNA is cut when the phage head is “full”, and includes more than 100% of the genome sequence resulting in identical sequences at the end of each piece of DNA within a phage head and each virion containing the complete set of genes with in the genome, yet each individual virion contains a different chromosomal sequence. Direct repeat Property of some phage genome ends, in that the same DNA sequence is found at both the beginning and end of the piece of DNA within the phage head. Allows for circularization of the phage genome via homologous recombination after injection into the host cell, as well as for linear multimer concatemers. Endolysin Phage encoded enzymes that cleave the bacterial cell wall from within, aiding the release of the virions at the culmination of the lytic cycle. Exclusion Refers to the mechanism by which a cell prevents infection of phages via modification of cell surface proteins, preventing phage attachment and DNA injection. Genome The complete DNA sequence of an individual phage. Host A cell that a phage can infect and propagate within. Host range The different bacterial strains that a specific phage can infect. Identity Sequences that are identical. Immunity Refers to the mechanism by which a cell is resistant to phage infection via expression and binding of an immunity repressor to the invading phage DNA. Immunity repressor DNA binding protein encoded by the phage genome that prevents expression of phage genes involved in the lytic cycle. The immunity repressor is continually synthesized during lysogeny. Integrase Enzyme that promotes homologous recombination of the prophage and the bacterial genome at attachment sites with the result of a stable single piece of DNA. Lysogenic The cycle in which some phages (temperate phages) delay the lytic cycle after DNA injection, and

276

instead exist within a host cell as a quiescent piece of DNA (a prophage). Lysogen-y A cell that contains a prophage is a lysogen, and exists in the state of lysogeny. Lytic Describes the cycle of phage infection from DNA injection, through phage genome replication, synthesis and assembly of virions, and burst of the host cell. Also describes phages that undergo the lytic cycle (-“lytic phages”). Myoviridae The taxonomic Family name for the contractile tailed phages. Pham (or phamily) Group of related protein sequences as calculated by the program Phamerator. Plaque A clearing in a lawn of bacteria due to the presence of an initial single virion that subsequently underwent multiple rounds of the lytic cycle, killing the host cells in its proximity. Plasmid An additional piece of DNA within a host organism, smaller than the larger genome, that is replicated and passed down to daughter cells during division. Podoviridae The taxonomic family name for the short tailed (extensible tailed) phages. Prophage The repressed genome of a temperate phage found within a host cell. Recombination, homologous and non-homologous Process by which two pieces of DNA break and swap ends; such that pieces “A–B” and “C–D” become “A–D” and “C–B”. Homologous recombination occurs between identical sequences with the strands and happens more frequently than non-homologous recombination which requires no sequence identity. Resistant A cell that is not susceptible to phage infection. Similarity Sequences that show conservation of related amino acids in corresponding positions, even if the sequence is no longer identical. Siphoviridae The taxonomic Family name for the noncontractile tailed phages, also called the flexible-tailed phages. Synteny The conservation of gene order observed within phage genomes. Tail Protein structure that mediates host recognition and DNA delivery in the tailed phages. Tails are comprised of multiple different proteins and are found in a number of morphologies. Tail fiber The long, thin, extended proteins attached to the tail tip that are the primary mechanism by which a phage binds to its host cell. Tape measure The protein that makes up the core of the flexible-tailed phages, the length of the tape measure is directly proportional to the length of the phage tail. Temperate Describes phages that are capable of delaying the lytic cycle after their DNA is injected within the host cell; instead the phage undergoes the lysogenic cycle in which DNA is replicated within the cell (called lysogens) and passed onto daughter cells after cell division. When host cells are stressed, the lytic cycle may be induced.

Encyclopedia of Virology, 4th Edition, Volume 4

doi:10.1016/B978-0-12-809633-8.20961-8

Genetic Mosaicism in the Tailed Double-Stranded DNA Phages

Terminase ATP-driven DNA packaging motor encoded by phage genomes. By convention, the terminase gene is frequently assigned position 1 in the genomes of flexible-

277

tailed phages; in the short-tailed podoviridae, the terminase is found towards the right end of the genome. Virion An individual virus or phage particle.

Introduction Bacteriophages – or phages, the viruses of bacteria – are the most numerous biological entities on the planet, with an estimated global population of 1031, and a turn-over time of several days – these numbers are based on measurements taken by the Colwell lab in 2000, and Suttle lab in 2007. Phages are comprised of nucleic acid – dsDNA, ssDNA, dsRNA, or ssRNA – surrounded by a protective proteinaceous shell. The majority (over 90%) of the sequenced phage isolates in the non-redundant “Nucleotide” collection in GenBank as of February 2019 are members of the Caudovirales (also called the tailed phages), these phages use dsDNA as genetic material surrounded by an icosahedral “head” (or capsid), and at a single vertex is attached a complex, multiprotein structure called a “tail” which mediates recognition of the bacterial host. Tailed phage morphology broadly falls into three groups and is the basis for the taxonomic classification of these phages: the Myoviridae, or contractile-tailed phages; the Siphoviridae, the non-contractile or flexible tailed phages, and the Podoviridae, the short-tailed phages. Numerous isolates of each morphological type have been recovered from a variety of hosts and habitats; comparatively fewer of these have had their genomes sequenced and analyzed using tools to predict the location and functions of their genes, and to determine how similar they are to each other. Comparisons of the genome sequences in smaller, host specific-sets, as well across the broad collection as whole, have illuminated a number of properties of phage genes and genome sequences conserved throughout the morphological types, as well as allowed scientists to gain insights into global viral genetic space. This includes the property of “mosaicism” – as phage genomes are collections of individual genes with separate evolutionary histories – much like a mosaic is a piece of art comprised of many different individual tiles. Independently, the tiles – or genes – are less meaningful, but together, the tiles form a single large mosaic picture, just as phage genes collectively encode the instructions for the life cycles of a single phage.

Evidence for Genome Mosaicism Before the Advent of DNA Sequencing The mosaic nature of tailed bacteriophage genomes was first described in the hybridization studies of DNA from Escherichia coli phage lambda and its relatives in 1971. DNA from these phages was melted to single strands, mixed, and allowed to re-anneal and form heteroduplexes; these were examined with transmission electron microscopy, and the lengths and locations of the bound, complementary sections versus unhybridized pieces carefully mapped. It was readily apparent that there were portions within each of the phage genomes that had sufficient identity to form stable duplexes with the others interspersed with regions of non-identity; and that these sequences were not uniformly distributed across the different phage genomes. Some phages shared certain portions of their genomes; other portions of their genomes were shared with other phages that did not contain the first segment. Subsequent genetic studies of the Salmonella phage P22 identified genes and regions of genome organization in common with the E. coli lambdoid phages – surprising investigators who had grouped the P22-like phages with the E. coli T7-like phages based on their shared short-tail morphology. These observations of shared genetics between morphologically dissimilar phages, among others, led Susskind and Botstein (1978) to propose a method of phage genome evolution based on swapping of modules of genes; suggesting that this was accomplished by homologous recombination – breaking of the two DNA strands at identical sequences followed by rejoining to the other phages’ genome, creating a new overall sequence. They proposed this was promoted by the existence of short, identical “linker” sequences flanking related modules; such a mechanism could result in the observed mosaic patterns between genomes. In phages in which no linker sequences could be identified, they proposed that periodic loss of linker sequences through mutation could occur, thus preventing their identification in sequence comparisons.

DNA Sequencing Illuminates Properties of Bacteriophage Genomes With the advent of DNA sequencing, phages were an obvious target due to their short genomes and tractability as a model system; and indeed, the first complete genome sequence of any entity reported belonged to the phage phiX174 by Sanger (1977), with the larger phage lambda following in 1983. Subsequent genome sequencing and comparative analyses through the 1990s revealed sequence similarity and shared genome architecture from phages across of a variety of hosts, including Gram negative E. coli and Salmonella phages as well as Gram positive Mycobacterium and Streptomyces phages. Indeed, in 1999, sufficient phage genomes had been fully sequenced such that Hendrix et al. proposed that all tailed phages were related through common ancestry, had access to a global pool of genes; and that the variation of identical sequences flanked by unrelated sequences, mosaicism, was a universal property of phage genomes. Interestingly, the mosaicism was not distributed at random through the genomes, but instead mapped most often to gene boundaries, such that intact genes were shared between otherwise unrelated genomes. Nor was

278

Genetic Mosaicism in the Tailed Double-Stranded DNA Phages

the mosaicism limited to a few mobile elements, such as transposons, found across a large number of genomes, but was identifiable throughout all regions and genes within phage genomes. With the analysis of additional phage genome sequences, genetic mosaicism continued to be observed throughout phages of different hosts and morphologies, and it became evident that homologous recombination between linker sequences facilitating the exchange of genetic modules could not be the primary driver behind genome diversity and evolution, as multiple versions of genes with significant sequence similarity were found across otherwise unrelated phage genomes. Instead, Hendrix et al. (1999) proposed that the driving force behind the mosaicism present in phages genomes must be non-homologous recombination. This argument was based on the sheer magnitude of the numbers involved, with a rationale along these lines, with 1031 phage virions on the planet, and a predicted 1023 productive infections per second, a certain number of those infections would be co-infections, bringing related or unrelated phage genomes into direct contact with each other inside the same host cell. Within those co-infections, if a fraction of those underwent a nonhomologous recombination event, there could be on the order of 1016 events per second. The majority of these non-homologous recombination events are likely to produce non-viable progeny – if you swap the center of your capsid gene for the middle of the other phage’s DNA polymerase gene, both progeny will both be unable to produce new phages in the next generation – and so, the recombination events that produce viable progeny will be more likely to occur at gene or protein domain boundaries at a rate of approximately 109 per second, – thus leading to the observed pattern of genetic mosaicism in phage genomes.

Comparative Genomics of Complete Bacteriophage Sequences Shows Diversity and Mosaicism As DNA sequencing costs dropped, more phages were completely sequenced; including groups of Lactococcus phages and E. coli phages. A detailed analysis of the genomes of lambda-like E. coli phages, including HK97 and HK022 by Juhala et al. (2000) demonstrated that the mosaicism present in the genomes was not limited to gene or module boundaries, even those these sites were overrepresented in the reassortment of the genes between genomes. Indeed, recombination events had clearly taken place between phages within coding sequences, although these were not evenly distributed across all genes, as more events were identified between tail fiber genes, than between others, suggesting that the mechanism behind these patterns was rampant non-homologous recombination between co-infecting phages or prophages that were previously integrated in the host cell. The prevalence of mosaic events in tail fibers could be part of a mechanism to increase the rapid diversification of host range, or could be an artifact of the linear structure of the fibers – long extended proteins like tail fibers could more easily tolerate non-homologous joins than similar events in globular proteins, which could be more likely be lethal to the phage. Closer examination of genomes from phages across hosts and virion morphology demonstrated that even though nucleotide and protein sequences of the genes were unrelated, the order and location of genes within phage genomes, particularly amongst phages with shared morphologies, was conserved. This property is called “synteny”, that analogous genes appear in the same order within phage genomes. The vast diversity of phages was illuminated by an analysis by Pedulla et al. (2003) of a collection of fourteen complete genome sequences from phages infecting the same host, Mycobacterium smegmatis mc2155. These phages were all tailed phages, and predominantly flexible tailed (two were contractile-tailed); and many of them were indistinguishable from each other by electron micrograph. However, comparative analysis of the genome sequences revealed that these genomes had less than 50% nucleotide sequence identity. This was a tremendous surprise to the scientific community – these phages that appeared identical under the microscope and were in direct genetic contact with each other through their ability to infect the same host did not share significant DNA sequence identity (for comparison, humans share about 60% of their genome sequence with bananas). Not only did these phages not share sequence similarity with each other, the majority of genes within phage genomes were unlike any other sequences in our sequence databases, that only 20% of the genes within a phage genome could be assigned a putative function. The genes that could be assigned functions, both during these studies in the early 2000s and up through the writing of this article in 2019, were those that played well-characterized roles in the phage lifestyles. During the lytic cycle, after binding to a host cell, the phage nucleic acid is deposited into the cytoplasm, where it undergoes transcription, translation, and replication; the structural proteins are synthesized, the heads and tails self-assemble, DNA is packaged into the head, the head and tail are joined; and the cells are lysed releasing the new virions into the environment. Temperate phages can undergo the lysogenic cycle, during which they integrate their DNA into the host genome or circularize as a large extrachromosomal plasmid and repress their lytic functions by means of a protein called the immunity repressor; the phage DNA is then replicated by the host during normal cell division and passed on to both daughter cells. Lysogens, the cells that contain these phage genomes (called prophages) sometimes gain new abilities via genes expressed from the prophage, included those involved in host defense against subsequent infection from other phages; lysogens can also undergo induction of the lytic cycle during stress, and the prophage can excise from the host genome and begin the lytic cycle. Many of the proteins involved in these processes are phage-encoded, and can be routinely identified in phage genome sequences based on sequence similarity to those genes in other phages that have been characterized at the bench. The different branches of the Caudovirales – Siphoviridae, Myoviridae, and Podoviridae – demonstrate conservation of some genome properties with regard to genetic content as well as variations specific to their morphology; each type is broadly discussed below.

Mosaicism in the Siphoviridae The flexible, non-contractile tailed phages, the Siphoviridae, have been isolated from all known bacterial phyla and habitats; many of the more recent large scale comparative genomics studies describe phages of the Siphoviridae. Within the genomes of the Siphoviridae (Fig. 1),

Fig. 1 Three phage genomes – Gordonia phages BaxterFox and Zirinka, and Mycobacterium phage Che9c – are compared with the program Phamerator. The letters/numbers after each phage name indicates their cluster and subcluster designation. The central ladder indicates the length and position of the nucleotide sequence, in kilobases. Genes are represented by rectangles, and colored according to protein sequence similarity; genes in white share no protein sequence similarity with any other proteins in the Actino-Draft database. Numbers within the rectangles indicate the gene numbers; numbers above the rectangles indicate pham number, the number in parenthesis indicate the number of genes within that pham from the entire database “Actino-Draft”. The direction of transcription is indicated by the placement of the rectangles above or below the central ladder; above, indicates left to right. The colors between the genomes – as opposed to the colors of the gene rectangles – indicates nucleotide sequence similarity as calculated with the program BLAST, purple represents the most similar sequences, shading to red as the least similar, and white indicates no significant sequence similarity.

Genetic Mosaicism in the Tailed Double-Stranded DNA Phages 279

280

Genetic Mosaicism in the Tailed Double-Stranded DNA Phages

synteny is evident in the order of the virion structural genes in particular; by convention, newly sequenced phage genomes are oriented such that the structural genes are on the left-side of the genome, and the gene(s) encoding the terminase, the DNA packaging motor, is one of the first, if not the first, gene. Following the terminase are genes encoding the head and tail components, specifically, the portal protein, the capsid maturation protease, the scaffolding protein, the major capsid protein, the head-to-tail connector proteins, the tail terminator, the major tail subunit, the tail assembly chaperones, the tape measure, the minor tail proteins, and the tail fiber. The number and size of the structural genes, and the conservation of their order is pervasive, from the very smallest flexible tailed phages with genomes of approximately 15 kb; to the largest, with genomes closer to 100 kb; and is observed in phages of hosts in multiple phyla, in both Grampositive and Gram-negative bacteria, and from diverse environments. Interestingly, while the order of these genes is predominantly maintained, the larger phage genomes have more gene insertions between various structural proteins; these genes include parasitic elements like transposons, self-splicing introns, and homing endonucleases, as well as a large number of genes without identifiable functions. Adjacent to the structural genes are the genes predicted to encode proteins involved in cell lysis, including the endolysin(s) and holin; these are followed by a central region of the genome in found only in temperate phages containing the integration cassette, including the integrase, the attP site (for “att”achment “p”hage, the DNA sequence that is identical to the integration site in the host) and the immunity repressor, which prevents expression of genes involved in lytic functions. The integration cassette region in many temperate phages is highly variable, and otherwise closely related phages can display a subset of a wide range of diverse small genes in this region. These genes are expressed from the prophage during lysogeny, and at least some of them aid in defense of the host from super-infecting related or unrelated phages, via restriction, exclusion, toxin/anti-toxin systems, abortive infection systems, and others; the mechanism(s) that contribute to this extensive localized genetic diversity is unclear. The right arm of the flexible-tailed phage genomes is replete with small genes (B30–500 aa) of unknown function; as well as larger genes for which functions can be identified; those code for proteins involved in DNA replication, recombination, and nucleotide modification and processing. The smaller genes are characteristic of phage genomes in general, their role in the biology of the phage is unclear, nor are they similar to other sequenced phage genes available in GenBank. These genes of unknown function tend to be smaller than their counterparts to which a function can be assigned, likely both due to the lack of similar sequences of known function being present in the databases, but also due to the limitations of search algorithms in matching smaller sequences with statistically relevant specificity. Transcriptomic and proteomic studies indicate that these small genes are expressed early during lytic infection, and in at least a few cases, are involved in the defense of the host cell against super-infecting phages – either self or unrelated. A putative role in host defense is further supported through whole genome essentiality studies performed – most of these small genes can be deleted from the genome with no effect on the viability of the progeny under laboratory monoculture conditions. This would make sense if these smaller genes are used to protect the host from further infection of other phages.

Mosaicism in the Myoviridae The larger comparative genomic studies of the Myoviridae have primarily focused on the on the T4-like phages in Gram negative hosts such as E. coli and the marine Cyanobacteria Synechococcus and Prochlorococcus; indeed, while myoviridae of the Actinobacteria have been isolated, they do not exhibit the large complicated tails of the T4-like phages, and the two types of phage share no genes in common. The genomes of the T4-like phages generally are between 150 kb and 250 kb; they share some gene content in the form of a set of 20 or so “core” genes, present in all of these phages; core genes in this case being defined by functional analogy and protein sequence similarity. The majority of these shared genes are homologs of the structural genes of T4; T4 virions have a complicated structure and elaborate, multi-protein contractile tail designed to deliver the phage DNA by injection through the cell membrane and wall. To date, no temperate phages with a T4-like morphology have been isolated; all are lytic phages. Like the Siphoviridae, the overall gene order of the core genes within the Myoviridae is conserved; and indeed, the order from terminase through major capsid and neck proteins is identical to those of the Siphoviridae. The mosaicism of phage genomes is evident within the insertions of smaller genes of unknown functions between the structural proteins similar to the larger genomes of the Siphoviridae, the overall synteny of the structural proteins is maintained in these larger phages.

Mosaicism in the Podoviridae The taxonomic classification of Podoviridae, or short-tailed phages, comprises at least two large unrelated groups; the short-tailed phages that are similar to the Salmonella phage P22 are more related to the lambdoid phages of E. coli such that the melted strands of the genomes can anneal and form heteroduplexes as demonstrated by Campbell (1994), than these phages are to other shorttailed phages.The larger comparative studies performed on Podoviridae have primarily included the phages related to E. coli phage T7, including phages of the Enterobacteria and phages of the marine cyanobacteria. These phages share little nucleotide or protein sequence similarity, but their genome size of approximately 40 kb, genome architecture, and functional gene content are highly conserved. The synteny of these genomes is evident in the comparative studies, with an RNA polymerase gene near the left end, followed by genes involved host takeover and in DNA replication, with the structural genes occupying the end of the right arm. Here there is a slight permutation from the gene order in the Siphoviridae; as the T7-like phage structural genes begin with the

Genetic Mosaicism in the Tailed Double-Stranded DNA Phages

281

portal gene, followed by the head, and tail genes, and end with the terminase. The smaller genes of unknown function show the most variety in terms of gene content across genomes, with the larger structural genes showing more conservation. Interestingly, some of these genomes include genes related to integrases found in temperate phages, but do not include identifiable homologs of the remainder of the machinery required for stable integration.

Clusters and Superclusters: Large Scale Comparative Genome Analysis The mosaic nature of phage genes and genomes and the lack of a single gene present in all phages has made defining the boundaries of their relatedness a difficult task. A number of classification schemes have been used to group -related phages together; including morphology and host range. As more phage genomes were sequenced through the 2000s and 2010s, the more it became clear that these earliest metrics for classification were not sufficient; Salmonella phage P22 and E. coli phage lambda being the only two earliest examples. The larger scale comparative analyses of complete genome sequences of phages infecting specific bacterial genera published in the past decade have upheld the earliest observations of genetic mosaicism across the tailed phages. Phages isolated from a multitude of hosts share genome architectures and functional units while exhibiting little or no nucleotide or protein sequence similarity; and many of the predicted protein encoding genes are small, only 300 bp or so on average, diverse, located in equivalent areas of the genomes, and do not have identifiable functions. The ubiquitous mosaicism and vast diversity of gene sequences has confounded traditional taxonomy based on vertical descent of genes; however, attempts at grouping phages by shared nucleotide sequences or by shared proteomes – also called shared gene content – have proved useful in identifying groups of related phage that share similar biological traits. Phages with similar gene content are assigned to the same “cluster”, however, these boundaries may still be fuzzy, in that phage A may share 35% of its genes with phage B, and phage B may share 35% of its genes with phage C; all three would be assigned to the same cluster even if phages A and C do not share 35% of their genes. The results of this type of cluster building is that given enough members, there may no longer be any single protein that is present in all cluster members; this is seen in the Cluster A of the Actinobacteria. These phages are temperate Siphoviridae with approximately 600 isolates with complete genome sequences listed on website provided in “Relevant Website section”, the Actinobacteriophage database. The most recent comparisons of phage gene content suggests that phages with similar genome architecture and sharing 35% or more protein sequences from across the genome, when shared proteins are measured through pham membership as determined by the program Phamerator, can be used to define cluster membership. An analysis of Entereobacteriophages suggests the use of the term “supercluster”; these would include phages that share genome architecture, but few or no protein sequences across the group. The phages of the superclusters show diversity of host range across multiple genera; and provide evidence for the model of gene movement throughout the global population of phages proposed by Jacobs-Sera et al. (2012), in which different phage types exhibit different but overlapping host preferences (the “stepping stone” model). Recombination events between these co-infecting phages allows all phages access, albeit at different rates, to the continuum of diversity within global phage genetic space.

Different Phage Clusters Acquire New Genes at Different Rates Comparative genomic analysis between and within phage clusters demonstrated that the rates of gene exchange and mosaicism were not equivalent between different groups of phages. With more than 3000 sequenced phages isolated on hosts from across the domain of bacteria, patterns in gene exchange and gene flow became evident. By examining the gene content and nucleotide sequence similarity of over 2000 phages in direct pairwise comparisons, Mavrich and Hatfull demonstrated that different phage clusters exchange genes at different rates; and moreover, that only temperate phages exhibit “high gene flux” – in which parts of their genomes are relatively identical in terms of gene content and nucleotide sequence (frequently, this corresponds to the left arm of the genome containing the structural proteins) – while the remainder of the genomes shares little to no genome sequence similarity or gene content. High-gene flux phages are more likely to swap entire genes than accumulate nucleotide changes within shared genes. This type of pairwise comparison is in direct contrast to the phages that exhibit “low gene flux”; and in these phages, changes between genomes are spread out evenly across the genome, and nucleotide changes within shared genes occur at about the same rate that entire genes are swapped between genes. To date, all lytic phages and some groups of temperate phages demonstrate low gene flux. These two different modes account for the seemingly irreconcilable observations of discreet phage populations observed within large lytic marine Cyanophages – these are low-gene flux phages with many shared genes, for which individual rates of nucleotide substitutions can be counted; while some groups of the temperate Actinobacteriophages within the high-flux mode share structural genes and little else. It is unclear as to the mechanism through which these phages mediate high gene flux.

Mosaicism, Hybrids, and Single Gene Surveys Today’s sequencing technologies rely on shearing DNA into small, manageable fragments (100–1000 bp, depending on the sequencing technology used) prior to sequencing, followed by post-sequencing computer assembly of the small pieces into a

282

Genetic Mosaicism in the Tailed Double-Stranded DNA Phages

complete full-length genome. For purified DNA from phage isolates, assembly is trivial, however, for DNA sequences extracted from complex environmental samples (also called the metagenome of the sample) the mosaicism, vast numbers, and genetic diversity of the phage genomes makes assembly difficult. Nor is it easy to identify phage sequences recovered from metagenomic samples as there are comparatively few full genome phage sequences available for alignments. Some environmental survey studies use amplification of genes such as those encoding phage portal proteins, major capsid proteins, integrases, or DNA polymerase genes, and these are of some utility in determining the presence and diversity of phages containing those particular genes. While the majority of complete phage genome sequences support that single gene survey analyses may represent individual phage types, some exceptions have been discovered and are worth mentioning. First, a number of phage genomes of the Siphoviridae and Podoviridae from diverse hosts such as Gordonia and the marine cyanobacteria contain integrases without any other lysogenic components; nor have efforts at isolating lysogens from infections of these phages been successful. In these cases, the “integrase” genes may be merely homologous recombinases that facilitate the gene flow between these phage types and the host or co-infecting phages. Second, structural hybrids have recently been identified between the Gordonia and Mycobacterium phages (including the three shown in Fig. 1); the bottom two phages exhibit prolate capsids and the top has an isometric head. A similar relationship exists between the phages depicted in the figure, which illustrates a number of properties discussed in this article; including mosaicism, synteny, and conservation of genome architecture; these are not the only examples of structural hybrids, the marine phages of Pseudoalteromonas include a Myo/Sipho hybrid – in this case, the contractile-tailed phage is similar to the smaller less complex phage Mu rather than the larger complicated T4. The rate and formation of hybrids is dependent on the size, type, and locations of functionally equivalent genes. The ratio of genome overall length to that occupied by essential genes directly impacts the amount of flexibility a phage genome has to acquire new genes without losing viability. As capsid sizes increase, phages have room to acquire new genes, but these new genes may be in the form of insertions between essential genes. As essential genes are moved farther apart from each other on the genome, the less likely it is that a non-homologous recombination event could accommodate the formation of a structural hybrid – thus it is unlikely that a lambda-T4 hybrid could even be recovered. It is also unlikely that a phage with a larger capsid size could downsize once enough new genes have been acquired; as truncation of the genome sequence may then include essential genes. Additional genome sequencing of phages from a wider variety of phages may reveal more insights into the gene flow within the phage population.

Further Reading Bondy-Denomy, J., Qian, J., Westra, E.R., et al., 2016. Prophages mediate defense against phage infection through diverse mechanisms. ISME Journal 10 (12), 2854–2866. Brussow, H., Hendrix, R.W., 2002. Phage genomics: Small is beautiful. Cell 108 (1), 13–16. Cotinho, F.H., Silveira, C.B., Gregoracci, G.B., et al., 2017. Marine viruses discovered via metagenomics shed light on viral strategies throughout the oceans. Nature Communications 8. Dekel-Bird, N.P., Avrani, S., Sabehi, G., et al., 2013. Diversity and evolutionary relationships of T7-like podoviruses infecting marine cyanobacteria. Environmental Microbiology 15 (5), 1476–1491. Deng, L., Ignacio-Espinoza, J.C., Gregory, A.C., et al., 2014. Viral tagging reveals discrete populations in Synechococcus viral genome sequence space. Nature 513 (7517), 242–245. Dutilh, B., Cassman, N., McNair, K., et al., 2014. A highly abundant bacteriophage discovered in the unknown sequences of human faecal metagenomes. Nature Communications 5. Grose, J.H., Casjens, S.R., 2014. Understanding the enormous diversity of bacteriophages: The tailed phages that infect the bacterial family Enterobacteriaceae. Virology 468– 470, 421–443. Hendrix, R.W., 2002. Bacteriophages: Evolution of the majority. Theoretical Population Biology 61 (4), 471–480. Hendrix, R.W., Smith, M.C., Burns, R.N., Ford, M.E., Hatfull, G.F., 1999. Evolutionary relationships among diverse bacteriophages and prophages: All the world's a phage. Proceedings of the National Academy of Sciences of the United States of America 96 (5), 2192–2197. Mavrich, T.N., Hatfull, G.F., 2017. Bacteriophage evolution differs by host, lifestyle and genome. Nature Microbiology 2, 17112. Pedulla, M.L., Ford, M.E., Houtz, J.M., et al., 2003. Origins of highly mosaic mycobacteriophage genomes. Cell 113 (2), 171–182. Pope, W.H., Bowman, C.A., Russell, D.A., et al., 2015. Whole genome comparison of a large collection of mycobacteriophages reveals a continuum of phage genetic diversity. elife 4. Pope, W.H., Mavrich, T.N., Garlena, R.A., et al., 2017. Bacteriophages of Gordonia spp. Display a spectrum of diversity and genetic relationships. mBio 8 (4). Rowher, F., Hisakawa, N., Youle, M., Maughan, H., 2014. Life in Our Phage World. Wholon Publication. Yuhan, Y., Gao, M., 2017. Jumbo phages: An overview. Frontiers in Microbiology 8 (403).

Relevant Websites https://www.ncbi.nlm.nih.gov/ NCBI – NIH. Phamerator.org Phamerator. Phagesdb.org The Actinobacteriophage Database.

Bacteriophages of the Human Microbiome Pilar Manrique1, The Ohio State University, Wexner Medical Center, Columbus, OH, United States Michael Dills1 and Mark J Young, Montana State University, Bozeman, MT, United States r 2021 Elsevier Ltd. All rights reserved.

Glossary Bacteriophage-phage Virus of bacteria. Diversity The number of species, their taxonomies, and their abundance in an ecological community. Generally represented by a diversity index, which is a quantitative measure of both, the richness and evenness of a community (e.g., Shannon Diversity Index). Dysbiosis Microbial system imbalance within body. Dysbiosis is the disruption of a microbiota’s commensal and mutualistic capacity. Dysbiosis is not a requisite criterion in health or disease though it is implicated in the etiology, persistence and progression of many diseases. Eubiosis The healthy state of the human microbiome communities. Fecal microbial transplantation or transplant (FMT) Transplantation of feces from a healthy individual donor to the gut of a recipient individual suffering a disease in order to restore a healthy microbiota. Gnotobiotic mice Germ-free mice (devoid of microbes) that are colonized with exogenous microorganisms in order to study the effect of the microorganism on the host. Human microbiome The human body in total as a conglomeration of human cells and microorganisms (the microbiota). The phage component is in turn referred to as the human phageome.

Lysogenic or temperate phage A phage that has the ability to integrate its DNA into the host chromosome where it is passively replicated with the bacterial chromosome. Lytic phage A phage that upon injection of its DNA into its susceptible host cell replicates its genome, produces structural proteins and ultimately lyses the cell to release its progeny virions. Metagenomics Study of DNA isolated directly from complex biological samples. DNA is isolated and deepsequenced generating sequence reads that are assembled with bioinformatic programs to render an ensemble of DNA sequences from all the organisms present in the microbial community under study. Within this context, viruses can be purified and sequenced separately, rendering a viral metagenome containing sequences from viral entities. In contrast, cellular metagenomes will contain both cellular and viral sequences. Phage therapy Use of specific bacteriophage or phage cocktails to treat bacterial infections. Prophage reservoir All the prophages encoded in bacterial members of the gut microbial community. Prophage A bacteriophage genome integrated in the bacterial host genome.

Introduction Humans exist in homeostasis with a considerable diversity of microbial life, ranging from bacteria and their viruses-termed bacteriophages (phages), to eukaryotic viruses, archaea, protozoa, and fungi. Together these organisms comprise the human microbiome. Within this community, bacteria are the most abundant cell type and bacteriophages are the most abundant virus type. The composition of the human microbiome varies between individuals and body site, diet, geography and health status. Disease states can correlate with deviations in microbiome structure, termed dysbiosis, with respect to that of the healthy state. The causes of dysbiosis are a central focus of microbiome research. Bacteriophages are a key ecological determinant affecting the dynamics and function of bacterial communities. Their role in shaping human bacterial communities, and ultimately in human health and disease, remains a key question. The study of phagebacteria-human relationships spans the paradigms of health and disease, with phages being agents of both. Individual phages have been extensively studied as secondary agents of human disease, often conferring pathogenic traits to commensal bacteria. In other microbial systems bacteriophages are known to strongly influence the dynamics and function of their bacterial host communities. It is assumed therefore, that microbial viruses are partially responsible for the human microbiome’s capacity to modulate host phenotype and cause, enhance or ameliorate disease. This assumption is underpinned by evidence that viral fractions of the microbiota alone can generate systemic changes in health. Phages can potentially influence the human host through predation of human pathobionts, and through direct interaction with the human immune system among other modes. The human phageome has been studied at both a multi-trophic systems level, as well as at the individual virus level. To this end, a number of tools have been developed to further our understanding of the features and dynamics of bacteriophages within the human microbiome. Quantification of purified phages using fluorescent DNA intercalating dyes and direct visualization by electron microscopy have allowed for the estimation of phage densities and provided information on the virion morphologies of those communities. Advances in ‘omics’ technologies (primarily genomics) and bioinformatic tools have permitted in-depth 1

The authors contributed equally to this work.

Encyclopedia of Virology, 4th Edition, Volume 4

doi:10.1016/B978-0-12-809633-8.21226-0

283

284

Bacteriophages of the Human Microbiome

Fig. 1 Human microbiome phages across several body sites.

characterization of human phages at the community level. Specifically, next generation sequencing technologies (NGS) have provided a more complete understanding of phage diversity and community structure associated with different human developmental and health states. Mouse models, both gnotobiotic mice which can replicate specific bacteria-phage dynamics or community dynamics, as well as conventional mouse disease models which replicate disease-associated changes in humans, have been used to investigate the overall phageome composition and its influence on host health. The increase of human fecal microbial transplantation (FMT) to treat several gut associated diseases has provided new opportunities to study human gut phage temporal ecology. Ex-vivo and in-vitro models have been employed to decipher the same dynamics as the other systems mentioned as well as mechanisms of spatial localization for certain phage. This combination of research tools has provided invaluable insights into the role of phages in health and disease. This article examines phages in human microbiomes from three different perspectives: the distribution and composition of the human phageome across individuals and body sites, the dynamics of the phage community within specific body sites, and the applications of bacteriophages for human health. Each of these aspects exists at the intersection of multiple dimensions including time, intra- and inter-individual variation, health and disease.

Human Phageome Distribution and Composition Bacteria and their viruses populate nearly all external features of the body (Fig. 1). These sites range from highly habitable to depauperate. Like the bacteriome (the collection of all bacteria present in a given microbiome), the composition of humanassociated phage communities is significantly affected by body site. Micro-habitats of the skin, for instance, are populated by fairly simple resident microbial and viral communities. These communities are generally composed of Staphylococcus and Streptococcus phage, such as Staphylococcus capilis phage STB20, being the dominant prokaryotic phage. Itinerant microbes pass through these habitats frequently from other environments. In contrast, the vaginal environment hosts a more complex and overall more stable microbial community that is usually dominated by Lactobacillus species and lactobacillus phages. Gastrointestinal (GI) tract communities are dominated by Streptococcus and Prevotella species, and presumably their viruses in the oral cavity, and members of the Bacteroidetes, Firmicutes, Proteobacteria and Actinobacteria phyla and their viruses in the gut. The majority of microbes make their home in the GI tract, which is one of the main immune organs in the human body. The GI tract is therefore the largest interface between microbes and human immune cells; and the largest microbemicrobe interface, making it a fruitful area for microbiome research. It encompasses the majority of studies carried out on the human phageome. Beyond external surface habitats, phages are also found within the body. Active phage particles have been isolated from blood and organs. The process of phage transcytosis, in which phage particles passively move across epithelial layers, likely contributes to their internal presence, although active transport systems may also exist. Using tissue culture models of gastrointestinal epithelium, it has been estimated that approximately 31 billion bacteriophage particles cross the gut epithelium into the body every day. The

Bacteriophages of the Human Microbiome

285

presence of internal active phage particles may provide a cooperative immune function for the human host, although its exact implications on both the human host and microbial dynamics have yet to be determined. Along the length of the GI tract there is significant variation in the abundance of bacteria and viruses. Microbial cell densities are high in the oral cavity (B108/ml of saliva) and decline towards the stomach. Beyond the stomach, cell densities increase progressively through the small intestine and number in excess of 1011 cells per cm3 in the colon. Phages reside in the lumen or associated with mucosal membranes (e.g., gums, gut mucus) in similarly high numbers to their bacterial hosts. It has been estimated that between 109 and 1011 virus particles per cm3 are present in the lumen compared to 108–1010 virus particles per cm3 in the mucosa. However, accurate measures of phage densities across both radial and longitudinal axes are difficult to determine. The ratio of phages to bacteria varies between 1:1 and 1:10, which is significantly lower than the phage to bacteria ratios observed in other environments, like the ocean, where phages commonly outnumber bacterial hosts. The differences in phage to bacteria ratios observed between different environments may be linked to cell density, with high cell density favoring lower phage to bacteria ratios. Understanding the spatial and temporal dynamics of phage-bacteria associations at a finer scale will provide critical information on role of phages within the GI ecosystem. Within an individual, tens to hundreds of phage species can be found at any given time in either the oral cavity, the small intestine or the colon. The highest phage load is found in the colon, followed by the oral cavity and small intestine, however, the oral cavity hosts the most diverse phage community. The majority of phages metagenomic surveys have been carried out in the oral cavity and in stool samples, which serve as a proxy to study the colon microbiome. In both environments, the phage community is relatively novel since 460% of the phage DNA sequences found in these studies are not represented in cultured phage isolates. However, the overall phage diversity is lower than what is observed in other microbial environments. The GI tract phageome is characterized by a lower species richness and a very uneven phage distribution, with a minority of phage types comprising a majority of the total phage particles. Without perturbation, the gut phageome can remain surprisingly consistent within an individual over the course of years. When comparing the phage community among different individuals, the gut phage community is specific to each individual. In contrast, the oral cavity phageome shows more temporal variation within an individual. Despite being characterized by an individualized phageome, some GI tract phages are globally distributed (Fig. 2). For example, the gut crAss-like phage is distributed broadly across the human population. CrAss phage has been found in most individuals, and can make up nearly 90% of the community in some people. How and when these phages are acquired is an open question. Through environmental exposures such as common interactions and cohabitation, individuals are constantly exposed to phages from other people and the environment. Most of these viruses do not colonize at detectable levels but a small proportion can establish residency. Consequently, gut and oral phage communities tend to be more similar between individuals within a same household and between relatives (e.g., siblings, mother and offspring). A subset of phages is commonly found in healthy people and less so in diseased individuals (e.g., patients with irritable bowel disease-IBD), which contributes to the notion of phages being agents of health in addition to their known roles in acute disease. The dsDNA phages of the Caudovirales order and ssDNA Microviridae phages are the most commonly observed phage types in the gut. Both viral metagenomics and direct observation by electron microscopy indicate the dominance of these two taxonomic groups. However, it is important to note that over 60% of the genomic sequences obtained from human gut phages do not share sequence homology with known viruses and therefore cannot be classified. Of the classifiable phages, Microviridae generally

Fig. 2 Factors affecting composition and global distribution of GI tract bacteriophage. The individual composition of the gut phage community is impacted by a complex array of factors. Overall, healthy individuals harbor a mixture of individual-specific phages and a smaller subset of phages globally distributed and commonly found in healthy individuals.

286

Bacteriophages of the Human Microbiome

Fig. 3 The adult human gut phage community in health and disease. Human gut phages found in stool samples are a mixture of lytic phages and activated prophages from the prophage reservoir. During certain diseases, the phage community composition is changed. An increase in prophage activation, activation of a different subset of prophage, or an increase of lytic phage potentially contributes to these changes. A balance between lytic and temperate prophage, and an adequate regulation of the prophage reservoir is important to maintain health. Reprinted from Manrique, P., Dills, M., Young, M.J., 2017. The human gut phage community and its implications for health and disease. Viruses 9.

dominate in healthy gut communities. An increased abundance of Caudovirales phages has been associated with certain diseases. For instance, in individuals suffering from IBD, the classifiable Microviridae to Caudovirales ratio is significantly decreased. Other prevalent classified gut phages include the 936 group of lactococcal phages, N4, I3, Mu, Hp1, P2, PhiCD119-like viruses. Most RNA human associated viruses are eukaryotic, with only a limited number of RNA phages detected within human fecal matter. It is important to note that the majority of studies characterizing the phage community are carried out using stool samples due to the difficulty and scarcity of access to intestinal tissue for research. In humans, stool phage content significantly resembles that of the colon, providing a good approximation to study human gut phages. In mice, where access to intestinal tissue is easier, differences in both the phage community composition and dynamics between the colon and the small intestine have been identified. However, information about the fine scale spatial distribution of phages within the human gut ecosystem is scarce and should be addressed when possible. At least 50% of human-associated bacteriophage particles isolated from stool samples can be characterized as prophages derived from lysogenized bacterial strains. The majority of identified prophage sequences in stool samples are found as integrated prophages in Firmicutes, Bacteroidetes and Proteobacteria bacterial species, which comprise the main bacterial phyla found in the gut. Lambda-like and phi80-like coliphages are found in the majority of healthy individuals. Within the Firmicutes phyla, an indepth characterization of prophages of Faecalibacterium prausnitzii, a species highly associated with human health has been carried out, showing that the majority (470%) of these genomes contain at least one prophage. Bacteroidales-like temperate phages are highly prevalent in cellular metagenomes of healthy adult individuals, but are nonetheless underrepresented in viral metagenomes from stool samples. Some of these prophages are classified as Microvirus, a genus from the Microviridae viral family. Previously, this viral genus was considered to be strictly lytic. This type of prophage is shared among different individuals and can be used to group healthy individuals based on the types of prophages they harbor (i.e., viral enterotypes). However, most gut isolated Bacteroides phages including crAss-like phages, do not have lysogenic capabilities and are thought to be maintained in the gut through persistent infections. The oral cavity and vagina are also rich in temperate phages, with Firmicutes prophages being the most abundant in the oral environment and Lactobacillus phages in the vagina. The regulation of prophage activation is likely a key determinant of microbiome ‘health’ with important consequences for human health (Fig. 3). Approximately 70% of gut bacteria contain at least one prophage in their genome. At any given time, a subset of these prophages are induced, producing viral particles which make up a major part of the active phage community. The totality of microbiome prophages is known collectively as the microbiome prophage reservoir. Using murine models, it has been demonstrated that activated temperate phages account for a large fraction of the viral particles found in fecal samples, creating an active viral pool. It is thought that external changes, such as diet or antibiotic treatment, significantly affect the type of prophage that enter the lytic cycle causing a shift in the active temperate bacteriophage community. A model known as the “community shuffling model” proposes that changes in the bacterial community associated with certain diseases arise partly from changes in the prophage activation profile. For instance, individuals suffering from IBD, in which F. prausnitzii abundance is significantly lower compared to healthy individuals, present a greater abundance of active F. prausnitzii prophages in their stool sample. This, together with changes in the ratio of lytic and temperate phages observed in certain diseases has led to the hypothesis that a balanced prophage activation is necessary to maintain health. A more extensive characterization of human associated temperate phages across different body sites, time, and health state is needed to test this hypothesis and understand their ecological role in the human microbiome. Both phage and bacterial communities fluctuate through time and display certain trends correlating to the stages of human development. Gut phage diversity at birth is extremely low, but increases significantly within days after birth. The composition of infant gut phageomes is influenced by the type of birth (natural vs cesarean) and the newborn food intake (breast fed vs formula), as most of the prophages are transferred from the mother to the child inside bacterial lysogens. The most abundant phages in newborns are those infecting Lactobacillus and Bifidobacterium sp. As the infant ages, prophages become active and predate on their host species contributing to the microbial dynamics that take place within the gut of the newborn. Soon after birth, several factors such as time, diet and phage predation alter the microbial community composition and structure. A shift to Bacteroides and

Bacteriophages of the Human Microbiome

287

Fig. 4 Phage microbiome trophic interactions.

Firmicutes phages occurs corresponding with a rise of Bacteroides and Firmicutes hosts within the bacterial community. The proportion of lytic to temperate phages in the gut is altered as well, with lytic phages predominating in early childhood. As both the bacterial and phage communities gain complexity, this balance switches to favor lysogenic phages and their lysogenized hosts. Much of the phageome temporal stability seen in adulthood is hypothesized to be the manifestation of a predominantly temperate phage community, as integrated or episomal prophages (the prophage reservoir) are able to persist in the gut by sheltering within select bacteria. Diet and drugs can exert strong selective forces on the gut phageome (Fig. 4). By modulating the nutrient landscape, dietary elements can promote certain bacterial assembly patterns which in turn alter the composition of the phage community. Antagonistic compounds derived from the diet or consumed as medication can also influence gut phage dynamics through depletion of bacterial hosts and activation of prophages. Changes due to dietary intervention are generally reversed upon return to a conventional diet, however, in some cases, the phage community fails to return to its prior state. During and after antibiotic treatment, bacterial diversity is significantly decreased, whereas viral richness is not significantly affected, in all likelihood due to an increase in antibiotic-induced activated prophages. However, antibiotics do cause a shift in the overall phage community membership. It is hypothesized that this apparent greater resilience in the phage community potentially contributes to the ability of the bacterial community to return to equilibrium after disruptions, including disease. Both acute and systemic diseases often correlate with bacterial and phage dysbiosis in the microbiome (Fig. 3). General changes in phage number, phage diversity and the prevalence of certain phage types can be indicative of unhealthy states. For example, phage taxonomies commonly associated with healthy individuals are underrepresented in patients with IBD. IBD patients with different disease type (colitis vs Crohn’s disease) harbor different phage assemblies, generally characterized by an increase in classifiable Caudovirales phages, an increase in strictly lytic phages, a decrease in Clostridiales phages, and an increase in phages associated with pathobiont hosts. The number of detectable phage particles can also change. Mucosa-associated phages are more abundant in individuals with IBD and leukemic disease patients. These features may be attributed to changes in prophage activation profile (community shuffling model) or an increase in lytic bacteriophages. Consistent with this model, a shift from a temperate lactobacillus phage-enriched environment to a more lytic-enriched one in the vaginal microbiome has been associated with bacterial vaginosis. Phage communities are significantly altered in patients suffering from a variety of diseases, ranging from periodontal diseases to bacterial vaginosis, cystic fibrosis, type 2 diabetes, IBD and malnutrition. These changes result in differences in phage diversity; impacting the bacterial community with potential consequences for human health. Phages can also act as vectors of pathogenicity, like cholera toxin CTXj phage and shiga toxin encoding Stx phages. Though many are transient to humans and exist within their host bacteria causing acute disease, a spectrum of these phages can infiltrate the human microbiome. At the cost of increased predation, many pathobionts are highly permissive to temperate phages which can provide advantageous genetic factors. These organisms and their associated phages can exist in a commensal or infectious capacity which dictates their broader distribution across the human population as members of the microbiome.

Dynamics and Implications of the Human Phage Community Phages engage in a full spectrum of symbiotic interactions with bacteria, from mutualism to parasitism, each contributing to significantly different outcomes in their bacterial host population. Virus predation drives the evolution of both phage and host, likely

288

Bacteriophages of the Human Microbiome

through a constant arms-race directional selection, in which selection and propagation of bacterial mutants resistant to phage predation is followed by the emergence of virus mutants that are able to infect the newly evolved strains. Lysogeny offers an alternative host-virus association. Upon integration of a phage genome into the bacterial chromosome, an altered symbiosis is created. The prophage fitness burden can be compensated through prophage-encoded advantages such as protection from superinfection, lysis of competing bacteria, or utilization of advantageous prophage-encoded genes such as antibiotic resistance genes or polymorphic toxins. In complex microbial environments, like the oceans, phages have been shown to have predominantly lysogenic cycles under unfavorable conditions (e.g., low nutrient availability, low host densities). However, increasing host densities (e.g., the gut ecosystem) might also favor the strategy of lysogeny as proposed in the Piggy Back the Winner model (PbW). Some phages may remain in a state between lysogeny and lytic replication known as pseudolysogeny, in which the viral genome remains as an extrachromosomal unit, without being degraded, until it is either actively replicated or integrated into the host genome. Host-phage dynamics have been primarily explored in the intestinal tract and the oral cavity. Within these microbiomes, the human developmental stage and specific tissue site (e.g., saliva compared to sub- and supragingival areas in the oral cavity or small intestine compared to cecum and colon in the GI tract) are some of the key factors dictating host-phage dynamics. In the gut microbiome of newborns, rapid colonization with the mother’s microbiome leads to density dependent (Lotka–Volterra) hostphage dynamics that fade with time as the community reaches an equilibrium and becomes enriched with lysogenic phages. In adulthood, host-phage dynamics change considerably within different sections of the GI tract. For instance, lytic dynamics are more common in the oral cavity than in the gut. Analysis of phage-host abundance and genome evolution in murine models has shown that within the intestinal tract there is reduced directional selection, evident by a decrease in the appearance rate of phageresistant bacterial mutants. This may be due to epigenetic variability of bacterial host populations in response to different chemical micro-environments within the gut allowing the bacteria to evade phage predation without necessarily modifying their genomes. These observations are our first indications of the mechanisms driving the balance between lysis and lysogeny, and the prevalence and role of pseudolysogeny and their contributions to host-phage dynamics. The effects of bacteriophages in the gut are beginning to be unraveled primarily using a combination of animal models and NGS technology. Animal models in which germ-free mice are colonized with two bacterial strains- one containing a prophage and a second strain susceptible to the prophage, or with a bacterial consortium representative of the human microbiome, have been used to elucidate the role of GI phages. Within this context, it has been shown that prophage induction rate is significantly higher in the gut than rates observed in vitro. The ‘cost’ of prophage induction (lysis) in a subset of the bacterial population encoding the prophage, may be compensated by the ability of the temperate phage particle (activated prophage) to kill competitor strains. It is also possible to establish a more complex microbial community that is more representative of the gut ecosystem by colonizing either germ free mice or in vitro chemostat-based bioreactors with human stool samples. This, together with conventional mouse models, has shown that the phage fraction influences the bacterial community structure and that it likely contributes to its resilience after disruptive events such as antibiotic treatment. Recently, more specialized models such as ‘gut on a chip’ are becoming important tools to investigate how the human epithelium and mucosal surfaces can impact bacterial-phage interactions with important consequences to the human host. The GI tract provides extensive mucosal surfaces onto which some bacteriophages can adhere in high localized densities, altering site-specific phage-bacteria ratios (Fig. 5). In the gut, phage adherence to mucosal surfaces has been described by the bacteriophage adherence to mucus (BAM) model. Briefly outlined, the BAM model proposes that the binding of certain bacteriophages to mucus glycoproteins through immunoglobulin-like domains found on their capsids alters their diffusive properties, increasing the density of phages in outer mucosal layers where they are more likely to encounter susceptible hosts. This mechanism not only allows for the increased success of the adherent phages, but it also confers potential immunity to bacterial invasion of the gut epithelium. The BAM model may operate in other mucosal surfaces of the body as well, such as gums, where bacteriophages can be found bound to the oral mucous membranes in extremely high concentrations compared to their bacterial counterparts (35:1 ratio). A model that integrates both BAM-based immunity and lysogeny-driven PbW dynamics proposes that bacteriophages can provide a level of protection against bacterial invasion of the epithelium (Fig. 5). In this model, lysogeny, which is prevalent in the lumen and at the surface of the mucosal layer, is thought to confer competitive advantages to commensal bacteria in the highly colonized areas of the gut through symbiotic interaction between the lysogen and the prophage. Simultaneously, lytic phage predation, enhanced in the dense mucus layers, is likely involved in eliminating potential pathogens. Similar dynamics are seen in the oral cavity. Regardless of whether phage infection is of lytic or lysogenic nature, phages can alter the metabolism of their host cell in order to promote their successful replication, influencing the overall functional profile of their microbial communities. To better understand phage-bacteria dynamics, it is important to determine the host range of the phageome. Bioinformatic analysis of metagenomic datasets suggests that some members of the human phageome may have a broader host range than previously appreciated. Through mechanisms that increase phage genomic diversity, such as single nucleotide mutations and rearrangements of tail fiber genes, gut microbiome phages can alter their host specificity allowing persistence through time. These results have been validated with mouse models and in culture. At the community level, phages likely stabilize human microbiomes and provide a degree of resistance to external changes and colonization. For instance, the gut microbiota is deeply affected by antibiotic treatment, after which a new, but similar, microbial “steady-state” is restored. Antibiotic pressure leads to an increase of phages that encode antibiotic resistance genes, potentially contributing to the resilience of the bacterial community, which is a hallmark of healthy microbiomes. These observations suggest that phages could contribute to re-establishing a healthy symbiosis altered during changes caused by disease. The bacterial

Bacteriophages of the Human Microbiome

289

Fig. 5 Phage-bacteria and phage-human interactions within the intestinal tract.

Fig. 6 Role and applications of human associated bacteriophages.

community established after fecal microbial transplant (FMT), is likely influenced by the original phage community. However, it is important to keep in mind that phages may also augment disease development by enhancing dysbiosis in the microbiome. The use of whole microbial community transplants (e.g., FMT) as an effective treatment of gut-microbiome associated diseases, provides an ideal scenario to study both the influence of bacteriophages on the resilience of the gut microbiome and the capability of bacteriophages to modulate changes in the microbial community structure. Based on the effectiveness of fecal microbial transplantation in treating C. difficile infection this practice has been implemented in the treatment of other diseases. The type of

290

Bacteriophages of the Human Microbiome

eukaryotic viruses and bacteriophages hosted by a subject have been linked with treatment success, suggesting that both, the viruses that were already in the patient and the viruses which are able to establish after treatment, may affect the outcome of the procedure. Donor-patient compatibility is likely an important factor for phage transfer and treatment outcome.

Clinical Utility of Bacteriophages of the Human Microbiome Since their discovery in the early 1900s, bacteriophages have been investigated as biotherapeutics to control bacterial infections for diverse applications, ranging from food safety to disease treatment in animals and humans (Fig. 6). It was proposed early on that phage therapy using cocktails of purified phages could eliminate bacterial infections. However, the emergence of commercial antibiotics in the late 1930s and the variability in phage treatment success limited the pursuit of phage therapy in Western medicine. The rise of antibiotic resistant human pathogens has rekindled interest in phage therapy. Phage therapy has been successfully used to eliminate topical (e.g., Staphylococcus sp. skin infections), internal or systemic infections (e.g., gut-derived sepsis due to P. aeruginosa, multi-drug resistant E. coli infection, and C. difficile infection), and successfully personalized as intravenous phage therapy for treating Acinetobacter baumannii septicemia. Advancement of phage-based therapeutics will require a better understanding of the ecological forces that shape phage establishment and dynamics within human microbiomes. Elimination of specific pathogens within complex bacterial communities, such as the gut microbiome, is currently a challenge for phage therapy implementation. Important variables included phage host range, host range switching, and the control of the cell lysis process. Ideally, phage therapy could be used to eliminate specific pathogenic bacteria without disruption of the commensal bacterial community, mitigating the potential for secondary infections caused by off target antibiotic effects. If successful, phage therapy may provide a cheap and effective alternative to antibiotics. From an ecological viewpoint, phages are crucial members of the human microbiota in both health and disease. Therefore, it is reasonable to consider that phages can be used not only to treat specific bacterial infections, but to return a dysbiotic bacterial community to eubiosis. Effective treatment of C. difficile infection, an opportunistic pathogen that is associated with a decrease in gut microbiome diversity, can be accomplished by application of fecal filtrates which are greatly enriched in bacteriophages. The use of complex mixtures of bacteriophages isolated from healthy individuals presents a potential therapeutic treatment of more complex human diseases associated with dysbiotic bacterial communities such as IBD, or non-gut related diseases, such as periodontitis or bacterial vaginosis. A long-term goal of phage therapy is to provide safe and effective phage cocktails to control the microbiome bacterial community structure and function by design. Alongside their role as important biological agents that can modify microbial communities, phages may emerge as useful biomarkers of health and disease. Specific phage sequences may be identified that can separate individuals based on their health status. Phage-based biomarkers have been shown to be useful to determine healthy, at risk individuals and subjects more likely to respond to treatments involving modification of the gut microbial community.

Concluding Remarks It is well established that human associated microbes are key factors contributing to human health and disease. There is increasing evidence that phages play a direct role in microbiome health and disease. However, our knowledge of bacteriophage communities in human microbiomes remains sparse. Advances in sequencing technologies, phage annotation, animal and in-vitro models, and clinical human trials are significantly contributing to our understanding of phage-bacteria-human interactions. The influence of phage particles inside of the human host also requires further research. Within the next decade, we foresee a significant advance in human phage biology, which will result in applications that improve human health and limit disease.

Further Reading Abeles, S.R., Pride, D.T., 2014. Molecular bases and role of viruses in the human microbiome. Journal of Molecular Biology 426, 3892–3906. De Paepe, M., Leclerc, M., Tinsley, C.R., Petit, M.A., 2014. Bacteriophages: An underestimated role in human and animal health? Frontiers in Cellular and Infection Microbiology 4, 39. De Sordi, L., Lourenco, M., Debarbieux, L., 2019. The battle within: Interactions of bacteriophages and bacteria in the gastrointestinal tract. Cell Host & Microbe 25, 210–218. Edlund, A., Santiago-Rodriguez, T.M., Boehm, T.K., Pride, D.T., 2015. Bacteriophage and their potential roles in the human oral cavity. Journal of Oral Microbiology 7, 27423. Hyman, P., Abedon, S., 2012. Bacteriophages in human health and disease. In: Advances in Molecular and Cellular Microbiology series 24. CABI. Manrique, P., Dills, M., Young, M.J., 2017. The human gut phage community and its implications for health and disease. Viruses 9. Maura, D., Debarbieux, L., 2012. On the interactions between virulent bacteriophages and bacteria in the gut. Bacteriophage 2, 229–233. Mills, S., Shanahan, F., Stanton, C., et al., 2013. Movers and shakers: Influence of bacteriophages in shaping the mammalian gut microbiota. Gut Microbes 4, 4–16. Nguyen, S., Baker, K., Padman, B.S., et al., 2017. Bacteriophage transcytosis provides a mechanism to cross epithelial cell layers. MBio 8. Ogilvie, L.A., Jones, B.V., 2015. The human gut virome: A multifaceted majority. Frontiers in Microbiology 6, 918. Reyes, A., Semenkovich, N.P., Whiteson, K., Rohwer, F., Gordon, J.I., 2012. Going viral: Next-generation sequencing applied to phage populations in the human gut. Nature Reviews: Microbiology 10, 607–617. Shkoporov, A.N., Hill, C., 2019. Bacteriophages of the human gut: The “known unknown” of the microbiome. Cell Host & Microbe 25, 195–209. Weitz, J.S., 2016. Quantitative Viral Ecology – Dynamics of Viruses and Their Microbial Hosts. Princeton University Press.

Bacteriophage: Red Recombination System and the Development of Recombineering Technologies Lynn C Thomason, Frederick National Laboratory for Cancer Research, Frederick, MD, United States Kenan C Murphy, University of Massachusetts Medical School, Worcester, MA, United States r 2021 Elsevier Ltd. All rights reserved.

Glossary λ Red system The generalized recombination system of bacteriophage λ, which infects E. coli. The Red system consists of a 5′-43′ dsDNA exonuclease, Exo, and a ssDNA annealing protein, Beta. Bacteriophage A virus that infects bacteria. Escherichia coli (E. coli) A well-studied genetically tractable Gram-negative bacterium. Exonuclease An enzyme that digests linear DNA from the ends. Exonucleases are classified according to their degradation properties. Genetic engineering In vitro or in vivo manipulation of DNA segments to form new sequences. Heteroduplex DNA A double-stranded DNA with the two strands derived from different molecules. Homeologous DNA DNAs with partially but not completely identical nucleotide sequences. Homologous DNA DNAs with identical or nearly identical nucleotide sequences. Methyl-directed mismatch repair (MMR) A method that bacteria use for detecting and repairing mistakes created during synthesis of new DNA. Oligonucleotide For the purposes of this article, a commercially synthesized piece of single-stranded DNA, generally B70 nt long.

Prophage A bacteriophage chromosome integrated into the chromosome of its bacterial host, with most of the viral functions repressed by a phage-encoded repressor protein. An intact prophage can excise from the bacterial chromosome in response to an environmental stimulus and undergo lytic growth to make new viruses. Rac prophage A defective E. coli prophage that contains the RecET system. Defective prophages are unable to excise and undergo a lytic cycle, but they still encode functional genes. RecET The generalized recombination system of the Rec prophage, The RecET system consists of a 5′-43′ dsDNA exonuclease, RecE, and a single-stranded DNA annealing protein, RecT. Recombinase For the purposes of this article, an enzyme that promotes pairing of two complementary single-strand DNAs. Recombination The rearrangement of DNA sequence in a new order. Recombineering In vivo genetic engineering with bacteriophage recombination functions. Selection/counterselection A two-step procedure of genetic manipulation using two adjacent gene cassettes, one of which can be selected for, and the other selected against. Single-strand annealing Pairing of two complementary single-strand DNAs.

Introduction Over the past two decades, a phage-based genetic engineering technology in bacteria has emerged that allows any stretch of genomic DNA in a bacterial chromosome, or endogenous plasmid, to be easily altered in a precise manner. Many different genetic changes can be made, including gene knockouts, replacements, insertions, deletions, and point mutations. This technology, termed recombineering (for in vivo recombination-mediated genetic engineering) relies on the expression of bacteriophage-encoded homologous recombination functions in the absence of all the other phage functions. This article will first review studies of the phage λ Red recombination system and how it operates during a phage infection, followed by details of how two phage homologous recombination systems, Red from phage λ and RecET from the cryptic Rac prophage, have been utilized to create new genetic engineering tools. What makes the phage systems so practical for in vivo engineering is that short DNA homologies (B50 bp) are adequate for targeting, such that recombination substrates can be generated using PCRs of drug-resistant markers, with targeting homologies of these substrates incorporated into the PCR primers. Short DNA oligonucleotides can also serve as substrates for recombineering. Thus, allelic exchange substrates are easily generated for gene targeting of bacterial chromosomes, as well as eukaryotic DNA cloned into cosmids and bacterial artificial chromosomes (BACs). This technology has impacted an array of diverse fields such as the study of bacterial pathogenesis, metabolic engineering, synthetic biology, and mouse knockout technologies.

Bacteriophage Since Twort and d'Herelle first discovered the existence of bacterial viruses in the early 1900s, these organisms have been under intensive study. Bacterial viruses are commonly known as bacteriophages (bacteria eaters) or phage. By the 1930s, they were employed as therapeutic interventions for diseases such as cholera and dysentery, thought the details of the reported successes of

Encyclopedia of Virology, 4th Edition, Volume 4

doi:10.1016/B978-0-12-809633-8.07023-0

291

292

Bacteriophage: Red Recombination System and the Development of Recombineering Technologies

these treatments were not properly documented. The advent of antibiotics in the 1940s during WWII, together with the lack of interest by medical professionals in the West, led to only limited use of phage therapy as a modern way to combat bacterial infections. Phage therapy was mostly studied and developed at the Eliava Institute in the Republic of Georgia from the 1930s onward, where to this day, phage formulations are still used to treat bacterial infections of the skin and gut, and other infections that have proved refractory to antibiotic treatments. By the 1940s, however, bacteriophages moved to center stage in a different venue, as a major tool that led to the development of modern-day molecular biology. The physicist Max Delbruck founded the phage group with a goal of characterizing these invisible creatures to better understand the basis of life. As the new field of molecular biology grew, the phages and their bacterial hosts served as primary model organisms. Early concepts of molecular biology were established by members of this group using bacteriophage and their hosts, which included the discovery of genetic mutations in bacteria, the structure of DNA and how it replicated, the triplet nature of the genetic code, and the capacity of DNA to repair itself, including mechanisms of general DNA homologous recombination. The phage group studied various bacteriophages, primarily the T-even lytic phages. In 1953, Esther Lederberg reported the isolation of a new phage, λ, as a prophage resident of Escherichia coli that could be induced from the host cell to propagate lytically. λ is thus a temperate phage and maintains two life-styles: it is able to integrate its 48.5 kb double-stranded DNA (dsDNA) chromosome into that of its E. coli host, allowing passive replication of the phage DNA along with that of the bacterium. In this prophage state, the viral lytic functions are repressed until the virus is triggered to undergo lytic growth and make more phage particles, a process that results in lysis of the host cell by phage encoded enzymes. During the lytic growth cycle phage chromosomes are both replicating and recombining with each other, in a process to generate long concatemeric DNA species, which are the precursors for packaging phage chromosomes into newly formed phage heads. It was this system that led early investigators to use phage λ as a tool to study DNA homologous recombination, as one could easily infect bacteria with two mutant forms of the phage, allow them to recombine with one another during a single infection cycle, then determine the numbers of wild type progeny that were generated following lysis of the bacterial cell. It offered hope to determine, in a very simple system, the mechanisms of initiation and resolution of DNA homologous recombination intermediates.

Homologous Recombination Systems in E. coli and λ Phage DNA homologous recombination is an important universal process in all forms of life. In simple terms, it is genetic exchange between identical or near identical DNA sequences. Its existence is required for lateral transfer of genetic material between bacteria, for the genetic variation generated during sexual reproduction, and for the repair of dsDNA breaks that occur either spontaneously, or because of the presence of DNA damaging agents. In essence, homologous recombination is the guardian of the genome, where mutations in genes encoding recombination functions can lead to the development of cancer (e.g., BRCA1) or to instability of chromosomal structures and numbers. One of the first species to be used in the laboratory to define the genetic requirements of homologous recombination, and subsequently its molecular mechanisms, was E. coli, and its infectious particle, phage λ. E. coli has a homologous recombination system that is important for the incorporation of foreign DNA into the chromosome and for repair of damaged DNA. In 1962, A. John Clark and Ann Margulies isolated mutants of E. coli defective for recombination. These mutations were in a gene that became known as recA. The RecA protein plays a central role in almost all pathways of recombination in E. coli, by binding to single-stranded DNA (ssDNA) and promoting invasion of the RecA-ssDNA filaments into homologous dsDNA, the initiating event in almost all types of both prokaryotic and eukaryotic homologous recombination systems. Soon thereafter, other E. coli genes were discovered that also played roles in homologous recombination. These included the genes recB, recC and recD, whose products form a trimeric protein, RecBCD, which binds to the ends of dsDNA and digests both the 5′ and 3′ strands in a highly processive manner. RecBCD, also known as ExoV, is an E. coli dsDNA exonuclease complex capable of degrading linear intracellular dsDNA at a rate of 1000 bp/s. As this powerful exonuclease translocates along, it nicks both strands of dsDNA until encountering an 8 bp sequence called Chi (crossover hotspot initiator – 5′ GCTGGTGG 3′). Recognition of the Chi site alters the enzymatic properties of RecBCD such that degradation of the 3′-ended DNA strand is suppressed. Further translocation of RecBCD past the Chi site allows continued production of a long ssDNA with a 3′ end, which serves as a substrate for the RecA protein. As noted above, it was recognized that λ phage was able to recombine its DNA in E. coli. The question arose if λ used the host RecA-RecBCD functions to recombine, or whether it possessed its own recombination system. An answer to this question came in 1967 when Brooks and Clark found that if the E. coli recA gene was inactivated, λ could still promote recombinant formation during phage infection. Thus, it was clear that the phage possessed a generalized recombination system that could work independently of the host RecA protein. Soon thereafter, mutants in λ were identified that disrupted this new phage recombination system. This system was called λ Red, named after the recombination-defective phenotype displayed by these mutants. It was shown genetically that these λ mutants formed two complementation groups, α and β. Biochemical studies later determined that the α gene encoded a dsDNA 5′-3′ exonuclease (called Redα or λ Exo), and the β gene encoded a protein that stimulated annealing between two homologous ssDNA molecules (called Redβ or λ Beta protein). A third λ gene, γ, affects recombination indirectly: the encoded protein (λ Gam) was found to inhibit RecBCD by binding to the dsDNA binding site of the enzyme, thus inhibiting RecBCD from binding to dsDNA ends. The Gam protein prevents RecBCD from interfering with λ Exo-Beta complex processing of dsDNA ends, thus indirectly stimulates Red-stimulated recombination. It is noteworthy that the λ phage recombination system is assisted by the Gam protein, which directly inhibits a major component of the E. coli recombination system (RecBCD). This turns

Bacteriophage: Red Recombination System and the Development of Recombineering Technologies

293

out to be a requirement, as the λ chromosome contains no Chi sites, and thus cannot efficiently utilize the host recombination system. Since it lacks Chi sites, the λ chromosome is susceptible to complete digestion by the dsDNA exonuclease activity of RecBCD unless the Gam protein is present. Over a number of years, work from various laboratories contributed to the genetic and biochemical characterization of the λ Red system. Scientists in F.W. Stahl’s laboratory at the University of Oregon, in the C. Radding lab at Yale Medical School, and in the laboratories of T. Poteete and K. Murphy at the University of Massachusetts Medical School were instrumental in defining the in vivo activities and biochemical properties of the Red system, as well as the phage P22 recombination system, and distinguishing them from contributions from the hosts. Two key concepts emerged from these studies: (1) In the absence of RecA, the λ Red system required DNA replication of the phage chromosomes to promote high levels of recombination. Many of these studies came from replication-blocked phage crosses, where λ Red recombination was highly dependent on RecA. This observation highlighted the direct link between the mechanisms of DNA replication and recombination for the Red system, a mechanistic coupling unique to the phage that does not occur in its E. coli host. (2) A key biochemical finding was that the E. coli RecA protein and λ Beta protein promoted different types of reactions with ssDNA. While a RecA-coated ssDNA filament was capable of finding homologous DNA in the context of the DNA duplex (i.e., strand invasion), λ Beta protein bound to and accelerated the annealing of ssDNA to a complementary ssDNA species. The ssDNA substrate, in the case of Beta, was provided in vivo by the action of λ Exo when it initiated DNA degradation of a linear phage chromosome.

Biological Roles of the λ Red System Role in DNA Replication Enquist and Skalka observed that Red-mediated recombination is required for optimal levels of λ DNA replication. Both the rate of replication and the total amount of λ DNA generated are lower in red mutants. This defect is not rescued by RecA function, suggesting that at least partially non-overlapping functions exist for the phage and host generalized recombination systems. The DNA rolling-circle replication forms that predominate during the late stage of the phage replication cycle are likely to be substrates for Red recombination, since they contain dsDNA ends and active replication forks. It was later noted that rolling-circle tails are shorter in a red mutant background, suggesting that two replicating linear molecules may be recombined into a longer one by the Red system. The products of rolling-circle replication intermediates are multimers of λ chromosomes, termed concatemers. These concatemers are the DNA substrates utilized by the phage terminase enzyme for packaging individual chromosomes into protein capsids. Red is important in generating larger concatemers, thus making more packageable λ chromosomes. However, long concatemers can also persist in the population in the absence of Red, as long as the Gam protein is present. In this case, Gam, by binding to RecBCD, protects the dsDNA ends of rolling-circles, thus allowing these replication intermediates to form concatemeric DNA. While recombination is not absolutely required for the formation of packageable DNA, it stimulates this process.

Genetic Exchange Among Similar Phages – Generating Genetic Diversity The lambdoid phages as a group share similar genomic organizations. The recombination, replication, and morphogenesis functions all share similar positions along the length of the chromosome when lambdoid phage genetic maps are compared. Although individual protein factors may differ structurally, and functions from one phage may fail to complement mutations in analogous genes of other lambdoid phages, complete functional modules, which may include several genes, are often readily exchanged between phages. One example of this modularity is the clustering of recombination functions from phage λ and P22 (a Salmonella phage). The phage P22 annealase called Erf (essential recombination function) cannot substitute for λ Beta in E. coli, and the P22 anti-RecBCD function modifies RecBCD’s enzymatic activity and actually increases its affinity for dsDNAs ends, while the λ Gam protein inhibits RecBCD from binding to dsDNA ends. Nonetheless, the λ Red system (three genes) and the P22 system (four genes) can efficiently substitute for one another in E. coli and Salmonella hosts. Recombination reactions in these functional segments occur in short regions of homology between the modules, leading to sequence mosaicism in the lambdoid family of bacteriophages. λ Red-like systems are capable of recombining with short homologies very efficiently and thus would be expected to easily promote recombination among these modules during mixed phage infections. Furthermore, the Red system is also able to act on homeologous DNAs, recombining DNA that is only partially homologous (as much as 22% divergent). This ability gives lambdoid phages even greater capabilities to promote recombination within small regions of imperfect homology between functional modules. Such a relaxed requirement for DNA shuffling would give lambdoid phages an evolutionary advantage over other phages (and bacteria) that require high sequence identity for exchanging DNA segments.

DNA Repair When phage DNA enters the cytoplasm of a bacterium in nature, restriction endonucleases often generate double-strand breaks in the viral chromosomal DNA. The dsDNA breaks inflicted by these restriction systems can be repaired by recombination with

294

Bacteriophage: Red Recombination System and the Development of Recombineering Technologies

another infecting phage. Heterologous prophages already resident in the cell may also participate in genetic recombination with the damaged DNA of the infecting phage, with portions of the broken chromosomes rescued by recombination and incorporated into resident prophages. It was, in fact, the in vivo delivery of a dsDNA break by a restriction enzyme during experimental phage λ crosses in the Stahl laboratory that demonstrated the importance of dsDNA ends for the action of λ Red in vivo. DNA damage is the natural inducing agent for excision of lambdoid prophages from the bacterial chromosome, as a result of RecA co-protease-dependent cI repressor cleavage and subsequent expression of the phage lytic genes, including the recombination functions. A low level of DNA damage may allow transient expression of phage functions without inducing a full-fledged lytic cycle. Activation of the Red system through this transient prophage induction increases total cellular recombination potential, which may benefit the bacterial host by providing an additional DNA repair pathway. Transient prophage induction would also increase the likelihood that double-strand breaks in the phage DNA will be rescued by other prophages of the same type or by heterologous prophages with limited homology. Under some circumstances a partially induced prophage may undergo DNA replication while still integrated in the E. coli chromosome, giving rise to onion-skin replication, a replication bubble with multiple replication forks. Exo and Beta may act to resolve these kinds of aberrant structures.

Classical Models of λ Recombination After nearly three decades of research involving the Red system, two general mechanisms of λ homologous recombination have been defined. One mechanism is the RecA-dependent pathway of Red recombination, which occurs when phage chromosomes are prevented from replicating. In delineating this pathway, crosses were performed with λ mutants defective for genes responsible for the initiation of chromosomal replication. As noted above, the loss of RecA drastically reduced recombination in such crosses. By this mechanism, it is thought that the Red proteins Exo and Beta act on dsDNA ends to generate ssDNA, where the bound Beta protein is replaced by RecA, with assistance of host recFOR proteins (functions normally associated with loading RecA onto ssDNA coated with E. coli’s single-stranded binding protein). This intermediate is then shuttled into the host pathway of homologous recombination, involving both branch migration proteins and Holliday junction resolvases of the host (though the phage resolvase Rap could also play a role). The second major mechanism of Red recombination is the RecA-independent ssDNA annealing pathway. This route is best described in the context of the rolling-circle mode of chromosomal replication (also called the sigma, or σ, mode), as mentioned above. Such rolling-circle intermediates are thought to be initiated by random nicks in a λ chromosome replicating in the circle-tocircle mode (also called the theta, or θ, mode), which consequently generates dsDNA ends that are distributed across all regions of the λ chromosome. Thus, in this scenario, the ends of two rolling-circle intermediates are acted upon by λ Exo and Beta to produce ssDNA coated with Beta protein. Complementary regions of this ssDNA are then annealed together by the action of λ Beta to form a larger heteroduplex recombinant DNA intermediate. After DNA polymerase I fills the remaining ssDNA gaps, the nicks are sealed by DNA ligase. Thus, the recombination event promotes both genetic diversity in a heterogenic community, as well as formation of large concatemeric DNAs refractory to exonuclease degradation for more efficient packaging.

Development of Recombineering By the late 1990s, it was recognized that the λ Red system might be useful to promote genetic engineering of the E. coli chromosome, by virtue of its ability to promote recombination between linear dsDNA substrates and bacterial chromosomes. It was theorized that the Red and Gam functions, if expressed off a plasmid independent of any other phage functions, might create a hyper-recombinogenic environment in E. coli, and perhaps other hosts as well, including pathogenic bacteria. Experiments performed by one of us (KM) showed that this was indeed the case. Plasmid-borne expression of Red and Gam in an E. coli strain resulted in a strain that could be transformed with linear DNA substrates at very high efficiency. Furthermore, the hyper-rec background of these strains facilitated the insertion of a PCR-amplified linear dsDNA antibiotic resistance gene, flanked on each side by 1 kb homologies to the target gene, at efficiencies up to 100-fold higher when compared to other recombination-proficient strains used at that time. Around the same time, the Francis Stewart laboratory was working with another phage homologous recombination system, the RecET system of the cryptic Rac prophage. This system, which is analogous to the λ Red system, possesses a 5′-3′ dsDNA exonuclease (RecE) and a single-strand annealing protein (RecT). Although λ Exo and RecE have similar activities, the full length RecE protein is much larger (nearly 900 amino acids), with the exonuclease activity in the C-terminal domain. The Stewart lab demonstrated that RecET could also promote recombination between bacterial genomic DNA and PCR substrates, but this time with target homologies of about 50 bp. This was a key step in the development of recombineering technologies. This relaxed requirement for the size of flanking homologies allows the preparation of dsDNA substrates to be performed by PCR, where a drug marker serves as a template and flanking homologies are simply added to the 5′ ends of the PCR primers. Plasmid constructs containing drug markers with long flanking homologies, typical of the early days of gene replacement in bacteria, were no longer required. This novel gene replacement technology was quickly followed by similar reports from a number of laboratories. It was found that short flanking homologies (30–50 nt) on the linear dsDNA were adequate for the recombination for both the λ Red and RecET systems, though the former system showed mildly better efficiencies. Both systems required λ Gam for efficient gene replacement.

Bacteriophage: Red Recombination System and the Development of Recombineering Technologies

295

Fig. 1 Linear dsDNA recombineering with PCR-generated substrate. (a) Hybrid primers are designed that are B70 nt long and contain B50 nt of 5′ homology that flank the intended target sequence. The remaining B20 nt at the 3′ ends prime the DNA template, in this case an antibiotic resistance gene (drugR marker) indicated in blue. These hybrid primers are used in a PCR reaction to generate a linear dsDNA containing the amplified cassette with homology to the target gene at both ends. (b) The PCR product is used for recombineering (requiring both λ Beta and Exo) and will replace most of the endogenous DNA, indicated in black, between the two target homology sequences. (c) The final recombinant can be verified using two pairs of PCR primers, 1 + 2 and 3 + 4. These primer pairs will amplify the recombinant junctions between the target and the newly inserted DNA and confirm correct insertion. If the inserted DNA is a different size than the original endogenous sequence, primer pair 1 + 4 can be used to confirm that the recombinant strain contains only the modified copy of the gene and that the original sequence is no longer present. Primer pair 1 + 4 can also be used to amplify a PCR product for DNA sequencing.

The phage recombination functions plus Gam, either residing in the bacterial chromosome or encoded by a plasmid, are typically expressed from one of a number of controllable promoters. These promoters are normally inactive but can be up-regulated for gene expression by displacement of a bound repressor protein (e.g., LacI or AraC) from the operator by small molecule inducers (e.g., IPTG or arabinose, respectively), or by inactivation of a heat-labile repressor. The regulated expression of these recombineering functions is important, as constitutive high levels of expression result in poor growth and can be mutagenic. Once induced, the phage recombination functions promote a hyper-recombinogenic environment, where linear DNA species recombine with target chromosomal or episomal sequences at very high frequencies. The DNA substrates can be dsDNA that is either generated by PCR or chemically synthesized, or single-stranded DNA oligonucleotides. To perform recombineering, the cells are grown to moderate density, the recombination functions are induced, the cells are collected by centrifugation and washed, and linear DNA donor substrates are introduced into the bacteria by electroporation. The cells are then suspended in liquid media and grown for a period of time, to allow recombination between the introduced substrate and its target chromosome (or episome), and to promote expression of any antibiotic resistance cassette. The cells are then plated on petri plates under appropriate conditions to recover recombinants; most often an antibiotic selection is applied. Recombineering is efficient, precise, and accurate. The ease and speed of the process has made it the method of choice for genetic engineers to alter bacterial chromosomes, study bacterial pathogenesis, perform metabolic engineering of industrially-important bacteria, or manipulate eukaryotic DNA cloned into plasmids and bacterial artificial chromosomes (BACs).

Recombineering With dsDNA The most common type of genetic modification constructed with recombineering is a simple gene replacement, where the target gene is replaced by a DNA cassette conferring drug resistance. For this purpose, a PCR is used to generate linear dsDNA where the targeting homologies are incorporated into the 5′ ends of the PCR primers (Fig. 1). The PCR products are made using chimeric (hybrid) primers of about 70 nt in length. The 5′ ends provide B50 nt of homology to the target DNA and the 3′ ends provide B20–25 nt of priming sequence to amplify the drugR cassette. Although the template for PCR can theoretically be a drugR marker contained on a plasmid, care must be taken to prevent carryover of the supercoiled template plasmid from the PCR reaction, leading to false positive recombinants following electroporation. Treating the PCR with the restriction enzyme DpnI can minimize template carryover, as it is active only on methylated DNA and will not digest PCR products. Other appropriate templates for PCRs used in recombineering include drug-resistant markers on conditionally-replicating plasmids, or bacterial colonies or cultures. For the latter, a small portion of a medium-sized colony, or several microliters of an overnight culture, will serve well as a template in a

296

Bacteriophage: Red Recombination System and the Development of Recombineering Technologies

Fig. 2 Two-step selection/counterselection procedure to generate unmarked gene knockouts. (a) A linear dsDNA dual cassette containing two adjacent genes, one encoding drug resistance (indicated in cyan) and the other a counterselectable maker, i.e., sacB (indicated in yellow) is generated using PCR with hybrid primers that provide flanking homologies to the desired target, as described in Fig. 1. (b) In the first recombineering step, the linear dsDNA dual cassette is inserted at the desired target, and selection for drug resistance is applied. (c) In the second recombineering step, a linear dsDNA substrate containing the final desired DNA sequence (in red) with flanking homologies is provided. Application of the counterselection allows recovery of recombinants in which the dual cassette has been replaced with the desired DNA. (d) As described in Fig. 3, the final recombinant should be verified with the appropriate PCR reactions and by DNA sequencing.

standard PCR to generate recombineering substrates, provided the PCR program contains an initial (95°C) heating cycle to lyse the cells prior to cycling. To this end, the Court laboratory has engineered a useful bacterial strain called T-SACK, which contains most of the drug-resistant markers typically used for recombineering in E. coli. Contained in this strain are the genes tetA (TetR), sacB (sucroseS), bla (AmpR), cat (CamR), and kan (KanR), making this a useful single source for the variety of drug-resistant markers used for recombineering. Use of the T-SACK strain as a PCR template for drugR dsDNA markers is preferable to using a plasmid template, since the possibility of false positives resulting from contamination by residual supercoiled plasmid template is eliminated. The error-prone Taq polymerase is suitable for this PCR, as any errors generated in amplifying the drugR cassette will not appear on the antibiotic selection plate.

Markerless Gene Deletions and Counterselection Schemes Linear dsDNA recombineering can also be used in a two-step selection/counterselection method (Fig. 2), if it is undesirable for an antibiotic resistance gene to remain in the final construct. Several dual cassettes with two genes are commonly used; these cassettes contain a drug marker linked to a second gene whose function can be selected against. The sacB gene from Bacillus subtilis is frequently used as a counterselection marker, since bacteria containing this gene are unable to grow on solid agar medium containing sucrose. The sacB gene has been linked to several different drug markers, to generate selection/counterselection dual cassettes. Other counterselectable genes can also be used, including the wild type rpsL gene in a rpsL31 (strepR) mutant (where loss of the overexpressed WT rpsL gene confers streptomycin resistance) and tolC (resistance to colicin E1). In the first recombineering step, the drug resistance encoded by the dual cassette is selected, and the cells are tested to verify that the counterselectable function is expressed. In the second recombineering step, the counterselectable function is selected against, allowing the dual cassette to be replaced with the desired genetic alteration, which may be either a net deletion/small insertion created with a ssDNA oligo, or any desired dsDNA sequence created with a PCR or synthetically designed substrate. This method is useful to insert non-selectable pieces of DNA such as gene tags (i.e., gfp). Replacement of the dual cassette is confirmed using PCR analysis and final constructs are verified by DNA sequencing.

Recombineering With ssDNA The simplest recombineering reaction uses linear ssDNA, usually provided as synthetic DNA oligonucleotides (oligos). For oligomediated recombineering, only the expression of λ Beta is required, as the substrate is already single-stranded, i.e., λ Exo is not needed. For oligo-mediated recombineering, B70 nt ssDNA oligos are optimal in length, with the desired base change(s) in the center of the linear DNA. In wild type E. coli cells, oligos yield recombinants at a frequency of about 1 out of 1000 total cells. Thus, a selection scheme is usually required for single base changes, insertions, or deletions. However, the Court lab found that the

Bacteriophage: Red Recombination System and the Development of Recombineering Technologies

297

frequency of oligo-mediated recombineering can be increased B100-fold by using strains that are deficient for methyl-directed mismatch repair (MMR), providing further evidence of a mechanistic link between λ Red and host DNA replication machinery. In the absence of MMR, point mutations and changes of a few bases can be made very easily, with 30%–50% of the total viable cells (after outgrowth) containing the oligo-mediated mutation in its chromosome. The model for this mechanism is straightforward. When the ssDNA oligo anneals to the lagging strand template at the replication fork, mismatches present between the oligo and the template strand are recognized by the MMR system, and are corrected in favor of the wild-type DNA sequence. Thus, MMR mutants expressing λ Beta demonstrate higher rates of oligo-mediated recombineering. However, because cells lacking MMR accumulate mutations, it is preferable to avoid use of such mutants as recombineering hosts. One method to bypass the MMR system is the use of oligos that generate C-C mismatches. Since these mismatches are not recognized by the MMR system in E. coli, oligos that generate this particular mismatch generate high frequency mutagenesis. The Court laboratory also found that when modifying genes, incorporation of base changes at several adjacent wobble codons near the desired nucleotide change allows evasion of an active MMR system and can be done without changing the amino acid sequence of a protein of interest. The ssDNA oligo design for this scheme generally results in a high frequency of recombinant formation, even in MMR+ bacteria. Other measures to avoid MMR involve the co-expression of dominant negative mutations of Dam methylase or the mismatch recognition function MutS along with λ Beta protein during the induction period. In these circumstances, the inhibition of the MMR system is transient, allowing the oligo to avoid correction for just a short recombinogenic window, leaving MMR system intact for the outgrowth period. When MMR is avoided, high efficiency genetic alterations are possible with oligo recombineering, including single point mutations and small changes of a few base pairs. Large deletions of more than several kb can also be made with ssDNA oligos, but the efficiency of this reaction is much lower and generally requires a selection.

In vivo Cloning Another useful dsDNA recombineering reaction mediated by the Red system is a recombination reaction that results in a bacterial segment of genomic DNA being incorporated into an electroporated plasmid backbone to form an intact replicating plasmid. It is a method to clone large segments of bacterial chromosomes, or eukaryotic DNA from BAC libraries, without the use of restriction enzymes or PCR, a procedure called in vivo cloning. This is especially useful for cloning large sizes of chromosomal segments, where classical methods of restriction enzymes digestion and ligation are not feasible as a result of the lack of unique cutting sites. To perform in vivo cloning, a DNA plasmid backbone containing, at a minimum, the origin of DNA replication and a drugR selection marker, is amplified using chimeric primers. The 5′ ends of these primers provide homology to the target DNA to be retrieved, and the 3′ ends prime the plasmid DNA sequence (Fig. 3). Cloning by retrieval is a low efficiency reaction but is easily selected for by plating on antibiotic-containing selection plates. The advantage of in vivo cloning is that segments of the bacterial genome are cloned without subjecting them to a PCR, which might otherwise introduce polymerase errors. Instead, once retrieved from the chromosome and inserted into the vector backbone, the insert replicates using the high-fidelity system of the E. coli replisome (and its proofreading capabilities). Any errors that might persist in the wake of the replisome are caught by the MMR system. The most common false positives occurring with in vivo cloning arise when the electroporated plasmid recircularizes without the insert, often as a result of recombination between micro-homologies that exist in the terminal ends of the vector. Such false positives can be minimized by careful design of the homologous sequences to the chromosomal target region, in order to avoid such micro-homologies. Control transformations in the absence of the recombination functions should be done to verify that the plasmid backbone does not circularize at high efficiency without incorporating the desired insertion. The Stewart laboratory found that the RecET system is superior to λ Red for one type of in vivo cloning: the joining of multiple linear dsDNAs containing short terminal homologies to form intact circular plasmids. Recombination occurs at the DNA ends and depends on homology rather than on restriction sites, giving flexibility in plasmid design. For reasons that are not understood, the full length RecE protein is required for this linear DNA assembly to occur at high efficiency.

Mechanism of Oligo-Mediated Recombineering Oligo-mediated recombineering requires only a single function, a ssDNA annealing protein, Beta or RecT. For any chromosomal target, there are two alternative sequences for a recombinogenic oligo: one targeting either the leading strand or the lagging strand templates. It was observed that for these two complementary substrates, one oligo gave a B20-fold higher recombination frequency relative to the other. This difference correlates with the direction of DNA replication through the target site, and the higher efficiency oligo is complementary to the lagging strand template. This observation suggested that the recombination is occurring at the DNA replication fork, with oligos annealing to the lagging strand template at single-strand gaps present in the newly synthesized discontinuous lagging strand, as illustrated in Fig. 4. In effect, the oligo becomes a pseudo-Okazaki fragment, with filling-in of the gap behind the 5′ end and extension from the 3′ end performed by DNA polymerase I, so that the oligo is incorporated into the growing new daughter strand. Subsequent work from several laboratories strengthened the evidence that in

298

Bacteriophage: Red Recombination System and the Development of Recombineering Technologies

Fig. 3 In vivo cloning by retrieval. To retrieve a gene or other DNA sequence from the bacterial chromosome, hybrid primers, shown in black, are used to amplify a plasmid backbone, indicated in gray, containing an origin of DNA replication (ori ) and usually an antibiotic resistance gene to allow for plasmid selection. The hybrid primers should be designed with the two homology sequences (indicated in green and red) facing inward, so that when the recombination between the plasmid backbone and the bacterial DNA occurs, the desired DNA sequence will be incorporated onto the plasmid backbone. DNA to be retrieved, ‘Your Favorite Gene’ (YFG, indicated in blue) is located on the bacterial chromosome or BAC. This reaction can also be used to create an intact circular plasmid from two or more PCR products.

Fig. 4 Model for single-strand DNA annealing at the DNA replication fork promoted by the λ Beta protein. A simplified schematic of a DNA replication fork is presented, with the leading and lagging strand templates in red and blue, respectively, and strand polarities indicated. The newly replicated leading and lagging strands are in black with the direction of DNA synthesis indicated by the arrowheads. Beta (or other single-strand annealing protein) binds a ssDNA oligonucleotide and anneals it to complementary DNA sequence on the lagging strand template in a gap that arises during discontinuous replication of the lagging strand. Beta may also anneal a complementary ssDNA oligo to single-strand DNA on the leading strand template ahead of the newly replicated leading strand, but this recombination is less efficient than a recombination event targeting the lagging strand.

order for recombinants to form, the circular DNA being targeted (either chromosome or plasmid) must be replicating, consistent with the idea that recombination occurs predominately by single-strand annealing, rather than by strand-invasion.

Mechanism of dsDNA Recombineering Two laboratories have reported that dsDNA recombination likely proceeds through a single-stranded DNA intermediate. In this model, after a dsDNA substrate is introduced into the bacterial cell expressing λ Red, one strand is entirely removed by phage λ Exo, leaving the complementary ssDNA bound by the recombinase, which is then incorporated by single-strand annealing at the cellular target, much like the mechanism described above for oligo-mediated recombineering. The data for this mechanism is suggested by the fact that sequence information in the homologous regions for most recombinants following recombineering is usually derived from one of the two strands in the dsDNA substrate. This mechanism is likely to occur with short dsDNA substrates more often when compared to longer substrates, since λ Exo is a moderately processive (B3 kb) enzyme. However, the fact that this is not seen in all recombinants, and less so with long recombinants, suggests that another apparently less-favored mechanism is also at work.

Recombineering in Pathogenic Bacteria Recombineering technology opened up the toolbox of genetic engineering in E. coli. The fact that the λ Red functions could be supplied from an episome suggested early on that the system might be amenable for use in clinically relevant bacteria, simply by transformation of a Red + Gam-producing plasmid into a clinical strain of interest. This proved to be the case, and the plasmid-borne

Bacteriophage: Red Recombination System and the Development of Recombineering Technologies

299

λ Red system has been used to make gene knockouts in enterohemorrhagic and enteropathogenic E. coli, for constructing vaccines in Shigella species, and to manipulate the chromosome of Salmonella enterica. The successful use of λ Red for genetic modification in these bacteria is not too surprising, given the genetic relatedness between these bacterial species and E. coli K12. In more distantly related bacteria such as Pseudomonas species, the λ Red system can work, but high efficiency requires longer regions of targeting homologies. Ideally, the use of recombineering technology in non-E. coli species is most successful when recombination systems from endogenous phage systems of the bacterium in question are found and characterized. This was the approach taken by the Hatfull lab when investigators there identified a RecET-like system in the mycobacterial phage Che9 and demonstrated its use in both dsDNA and oligo-mediated recombineering in Mycobacterium smegmatis and Mycobacterium tuberculosis. Other studies have identified λ Beta or RecT-like annealases in the chromosomes of diverse bacteria and have shown that they are capable of promoting oligomediated recombineering in E. coli at a wide range of efficiencies. A recent search of λ Beta-like proteins for use in the soil organism Pseudomonas putida revealed an annealase activity (the Ssr protein) capable of promoting oligo-mediated recombineering, setting the stage for further development of recombineering in the important strain, used often in metabolic engineering strategies. Similarly, a Beta-like W3 phage recombinase has recently been developed for use in the genus Shewanella, a metal-reducing bacterium important in bioremediation.

Combining Recombineering With Other Gene Modification Technologies The gene-editing tool CRISPR has revolutionized the genetic toolbox for eukaryotic cells. In its simplest manifestation, RNA guides bring the Cas9 nuclease to the target gene to deliver an endonucleolytic double-strand cut at the desired location. Because of the presence of non-homologous end-joining (NHEJ) functions in eukaryotic cells, the cut is often repaired inefficiently by rejoining the dsDNA break after some nucleotide degradation occurs at the cut site, leading to gene inactivation. Bacteria typically lack NHEJ systems, and as such, cuts are either repaired by homologous recombination pathways, or not. In the latter case, the cut is lethal. It was recognized that CRISPR could be used in bacteria, not for gene inactivation per se, but more importantly to select against non-recombinants following recombineering. The Marraffini lab combined the CRISPR-Cas9 technology of creating targeted in-vivo dsDNA breaks with Red-mediated oligo recombineering, using dsDNA cleavage as a counterselection step against unmodified chromosomes in E. coli and Streptococcus pneumonia. This methodology drives up the frequency of oligo-mediated recombineering, where half of the total viable cells recovered were recombinants. Since this efficiency is so high, recombinants can be identified by DNA sequencing a few candidates, with no need for selections or screens. These experiments also suggested that Red recombination did not repair the double-strand break created by CRISPR-Cas9, and that Cas9-cleavage was acting solely as a counterselection step against non-recombinants (i.e., parental DNA). Clearly, one requirement in using this combination of technologies is timing; that is, that the phage-mediated linear DNA recombination occurs before Cas9 cutting is activated. In another manifestation of combining two separate technologies into one genetic modification scheme, oligo-mediated recombineering is used to deliver the P1 phage site-specific attachment site loxP to different regions of a bacterial chromosome, separated by 10 kb of DNA or more. Following this step, expression of the site-specific recombinase Cre will promote recombination between the two loxP sites, and depending on the relative orientation of these sites, will promote either inversion of the marked segment or its deletion. This has been a common practice in genome reduction strategies, and to identify essential regions of a bacterial genome. In a more recent example of a similar technology in M. tuberculosis, Che9 RecT-promoted oligo-mediated recombineering is coupled with the mycobacterial Bxb1 phage site-specific integration system. In this system called ORBIT (Oligo Recombineering followed by Bxb1 Integrase Targeting), the ssDNA oligo contains the Bxb1 phage attP site. The attP-containing oligo is co-electroporated with a non-replicating plasmid containing Bxb1 attB, a drugR-marker, and a payload (e.g., a gfp tag). In this system, the introduction of the attP site is followed by integration of the payload plasmid in the same outgrowth period. Thus, gene deletions or fusions can be made by co-electroporating an oligo with a ready-made target-independent plasmid. A major advantage of ORBIT is that no target-specific plasmids or PCR products need to be generated for the construction of gene knockouts or genetic fusions.

Recombinase-Independent Recombination It has been observed that in some bacteria, electroporation of linear ssDNA substrates gives rise to recombinants that are independent of any known phage recombinase. While not strictly “recombineering”, there are similarities between this type of ssDNA recombination and ssDNA phage-mediated recombineering. B. Swingle and others have found that oligo-mediated recombinants can be found at a frequency of 104/108 viable cells when high oligo concentrations are used. The recombination displays a lagging strand bias, and oligos targeting the lagging strand that escape mismatch repair give the highest efficiencies. Recombinase-independent oligo-mediated recombination has so far been demonstrated in Pseudomonas syringae, E. coli, S. enterica, Shigella flexneri, Yersinia pseudotuberculosis, Legionella pneumophila and Shewanella oneidensis. Recombinaseindependent recombination provides a toe-hold for creating mutations in bacteria that lack recombineering systems, and thus affords a starting point for development of in vivo genetic engineering in these organisms. Although the frequencies of this recombinase-independent oligo incorporation are too low to allow direct screening for recombinants, the level of

300

Bacteriophage: Red Recombination System and the Development of Recombineering Technologies

Fig. 5 A flow-chart for recombineering. In all cases the desired genetic constructs should first be designed in silico. The appropriate linear DNA substrate and expression system will vary depending on the desired alteration. The outgrowth procedures and methods used to identify the recombinant clone will also differ.

recombination can be dramatically enhanced by using it in combination with CRISPR-Cas9 cutting. The dsDNA break created by Cas9 is used as a counterselection and enriches the recombinant class by eliminating non-recombinants, just as when CRISPR-Cas9 is coupled to canonical recombineering.

Final Thoughts Over the last B20 years the development of recombineering technology has revolutionized bacterial genetic engineering. Many papers have been published describing the various possible reactions and helping to elucidate molecular mechanisms involved in this bacteriophage-mediated recombination. A few seminal publications are listed in the following section. For those considering recombineering, a flow chart defining the necessary steps is provided in Fig. 5. While this article has described the present state of recombineering, it is a rapidly evolving field, since phages are the most abundant organisms in nature and new isolates are being characterized daily. Many of these newly identified phages will contain generalized recombination functions analogous to the Red and RecET systems; these newly identified functions have the potential to enable gene editing in their specific bacterial hosts.

Further Reading Cairns, J., Stent, G.S., Watson, J. (Eds.), 1966. Phage and the Origins of Molecular Biology, first ed. Cold Spring Harbor, NY: Cold Spring Harbor Laboratory. Costantino, N., Court, D.L., 2003. Enhanced levels of λ Red-mediated recombination in mismatch repair mutants. Proceedings of the National Academy of Sciences of the United States of America 100 (26), 15748–15753. Court, D.L., Sawitzke, J.A., Thomason, L.C., 2002. Genetic engineering using homologous recombination. Annual Review of Genetics 36, 361–388. Ellis, H.M., Yu, D., DiTizio, T., Court, D.L., 2001. High efficiency mutagenesis, repair, and engineering of chromosomal DNA using single-strand oligonucleotides. Proceedings of the National Academy of Sciences of the United States of America 98 (12), 6742–6746. Fu, J., Bian, X., Hu, S., et al., 2012. Full-length RecE enhances linear-linear homologous recombination and facilitates direct cloning for bioprospecting. Nature Biotechnology 30 (5), 440–446. Jiang, W., Bikard, D., Cox, D., Zhang, F., Marraffini, L.A., 2013. RNA-guided editing of bacterial genomes using CRISPR-Cas systems. Nature Biotechnology 31 (3), 233–239. Kutter, E.M., Kuhl, S.J., Abedon, S.T., 2015. Re-establishing a place for phage therapy in western medicine. Future Microbiology 10 (5), 685–688. Marinelli, L.J., Piuri, M., Hatfull, G.F., 2019. Genetic manipulation of lytic bacteriophages with BRED: Bacteriophage recombineering of electroporated DNA. Methods in Molecular Biology 1898, 69–80. Murphy, K.C., 1998. Use of bacteriophage λ recombination functions to promote gene replacement in Escherichia coli. Journal of Bacteriology 180 (8), 2063–2071. Murphy, K.C., 2016. λ Recombination and recombineering. EcoSal Plus 7 (1), doi:10.1128/ecosalplus.ESP-0011-2015. Murphy, K.C., Nelson, S.J., Nambi, S., et al., 2018. ORBIT: A new paradigm for genetic engineering of mycobacterial chromosomes. mBio 9 (6), doi:10.1128/mBio.01467-18.

Bacteriophage: Red Recombination System and the Development of Recombineering Technologies

301

Sawitzke, J.A., Thomason, L.C., Costantino, N., et al., 2007. Recombineering: In vivo genetic engineering in E. coli, S. enterica, and beyond. Methods in Enzymology 421, 171–199. Stahl, M.M., Thomason, L., Poteete, A.R., et al., 1997. Annealing vs. invasion in phage λ recombination. Genetics 147 (3), 961–977. Thomason, L.C., Sawitzke, J.A., Li, X., Costantino, N., Court, D.L., 2014. Recombineering: Genetic engineering in bacteria using homologous recombination. Current Protocols in Molecular Biology 106. (1:16.1–1:16.39). Zhang, Y., Buchholz, F., Muyrers, J.P., Stewart, A.F., 1998. A new logic for DNA engineering using recombination in Escherichia coli. Nature Genetics 20 (2), 123–128.

Nanotechnology Application of Bacteriophage DNA Packaging Nanomotors Tao Weitao, Southwest Baptist University, Bolivar, MO, United States Lixia Zhou, Zhefeng Li, Long Zhang, and Peixuan Guo, College of Pharmacy, The Ohio State University, Columbus, OH, United States r 2021 Elsevier Ltd. All rights reserved.

Nomenclature ATP

qRT-PCR

Adenosine triphosphate

Glossary ATPase gp16 gp16 is an enzyme that can catalyze the hydrolysis of ATP. The gp16 is a DNA packaging ATPase from the virus phi29. Bacteriophage A virus that can infect and replicate within bacteria. Biomotor/nanomotor A motor with size in nanometer scale functions to gear molecules moving forward.

Quantitative reverse transcription polymerase chain reaction

Connector/nanopore A hollow channel or pore in nanometer scale for the translocation of molecules, also known as the portal. Procapsid A preformed protein shell in the virus assembly stage of many bacteriophages.

Introduction We are all surrounded by traditional macroscale motors including both combustion engines and electric motors. One might ask: can we downscale the way we think of motors to the nanoscale range? The possible applications of nanomotors in nanotechnology will be intriguing but different. However, when we look at the basic mechanisms of both the macroscale and the nanoscale motors, they show many striking parallels and similarities. Analogous to the macroscopic ones, biological nanomotors are manufactured from static modules or parts that form a framework for moving and transporting substrates. Even though the moving parts are not always restricted to rotary or revolving motion, some motors perform the linear motion. Moreover, both macroscopic and nanomotors work by repeating the same cyclical of steps over and over. Active motion is a typical and prominent feature of life. Motion distinguishes living organisms from inanimate matter. Motion is action derived from motors. All important life activities rely on biomotors – macroscopic respiration, heartbeats, walking, speaking, blinking. Biomotors drive even microscopic DNA replication, DNA repair, RNA transcription, homologous recombination, intracellular transport, molecular transport, and viral assembly. Biomotors are powered by ATP. The amount of ATP produced and consumed by one person is close to body weight, which is 50–80 kg per day. Biological nanomotors are nanoscale machines that convert a primary energy source to mechanical motion between active and framework components. Motor complexes are vital to biological systems, and they support efficient, directional motility for the transportation of cellular components. Due to the progress of structural biology  single molecule methods and biotechnology approaches  the understanding of the molecular basis of structure in biological movement has progressed to the point where we can start thinking about possible applications of biomotors in nanotechnology. The elegant and delicate structures of viral double-stranded (ds)DNA packaging motors have inspired their application in nanotechnology. For example, studies on the motor packaging RNA (pRNA) of bacterial phage phi29 has led to the emergence of the field of RNA nanotechnology for disease treatment (Fig. 1). With the discovery of the highly stable pRNA three-way junction (3WJ) motif (Fig. 1), the application for constructing multivalent RNA nanoparticles with high chemical and thermodynamic stability leads to the development of RNA nanotechnology. RNA nanotechnology is the bottom-up self-assembly of nanometer-scale RNA architectures that are composed mainly of RNA (Fig. 1). In RNA nanoparticles, the scaffolds, targeting ligands, and therapeutics miRNAs or siRNAs can be composed mainly of RNA (Fig. 1). While classical studies on RNA focus on intra-RNA interactions and 2D/3D structure, more recently, RNA nanotechnology has further extended to inter-RNA interaction and quaternary (4D) structure (Fig. 1). The resulting branched RNA nanoparticles are homogenous, uniform in size and shape, and can harbor different functionalities while retaining their tertiary folding and independent functionalities both in vitro and in vivo. Another new approach for high potent drug development leads to the study of motor ATPase with high stoichiometry of homologous subunits (Fig. 2). Many ATPase biomotors contain six pRNA copies that can serve as drug targets (Z ¼ 6); and inducing a single pRNA subunit inhibits the entire biomotor function (K ¼ 1), resulting in high inhibition (Fig. 2(D)). Leading to the creation of a novel drug targeting strategy: increasing the number of subunits that are potential targets (changing Z) further improving drug efficiency. This idea is akin to series circuits, such as those in traditional Christmas lights, where if a single bulb stops working, the entire strand fails (Fig. 2(A)). In contrast, traditional drug development is analogous to parallel circuits’

302

Encyclopedia of Virology, 4th Edition, Volume 4

doi:10.1016/B978-0-12-809633-8.21520-3

Nanotechnology Application of Bacteriophage DNA Packaging Nanomotors

303

Fig. 1 Emerging field of RNA nanotechnology. (A) Construction and characterization of multivalent pRNA nanoparticles with tunable shape and stoichiometry. (B) Tectonics method to construct RNA nanosquares, polyhedron from tRNA, and nanoprisms from phi29 pRNA. (C) Computational approaches to expedite RNA nanoparticle manufacture and optimization. Adapted with permission from (A) Xu, C., Haque, F., Jasinski, D.L., et al., 2018. Favorable biodistribution, specific targeting and conditional endosomal escape of RNA nanoparticles in cancer therapy. Cancer Letters 414, 57–70. Available at: doi:10.1016/j.canlet.2017.09.043 © 2017 Elsevier B.V. All rights reserved. (B) Jasinski, D., Haque, F., Binzel, D.W., Guo, P., 2018. The advancement of the emerging field of RNA nanotechnology. ACS Nano 11(2), 1142–1164. Available at: doi:10.1021/acsnano.6b05737 Copyright © 2017 American Chemical Society. (C) Shu, Y., Yin, H., Rajabi, M., et al., 2018. RNA-based micelles: A novel platform for paclitaxel loading and delivery. Journal of Controlled Release 276,17–29. Available at: doi:10.1016/j.jconrel.2018.02.014 © 2018 Elsevier B.V. All rights reserved.

challenge to circumvent, as a single bulb’s failure has no effect on the rest of the strand. By addressing multicomponent systems more like series circuits, it may be possible to create ultra-high inhibitory drugs. The phi29 motor is part of a large ATPase family (AAA þ ) that was used to elucidate this potent drug inhibition mechanism. Drugs that target viral motors in this manner include rimantadine for influenza, ALLINI for HIV, and bedaquiline for tuberculosis (Fig. 2(E)). Bacillus phi29 and Escherichia T7 present themselves as one striking feature in double-stranded (dsDNA) tailed bacteriophages of the Podoviridae family. During replication and morphogenesis, they encapsulate and compress their genomic DNA with tremendous velocity into preformed protein shells, called procapsids, driven by hydrolysis of ATP. This energetically unfavorable translocation is accomplished by a viral nanomotor. The viral DNA packaging motor is composed of a few parts. The gearing components are the first; and the second is the motor structure frame, such as the force generation ATPase. The third is the essential components of connector complex that is a dodecameric structure with a central channel (30–60 Å ) through which viral DNA is packaged into the capsid and exits during infection. Though the portal proteins from different viruses have little sequence homology and large variation in molecular weight, portal complexes show significant morphological similarities. Yet, the portal connector of phi29 is bound with both ATPase gp16 and prohead RNA (pRNA). Binding of gp16 to the packaging motor relies on the association of pRNA because the central domain of pRNA is there for connector binding, and the 50 /30 paired helical region for recruitment of gp16. In phi29, moreover, the structure of one phage portal protein has been determined at atomic resolution. The connector ring consists of a twelve a-helical subunit, with three long helices of each subunit forming the central channel. The ring is 138 Å across at its wide end and 66 Å at the narrow end. The internal channel is 60 Å at the top and 36 Å at the bottom. The wider end of the connector is located in the prohead and its narrow end partially protrudes out of the capsid. The connector is located at the five-fold vertex of the viral capsid, which leads to a symmetry mismatch between capsid and the portal connector. Extensive investigation of the group of dsDNA viruses reveals that all DNA packaging motors known so far involve two nonstructural components typical of ATPase. Further scrutiny has found these two proteins can be classified into two categories according to their roles and sizes. One category with larger size is for procapsid binding, such as gp16 of phi29, gpA of l, gp12 of

304

Nanotechnology Application of Bacteriophage DNA Packaging Nanomotors

Fig. 2 Biomotors have high stoichiometry of homologous subunits. Drug inhibition efficiency is correlated with the stoichiometry of the targeted biocomplex with a mechanism of series circuit, using the hexametric ATPase as an example. (A) In the series circuit of Christmas lights; one broken bulb will turn off the whole chain. (B) The hexametric ATPase. (C) One key factor regarding drug potency is the stoichiometry of the homo-subunit serving as a target. Interactions between the subunits may follow the binomial distribution and Yang Hui’s Triangle. (D) Assay to prove the potent drug development based on the series circuit model. Plotted is virus assembly inhibition effect by drugged components of DNA, pRNA, gp16 ATPase, and ATP with stoichiometry of 1, 6, 6, and 1000, respectively. (E) Drugs that target ATPase have high stoichiometry. (left) Rimantadine for influenza virus, (middle) ALLINI for HIV, and (right) bedaquiline, the recent FDA-approved drug that targets the ATPase of Mycobacterium Tuberculosis, one of the most antimicrobial-resistant bacteria. (E) Adapted with permission from Pi, F., Zhao, Z., Chelikani, V., et al., 2016. Development of potent antiviral drugs inspired by viral hexameric DNA packaging motors with revolving mechanism. Journal of Virology 90(18), 8036–8046. Available at: doi:10.1128/JVI.00508-16. Copyright © 2016, American Society for Microbiology.

phi21, gp17 of T4, and gp19 of T3/T7. The second category is the smaller-size protein for DNA binding and processing. This group includes gp16 of T4, gp18 of T3/T7, gpNu1 of l, and gp1 of phi21. The DNA-packaging reaction is prohead and DNA dependent. One ATP is used for the packaging of two base pairs of DNA. Emerging information reveals that the mechanism of herpes virus DNA packaging is very similar to that of phages. General structures of phage dsDNA motors include channels and DNA packaging enzymes (Fig. 3). The tailed-bacteriophage assembles the channel into an oligomeric-ring portal (connector), which is composed of 12 protein subunits and embedded in the pentagonal portal vertex of the procapsid. The phi29 connector has a truncated cone-shape. The connector functions as a nucleation point for the assembly of motor components. Both the DNA-packaging enzyme gp16 and pRNA bind the connector. Binding of gp16 to the packaging motor relies on binding of pRNA because the central domain of pRNA is for connector binding, and the 50 /30 paired helical region for recruitment of gp16. The ingenious structures and functions of the phage biomotors have inspired us to develop nanobiotechnology for medical application. As RNA is a building block in nanotechnology, phi29 pRNA can be used to construct nanoparticle vehicles to deliver various therapeutics to the target cells for treatment of cancers, viral infections, and some genetic diseases. The phi29 DNA-packaging motor can be engineered and embedded into cell membrane to generate a pore for molecular sensing and drug delivery to the target cells for precision medicine. In this work, we intend to delineate the revolving mechanism of the biomotors in bacteriophages, with an emphasis on phi29. With our understanding of the dsDNA translocation mechanism by the phi29 packaging motor, we introduce the application of the packaging motors, especially the motor connectors, in single pore sensing of DNA, RNA, chemicals and proteins. The nanomotors play significant roles in many biological processes as described above. The phage DNA packaging motors, such as phi29, SPP1, T4, and T7, package their DNAs into the procapsid during genome replication. Herein, due to space limitation, we will use the phi29-gp10 revolution motor as an example to explain how it packages its genome into the procapsid. From the movement mode, the nanomotors can be classified into linear, rotation, and revolution categories. (Please refer to the further reading list to acquire more knowledge on the mechanism of different biomotors). Fig. 4 shows the structure of several revolution nanomotors. Generally speaking, revolution nanomotors have larger size channel with left-handed chirality. Fig. 3 shows a simplified illustration of the mechanisms for the revolution motor of phi29 to drive unidirectional translocation of dsDNA. Briefly, the ring-shaped ATPase present on pRNA binds ATP to trigger the conformational changes of the ATPase subunit to revolve the dsDNA and move it forward. The phi29 gp10 connector has a large diameter and the narrowest part in the channel is B3.6 nm to allow the translocation of dsDNA into the procapsid. The movement of the dsDNA around the motor is similar to how the

Nanotechnology Application of Bacteriophage DNA Packaging Nanomotors

305

Fig. 3 Structure and function of phi29 DNA packaging motor. (A) Structure of hexameric pRNA and the connector showing a 301 tilt. (B)–(C) dsDNA showing the shift of 301 angle between two adjacent connector subunits. (D) AFM images of hexameric pRNA with 7-nucleotide loops. (E) The mechanisms for the revolution motor. Reprinted with permission from (A)–(C) Schwartz, C., De Donatis, G.M., Zhang, H., Fang, H., Guo, P., 2013. Revolution rather than rotation of AAA þ hexameric phi29 nanomotor for viral dsDNA packaging without coiling. Virology 443(1), 28–39. doi:10.1016/j.virol.2013.04.019. Copyright 2013 Elsevier. (D) Shu, Y., Haque, F., Shu, D., et al., 2013. Fabrication of 14 different RNA nanoparticles for specific tumor targeting without Accumulation in normal organs. RNA 19, 766–777. Copyright 2013 RNA Society. (E) Guo, P., Driver, D., Zhao, Z., et al., 2019. Controlling the revolving and rotating motion direction of asymmetric hexameric nanomotor by arginine finger and channel chirality. ACS Nano 13, 6207–6223. Available at: https://pubs.acs.org/doi/10.1021/acsnano.8b08849. Further permissions related to the material excerpted should be directed to the ACS.

Earth revolves around the Sun but without self-rotation of the Earth, which avoids the coiling and tangling that happens in large or lengthy DNAs. From this DNA packaging process, researchers determined that the biomotors drive unidirectional DNA passage through the nanopore into the procapsid . The single pore sensing is derived from the natural phenomenon of viral DNA packaging. Scientists have discovered that analytes can be driven through the extracted nanopore and each generates a unique electric signal with applied voltage. The proofof-concept about the nanopore sensing was initially demonstrated by the inspection of DNA translocation through the a-hemolysin nanopore in 1996. In their research, the qRT-PCR verified the successful translocation of the single-stranded DNA (ssDNA) that was able to be driven electronically through the a-hemolysin nanochannel. To date, the nanopore sensing has been developed for different analyte detection as it has many advantages as an analytical tool. First, the detection is label- and amplification-free. Second, it can achieve the single-molecule detection with high specificity and sensitivity and offer real-time identification. Third, it only requires small analyte sample volumes (mL range) and low concentrations (nM or pM range) for analysis. In the following paragraphs, we will summarize the third type revolving motor mechanism and the application of these motors in single molecule nanopore sensing.

The Revolving Mechanism of Biomotors in Bacteriophages The Revolution Mechanism Without Rotation Common to the dsDNA Packaging Motors of all the dsDNA Bacteriophages During replication, the dsDNA viruses translocate their genomic DNA into procapsids using the same mechanism by the nanomotors with ATP as an energy source in an entropically unfavorable process. The force generation mechanism of the bacteriophage packaging motors of phi29, HK97, SPP1, P22, T4, and T7 share the following common revolution mechanism without rotation. The 301 lefthanded twist of the channel wall causes an anti-parallel arrangement with the right-handed helix of the dsDNA, leading to oneorientation trend (Fig. 3(B) and (C)). The same twist has been found in motor channels of all dsDNA viruses known so far including phi29, HK97, SPP1, P22, T4, and T7, of which their primary amino acid sequences were non-conserved, but the higher structures of the swivel were impeccably conserved and aligned. One-direction flow loops inside the channel of SPP1 and phi29 help a onedirectional processing for the one-way translocation of dsDNA. The electropositive-lysine layers present in all the phage channels interact with one strand of the electronegative-dsDNA phosphate backbone, causing a relaying contact and transitional pausing during dsDNA translocation. In April of 2020, Xiangxi Wang, Hongrong Liu and Zihe Rao reported a high-resolution structural and mechanism of the ATP-driven DNA packaging motor of the double-stranded herpesvirus. The structure clearly shows that the herpesvirus DNA packaging motor is a hexamer exercising the revolution mechanism instead of rotation (Fig. 4(A)). The elucidation of the structural ends the 20-year fervent debate on whether the stoichiometry of the genome packaging motor of dsDNA viruses is pentamer or hexamer, or whether the motion process is rotating or revolving. Like the architecture of the phage packaging motors ((Fig. 4(A)), a hexameric ring structure of the herpesvirus DNA packaging complex suggests that the motor translocates dsDNA by a sequential revolution mechanism with the aid of a transacting arginine fingers essential for ATP hydrolysis and DNA translocation. Astonishingly, the human viral motor has the structural properties similar to the bacteriophage motors, implying that the sequential DNA translocation via revolving motion is common during viral motor evolution in both eukaryotic and prokaryotic systems.

306

Nanotechnology Application of Bacteriophage DNA Packaging Nanomotors

Fig. 4 The structure of hexametric ATPase motor. (A) The DNA packaging motor of the double-stranded DNA herpesvirus (left) and the elucidation of the sequential revolving mechanism in DNA translocation (right). Adapted in part with permission from Creative Commons Attribution 4.0 International in Architecture of the herpesvirus genome-packaging complex and implications for DNA translocation. Yang et al. Protein Cell. doi:10.1007/s13238-02000710-0. Copyright 2020. (B) Channel size to differentiate rotating and revolving mechanism. Rotating motors have channel sizes all r2.0 nm in diameter to ensure full contact between DNA and channel wall similar to the nut driving the bolt, while revolving motors have channel sizes Z3 nm to have room to accommodate the revolving motion. Reprinted in part with permission from: Guo, P., Grainge, I., Zhao, Z., Vieweger, M., 2014. Two classes of nucleic acid translocation motors: Rotation and revolution without rotation. Cell and Bioscience 4, 54. Copyright 2014 Springer Nature. Mancini E.J., Tuma, R., 2012. Mechanism of RNA packaging motor. Advances in Experimental Medicine and Biology 726, 609–629. Besprozvannaya, M., Burton, B.M., 2014. Do the same traffic rules apply? Directional chromosome segregation by SpoIIIE and FtsK. Molecular Microbiology 4, 599–608.

Revolving Mechanisms Determined by Motor Channel Sizes For dsDNA to rotate through the center of the motor channel, the channel needs to have a size similar to the diameter of dsDNA, 2 nm. Thus, the channel diameter should be approximately 2 nm for dsDNA or 1 nm for ssDNA. For dsDNA to revolve through the motors, dsDNA moves by touching the channel wall rather than proceeding through the center of the channel. Accordingly, the channel diameter should be much larger than the dsDNA’s to ensure sufficient room for revolution. The crystal structures and cryoelectron microscopy studies on the connectors of the T7, phi29 and other phages demonstrate that the channel diameters of the revolution motors are 43.5 nm, and those of rotation motors such as helicase and DNA polymerase is o 2 nm. Fig. 4(B) shows several rotating motors with smaller channel but revolving motors with larger channel. The revolution of dsDNA along the hexametric ATPase ring has the following parameters: the width of dsDNA helix is 2 nm while the diameters of the narrowest regions of the dodecameric portals of all the connector channels of phi29 SPP1, T4, T7, HK97, and FtsK are 3–5 nm. Even a herpesvirus motor complex forms a central channel with an internal diameter of 3.9 nm, greater than that of dsDNA as revealed by Cryo-electron microscopy (Fig. 4(A)).

Nanotechnology Application of Bacteriophage DNA Packaging Nanomotors

307

Revolving Motors Distinguished by Chirality The anti-parallel arrangement between the phi29 connector subunit and the DNA helices can facilitate dsDNA to revolve in single direction. All 12 subunits of the phi29 connector portal protein, relative to the vertical axis of the channel, tilt at a 301 left-handed angle to form the channel with a configuration anti-parallel to the right-hand dsDNA helix during packaging. This structural arrangement significantly facilitates the controlled motion, providing evidence for dsDNA revolving, instead of rotating, through the connector channel. The revolution neither produces coiling, or torsion force, nor touches each of the 12 connector subunits in 12 discrete steps of 301 transitions for each helical pitch. Fig. 5 demonstrates examples of rotating motors with right-handed channel whereas revolving motors with left-handed channel.

Special Aspects of Revolving Motor Actions Force generation and energy conversion Several primary chemo-mechanical coupling mechanisms exist in all biomotors studied so far. By these mechanisms, the cycles of nucleotide binding and hydrolysis are coupled to conformational entropy rearrangements of the substrate-binding subunits of the motors. In the sequential mechanism, individual ATP binding and hydrolysis events proceed sequentially. For the concerted mechanism, all active sites on the motors simultaneously hydrolyze ATP. Of the stochastic mechanism, the ATPase sites randomly hydrolyze ATP. For all three mechanisms, ATP binding to the disordered subunits of ATPase stimulates conformational alterations of the ATPase with the entropy change. This fastens the bound ATPase at a less random configuration than the unbound ATPase. Such new conformation entropy facilitates the ATPase subunits to bind dsDNA and to prime ATP hydrolysis. ATP hydrolysis by the ATPase triggers the subsequent entropic and conformational changes. These alterations render the ATPase a low affinity for the dsDNA substrates, thereby leaving dsDNA to the next ATP-bound subunit that has a high affinity with dsDNA. These repetitive actions cause dsDNA to revolve around the subunits through the interior channel of the motor. Unidirectional dsDNA translocation The translocation direction of dsDNA is controlled by five factors in phi29 motor (Fig. 3). First, ATPase experiences a chain of entropy transitions and conformational changes during ATP and dsDNA binding. ATP hydrolysis causes additional change in entropy and conformation of the ATPase, including the one with a low affinity for dsDNA that pushes dsDNA away and revolves inside the channel. Second, the 301 angle of each subunit of the dodecameric connector channel is anti-parallel to dsDNA, matching with the 12 subunits of the connector channel (3601 C 12 ¼ 301), as shown by crystallography (Figs. 3 and 5). Third, the internal channel loops have the unidirectional flowing property acting as a ratchet valve to avoid dsDNA reversal. Fourth, the 50  30 single-directional revolution of one strand of dsDNA goes along the connector channel wall. Fifth, the electrostatic force occurs from the relaying interaction of the electropositive-lysine layers with the electronegativeDNA phosphate backbone. Altogether, dsDNA is translocated in one direction by the multiple factors. Communication/interaction between motor subunits for sequential action In the phi29 motor, a sequential action of both the pRNA and the ATPase gp16 was reported. Particularly, the subunits of ATPase act sequentially and cooperatively. The arginine finger motif plays a role in communication and interaction among the subunits by bridging dimer formation (Fig. 3(D)). Communication between

Fig. 5 Different chiralities of rotating and revolving motors. Rotating biomotors exhibit right-handed chirality to drive the right-handed dsDNA similar to the nut driving the bolt or the screwdriver turning the screw, whereas revolving biomotors exhibit left-handed chirality within the channel. Crystal structure analysis of viral DNA packaging motors reveals that this class of biomotors package DNA using the revolving mechanism. Reprinted with permission from Guo, P., Grainge, I., Zhao, Z., Vieweger, M., 2014. Two classes of nucleic acid translocation motors: Rotation and revolution without rotation. Cell and Bioscience 4, 54. Copyright 2014 Springer Nature. and De-Donatis, G.M., Zhao, Z., Wang, S., et al., 2014. Finding of widespread viral and bacterial revolution dsDNA translocation motors distinct from rotation motors by channel chirality and size. Cell and Bioscience 4, 30. Copyright 2014 Springer Nature.

308

Nanotechnology Application of Bacteriophage DNA Packaging Nanomotors

each two neighboring subunits is mediated by the finger. Such communication causes an asymmetrical hexameric organization, consistent with the asymmetrical structures in other hexameric ATPase systems. Thus, the arginine finger acts as a bridge between two subunits to form a transient dimeric subunit. Specifically, the arginine finger, located at the interface of two subunits of gp16, extends into the ATP binding pocket of the downstream subunit. The asymmetrical appearance of one dimer and four monomers in highresolution structural complexes provide evidence for the mechanisms by which the arginine finger promotes inter-subunit interactions and sequential movements of individual subunits. The arginine finger is important for regulating energy transduction and motor function (Fig. 3(D)).

The Application of Bacteriophage DNA Packaging Motors in Single Pore Sensing of DNA, RNA, Chemicals, and Proteins The single pore sensing technique has built up a powerful platform for numerous applications including the detection of DNAs, peptides, and chemicals. Prior to discussing the use of nanopores for single-molecule sensing, we present background information about the history of single molecule sensing. The well-known biological connectors found in bacteriophages include the phi29, SPP1, T4, P22, and T7. The following lists several well-studied connectors.

Phi29 Connector The phi29 connector was the first portal protein with the atomic structure solved. The connector consists of 12 protein subunits (gp10) that form a ring-like truncated cone structure. The molecular weight for each subunit is 36 kDa. The end diameters of the cone structure are 6.6 and 13.8 nm, with the wider and narrower ends termed the C- and N-terminals, respectively. The area of the narrowest part of the inner channel is 10 nm2, which corresponds to a diameter of B3.6 nm. In bacteriophage phi29, the C-terminal is located in the procapsid, and translocation of dsDNA during packaging is unidirectional from the N-terminal to C-terminal. The clip region in the phi29 connector can bind with pRNA to help with the DNA packaging. In addition, the negatively charged residues, aspartate and glutamate in the channel, have the potential to keep the DNA in the center of the channel; this central localization is necessary for the phi29 DNA packaging. The phi29 nanochannel also has the positively charged residues, such as lysine and arginine, which can attract the DNA and hence prevent the DNA from slipping out.

SPP1 Connector There are four domains in the SPP1 connector: clip, stem, wing, and crown. This connector has 13 protein subunits (gp6) that assemble into a cone structure similar to the phi29 connector, with a total molecular weight of 745 kDa. The overall diameter of the SPP1 nanochannel is B16.5 nm with a height of B11 nm. The most constricted part in the tunnel is B2.7 nm. The negatively and positively charged residues in the SPP1 channel are also required for DNA packaging similar to those of the phi 29 motor.

T7 Connector The T7 connector is formed by 12 protein subunits with a molecular weight of 59 kDa for each subunit. The channel length is 13.1 nm, with external diameters ranging from 5.9 to 17.3 nm. The most restricted region in the T7 interior channel is 3.9 nm, which is relatively wide compared to the other types of bacteriophage connectors. The lysine residue in the stem domain of T7 nanochannel is highly possible to interact with the phosphate backbone to help with DNA translocation. Apart from phi29, SPP1, and T7 connectors, many other phage connectors such as T3, T4, and P22 were studied. Generally, the phage connectors consist of about 11–13 subunits of proteins, with their height and width on the order of 100 s nm and inner diameter within 4 nm. Over the years, the nanopore-based sensing technique has been developed for the detection of various molecules and has proven its potential for the medical diagnosis of diseases. The following sections discuss the mechanisms and provide examples of single-molecule sensing using nanopores. Please note that we will focus on the discussion of the biological nanopores from bacteriophages in this article. For information on other types of nanopores, please refer to the further reading list.

The Mechanism of Single Pore Sensing The general mechanism of biological nanopore sensing is based on the resistive pulse technique. Fig. 6(A) shows a schematic diagram of the technique. Briefly, the purified connector is inserted into the lipid bilayer membrane to form the nanopore channel. Voltage is applied across the membrane, which allows ions to pass freely through the channel. When analytes are translocated through the membrane, ion flow is affected. This results in a change to the current and creates fingerprinting signals. Commonly used electronic signatures that help identify different analytes include the current blockage and dwell time. The current blockage represents the percentage of the blocked current relative to the open current of the nanopore channel, while the dwell time measures the length of time the current blockage lasts.

Nanotechnology Application of Bacteriophage DNA Packaging Nanomotors

309

Fig. 6 The sensing of dsDNA with the phi29 connector. (A) The schematic diagram of the single pore sensing of a membrane-embedded connector with applied voltage in vitro. (B) The current trace signals of dsDNA. (C) The distribution of 7500 translocation events for various types of DNA from quantitative analysis. (D) The illustration and corresponding current blockage signals for different types of DNAs. Reprinted with permission from: (A) Jing, P., Haque, F., Shu, D., Montemagno, C., Guo, P., 2010. One-way traffic of a viral motor channel for double-stranded DNA translocation. Nano Letters 10 (9), 3620–3627. Further permissions related to the material excerpted should be directed to the ACS. (D) Haque, F., Wang, S., Stites, C., et al., 2015. Single pore translocation of folded, double-stranded, and tetra-stranded DNA through channel of bacteriophage phi29 DNA packaging motor. Biomaterials 53, 744–752.

As analytes are driven to translocate or interact with the connector by electric force, many things may affect current blockage and dwell time. These things include intrinsic properties of the targets such as their shape, length, charge, hydrophobicity, or hydrophilicity, as well as the type and modification of the analytes. First, analytes that are smaller than the nanopore may be translocated through the channel to generate electric signatures, and different samples have the potential to generate different signals. Second, analytes that are larger than the diameter of the nanopore cannot pass through the channel due to size limitations. Theoretically, larger analytes or those with longer chains will block more current signals and have a longer dwell time than smaller analytes. In addition, stronger interactions between the analytes and the connector can produce longer dwell times than those have weaker interactions with the nanopore. Moreover, some analytes can interact with the connector on one side and induce conformational changes to the connector, thus generating distinct electric signals without translocation. Since only specific types of analytes can trigger conformational changes to the connector and cause current blockages with different dwell times, this becomes a technique for analyte determination. In both cases concerning small and large analytes, the type and modification of the connector can greatly affect the electric signals. For instance, a larger nanopore can allow larger analytes to be translocated through the pore, while smaller nanopore channels do not. Moreover, modification of the nanopores, such as site-directed mutagenesis or introducing ligands into the nanopore, can significantly influence the interactions between the analytes and the nanopore, resulting in different electric signals. For example, introducing more hydrophobic groups into the nanopore can increase the dwell time for samples with many hydrophobic groups. Please refer to the further reading list for additional information on the modification of the connector for nanopore-based sensing techniques. In conclusion, biological nanopore sensing is based on distinct electric signatures generated when analytes pass through or interact with the nanopore. The intrinsic properties of the analytes and the nanopore determine the fingerprinting signals, which allow for the differentiation and identification of unique analytes.

Applications of the Biological Nanopore Channel as a Conduit for Single Pore Sensing As mentioned in the nanopore detection mechanism section above, different analytes can generate distinctive electric signals to be distinguished from the background. The following sub-sections will address several experimental examples to explain the single pore sensing technique further. The sensing of DNA: DNA carries genetic information and is imperative for all lives. Recognizing the variations in different types of DNA may help people better elucidate their functions, which can benefit the diagnosis of diseases in their early stages, and further assist in the treatment of these diseases.

310

Nanotechnology Application of Bacteriophage DNA Packaging Nanomotors

The single pore sensing of DNA recognizes that different size and conformation can cause various current blockage events. The phi29 gp10 connector has been shown for the differentiation of dsDNA with different conformations by Haque and coworkers. In the experiment, Haque et al. examined the translocation and analyzed the current blockage of the folded 5 kbp dsDNA. Fig. 6 shows the current signatures for the dsDNA with different conformations that caused distinct blocking events. The straight dsDNA caused a single level blockage B32%, while the tetra-stranded complex with two dsDNA geared into nanostructure blocked the channel current with B64%. Additionally, the dsDNA and tetra-stranded DNA (tsDNA) also showed difference in their current blockage events. In short, the dsDNA mainly caused the 32% current blockage events whereas the tsDNA produced both the 32% and 64% blockage events. These experiments fortify a foundation for future studies on the detection of biomarkers from the conformational changes. The sensing of RNA: Different types of RNAs have been found to be able to serve as the biomarkers for diseases diagnosis, demonstrating their importance to lives. Currently, the sensing of RNAs with bacteriophage nanopores is at the beginning stage. The work in the detection of singlestranded RNA (ssRNA) with a modified phi29 nanopore has promoted the application of bacteriophage connectors in RNA detection. Geng et al. determined that the removal of the internal loop segment of phi29 channel could create a modified channel with a cross-sectional area about 40% less than that for the wild-type phi29 connector. The translocation of ssRNA was able to be identified by the modified phi29 connector, which resulted in B20% current blockage. Their experiments suggested that the modification of nanochannels could achieve different detection purposes. Please refer to the further reading list for more details. The sensing of chemicals: Chemicals are essential to our everyday life. Biological nanopores are also proven to be capable for the detection of the chemicals after connector modifications. Presently, the chemical detection is mainly based on the physical blocking of current signals when chemicals interact with the functional groups in the connecter. The detection of chemicals with modified bacteriophage connector has been reported. By mutating the lysine-234 to cysteine in the phi29 nanopore, the modified phi29 nanopore can distinguish among ethane, thymine, and benzene with thioester moieties. The binding events between the connector and chemicals can cause current blockage events. The unique electric signals are generated by the physical blockings when the chemicals react with the cysteine residues on the connector during the translocation process. From the analysis of the binding events among ethane, thymine, and benzene, the current blockage for the permanent binding events have been found to allow for the discrimination among the three chemicals (Table 1). The biological nanopore sensing of peptides and proteins: Polypeptides or proteins play significant roles in biological functioning. Even though the nanopore-based sensing of proteins or peptides is still a nascent technology, it presently can achieve the detection of the proteins or peptides at the single-molecule level. In the single pore sensing of polypeptides, the short peptides have the possibility to translocate through the nanopore channel to produce current blockage. However, due to the relatively small size of the nanopore, larger sized polypeptides or proteins cannot be translocated. In this situation, a current blockage may occur through nanopore conformational changes caused by specific interactions between the larger-sized proteins and the connector. Therefore, the dwell time indicates either the time required for one complete translocation event or the interaction time between the nanopore and the analytes. As mentioned earlier, as peptides are driven to be translocated or to interact with the connector by electrical force, the intrinsic properties of the polypeptide such as length and charge, type and modification of the connector greatly influence the current blockage and dwell time. Researchers have been able to discriminate peptides with different compositions and lengths from the nanopore technique experimentally. For example, Fig. 7 shows the differences in the current signatures to discriminate peptides with same lengths but different compositions. Ji and coworkers were also able to differentiate among peptides with different numbers of arginine resides using T7 nanopore technology. Fig. 8 compares the current blockage and dwell time to distinguish peptides with the number of arginine resides ranging from 8 to 12. Like other biomolecules, longer peptides generated larger current blockage and longer dwell time. Moreover, the peptides with same length and composition can also be detected using the peptide digestion assay. The main idea for the detection is that peptides with various lengths will be created after specific enzyme digestion. Furthermore, the phi29 connector was inserted into the stable polymer membrane in Oxford MinION flow cell, which Table 1

Comparison of permanent current blockage events among thioesters

Thioesters: Permanent Binding Events Transient binding events

Permanent binding events

Ratio of permanent to transient events

thioesters

(Cys-X)

(Cys-X þ Cys-X)

p value

(Cys-X þ Cys-X)/(Cys-X)

p value

ethane

16.4 7 2.0%

33.5 7 0.5% (N ¼ 63)

o0.001(thymine); o0.001 (benzene)

2.04 7 0.12

o0.001 (thymine); o0.065 (benzene)

thymine

18.9 7 2.6%

36.3 7 1.2% (N ¼ 44)

o0.001 (ethane); o0.001 (benzene)

1.92 7 0.14

o0.001 (ethane); o0.437 (benzene)

benzene

19.5 7 6.2%

38.4 7 2.0% (N ¼ 66)

o0.001 (ethane); o0.001 (thymine)

1.96 7 0.32

o0.065 (ethane); o0.437 (thymine)

Note: Reprinted with permission from Haque, F., Lunn, J., Fang, H., Smithrud, D., Guo, P., 2012. Real-time sensing and discrimination of single chemicals using the channel of phi29 DNA packaging nanomotor. ACS Nano 6 (4), 3251–3261. Copyright (2012) American Chemical Society.

Nanotechnology Application of Bacteriophage DNA Packaging Nanomotors

311

Fig. 7 The identification and differentiation among peptides with same length but distinctive composition. Reprinted with permission from Ji, Z., Guo, P., 2019. Channel from bacterial virus T7 DNA packaging motor for the differentiation of peptides composed of a mixture of acidic and basic amino acids. Biomaterials 214, 119222. From Elsevier.

comprises 2048 wells in the portable MinION device for high-throughput peptides sensing. Currently, the TAT, R14, MAR, and P27 peptides were found to be able to generate distinct current blockages signals in the MinION. This innovation has greatly advanced the development in peptides sensing and sequencing. Please refer to the further reading list for more details on the peptides sensing. Scientists have also developed methods to detect proteins that are too large to pass through a nanopore via the analysis of fingerprinting signals by specific protein-nanopore interactions. For example, Wang and coworkers (Please refer to the further reading list for more details) incorporated an Epithelial Cell Adhesion Molecule (EpCAM) peptide into the C-terminal of phi29 connector channel to

312

Nanotechnology Application of Bacteriophage DNA Packaging Nanomotors

Fig. 8 The differentiation of peptides with different number of arginine residues (R8, R9, R10, and R12). Reprinted with permission from Ji, Z., Kang, X., Wang, S., Guo, P., 2018. Nano-channel of viral DNA packaging motor as single pore to differentiate peptides with single amino acid difference. Biomaterials 182, 227–233. From Elsevier.

detect the EpCAM antibody. Upon the specific interaction between the EpCAM peptide and the antibody, it was possible to collect distinct current blockage signals from the background even though many other nonspecific antibody or serum components were present.

Summary The phage dsDNA-packaging motors have long been thought to be rotation motors. A third type of biomotors recently has been discovered that translocate dsDNA by a revolution mechanism without rotation, and are commonly found in viruses, bacteria, and eukaryotic cells. Analogously, the rotation motion resembles Earth rotating on its own axis every 24 h, but revolution resembles Earth revolving around Sun every 365 days. The revolving motion renders a motor free of friction, coiling, and torque, which has helped clear up many puzzles of viral DNA packaging motor studies over the structure, stoichiometry, and functioning of DNA translocation motors. In recent years, development and application of biological nanopores have advanced significantly. These biological nanopores can detect different types of molecules, such as DNAs, chemicals, and peptides. The bacteriophage nanopore has proven its ability for discriminating DNAs with a variety of lengths and conformations. These studies indicate likelihood of future applications of the bacteriophage conduits in detection of biomarkers, such as miRNAs, which can be achieved by the conformational differences in the absence and presence of target analytes. Regarding chemical sensing, specific modifications on the nanopore channel can generate unique signals for individual chemicals. Additionally, the biological nanopores can distinguish among short peptides. The analysis of the product peptides with the use of enzymes also shows the evidence of the nanopore system to differentiate among proteins of similar sizes. This differentiation further indicates the potential to detect specific enzymes by the nanopore system. Overall, based on their ability to provide quantitative analyses of molecules of many kinds, nanopores have great potential for disease diagnosis. The biological nanopores can be used in versatile applications, though several challenges remain. The type and number of analytes that can be discriminated by single pore sensing are inadequate. Different analyte detection will likely require the relevant modification of the nano-channel with distinct properties and specifically designed probes. For example, the detection of certain

Nanotechnology Application of Bacteriophage DNA Packaging Nanomotors

313

chemicals may require the specific types and modifications of the nanopore for specific binding between the analytes and the connector. In addition, the detection or sequencing of proteins remains challenging. Further studies are necessary to build the database for peptide discrimination or sequencing. Moreover, to transfer the nanopore sensing technique to the clinical trials, additional knowledge on how to enhance the sensitivity and specificity in clinical samples or in the presence of many impurities are necessary to be revealed. In sum, the single pore sensing is becoming a powerful analytical tool, but still requires further improvement and exploration of various connectors.

Acknowledgments We thank Todd Sukany for copy editing and proofreading of this article.

Further Reading Bazinet, C., King, J., 1985. The DNA translocating vertex of dsDNA bacteriophage. Annual Review of Microbiology 39, 109–129. Besprozvannaya, M., Burton, B.M., 2014. Do the same traffic rules apply? Directional chromosome segregation by SpoIIIE and FtsK. Molecular Microbiology 4, 599–608. Cuervo, A., Carrascosa, J.L., 2012. Viral connectors for DNA encapsulation. Current Opinion in Biotechnology 23 (4), 529–536. De-Donatis, G.M., Zhao, Z., Wang, S., et al., 2014. Finding of widespread viral and bacterial revolution dsDNA translocation motors distinct from rotation motors by channel chirality and size. Cell and Bioscience 4, 30. Dedeo, C.L., Cingolani, G., Teschke, C.M., 2019. Portal protein: The orchestrator of capsid assembly for the dsDNA tailed bacteriophages and herpesviruses. Annual Review of Virology 6, 141–160. Erika, J., Tuma, R., 2012. Mechanism of RNA packaging motor. Advances in Experimental Medicine and Biology 726, 609–629. Geng, J., Wang, S., Fang, H., Guo, P., 2013. Channel size conversion of phi29 DNA-packaging nanomotor for discrimination of single- and double-stranded nucleic acids. ACS Nano 7, 3315–3323. Guo, P., Driver, D., Zhao, Z., et al., 2019. Controlling the revolving and rotating motion direction of asymmetric hexameric nanomotor by arginine finger and channel chirality. ACS Nano 13, 6207–6223. Guo, P., Erickson, S., Anderson, D., 1987b. A small viral RNA is required for in vitro packaging of bacteriophage phi29 DNA. Science 236, 690–694. Guo, P., Grainge, I., Zhao, Z., Vieweger, M., 2014. Two classes of nucleic acid translocation motors: Rotation and revolution without rotation. Cell and Bioscience 4, 54. Guo, P., Grimes, S., Anderson, D., 1986. A defined system for in vitro packaging of DNA-gp3 of the Bacillus subtilis bacteriophage phi29. Proceedings of the National Academy of Sciences of the United States of America 83, 3505–3509. Guo, P., Peterson, C., Anderson, D., 1987c. Prohead and DNA-gp3-dependent ATPase activity of the DNA packaging protein gp16 of bacteriophage phi29. Journal of Molecular Biology 197, 229–236. Guo, P., Zhang, C., Chen, C., Garver, K., Trottier, M., 1998. Inter-RNA interaction of phage phi29 pRNA to form a hexameric complex for viral DNA transportation. Molecular Cell 2 (1), 149–155. Guo, P., 2010. The emerging field of RNA nanotechnology. Nature Nanotechnology 5, 833. Haque, F., Lunn, J., Fang, H., Smithrud, D., Guo, P., 2012. Real-time sensing and discrimination of single chemicals using the channel of phi29 DNA packaging nanomotor. ACS Nano 6 (4), 3251–3261. Haque, F., Wang, S., Stites, C., et al., 2015. Single pore translocation of folded, double-stranded, and tetra-stranded DNA through channel of bacteriophage phi29 DNA packaging motor. Biomaterials 53, 744–752. Ji, Z., Kang, X., Wang, S., Guo, P., 2018. Nano-channel of viral DNA packaging motor as single pore to differentiate peptides with single amino acid difference. Biomaterials 182, 227–233. Ji, Z., Jing, P., Jordan, M., Jayasinghe, L., 2020. Insertion of channel of phi29 DNA packaging motor into polymer membrane for high-throughput sensing. Nanomedicine: NBM 25, 102170. Jing, P., Haque, F., Shu, D., Montemagno, C., Guo, P., 2010. One-way traffic of a viral motor channel for double-stranded DNA translocation. Nano Letters 10 (9), 3620–3627. Maluf, N.K., Feiss, M., 2006. Virus DNA translocation: Progress towards a first ascent of mount pretty difficult. Molecular Microbiology 61 (1), 1–4. Parent, K.N., Schrad, J.R., Cingolani, G., 2018. Breaking symmetry in viral icosahedral capsids as seen through the lenses of X-ray crystallography and cryo-electron microscopy. Viruses 10 (2), 67. Pi, F., Vieweger, M., Zhao, Z., Wang, S., Guo, P., 2016. Discovery of a new method for potent drug development using power function of stoichiometry of homomeric biocomplexes or biological nanomotors. Expert Opinion on Drug Delivery 13 (1), 23–36. Schwartz, C., De Donatis, G.M., Zhang, H., Fang, H., Guo, P., 2013. Revolution rather than rotation of AAA þ hexameric phi29 nanomotor for viral dsDNA packaging without coiling. Virology 443 (1), 28–39. Shu, Y., Haque, F., Shu, D., et al., 2013. Fabrication of 14 different RNA nanoparticles for specific tumor targeting without accumulation in normal organs. RNA 19, 766–777. Wendell, D., Jing, P., Geng, J., et al., 2009. Translocation of double stranded DNA through membrane adapted phi29 motor protein nanopore. Nature Nanotechnology 4 (11), 765–772. Yang, Y., Yang, P., Wang, N., et al., 2020. Architecture of the herpesvirus genome-packaging complex and implications for DNA translocation. Protein Cell. doi:10.1007/ s13238-020-00710-0.

General Ecology of Bacteriophages Stephen T Abedon, The Ohio State University, Mansfield, OH, United States r 2021 Elsevier Ltd. All rights reserved.

Glossary Burst Release of phages from a bacterium, usually upon lysis. Burst size is the number of phage virions released per phage-infected bacterium. Chronic release The continuous release of phage progeny from infected bacteria in a manner that does not involve bacterial death, infection loss, or host lysis. Epifluorescent microscopy Microscopic observation of light absorbed and then reemitted from specimen-bound dyes. Fecundity Measure of the number of offspring produced by an individual. Latent period The time beginning with phage adsorption to a bacterium and ending with phage release from a bacterium. Metagenomics The isolation and sequencing of genomic nucleic acid from whole communities including bacteriophage communities such as those found within a sample volume of seawater. Plaque Localized region of phage population growth occurring in association with a turbid, spatially constrained bacterial population (bacterial lawn), as resulting in localized clearing of that turbidity.

Pseudolysogeny A latent phage infection in which penetration of the phage genome into the bacterial cytoplasm has occurred but subsequent steps towards either productive infection or lysogeny are delayed. Pseudolysogeny is thought to occur primarily when bacteria that are in a significantly starved state are infected by a phage. Single-step growth A technique involving adsorption of phages to bacteria that is followed by a determination of the timing of phage-progeny release from the now-infected bacteria, as well as determination of the total number of phages released per bacterium. The word ‘single’ (or ‘one’) is used to indicate that the released phages are prevented from infecting or even adsorbing to subsequent bacteria found within the experimental vessel. Spatial structure Environmental impediments to mixing, diffusion, and organism motility such that the present spatial position of an organism serves as a good predictor of future spatial position. In the microbiology laboratory, spatial structure is often imposed upon cultures via the addition of agar to growth media, e.g., such as may be found within a Petri dish.

Introduction Bacteriophages are viruses of bacteria. An enormous number of bacteriophages are found the world over, on the order of 1030 or more. By way of comparison, 1030 is approximately the weight of the Sun, in pounds. This article presents an overview of the interactions between bacterial viruses and their environments: that is, an overview of phage ecology. Phage ecology is an exploration of how bacteriophages exist in the wild, and of their impact there.

Phage Existence Within Environments Phage populations have been documented in more or less all environments within which bacteria have been documented. Phage ecological characterization begins with efforts to determine what phages are present within specific environments and in what numbers. With the availability of modern molecular, microscopic, flow-cytometric, and sequencing techniques, it is relatively easy to obtain direct counts or to molecularly characterize uncultured phage virions obtained from environmental samples. A variety of techniques may also be employed to study viral infection/production (aka, proliferation) within environments, though none of these methods are ideal. Estimations of lysogeny in environments are possible but similarly are not ideal.

Overview Determination of phage numbers by viable count, typically meaning plaque count (Fig. 1), once dominated phage environmental endeavors, particularly prior to the 1990s. Direct microscopic counts, however, are generally much higher than viable counts, for example by as much as 10,000-fold. For isolation of viable phages from environmental samples, selective enrichment techniques are often employed. Representative phage numbers, but more generally virus numbers determined by direct counts range from 1 × 104 to 2 × 107 per ml of seawater. Assuming virion degradation between sampling and enumeration in the above estimations, these numbers are calculated to be as high as, e.g., 3–100 × 106 per ml. Sediment and terrestrial virus counts can be even higher than numbers as found in open water. Especially using epifluorescence microscopy, phage direct counts are accomplished more easily than either phenotypically characterizing environmental phages or demonstrating phage environmental impact on specific bacteria. Various molecular

314

Encyclopedia of Virology, 4th Edition, Volume 4

doi:10.1016/B978-0-12-809633-8.20955-2

General Ecology of Bacteriophages

315

Fig. 1 Bacteriophage plaques. Shown are two strains of phage T4 as grown on the host bacterium, Escherichia coli. The larger plaques are made by an r mutant, which can display a significantly shorter latent period than the smaller-plaqued wild type. Note the abruptness of the ‘collisions’ between r-type and wild-type plaques, which is suggestive that phage infections likely extend here beyond the visible portion of the wild-type plaques, an interesting laboratory-ecological phenomenon which was first observed in the late 1940s.

approaches – including pulsed-field gel electrophoresis and metagenomic sequencing of phage environmental populations – allow characterization of phage environmental diversity, especially in terms of genotype. While electron microscopy of environmental samples does allow some phenotypic characterization, particularly of virion morphology, none of these approaches are substitutes for phenotypic analysis of individual phage isolates found in at least ‘phage’ pure cultures. In most cases, however, such characterization is not easily accomplished because the majority of presumptive phage hosts are not easily cultured in the laboratory. Phage culturing and microscopy are considered in greater details as follows.

Phage Isolation and Host Range Phage presence as well as host range diversity can be assessed by determining phage viable counts. Typically this involves plaquing (Fig. 1). Though rarely employed, it is also possible to dilute phages suspended in liquid such that their isolation and even enumeration, using the most probable number method, can be achieved without plaquing. While directly plating environmental samples can be desirable, phage densities are not always sufficient for either isolation or quantification. Two general approaches exist for increasing phage numbers associated with environmental samples: selective phage enrichment and phage concentration. Selective enrichment involves incubating an environmental sample in liquid culture supplemented with a specific bacterial host. The result is a selective increase in the number of those phages which are able to productively infect that host. Concentration of environmental phages requires more sophisticated techniques and equipment, such as filtration, high-speed centrifugation, or precipitation, but allows quantitative determinations of even low environmental densities of phages. Unfortunately, all handling steps involved in phage isolation are likely to result in enrichment of phages possessing certain properties, such as phages which replicate more effectively under enrichment conditions, which have virions that are better able to resist physical damage, or which produce more-easily noticed plaques. Even when carefully employed, viable counts, as noted, tend to grossly underestimate total phage numbers. Aside from phage losses due to handling, these underestimations can stem from the relatively narrow host ranges of phages that allow only a small portion of a phage community to infect or form plaques on a given bacterial host. Also, it is uncertain what fraction of total phage virions are viable under even ideal laboratory conditions nor how much laboratory environments deviate from each phage’s ideal growth environment. Furthermore, also as noted, not all bacteria are culturable; in fact, the vast majority of bacterial species currently are either nonculturable or, at least, difficult to culture using standard laboratory procedures. The resulting limitations on testing phage samples against most hosts necessarily limit our understanding of phage host range diversity within environments.

Microscopic Determination Historically, microscopic determinations have played important roles in the study of phage environmental microbiology, since it was through these observations, especially in the late 1980s and early 1990s, that aquatic environments were found to harbor vast numbers of virus-like particles, especially phages. These observations made it obvious that phage-bacterial interactions could occur within environments with great regularity, suggesting that phages could enormously impact aquatic bacteria. Early visualization of phages involved almost exclusively electron microscopes, since the small size of phages causes them to appear, at best, as pinpricks of light using dark-field optical microscopy. Furthermore, electron microscopes allow scrutiny of phage morphology, which is helpful in differentiating virus-like particles from non-virus particles of similar size, and in obtaining a first approximation of virion in situ diversity. Unfortunately, electron microscopy is fallible because viruses that are not obviously (or exclusively) virus-like in their morphologies are less likely to be enumerated, and thereby may be excluded from viral total

316

General Ecology of Bacteriophages

Table 1

Common metrics of phage ecological, and universal, abundance

Measure

Quantity

Approximate number of phage hosts worldwide Typical ratio of phage to bacteria in environments Approximate average phage density in sea water Typical range of phage densities in aquatic environments Approximate phage density in soils Approximate phage density in sediments Total volume of phage habitats (mostly ocean) Estimated worldwide number of virus particlesa Estimated total phage mass assuming 108 Da/virion Approximate phage mass in ‘blue-whale units’ (BWUs ¼ 108 g) Cumulative length of phage particles lain end to end

1030 10:1 107 ml−1 104–108 ml−1 108 g−1 1010 ml−1 41024 ml 1031 1015 g 107 BWUs 108 light years

a

It is typically assumed that most virus particles observed in the wild are phages.

counts. An additional problem with total counts is that they can determine neither the viability nor the host range of the observed viruses. This includes uncertainty about what fraction of the observed viruses are indeed phages as it is especially difficult to distinguish tailless phages from various groups of eukaryotic viruses. For this reason, researchers who use culture-independent microscopic techniques tend to classify all observations as ‘viruses-like particles’ and not necessarily as ‘phages’. Because of virion morphology and the relative prevalence of bacterial hosts in environments, it is assumed that the majority of observed virions in most environments never the less are phages. Alternatively for determining virion total counts is epifluorescence microscopy, a technique requiring less-sophisticated equipment and sample preparation than electron microscopy. In epifluorescence, virus-containing samples are stained with nucleic acid-binding fluorescent dyes that allow viruses to be viewed using a fluorescent microscope (again as pinpricks of light, though now pinpricks of illuminated viral genomes). Both electron microscopy and epifluorescence microscopy, like viable counts, often require sample concentrating prior to enumeration. In general, electron microscopy allows determination of virion morphologies while epifluorescence allows differentiation of entities containing nucleic acid from those lacking it. As these various forms of microscopy and sample preparation have been refined and developed over the past 30 or so years, more accurate determinations of environmental phage numbers have become possible, and estimates of B106 or more phages per milliliter are typical in pelagic aquatic environments (i.e., within water columns). In marine environments, phage densities appear to be greater in coastal waters (e.g., B107 ml−1 or more) versus more off-shore waters (e.g., B106 ml−1 or more). Densities may be in ranges closer to B108 g−1 for terrestrial environments (i.e., soils) as well as productive lakes and even higher still (up to 1010 g−1) in sediments, though note that in general study of the ecology of viruses in soils, or sediments, is more difficult than its study in more fluid environments. Although densities tend to decline in deeper sediments, it is speculated that shallow sediments could serve as reservoirs for pelagic viruses, offering long-term, perhaps many decades, phage storage such that sediment disruption may return phages to overlying waters, serving to a degree as equivalents to plant soil ‘seed banks’. Using these methods, it has been estimated that at least 1030 phage particles, and perhaps as many as 1031 or more, are present on the planet at any given time, where the lower figure is equal to approximately the volume of all the world’s oceans, in milliliters, times 106 (Table 1).

Phage Ecology The science of ecology may be differentiated into a number of more-or-less distinct subdivisions. These include organismal, population, community, ecosystem, and landscape ecologies. Also included, especially under the heading of organismal ecology, are physiological and evolutionary ecologies. Though important to an understanding of ecology in general, in most cases little effort is made within the phage ecology literature to explicitly distinguish among these various categories, plus there exist numerous overlaps between categories. Indeed, phage biologists including this author often are trained first as microbiologists, or as molecular geneticists, etc., and only later or much less so as ecologists. As a consequence, phage publications relevant to our understanding of phage ecology typically will not describe their contents from a classically ecological perspective. A major goal of this section, therefore, will be to expose readers to phage ecological thinking as organized within a more general ecological framework, presented in overview as follows.

Overview What can be described as a phage organismal ecology considers phage adaptations within environments, with emphasis on phage survival, reproduction, and dissemination. Included under this heading can be phage physiological ecology, which considers the impact of environments on phenotypes. For phages this would be manifest either through modifications in the physiology of bacterial hosts or in terms of environmental impact on virion properties. Also included under the heading of phage organismal

General Ecology of Bacteriophages

317

Fig. 2 Lytic bacteriophage life history. Images © James A. Sullivan, used with permission.

ecology is phage evolutionary ecology, which considers the selective benefits of phage adaptations, i.e., how adaptions impact phage evolutionary fitness (survival and growth rates). Phage population ecology considers phage intraspecific interactions, particularly in terms of how those interactions control phage population size, dispersion, and growth rates. Phage community ecology considers phage interactions with other species, especially with bacteria but also with other types of phages, Also are the biotic components of the greater bacterial environment, such as in terms of the impact of phage-encoded exotoxins on the eukaryotic hosts of bacteria. Phage ecosystem ecology considers phage impact on nutrient cycling and energy flow within ecosystems, which by necessity represents a much more abiotic emphasis than other approaches. Phage landscape ecology, which has not been substantially considered explicitly within the phage literature, would consider the phage impact on interactions between distinct ecosystems, such as between a pond and a surrounding forest, a sewage treatment plant and downstream environments, or an animal (serving as an ecosystem itself) and the extra-animal environment. Note that these various means of looking at phage ecology build upon one another. That is, ecosystem ecology builds upon community ecology which builds upon population ecology which builds upon organismal ecology which, finally, is understood best in light of a strong molecular appreciation of phage biology.

Phage Organismal Ecology Phage organismal ecology may be considered from the perspective of environmental impacts on phage ‘growth’ parameters, particularly as affecting phage reproduction, dissemination, and survival. More specifically are such things as phage adsorption rates, infection timing, productive-infection versus lysogeny decisions, burst sizes, and virion decay rates. Since our understanding of phage growth parameters is derived from the laboratory characterization of phages, this section describes this aspect of phage ecology by employing a very much laboratory-centered emphasis. The traditional means of studying most phage growth parameters are single-step growth experiments and phage adsorption determinations. We can fit these growth parameters into a generalized scheme of the phage life cycle, and then consider how variation in the assorted steps of this life cycle can impact phage ecological success. These steps, in order, include (Fig. 2) (1) a diffusion-driven phage-virion extracellular search for bacteria to infect, (2) the phage adsorption process during which a relatively inert phage-virion and adsorption-susceptible bacterium are together converted into a phage-infected bacterium, (3) the phage infection which can vary in its phenotypic expression from restricted (phage dies, bacterium lives) to abortive (both phage and bacterium die) to productive (phage lives but bacterium dies) to pseudolysogenic (or ‘carrier state’; especially, a relatively unstable latent infection) to lysogenic (a somewhat stable latent infection), and (4) some means of phage-virion release into the extracellular environment. The latter, depending on the phage, may or may not involve host-cell and infection death (i.e., via lysis versus chronic/continuous release, respectively). Consideration of how variation in these assorted steps can impact phage ecological success is presented in the following sections. For example, we can expect rapid virion adsorption – particularly rapid phage attachment given encounter with a phagesusceptible bacterium – to be favored over less-rapid adsorption unless phenotypes conferring rapid adsorption interfere with phage survival or fecundity. For instance, T4-like phages may enhance their survival within extra-colonic environments by taking on a temporarily adsorption-incompetent but inactivation-resistant state. Another counter to the evolution of more-rapid phage adsorption/attachment may be phage growth within relatively spatially structured environments, if there exist tradeoffs between time spent disseminating versus time spent infecting bacteria. Note that these considerations of phage adsorption rates more generally reflect differences between phage properties ‘as virions’ versus ‘as during bacterial infection’, differences which are explored more fully in the following section. Phage infections in the wild also can be less efficient than those studied under laboratory ‘ideal’ conditions for phage growth (e.g., smaller burst size, longer latent period, slower adsorption). The major bacterial nutrients, carbon and energy, as well as critical minerals, such as phosphorus, can be in short supply within many environments, resulting in such inefficiencies in nature. The ecological study of organismal physiological response to changing environmental conditions is termed physiological ecology, or ecophysiology in short. In addition to developing a better general understanding of phage in situ population ecology,

318

General Ecology of Bacteriophages

consideration of phage ecophysiology is relevant to estimating levels of phage-induced bacterial mortality within natural environments, an important aspect of in situ phage–bacterial community ecology.

Bacteria-like versus virion-emphasizing modes of existence The selective advantage associated with various phage infection and release strategies, similar to phage adsorption rates, will be dependent on environmental conditions, particularly in terms of numbers of bacteria, bacterial physiology, and rates of phage decay. In general we can assume that conditions that favor bacterial growth and survival over phage population growth and survival will tend to favor more bacteria-like modes of phage replication/survival over more production of phage virionsemphasizing modes; more precisely, we can expect life history biases towards phage existence infecting a bacterium rather than towards greater emphasis on existence as free virions. Thus, low bacterial densities, bacterial physiologies that poorly support significant virion production, or virion-specific environmental antagonists including phage-restricting bacteria will tend to favor extended phage latent periods. These infecting states may include pseudolysogeny, lysogeny, or productive but longer-lasting infections, including infections by phages that release their progeny chronically rather than lytically. Alternatively, we can assume that conditions that favor phage population growth will tend to favor more productive modes of replication, that is, ones that emphasize the virion state over the infected state. This more virion-emphasizing state can include shorter-lengthed productive phage infections – especially lytic infections that favor earlier virion release over higher infection fecundity – but also more-rapid chronic virion release at the expense of longer-term infection survival. The advantage of displaying a more virion-emphasizing existence may be viewed from two perspectives. The first are conditions that allow for rapid phage-population expansion, i.e., relatively high densities of uninfected host bacteria. Less intuitively, that expansion comes at the expense of bacterial survival, and therefore any phage which persists within a more bacteria-like (extended latent period) mode of infection may be more susceptible to attack by unrelated phages under conditions which are favorable toward phage population expansion (see, e.g., ‘Killing the winner’, below). Also favoring the virion over the infecting state, a temperate phage can increase its target number and reduce its genomic target size via lysogen induction, i.e., by changing from a single bacterial lysogen to multiple copies of phage virions. Such induction can occur following bacterial lysogen exposure to DNA damaging agents. In considering phage population growth and survival, we thus can envisage phage adaptation as involving compromises between conflicting tendencies toward greater emphasis on existence as an infection versus greater emphasis on existence as a virion.

Phage Population Ecology Considerations of phage population ecology may be differentiated into two categories: (1) the impact of phage adaptations and environmental conditions on phage population growth within well-mixed, fluid environments (such as within broth in the laboratory) and (2) the impact of phage adaptations and environmental conditions on phage population growth within spatially structured environments (such as within soft-agar overlays in the laboratory; Fig. 1). These categories will be considered in turn. Like phage organismal ecology, much of our understanding of phage population ecology derives from laboratory characterization of phages, but also from exercises in mathematical modeling of phage population growth. We can also distinguish phage population considerations into those operating at low phage densities versus those operating at high phage densities. In both cases density refers to that of a single phage population since inter- as opposed to intra-species interactions are considered under the province of phage community rather than phage population ecology. These latter considerations are much simpler to envisage during phage broth-culture growth. The growth of phages in broth can be considered to occur over at least four distinct steps. These are as follows: (1) Phage entrance into an environment. This can be physical, in terms of phage movement from one place to another, physiological in terms of the induction of an already-present lysogen, or genetical in terms of phage host-range mutation such that a previously phage-insusceptible bacterial population suddenly becomes susceptible to sympatric phages. (2) A period of phage-population growth that spans from the point of the initial phage infection of a susceptible bacterium within an environment through to the point of transition where most susceptible bacteria have become phage infected. Most bacteria will become phage infected, however, only if sufficient bacteria are present to support phage population growth to relatively high densities. (3) The transition to and then span over which most of the phage-susceptible bacteria within a population have become phage infected. (4) A postinfection, particularly post community-wide bacterial lysis period during which phage numbers greatly outnumber those of bacteria. In many cases, each of these steps is at best a transient phenomenon for a given combination of phage, bacterium, and environment. Once a phage virion has been released from an infected bacterium, it has entered into a period of what can be described as an extracellular search for new bacteria to infect. The duration of this search will be a function of the product of the density of phage susceptible bacteria within a given environment – which in turn can be a function of the range of bacterial types that a given phage may infect – and the phage adsorption rate constant, which will vary as a function of phage, bacterium, and environmental parameters. The likelihood that a given phage will successfully adsorb to and infect some bacterium is additionally a function of a phage’s decay rate, a.k.a., virion inactivation rate, with higher decay rates reducing the likelihood of successful infection by a virion. Decay can be a function of abiotic insults that result in virion capsid or nucleic acid damage, phage adsorption to abiotic materials,

General Ecology of Bacteriophages

319

biotic insults including engulfment by eukaryotes as well as inactivation mediated by phage-adsorbed bacteria, or, from a modeling perspective, even emigration out of an environment. To a phage entering a previously phage-free environment, the likelihood that phage population growth will occur therefore is a function of bacterial numbers, rates of phage adsorption per bacterium, and rates of phage decay. If successful adsorption occurs, then the likelihood that phage population growth will continue beyond this first adsorption will be a function of phage burst size, with each additional phage produced per infection potentially possessing an additional, nearly identical likelihood of subsequent successful bacterial adsorption. Extending this point, we can assume that with either higher bacterial numbers, more-effective/rapid virion adsorption per bacterium, lower rates of decay, or higher initial virion numbers, then there will be a greater likelihood that phage population growth will be initiated upon phage entrance into a new environment. Overall the potential for a phage population to grow within a given environment can be described as its existence conditions. Once phage population growth has begun, it seems unlikely that there would exist any intrinsic inhibitions on phage evolution toward faster population growth. Both in terms of intraspecific and inter-specific competition among phages, it is that phage which reaches a bacterium first that should have the greatest potential to ‘claim’ that bacterium for infection. Thus, a phage type that grows its population fastest will have access to more bacteria sooner, and therefore enjoy among its progeny greater overall numbers of bacterial infections within a given environment. In general, we can expect faster overall phage population growth given greater phage burst sizes, shorter phage latent periods, or faster phage adsorption. Phage decay can also affect rates of phage population growth, and this occurs via reductions in effective phage burst sizes, where the qualifier ‘effective’ refers to the number of phages released from a given bacterial cell which go on to successfully infect new bacteria. Things change, within a given environment, as a phage population reaches the limits of its population growth. At this point the selective pressures on phage growth change from ones that are mostly phage-density independent to ones which instead are highly phage-density dependent. These phage-density-dependent limits vary in their relevance, however, depending on bacterial densities along with phage fecundities. At one extreme, bacterial densities are so low, or phage fecundities (effective burst sizes) so small, that phage populations never reach densities of sufficient size to greatly affect bacterial populations. Under these conditions, multiple phage adsorptions to a given bacterium is of low likelihood and competition between phages for individual bacteria small. Selection here should be biased toward greater virion or infection durability, for example, rather than higher rates of phage population growth. We have this expectation because bacterial rarity presumably would put a premium on phage survival prior to bacterium encounter and also because phage potential to migrate to new bacteria-containing environments could be a function of longer-term phage survival, either as virions or as bacterial infections. At the other extreme, bacterial densities as well as phage fecundities may be sufficiently high that a majority of phagesusceptible bacteria may become phage infected and therefore no longer available for further infection by the same phage type. Prior to this point we may expect selection to favor phages that display rapid population growth, even if such rapidity should come at the expense of effective burst size, such as in terms of actual burst size or in terms of virion resistance to decay. Thus, at higher bacterial densities, while phage densities are still relatively low, a phage could display faster population growth by shortening its latent period. Such shortening, however, is most easily accomplished at the expense of the overall duration of the period during which phage progeny intracellularly mature, resulting in a reduced phage burst size. Alternatively, one could envisage a loss of phage genes that otherwise could contribute to long-term virion survival should the presence of those genes come at the expense of effective phage burst size or further latent period shortening. One possible example of such evolution could be a common loss of abilities to display lysogenic cycles among phages contaminating industrial dairy ferments. Adaptation that results in declines in phage survival could be shortsighted within environments in which densities of new bacteria to infect come to rapidly decline. Instead, we can envisage an advantage bestowed upon phage infections during ‘final rounds’ of infection – that is, at the point where most bacteria within a population have become phage infected – that would result from larger effective burst sizes. In this way more or more robust phages are produced to better assure phage survival until access to a new bacterial population occurs. Somewhat equivalently, a phage may accrue advantages by displaying more bacteria-like lifestyle under conditions where bacterial densities are in decline, such as via temperate-phage reduction to lysogeny upon bacterial adsorption rather than display of a lytic infection (above). Many of these same effects take place during phage population growth within spatially structured environments, such as within bacterial lawns immobilized within soft agar, that is, as phage plaques (Fig. 1) or, as we can speculate, in association with bacterial biofilms. A difference, though, is that in addition to temporal differentiation between phage density-independent and phage density-dependent selection, there also can be spatial differences. Thus, phage multiplicity may be higher toward the center of plaques versus toward their periphery. In other words, at a plaque’s leading edge, selection acting upon phage characteristics likely is more phage density independent than toward the center of plaques. Therefore, we can expect selection for faster phage population growth to occur at a plaque’s periphery, whereas, by contrast, selection at a plaque’s center should be biased more toward a greater effective burst size. As noted, it may be possible to extrapolate, to a degree, the characteristics of phage population growth within phage plaques to that in association with at least some bacterial biofilms.

Phage Community Ecology Phage community ecology includes issues of phage predation on bacteria as well as the establishment of mutualistic-like relationships such as those between prophages and bacteria, forming lysogens. The phage community ecology literature is more

320

General Ecology of Bacteriophages

closely aligned with the ecological literature, which is to say that a good deal of phage community ecology has been done by individuals trained first as ecologists who have then taken to phages as model systems rather than by individuals trained first by studying more organismal or more molecular aspects of phage biology. As such, phage community ecology has a much more theoretical basis than either phage organismal or population biologies, including the employment of simulations of phagebacterial interactions, especially within continuous cultures (chemostats). In addition to considerations of phage impact on bacterial populations, including as can be associated with other organisms such as ourselves, phage community ecology can also consider interactions between different phage species or grazing on phages by protists. We can distill three general governing principles of phage community ecology, especially with regard to phage impact on bacteria: (1) The greater a phage population density then, all else held constant, the greater the phage impact on a bacterial population, and the faster that phage populations reach these higher densities, then the sooner their impact. (2) The greater the density of phage-sensitive bacteria, then – potential changes in bacterial physiology aside – the greater the population size that phages can reach and the sooner they will grow to those population sizes. And (3) Bacteria can hide in various ways from phages – numerically, genetically, phenotypically, physiologically, spatially – but doing so can come at a cost in terms of, for example, bacterial growth rates. Real world applications of these concepts include that of ‘Killing the winner’ where it is expected that populations of same or similar bacterial types, which grow to higher densities, will tend to become more susceptible to phage-mediated population crashes. These crashes result solely from bacteria having reached these higher densities and thereby being able to support phage population growth to higher densities. By culling especially more successful bacterial populations, phages may serve as mediators of a frequency-dependent selection among bacterial communities that results in greater bacterial diversity than environments might otherwise sustain. A second and related real-world consideration is that bacteria within especially eutrophic, i.e., high-nutrient environments, where phage virions can be especially abundant, may display physiologies that represent tradeoffs between rates of bacterial growth and phage-free survival. That is, bacteria may be subject to physiological and other burdens that come with deploying protective measures against potentially infecting phages. Protective measures are diverse and include such things as restrictionmodification systems along with CRISPR-Cas systems. Additionally, bacteria may take on more-protective lifestyles, potentially such as within extracellular polysaccharides as found in association with biofilms. Bacteria may also display reduced versatility in the course of mutating to phage resistance because of loss (temporarily or permanently; fully or partially) of membrane proteins involved, for example, in nutrient transport, proteins which could otherwise serve as phage receptors. That is, the more proteins or other molecules a bacterium displays on its surface, then the more phage types that bacterium may be susceptible to. Because many of these mechanisms of phage resistance can result in reduced bacterial fitness, we should expect that the strength of selection for phage resistance should be directly proportional to phage density within a given environment. Phages, in turn, can display adaptations which allow them to overcome various bacterium-mediated mechanisms of phage resistance. Bacteria can also evolve to mitigate the costs associated with displaying phage resistance. A third real-world consequence of phage impact on bacteria can result from phage-mediated transduction of bacterial genetic material. Phages do this by two basic mechanisms: generalized transduction in which random pieces of bacterial DNA are incorporated into consequently defective phage virions, and specialized transduction. The latter traditionally involves the incorporation of that bacterial genomic DNA which flanks bacterial-chromosome inserted prophages, and occurs as a consequence of imprecise prophage excision upon induction. Serving essentially as an extension of the concept of specialization transduction is phage encoding of ‘morons’, which involves the incorporation via illegitimate recombination of bacterial genes into various locations within temperate phage genomes. This ‘more DNA’ is then passed on to bacteria in the course of establishment of lysogeny by moron-carrying phages. These genes, along with prophages generally, can impact bacterial phenotype including the fitness of lysogens, potentially positively. These latter mechanisms of transduction serve as a means of transferring genetic material from bacterium to bacterium while also transferring genetic material from bacterium to phage.

Phage Ecosystem Ecology The primary emphasis of phage ecosystem ecology is on the impact of phage infection of prokaryotes on movement of nutrients and energy from prokaryotes to higher trophic levels. That is, phage infection influences the acquisition of bacteria including cyanobacteria by bacteria-eating eukaryotes. The real-world significance of this emphasis derives from prokaryotes serving as the base of many, especially aquatic ecosystems, and the role that these organisms play in carbon sequestration. That is, within aquatic environments most dissolved organic material becomes available to eukaryote grazers only once bacteria have assimilated it. With phage-induced bacterial lysis, however, a significant fraction, even 25% of the carbon entering these ecosystems photosynthetically, is converted to a dissolved organic state. This creates a ‘viral loop’ that shunts organic carbon away from what is known as the aquatic ‘microbial loop’, which otherwise moves carbon from prokaryotes to grazers, i.e., especially phagotrophic protists to grazer-consuming animals (Fig. 3). Another way of stating the significance of the phage impact on the survival of aquatic bacteria is that it is approximately equivalent to the impact of protist grazing. The impact of phage infection versus protist grazing, however, varies with habitat, with phages displaying a greater impact especially in more extreme, including low-oxygen environments.

General Ecology of Bacteriophages

321

Fig. 3 Microbial loop as short circuited by virus-, particularly phage-induced lysis. The viral loop consists of the generation of dissolved organic carbon via virus-induced lysis of especially heterotrophic bacteria and cyanobacteria (the latter here included under the heading, Algae), which is followed by assimilation of a fraction of that material into heterotrophic bacteria. The viral loop reduces the efficiencies by which photosynthetically fixed organic carbon is transferred to higher trophic levels via the microbial loop, such as to aquatic animals.

Further Reading Abedon, S.T., 2006. Phage ecology. In: Calendar, R., Abedon, S.T. (Eds.), The Bacteriophages. Oxford: Oxford University Press, pp. 37–64. Abedon, S.T., 2007. Bacteriophage Ecology: Population Growth, Evolution, and Impact of Bacterial Viruses. Cambridge, UK: Cambridge University Press. Abedon, S.T., 2009. Phage evolution and ecology. Advances in Applied Microbiology 67, 1–45. Abedon, S.T., 2011. Bacteriophages and Biofilms: Ecology, Phage Therapy, Plaques. Hauppauge, New York: Nova Science Publishers. Abedon, S.T., 2017. Commentary: Communication between viruses guides lysis-lysogeny decisions. Frontiers in Microbiology 8, 983. Abedon, S.T., 2017. Phage “delay” towards enhancing bacterial escape from biofilms: A more comprehensive way of viewing resistance to bacteriophages. AIMS Microbiology 3, 186–226. Breitbart, M., Rohwer, F., Abedon, S.T., 2005. Phage ecology and bacterial pathogenesis. In: Waldor, M.K., Friedman, D.I., Adhya, S.L. (Eds.), Phages: Their Role in Bacterial Pathogenesis and Biotechnology. Washington, DC: ASM Press, pp. 66–91. Brüssow, H., Kutter, E., 2005. Phage ecology. In: Kutter, E., Sulakvelidze, A. (Eds.), Bacteriophages: Biology and Application. Boca Raton, FL: CRC Press, pp. 129–164. Goyal, S.M., Gerba, C.P., Bitton, G., 1987. Phage Ecology. Boca Raton, FL: CRC Press. Howard-Varona, C., Hargreaves, K.R., Abedon, S.T., Sullivan, M.B., 2017. Lysogeny in nature: Mechanisms, impact and ecology of temperate phages. ISME Journal 11, 1511–1520. Hyman, P., Abedon, S.T., 2010. Bacteriophage host range and bacterial resistance. Advances in Applied Microbiology 70, 217–248. Hyman, P., Abedon, S.T., 2018. Viruses of Microorganisms. Norwich: Caister Academic Press. Lenski, R.E., 1988. Dynamics of interactions between bacteria and virulent bacteriophage. Advances in Microbial Ecology 10, 1–44. Weinbauer, M.G., 2004. Ecology of prokaryotic viruses. FEMS Microbiological Reviews 28, 127–181. Wommack, K.E., Colwell, R.R., 2000. Virioplankton: Viruses in aquatic ecosystems. Microbiology and Molecular Biology Reviews 64, 69–114.

Relevant Websites http://www.archaealviruses.org Archaeal Viruses. http://www.phage.org Bacteriophage Ecology Group. http://www.ISVM.org International Society for Viruses of Microorganisms. https://en.wikipedia.org/wiki/Phage_ecology Phage Ecology. http://www.phage-therapy.org Phage Therapy.

Marine Bacteriophages Vera Bischoff, Falk Zucker, and Cristina Moraru, Institute for Chemistry and Biology of the Marine Environment, Oldenburg, Germany r 2021 Elsevier Ltd. All rights reserved.

Glossary Bathypelagic layer The bathypelagic layer extends from the mesopelagic layer to B 4000 m below the ocean surface, where the abyssopelagic zone begins. Because no light penetrates in the bathypelagic layer, it is also known as the dark zone. Epipelagic layer The epipelagic layer represents the uppermost layer of the ocean, where sunlight easily penetrates and the photosynthetic autotrophs can be found. It extends from the 0 m to 150/200 m depth. Fosmid Fosmids are cloning vectors based on the bacterial F-plasmid. They can hold DNA inserts of up to 40 kb in size. Mesopelagic layer The mesopelagic layer represents the part of the pelagic zone in between the photic epipelagic and aphotic bathypelagic zones. It is also known as the twilight zone and it extends from B 150/200 m to B1000 m depth. It begins at the depth where only 1% incident light reaches and ends where there is no light. Nanopore Sequencing Technology Nanopore sequencing technology, as developed by Oxford Nanopore Technologies Inc, produces long reads. The read length usually averages 4 kb, but, with DNA preparations minimizing fragmenting, it can reach even 1 Mb. From here on we will refer to this technology as the nanopore technology. It involves several steps. First, the sequencing library is prepared by binding adapters to DNA fragments, while striving to keep the fragment length as high as possible. Then, the library is loaded on a flow cell, which contains more than one thousand protein nanopores set in an electrically resistant polymer membrane. By setting a voltage across the membrane, an ionic current is passed through the nanopore. The protein nanopores can capture single stranded DNA and pass it through. Each time a base passes

through a pore, the electric current is changed and recorded. For each pore, the base sequence of a single DNA fragment passing through it is inferred from the changes in the electrical current. Read In nucleic acid sequencing, a read is defined as the base pair sequence corresponding to a single DNA fragment. Sequencing by Synthesis Technology Sequencing by synthesis technology, as developed by Illumina Inc., produces short reads of 200–400 bps, in very high throughput. From here we will refer to this technology as the Illumina technology. It involves several steps, as follows. First, the sequencing library is prepared by fragmenting the DNA and binding adapters to both ends of the fragments. Then, the library is loaded on a flow cell, where DNA fragments bind randomly to its surface, one fragment at a single location. This is followed by solid-phase amplification, which creates up to 1000 identical copies of each single DNA fragment. All copies from one molecule are found in close proximity, creating a cluster with a diameter of 1 mm or less. In a sequencing flow cell, there are millions of such clusters per square centimeter. Further, in each sequencing cycle deoxynucleotide triphosphates (dNTP) which are fluorescently labeled are added. Because the fluorochrome acts as a terminator, only a single base binds to a DNA molecule per sequencing cycle. After the base incorporation, the fluorescent dyes are imaged and then cleaved away from the DNA molecules, to allow the next cycle of sequencing. Because sequence clusters have a defined spatial localization on the flow cell, the sequence of every single read can be inferred from the order of the fluorescent signals corresponding to the four bases (A, T, C, and G) at the respective location.

A Short Introduction in Marine Phages In the late 800 s it was recognized that, in marine environments, viruses outnumber microbial cells by an order of magnitude. In seawater, virus numbers range in between 104 to 108 per millimeter. In sediments, virus abundance is even higher, up to 109 per cubic centimeter. The majority of these viruses infect bacteria, and they are called bacteriophages, or phages. Phages shape the bacterial community structure through viral lysis and influence both host metabolism through the expression of auxiliary metabolic genes and host evolution through horizontal gene transfer. Furthermore, phages are responsible for the viral shunt, in which organic matter is hijacked by viral lysis from being transferred to higher food chain levels through grazing. Through their various roles, phages are major drivers of marine biogeochemical cycles.

Morphological Diversity of Marine Phages Generally, phage morphological diversity is low. All phages have a capsid, which is a protein shell surrounding the nucleic acid. Phage capsids are either helical or icosahedral in shape (see Figs. 1–3). A variation of the icosahedral shape is the prolate capsid, resembling an elongated icosahedron (see Fig. 1). Helical capsids are known only for ssDNA phages. The icosahedral shape is the

322

Encyclopedia of Virology, 4th Edition, Volume 4

doi:10.1016/B978-0-12-809633-8.20988-6

Marine Bacteriophages

Fig. 1 Phage capsid types.

Fig. 2 Morphological and taxonomic diversity of marine dsDNA phages.

323

324

Marine Bacteriophages

Fig. 3 Morphological and taxonomic diversity of marine ssDNA phages.

most encountered, being present in all dsDNA and RNA phages and in part of the ssDNA phages. In rare instances, icosahedral phages are enveloped by a lipid layer. An additional structure, called tail, is present in the majority of known dsDNA phages. The tail mediates the adsorption to the host and the DNA transfer. There are three basic tail morphologies: podoviral, myoviral and siphoviral (see Fig. 2). In podoviruses, the tail is short and non-contractile. In myoviruses, the tail is long, rigid and contractile. In siphoviruses, the tail is long, flexible and non-contractile. Tailed phages have capsid proteins with an HK97 fold (named after the eponymous bacteriophage), while non-tailed phages have capsid proteins with a double jelly roll fold (DJR). Because the majority of cultivated marine phages are tailed, and thus belong to Caudovirales, until recently it was believed that tailed phages dominate in the environment as well. However, it was shown by quantitative electron microscopy that non-tailed phages make up in between 66%–85% of environmental phages in global surface oceans. The identity and phylogenetic affiliation of these non-tailed phages is currently unknown. Part of them could be dsDNA phage with a DJR-fold capsid, from the Corticoviridae or “Autolykiviridae” families. Another part could be ssDNA phages from the Microviridae family. And another part most likely represents so far unknown phages. As with their bacterial hosts, morphology alone cannot be used to classify phages or to predict their behavior and ecological role. Behind the few morphological types described, a huge genomic diversity is hidden.

Genomic Diversity of Marine Phages and How do we Study it The marine phage diversity is large and just being explored. The methods used to study phage diversity fall within two categories: (1) culture-dependent methods, based on phage isolation in pure cultures and (2) culture-independent methods, based on microbial ecology tools, as for example sequencing of marker genes or of phage genomes directly from the environment. The first set of methods allows in-depth characterization of specific phage-host systems. For example, cultured phages can be visualized by electron microscopy, can be (relatively) easily genome sequenced, their interaction with the host can be studied in great detail and their host range can be determined. The second set of methods most often allows access only to phage gene or genome sequences. However, they allow sampling and analysis on a much larger scale than would be possible by using only phage isolates. By now, phages for several marine bacterial groups have been isolated and characterized. For isolation of phage cultures from an environmental sample, the phage fraction of the respective samples has to be separated from the microbial cells first. Then, the phage fraction is mixed with growing cultures of the hosts of interest. The propagation of host-specific phages is monitored by tracking cell lysis, for example by plaque assays or turbidity measurements. If lysis is observed, the new phages are purified by successive plaque assays when the hosts can grow on solid media, or by dilution to extinction when the hosts grow only in liquid media. The most seminal isolation efforts were focused on phages infecting cyanobacteria, also called cyanophages. More than 7000 cyanophages have been obtained from different studies. However, many isolates have a high degree of genomic relatedness, which reduces their diversity to just a few groups. In comparison, the number of isolates available from other bacterial groups is smaller, but of comparable diversity. For example, to date there have been published 34 phages infecting the Roseobacter group, 40 phages infecting Cellulophaga and 15 phages infecting Pelagibacter. Although having phage cultures is highly advantageous, the inability to cultivate most of the marine bacteria results also in an inability to cultivate their phages and the diversity we see from cultured phages is only the tip of the iceberg. Therefore, researchers need other methods to access environmental phage diversity.

Phage Marker Genes In microbial ecology, marker genes have long been used to analyze the composition and functions of microbial communities. The most used gene in diversity studies is the one encoding for the 16S rRNA, because it is universally present in cellular organisms and it is conserved, allowing identification and classification of microorganisms. In contrast, phages don’t have a universally present gene. Instead, specific marker genes have been used to assess the diversity of certain phage groups in environmental studies. Genes encoding different proteins were used, as for example genes for the major capsid protein or the DNA polymerase. In these approaches, specific primers were used to amplify by Polymerase Chain Reaction (PCR) fragments of the marker genes, followed

Marine Bacteriophages

325

by sequencing and sequence analysis. A recent development, the solid-phase single-molecule PCR polony method allows quantification of virus populations in nature.

Viral Metagenomics (Viromics) Much of our knowledge regarding phage diversity comes from metagenomics studies. A metagenome represents a collection of the sequences of all DNA molecules in a sample. When only the viral fraction is targeted, it is named viral metagenome, or virome. To generate a virome, the phage fraction is separated, the DNA is extracted, fragmented in small pieces and sequenced using one of the high throughput sequencing technologies, most often Illumina. This approach generates many small sequence fragments, also called reads, which then have to be assembled into the original phage genomes (see Fig. 4). Most of the times, due to the huge sequence diversity, only partial phage genomes can be assembled. These partial or complete phage genomes (also called contigs) are then grouped into populations, roughly representing the equivalent of phage species, and then into viral clusters, representing approximately the equivalent of genera or subfamilies (see Fig. 5). The biggest collection of marine viromes has been generated from three scientific expeditions through the world’s oceans: Malaspina, Tara Oceans and Tara Oceans Polar Circle. A total of 145 samples have been collected, mainly from the epipelagic and mesopelagic layers, but also from the bathypelagic layer. The first analysis of these viromes included only the Malaspina and the Tara Ocean expeditions and only the epipelagic and the mesopelagic layers. The resulting dataset was labeled the Global Ocean Virome (GOV) and contained 1,380,834 viral contigs. These contigs were grouped first in 15,222 viral populations and then in 867 viral clusters. From the 867 viral clusters, two-thirds did not affiliate with any cultivated virus, underlining the richness of unexplored marine virus diversity. Five of the 867 viral clusters were relatively ubiquitous, being abundant in more than 20 samples. Only three of the clusters were related to known marine phage groups, that is the T4 superfamily, the Cbaphi381 virus, and the T7virus (see more about these groups in the “Cultivated marine phages” section). The predicted hosts for these viral clusters included Actinobacteria, Alphaproteobacteria, Deltaproteobacteria, Gammaproteobacteria, Bacteroidetes, Cyanobacteria, and Deferribacteres. The second Global Ocean Virome (GOV2) dataset comprises all 145 samples, including the ones in GOV, deeper sequenced. In GOV2 there are of 195,728 viral populations with contig size bigger than 10 kb. From these, 90% could not be assigned to any known viral family, however, the expectation is that a big portion represents phages. The other 10% were assigned to dsDNA viruses and bacteriophages. Standard viromics has enabled unprecedented access to the diversity of marine phages. However, one of its shortcomings is that microdiversity, defined by the presence in the same sample of similar phage genomes (for example genomes with 495% identity over at least 70% of their genome), hinders assembly of complete genomes from small reads. As a result, at least part of the phage contigs obtained from viromics studies originate probably from the phages with low microdiversity in the respective samples, but not necessarily with high abundances. Several approaches that can circumvent this problem are reviewed below.

Viral Contigs From Fosmid Libraries A first approach is to clone environmental phage DNA into fosmids, and then to sequence each fosmid using Illumina technology (see Fig. 6). Because one fosmid carries only one phage DNA fragment, this reduces the assembly complexity and, in principle, allows the retrieval of phage contigs from microdiverse phages. On the other hand, large phage genomes cannot be cloned into fosmids, the practical insert size being in the range of 5–48 kb. Despite this limitation, complete phage genomes can be obtained using fosmid libraries, as exemplified by the retrieval of 208 such genomes from the deep chlorophyll maximum (DCM) in the Mediterranean Sea. These phage genomes were grouped in 21 clades of the Caudovirales and were predicted to infect SAR11, SAR116, Cyanobacteria and the low GC Actinobacteria. Fosmid libraries were also used to explore the viral diversity in the 1000 m deep Adriatic Sea and the 3000 m deep Ionian Sea. A total of 28 complete viral genomes, with lengths of 30–41 kb, were obtained. To assess the distribution of these genomes in the water column, raw reads from the Pacific Ocean virome dataset were mapped against each of them. From the 28 genomes, several were found only in the bathypelagic and mesopelagic layers and in high abundances. Five of the phage genomes were predicted to infect the deep ocean SAR11. From these five, uvDeep-CGR2-AD10-C281 was abundant in the mesopelagic and bathypelagic waters and uvDeep-CGR2-KM22-C255 was found both in the deep and surface waters. In addition, 11 contigs represented putative prophages likely infecting Psychrobacter, Flavobacteria, Planctomyces, and Pelagibacter.

Single Virus Genomics A second approach is single-virus genomics. Individual phage particles are separated by flow cytometry, followed by genome extraction, amplification and then sequencing (see Fig. 7). The resulted genomes are called viral single-amplified genomes (vSAGs). Applying this method on seawater samples from the Mediterranean Sea and the Atlantic Ocean resulted in 44 vSAGs. Most of the vSAGs were tentatively assigned to the Caudovirales, representing potentially 37 new species and 7 new genera. One of the vSAGs, vSAG-37-F6, was later discovered to infect environmental Pelagibacterales and was ranked as the 13th most abundant marine phage. Furthermore, a high abundance of vSAG-37-F6 transcripts was found in coastal temperate waters from the NE Atlantic, suggesting that this phage was actively infecting its host.

326

Marine Bacteriophages

Fig. 4 Workflow for preparation of viral metagenomes.

A limitation of single virus genomics becomes visible when considering the high number of viral particles in environmental samples: this approach becomes impractical at the viral community scale.

Long Read Viromics A third and very promising approach to capture phage microdiversity is VirION. It combines Illumina sequencing technology, with reads of B300 bases, and nanopore technology, with reads of B4000 bases. The combination is necessary because the long read technologies have a high error rate, and the high-quality Illumina reads are used to correct the errors. When applied to seawater

Marine Bacteriophages

Fig. 5 Grouping of viral contigs in populations (based on sequence similarity) and clusters (based on shared gene content).

327

328

Marine Bacteriophages

Fig. 6 Preparation of viral contigs from fosmid libraries.

samples from the Western English Channel, VirION was able to capture phage genomes belonging to abundant, microdiverse populations that were previously missed by short-read data. One of the assembled contigs, H_NODE_1248, was identified as the most abundant and ubiquitous viral genome in the marine environment. H_NODE_1248 belongs to the same viral cluster as 57 other contigs, including vSAG-37-F6.

Metagenome Assembled Viral Genomes (MAVGs) Cellular metagenomes can be a rich source of phage contigs. It is estimated that in between 20%–60% of bacterial cells in the marine environment are infected by phages at any time. Therefore, bacterial metagenomes also contain sequences of phages actively infecting the cells at the time of sampling (see Fig. 8). The deeper the metagenome is sequenced (a higher number of reads

Marine Bacteriophages

329

Fig. 7 Workflow for single-virus genomics.

produced), the higher the number of viral sequences retrieved. This approach was recently applied to samples from the Mediterranean Sea, from various depths and sampling sites. More than 1300 viral contigs were obtained, from which 36 contigs represented complete genomes. Host prediction indicated that part of the viral contigs infect Cyanobacteria, Actinomarina, SAR11, SAR116 or other Alphaproteobacteria. Most of the retrieved phage contigs were found exclusively in the Mediterranean Sea, and some were specific to the bathypelagic realm.

Linking Phage Genomes With Host Identity Obtaining phage genomes using either viromics or single-virus genomics does not preserve the link between the phage and its host. Several computational methods can be used to predict the potential hosts, as for example (1) sequence similarity with known phages, (2) sequence similarity between phage and host genome, (3) similarities between phage genomes and host CRISPR spacers or (4) similarity of phage and host tRNA. However, for the vast majority of environmental phage genomes, the host cannot be predicted using bioinformatics approaches. Outside the realm of bioinformatics, phageFISH and viral tagging allow linking phage and host identity in environmental samples. In phageFISH, microbial cells in a sample are hybridized with fluorescently labeled probes targeting their 16S rRNA, for taxonomic identification. Simultaneously, intracellular phages are hybridized with probes specific for their genes, labeled with different fluorochromes. Hybridized cells and phages are detected by microscopy. Overlapping 16S rRNA and phage fluorescent signals are considered proof that those specific cells are infected by the targeted phage. Another way to link phages and hosts is to sequence genomes from individual environmental cells. These type of genomes are called single amplified genomes (SAGs), because the DNA from one single cell is extracted and then amplified enzymatically to obtain the amount necessary for sequencing. Single cell genomics recovers all DNA molecules in a cell. Therefore, if infecting phages are found in the cell, they will be sequenced together with the genome of its host, enabling thus the linkage between host cell identity and phage identity (see Fig. 9). Even more, the approximate stage of infection can be estimated by comparing the number of reads per genome base, also called coverage, corresponding to the phage and bacterial genome. A higher coverage of the phage genome indicates a higher number of phages per cell, therefore an advanced infection stage. In SAGs obtained from the Saanich Inlet water column, 69 phage contigs infecting the gammaproteobacterial SUP05 clade were identified. Phylogenetic analysis grouped the contigs into five new genera within the dsDNA Caudovirales and the ssDNA Microviridae. Several cells were co-infected by dsDNA and ssDNA phages, potentially indicating a cooperative interaction

330

Marine Bacteriophages

Fig. 8 Preparation of viral contigs from cellular metagenomes (MAVGs).

between these phage types. In another study, twenty complete or near-complete phage genomes were discovered in SAGs from seawater samples from the Gulf of Maine and the North Pacific subtropical gyre. The infected hosts belonged to Marinimicrobia, Verrucomicrobia, Gammaproteobacteria lineages SAR86 and SAR92, Bacteroidetes and Alphaproteobacteria. Gene content analysis indicated that the assembled genomes belonged to dsDNA, tailed phages in the order Caudovirales. Retrieval of an algal virus from a Verrucomicrobia cell indicated that unspecific attachment of virus particles to cells can be a problem when obtaining phage genomes from bacterial SAGs.

ssDNA Phages in the Marine Environment Most of the viral metagenomic studies have focused on dsDNA phages. In comparison, less is known about the diversity of ssDNA and RNA phages in the marine environment, because their nucleic acid cannot be sequenced by the usual methods. To sequence ssDNA phages, their genome has to first be converted to dsDNA. This can be achieved using several methods, from which multiple displacement amplification (MDA) is the most used. However, MDA is not quantitative, because it is amplifying predominantly circular ssDNA phages, to the detriment of dsDNA phages. The method giving the most precise quantitative results from using an adaptase-linker amplification. Applying this method on marine samples, it was concluded that ssDNA phages represent not more than 5% of the marine phage community.

Marine Bacteriophages

331

Fig. 9 Linking host identity and phage genomes using single-cell genomics.

Cultivated Phages Infecting Main Groups of Marine Bacteria Phages of Marine Cyanobacteria – Cyanophages Cyanophages are viruses infecting cyanobacteria. As oxygenic photoautotrophs, cyanobacteria are the main contributors to the oceanic primary production. Furthermore, many cyanobacteria are diazotrophs and thus they can assimilate N2 gas and make it available for other organisms. This is especially important in ocean surface waters, where nitrogen is often a limiting nutrient for primary production. Cyanobacteria can be categorized into two major subgroups: unicellular and filamentous. The most abundant cyanobacteria in the oceans are the unicellular Prochlorococcus and Synechococcus. Due to their small size, they are also called picocyanobacteria. Prochlorococcus dominates in oligotrophic oceans between 401N and 401S. The co-occurring genus Synechococcus is usually less abundant than Prochlorococcus, but it occurs also in sub polar regions, brackish and nutrient-rich environments. Cyanophages have been studied quite intensively in the last decades. However, in comparison to their hosts, they are still poorly understood and the knowledge is severely biased towards certain host species and environments. The majority of the isolates and of the publicly available genomes belongs to phages of Prochlorococcus and Synechococcus, also called picocyanophages. Phages infecting filamentous and unicellular cyanobacteria form separate phylogenetic clusters.

Phages of unicellular cyanobacteria More than 7000 picocyanophages have been isolated, from which at least 1000 have been sequenced in different studies. All isolated picocyanophages are tailed (see Table 1). The T4-like, S-TIM5 and S-CBWM1 picocyanophages have myoviral morphology. The T7-like picocyanophages have podoviral morphology. The other isolates have siphoviral morphology. Most of the picocyanophage in culture are either T4-like (Fig. 10.1) or T7-like.

332

Marine Bacteriophages

Table 1

Cultivated cyanophages infecting unicellular cyanobacteria

Cyanophages Genome type

Morphology Life style

T4-like

myoviral

S-TIM5 S-CBWM1 T7-like P-SS2 S-CBS1 S-CBS2 S-CBS3 S-CBS4 KBS2A A-HIS1 A-HIS2

dsDNA

podoviral siphoviral

Host

strictly lytic

Prochlorococcus Synechococcus Synechococcus Synechococcus strictly lytic, some potentially Prochlorococcus temperate Synechococcus strictly lytic, potentially temperate Prochlorococcus

Strictly lytic

Synechococcus Synechococcus Synechococcus Synechococcus Acaryochloris Acaryochloris

Genome size (kb)

G þ C content [%]

tRNAs #

170–250

B33–45

0–25

B161 B140 41–48

B41 B52 B38–55

10 34

B108 B30 B72 B33 B69 B40 B55 B57

B52 B59 B55 B61 B51 n.d. B47 B47

n.a. 0 0 0 1 0 0 0

Most cultivated phages infecting picocyanobacteria are strictly lytic. To date, no temperate picocyanophages have been isolated. However, induction experiments with environmental cyanobacteria suggest the existence of prophages. Furthermore, the Prochlorococcus phage P-SSP7 has an integrase gene, potentially being able to integrate into the host genome. Recently, a 100 kb putative prophage was predicted in the genome of Synechococcus WH8016. This prophage forms a new picocyanophage clade together with four MAVGs recruited from the Red Sea and Tara Ocean cellular metagenomes. This clade is found all over the oceans, in low abundance. Several cyanophages, as for example S-CBWM1, S-TIM5 and those infecting Acaryochloris (see Table 1), have mitochondria-like DNA polymerase genes. Phylogenetic analysis indicates a likely common ancestor for the phage and mitochondrial DNA polymerase genes. It is likely, but not proven, that the DNA polymerase gene was transferred from a phage to an alpha-proteobacterial mitochondrial progenitor. Picocyanophage and their host systems are the most studied amongst marine phage-host systems. Because several ecological principles derived from their study can be extended to other phage-host systems, they are further reviewed in this book article separately, in the “Marine Phage Ecology” section.

Phages of filamentous cyanobacteria Filamentous cyanobacteria are free-living like the aggregate-forming Trichodesmium or associated with diatom hosts like the heterocyst-forming Richelia and Calothrix. They often form blooms in estuarine and marine environments. The production of cyanotoxins and oxygen depletion during and after the blooms can change the functioning of the whole ecosystem and can be harmful to the environment, human health and economy. Global warming is expected to enhance the frequency and intensity of cyanobacterial blooms. Cyanophages are thought to play an essential role in the termination of cyanobacterial blooms, but also in their maintenance due to community shifts between specific ecotypes. For example, phage lysis of Nodularia leads to the release of nitrogen and stimulates the growth of Synechococcus. A summary of the known phages infecting filamentous cyanobacteria is given in Table 2.

Phages of Marine Alphaproteobacteria Phages of marine Pelagibacterales Pelagibacterales, also known as SAR11, are small, vibrio-shaped chemoheterotrophic bacteria that constitute 20%–40% of all planktonic cells in the euphotic ocean and B20% of cells in the deeper mesopelagic and bathypelagic waters. They are present in the oceans from pole to pole, but their largest numbers are found in stratified, oligotrophic gyres. For a while, it was believed that the success of SAR11 clade is due to a lack of phage predators. However, this hypothesis was disproved by the isolation of several phages infecting Pelagibacterales, also called pelagiphages (Fig. 10.2). To date, there are only 15 known pelagiphage isolates, from which 14 are podoviruses and one is a myovirus related to T4 phages (see Table 3). Most pelagipodovirus isolates are placed within the proposed genus HTVC019Pvirus of the Autographivirinae subfamily and have the ability to propagate by both lytic and lysogenic cycles. They encode in their genomes a tyrosine integrase. Most likely this integrase catalyzes the integration of the pelagiphage in the host genome, through site-specific recombination in between the bacterial chromosomal attachment site (attB) and phage chromosomal attachment site (attP). The attB sites are various tRNAs present in the SAR11 genome, while the attP site is located in between the integrase and the RNA polymerase gene. Integration of HTVC019Pvirus phages has been shown both in SAR11 cultures and in metagenomic samples, suggesting that temperate pelagiphages could be implicated in ecological processes on broad scales.

Marine Bacteriophages

333

Fig. 10 (1) Electron micrograph of negative-stained Prochlorococcus myoviruses P-SSM2 and P-SSM4. Scale bars indicate 100 nm. (A) P-SSM2 with non-contracted tail. (B) P-SSM2 with contracted tail. (C) P-SSM4 with contracted tail. (D) P-SSM4 with non-contracted tail. (2) Electron micrograph of isolated pelagiphages. (A) Pelagipodovirus HTVC011P. (B) Pelagipodovirus HTVC019P. (C) Pelagipodovirus HTVC010P. (D) Pelagimyovirus HTVC008M. (E) Host cell of Candidatus P. ubique strain HTCC1062 infected with HTVC011P. (3) Electron micrograph of cobaviruses. (A) Molybdenum stained, cell debris bound Lentibacter virus vB_LenP_ICBM1. (B) Uranyl acetate stained, free Lentibacter virus vB_LenP_ICBM2. Scale bars indicate 50 nm. (4) Electron micrograph of Pseudoalteromonas phenolica infecting phage TW1. Scale bar indicates 50 nm. (5) Electron micrograph of Pseudoalteromonas phage vB_PspS-H40/1. Tungstophosphoric acid stained. Scale bar indicates 40 nm. (6) Electron micrograph of phage f327 infecting Pseudoalteromonas sp. BSi20327. Scale bar indicates 100 nm. (7) Thin-section electron micrograph of autolykiviruses. Scale bar indicates 50 nm. Reproduced from (1) Sullivan, M.B., Coleman, M.L., Weigele, P., Rohwer, F., Chisholm, S.W., 2005. Three Prochlorococcus cyanophage genomes: Signature features and ecological interpretations. PLoS Biology 3 (5), e144. doi:10.1371/ journal.pbio.0030144. Used under CC BY (http://creativecommons.org/licenses/by/4.0/). No changes made. (2) Zhao, Y., Temperton, B., Thrash, J.C., et al., 2013. Abundant SAR11 viruses in the ocean. Nature 494 (7437), 357–360. doi:10.1038/nature11921. Copyright (2013, Springer Nature). (3) Bischoff, V., Bunk, B., Meier-Kolthoff, J.P., et al., 2019. Cobaviruses – A new globally distributed phage group infecting Rhodobacteraceae in marine ecosystems. The ISME Journal. doi:10.1038/s41396-019-0362-7. Used under CC BY (http://creativecommons.org/licenses/by/4.0/). (4) Shin, H., Lee, J.-H., Ahn, C.S., Ryu, S., Cho, B.C., 2014. Complete genome sequence of marine bacterium Pseudoalteromonas phenolica bacteriophage TW1. Archives of Virology 159 (1), 159–162. doi:10.1007/s00705-013-1776-6. Copyright (2013, Springer Nature). (5) Kallies, R., Kiesel, B., Schmidt, M., et al., 2017. Complete genome sequence of Pseudoalteromonas phage vB_PspS-H40/1 (formerly H40/1) that infects Pseudoalteromonas sp. strain H40 and is used as biological tracer in hydrological transport studies. Standards in Genomic Sciences 12, 20. doi:10.1186/s40793-017-0235-5. Used under CC BY (http:// creativecommons.org/licenses/by/4.0/). No changes made. (6) Yu, Z.-C., Chen, X.-L., Shen, Q.-T., et al., 2015. Filamentous phages prevalent in Pseudoalteromonas spp. confer properties advantageous to host survival in Arctic sea ice. The ISME Journal 9 (4), 871–881. doi:10.1038/ismej.2014.185. Copyright (2014, Springer Nature). (7) Kauffman, K.M., Hussain, F.A., Yang, J., et al., 2018. A major lineage of non-tailed dsDNA viruses as unrecognized killers of marine bacteria. Nature. doi:10.1038/nature25474. Copyright (2018, Springer Nature).

334

Table 2

Marine Bacteriophages Phages of filamentous cyanobacteria

Cyanophages

Genome type

Morphology

Life style

Host

Genome size (kb)

G þ C content

tRNAs #

vB_NpeS-2AV2 vB-AphaS-CL131 A-1(L) A-4L N1 A1 PaV-LD PP Pf-WMP3 Pf-WMP4 NCTBa

dsDNA

siphoviral

strictly lytic

Nodularia Aphanizomenon flos-aquae Anabaena

B139 B113 B68 B42 B65 B68 B95 B43 B43 B41 B258

B40 B40 n.d. B43 B35 B37 B41 B46 B47 B52 B42

1 2 n.d. 0 0 0 0 0 0 0 0

myoviral podoviral myoviral

Nostoc

podoviral

myoviral

Planktothrix agardhii Phormidium foveolarum

n.d.

Trichodesmium

Uncultivated, genome obtained from a decaying bloom of Trichodesmium in the New Caledonian lagoon in the South Pacific Ocean. Note: n.d. ¼ not determined.

a

Table 3

Cultivated pelagiphages

Pelagiphages

Genome type

Morphology

Life style

Host genus

Genome size (kb)

G þ C content

tRNAs #

HTVC019Pvirus HTVC010P HTVC008M

dsDNA

podoviral

lytic and lysogenic strictly lytic strictly lytic

Pelagibacter

37–42 B35 B147

B32–36 B30 B34

0–1 0 0

myoviral

Amongst cultivated pelagiphages, HTVC010P is the most abundant in the oceans. In the North and South Atlantic, it was found both in surface waters and in the DCM layer. HTVC008M was found only in a few stations in the Southern Atlantic, indicating latitudinal biogeography of the SAR11 phages. A study looking at the ratio between the phage and host transcripts in the coastal waters of the NE Atlantic, found a low ratio between cultivated pelagiphages and SAR11. This is suggestive of a low lytic activity of cultivated pelagiphages in coastal waters, not surprising considering that SAR11 generally has slow growth rates and that cultivated pelagiphages have long latent periods (16–22 h) and lower burst sizes (9–49 phages per infected cell).

SAR116 phages SAR116 is an important group of heterotrophic bacteria in the surface ocean, with abundances as high as 10%. They have genes for proteorhodopsin-based photoheterotrophy, dimethylsulfoniopropionate, carbon monoxide and C1-metabolism, indicative of their involvement in marine biogeochemical cycles. To date, only one phage infecting the marine SAR116 has been isolated, named HMO_2011. Its host is ''Candidatus Puniceispirillum marinum” strain IMCC13122. Phage HMO_2011 has a podoviral morphology and a B55 kb dsDNA genome. As it is typical with phage genomes, most of the encoded genes are hypothetical. Amongst the annotated genes, the DNA polymerase and the putative methanesulfonate monooxygenase (MsmA) are noticeable. The DNA polymerase contains a partial DnaJ domain between the 30 -50 exonuclease and the DNA polymerase domains. Usually, this domain is found in molecular chaperones. The MsmA presence hints toward the involvement of the HMO_2011 phage in the C1 metabolism of SAR116. The environmental distribution of the HMO_2011 phage was assessed using the Pacific Ocean Virome dataset. It was found that HMO_2011 is distributed mainly in the euphotic zones of both coastal and open ocean, and it is much less abundant in the aphotic zones. In the coastal waters of the NE Atlantic, HMO_2011 was highly active, as judged from the ratio between phage and host transcripts. This is in agreement with the shorter latency time (6 h) and bigger burst size (500 phages/infected cell) observed in culture.

Phages of marine Rhodobacteraceae – Roseophages Marine Rhodobacteraceae, also known as the Roseobacter Group, are another important group of heterotrophic bacteria in the marine environment. They have a high metabolic versatility, being able to perform anoxygenic photosynthesis, metabolize a large variety of organic compounds, degrade dimethylsulfoniopropionate and produce various secondary metabolites. Most often, they are associated with micro- and macroalgae, with which they potentially engage in both mutualistic and pathogenic interactions. Roseophages, that is phages infecting roseobacters, have been isolated for ten out of more than 70 described genera of marine Rhodobacteraceae. These ten genera are Ruegeria, Sulfitobacter, Roseovarius, Dinoroseobacter, Roseobacter, Lentibacter, Celeribacter, Loktanella, Pelagibaca and Thiobacimonas. Most of the roseophages are tailed, with dsDNA genomes and a podoviral or siphoviral morphology, and only two isolates have a ssDNA genome (see Table 4). There are two main groups of roseopodoviruses, the N4-like and the Cobavirus group, both strictly lytic. The roseosiphophages are more diverse, most of them being classified as

Table 4

Marine Bacteriophages

335

Genome G þ C size (kb) content

tRNAs #

Cultivated roseophages

Roseophages

Genome Morphology Life style Type

N4-like

dsDNA

Cobavirus group phiCB2047-A, phiCB2047-C Chi-like Cbk-like RDJLphi1 RDJLphi2 vB_DshS-R5C vB_ThpS-P1 vB_PeaS-P1 vB_RpoMi-Mini, B_RpoMi-V15

ssDNA

podoviral

strictly lytic

Host genus

Ruegeria, Sulfitobacter Roseovarius Dinoroseobacter Roseobacter strictly lytic Lentibacter Sulfitobacter Celeribacter temperate Sulfitobacter siphoviral strictly lytic, potentially temperate Ruegeria Loktanella Ruegeria Roseobacter Roseobacter Dinoroseobacter temperate Thiobacimonas Pelagibaca Icosahedral, strictly lytic Ruegeria non-tailed

73–77

B43–51 0–15

39–41

B46–48 0

B41 57–61

B59 0 B55–64 0

B147 B62 B63 B78 B39 B38 4.2

B56 B58 B57 B62 B67 B64 B58

24–26 0 0 0 0 0

Cbk-like, Chi-like and Mu-like head group. The phages in the Mu-like group are temperate, the induction to lytic cycle being shown in deep-sea roseobacters. The other roseosiphophages are mainly strictly lytic, only some having certain genome characteristics, as for example integrase genes, which indicate a temperate potential. Cobaviruses (Fig. 10.3) comprise several cultivated roseophages, as well as environmental phage genomes, forming a clade in the proposed Siovirus genus. Their genomes are split in two arms, with opposite transcriptional directions and separated by a rhoindependent transcriptional terminator. The presence of direct terminal repeats (DTRs) at the end of the genomes indicate that cobaviruses replicate using a mechanism similar to the T7 phages. This involves the formation of long concatemeric DNA molecules as intermediates in replication and packaging, concatemers formed by the annealing of 30 single strands resulted at the DTR level during replication. Cobaviruses most likely lyse their host cells via the canonical holin-endolysin pathway. They are found worldwide in the euphotic ocean, mostly in coastal areas, but also in the open ocean. They are more numerous in bays or estuaries, as for example in the Goseong Bay, Delaware Estuary, Chesapeake Bay and Port of Los Angeles. Some cobaviruses were shown to be cosmopolitan, their genome being found for example both in the North Sea and in the Yellow Sea. In coastal environments, cobaviruses seem to be present throughout the year. Both cultivated and environmental cobaviruses encode in their genomes a cobalamin-dependent ribonucleotide reductase gene, related with similar genes from protist-associated microorganisms. This suggests that the habitat of cobaviruses are protist associated bacterial hosts, most probably vitamin B12 producers. Recently, two ssDNA roseophages, distantly related to other Microviridae have been isolated (see Table 4). At 4.2 kb and 4 predicted ORFs, vB_RpoMi-Mini and B_RpoMi-V15 have the smallest genome size amongst known ssDNA phages. They encode a major capsid protein, a replication initiation protein, a peptidase, and a hypothetical protein. Similar proteins have been found in the genome of the Novosphingobium tardaugens NBRC 16725, indicating the presence of a prophage or prophage remnant. The similarity of the peptidase gene with other genes from bacterial genomes, including genes located in prophage regions, indicate that ssDNA roseophages participate in horizontal gene transfer.

Phages of Marine Gammaproteobacteria Vibriophages Vibriophages are viruses infecting bacteria from the Vibrionaceae family. These bacteria represent a genetically and metabolically diverse group of heterotrophic bacteria, ubiquitous in the oceans. They can be free-living, however, they are more abundant in sediments or in associations with other marine organisms or organic particles. Furthermore, the group includes several pathogens, including Vibrio anguillarum, V. parahaemolyticus, V. harvey, and V. vulnificus. These pathogens infect more than 50 species of fish, mollusks and crustaceans. They can cause vibriosis, a malignant disease affecting the aquaculture industry globally, and food poisoning in people consuming infected, raw seafood. As a consequence, strictly lytic vibriophages are of special interest due to their potential use in phage therapy, as treatment for vibriosis. The known marine vibriophages have a high diversity, with representatives in Caudovirales, Autolykiviridae, and Inoviridae. In the environment, vibriophages are found both in the water column and in association with animals. For example, a diverse and abundant vibriophage community has been found in oysters (Crassostrea virginica) from Delaware Bay, Ladysmith Harbor in British Columbia and from different estuaries in Gulf of Mexico. Vibriophages were present year-round in oyster tissue, with increased abundances in summer. This suggests that vibriophages are a common occurrence in oysters and most likely they play a

336

Table 5

Marine Bacteriophages

Cultivated phages infecting Pseudoalteromonas

Pseudoalteromonas Genome type Morphology Life style phage/genus f327 PM2 Rio1-like virus pYD6-A PSAHM1-like virus PSAHS1-like virus PSAHS2-like virus vB_PspS-H40/1 Pq0 H105/1 PSAHS6-like virus B8b TW1

ssDNA dsDNA

filamentous chronic icosahedral strictly lytic and temperate podoviral strictly lytic myoviral siphoviral

Host genus

Genome size (kb) G þ C content tRNAs #

Pseudoalteromonas B6 B10 43–45 B77 strictly lytic 130 strictly lytic 36–39 B38 B45 B33 strictly lytic and temperate B31 B35 Strictly lytic, possibly temperate B46 B40

n.d.a B42 B45 B39 B36 B40 63–64 B41 B40 B41 B40 50 B40

0 0 0 2 0 0 0 0 0 0 0 0 0

n.d. ¼ not determined.

a

role in regulating Vibrio populations. Morphological characterization of vibriophages from oysters showed the presence of both tailed and non-tailed phages. A few hundred lytic marine vibriophages have been isolated to date, from which 251 make part of the Nahant collection and have been genome sequenced. Most belong to Caudovirales, being either podo-, sipho- or myoviruses. About 20 isolates belong to the newly proposed Autolykiviridae family (see below). Caudoviruses in the Nahant collection have genome sizes ranging in between 21 kb and 348 kb. Notably, 32 phages carry putative CRISPR features. CRISPR/Cas systems are usually used by bacteria to protect themselves against phages. However, phages can encode their own CRISPR/Cas systems to evade host immunity. The first such system was shown in a vibriophage infecting Vibrio cholera. This suggests that the CRISPR features found in marine vibriophages could function in a similar way. Autolykiviruses are non-tailed phages representing a novel family of the double jelly-roll capsid viruses. Their capsid has about 50 nm in diameter and it likely contains an inner lipid layer (see Fig. 10.7). Their linear, dsDNA genomes have 10-kb in length, terminate in inverted repeats and encode for B20 proteins. Autolykiviruses have a broad host range, commonly killing hosts in multiple Vibrio species. This is in contrast to tailed phages, which usually infect few and closely related species. The isolated autolykiviruses are all strictly lytic. Autolykivirus-like prophages have been predicted bioinformatically in bacterial genomes and in contigs from marine metagenomes. Their in vivo activity has been recently demonstrated in two marine Vibrio species. Prophages are abundant in marine vibrios and can encode (1) virulence factors, as for example Zonula occludens toxin (Zot); (2) bacterial host fitness factors, as for example RTX-toxins, collagenases, lipases, hemolysins, chondroitin AC lyases; and (3) genes involved in antibiotic and heavy metal resistance. In a recent analysis, 15% of the marine Vibrio isolates contained Inoviridae prophages encoding Zot. Inoviruses are filamentous ssDNA phages with a chronic life cycle. They integrate into the host genome and, upon induction, they release phage progeny without lysing the cell. Zot is known from Vibrio cholerae infected by the chronic CTX phage. Zot localizes in the bacterial membrane and increases intestinal permeability of the mammalian hosts. Zot containing prophages were found in isolates from coastal marine waters (e.g., V. campbellii), deep hydrothermal vents (e.g., V. antiquarius and V. diabolicus) and deep subseafloor sediments (e.g., V. diazotrophicus). This suggests that harmless, environmental bacteria can acquire virulence traits from pathogenic donors and act as potential biological reservoirs of these genes in the environment.

Pseudoalteromonas phages Pseudoalteromonas spp. are ubiquitous heterotrophic bacteria in the marine environment. Most often they are associated with particles, where they constitute up to 20%–60% of the microbial communities and they are involved in carbon export in the oceans. There are 12 cultivated phage groups infecting Pseudoalteromonas. Most of them are dsDNA phages, belonging either to Caudovirales (Fig. 10.4 and 10.5) or Corticoviridae, and one is a ssDNA phage (see Table 5). Phage 327 is the first filamentous phage known to infect a Pseudoalteromonas strain from the Arctic Sea ice (Fig. 10.6). In culture, it decreases the host growth rate, cell density, and tolerance to H2O2 and NaCl, but it enhances its motility and chemotaxis. The latter might confer a survival advantage during the polar night, when nutrients are scarce. Pseudoalteromonas virus PM2 is the first isolated and the most studied corticovirus. It infects via the lytic cycle the marine Pseudoalteromonas espejiana BAL-31. The capsid has B56 nm in diameter and it contains an inner lipid bilayer. The genome is composed of a circular, highly supercoiled dsDNA molecule. It has only 10 kb in size and encodes for 21 proteins. Genome replication takes place via a rolling-circle mechanism. Through bioinformatics analysis, PM2-like prophages have been predicted in the genomes of cultivated marine Proteobacteria, as well as on contigs from marine metagenomes.

Marine Bacteriophages

Table 6

337

Cultivated phages infecting Bacteroidetes

Phage/genus

Genome Type

Morphology Life style

Host genus

Genome size (kb)

G þ C content

tRNAs #

phi38:1, Cba401likevirus Cba183likevirus Cba142likevirus Cba41likevirus CbaSMlikevirus Cba391likevirus Cba461likevirus Cba181likevirus Cba101likevirus Cba131likevirus Cba184likevirus

dsDNA

podoviral

Cellulophaga baltica

B73

38

16

B73 B100 B146 B54 B29 B35 B39 B57 B78 B7

33 30 33 33 31 38 37 31 30 34

1 0 24 0 0 0 0 3 0 0

B12

29

0

B36 B36 B36 B45

B31 B36 B36 B42

0 0 0 0

myoviral siphoviral

ssDNA

Cba482likevirus 11b P12024S P12024L P2559S

strictly lytic

dsDNA

icosahedral, non-tailed icosahedral, Strictly lytic, possibly non-tailed temperate siphoviral strictly lytic

Flavobacterium Persicivirga Croceibacter atlanticus

Phages of Marine Bacteroidetes One important group of marine heterotrophs belongs to the Bacteroidetes phylum. Marine Bacteroidetes increase in abundance during algal blooms, being responsible for the degradation of biopolymers, for example polysaccharides, and thus are involved in recycling the organic matter derived from blooms. To date, most phage isolates for Bacteroidetes (see Table 6) infect the Cellulophaga genus, present in coastal environments and most often associated with macroalgae. To date, 40 isolates grouped in 12 phage genera are known to lytically infect different strains of Cellulophaga baltica. Ten of the genera represent tailed dsDNA phages, with podo-, sipho- and myovirus morphology. The phi38:1 podoviral isolate belongs to one of the most abundant viral clusters in the sunlit ocean, as discovered by viromics studies. The presence of a high number of tRNAs in its genome, correlated with a broader host range compared with isolates with fewer tRNAs, could explain in part the success of this phage and its relatives in the environment. Likely, the tRNAs allow the phage a better use of the host resources. Although phi38:1 phage has a broad host range, it does not infect all C. baltica strains with the same efficiency. In the original isolation host, phi38:1 infects 64% of the host cells, has a latent period of 70 min and a burst size of 8 phages per cell. In the alternative host, the infection efficiency of phi38:1 is lower. Only 20% of the host cells are infected, the latent period is B5 h and the burst size is 40 phages per cell. Transcriptomics experiments showed that phi38:1 has similar expression patterns in the two hosts. On the other hand, the transcriptional response of the isolation and alternative hosts are significantly different. Upon phage infection, the alternative host overexpresses DNA degradation genes and underexpresses translation genes, resulting in a delayed phage replication and protein translation. These findings have implications for phage-infection ecology in nature. Because the microdiversity of both host and phages is expected to be high in the environment, it is likely that variable infection efficiencies are quite common. Single-stranded DNA phages infecting C. baltica have been classified in two genera. The isolates from the Cba184likevirus genus are strictly lytic, have a genome size of B6.5 kb, a capsid of B30 nm and they are thought to represent a new subfamily of the Microviridae. The Cba482likevirus genus has strictly lytic phages with 11.5 kb genomes, the biggest length known for ssDNA phages. Furthermore, the capsid has a diameter of B72 nm, twice bigger than other Microviridae representatives. Consequently, the Cba482likevirus is thought to represent a third lineage of ssDNA phages, besides Microviridae and Inoviridae. Sequence similarity indicates that Zunongwangia profunda (Bacteroidetes) harbors a Cba482likevirus prophage in its genome. This suggests that phages in the Cba482likevirus genus are capable of integrating into host genomes and becoming temperate.

Marine Phage Ecology Phage Micro- and Macro-diversity in the Marine Environment An ongoing discussion in virology regards the existence of the viral species. The biological species concept defines species as interbreeding individuals that remain isolated from other similar groups. Viruses don’t reproduce sexually. However, they undergo homologous gene exchange, during simultaneous infection of the same host by multiple viruses. If the gene exchange

338

Marine Bacteriophages

Fig. 11 Schematic representation of viral micro and macrodiversity.

would be confined within groups of similar viruses, then these groups would form discreet viral populations and could be defined as species. One of the first indications that marine phages form discreet and stable genomic populations came from work on roseophages. Following the isolation of the SIO1 roseophage, a very similar phage was isolated 12 years later, indicating that phages are stable in time. Later on, several studies on cyanophages confirmed the existence of sequence discreet clusters. For example, in coastal waters of Southern New England, different clusters of cyanophages were stably maintained over a time span of 15 years. Furthermore, they exhibited a clear temporal and spatial pattern of abundance, suggesting that they occupy slightly different niches and that they represent viral ecotypes. From the different studies (e.g., Brum et al., 2015; Roux et al., 2016, 2018), an empiric threshold for defining viral populations was established. This threshold was set at 95% nucleotide identity over at least 70% of the genome. Recently, the existence of discreet viral populations in the marine environment and their delineation by the above threshold was validated using the massive amount of data from the GOV2 dataset. In an environmental sample, phages can be organized into different ecological levels (see Fig. 11). The lowest level is the individual phage. All individuals belonging to the same species form a population. The genetic diversity within the population, that is the number and proportion of different individual phages, is defined as microdiversity. The highest level of organization is the community, which is represented by all the phage populations in a particular sample. The number and the abundances of these populations are defined as macrodiversity. Usually, a viral community is formed from multiple viral populations, each with its own degree of microdiversity. Diversity analysis showed that oceanic viral communities are grouped within meta-communities or ecological zones. In terms of diversity, this means that phage communities within one zone are more similar to each other than to phage communities from other zones. There are only five viral ecological zones within the world’s oceans: (1) Arctic, (2) Antarctic, (3) bathypelagic (4 2000 m depth), (4) temperate and tropical epipelagic (0–150 m depth) and (5) mesopelagic (150–1000 m depth). These zones are similar to those observed for the marine bacterial communities, indicating that the main driver of phage diversity is the bacterial community composition. The highest macrodiversity was found in the epipelagic and Arctic zones, and the lowest in the Antarctic zone. The highest microdiversity was found in the mesopelagic and epipelagic zones, and the lowest in the Arctic. Furthermore, the epipelagic and the Arctic zones contained the higher number of zone-specific populations, which indicates high levels of endemism and suggest that these regions are virus biodiversity cradles.

Marine Phages as Factors Driving Bacterial Mortality and Diversity Until the late 800 s, when it was discovered that marine phages are highly abundant, it was believed that the main mortality factor for marine bacteria is grazing by protists. Now, it is recognized that viral lysis contributes significantly to bacterial mortality, shaping microbial community composition and diversity. It is estimated that 10 and 40% of bacterial cells are lysed by viruses.

Marine Bacteriophages

339

Fig. 12 Viral shunt in the marine environment. Reprinted with permission from Weitz, J.S., Wilhelm, S.W., 2012. Ocean viruses and their effects on microbial communities and biogeochemical cycles. F1000 Biology Reports 4, 17. Used under CC BY (http://creativecommons.org/licenses/by/4.0/). No changes made.

Phage infections in the oceans are believed to follow a kill the winner model. In this model, phage infections are stimulated by increasing host cell densities and in turn, lead to the host population crash. In the same time, the coevolution of bacterial hosts and phages is a veritable arms race, in which the hosts find ways to evade the attack of phages and the phages evolve to counter the new defenses, to ensure that neither the predator nor the prey is driven to extinction. This arms race results in a high microdiversity of both host and phage. This was shown in chemostat experiments in which a single bacterial strain and its phage were incubated for prolonged periods. Both heterotrophic hosts, e.g., Cellulophaga baltica MM#3, and photosynthetic hosts, e.g., Synechococcus sp. WH7803, were subjected to such experiments (Middelboe et al., 2009; Marston et al., 2012). In both cases, samples collected at different time points showed the evolution of several host and phage strains, with different degrees of resistance and virulence, respectively. The mechanisms used by marine bacteria to escape phage predation are multiple, and still largely uncharacterized. In Prochlorococcus, resistance to phages is conferred by mutations impairing the attachment of phages to the cell surface. These mutations are localized especially in non-conserved, horizontally transferred genes from a single hypervariable genomic island, genes probably involved in the synthesis of phage receptors. Although these mutations protect Prochlorococcus from certain phages, they also decrease its growth rate and even allow more rapid infection by other phages. Phage-resistant strains continue to evolve, until they reach an improved growth rate and a narrower resistance range. This mechanism could explain why in nature many Prochlorococcus cells have high growth rates and are resistant to phages. Bacterial lysis leads to the release of both dissolved organic matter (DOM) and of particulate organic matter (POM) into the environment. The released matter, especially the DOM, is re-utilized by heterotrophic microorganisms, instead of being transferred to upper trophic levels via grazing (see Fig. 12). This phenomenon is termed the “viral shunt”. Part of the released POM will likely aggregate because cell debris resulted from cell lysis are sticky, and will be exported to the deep ocean. Therefore, it has been hypothesized that phages participate in the biological pump through the stimulation of particle formation. Recently, it has been shown that Synechococcus and their phages are strongly associated with carbon export at 150 m in the subtropical, nutrient-depleted, oligotrophic ocean.

Diel Rhythms of Phage Infections in the Marine Environment Most organisms have a circadian rhythm, in which physiological processes are regulated in a B24 h cycle. These diel oscillations are observed both in pure cultures in the laboratory and in the marine environment as well. In unicellular cyanobacteria, the circadian clock directs oscillations in gene transcription levels, with the majority of transcripts peaking at dawn or dusk. For example, in Prochlorococcus, the photosystem I and II genes and the Calvin cycle genes are maximally transcribed at dawn, most likely in preparation for harvesting the light energy through the day. Because unicellular cyanobacteria are main primary producers in the oceans, most likely their diel rhythms influence the diel rhythms of the co-occurring heterotrophic bacteria and of their phages.

340

Marine Bacteriophages

In laboratory settings, cyanophage infections are influenced by light at different steps in the infection cycle. Some cyanophages show light-dependent absorption onto their host cells, with a higher fraction of phages absorbing in the light than in the dark. Phage replication and burst size are positively correlated with light intensity, indicating that cyanophages use for their replication resources generated by the photosynthetic metabolism of their host cells. In the marine environment, the influence of diel cycles on the host and phages was studied in the North Pacific Subtropical Gyre, a habitat representative of oligotrophic oceans. During an eight-day experiment, metagenomics and quantitative metatranscriptomics were used to monitor the temporal abundances and transcriptional activities of most abundant dsDNA viruses. A small fraction of the assembled viral scaffolds showed a diel periodicity in their transcript abundance, with a peak in the afternoon to early evening. Most likely other viruses, with lower abundances, were subjected to diel rhythms, but the methods used were not sensitive enough to detect their transcripts. Amongst the diel transcripts, those for structural viral genes were highly abundant, indicating that virus production was following a diurnal lytic cycle. The majority of viral scaffolds showing a diel transcriptional activity were attributed to cyanophages, most likely infecting Phrochlorococcus. The replication and cell division of Prochlorococcus also followed a diel periodicity, both processes peaking in the later afternoon to dusk. This indicates connectivity in between the diel cycle of Prochlorococcus and of their phages. Three possible scenarios can be envisioned to explain the peak in cyanophage transcripts in the afternoon. In the first, cyanophages infect their hosts through the 24-h cycle, but with different burst sizes. Because the host has more metabolic resources in the afternoon, those phages whose infection cycles overlap with the afternoon period have the highest burst sizes. In the second scenario, cyanophages released in the afternoon burst will infect their hosts during the night, but not advance in their infection cycle until the afternoon increase in host photosynthetic metabolism and DNA replication. In the third scenario, cyanophages absorption is enhanced by light, leading to a higher infection rate during the day and thus to an afternoon burst peak. Not only cyanophages exhibited diel variations in their transcript levels. Amongst the viral scaffolds with diel periodicity, one most likely represented a Pelagibacter phage. This indicates that not only phages of photosynthetic bacteria follow a diel cycle, but also those of heterotrophic bacteria.

Auxiliary Metabolic Genes in Marine Phages With the increased availability of genome sequencing technologies, more and more phages have been genome sequenced. As expected, most of the genes found in phage genomes are essential for the production of new progeny, as for example those involved in nucleic acid replication, capsid, and tail assembly, cell lysis. However, much to our surprise, in the early 2000s it was discovered that cyanophages carry host genes involved in photosynthesis. Since then, more than 200 different genes were found which are not directly involved in phage replication, but rather are part of the host metabolic pathways. Those genes have been named auxiliary metabolic genes (AMGs). Not included in AMGs are genes involved in DNA packaging, nucleotide transport and metabolism, protein metabolism and assembly, DNA synthesis, replication, and repair. AMGs are classified in 2 classes: (1) class I includes AMGs present in the KEGG metabolic pathways and (2) class II includes proteins assigned only a general metabolic function or marginal metabolism (e.g., transport function) and thus, not present in KEGG. The first photosynthetic AMGs discovered were psbA and psbD, encoding two major components of the photosystem II (PSII). They were found in the genome of S-PM2 myocyanophage, which infects Synechococcus. Since then, several other photosynthetic AMGs were detected. Environmental surveys show that they are common in marine picocyanophages and that they were likely acquired by horizontal gene transfer from infected hosts. Phage PSII genes are actively transcribed in Prochlorococcus infected with P-SSP7 podovirus. Likely, they are functional in photosynthesis and increase phage fitness. Recently, genes for both photosystem I and II were discovered in P-TIM68 myovirus infecting Prochlorococcus. These genes are expressed during phage infection and the encoded proteins are incorporated into host membranes. Not only is the host photosynthetic capacity maintained during phage infection, but the cycling electron flow in the photosystem I is enhanced. This suggests an increased production of ATP in infected cells, required probably by the phage replication machinery. With the increase in the number of cultivated and environmental phage genomes, more and more AMGs have been found. Many of them are potentially involved in different biogeochemical cycles in the oceans. For example, dsrC, a gene belonging to the dissimilatory sulfur reductase operon, is usually found in sulfate/sulfite-reducing bacteria in anoxic environments and in sulfuroxidizing bacteria in both oxic and anoxic environments. Recently, a dsrC gene was found on several viral contigs from the Global Ocean Virome. Another gene from the sulfur cycle, soxYZ, was found as well on phage contigs from the GOV dataset. All contigs with dsrC and soxYZ genes belonged to the T4-like viral cluster, one of the most abundant and ubiquitous clusters in the oceans. These findings suggest that marine phages manipulate sulfur cycling. Another example of AMGs is the P-II gene, which is involved in the regulation of the nitrogen metabolism in bacteria. It was found on several phage contigs in the GOV dataset, suggesting phage involvement in marine nitrogen cycling.

Transfer RNAs (tRNAs) in Phages and Their Role in Phage-Host Interactions Transfer RNAs are present in the genomes of many phages, including cyanophages. These RNAs recognize specific codons in the mRNA and transfer the corresponding amino acid for incorporation into the growing protein chain. Due to the genetic code degeneracy, part of the 20 amino acids are encoded by multiple codons. In a particular bacterial host, depending on its genome G þ C content, certain codons can be more frequent than others. As a result, the cellular tRNA pool can be biased toward the most

Marine Bacteriophages

341

frequent codons. Generally, phages infect hosts with similar codon usage, because otherwise, the protein synthesis would be inefficient. To bypass differences in codon usage phages can encode tRNAs in their own genomes. Some cyanophages use this strategy to broaden their host range. For example, cyanomyophages with a higher number of tRNAs in their genomes are able to infect both low G þ C content hosts (Prochlorococcus) and high G þ C content hosts (Synechococcus). On the other hand, cyanomyophages with no tRNAs or a low number of tRNAs in their genomes are able to infect only Prochlorococcus. The number of tRNAs in the phage genomes is quite variable (see Table 1). The cultivated phage with the highest number of tRNA genes is the S-CBWM1 cyanophages. It has 34 tRNA genes in its genome. And potentially used these tRNAs to increase the translation efficiency for several unique phages genes, which have no homologs in sequences databases.

Conclusions Phages are most likely present in all marine environments. They regulate the metabolism, structure, and composition of bacterial communities, thus impacting globally the marine biogeochemical cycles. Significant advancements in sequencing technologies have allowed characterization of marine phages on an unprecedented scale. Even so, much more remains to be known. For example, the host is unknown for most of the environmental phage contigs. As opposed to the sunlit ocean, the phage diversity of the deep-ocean and sediments is less known. Similarly, knowledge is sparse regarding marine ssDNA and RNA phages, as compared to dsDNA phages. Phage-host interactions have been studied only for a handful of phage-host systems. Also, little is known about lysogeny, how it shapes microbial communities and which factors are responsible for the lytic/lysogenic switch. To answer all these questions, a combination of cultivation-dependent and independent methods is necessary.

References Brum, J.R., Ignacio-Espinoza, J.C., Roux, S., et al., 2015. Patterns and ecological drivers of ocean viral communities. Science 348 (6237), 1261498. doi:10.1126/science.1261498. Marston, M.F., Pierciey, F.J., Shepard, A., et al., 2012. Rapid diversification of coevolving marine Synechococcus and a virus. Proceedings of the National Academy of Sciences of the United States of America 109 (12), 4544–4549. doi:10.1073/pnas.1120310109. Middelboe, M., Holmfeldt, K., Riemann, L., Nybroe, O., Haaber, J., 2009. Bacteriophages drive strain diversification in a marine Flavobacterium: Implications for phage resistance and physiological properties. Environmental Microbiology 11 (8), 1971–1982. doi:10.1111/j.1462-2920.2009.01920.x. Roux, S., Brum, J.R., Dutilh, B.E., et al., 2016. Ecogenomics and potential biogeochemical impacts of globally abundant ocean viruses. Nature 537 (7622), 689–693. doi:10.1038/nature19366.

Further Reading Allers, E., Moraru, C., Duhaime, M.B., et al., 2013. Single-cell and population level viral infection dynamics revealed by phageFISH, a method to visualize intracellular and free viruses. Environmental Microbiology 15 (8), 2306–2318. doi:10.1111/1462-2920.12100. Aylward, F.O., Boeuf, D., Mende, D.R., et al., 2017. Diel cycling and long-term persistence of viruses in the ocean’s euphotic zone. Proceedings of the National Academy of Sciences of the United States of America 114 (43), 11446–11451. doi:10.1073/pnas.1714821114. Bischoff, V., Bunk, B., Meier-Kolthoff, J.P., et al., 2019. Cobaviruses – A new globally distributed phage group infecting Rhodobacteraceae in marine ecosystems. The ISME Journal 13, 1404–1421. doi:10.1038/s41396-019-0362-7. Coutinho, F.H., Silveira, C.B., Gregoracci, G.B., et al., 2017. Marine viruses discovered via metagenomics shed light on viral strategies throughout the oceans. Nature Communications 8, 15955. doi:10.1038/ncomms15955. Duhaime, M.B., Solonenko, N., Roux, S., et al., 2017. Comparative omics and trait analyses of marine Pseudoalteromonas phages advance the phage OTU concept. Frontiers in Microbiology 8, 1241. doi:10.3389/fmicb.2017.01241. Enav, H., Kirzner, S., Lindell, D., Mandel-Gutfreund, Y., Béjà, O., 2018. Adaptation to sub-optimal hosts is a driver of viral diversification in the ocean. Nature Communications 9 (1), 4698. doi:10.1038/s41467-018-07164-3. Gregory, A.C., Solonenko, S.A., Ignacio-Espinoza, J.C., et al., 2016. Genomic differentiation among wild cyanophages despite widespread horizontal gene transfer. BMC Genomics 17 (1), 930. doi:10.1186/s12864-016-3286-x. Gregory, A.C., Zayed, A.A., Conceição-Neto, N., et al., 2019. Marine DNA viral macro- and microdiversity from pole to pole. Cell 177, 1109–1123. doi:10.1016/j.cell.2019.03.040. Holmfeldt, K., Solonenko, N., Howard-Varona, C., et al., 2016. Large-scale maps of variable infection efficiencies in aquatic Bacteroidetes phage-host model systems. Environmental Microbiology 18 (11), 3949–3961. doi:10.1111/1462-2920.13392. Holmfeldt, K., Solonenko, N., Shah, M., et al., 2013. Twelve previously unknown phage genera are ubiquitous in global oceans. Proceedings of the National Academy of Sciences of the United States of America 110 (31), 12798–12803. doi:10.1073/pnas.1305956110. Hurwitz, B.L., Brum, J.R., Sullivan, M.B., 2015. Depth-stratified functional and taxonomic niche specialization in the ‘core’ and ‘flexible’ Pacific Ocean Virome. The ISME Journal 9 (2), 472–484. doi:10.1038/ismej.2014.143. Marston, M.F., Martiny, J.B.H., 2016. Genomic diversification of marine cyanophages into stable ecotypes. Environmental Microbiology 18 (11), 4240–4253. doi:10.1111/14622920.13556. Martinez-Hernandez, F., Fornas, O., Lluesma Gomez, M., et al., 2017. Single-virus genomics reveals hidden cosmopolitan and abundant viruses. Nature Communications 8, 15892. doi:10.1038/ncomms15892. Mizuno, C.M., Ghai, R., Saghaï, A., López-García, P., Rodriguez-Valera, F., 2016. Genomes of abundant and widespread viruses from the deep ocean. mBio 7 (4), doi:10.1128/ mBio.00805-16. Thompson, L.R., Zeng, Q., Kelly, L., et al., 2011. Phage auxiliary metabolic genes and the redirection of cyanobacterial host carbon metabolism. Proceedings of the National Academy of Sciences of the United States of America 108 (39), E757–E764. doi:10.1073/pnas.1102164108. Weitz, J.S., Wilhelm, S.W., 2012. Ocean viruses and their effects on microbial communities and biogeochemical cycles. F1000 Biology Reports 4, 17. doi:10.3410/B4-17.

Ecology of Phages in Extreme Environments Tatiana A Demina, Molecular and Integrative Biosciences Research Program, Faculty of Biological and Environmental Sciences, University of Helsinki, Helsinki, Finland Nina S Atanasova, Finnish Meteorological Institute, Helsinki, Finland and University of Helsinki, Helsinki, Finland r 2021 Elsevier Ltd. All rights reserved.

Nomenclature CRISPR

Clustered regularly interspaced short palindromic repeats Cryo-EM Cryogenic electron microscopy ds Double-stranded MCP Major capsid protein NaCl Sodium chloride ss Single-stranded

Glossary Caudovirales Taxonomical order of viruses, which contains viruses with icosahedral heads, different types of tails, and linear dsDNA genomes. Cryoconite hole Water-filled melt-hole on the surface of a glacier. Cryogenic electron microscopy Electron microscopy technique where samples are rapidly frozen and then observed at cryogenic temperatures. Cryopeg Water brine within permafrost. Cryosphere The areas of the Earth where water is in solid state. Deep-sea hydrothermal vent Geothermal vents at the seafloor of volcanically active areas. Extremophile Organism that inhabits extreme environment, where one or more physico-chemical parameters reach their extreme values. Halophile Organism that requires high salt concentrations for the optimal growth. Hyperarid Extremely dry, with an aridity index below 0.05. Metagenomics Metatranscriptomics, and metaproteomics, studies of pools of genomes, mRNA, and proteins, respectively, derived directly from environmental samples.

T number Triangulation number T4-like Virus morphology resembling bacterial myovirus T4 T7-like Virus morphology resembling bacterial podovirus T7 UV Ultra violet VLP Virus-like particle VPR Virus-to-prokaryote ratio

Myovirus Virus with an icosahedral head and a contractile tail. Permafrost Ground that has remained frozen for at least two years’ time period. Phage Virus infecting bacteria. Podovirus Virus with an icosahedral head and a short non-contractile tail. Polyextreme environment Environment with two or more physico-chemical parameters reaching their extreme values. Prophage Phage genome integrated into the host cell DNA or existing as a plasmid in the host cell cytoplasm. Siphovirus Virus with an icosahedral head and a long non-contractile tail. Triangulation number Square of the distance between two adjacent five-fold vertices of an icosahedral viral capsid. Virome Specific pool of nucleic acids that constitute a viral community within an ecosystem. Virophage Small dsDNA virus whose replication is dependent on the co-infecting giant virus. Virus-like particle In microscopy: particle that morphologically resembles a virus. Vitrification Rapid cooling process that suppresses the formation of ice crystals.

Introduction The availability of liquid water within a cell can be considered as one of the prerequisites for life, enabling a proper functioning of essential cellular processes, such as enzymatic reactions and nucleic acid metabolism. These requirements are easily met in an environment with 20–301C temperature, 1 atm pressure, and available water. However, in the Earth’s extreme environments, physico-chemical parameters like temperature, pressure, salinity, ultra violet (UV) radiation and pH reach extreme values. The word extreme is derived from the Latin extrēmus, which means uttermost, utmost or last. Indeed, the conditions for life in extreme environments can probably best be described as “at the limit”. Majority of organisms cannot tolerate such conditions, indicating that different types of extreme environments may have low species diversity and some large taxonomic groups of organisms can be totally absent. Nonetheless, several extreme environments on the Earth, such as polar sea ice, hypersaline environments, volcanic springs, and deep-sea hydrothermal vents, are actually flourishing with life. These environments are inhabited by extremophiles, organisms that

342

Encyclopedia of Virology, 4th Edition, Volume 4

doi:10.1016/B978-0-12-809633-8.20964-3

Ecology of Phages in Extreme Environments

343

not only tolerate, but also require the surrounding extreme conditions for survival. Most extremophiles are microorganisms. Archaea often dominate in various extreme environments, but bacteria and eukaryotes can also be abundant. Viruses of extremophiles exceed the number of their archaeal, bacterial or eukaryotic host cells by ten- to hundred-fold and are often the only predators in various ecosystems of extreme environments. Due to the low species diversity of their host organisms, viruses of extremophiles are important evolutionary players in shaping the dynamics and structure of microbial communities. Even today, extreme environments remain largely unexplored territories and the overall number of extremophilic phage isolates is low compared to the non-extremophilic ones. Genome sequences of the described phages of extremophiles, as well as metagenomes obtained from extreme environments typically contain low number of hits to common databases. Detailed molecular studies of virus-host interactions in extreme environments are scarce and much more research is needed in order to better understand these unique phage-host systems. Moreover, some extreme environments, such as volcanic hot springs, are suggested to resemble the conditions on the early Earth and could thus provide valuable information about the origin and evolution of life. In addition, studies on extremophiles and their viruses are also important regarding astrobiology, i.e., how microorganisms could survive in extra-terrestrial conditions. The research on viruses and their extremophilic hosts can also add insights into the biodiversity on our planet, providing knowledge needed for predicting the effects of climate change. This article concentrates on extremophilic bacterial viruses, phages, but archaeal viruses are also mentioned when relevant. The available data on extremophilic phages reviewed here have been obtained by different methods, each of which has its own limitations. For example, by electron microscopy, virus-like particles (VLPs), cellular membrane vesicles or staining artefacts may be visualized together with bona fide viruses. For simplicity, the different environments reviewed here have been categorized into e.g., thermal, cold, or hypersaline. In fact, several of the introduced environments are polyextreme, meaning that they are subjected to two or more extreme parameters. As with all the other studied environments, phages with icosahedral heads, different types of tails and linear double-stranded (ds) DNA genomes dominate in most extreme environments, while still very little is known about extremophilic enveloped viruses or viruses with RNA genomes. We expect that future research on extremophilic phages will reveal many novel structures and functions helping to understand the life around us.

Phages in Hypersaline Environments Hypersaline environments are characterized by high salt concentrations. In such environments, the dominant salt is usually NaCl and the overall proportion of ions corresponds to that in seawater, but some hypersaline waters, e.g., the Dead Sea, may have ions in other proportions. The examples of hypersaline environments are solar salterns, brine springs, hypersaline lakes and ponds, deep-sea brine pools, halite deposits, salt mines, saline soils, and salty food products. Halophilic, i.e., “salt-loving”, organisms flourish in hypersaline environments. Although some marine organisms may be slightly halophilic, here, by halophiles we refer to organisms that are found in a salinity higher than that of seawater. Halophiles may be slight, moderate, or extreme, requiring 1%–3% (0.2–0.5 M), 3%–15% (0.5–2.5 M), or 15%–30% (2.5–5.2 M) NaCl, respectively, for optimal growth. Halophiles are known within all three domains of life, but archaea and bacteria are especially abundant in hypersaline environments. High numbers of VLPs (up to 109 VLPs ml1) of various morphologies, including spindle-shaped, spherical, tailed, filamentous, as well as some exceptional shapes, such as stars and hooks, have been reported for hypersaline environments. For example, high VLP densities (108–109 VLPs ml1) and high abundance of tailed icosahedral VLPs, typical for the phages belonging to the order Caudovirales, have been found in the alkaline (pH B10), hypersaline Mono Lake, USA (Fig. 1). In those samples, sipho- and myovirus-like particles were the most abundant among tailed VLPs, but podoviruses were also present. However, the VLPs observed in hypersaline environments, especially those of unusual forms, may represent mostly archaeal viruses, given that archaea usually dominate at high salt concentrations. Most of the phages described so far display tailed icosahedral particles, but in hypersaline environments, even tailed icosahedral particles may also refer to archaeal tailed viruses. Metagenomic, metatranscriptomic, and metaproteomic studies have revealed that viral communities are dynamic, specific, and diverse in hypersaline environments. Metagenomics of the sediment samples from the Great Salt Lake (15% total salinity, pH 7.4), USA, revealed that bacteria dominated in the prokaryotic community, with Proteobacteria, Firmicutes, Bacteroidetes, Actinobacteria, Chloroflexi, Cyanobacteria, and Planctomycetes as the most dominant phyla. In the viral fraction of the metagenome, sequences related to the members of the order Caudovirales dominated, with lambda-like, T4-like, and Cellulophagaassociated phages being particularly abundant. The sequences of tailed phages have been also found in other hypersaline environments, e.g., deep-sea anoxic brines of Red Sea or cold hypersaline Deep Lake in the Antarctica. The presence of prophage sequences, viral functional genes, and virus defense systems, such as clustered regularly interspaced short palindromic repeats (CRISPRs), in bacterial genomes retrieved from high salinity environments suggests active viral-mediated gene transfer and dynamic virus-host interactions. At least 31 halophilic phages have been isolated to date (Table 1). These phages infect hosts of various genera and have been isolated from distant hypersaline environments all over the world or induced from halophilic bacterial strains. The first eight viruses infecting Salinibacter ruber, an extremely halophilic bacterium commonly found in hypersaline environments, have been isolated only recently, while previous studies of metavirome and metatranscriptome from Santa Pola solar saltern, Spain, have already suggested the presence of active phages infecting S. ruber. A few phages have been isolated from soda lakes, which are characterized by high sodium carbonate concentrations and high pH values. Both virulent and temperate halophilic phage isolates are known, although for many of them, life cycles have not been studied in sufficient detail. Most of the isolated halophages

344

Ecology of Phages in Extreme Environments

Fig. 1 Transmission electron micrographs of viruses from depths of 2 m (A–C), 16.5 m (D–F), and 35 m (G–I) in Mono Lake. Reproduced from Brum, J.R., Steward, G.F., 2010. Morphological characterization of viruses in the stratified water column of alkaline, hypersaline Mono Lake. Microbial Ecology 60 (3), 636–643. doi:10.1007/s00248-010-9688-4.

display tailed icosahedral particles, typical for phages classified into the order Caudovirales. As for other morphotypes, one tailless icosahedral virus, SSIP-1, which infects Salisaeta sp., was isolated from Sedom Ponds, Israel, and one spherical virus, JMT-1, which infects Chromohalobacter, was isolated from Yuncheng Saline Lake, China. For many isolates, there are no available data on their genome organization, others have dsDNA genomes in either linear or circular conformation and of variable length (Table 1). High-resolution image reconstructions based on cryogenic electron microscopy (cryo-EM) revealed that the icosahedral capsid of CW02, podovirus infecting Salinivibrio, is decorated with turret-like structures at icosahedral vertices and its major capsid protein (MCP) has a conserved HK97-fold. A so-called canonical HK97-fold, which was originally observed in phage HK97, is found in tailed phages, eukaryotic herpesviruses and in HSTV-1, an archaeal tailed virus. The conservation of certain protein folds and overall virion architectural principles observed across various viral groups has led to the idea that all viruses may be in principle classified into a limited number of structure-based lineages, including the HK97-like lineage. The other halophilic tailless icosahedral bacteriophage, SSIP-1, (Fig. 2) displays structural features typical for viruses grouped into the PRD1-adenovirus-like structural lineage. This lineage comprises tailless icosahedral inner-membrane containing (except membraneless adenovirus) dsDNA viruses, including both prokaryotic and eukaryotic ones, as well as virophages. Viruses belonging to the PRD1-adenovirus-like structural lineage are either extremophilic or

Ecology of Phages in Extreme Environments

Table 1

Examples of phages isolated from high salinity environments

Virus name

Host

Isolation site

F9-11

Halomonas halophila

Nda

G3 HM5

Pseudomonas sp. G3 Isolate 131 from La Mala saltern Isolate 121 from La Mala saltern Tetragenococcus halophilus Tetragenococcus halophilus Halomonas halophila Halomonas halophila Vibrio sp. B1 Idiomarina sp. Halomonas salina Halomonas salina Salicola sp. PV3 Salicola sp. PV4 Vibrio metschnikovii Salicola sp. s3–1 Salinivibrio SA50 Salisaeta sp. SP9–1

Induced from H. halophila F9–11, Alicante, Siphovirus Spain Salt pond, Canada Myovirus La Mala saltern, Granada, Spain Siphovirus La Mala saltern, Granada, Spain

Siphovirus

Nd

Fermenting soy sauce Fermenting soy sauce Induced from H. halophila (Alicante, Spain) Induced from H. halophila, Alicante, Spain Solar saltern, Alicante, Spain Mono Lake, USA Saline soil, Great Salt Plains, USA Saline soil, Great Salt Plains, USA Solar saltern, Margherita di Savoia, Italy Solar saltern, Margherita di Savoia, Italy Lake Magadi, Kenya Solar saltern, Margherita di Savoia, Italy Great Salt Lake, USA Salt water, Sedom Ponds, Israel

Nd Nd Nd Nd dsDNA, B80 kbp Nd dsDNA, B49 kbp dsDNA, B340 kbp Nd Nd dsDNA, o30 kbp Nd Linear dsDNA, 49,390 bp Circular dsDNA B43,788 bp

Salt mine, Qiaohou, China Lake Shala, Ethiopia Lake Shala, Ethiopia Lake Magadi, Kenya Yuncheng Saline Lake, China

Myovirus Siphovirus Siphovirus Siphovirus Myovirus Nd Myovirus Myovirus Siphovirus Myovirus Siphovirus Myovirus Podovirus Tailless icosahedral with inner lipid membrane Siphovirus Myovirus Siphovirus Siphovirus Spherical

dsDNA, nd Linear dsDNA, Linear dsDNA, Linear dsDNA, Linear dsDNA,

138,081 bp 38,261 bp 58,951 bp B23 kbp

Bras del Port saltern, Alicante, Spain Bras del Port saltern, Alicante, Spain Bras del Port saltern, Alicante, Spain Bras del Port saltern, Alicante, Spain Bras del Port saltern, Alicante, Spain Bras del Port saltern, Alicante, Spain Bras del Port saltern, Alicante, Spain Campos saltern, Mallorca, Spain

Siphovirus Siphovirus Siphovirus Siphovirus Siphovirus Siphovirus Siphovirus Siphovirus

Linear Linear Linear Linear Linear Linear Linear Linear

53,812 53,808 51,326 50,128 35,009 35,434 53,197 35,883

HM15 phi7116 phiD-86 F5–4 F12–9 UTAK phiMono1 phigspB phigspC SCTP-1 SCTP-2 Ø1M3–16 SCTP-3 CW02 SSIP-1

345

QHHSV-1 Shbh1 Shpa Mgbh1 JMT-1

Halomonas ventosae Bacillus sp. MGK1/ERV9 Paracoccus sp. HS3 Bacillus sp. MGK1 Chromohalobacter sp. LY7–3 M8CC-19 Salinibacter ruber M8 M31CC-1 Salinibacter ruber M31 M31CR41–2 Salinibacter ruber M31 M31CR41–3 Salinibacter ruber M31 M8CR30–2 Salinibacter ruber M8 M8CR30–4 Salinibacter ruber M8 M8CRM-1 Salinibacter ruber M8 M1EM-1 Salinibacter ruber M1

Particle morphology

Genome (type and size)

Nd Nd

dsDNA, dsDNA, dsDNA, dsDNA, dsDNA, dsDNA, dsDNA, dsDNA,

bp bp bp bp bp bp bp bp

a

Nd, not determined.

moderate, e.g., phages PRD1, PM2, P23-77 (see below), archaeal viruses STIV, SH1, HHIV-2, HCIV-1, algal virus PBCV-1, human adenovirus, and the virophage Sputnik.

Phages in Thermal Environments Thermophilic, i.e., “heat-loving”, organisms inhabit hot environments, where temperature ranges from B40 to B801C, while hyperthermophiles grow optimally at temperatures above 801C. Such hot environments include terrestrial volcanic sites, non-polar deserts, hydrothermal springs, submarine hydrothermal systems, and oceanic basement subsurface. In addition, thermophilic microorganisms can be found from various artificially made environments, such as dairy production plants, water heaters, and compost piles. Hyperthermophilic archaea tend to dominate in the hottest environments (at above 701C), while thermophilic bacteria are often dominant at milder temperatures (B40–701C). Viruses are abundant and active players in microbial communities inhabiting thermal terrestrial and marine sites. Moreover, in thermal environments where temperature exceeds the upper limit for eukaryotic life (B601C), prokaryotic viruses are the only predators, controlling host abundance and affecting organic and inorganic nutrient cycling.

Terrestrial Hot Springs Terrestrial thermal springs are sites where geothermally heated groundwater is released at the land surface. These sites are characterized by temperatures of B351C up to the boiling point, a pH range of 1–9, and various ion composition. Thermal springs are found all over the planet, but some regions, such as Yellowstone National Park in the USA, are particularly rich in hot springs and geysers.

346

Ecology of Phages in Extreme Environments

Ecology of Phages in Extreme Environments

347

Quite different VLP densities have been reported for hot springs, ranging roughly from 104 to 107 VLPs ml1. Virus-to-prokaryote ratios (VPRs) detected in hot springs are typically lower when compared to moderate environments. Electron microscopic examinations of VLPs in Yellowstone hot springs revealed a variety of different morphotypes, including rod-shaped, lemon-shaped, and filamentous particles typical for hyperthermophilic archaea, novel morphotypes (zipper-like, pleomorphic, and other complex particles), and tailed particles. The latter ones were observed only in neutral or mildly alkaline springs and may represent bacteriophages or archaeal viruses. A complete genome sequence of the OS3173 phage, which presumably infects Aquificales, was retrieved from a metavirome from the alkaline Octopus hot spring (pH B8.5), Yellowstone. Metagenomic studies of acidic springs in Iceland, Italy, and Yellowstone National park (76–901C, pH 1.8–5.5) showed the overall predominance of archaeal viruses over bacteriophages. However, the fraction of phages might have been underestimated due to a bias of reference databases, which contain significantly more hyperthermophilic archaeal viral sequences compared to those of thermophilic bacteriophages. Sequences related to the members of the order Caudovirales were abundant in samples obtained from a hot spring in Grensdalur, Iceland (851C, pH 5), and a partial genome of a presumably Hydrogenobaculum phage, HP1, was reconstructed from the same sample. Cyanophages have been reported to dominate in some hot springs. Microscopy analysis of Brandvlei hot spring (601C, pH 5.7), South Africa, showed the presence of both archaea-specific VLPs and tailed particles. Lambda-like siphoviruses dominated, while podoviruses, as well as regular-sized and jumbo myoviruses, were also present. Metaviromic analysis demonstrated the prevalence of Caudovirales-related sequences and revealed a large fraction of genes typical for cyanophages. Partial genomes of two phages, cyanophage BHS3 and putatively Gemmata phage BHS4, were also obtained from Brandvlei hot spring metavirome. The predominance of tailed VLPs was also shown in cyanobacteria-dominated phototrophic mats in Porcelana hot spring (46–701C, pH B7), Chile (Fig. 3). Metagenomic analysis of the same samples confirmed the dominance of phages belonging to the order Caudovirales and allowed a full genome reconstruction of a T7-like thermophilic cyanophage (podovirus), TC-CHP58, which presumably infects Mastigocladus sp. However, no thermophilic cyanophages have yet been isolated. The most numerous thermophage isolates have been obtained on Thermus strains (Table 2). In a large sampling of alkaline hot springs in Iceland, New Zealand, Russia (Kamchatka), and the USA, arranged by the Promega corporation, 115 phages were isolated on seven Thermus strains. About 45% of the obtained isolates were tailed phages (myo- and siphoviruses). In addition, tailless icosahedral and filamentous phages were present. In this sampling, two Thermus thermophilus siphoviruses, P23-45 and P7426, with exceptionally long tails (B800 nm) were isolated from Kamchatka hot springs, Russia. Later, TSP4, a similar Thermus siphovirus with a 785-nm long tail, was isolated from Tengchong hot spring, China, suggesting a worldwide distribution of this unusual siphovirus type. In addition, another Thermus thermophilus siphovirus, G20c, isolated from Kamchatka hot springs shows high sequence similarity to P23-45 and P74-26. Thermus phages P23-77 and IN93 display similarities to tailless icosahedral inner-membrane containing dsDNA viruses SH1, PH1, HHIV-2, HCIV-1, and SNJ1, which infect halophilic archaea. Consequently, these phages and archaeal viruses have been classified into one viral family, Sphaerolipoviridae. In addition, two other Thermus phages, P23-72 and P23–65 H, have been proposed to belong to the same family. Cryo-EM-based image reconstructions (Fig. 4) revealed that P23-77 virion consists of an icosahedral protein shell and an internal lipid membrane, which encloses the dsDNA genome. At the five-fold symmetry positions, P23-77 virion is decorated with spike complexes, which most probably serve for host recognition. The capsid lattice arrangement (pseudo-hexameric with T ¼ 28 symmetry) seems to be conserved within the family Sphaerolipoviridae, while spike complexes display various forms. Overall virion structural organization, as well as MCP folding, places P23-77, together with other sphaerolipoviruses, within the PRD1-adenovirus structural lineage.

Deep-Sea Hydrothermal Vents Seafloor had been considered as a desert-like environment, where life can be barely found, until the discovery of hydrothermal vent systems in the late 1970s. Since then, most studies on life in the deep ocean have concentrated around underwater thermal systems as the hotspots of life, surrounded by dark, cold and nutrient deficient ocean waters. Deep-sea hydrothermal vents are unique extreme environments with huge gradients of temperature, pH, and dissolved chemical compounds, not to mention high hydrostatic pressure. Vent fluids may reach as high temperatures as 4001C at the top of the chimney, but the temperature of fluids drops sharply, when they are mixed with the surrounding cold seawater (0–41C). Due to the mixing of extremely hot and cold Fig. 2 Icosahedral reconstruction of the SSIP-1 virion. (A) Electron micrograph of SSIP-1 virions vitrified at 9% (wt/vol) salt water (SW) buffer (see composition in Table S2 in original publication). Three spikes are indicated with black arrowheads (Scale bar, 50 nm.) (B) Central slice through an icosahedral reconstruction. Inset shows a radially averaged density profile. DNA (D), membrane (M), capsid (C), and spikes (S) are indicated. Twofold, threefold, and fivefold axes of icosahedral symmetry are indicated by an ellipse, a triangle, and a pentagon, respectively. Three concentric layers of DNA are indicated with asterisks. Lipid bilayer is interrupted by transmembrane densities at the threefold axes of symmetry (triangle). (C) Radially colored isosurface representation of the reconstruction with an arbitrary handedness is rendered at 2s above the mean density. Color bar shows radial coloring. Inset shows a model lattice exemplifying the T ¼ 49 icosahedral triangulation. Geometrical arrangement of the capsomers is given by the relationship T ¼ h2 þ hk þ k2, where h and k define the lattice point. Here h ¼ 7, k ¼ 0. The two frontmost fivefold vertices are in red. (D–F) Six times magnified close-ups of the reconstruction taken along the (D) twofold, (E) threefold, and (F) fivefold axes of symmetry. Reproduced from Aalto, A.P., Bitto, D., Ravantti, J.J., et al., 2012. Snapshot of virus evolution in hypersaline environments from the characterization of a membrane-containing Salisaeta icosahedral phage 1. Proceedings of the National Academy of Sciences of the United States of America 109 (18), 7079–7084. doi:10.1073/pnas.1120174109.

348

Ecology of Phages in Extreme Environments

Fig. 3 Transmission electronic micrographs of VLPs obtained from the interstitial fluid of phototrophic microbial mats growing between 621C and 421C in Porcelana hot spring. Scale bar: 100 nm. (A–G) Caudovirus-like particles belonging to Myoviridae, Podoviridae, and Siphoviridae families. (H–K) Filamentous and rod shaped VLPs that could be associated with Lipothrixviridae and Clavaviridae families. Reproduced from Guajardo-Leiva, S., Pedrós-Alió, C., Salgado, O., Pinto, F., Díez, B., 2018. Active crossfire between cyanobacteria and cyanophages in phototrophic mat communities within hot springs. Frontiers in Microbiology 9, 2039. doi:10.3389/fmicb.2018.02039.

waters, a steep gradient of pH and fluid chemistry is formed within meters from the fluid emission site. As a result, a variety of microenvironments is formed within hydrothermal vents systems, hosting very diverse forms of life. Chemosynthesizing archaea and bacteria serve as primary producers in these communities. Viruses are ubiquitous in both deep and shallow underwater vents, except that no VLPs have been found on the interior parts of chimney structures. Virus particles are typically more abundant in vent systems than in the surrounding seawater: 104–105 VLPs ml1 observed in deep-sea waters and 105–107 VLPs ml1 in hydrothermal vent systems. In deep-sea hydrothermal environments, VPR is typically not higher than that in seawater (r10). Sequence and microscopy analyzes have revealed the presence of tailed bacteriophages in various hydrothermal vents and even hundreds of meters below the seabed surface. VLPs of various shapes were observed in the brackish-marine sediments of the Baltic Sea (Fig. 5), including tailless and tailed icosahedral particles and VLPs inside cells. The analysis of g23 gene, which encodes MCP of T4-like myoviruses, revealed diverse groups of T4-like phages, including cyanophages, in the Baltic Sea subsurface sediments. Sampling of hydrothermal vents is highly difficult not only due to technical and logistical constraints, but also due to the very dynamic nature of these systems. Cells and viruses found in plume fluid samples may in fact originate from the surrounding seawater, surface chimney sediments, or subsurface seabed, as water masses are constantly in movement. Perhaps, this partially explains the fact that the available literature contains controversial reports regarding the specificity of viral community composition, when the vent and background seawater metagenomes are compared. Metagenomes obtained from hydrothermal vents display high enrichment with proviral sequences and mobile genetic elements in cells and auxiliary metabolic genes in viruses, suggesting that lysogeny is a common virus life strategy in these environments. Genomes of phages presumably infecting sulfuroxidizing bacteria, which are ubiquitous marine chemolithoautotrophs, were retrieved from metagenomic datasets sampled from hydrothermal vent plumes in the Pacific Ocean. These dsDNA viral genomes contained auxiliary metabolic genes involved in elemental sulfur oxidation, suggesting a role in supplementing the metabolism of sulfur-oxidizing bacteria and thus affecting sulfur cycling in deep sea. To the best of our knowledge, only eight phages infecting bacteria from deep-sea hydrothermal vents have been obtained to date (Table 2). All of them have been either induced or spontaneously isolated from the host cultures, i.e., no phages have been yet isolated directly from deep-sea hydrothermal vent samples. The known isolates are all tailed phages obtained from the strains belonging to Bacillus, Geobacillus, Nitratiruptor, and Marinitoga genera isolated from Pacific or Atlantic deep-sea hydrothermal systems. Interestingly, the virions of MPV1, a siphovirus infecting Marinitoga piezophila, carry a host plasmid in addition to the viral DNA, highlighting the horizontal gene transfer in the deep-sea thermal systems.

Hot Deserts Hot hyperarid deserts represent complex environments, where organisms have to adapt to multiple extreme conditions, including broad temperature variations, water limitation, intensive UV radiation and nutrient starvation. In spite of multiple extremities, microbial cells and viruses inhabit the soils of hot deserts, although viral abundances are low. Tailed virus particles were observed in the samples of soils from the Namib, Mojave, and Sahara deserts. Hot desert metaviromes have also shown to

Ecology of Phages in Extreme Environments

Table 2

349

Examples of phages isolated from thermal environments

Virus name Isolates from phiYS40 phiNS11 IN93

Host terrestrial hot springs Thermus thermophilus Bacillus acidocaldarius Thermus thermophilus

P23–45

Thermus thermophilus

P74–26

Thermus thermophilus

P23–77 P23–72 P23–65 H PH75 GBSV1

Thermus thermophilus Thermus thermophilus Thermus thermophilus Thermus thermophilus Geobacillus sp. 6K51

TSP4

Thermus TC4

BV1

Geobacillus sp. 6k512

Isolation site

Particle morphology

Genome (type and size)a

Hot spring, Japan Beppu hot springs, Japan Induced from Thermus aquaticus TZ2, hot spring, Japan Alkaline hot spring, Dolina Geyser, Kamchatka, Russia Alkaline hot spring, Uzon, Kamchatka, Russia

Myovirus Tailless icosahedral Tailless icosahedral

Linear dsDNA, 152,372 bp dsDNA, nd Circular dsDNA, 19,604 bp

Siphovirus with a long tail (823 nm) Siphovirus with a long tail (823 nm) Tailless icosahedral Tailless icosahedral Tailless icosahedral Filamentous Myovirus

Linear dsDNA, 84,201 bp

Alkaline hot spring, New Zealand Alkaline hot spring, New Zealand Alkaline hot spring, New Zealand Thermal spring Isolated from Geobacillus sp. 6K51 culture, hot spring, Xiamen of China Tengchong hot spring (651C, pH 7.0), China

Isolated from Geobacillus sp. 6k512 culture, hot spring, Xiamen of China MMP17 Meiothermus TG17 Hot spring (pH 7.3 and 631C), Eryuan, Yunnan, China phiTMA Thermus thermophilus Atagawa hot spring, Japan RM378 Rhodothermus marinus Slightly saline geothermal site, Hveravik, Iceland phiOH3 Thermus thermophilus Obama hot spring (751C), Nagasaki, Japan G20c Thermus thermophilus Hot spring (B651C pH 7.5), Geyser Valley, Kamchatka, Russia Isolates from deep-sea hydrothermal systems BVW1 Bacillus sp. w13 Isolated from Bacillus sp. w13, Pacific hydrothermal fields GVE1 Geobacillus sp. E26323 Isolated from Geobacillus sp. E26323, Pacific hydrothermal fields GVE2 Geobacillus sp. E263 Isolated from Geobacillus sp. E263, deep-sea hydrothermal field, the east Pacific D6E Geobacillus sp. E263 Isolated from Geobacillus sp. E263, deep-sea hydrothermal field, the east Pacific NrS-1 Nitratiruptor sp. Induced from Nitratiruptor sp. SB155–2, Iheya SB155–2 North hydrothermal field, Japan MPV1 Marinitoga piezophila Induced from M. piezophila KA3, hydrothermal vent chimney (2630 m depth), East-Pacific Rise MCV-1 Marinitoga camini Induced from M. camini 97, Lucky strike hydrothermal vent (1700 m depth), Mid Atlantic Ridge MCV-2 Marinitoga camini Induced from M. camini 55, black smoker chimney, Menez Gwen site, (840–870 m depth), Mid Atlantic Ridge

Linear dsDNA, 83,319 bp Circular dsDNA, 17,036 bp dsDNA, nd dsDNA, nd ssDNA, B6.5 kbp Linear dsDNA, 34,683 bp

Siphovirus with a long tail (785 nm) Myovirus

dsDNA, B80 kbp

Myovirus

dsDNA, 33.5–39.5 kb

Myovirus Myovirus Filamentous Nd (probably siphovirus)b

Linear dsDNA, 151,483 bp Linear dsDNA, 129,908 bp Circular ssDNA, 5,688 bp Linear dsDNA, 81,291 bp

Tailed icosahedral

dsDNA, B18 kbp

Siphovirus

dsDNA, B41 kbp

Nd (probably siphovirus)b Myovirus

Linear dsDNA, 40,863 bp

Siphovirus

Linear dsDNA, 37,159 bp

Siphovirus

dsDNA, 43,715 bp

Siphovirus

Circular dsDNA, 53,412 bp

Siphovirus

Circular dsDNA, 50,307 bp

Circular dsDNA, 35,055 bp

Circular dsDNA, 49,335 bp

a

Nd, not determined. Based on genome sequence analyzes.

b

contain phage sequences, e.g., phages associated with bacteria belonging to the following genera: Bacillus, Rhizobium, Geobacillus, Actinoplanes, Mycobacterium, Streptomyces, and Myxococcus. The treatment of desert soil samples with mitomycin C has significantly enhanced the recovery of viral particles from the samples, suggesting that lysogeny may be a prevalent lifestyle of viruses in hot desert soils.

Phages in Polar and Other Cold Environments About 10% of the Earth land surface is permanently covered by ice and snow, extending up to roughly 30% during winter in the Northern hemisphere. Furthermore, ocean deep waters, which account for 90% of the total ocean volume, are cold (0–31C), and up to B5% of the area of the global ocean is covered by sea ice. Thus, the biosphere provides a great variety of cold biotopes,

350

Ecology of Phages in Extreme Environments

Fig. 4 (A) Organization of P23-77. Cryo-electron micrograph taken at 2.7 mm underfocus showing P23-77 viral particles (white arrow), with a diameter of 78 nm. Thin spikes, which might be used in the infection of the host, are visible on some particles (black arrow). The viral membrane inside the protein capsid is visible in empty particles (inset, underfocus 2.2 mm). Bar, 100 nm. (B) A 0.28 nm thick central section through the virion. Symmetry axes are indicated with a black ellipse (2-fold), triangle (3-fold) and pentagon (5-fold). Proteins connecting the viral capsid and the underlying lipid bilayer at the five-fold vertexes are visible. D (DNA), M (membrane) and C (capsid shell). Bar, 20 nm. Protein is black in A and B. (C) Radial density profiles of the icosahedral reconstruction of the intact virion (solid line) and the empty particle (dotted line). For calculation of the radial profiles both the full and the empty particle reconstructions were calculated to 3.0 nm resolution. Reproduced from Jaatinen, S.T., Happonen, L.J., Laurinmäki, P., Butcher, S.J., Bamford, D.H., 2008. Biochemical and structural characterisation of membrane-containing icosahedral dsDNA bacteriophages infecting thermophilic Thermus thermophilus. Virology 379 (1), 10–19. doi:10.1016/j.virol.2008.06.023.

Ecology of Phages in Extreme Environments

351

Fig. 5 Transmission electron micrographs showing the morphologies of virus-like particles and infected cells in the deep sediments of the Baltic Sea. (a), Examples of virus-like particles observed. Scale bars: 100 nm. (b), Infective viruses (arrows) in the visibly infected cells. Samples were recovered from deep sediments down to 70 mbsf in Hole M59C. Reproduced from Cai, L., Jørgensen, B.B., Suttle, C.A., et al., 2019. Active and diverse viruses persist in the deep sub-seafloor sediments over thousands of years. The ISME Journal 13 (7), 1857–1864. doi:10.1038/s41396-019-0397-9.

where psychrophilic or psychrotrophic organisms thrive. Psychrophiles, i.e., “cold-loving” organisms, grow and reproduce optimally at the temperatures ranging roughly from 0 to 151C (max. B201C) and often are also capable to survive at sub-zero temperatures. Psychrotrophic organisms are cold-tolerant, they may survive at the above mentioned temperatures, but require temperatures

352

Ecology of Phages in Extreme Environments

Table 3

Examples of phages isolated from cold environments

Virus name

Host

Isolation site

Particle morphologya

Genome (type and size)

1a 21c 11b MYSP03 9A MYSP08 MYTP08

Shewanella sp. 1A Colwellia sp. 21C Flavobacterium sp. 11B Flavobacterium sp. MYB03 Colwellia psychrerythraea 34H Flavobacterium sp. Flavobacterium sp.

Arctic sea ice Arctic sea ice Arctic melt pond samples Mingyong Glacier, China Arctic nepheloid layer seawater Mingyong Glacier, China Mingyong Glacier, China

Nd Nd Circular dsDNA, 36,012 bp dsDNA, B66 kbp Linear dsDNA, 104,936 bp Nd Nd

1/49 1/32 1/44 3/49 1/4 1/40 1/41 VMY22 f327

Shewanella sp. 49 Flavobacterium sp. 32 Shewanella sp. 44 Shewanella sp. 49 Shewanella sp. 4 Shewanella sp. 40 Shewanella sp. 41 Bacillus cereus MYB41–22 Pseudoalteromonas sp. BSi20327

Nd Circular dsDNA, 42,252 bp Circular dsDNA, 49,640 bp Circular dsDNA, 40,161 bp Linear dsDNA, 133,824 bp Linear dsDNA, 139,004 bp Circular dsDNA, 43,510 bp dsDNA, 18–20 kb ssDNA, 6.1 kbp

VNPH MuztagBP1 MuztagBP2 MYBP2A-15 MYSP06 VSW-3 PANV1 PANV2 OANV1 OANV2

Aeromonas sobria NPH-1 Pseudomonas sp. Pseudomonas sp. Pseudomonas sp. Janthinobacterium sp. MYB06 Pseudomonas fluorescens SW-3 Paraglaciecola sp. Paraglaciecola sp. Octadecabacter sp. Octadecabacter sp.

Arctic sea ice Arctic sea ice Arctic sea ice Arctic sea ice Arctic sea ice Arctic sea ice Arctic sea ice Mingyong Glacier, China Pseudoalteromonas sp. culture from Arctic sea ice Napahai plateau wetland, China Karakul lake, China Karakul lake, China Mingyong Glacier, China Mingyong Glacier, China Napahai plateau wetland, China Antarctic sea ice Antarctic sea ice Antarctic sea ice Antarctic sea ice

Myovirus Siphovirus Siphovirus Tailed icosahedral Siphovirus Tailed icosahedral Tailless icosahedral (tectivirus-like) Myovirus Siphovirus Siphovirus Siphovirus Myovirus Myovirus Myovirus Podovirus Filamentous Myovirus Myovirus Siphovirus Myovirus Siphovirus Podovirus Myovirus Siphovirus Siphovirus Podovirus

dsDNA, 110–120 kb Nd Nd Nd dsDNA, 65–70 kb dsDNA, 40,556 bp Nd Nd Nd Nd

a

Nd, not determined.

above 151C for the optimal growth (max. B301C). In many habitats, psychrophiles have to tolerate also other extreme factors, such as high pressure in cold deep-sea waters or high salinity in some Arctic and Antarctic lakes. Diverse and active microbial communities inhabit the cryosphere. Because of the absence or low abundance of grazers, these communities are characterized by high prokaryotic mortality due to viral infections. In low temperature ecosystems, viruses have a particularly important role in driving microbial evolution and, indirectly, the cycling of carbon. Although the viral component of cryosphere remains largely unstudied, it is already evident that viruses are abundant in various cold environments, such as polar freshwater and saline lakes, glaciers, sea ice, polar soils, snow, and permafrost.

Polar Oceans Metagenomic analysis of viromes from four different oceanic regions, including the Arctic Ocean, has revealed that global and regional marine viral diversity is high, varying in different latitudes, and there are both highly widespread and endemic marine viruses in the global ocean. The analyzes of surface water dsDNA viromes sampled in the Antarctic Prydz Bay revealed high abundance of tailed phages, including Flavobacterium phage 11b (Table 3). A substantial number of prophage sequences and viral genes, especially phage terminase genes, has been detected within bacterial genomes from both Arctic and Antarctic Ocean waters, indicating active horizontal gene transfer in these cold polar waters.

Glaciers Glaciers, especially supraglacial (top) biotopes, host a variety of microorganisms, including bacteria and their viruses. Due to the absorption of sun heat by cryoconite, i.e., dark rock particles and organic sediments on the surface of glaciers, the underlying ice melts, forming cryoconite holes filled with water (Fig. 6). These holes are the hotspots of microbial life in glaciers. Positive correlation between bacterial and viral abundances, high numbers of virus-like particles (106–107 VLPs ml1), high virus infection and production rates, but low virus burst size have been reported for cryoconite holes. Molecular studies targeting T4 MCP gene, g23, on glacier surfaces in Svalbard, Arctic, revealed a diverse community of T4-like phages, including both novel and cosmopolitan phages. The analyzes of metaviromes from cryoconite hole ecosystems of Svalbard glaciers and the Greenland Ice Sheet showed high diversity of bacteriophages, suggesting novel virus groups and unusual virus life strategies. At least eight

Ecology of Phages in Extreme Environments

353

(a) Organic and inorganic debris Glacier surface

(b)

(c)

TRENDS in Microbiology

Fig. 6 Cryoconite holes. (a) Schematic diagram of cryoconite hole formation. Cryoconite holes at the surface of glaciers are formed because organic and inorganic particles, which are darker than the surrounding white icy surface, absorb the solar radiation better than the ice. The heated debris therefore melts into the ice forming holes, which provide nutrients and liquid water for microbial activity. (b) Example of open cryoconite holes in the Arctic. (c) Example of entombed cryoconite hole in Antarctica. Photo courtesy Liz Bagshaw. Reproduced from Anesio, A.M., Bellas, C.M., 2011. Are low temperature habitats hot spots of microbial evolution driven by viruses? Trends in Microbiology 19 (2), 52–57. doi:10.1016/j.tim.2010.11.002.

cold-active tailed bacteriophages have been isolated from glacial melt water (Table 3): Flavobacterium phages MYSP03, MYSP08, and MYTP08, Janthinobacterium phage MYSP06, Bacillus cereus phage VMY22, and Pseudomonas phages MYBP2A-15, MuztagBP1, and MuztagBP2.

Polar Lakes Culture-independent studies have revealed high virus loads in Antarctic freshwater and saline lakes. VLP concentrations appear to be high in polar saline lakes compared to temperate marine waters, but lower in polar freshwater bodies than in temperate freshwater systems. Microscopic examinations of Antarctic freshwater and saline lakes demonstrated various virus morphologies, including tailed and tailless icosahedral particles. Metagenomic data have also showed high relative abundance of the phages belonging to the order Caudovirales. Based on the available viromes, Arctic and Antarctic freshwater viral communities are similar at general taxonomic level, but diverse and specific at a fine level. A large portion of sequences from polar metagenomes has no similarities to the data stored in common repositories, highlighting a need for more comprehensive sampling of cold environments.

Sea Ice In sea ice, microbial communities thrive in brine channels and pockets, which are formed within the ice in the process of seawater freezing. Algae, bacteria, and their viruses dominate in these unique microenvironments. It has been shown that bacterial growth and viral production in sea ice brines is possible even at  121C. Bacteriophage densities range roughly from 105 to 108 VLPs ml–1 in sea ice. Higher concentrations of bacteria and viruses have been reported for sea ice compared to seawater, pointing to active virus production within the ice and/or active enrichment with viruses from seawater. The analysis of genome sequences of sea ice bacteria indicated that gene transfer mediated by viruses is abundant in sea ice. To date, the majority of sea ice virus-host systems have been isolated from the Arctic (Fig. 7), but recently, the first four virus-host systems have been obtained also from the Antarctic sea ice (Table 3).

Permafrost Viruses have been observed in Siberian, Antarctic, and Arctic permafrost. The hotspots of life in frozen soils are cryopegs, highly saline pockets or brines within permafrost layers. The study of cryopegs in permafrost near Barrow, Alaska, showed greater VLP concentrations in cryopegs (up to 107 VLPs ml1) than in the associated ice wedge (up to 105 VLPs ml1). Microscopic images of Barrow cryopeg filtered brine revealed VLPs resembling siphoviruses and virome analysis showed that phages belonging to the order Caudovirales dominated in the taxonomic composition among sequences having hits to known viruses. Unclassified

354

Ecology of Phages in Extreme Environments

Fig. 7 Transmission electron micrographs of the negative stained samples of ‘1  purified’ ice phages. a Phage 1/4, b phage 1/32, c phage 1/41, d phage 1/40 isolated from Baltic sea ice. Scale bars 100 nm. Note that only sections A–D from original figure are used here. Reproduced from Luhtanen, A.-M., Eronen-Rasimus, E., Kaartokallio, H., 2014. Isolation and characterization of phage–host systems from the Baltic Sea ice. Extremophiles 18 (1), 121–130. doi:10.1007/s00792-013-0604-y.

members of the order Caudovirales were also prevalent in the taxonomically identifiable part of the metavirome obtained from peatland soils along a permafrost thaw gradient in Stordalen, Sweden.

Other Cold Environments Metagenomic studies have revealed the presence of bacteriophages in polar soils and snow. Public snow metagenomes contain a portion of viral sequences, albeit small, and prophages have been induced from Paenibacillus isolates obtained from top snow in the Antarctic. Studies of cold hyperarid desert soils in Antarctica showed high VLP densities (B108 VLPs g1 of dry soil), which is substantially higher than in hot hyperarid desert soils, and very high VPRs (up to 8200). Sequence analysis demonstrated that the members of Caudovirales were dominant among identified viruses, with Mycobacterium phages being particularly abundant. Interestingly, temperate bacteriophages (siphoviruses SpaA1 and BceA1) with an unusual arrangement of genes were retrieved from the Antarctic soils. About 20 cold-active lytic tailed bacteriophages have been isolated from the Napahai wetland located in the middle of the Hengduan Mountains, China. Some dozens of phages were isolated on Flavobacterium psychrophilum, a psychrophilic pathogen of fish. In addition, many psychrotrophic or psychrophilic phage isolates have been obtained from refrigerated food sources, e.g., meat, fish and milk products. Water droplets in the clouds may also harbor viruses, as these unique microenvironments remain liquid at temperatures close to  401C and allow microbial cells to survive and metabolize avoiding vitrification at subzero temperatures. However, very little is known about phages residing in the atmosphere (see the article below).

Atmosphere Atmospheric phages and their airborne host microbes originate from various terrestrial or aquatic sources, such as sea spray, desert dust, plant litter, snow, or sometimes, human or animal sources. Oceans are the major producers of primary biological aerosol particles, which are also called biological aerosols, referring to airborne viruses, bacteria, archaea, fungi, pollen, spores, or

Ecology of Phages in Extreme Environments

355

fragments of different microorganisms. Marine biological aerosols are released from sea surface microlayer to the atmosphere by bubble bursting process and transported further by winds. Various studies have shown that bacteria and phages are enriched in the sea surface microlayer indicating that a large fraction of atmospheric viruses are likely phages. At the airborne state, viruses are considered to be associated with small organic aerosol particles (o 1 mm) and due to the relatively small size, atmospheric residence times of viruses are longer compared to bacteria and other cellular organisms. Terrestrial dust is another frequent source of bioaerosols and it has been found to contain various VLPs with icosahedral heads and different types of tails. Saharan dust is known to cross the Pacific Ocean in a few days’ time, indicating that phages attached to small dust particles can indeed travel long distances in air. Terrestrial dust contains a size distribution of inorganic and organic particles with the smallest ones being able to penetrate lung alveoli. This suggests that in addition to various environmental sites, phages can be transported also to the lungs and from therein to the blood circulation of humans and animals. Diverse phages have also been observed in different types of soils and plant litter from which they can be aerosolized by winds. Because environmental phages are major drivers of the carbon cycles by causing cellular lysis, they can also influence climate change. The changing climate, on the other hand, can affect for instance, phage life cycles. Atmospheric dispersal of phages and other viruses has been considered as one of the most plausible explanations for the major question why genetically similar viruses are found from geographically distant environments, in spite of the enormous number of phages in the oceans (up to 1031). Because phages influence horizontal gene transfer, atmospheric phages can transport genes to host bacteria even to distant environments, affecting the global gene pool. Extremophilic archaeal viruses were found to be transported by air between three different geographically distant hot springs of Yellowstone National Park, suggesting that airborne viral dispersion contributes to local population diversity more than mutation rate. It is likely that the same phenomenon is experienced globally for phages and other viruses as well. The atmospheric significance of phages and their host bacteria is especially crucial in polar environments, where aerosols from terrestrial origin are less prevalent. Atmospheric microorganisms can function as cloud condensation nuclei and ice nuclei, affecting cloud dynamics. Bacteria, such as those belonging to the genus Pseudomonas, are known to contain ice nucleation active membrane proteins. In some rare occasions, phages have also been shown to be able to nucleate ice, but to date very little is known about this property. Environments like the oceans are known to contain numerous phages that are ecologically important, but to our knowledge, specific phage isolates originating from outdoor atmospheric samples, are yet to be discovered. In addition, very little is known about the survival and infectivity of phages in the atmosphere. At the airborne state, phages are subjected to several environmental stressors, such as intense UV radiation, desiccation, rain, ice formation, and low temperature. Phages can be protected from these stressors for instance by remaining in the prophage state within the host cell. Attachment to solid particles, on the other hand, can protect phages from desiccation caused by UV. Flow cytometric studies of air samples have demonstrated that atmospheric samples contain around 10–100 times more viruses than bacteria and approximately 70% of these viruses are attached to organic aggregates. Deposition rates of approximately 109–7  109 m2 in a day were reported for viruses and 107–8  107 m2 in a day for bacteria with the highest virus deposition rates originating from marine bioaerosols. It is presumable that a large percentage of these viruses are phages.

Conclusions Although characterized by very challenging conditions, extreme environments are in fact inhabited by a large number of extremophiles, which are mostly microorganisms. Viruses and bacteriophages in particular are abundant and active in extreme environments and have globally important ecological roles. Virus-like particle densities differ in various extreme niches, being e.g., 104–105 VLPs ml1 in the deep-sea and up to 109 VLPs ml1 in hypersaline waters. Tailed VLPs typical for bacteriophages, as well as other more unusual particle morphologies, have been observed in extreme environments. Part of these VLPs may represent archaeal viruses. Metagenomic studies have revealed Caudovirales-related sequences in numerous extreme sites. Atmospheric movement of phages has been suggested to provide some explanations for the observed genetic similarity between phages originating from geographically distant extreme environments. Sampling of certain environments, e.g., hydrothermal vents, seabed subsurface or the atmosphere, is still very challenging technically, while other sites, e.g., hypersaline lakes and salterns, represent more easily accessible locations. Subsequently, the number of isolates from different types of extreme environments varies. Known extremophilic phage isolates display the following morphologies: tailless or tailed icosahedral (myo-, sipho-, or podoviruses), spherical, and filamentous. Their genomes are linear or circular ds- or ssDNA molecules, many of which are not yet sequenced. Detailed molecular studies of the obtained isolates are still rare, but the available data adds insights into the understanding of evolutionary relations between different viral groups. The research on extremophilic phages widens our view on global viral diversity and helps to determine certain boundaries of the virosphere.

Further Reading Atanasova, N.S., Oksanen, H.M., Bamford, D.H., 2015. Haloviruses of archaea, bacteria, and eukaryotes. Current Opinion in Microbiology 25, 40–48. Deming, J.W., 2010. Sea ice bacteria and viruses. In: Thomas, D.N., Dieckmann, G.S. (Eds.), Sea Ice, second ed. Blackwell Publishing Ltd., pp. 247–282. Després, V.R., Huffman, J.A., Burrows, S.M., et al., 2012. Primary biological aerosol particles in the atmosphere: A review. Tellus B: Chemical and Physical Meteorology 64 (1),

356

Ecology of Phages in Extreme Environments

Hodson, A., Anesio, A.M., Tranter, M., et al., 2008. Glacial ecosystems. Ecological Monographs 78 (1), 41–67. Lossouarn, J., Dupont, S., Gorlas, A., et al., 2015. An abyssal mobilome: Viruses, plasmids and vesicles from deep-sea hydrothermal vents. Research in Microbiology 166 (10), 742–752. Pearce, D.A., Wilson, W.H., 2003. Viruses in Antarctic ecosystems. Antarctic Science 15 (3), 319–331. Reche, I., D'Orta, G., Mladenov, N., Winget, D.M., Suttle, C.A., 2018. Deposition rates of viruses and bacteria above the atmospheric boundary layer. ISME Journal 12 (4), 1154–1162. Rothschild, L.J., Mancinelli, R.L., 2001. Life in extreme environments. Nature 409, 1092–1101. Sabet, S., 2012. Halophilic viruses. In: Vreeland, R.H. (Ed.), Advances in Understanding the Biology of Halophilic Microorganisms. Springer, pp. 81–116. Santos, F., Yarza, P., Parro, V., et al., 2012. Culture-independent approaches for studying viruses from hypersaline environments. Applied and Environmental Microbiology 78 (6), 1635–1643. Snyder, J.C., Wiedenheft, B., Lavin, M., et al., 2007. Virus movement maintains local virus population diversity. Proceedings of the National Academy of Sciences of the United States of America 104 (48), 19102–19107. Zablocki, O., Adriaenssens, E.M., Cowan, D., 2015. Diversity and ecology of viruses in hyperarid desert soils. Applied and Environmental Microbiology 82 (3), 770–777. Zablocki, O., van Zyl, L., Trindade, M., 2018. Biogeography and taxonomic overview of terrestrial hot spring thermophilic phages. Extremophiles 22 (6), 827–837.

ARCHAEAL VIRUSES

Diversity of Hyperthermophilic Archaeal Viruses David Prangishvili, Institut Pasteur, Paris, France and Ivane Javakhishvili Tbilisi State University, Tbilisi, Georgia Mart Krupovic, Archaeal Virology Unit, Institut Pasteur, Paris, France Diana P Baquero, Archaeal Virology Unit, Institut Pasteur, Paris, France and Sorbonne University, Paris, France r 2021 Elsevier Ltd. All rights reserved.

Nomenclature

NMR Nuclear magnetic resonance nt Nucleotide ORF Open reading frame PCNA Proliferating cell nuclear antigen PFV1 Pyrobaculum filamentous virus 1 PSV Pyrobaculum spherical virus SEV1 Sulfolobus ellipsoid virus 1 SIRV1 Sulfolobus islandicus rod-shaped virus 1 SIRV2 Sulfolobus islandicus rod-shaped virus 2 SNDV Sulfolobus neozealandicus droplet-shaped virus SPV1 Sulfolobus polyhedral virus 1 ssDNA Single-stranded DNA ssRNA Single-stranded RNA SSV1 Sulfolobus spindle-shaped virus 1 STIV Sulfolobus turreted icosahedral virus TSPV1 Thermoproteus spherical piliferous virus 1

Glossary

Jelly-roll fold A protein fold composed of eight b-strands arranged in two antiparallel four stranded b-sheets. Pleomorphic viruses Viruses with asymmetric or variable virion morphology. Protein glycosylation Post-translational modification where a carbohydrate molecule is covalently bound to a predetermined region of a protein to form a glycoprotein. Proviruses Viral genomes integrated into the host chromosome. Structural genomics Description of the three-dimensional structure of a protein encoded by a given genome. Synteny The shared ordering of genomic segments along a chromosome. Thermophilic Requiring high temperatures for optimal growth. Viral envelope Lipid layer present in many types of viruses that protects their genetic material. Virion Infectious mature virus particle.

Å Angstrom ABV Acidianus bottle-shaped virus ACV Aeropyrum coil-shaped virus AFV1 Acidianus filamentous virus 1 APBV1 Aeropyrum pernix bacilliform virus 1 APOV1 Aeropyrum pernix ovoid virus 1 ATV Acidianus two-tailed virus bp Base pair cryo-EM Cryo-electron microscopy dsDNA Double-stranded DNA GDGT Glycerol dibiphytanyl glycerol tetraether ITRs Inverted terminal repeats kb Kilobase kDa Kilodalton MCP Major capsid protein nm Nanometer

Acidophilic Thriving under highly acidic conditions. A-form One of the three major forms of double-stranded DNA, with a 23 Å helical diameter and 11 bp per helix turn. Capsid Protein shell that encloses the genetic material of the virus. Convergent evolution Independent evolution of similar features in species of different lineages. Core genes One or more genes strongly conserved at the nucleotide sequence level among a related group of genomes. Homologous recombination Recombination between two identical or similar DNA sequences. Hyperthermophilic Requiring extremely high temperatures for optimal growth. Inverted terminal repeats Short, related or identical sequences located in reverse orientation at the ends of the viral genome.

Introduction One of the most surprising results of the studies on viral diversity on our planet is the observation of an astounding number of different viral morphologies in the habitats where large-scale biodiversity is least expected – in geothermally heated environments where temperatures exceed 801C and pH values are often below pH 3. Many virus morphologies observed here have never been detected in environments with less extreme conditions. Besides common filamentous and spherical virions, this diversity includes particles resembling bottles, droplets, coils and spindles, which can be tailless, tailed or two-tailed. Viruses with all these morphologies have been isolated from hot terrestrial springs of Europe, Asia, and North America. The hosts for this collection of viruses are archaea from the phylum Crenarchaeaota – members of the genera Acidianus, Aeropyrum, Metallosphaera, Pyrobaculum, Stygiolobus, Sulfolobus and Thermoproteus. All these hosts grow optimally at temperatures above 801C and thus are referred to as hyperthermophiles. Viral infection occurs most efficiently at optimal temperatures of host growth and thus also the viruses are considered to be hyperthermophiles.

Encyclopedia of Virology, 4th Edition, Volume 4

doi:10.1016/B978-0-12-814515-9.00083-7

359

360

Diversity of Hyperthermophilic Archaeal Viruses

Table 1

Viruses of Crenarchaeota

Family name

Virion shape

Example of species

Viruses with particular shapes Ampullaviridae Bottle-shaped, with fibers at the blunt Acidianus bottle-shaped virus (ABV) end Bicaudaviridae Spindle-shaped with two appendages Acidianus two-tailed virus (ATV)

Envelope Genome type and size Accession number (bp) External

Linear dsDNA, 23,900 EF432053

None

Circular dsDNA, 62,730 Circular dsDNA, 5278 Circular dsDNA, 15,465 Circular dsDNA, 13,769 Linear dsDNA, 24,893

AJ888457

Bacilliform Spindle-shaped with fibers at one end Droplet/ovoid-shaped

None Aeropyrum pernix bacilliform (APBV1) External Sulfolobus spindle-shaped virus 1 (SSV1) Aeropyrum pernix ovoid virus 1 (APOV1) External

Coils with two identical terminal appendages

Aeropyrum coil-shaped virus (ACV)

None

Spherical viruses Globuloviridae Spherical Ovaliviridae Ellipsoid

Pyrobaculum spherical virus (PSV) Sulfolobus ellipsoid virus 1 (SEV1)

External External

Portogloboviridae Icosahedron

Sulfolobus polyhedral virus 1 (SPV1)

Internal

Turriviridae

Sulfolobus turreted icosahedral virus (STIV)

Internal

Acidianus filamentous virus 1 (AFV1)

External

Linear dsDNA, 21,080 AJ567472

Sulfolobus islandicus rod-shaped virus 2 None (SIRV2) Pyrobaculum filamentous virus 1 (PFV1) External

Linear dsDNA, 35,450 AJ344259

Clavaviridae Fuselloviridae Guttaviridae Spiraviridae

Icosahedron with 12 turrets

Filamentous viruses Lipothrixviridae Flexible filament with terminal structures Rudiviridae Rigid rod, with terminal fibers Tristromaviridae

Filamentous, with terminal fibers

Linear dsDNA, 28,337 Circular dsDNA, 23,219 Circular dsDNA, 20,222 Circular dsDNA, 17,663

AB537968 XO7234 NC_028256 HE681887

AJ635162 MF144115 KY927925 AY569307

Linear dsDNA, 17,714 KU307456

Hyperthermophilic viruses are extremely thermostable in aggressive environmental conditions of their natural habitats, as well as in the laboratory conditions, e.g., thermal inactivation of some of them requires autoclaving at 1211C at least for 40 min. Based on their diverse morphological and genomic properties, the isolated and characterized viruses of Crenarchaeota are currently classified into 13 families (Table 1), 1 order Ligamenvirales and 1 proposed class Tokiviricetes. The genomes of members of all families, except Spiraviridae, have double-stranded (ds) DNA genomes, circular or linear (Table 1), whereas members of the Spiraviridae have circular single-stranded (ss) DNA genome.

Morphology and Structure The families of crenarchaeal viruses have virions with diverse characteristic morphologies. The bottle-shaped, tailed spindle-shaped, tailless spindle-shaped, bacilliform, coil-shaped, droplet-shaped morphologies of members of the Ampullaviridae, Bicaudaviridae, Fuselloviridae, Clavaviridae, Spiraviridae, and Guttaviridae, correspondingly, are unprecedented among viruses of Bacteria and Eukarya, and represent Archaea-specific virion morphotypes. Specific for Archaea are also dsDNA viruses with filamentous virions, including members of the Lipothrixviridae, Rudiviridae and Tristromaviridae, as well as viruses with spherical virions carrying helical nucleoprotein core – members of Globuloviridae, Ovaliviridae and Portogloboviridae. The only family of crenarchaeal viruses with structural resemblance to viruses from other domains of life is Turriviridae. Turriviruses have non-enveloped icosahedral virions, with an internal lipid layer enclosing circular dsDNA genome. The major and minor capsid proteins have the double and single jelly-roll fold, respectively, and are related to those found in a wide range of bacterial and eukaryotic viruses. In other studied cases, capsid proteins of crenarchaeal viruses have unique structural folds. Structures of virions from the families Clavaviridae, Lipothrixviridae, Portogloboviridae, Rudiviridae, and Tristromaviridae, were reconstructed at near atomic resolution using cryo-electron microscopy (cryo-EM). The results shed light on the mechanisms of virion morphogenesis and provide information on the molecular basis of high thermostability of the corresponding virions. Remarkably, in the reconstructed virions of members of the four latter families the packed dsDNA could be observed and was found to be in A-form: the phosphate-phosphate distance along DNA backbone is about 5.9 Å , as opposed to 7 Å for common B-form DNA. The common occurrence of A-DNA in hyperthermophilic viruses suggests that it may be the prevalent storage form of DNA in most extreme environments. Most known viruses of Crenarchaeota are covered with envelopes. In those cases, where this was studied, the viruses form envelopes from lipids selectively acquired from the pool of host lipids.

Diversity of Hyperthermophilic Archaeal Viruses

361

Fig. 1 Electron micrographs of archaeal viruses with particular morphologies. (a) ABV, Acidianus bottle-shaped virus; (b) ATV, Acidianus two-tailed virus, the arrow indicates virion tails which underwent extracellular development; (c) APBV1, Aeropyrum pernix bacilliform virus 1; (d) SSV1, Sulfolobus spindle-shaped virus 1; (e) SNDV, Sulfolobus neozealandicus droplet-shaped virus; and (f) ACV, Aeropyrum coil-shaped virus. Negative stain with uranyl acetate, except for ACV, which is in vitreous ice. Bars, 100 nm. Image modified from Prangishvili, D., Bamford, D.H., Forterre, P., et al., 2017. The enigmatic archaeal virosphere. Nature reviews Microbiology 15, 724–739.

Viruses with particular morphologies Family Ampullaviridae (from Latin ampulla for “bottle”) The enveloped virion of Acidianus bottle-shaped virus (ABV), the only isolated member of the family Ampullaviridae, resembles in its shape a champagne bottle (Fig. 1(a)). It has an overall length of 230 nm and a width varying from 75 nm at the broad end to 4 nm at the pointed end. The broad end of the virion is decorated with 20 thin rigid filaments, which appear to be inserted into a disc and interconnected at their bases. The cone-shaped inner core is formed by a toroidally supercoiled nucleoprotein filament 7 nm in width. It is presently unclear whether the pointed end or the filaments on the broader end are involved in adsorption to the cell surface and channelling of viral DNA into host cells. The virion contains six major proteins in the size range between 15 to 80 kDa.

Family Bicaudaviridae (from Latin bi, “two”, and cauda for “tail”) The virions of Acidianus two-tailed virus (ATV), the only currently classified member of the family Bicaudaviridae, are released from host cells as spindle-shaped particles with overall dimensions of approximately 120  300 nm (Fig. 1(b)). Upon further incubation at temperatures above 751C, appendages protrude from both pointed ends of the virion, and the lemon-shaped virion body shrinks to approximately 85  150 nm. The tails are heterogeneous in length, reaching 400 nm; the maximum length of the virion including tails reaches about 1000 nm. They have a tube-like structure and terminate with a narrow channel, which is 2 nm in width, and a terminal anchor-like structure formed by two furled filaments, each with a width of 4 nm. The virion contains at least eleven proteins with molecular masses in the range of 12–90 kDa. Extracellular morphological development of the ATV virion takes place specifically at temperatures above 751C, close to that of the natural habitat, and does not require the presence of host cells, an exogenous energy sources or specific co-factors. However, the mechanism of development of the ATV tails remains unknown.

Family Clavaviridae (from Latin clava for “club”, “stick”) Aeropyrum pernix bacilliform virus 1 (APBV1), the only known member of the family Clavaviridae, has non-enveloped bacilliform, rigid virions with dimensions of about 143  16 nm (Fig. 1(c)). The terminal cap structures, one of which is pointed and the other rounded, most likely are involved in DNA packaging and recognition of the host cell.

362

Diversity of Hyperthermophilic Archaeal Viruses

The virions carry multiple copies of a single major capsid protein (MCP) of about 10 kDa and three minor capsid proteins which have molecular masses in the range of 9.5–21.5 kDa. The MCP consists of two a-helices linked with a b-hairpin, a structural fold not seen in the capsid proteins of other known viruses. The structure of APBV1 virion was determined by cryo-electron microscopy at near-atomic resolution. The structure reveals how the MCPs pack together forming a tubular structure: each MCP molecule makes extensive hydrophobic contacts to six other neighbouring subunits, forming very tight hydrophobic interface which apparently contributes to the virion stability at extremely high temperatures. The inner surface of the tubular structure is positively charged, allowing efficient interactions with the circular dsDNA genome and its packaging as a left-handed superhelix. The structure allowed to propose an assembly model where the dsDNA genome and the capsid assembly in a concerted fashion. According to this model, the virion assembly starts by binding of three specific sites of the circular dsDNA with one of the cap structures; this forms three loops, which gradually intertwine directed by protein assembly into the tubular structure; after all DNA has been covered by protein, the open end of the virion is sealed by another cap structure.

Family Fuselloviridae (from Latin fusello for “little spindle”)

Virions of this family are enveloped and many have the shape of a lemon or spindle, with a bunch of thin filaments attached to one of the two pointed ends (Fig. 1(d)). Other family members are more pleomorphic and elongated, with three relatively thick filaments at one pointed end. The terminal filaments most likely are involved in adsorption to the host cell surface. Virions have dimensions of approximately 60  100 nm. The envelope of Sulfolobus spindle-shaped virus 1 (SSV1) contains lipids and proteins VP1 and VP3. The circular dsDNA of SSV1 is positively supercoiled when isolated from virions. Examination of the SSV1 virion using cryo-electron microscopy (cryo-EM) have provided only limited information on virion architecture, but the study of SSV1 virion assembly and egress using electron tomography was more informative. Both virion assembly and egress were found to be concomitant and occur at the cellular cytoplasmic membrane via a process highly reminiscent of the budding of enveloped viruses that infect eukaryotes, including human immunodeficiency virus, influenza virus, and Ebola virus.

Family Guttaviridae (from Latin gutta for “droplet”) The guttaviruses have slightly pleomorphic enveloped virions. Virions of Sulfolobus neozealandicus droplet-shaped virus (SNDV) resemble elongated droplets with varying dimensions, 110–185 nm in length and 95–70 nm in width (Fig. 1(e)). The pointed end of the virion is covered by a beard of thick filaments. Virions of another member of the family, Aeropyrum pernix ovoid virus 1 (APOV1), are ovoid without detectable filamentous attachments.

Family Spiraviridae (from Latin spira for “coil”) The virions of Aeropyrum coil-shaped virus (ACV), the sole member of the family, are hollow, non-enveloped cylindrical particles, measuring about 230  20 nm. A short appendage of about 20 nm protrudes from each end of the cylindrical virion (Fig. 1(f)). Exceptionally, the viral genome is a single-stranded, positive-sense DNA molecule. The virion carries two MCPs with molecular masses of about 23 and 18.5 kDa, and a few minor virion proteins. The observation of partially degraded virions enabled to propose the following model of virion morphogenesis: the circular ssDNA is covered by multiple copies of the MCP and the two halves of the circular nucleoprotein filament intertwine to form a rope-like structure, which is further condensed into a helix of higher order.

Spherical viruses Family Globuloviridae (from Latin globulus for “small ball”) The virions of globuloviruses are enveloped, spherical particles, 70–100 nm in diameter (Fig. 2(a)). On the surface of the virion are multiple spherical protrusions about 15 nm in diameter. The viral envelope contains host-derived lipids and encases a tightlypacked superhelical nucleoprotein consisting of linear dsDNA and multiple copies of 33 kDa MCP. The virions carry minor capsid proteins. Virions of Thermoproteus spherical piliferous virus 1 (TSPV1) are decorated with numerous highly unusual filaments, which can extend hundreds of nanometers from the virion. Family Ovaliviridae (from Latin ovalis for “oval”) The sole known member of the Ovaliviridae family, Sulfolobus ellipsoid virus 1 (SEV1), has virions of ellipsoidal shape which measure about 115  80 nm. The virions are enveloped by a lipid containing membrane. The membrane encases circular nucleoprotein filament - consisting of circular dsDNA and DNA-binding MCPs, wrapped multiple times around the central axis of the virion (Fig. 2(b)). Family Portogloboviridae (from Latin porto for “to bear”, and globus for “ball”) The virions of Sulfolobus polyhedral virus 1 (SPV1) are icosahedral, about 90 nm in diameter (Fig. 2(c)). The cryo-EM reconstruction of the virion revealed structural details at near-atomic resolution. The icosahedral capsid is formed by 2 types of MCPs both of which carry variants of the single jelly-roll fold and are arranged in an atypical manner. The protein capsid encloses an internal lipid-containing membrane, which, in turn, encloses a spherical core formed by a circular nucleoprotein filament consisting of dsDNA and DNA-binding MCP. In the circular nucleoprotein, the dsDNA is complexed by a dimeric DNA-binding MCP

Diversity of Hyperthermophilic Archaeal Viruses

363

and is in A-form. The condensation of the circular nucleoprotein into a spherical core is suggested to occur in two steps: folding of circular nucleoprotein into a raft which is then spooled into concentric shells. Family Turriviridae (from Latin turris for “tower”) The overall morphology of non-enveloped icosahedral virions of the Turriviridae is highly similar to that of bacterial viruses from the families Tectiviridae and Corticoviridae: the virions are icosahedral with an inner lipid layer and pack naked dsDNA with the help of virus-encoded ATP-dependent molecular machinery (Fig. 2(d)). Moreover, members of all three families share the structure of MCPs which exhibit the double-jelly roll fold. Cryo-EM reconstruction of Sulfolobus turreted icosahedral virus (STIV) revealed unique structural features, including the elaborate turret-like structures at the fivefold vertices and the unusual pseudo-T ¼ 31 icosahedral lattice on which the virion is built.

Filamentous viruses Proposed class Tokiviricetes: order Ligamenvirales: family Rudiviridae (from the Latin rudis for “small rod”) order Ligamenvirales: family Lipothrixviridae (from the Greek lipos for “fat” and thrix for “hair”) family Tristromaviridae (from the Greek tria for “three” and stroma for “layer”) The virions of members of the three families of the proposed class Tokiviricetes are filamentous, about 23 nm in width and the lengths in the range of 400–2000 nm, depending on the size of the linear dsDNA genomes and parameters of the nucleoprotein helix (Fig. 3). At both ends, the virions carry identical terminal structures. In the case of Acidianus filamentous virus 1 (AFV1), a member of the Lipothrixviridae, the terminal structure resembles claws and folds upon interaction with the appendages of the host cells (Fig. 3(a)). The virions of Sulfolobus islandicus rod-shaped virus 2 (SIRV2), the type member of the Rudiviridae family, are decorated with three thin terminal fibers, whereas virions of Pyrobaculum filamentous virus 1 (PFV1), a member of the Tristromaviridae, possess bundles of filaments that can reach a length of 80 nm (Fig. 3(b) and (c)). The virion structures of members of the three families have been reconstructed by cryo-EM at near atomic resolution and revealed similar, previously unknown form of virion organization. In all three families the core of the filamentous virion is a tube-like nucleoprotein formed by condensation of linear dsDNA by dimers of the MCPs – homodimers in case of Rudiviridae and heterodimers in the case of the other two families (Fig. 4(a)). As a result of binding to DNA, the MCPs form a helix-turn-helix structure and tightly

Fig. 2 Electron micrographs of archaeal viruses with spherical morphologies. (a) PSV1, Pyrobaculum spherical virus 1, the arrows indicate spherical protrusions; (b) SEV1, Sulfolobus ellipsoid virus 1; (c) SPV1, Sulfolobus polyhedral virus 1; and (d) STIV, Sulfolobus turreted icosahedral virus. Negative stain with uranyl acetate. Bars, 100 nm. Image modified from Prangishvili, D., Bamford, D.H., Forterre, P., et al., 2017. The enigmatic archaeal virosphere. Nature reviews Microbiology 15, 724–739. SEV1 image is courtesy of L. Huang.

Fig. 3 Electron micrographs of archaeal viruses with filamentous morphologies. (a) AFV1, Acidianus filamentous virus 1, the inset displays the terminal structure; (b) SIRV2, Sulfolobus islandicus rod-shaped virus 2; and (c) PFV1, Pyrobaculum filamentous virus 1. Negative stain with uranyl acetate, except for SIRV2, which is in vitreous ice. Bars, 100 nm. Image modified from Prangishvili, D., Bamford, D.H., Forterre, P., et al., 2017. The enigmatic archaeal virosphere. Nature reviews Microbiology 15, 724–739.

364

Diversity of Hyperthermophilic Archaeal Viruses

Fig. 4 Virion organization of the filamentous viruses AFV1, SIRV2 and PFV2. (a) Comparison of the MCP dimer (asymmetric unit) of AFV1 (left), SIRV2 (centre) and PFV2 (right). The MCP1 of AFV1, SIRV2 and PFV2 are colored in orange, gold and yellow, respectively. The MCP2 of AFV1, SIRV2 and PFV2 are colored in blue, cyan and light blue, respectively. The N-terminal helices of MCP1 in AFV1, SIRV2 and PFV2 are marked with black arrows; (b) wrapping of A-form DNA in AFV1, SIRV2 and PFV2. One MCP dimer is colored as in (a), other four dimers are colored in gray. Proteins are shown in ribbon representation (top) and as surfaces (bottom). Image reproduced from Wang, F., Baquero, D.P., Su, Z., et al., 2020. Structure of a filamentous virus uncovers familial ties within the archaeal virosphere. Virus evolution 6, veaa023.

wrap around dsDNA, in a manner not observed in known viruses, and transforms it into A-form (Fig. 4(b)). Such arrangement keeps DNA inaccessible for solutes and ensures virion stability in highly aggressive environmental conditions. The MCPs have a molecular mass of about 14.5 kDa. Rudiviridae encode a single MCP, whereas each of the two other families encodes two paralogous MCPs. Remarkably, although the MCPs from the three virus families do not share significant sequence similarity, their structures are highly similar – they all carry unique four-helix bundle fold, not seen in the capsid proteins of other known viruses. The structural similarity of the MCPs suggests common origin of the MCPs and provides arguments to postulate evolutionary relationship between the three virus families which employ these MCPs. Thus, it was proposed to unify the families Rudiviridae, Lipothrixviridae, and Tristromaviridae into the new virus class Tokiviricetes. The former two families reveal certain similarity also at the genomic level, suggesting closer relationship. Thus, Rudiviridae and Lipothrixviridae were unified also in a taxon of a lower rank, the order Ligamenvirales. The major difference in virion architecture of the three families concerns coating of the nucleoprotein core. In the Rudiviridae, it is non-enveloped, in Lipothrixviridae, covered with lipid envelope, and in Tristromaviridae, covered by a protein matrix which mediates contact with the outer lipid envelope. The matrix layer of the Tristromaviridae is formed by an 18 kDa capsid protein, VP3. As in other crenarchaeal viruses, Lipothrixviridae and Tristromaviridae recruit the lipids for formation of the envelope from the pool of host lipids. The thickness of the lipid envelope in the reconstructed virions of the lipothrixviruses is about 20–25 Å , only half of the thickness of the cellular membrane. The host membrane, as in all Crenarchaeota, represents a 40 Å monolayer of glycerol dibiphytanyl glycerol tetraether (GDGT) lipids with different numbers of cyclopentane rings. Studies on the composition and structure of the virion envelope showed that the virus selectively acquires more flexible GDGT lipids lacking cyclopentane rings, which can be bent into U-shaped, ‘horseshoe’ conformation, forming the 20 Å thick virion envelope. Apparently, the membrane with lipids in such conformation, which was previously never observed in the living world, has advantages for survival in aggressive conditions of the natural environment of the hyperthermophiles.

General comments Summarizing the results on studies on morphology and morphogenesis of viruses of hyperthermophilic Crenarchaeaota, it could be noted that packaging of viral genomes in the form of a nucleocapsid appears to be the major structural characteristic of crenarchaeal viruses. Based on the available information, the assembly of the virion core apparently occurs through concerted coating of the DNA with MCPs and spontaneous self-assembly of such nucleocapsids into minimum free energy structures of different shapes. The self-assembly pathway of virion morphogenesis is common for RNA viruses but not for dsDNA viruses, where capsid assembly and genome packaging are generally separated. dsDNA viruses of Bacteria, herpseviruses of Eukarya, and crenarchaeal Turriviridae pack dsDNA in the naked state into the pre-assemble empty capsids. This process requires significant energy expenses — mainly because of the mutual repulsion of the negatively charged phosphate backbone of the DNA — and is facilitated by virus-encoded elaborate molecular machinery. Considering the complexity of such process, it is surprising that the pathways of virion morphogenesis which are common for crenarchaeal viruses are so rare in the world of dsDNA viruses. In most crenarchaeal virus families, the nucleoprotein core is covered by lipid-containing envelopes. Moreover, the core of the Portogloboviridae is encased by an icosahedral protein shell. The mechanisms of coating of the nucleoprotein cores remain

Diversity of Hyperthermophilic Archaeal Viruses

365

unknown. For Fuselloviridae, the final step of virion assembly was shown to occur in the course of budding from the host cell in a process, resembling the morphogenesis of enveloped viruses of eukaryotes. For other crenarchaeal viruses, the coating of virions with a lipid envelope takes place apparently intracellularly.

Genomes The genomes of crenarchaeal viruses are generally small in size (Table 1), with the largest ones (60–70 kb) being found in members of the family Bicaudaviridae. The 5,278 bp-long genome of the clavavirus APBV1 is among the smallest known dsDNA genomes. The other record holder is the coil-shaped spiravirus ACV – its ssDNA genome with 24.9 kb is the largest among known ssDNA genomes. The genomes can be either circular or linear (Table 1). Viruses with linear genomes often carry inverted terminal repeats (ITRs) of different length and exploit different strategies to protect their genome termini, such as covalently closed hairpins or covalently attached terminal proteins. Along with the extraordinary diversity of morphotypes, another outstanding feature of archaeal viruses is the uniqueness of their genome content, with a very low proportion of genes with recognizable homologs in public databases. Indeed, family-specific comparison of viral proteomes against the sequence databases revealed that B85% of the crenarchaeal virus proteins do not have identifiable homologs when an E-value threshold of o1e-5 is used. Functional annotation of crenarchaeal proteins shows that very few of these proteins are homologous to any sequences in the public databases, be it proteins of other viruses or those of cellular organisms. Consequently, archaeal virus genomes remain a rich source of unknown genes, many of which could be responsible for unique mechanisms of virus-host interactions or possess unexpected properties. Structural genomics have been performed to promote the functional characterization of Fuselloviridae, Bicaudaviridae, Rudiviridae, Lipothrixviridae, Globuloviridae and Turriviridae proteins. Protein structures were determined by X-ray crystallography, and nuclear magnetic resonance (NMR) spectroscopy. For some viral proteins, the structures displayed unique folds with no homologs in databases, hindering the assignment of putative functions. In few cases, the structural information suggested a function which was verified biochemically. This was the case, e.g., for a novel type of nuclease encoded by the lipothrixvirus AFV1 and replication initiator protein encoded by the rudivirus SIRV2. Notably, none of the viruses encodes a recognizable DNA-dependent RNA polymerase and only members of two families – Ampullaviridae and Ovaliviridae, encode recognizable DNA-dependent DNA polymerases. However, crenarchaeal viruses commonly encode multiple transcription factors which may recruit and redirect the host transcription machinery to preferential expression of viral genes. Comparative genomic analyses have revealed that hyperthermophilic viruses from different families share just a small group of common genes, suggesting independent origins of the distinct groups of archaeal viruses. Ten virus families each have only one or two members with the sequenced genome. Exceptions are Fuselloviridae and two families of the order Ligamenvirales with higher number of members with sequenced genomes. Comparison of these genomes facilitated functional predictions and contributed to understanding the evolutionary history of the corresponding virus families, and will be briefly detailed below.

Family Fuselloviridae The genomes of all members of Fuselloviridae are highly similar at the nucleotide sequence level and display overall gene synteny. A high frequency of homologous recombination has been reported between family members. The majority of predicted ORFs of the fuselloviruses cannot be assigned a function based on homology with sequences in public databases. Comparison of genome sequences has revealed a set of 12 not contiguous “core” genes shared by all family members, suggesting a common evolutionary history despite differences in the geographical context. The ‘core’ proteome includes structural proteins, predicted DnaA-like AAA þ ATPase, transcriptional regulators and a tyrosine superfamily integrase, which is involved in the integration of the viral genome into the host tRNA gene. Accordingly, the infected cell cultures contain both the covalently closed circular form of the viral genome and the integrated provirus. Unlike in the case of temperate bacteriophages, the integrase gene of the fuselloviruses is partitioned into two fragments upon the viral genome integration. Notably, studies on the prototypical fusellovirus SSV1 have shown that SSV1 integrase is not essential for the viral cycle. Moreover, nearly half of the predicted SSV1 ORFs were shown to tolerate insertions or deletions without affecting the viral infectivity.

Order Ligamenvirales All members of the order have linear dsDNA genomes. On the example of the genome of the rudivirus SIRV1 it was shown that the two strands of the linear dsDNA genomes are covalently linked at both ends, forming a continuous polynucleotide chain and producing terminal hairpin structures. For other ligamenviruses such detailed analysis of the terminal regions has not been performed. Rudiviruses carry terminal inverted repeats which differ in size and sequence. The longest with 2 kb is found in the rudivirus SIRV1. In both rudiviruses and lipothrixviruses active genome remodelling, involving both deletions and horizontal acquisition of new genes, has been documented. The comparison of the genomes of different members of the same family and different strains of the same species revealed multiple examples of genomic rearrangements caused by insertion/deletion of

366

Diversity of Hyperthermophilic Archaeal Viruses

sequences that do not disrupt the ORFs. In Rudiviridae and Lipothrixviridae such sequences were shown to carry 12 bp or multiples thereof. It was suggested that the latter might constitute genetic elements mobilized by archaeal intron splicing. Members of the Rudiviridae share significant similarity in the genome sequence and synteny, and carry a set of conserved core genes, most of which are localized in the middle region of the linear genome. By contrast, the genomes of Lipothrixviridae display a considerable variation in gene order and content, suggesting a longer evolutionary history. Based on the sequence similarity, the genomes of Lipothrixviridae group into four clusters of related species which form the four genera in the family. The examples of intergenomic recombination have been observed between members of different genera. Properties of about half of the proteins encoded by the rudiviruses SIRV1 and SIRV2 have been predicted as a result of analysis of their sequences, structures and biochemical characteristics. Such proportion of recognized gene functions is among the highest for crenarchaeal viruses. The proteins with predicted functions include transcription regulators with ribbon-helix-helix and helixturn-helix motifs, glycosyltransferases, acetyltransferase, Holliday junction resolvase, methyltransferase, dUTPase, ssDNA-binding protein, ssDNA annealing ATPase, Cas4-like exonuclease, as well as a protein specifically interacting with the host proliferating cell nuclear antigen (PCNA) and presumably recruiting it for the replication of the viral genome. Moreover, three proteins of the virus SIRV2 have been identified as anti-CRISPR proteins, which inhibit type I and type III CRISPR-Cas immunity systems of the Sulfolobus host. Detailed studies on the life cycle of the rudivirus SIRV2 enabled identification of the viral protein involved in the unique mechanism of virion egress. The function of this protein could not be predicted based on its sequence and was verified only as a results of detailed studies on the virus life cycle, supporting the possibility that many genes of archaeal viruses are implicated in specific aspects of virus-host interactions.

Evolutionary relationships The unique morphological and genomic features of crenarchaeal DNA viruses raise important questions regarding their origins. The analysis of the evolutionary relationships between all dsDNA viruses using the bipartite network approach, which traces connections between viral genomes through shared gene families, revealed that crenarchaeal viruses are largely disconnected from the global dsDNA virosphere. The families of crenarchaeal viruses are themselves largely disconnected from each other and share just a small group of common genes, suggesting that most families of crenarchaeal viruses have evolved independently of one another. How and when did archaea-specific viruses originate? Why do they only infect archaea? There are at least two non-mutually exclusive explanations. Some of the archaea-specific virus groups could have emerged during the early stages of cellular evolution and been retained in the Archaea but lost in the domains Bacteria and Eukarya. Other archaeal virus groups could have evolved concomitantly with the Archaea or, even more recently, within specific archaeal lineages. The observed limited gene sharing between different groups of archaea-specific viruses seems to make the latter possibility particularly plausible.

Further Reading Arnold, H.P., Ziese, U., Zillig, W., 2000. SNDV, a novel virus of the extremely thermophilic and acidophilic archaeon Sulfolobus. Virology 272, 409–416. Bettstetter, M., Peng, X., Garrett, R.A., Prangishvili, D., 2003. AFV1, a novel virus infecting hyperthermophilic archaea of the genus acidianus. Virology 315, 68–79. Contursi, P., Fusco, S., Cannio, R., She, Q., 2014. Molecular biology of fuselloviruses and their satellites. Extremophiles: Life Under Extreme Conditions 18, 473–489. DiMaio, F., Yu, X., Rensen, E., et al., 2015. Virology. A virus that infects a hyperthermophile encapsidates A-form DNA. Science 348, 914–917. Häring, M., Peng, X., Brügger, K., et al., 2004. Morphology and genome organization of the virus PSV of the hyperthermophilic archaeal genera Pyrobaculum and Thermoproteus: A novel virus family, the Globuloviridae. Virology 323, 233–242. Häring, M., Rachel, R., Peng, X., Garrett, R.A., Prangishvili, D., 2005. Viral diversity in hot springs of Pozzuoli, Italy, and characterization of a unique archaeal virus, Acidianus bottle-shaped virus, from a new family, the Ampullaviridae. Journal of virology 79, 9904–9911. Iranzo, J., Koonin, E.V., Prangishvili, D., Krupovic, M., 2016. Bipartite network analysis of the archaeal virosphere: Evolutionary connections between viruses and capsidless mobile elements. Journal of virology 90, 11043–11055. Kasson, P., DiMaio, F., Yu, X., et al., 2017. Model for a novel membrane envelope in a filamentous hyperthermophilic virus. eLife 6. Krupovic, M., Cvirkaite-Krupovic, V., Iranzo, J., Prangishvili, D., Koonin, E.V., 2018. Viruses of archaea: Structural, functional, environmental and evolutionary genomics. Virus Research 244, 181–193. Liu, Y., Ishino, S., Ishino, Y., et al., 2017. A novel type of polyhedral viruses infecting hyperthermophilic archaea. Journal of Virology 91 (13). Mochizuki, T., Krupovic, M., Pehau-Arnaudet, G., et al., 2012. Archaeal virus with exceptional virion architecture and the largest single-stranded DNA genome. Proceedings of the National Academy of Sciences of the United States of America 109, 13386–13391. Mochizuki, T., Sako, Y., Prangishvili, D., 2011. Provirus induction in hyperthermophilic archaea: Characterization of Aeropyrum pernix spindle-shaped virus 1 and Aeropyrum pernix ovoid virus 1. Journal of Bacteriology 193, 5412–5419. Mochizuki, T., Yoshida, T., Tanaka, R., et al., 2010. Diversity of viruses of the hyperthermophilic archaeal genus Aeropyrum, and isolation of the Aeropyrum pernix bacilliform virus 1, APBV1, the first representative of the family Clavaviridae. Virology 402, 347–354. Prangishvili, D., Bamford, D.H., Forterre, P., et al., 2017. The enigmatic archaeal virosphere. Nature Reviews Microbiology 15, 724–739. Prangishvili, D., Koonin, E.V., Krupovic, M., 2013. Genomics and biology of Rudiviruses, a model for the study of virus-host interactions in Archaea. Biochemical Society Transactions 41, 443–450. Prangishvili, D., Krupovic, M., 2012. A new proposed taxon for double-stranded DNA viruses, the order “Ligamenvirales”. Archives of Virology 157, 791–795. Prangishvili, D., Vestergaard, G., Häring, M., et al., 2006. Structural and genomic properties of the hyperthermophilic archaeal virus ATV with an extracellular stage of the reproductive cycle. Journal of Molecular Biology 359, 1203–1216. Quax, T.E., Krupovic, M., Lucas, S., Forterre, P., Prangishvili, D., 2010. The Sulfolobus rod-shaped virus 2 encodes a prominent structural component of the unique virion release system in Archaea. Virology 404, 1–4. Quemin, E.R., Chlanda, P., Sachse, M., et al., 2016. Eukaryotic-like virus budding in archaea. mBio 7.

Diversity of Hyperthermophilic Archaeal Viruses

367

Rensen, E.I., Mochizuki, T., Quemin, E., et al., 2016. A virus of hyperthermophilic archaea with a unique architecture among DNA viruses. Proceedings of the National Academy of Sciences of the United States of America 113, 2478–2483. Wang, F., Baquero, D.P., Su, Z., et al., 2020. Structure of a filamentous virus uncovers familial ties within the archaeal virosphere. Virus Evolution 6, veaa023. Wang, H., Guo, Z., Feng, H., et al., 2018. Novel sulfolobus virus with an exceptional capsid architecture. Journal of Virology 92. Wang, F., Liu, Y., Su, Z., et al., 2019. A packing for A-form DNA in an icosahedral virus. Proceedings of the National Academy of Sciences of the United States of America 116, 22591–22597.

Euryarchaeal Viruses Tatiana A Demina and Hanna M Oksanen, Molecular and Integrative Biosciences Research Program, Faculty of Biological and Environmental Sciences, University of Helsinki, Helsinki, Finland r 2021 Elsevier Ltd. All rights reserved.

Nomenclature

kbp Kilo base pairs knt Kilo nucleotides MCP Major capsid protein nt Nucleotide ORF Open reading frame ss Single-stranded T Triangulation number

Glossary

Lysogen A host cell that has been infected by a virus that remains dormant, despite the presence of viral DNA. Methanogenic Having the ability to produce methane. Prophage A virus that is dormant within the host cell either integrated into the host chromosome or replicating free like a plasmid. Protein-primed DNA replication Duplication of a linear DNA genome being started via the interaction of DNA polymerase with a protein covalently linked to the DNA termini. Temperate virus A virus that is able to infect a host, but remains dormant within the host cell. Thermophilic Having a requirement for an environment with high temperature. Virion Infectious mature virus particle. Virulent virus A virus that is able to infect a host, replicate, and subsequently leave the host cell by lysing the host cell.

Å Å ngström, 0.1 nm bp Base pair CRISPR Clustered Regularly-Interspaced Short Palindromic Repeats ds Double-stranded EM Electron microscopy ICTV International Committee on Taxonomy of Viruses

Alkaliphilic Having a requirement for an environment with high pH. Circular permutation A change in the sequence of the linear DNA termini that does not alter the relative sequence (e.g., circular permutation of ABCDEFGH could generate BCDEFGHA, CDEFGHAB, etc). Concatamer Two or more DNA molecules that are linked together to form a long, linear DNA molecule. Halophilic Having a requirement for an environment with a high salt concentration. Headful packaging The mechanism of packaging viral DNA based on the size of the virus head, rather than the length of the viral genome. Integrase An enzyme which can integrate viral DNA into the genome of its host cell.

Introduction Microorganisms belonging to the phylum Euryarchaeota inhabit diverse environments: halophilic euryarchaea dominate in hypersaline environments such as solar salterns and salt lakes, methanogenic euryarchaea are found in intestines, anoxic sediments, and sludge digesters, while thermophilic euryarchaea thrive in thermal environments, e.g., hot springs and deep-sea hydrothermal vents. Euryarchaea are also abundant and active in oceanic surface waters. The first archaeal virus Hs1 of halophilic euryarchaeon Halobacterium was found in 1974, being originally described as a bacteriophage before it was recognized that Archaea and Bacteria form two different domains of life. During the last 10 years, the number of described euryarchaeal viruses has tremendously increased due to extensive virus surveys, especially by using samples from hypersaline environments. Today, more than 110 euryarchaeal virus isolates or virus-like particles (VLPs) are described. Euryarchaeal viruses are extremely diverse, having various virion morphologies, genome types, sequences, life cycles, and host ranges. Among the five different morphologies known for euryarchaeal viruses (Fig. 1), tailed icosahedral viruses form the most numerous group with all three tail types found in tailed dsDNA bacteriophages of Caudovirales. Other known morphologies are tailless icosahedral viruses with internal membrane, pleomorphic, spindle-shaped, and spherical viruses. The smallest euryarchaeal virus is the pleomorphic virus HRPV-1 with a diameter of 40 nm, while the largest capsid diameter is found in tailed icosahedral virus HHTV-1 (B95 nm). So far, all known archaeal viruses have genomes of either linear or circular double-stranded (ds) or single-stranded (ss) DNA molecules. Euryarchaeal pleomorphic virus HRPV-1 was the first described archaeal virus with a ssDNA genome, but most of the euryarchaeal viruses known today have a linear dsDNA genome. The genome sizes range from 7,048 nucleotides (nt) of Halorubrum pleomorphic virus HRPV-1 to 143,855 kilo base pairs (kbp) of Halogranum tailed virus HGTV-1. For B50 euryarchaeal virus isolates, the whole genome sequence is available (Tables 1–3). Currently, euryarchaeal viruses are classified by the International Committee on Taxonomy of Viruses (ICTV) into the families Sphaerolipoviridae, Pleolipoviridae, and the newly proposed family Halspiviridae, but there is also a significant number of unclassified ones. High number of proviruses found in the chromosomal

368

Encyclopedia of Virology, 4th Edition, Volume 4

doi:10.1016/B978-0-12-809633-8.20989-8

Euryarchaeal Viruses

369

Fig. 1 Euryarchaeal virus morphotypes. (A–C) Icosahedral tailed virus morphotypes. (A) Myovirus morphotype, icosahedral head and long contractile tail. (B) Siphovirus morphotype, icosahedral head and long non-contractile tail. (C) Podovirus morphotype, icosahedral head and short non-contractile tail. (D) Internal membrane-containing icosahedral virus morphotype. (E) Pleomorphic virus morphotype without any protein capsid structure. (F) Spindle-shaped morphotype. (G) Internal membrane-containing spherical virus morphotype. Protein, blue; Lipid, yellow. Drawn not in scale.

Table 1

Icosahedral tailed dsDNA euryarchaeal virus isolates with a complete genome sequence available

Virus

Isolation host

Isolation source

Morphology, size and T numbera

Genome topology, length (bp) and Acc. No.b

phiH ChaoS9 HF2

Halobacterium salinarum Halobacterium salinarum Halobacterium saccharovorum Ch2 Haloferax sp. Aa2.2 Haloferax sp. LR2–5 Haloarcula californiae Haloarcula californiae Haloarcula californiae Haloarcula hispanica Haloarcula hispanica Haloarcula sinaiiensis Haloarcula vallismortis Halogranum sp. SS5–1 Halorubrum sodomense Halorubrum sp. s5a-3 Halorubrum sp. s5a-3 Halorubrum sp. B2–2 Halorubrum sp. B2–2 Halorubrum sp. BJ1 B11 Natrialba magadii

Halobacterium salinarum culture Halobacterium salinarum culture Saltern, Australia

M 64/170 M 61/128 M 58/94

L (C as a provirus) 58,072 MK002701 L CP 55,145 MK310226 L 77,670 AF222060

Saltern, Australia Lake Retba, Senegal Saltern, Margherita di Savoia, Italy Saltern, Samut Sakhon, Thailand Saltern, Samut Sakhon, Thailand Saltern, Margherita di Savoia, Italy Saltern, Samut Sakhon, Thailand Saltern, Margherita di Savoia, Italy Saltern, Samut Sakhon, Thailand Saltern, Samut Sakhon, Thailand Saltern, Eilat, Israel Saltern, Margherita di Savoia, Italy Saltern, Margherita di Savoia, Italy Saltern, Margherita di Savoia, Italy Saltern, Samut Sakhon, Thailand Lake Bagaejinnor, Mongolia, China Natrialba magadii culture

M 68/90 S 50/60 S 70/80 S S S 55/110 S P 62 T ¼ 7 S 95.5/73 T ¼ 13 M M 73.8/101 T ¼ 7 S M M M S 56/71 M 70/130

L L L L L L L L L L L L L L L L L

Anaerobic sludge digester

S 60/230

75,898 AY190604 CP 38,059 MG550112 103,257 KC292029 CP 54,291 KC292028 102,105 KC292027 CP 49,107 KC292025 CP 52,643 KC292024 CP 32,189 KC117378 102,319 KC117377 CP 143,855 KC292026 68,527 KC117376.1 CP 35,722 KC292023 76,134 KC292022 69,048 KC292021 74,519 KC292020 42,271 AM419438 (C as a provirus) 58,498 AF440695 MK450543 L 37,129 MH674343

Anaerobic sludge digester

S 55/210

L 26,803 AF065411, AF065412

Anaerobic sludge digester (deletion mutant of psiM1)

S 55/210

L 26,111 AF065411

HF1 HFTV1 HCTV-1 HCTV-2 HCTV-5 HHTV-1 HHTV-2 HSTV-1 HVTV-1 HGTV-1 HSTV-2 HRTV-4 HRTV-5 HRTV-7 HRTV-8 BJ1 phiCh1 Drs3 psiM1 psiM2 a

Methanobacterium formicicum Khl10 Methanothermobacter marburgensis Methanothermobacter marburgensis

Morphotypes M, S, and P correspond to myovirus (long contractile tail), siphovirus (long non-contractile tail), and podovirus (short non-contractile tail) morphologies. Head diameter and tail length are given for myoviruses and siphoviruses, and head diameter for podoviruses (nm). Head diameters of HSTV-2, HVTV-1, and HSTV-1 are based on cryo-EM (vertex to vertex). b L, linear; C, circular; CP, circularly permutated.

370

Table 2

Euryarchaeal Viruses

Icosahedral internal membrane-containing euryarchaeal dsDNA virus isolates with a complete genome sequence available

Virus

Isolation host

Isolation source

Virion size, T numbera

Genome topology, length (bp) and Acc. No.b

SH1 PH1 HHIV-2 HCIV-1 SNJ1

Haloarcula hispanica Haloarcula hispanica Haloarcula hispanica Haloarcula californiae Natrinema sp. J7–1

Lake Serpentine, Australia Pink Lake, Australia Saltern, Margherita di Savoia, Italy Saltern, Samut Sakhon, Thailand Salt mine, Hubei, China (induced from Natrinema sp. J7–1 and grown on Natrinema sp. J7–2)

80.0, T ¼ 28 dextro 50, nd 80.0, T ¼ 28 dextro 80.0, T ¼ 28 dextro 70–75

L 30,889 AY950802 L 28,072 KC252997 L 30,578 JN968479 L 31,314 KT809302 C 16,341 AY048850.1

a

Diameter of the capsids are given (nm). SH1, HHIV-2, and HCIV-1 diameters are based on cryo-EM. L, linear; C, circular.

b

Table 3

Pleomorphic, spindle-shaped, and spherical euryarchaeal virus isolates with a complete genome sequence available

Virus

Isolation host

HRPV-1 HRPV-2 HRPV-3 HRPV-6 HRPV9

Halorubrum Halorubrum Halorubrum Halorubrum Halorubrum

PV6 SS5–4 SP3–3 SS7–4 SS5–4

Morphology and sizea

Genome topology, length (bp or nt) and Acc. No.b

P P P P P

41 54 67 49 57

ssDNA C 7048 FJ685651 ssDNA C 10,656 JN882264 dsDNA C 8770 JN882265 ssDNA C 8549 JN882266 dsDNA C 16,159 KY965934

P P P P P P P

55 55 55 52 50 50 60

dsDNA C 9296 MG550111 dsDNA C 9368 MG550113 dsDNA C 9944 MG550110 dsDNA C 8082 GU321093 ssDNA C 8176 KF056323 dsDNA C 11,648 KX344510 dsDNA C 15,010 KY264020

His2 HGPV-1 SNJ2

Saltern, Trapani, Italy Saltern, Samut Sakhon, Thailand Salt water, Sedom Ponds, Israel Saltern, Samut Sakhon, Thailand Culture supernatant of Halorubrum sp. B2–2 on Halorubrum sp. SS5–4 culture Halorubrum sp. LR2–17 Lake Retba, Senegal Halorubrum sp. LR2–12 Lake Retba, Senegal Halorubrum sp. LR1–23 Lake Retba, Senegal Haloarcula hispanica Saltern, Margherita di Savoia, Italy Haloarcula hispanica Saltern, Hulu Island, Liaoning, China Haloarcula hispanica Saltern, Samut Sakhon, Thailand Haloarcula hispanica Culture supernatant of Haloferax sp. s5a–1 on Har. hispanica culture Haloarcula hispanica Pink Lakes, Victoria, Australia Halogeometricum sp. CG–9 Saltern, Cabo de Gata, Spain Natrinema sp. J7–1 Natrinema sp. J7–1 culture

P 71 P 56 P 70–80

His1 TPV1 PAV1 MetSV

Haloarcula hispanica Thermococcus prieurii Pyrococcus abyssi Methanosarcina mazei Gö1

L 92  40 L 140  80 L 120  80 S 56.6

HRPV10 HRPV11 HRPV12 HHPV-1 HHPV-2 HHPV3 HHPV4

sp. sp. sp. sp. sp.

Isolation source

Saltern, Avalon, Victoria, Australia, Deep‐sea hydrothermal vents, East Pacific Rise Pyrococcus abyssi culture Anaerobic sewage sludge

dsDNA L 16,067 AF191797 dsDNA C 9694 JN882267 dsDNA C 16,992 AJVG01000023 (WGS contig04, 19,792–36,797) dsDNA L 14,462 AF191796 dsDNA C 21,592 JQ010983 dsDNA C 18,098 EF071488 dsDNA L 10,567 MF186604

a

Morphotypes P, L, and S correspond to pleomorphic, lemon-shaped (spindle-shaped), and spherical morphologies. Diameter (for P and S) or length and width (for L) are given (nm). Nd, not determined. b L, linear; C, circular.

sequences of euryarchaea and metagenomic datasets indicate that the viruses known today represent only a minuscule fraction of the true euryarchaeal viral diversity. Here, we review advances in understanding viruses infecting euryarchaea by summarizing the data from culture-independent studies and describing known virus isolates.

Icosahedral Tailed Viruses To date, about 90 euryarchaeal icosahedral tailed viruses have been isolated, the majority of which are viruses infecting halophilic archaea of the class Halobacteria and the rest infect methanogenic archaea belonging to the family Methanobacteriaceae (Table 1). Archaeal tailed viruses morphologically resemble bacteriophages of the order Caudovirales, with an icosahedral head and either a long contractile (myovirus) or non-contractile (siphovirus) tail or a short non-contractile tail (podovirus) (Fig. 1(A)–(C)). Of these three morphotypes, myovirus isolates are the most numerous, while only one archaeal podovirus HSTV-1 of Haloarcula sinaiiensis is known. All currently known euryarchaeal icosahedral tailed viruses have dsDNA genomes (Table 1), which are reminiscent to the genomes of tailed bacteriophages in the order Caudovirales. Tailed archaeal viruses resemble tailed bacteriophages by a modular structure and mosaicism of their genomes, although many ORFs have no assigned functions. The gene for the large terminase subunit is one of the most conserved genes in tailed dsDNA phages and it is annotated in all euryarchaeal tailed virus genomes

Euryarchaeal Viruses

371

described to date. Most archaeal viruses have not been taxonomically assigned to any taxa, except some of the oldest isolates, e.g., phiH, Hs1, and HF2, but obvious similarities with tailed dsDNA phages suggest that all archaeal tailed viruses could be members of the order Caudovirales, but having their own family level taxa. Notably, no tailed viruses have been yet isolated for crenarchaeal hosts, although microscopic and metagenomics studies of hyperthermic environments, where crenarchaea typically dominate, have revealed the presence of tailed VLPs and sequences related to phages of Caudovirales.

Icosahedral Tailed Viruses of Halophilic Euryarchaea Since the discovery of the first archaeal virus Hs1 in 1974, other tailed halophilic archaeal viruses have been isolated: Ja1 (1975), B10 (1982), phiH (1982), Hh-1 and Hh-3 (1982), S45 (1984), phiN (1988), S5100 (1990), HF1 and HF2 (1993), phiCh1 (1997), S41, S50.2, and S4100 (1998). From these viruses, only a few have been characterized at the molecular level, and some of the early isolates have been lost later. In 2000s, a few more tailed haloarchaeal viruses have been reported: BJ1 (2007), HHTV-1, HRTV-1, and HCTV-1 (2009). Two large virus surveys of hypersaline environmental samples published in 2012 and 2015 yielded close to 60 new tailed archaeal virus isolates infecting halophilic strains of the genera Haloarcula or Halorubrum and one Halogranum tailed virus. The most recently described isolates are ChaoS9 and HFTV1 (2019), of which HFTV1 represents the first virus infecting Haloferax strain. In addition, several putative proviruses related to tailed viruses have been detected in the genomes of halophilic archaea, showing their wide distribution. One of the earliest euryarchaeal virus isolates is the temperate myovirus phiH, which was obtained from a spontaneously lysed culture of Halobacterium salinarum strain R1. It was actively studied until late 1990s, when the work on this virus isolate stopped due to its genomic instability. phiH virion displays a typical myovirus morphotype and it requires high salt concentrations (3 M KCl or NaCl) for stability. The linear dsDNA genome of phiH is packaged into viral particles, but is a circular plasmid-like molecule in infected cells. Viral DNA is terminally redundant and partially circularly permuted, being packed by a headful mechanism. Recently, the genome of phiH1, a predominant variant of phiH, has been fully sequenced and the annotation has been refined (Table 1). During replication rounds, a mixture of genetically rearranged phiH variants is produced due to the activity of insertion sequences. A so-called L-segment, flanked by insertion sequence(s), may undergo inversions or circularize into an autonomous 12-kbp plasmid phiHL, which provides partial immunity from the phiH infection. phiH1 has been largely used in molecular studies of archaeal gene expression and regulation, resulting in e.g., development of transfection methods in archaea. Another temperate myovirus, phiCh1, has been isolated from a spontaneously lysed culture of haloalkaliphilic Natrialba magadii (Table 1). Despite having very different hosts, phiCh1 is closely related to phiH. Peculiarly, in addition to viral DNA, phiCh1 virions contain several host-derived RNA species of 70–800 nt in length. Unlike phiH, phiCh1 integrates into the host’s chromosome. The replicative form is circularly permuted and terminally redundant, implying a headful mechanism of packaging. phiCh1 genome of 58,498 bp is methylated and arranged into three transcriptional units. phiCh1 and phiH1 genomes display similar gene synteny and share 63% nucleotide identity. Similarly to phiH1, phiCh1 demonstrates genomic rearrangements. The rearrangements caused by the lambda-like integrase Int1 result in the inversion in the region of two genes encoding putative tail fiber proteins, which affects host recognition specificity. Genomic rearrangements have been also suggested in haloarchaeal myoviruses S41 and S50.2 of Hbt. salinarum and siphovirus BJ1 of Halorubrum. A myovirus related to phiH and phiCh1, ChaoS9, has been recently isolated after a spontaneous lysis of a Hbt. salinarum culture. The left-end region (first 25 kbp) of ChaoS9 genome (Table 1) shares similarities in sequence and gene synteny with phiCh1 and phiH, while the right end (from 25 to 55 kbp) is significantly more divergent. The viruses share overall nucleotide identity of over 74%. The left-end region of ChaoS9 carries tail fiber genes, which are similar to those in phiCh1 and phiH, and other structural proteins, while the right-end region encodes proteins involved in DNA replication and lysogeny. ChaoS9 major capsid protein (MCP) displays 36% amino acid identity to that of siphovirus HHTV-1 of Haloarcula hispanica. Putative head and assembly proteins are unrelated in ChaoS9, phiH, and phiCh1. The revealed similarities in tail morphogenesis module and differences in head morphogenesis module suggest that large recombination events may have shaped the genomes of these viruses. This kind of genomic plasticity is also typical for tailed bacteriophages of the order Caudovirales. One more example of two closely related haloarchaeal tailed viruses is HF1 and HF2, both isolated from Cheetham Saltworks in Victoria, Australia. These myoviruses are virulent, displaying particles of a similar size and genomes of a similar length (76–77 kbp). Both viruses may enter unstable carrier states, when viruses are produced continuously, but eventually switch to a complete lysis. However, HF1 and HF2 have distinct host ranges with no common hosts. HF1 and HF2 genomes are almost identical in their first 48 kbp, while the right-end regions are 87% identical. Such genomic structure may be a result of a recent recombination event with some other HF-related virus. The divergent right-end part of virus genomes probably contains late genes, encoding viral structural proteins, possibly explaining different host ranges of these viruses. Virus genomes are mosaic and encode own DNA polymerases. In the first half of the 2010s, global sampling of hypersaline environments resulted in the isolation of dozens of new tailed haloarchaeal viruses. It has been shown that tailed haloarchaeal viruses are able to infect host strains originating from geographically very distant sites or from the same location sampled over a few years. Some of these viruses can also infect strains belonging to different genera. Some myoviruses isolated from the Samut Sakhon solar saltern, Thailand, have shown notably wide host ranges, while some of Halorubrum strains from the same location were highly susceptible to virus infections. A few tailed euryarchaeal viruses have been structurally characterized using cryo-electron microscopy (EM) and 3D image reconstruction. The most detailed structure (B9 Å resolution) has been resolved on HSTV-1, which is so far the only known archaeal podovirus (Table 1). Haloarcula sinaiiensis virus HSTV-1 is virulent, and its infectivity depends on salinity. At low salinity,

372

Euryarchaeal Viruses

HSTV-1 is reversibly non-infectious, regaining its infectivity when transferred back to high salinity. This kind of characteristic is very advantageous in environments with salinity fluctuations, caused by e.g., changing weather conditions. The majority of HSTV-1 ORFs have no assigned functions, while other ORFs are similar to those involved in DNA replication and nucleotide metabolism or virion assembly in tailed bacteriophages. The order of genes encoding structural proteins is also same as observed in bacteriophage counterparts. HSTV-1 particles display an icosahedral head with a diameter of 560 Å measured facet to facet and 624 Å vertex to vertex (Fig. 2(A) and (B)). The short podovirus-like tail with base plate tail fibers functions in host recognition. The capsid is decorated with 20 large (at least 138 Å long and 74 Å wide), cone-shaped towers, which are located at positions of threefold symmetry (Fig. 2(B)). The capsid shell is 20 Å thick and is arranged into T ¼ 7 laevo lattice. HSTV-1 MCP adopts the bacteriophage HK97 fold (Fig. 2(C)), thus suggesting evolutionary links between archaeal and bacterial tailed viruses and eukaryotic herpesviruses, all having the common HK97 fold in their MCPs. The structural similarity between different viral MCPs (without having any sequence similarity) unites these diverse viruses into one of the structure-based viral lineages (HK97-like viruses). In addition, based on homology modeling, phiCh1 MCP could also adopt the HK97 fold. Most probably, all archaeal tailed viruses share the same conserved HK97 fold to build up their capsids. The other structurally characterized tailed viruses are siphovirus HVTV-1 of Haloarcula vallismortis and myovirus HSTV-2 of Halorubrum sodomense. Like podovirus HSTV-1, the infectivities of HVTV-1 and HSTV-2 are reversibly lost in low-salt conditions. Both viruses are virulent with quite long intracellular phases: cells are lysed after 10–12 h of infection. HVTV-1 virion consists of an icosahedral capsid enclosing the linear genome of 101,734 bp and a tail typical for siphoviruses – long and flexible (Table 1). Cryo-EM revealed that the icosahedral capsid of HVTV-1 is arranged into T ¼ 13 lattice and decorated with five-fold symmetric spikes at vertices and trimeric decorative structures at the center of each MCP hexamer. HSTV-2 genome (Table 1) shares B70 and B60% of overall nucleotide identity with HF1 and HF2 genomes, respectively. HSTV-2 capsid (Table 1) is arranged into T ¼ 7 lattice being smaller than

Fig. 2 Structure of the haloarchaeal podovirus HSTV-1 head based on cryo-EM and icosahedral reconstruction, where the tail is invisible. (A) Central section of the icosahedral reconstruction of HSTV-1 is shown viewed along an icosahedral twofold symmetry axis (ellipse). Threefold (triangle), and fivefold (pentagon) symmetry axes are indicated. Bar is 10 nm. (B) 3D isosurface representation of HSTV-1. (C) Bacteriophage HK97 MCP fold (Protein Data Bank code 1OHG). (D) Averaged density showing the MCP fold of HSTV-1 calculated from the six segmented subunits from one HSTV-1 MCP hexamer. HK97 MCP fold (red ribbon) and HSTV-1 MCP fold (blue ribbon) fitted into the EM density in gray (see Panel B). Reproduced from Atanasova, N.S., Bamford, D.H., Oksanen, H.M., 2015. Haloarchaeal virus morphotypes. Biochimie 118, 333–343 with a permission from Elsevier.

Euryarchaeal Viruses

TV -1

90

10

0 0

5

10 15 0 2 25 30 35 40 45 50

HC

HR TV -7

95

0

-2 TV

2 2 5 1 0 10 5 5

HH

TV -1

1 TVHG

0 5 10 15 20 25 30 35 40 45 50 0 5 10 15 20 25 30 35 40 45

HHTV-2

65 6 5 0 50 5 45 40 35 30 25 20 15 10 5 0 140 135 130 125 120 115 110 105 100 95 90 85 80 75 70 65 60 55 50 45 40 5 3 30

HC

0 5 10 15 20 25 30 35 40 45 50 55 60 65 70 75 80 85

HF2

B

75 70 65 60 55 50 45 40 35 30 25 20 15 10 5

A

373

10

55 50 45 40 35 30 25 20 15 10 5 0 30 25

0

15 0 2 25 30 35 0

h1

0

5 10 15 20 25 30 35 40 0 5 10 15 20

PhiC

5

HSTV-1

BJ1

Fig. 3 Genomic comparisons of the 17 haloarchaeal tailed viruses with completely sequenced genomes. All viruses with complete genome sequence can be found in Table 1. (A) Dotplot alignment of the genomes. (B) Circular visualization of the homologous proteins shared between the selected virus representatives from each of the delineated groups and singletons (myoviruses in blue; siphoviruses in pink; podoviruses in purple). The outermost circle represents the genome maps with the coordinates (kbp) followed by the ORFs (on positive strand in green; on negative strand in red). Putative homologs (sharing over 30% identity) are connected by a gray line. Reproduced from Sencˇilo, A., Roine, E., 2014. A glimpse of the genomic diversity of haloarchaeal tailed viruses. Frontiers in Microbiology 5, 84.

that of HVTV-1. Minor trimeric proteins between the MCPs at the local three-fold positions stabilize the capsid, allowing enlargement of HSTV-2 capsid and fitting of a larger genome, which is a novel strategy not previously observed in virus capsids. Up to date, from the isolated haloarchaeal tailed viruses, 20 have completely sequenced genomes (Table 1). In addition, the Genbank database contains a record of CGphi46 virus genome sequence. Based on the description available in Genbank, CGphi46 infects Halorubrum sp. and was isolated from Cargill Saltern, USA. The genome is a linear dsDNA of 39,784 bp, displaying similarities to tailed haloarchaeal virus sequences, such as Halorubrum virus BJ1. Generally, the genomes of haloarchaeal tailed viruses are dsDNA molecules ranging in length roughly from 32 kbp (HSTV-1) to 144 kbp (HGTV-1) with GC content typically similar to that of their hosts. Some conserved bacteriophage genes, such as for the large terminase subunit, are widely found in haloarchaeal tailed virus genomes, while the majority of predicted ORFs have no assigned functions. Based on the whole genome sequences, a few groups of similar viruses can be distinguished: (1) HF1, HF2, HRTV-5, and HRTV-8, (2) HRTV-7 and HSTV-2, (3) HCTV-1, HCTV-5, and HVTV-1, while other viruses are so far singletons (Fig. 3). Clearly, many more sequences are needed to gain more insights into genetic diversity of tailed archaeal viruses and their taxonomy and to resolve their relatedness to other viral groups.

Tailed Viruses of Methanogenic Euryarchaea Since 1980s, several viruses infecting methanogenic archaea have been reported: PG (1984), PMS1 (1986), psiM1 and psiM2 (1989), phiF1 and phiF3 (1993), and Drs3 (2018). In addition, tailed VLPs have been found associated with Methanococcus voltae PS. Icosahedral VLPs with long contractile or non-contractile tails (myoviruses and siphoviruses) have been found in high abundances in acetate-fed anaerobic digester sludge reactors dominated by Methanosaeta. High numbers of VLPs, including tailed ones, have also been reported in other anaerobic digesters inhabited with methanogens. Moreover, several proviruses have been identified in the genomes of methanogenic archaea belonging to the orders Methanobacteriales, Methanococcales, and Methanosarcinales. Siphovirus psiM1 was isolated from an anaerobic sludge digester (55–601C) on Methanothermobacter marburgensis (Table 1). Circularly permuted and terminally redundant linear genome of psiM1 is packaged from a concatemeric precursor via a headful mechanism. psiM2 is a deletion derivative of the wild type psiM1 and lacks a 0.7-kbp deletion segment, which makes it more stable than the wild type. Both variants are virulent, encoding pseudomurein endoisopeptidase, an enzyme that cleaves host pseudomurein. However, a putative integrase gene is also found in psiM1 and psiM2. It has been observed that B15% of psiM1 particles carry multimers of a cryptic 4.5-kbp host plasmid pME2001, suggesting that psiM1 can mediate general transduction. Indeed, the virus has been shown to transduce some host chromosomal markers. A defective prophage related to psiM1 and

374

Euryarchaeal Viruses

psiM2, psiM100, can be found in the chromosome of Methanothermobacter wolfeii. Upon hydrogen limitation, Methanothermobacter wolfeii cultures spontaneously lyse, however, without producing VLPs, but the autolysate contains pseudomurein endoisopeptidase encoded by psiM100. psiM100 genome is B28.8 kbp and is 70.8% identical to psiM2, except for the region of 2.8 kbp, which is apparently an insertion from a virus non-related to psiM1 or psiM2. Two virulent viruses, phiF1 and phiF3, infecting Methanobacterium thermoformicicum have been isolated from an anaerobic sludge reactor. phiF1 has an icosahedral head of B70 nm in diameter and a non-flexible tail of B160 nm in length, while phiF3 has an icosahedral head of B55 nm in diameter and a flexible 230-nm long tail. The two viruses differ significantly in their host ranges, genome organization, and size. From the tested Methanobacterium strains, phiF1 infected M. thermoautotrophicum strain ΔH and several strains of M. thermoformicicum, while phiF3 was able to infect only M. thermoformicicum strain FF3. phiF1 has a linear dsDNA genome of B85 kbp, while phiF3 genome is a circular or terminally redundant linear dsDNA molecule of B36 kbp. No similarity between their sequences were found in hybridization experiments. There is no complete genome sequence available for these viruses. The other siphovirus infecting methanogenic euryarchaea, Drs3, (Table 1) is only distantly related to psiM2 or psiM100, having no predicted functions for the majority of ORFs.

Icosahedral Internal Membrane-Containing Viruses Currently, five icosahedral internal membrane-containing viruses infecting euryarchaea have been described, SH1 being the first virus isolate of this group (found 2005; Table 2). Their genomes are either linear or circular dsDNA molecules. These viruses are classified into the family Sphaerolipoviridae together with the phages P23–77 and IN93 infecting thermophilic Thermus bacteria. Alphasphaerolipovirus genus includes virus species of Haloarcula hispanica viruses SH1, PH1, and HHIV-2, and Haloarcula californiae virus HCIV-1. Natrinema virus SNJ1 is the sole member of the genus Betasphaerolipovirus, and the bacteriophages have their own genus Gammasphaerolipovirus. Archaeal sphaerolipoviruses have been isolated from different hypersaline environments: Australian salt lakes and solar salterns in Italy and Thailand (Table 2). Their host cells are either Haloarcula or Natrinema species. SH1, PH1, HHIV-2, and HCIV-1 are virulent and their progeny is released by host cell lysis. SNJ1 is temperate and initially found from lysogenic strain Natrinema sp. J7–1 originating from a salt mine, Hubei province, China. The strain J7–1 contains a plasmid pHH205, whose sequence is identical to that of SNJ1 and it is the proviral element of SNJ1 replicating freely in the cell. SNJ1 employs a rolling-circle mechanism of genome replication and harbors a gene encoding replication initiation protein. SNJ1 can be induced by mitomycin C and plated on Natrinema sp. J7–2 strain to obtain plaques. Natrinema sp. J7–1 is also the source of another halophilic virus SNJ2, which is a pleomorphic virus (Table 3). Archaeal sphaerolipovirus virions are icosahedral, B80 nm in diameter, and contain more than 12 different protein types, which are either forming the capsid, receptor recognizing spike complexes, or associating with the membrane. Underneath the icosahedral capsid, the protein-rich membrane bilayer encloses the genome. This virion morphology is also found among crenarchaeal (e.g., turrivirus STIV and STIV-2) and bacterial viruses (e.g., tectivirus PRD1 and corticovirus PM2). The recently obtained cryo-EM based whole virion structures of HCIV-1, HHIV-2, and SH1 at 3.7–3.8 Å resolution revealed that virus capsids are arranged in a pseudo T ¼ 28 dextro lattice composed of two MCP species (Fig. 4). MCP VP7 is smaller and composed of one single b-barrel, whereas the large MCP VP4 is composed of two b-barrels. One b-barrel is forming the lattice and the other one is located on top of the lattice-forming b-barrel, thus forming the tower structure on the capsomer surfaces (Fig. 4(B)). The MCPs are conserved among the archaeal sphaerolipoviruses and the amino acid similarities of the large MCP VP4 and small MCP VP7 are 90%–97% and 86%–99%, respectively. MCPs VP4 and VP7 form pseudo-hexameric capsomers with either two or three towers, which are built of VP7-VP4 heterodimers and monomeric VP7. The capsid assembly is coordinated by the conserved penton protein complexes at the five-fold vertices and two specific membrane associated GPS-II and GPS-III complexes. The bacterial sphaerolipovirus P23–77 capsid has the same T ¼ 28 organization, but the sequences of its MCPs are divergent from those of the archaeal counterparts although having similar single b-barrel folds. The multi-protein spike complexes are either dimeric horn-shaped structures (SH1, HCIV-2, and PH1) or pentameric propellerlike complexes (HHIV-2) plugged into the five-fold vertices serving as host recognition devices. The viral membranes are composed of virus-encoded proteins and host-derived lipids, which are selectively acquired from the host membranes during the virion assembly. The membrane is closely associated with the capsid, being connected to the capsid at least by the penton and GPS-complexes. The major lipid species in HCIV-1, HHIV-2, and SH1 virions are phosphatidylglycerol, phosphatidylglycerophosphate methyl ester, and phosphatidylglycerosulfate. In a similar way, the fatty acids of SNJ1 are of the host origin, but the composition is a bit different than in viruses of Haloarcula. The B30 kbp genomes of HCIV-1, HHIV-2, SH1, and PH1 are linear and have inverted terminal repeats and terminal proteins attached to the ends of the molecules, proposing a protein-primed mechanism of genome replication. However, no canonical DNA polymerase-encoding gene such as in the genome of bacteriophage phi29 is found in viral or host genomes, indicating that a different mode of replication might be employed. The genomes of HCIV-1, HHIV-2, SH1, and PH1 share over 56% nucleotide identity. SH1 and PH1 are the most closely related viruses with B76% identity. The four genomes are co-linear and the organization of their B50 ORFs is conserved (Fig. 5). The circular SNJ1 genome sequence shares little similarity to other archaeal sphaerolipoviruses at the nucleotide level. However, the arrangement of the viral core genes encoding putative genome packaging ATPase and the small and large MCPs is

Euryarchaeal Viruses

375

Fig. 4 Cryo-EM structures of HCIV-1 and HHIV-2 virions and their MCPs. (A) Cryo-EM density maps of HCIV-1 (left) and HHIV-2 (right) (vertex complexes have been omitted). HCIV-1 is rendered to display the capsid shell (left-half) and the particle interior (right-half). Color-coding by distance from the center (legend below); genome, red; inner and outer membrane leaflets, yellow and yellow-lime. Icosahedral asymmetric unit (IAU) and capsomer organization are shown on HCIV-1 surface (white-transparent hexagons) and at the center. Capsomers are numbered Nos. 1–5; the three-tower capsomers (No. 1, light-yellow; No. 2, cyan; No. 3, pink) and two-tower capsomers (No. 4, light-green; No. 5, light-magenta). Two MCPs VP4 and VP7 are visualized by blue and light-gray circles. The MCP subunits are labeled (A-X and a). Black short-lines joining the circles identify the VP7–VP4 heterodimers. Black pentagons represents the positions of the penton complex plugging the vertices. Facet of the virion and the icosahedral symmetry axes (two, three, and five-fold) are marked on HHIV-2 by a white triangle and numbers (2,3, and 5), respectively. (B) Single b-barrel MCPs VP7 (left) and VP4 (right) of HCIV-1. VP4 is composed of two vertical single b-barrels, one standing on top of the other, while VP7 consists of a vertical single b-barrel. The four-stranded sheets BIDG and CHEF are labeled in VP7. A loop in VP7 (residues 149–154) is marked by a dashed circle. Region of the VP4 atomic model is shown as a stereoview (black rectangles) and fitted into the corresponding 3.7 Å resolution density map. Reproduced from Santos-Pérez, I., Charro, D., Gil-Carton, D., et al., 2019. Structural basis for assembly of vertical single b-barrel viruses. Nature Communications 10, 1184.

conserved. These three genes are the most conserved among HCIV-1, HHIV-2, SH1, and PH1, and can be also found from bacteriophages belonging to the same family. The least conserved genes are encoding the receptor recognition complexes. The HHIV-2 spike complex genes are different from those found in the same position of the genomes of HCIV-1, SH1, and PH1 (Fig. 5) as are the structures of their vertex complexes. Putative proviruses HalaCibP1, HalaPauP1, and HaloLacP1 identified in the chromosomes of Haladaptus cibarius, Haladaptus paucihalophilus, and Halobiforma lacisalsi resemble HCIV-1, HHIV-2, SH1, and PH1 viruses and have signatures of putative integrase or transposase and tRNA genes flanking the sphaerolipovirus-like provirus sequences, indicating that some of these viruses could be temperate alphasphaerolipoviruses. In addition, two euryarchaeal

376

Euryarchaeal Viruses

Fig. 5 Genome comparison of icosahedral membrane-containing euryarchaeal dsDNA virus isolates belonging to the family Spherolipoviridae. Gray arrows, ORFs and genes; other colored arrows, homologous genes coding for structural virion proteins (VPs) indicated above; light blue arrow heads, inverted terminal repeats (ITR). The nomenclature for the virion proteins (VPs) is same in all four viruses except that the vertex complex proteins that are marked with asterisks: HHIV-2 proteins VP16 and VP17 and proteins VP3 and VP6 in other viruses. Genes encoding putative vertex complex proteins are bordered with black lines. Pairwise amino acid similarities (%) between the major membrane proteins (VP12), major capsid proteins (MCPs), or putative packaging ATPases are shown in between the genomes. Reproduced from Demina, T.A., Pietilä, M.K., Svirskaite, J., et al., 2017. HCIV-1 and other tailless icosahedral internal membrane-containing viruses of the family Sphaerolipoviridae. Viruses 9, 32.

proviruses TKV4 and MVV (putatively forming icosahedral particles) can be found integrated into the chromosomes of Thermococcus kodakaraensis KOD1 and Methanococcus voltae A3.

Pleomorphic, Spindle-Shaped, and Spherical Viruses Pleomorphic, spindle-shaped, and spherical viruses of euryarchaea have non-lytic life cycles. From these groups, halophilic pleomorphic viruses are the most numerous with 18 virus isolates described today (Table 3). A few spindle-shaped (i.e., lemonshaped) viruses have been isolated on halophilic or thermophilic euryarchaea, whereas the only example of a spherical euryarchaeal virus is MetSV of anaerobic methane-producer Methanosarcina (Table 3).

Pleomorphic Viruses Since the discovery of the first archaeal pleomorphic virus HRPV-1 in 2009, altogether 18 pleomorphic viruses infecting halophilic euryarchaea have been described. Fifteen of those are members of the family Pleolipoviridae, which is the first virus family containing viruses with either ssDNA or dsDNA genomes (Table 3). In addition, three virus isolates are tentative species of the family. Here, we briefly summarize the current knowledge on the life cycles, virions, and genomic sequences of pleolipoviruses. For more information, please see the further reading “Vesicle-Like Archaeal Viruses”. The majority of pleolipoviruses have been isolated on halophilic archaea belonging to the genera Halorubrum and Haloarcula of the class Halobacteria. Pleolipoviruses establish non-lytic infection, presumably using budding as an exit mechanism. Virus particles are produced continuously, and the host growth is typically retarded. SNJ2 is a temperate virus, and some other viruses harbor also integrase-encoding genes in their genomes. Pleolipoviruses lack a rigid protein capsid, instead, their virions are flexible membrane vesicles (Fig. 6). Virions have only 2–4 structural protein types, of which the spike induces the fusion of viral and host membranes during virus entry. Pleolipoviruses have either ssDNA or dsDNA circular genomes, except His2, whose genome is a linear dsDNA molecule (Table 3). All pleolipovirus genomes carry a conserved block of collinear genes. Based on whole genome information and gene content, the pleolipoviruses are classified into Alphapleolipovirus, Betapleolipovirus, and Gammapleolipovirus genera. In addition, proviral sequences related to pleolipoviruses are commonly found in genomes of halophilic archaea, indicating that they are quite common in hypersaline environments.

Spindle-Shaped Viruses Spindle-shaped viruses are so far found only in archaeal viruses. There are several crenarchaeal spindle-shaped viruses belonging to the family Fuselloviridae, but His1 is the only halophilic euryarchaeal spindle-shaped virus and it infects Haloarcula hispanica. His1 is taxonomically assigned into the genus Salterprovirus in the newly proposed family Halspiviridae. The linear dsDNA genome of His1 encodes a putative type-B DNA polymerase for protein-primed replication. The spindle-shaped His1 virion with a short tail is elastic and very stable in different conditions (Fig. 6). When inactivated (e.g., by treatment with non-ionic detergent or boiling), the spindle-shaped particle releases its genome and transforms into an empty tube. Similar mechanism might be used when the viral genome is injected upon host infection. His1 particles vary in size and are composed of only one MCP type and a few minor structural protein species. His1 MCP exists in two forms, one of which is modified with lipids. No lipid bilayer has been detected in His1 virion. Virion elasticity might be a consequence of the MCP lipid-modification. His1 life cycle is non-lytic and the infection leads to continuous virus production.

Euryarchaeal Viruses

377

Fig. 6 Spindle-shaped virus His1 of the family Halspiviridae. (A) A slice through a tomogram of His1 viruses. (B–E) Structure of His1 virion is shown (B and C top and side views of the symmetry-free map; D and E top and side views of the sixfold symmetrized map). (F) The central slice of the refined sixfold average map showing the two cavities and a plug density. (G1–G3) Slices normal to the long axis of the virion at the indicated positions in F. Reproduced from Hong, C., Pietilä, M.K., Fu, C.J., et al., 2015. Lemon-shaped haloarchaeal virus His1 with uniform tail but variable capsid structure. Proceedings of the National Academy of Sciences of the United States of America 112, 2449–2454.

TPV1 is the only described spindle-shaped virus isolate of hyperthermophilic euryarchaeon (Table 3). Its circular dsDNA genome can be found freely in high copy numbers in the host cell Thermococcus prieurii, from which it is produced continuously in a non-lytic fashion. VLP PAV1 is also spindle-shaped with a short tail but isolated from hyperthermophilic euryarchaeaon Pyrococcus abyssi, from which it is continuously released (Table 3). However, its infectivity has not been demonstrated. PAV1 circular genome exists also as an episome in the Pyrococcus abyssi cytoplasm. Half of the PAV1 genome is composed of genes that have homologs in pTN2-like plasmids of Thermococcus strains. PAV1 persists most probably in its host in a stable carrier state. In addition, a spindle-shaped A3-VLP has been observed in a methanogenic euryarchaeaon Methanococcus voltae. The major capsid proteins of the spindle-shaped viruses with a short tail His1 (type virus of Halspiviridae), PAV1, TPV1, A3-VLP, and SSV1 (type virus of Fuselloviridae) are homologous based on their sequence similarity, suggesting their close evolutionary relationship.

Spherical Virus Metsv The only euryarchaeal virus with spherical morphology is MetSV, which has been isolated from an anaerobic sewage sludge on Methanosarcina mazei Gö1 (Fig. 1(G), Table 3). Spherical particles are of B57 nm in diameter and contain a blackberry-like envelope. Albeit having smaller size and no helical nucleoparticles, MetSV particles generally resemble PSV, the previously reported spherical virus infecting thermophilic crenarchaea of the genera Pyrobaculum or Thermoproteus. The analysis of viral and host lipids has revealed that virus particles contain an internal membrane composed of lipids obtained from the host. The presence of direct terminal repeats at the ends of the linear genome and a gene encoding a type B DNA polymerase suggest that MetSV is replicated via a protein-primed mechanism. The majority of MetSV ORFs could not been assigned with putative functions. MetSV is virulent, and its host range includes a few Methanosarcina mazei strains and M. barkeri strain DSM 1311 growing as single cells. However, sarcina-like aggregates are resistant to MetSV infection. MetSV-derived spacers have been identified (with

378

Euryarchaeal Viruses

some mismatches) in Clustered Regularly-Interspaced Short Palindromic Repeats (CRISPR) loci in several Methanosarcina sp. genomes, but not in the type strain M. mazei DSM 3647.

Culture-Independent Studies Microscopic examinations of hypersaline environments have revealed high VLP concentrations, typically up to 109 VLP/mL, and varying VLP/cell ratios (1  100). A variety of VLP shapes was observed in hypersaline environments with spindle-shaped VLPs being the most abundant ones, especially in the saltiest waters. Other observed morphotypes include spherical, filamentous, tailed icosahedral and some more unusual shapes. For example, star-shaped VLPs have been detected in the Dead Sea and a lot of unusual particle morphologies in hypersaline Lake Retba, e.g., hairpin‐shaped and bacilliform particles, branched filaments, chains of small globules and others. It has to be yet noted that in microscopy, it is impossible to distinguish between bacterial and archaeal viruses, and e.g., pleomorphic VLPs can be confused with membrane vesicles commonly produced by cells. Square archaeon Haloquadratum walsbyi is very abundant in hypersaline environments, but no viruses have been yet isolated for it. However, EM examination of environmental samples has revealed icosahedral tailed and spindle-shaped VLPs associated with square cells of Haloquadratum sp. In addition, pulse-field gel electrophoresis of environmental DNA from Santa Pola solar salterns, Alicante, Spain, with subsequent fosmid and plasmid clones construction and sequencing allowed a reconstruction of a nearcomplete genome sequence of EHP-1 (environmental halophage 1) presumably infecting Hqt. walsbyi. Among the other 42 partial viral genome sequences obtained from the same salterns, half could be tentatively assigned to Hqt. walsbyi, Nanohaloarchaea, or Hrr. lacusprofundi, indicating potential virus-host pairs. In addition, single-cell genomics and microarrays have been used for the identification of virus-host pairs in environmental samples of Santa Pola salterns, with NHV-1 (nanohaloarchaeal virus 1) and its nanohaloarchaeal host as the case study. Metagenomic and metaproteomic analyses of the Deep Lake in Antarctica revealed that the lake is dominated by haloarchaea. Identified viral MCP sequences matched several haloarchaeal virus isolates: HCTV-1, HVTV-1, HCTV-2, HHTV-1, and CGphi46. The matches to HCTV-1-like prohead protease and HRTV-7-like scaffold protein sequences indicate active virus life cycles in the sampled community. The analysis of Antarctic salt lakes metagenomes has also revealed matches between archaeal CRISPR spacers and tailed halovirus isolates ChaoS9, phiH1, and phiCh1. Virus-host interactions in hypersaline environments still remain largely unresolved. It has been shown that salt concentration changes, which are inevitable in natural environments, may affect adsorption rates, infectivity, and life cycle of haloarchaeal viruses. Virus-host systems in hypersaline environments seem to be diverse and dynamic. Generally, common features are found in metaviromes from distant hypersaline environments, but finer-level spatial and temporal analyses often suggest controversial trends in viral dynamics. When compared to sequence databases, metagenomic datasets from hypersaline environments usually have a very limited number of hits. Virus defense systems such as CRISPR-Cas, commonly found in archaea, suggest a constant active interplay between viruses and their hosts. Euryarchaeal viruses have been detected even deep in the ocean basement by metagenomics. A complete genome sequence (JdFR1000234; 55,906 bp) of a putative archaeal virus obtained from geothermally heated fluid samples collected from the ocean basement (117–292 m depth) of the Juan de Fuca Ridge, North America, encodes putative viral MCP, portal protein, and terminase, displaying similarities to archaeal icosahedral viruses with a long contractile tail. In addition, the complete viral genome of ANM-1, which presumably infects an anaerobic methane-oxidizing archaeon, was assembled from subsurface sediments collected from a methane seep (820 m depth) in Santa Monica Basin. ANM-1-related metagenomic sequences found from other methane seeps suggest that these viruses have a wide distribution in such environments. Interestingly, ANM-1 sequence contains diversity-generating retroelements, which together with similar elements found in the genomes of subterranean nanoarchaea represent the first examples of this genetic diversification-driving mechanism in archaeal systems. A variety of VLPs resembling archaeal viruses have been found in deep anoxic sediments of the Lake Pavin, France. Pleomorphic ellipsoid VLPs have been observed associated with filamentous cells, presumably belonging to methanogenic archaea of the genus Methanosaeta. Recently, tens of genomes of viruses putatively infecting marine unculturable euryarchaea of Group II have been assembled from environmental viromes (the Red sea, Tara Oceans, and Osaka Bay viromes). These viruses, called magroviruses, possess dsDNA genomes of up to B100 kbp, displaying similarities to euryarchaeal tailed viruses. Interestingly, magrovirus genomes appear to encode an extensive set of proteins involved in DNA replication, i.e., a nearly complete replication apparatus of the archaeal type. Marine euryarchaea and magroviruses seem to be globally widespread and abundant, although no such virus-host systems have been yet isolated.

Conclusions The diversity of euryarchaeal viruses is still undersampled compared to that of viruses infecting bacteria or eukaryotes, limiting our understanding on their evolution and environmental impacts. The existing knowledge suggests that euryarchaeal viruses represent a very diverse group of viruses with various morphologies and life strategies. With detailed molecular studies, unexpected evolutionary links have been revealed between archaeal, bacterial or eukaryotic viruses, while environmental studies have shown that euryarchaea and their viruses are abundant and ecologically relevant in various environments, both moderate and extreme.

Euryarchaeal Viruses

379

Possible evolutionary relations have been also suggested for membrane-enclosed pleomorphic viruses and membrane vesicles, commonly produced by cells. Much more research is needed to understand the relations within and beyond the group of euryarchaeal viruses, their interactions with host cells and the extent of their ecological roles.

Further Reading Atanasova, N.S., Bamford, D.H., Oksanen, H.M., 2015. Haloarchaeal virus morphotypes. Biochimie 118, 333–343. Atanasova, N.S., Bamford, D.H., Oksanen, H.M., 2016. Virus-host interplay in high salt environments. Environmental Microbiology Reports 8, 431–444. Atanasova, N.S., Demina, T.A., Buivydas, A., Bamford, D.H., Oksanen, H.M., 2015. Archaeal viruses multiply: Temporal screening in a solar system. Viruses 7, 1902–1926. Atanasova, N.S., Roine, E., Oren, A., Bamford, D.H., Oksanen, H.M., 2012. Global networks of specific virus-host interactions in hypersaline environments. Environmental Microbiology 14, 426–440. Bamford, D.H., Pietilä, M.K., Roine, E., et al., 2017. ICTV virus taxonomy profile: Pleolipoviridae. Journal of General Virology 98, 2916–2917. (ICTV Report Consortium). De Colibus, L., Roine, E., Walter, T.S., et al., 2019. Assembly of complex viruses exemplified by a halophilic euryarchaeal virus. Nature Communications 10, 1456. Demina, T.A., Pietilä, M., Svirskaite, J., et al., 2017. HCIV-1 and other tailless icosahedral internal membrane-containing viruses of the family Sphaerolipoviridae. Viruses 9, 32. El Omari, K., Li, S., Kotecha, A., et al., 2019. The structure of a prokaryotic viral envelope protein expands the landscape of membrane fusion proteins. Nature Communications 10, 846. Hong, C., Pietilä, M.K., Fu, C.J., et al., 2015. Lemon-shaped haloarchaeal virus His1 with uniform tail but variable capsid structure. Proceedings of the National Academy of Sciences of the United States of America 112, 2449–2454. Pietilä, M.K., Atanasova, N.S., Oksanen, H.M., Bamford, D.H., 2013. Modified coat protein forms the flexible spindle-shaped virion of haloarchaeal virus His1. Environmental Microbiology 15, 1674–1686. Pietilä, M.K., Laurinmäki, P., Russell, D.A., et al., 2013. Structure of the archaeal head-tailed virus HSTV-1 completes the HK97 fold story. Proceedings of the National Academy of Sciences of the United States of America 110, 10604–10609. Pietilä, M.K., Roine, E., Sencilo, A., Bamford, D.H., Oksanen, H.M., 2016. Pleolipoviridae, a newly proposed family comprising archaeal pleomorphic viruses with singlestranded or double-stranded DNA genomes. Archives of Virology 161, 249–256. Prangishvili, D., Bamford, D.H., Forterre, P., et al., 2017. The enigmatic archaeal virosphere. Nature Reviews Microbiology 15, 724–739. Sencˇilo, A., Jacobs-Sera, D., Russell, D.A., et al., 2013. Snapshot of haloarchaeal tailed virus genomes. RNA Biology 10, 803–816. Sencˇilo, A., Roine, E., 2014. A glimpse of the genomic diversity of haloarchaeal tailed viruses. Frontiers in Microbiology 5, 84. Weidenbach, K., Nickel, L., Neve, H., et al., 2017. Methanosarcina spherical virus, a novel archaeal lytic virus targeting Methanosarcina strains. Journal of Virology 91. (e00955-17) Chapter "Vesicle-Like Archaeal Viruses".

Relevant Websites https://talk.ictvonline.org/ International Committee on Taxonomy of Viruses (ICTV). https://talk.ictvonline.org/ictv-reports/ictv_online_report/ssdna-dsdna-viruses/w/pleolipoviridae The Family Pleolipoviridae.

Vesicle-Like Archaeal Viruses Elina Roine, University of Helsinki, Helsinki, Finland Nina S Atanasova, Finnish Meteorological Institute, Helsinki, Finland and University of Helsinki, Helsinki, Finland r 2021 Elsevier Ltd. All rights reserved.

Nomenclature

HRPV-9 Halorubrum pleomorphic virus 9 ICTV International Committee on Taxonomy of Viruses kb Kilobase kDa Kilodalton nt Nucleotide NTPase Nucleoside-triphosphatase ORF Open reading frame pHK2 Haloarchaeal plasmid pHK2 RCR Rolling circle replication SNJ1 Natrinema virus 1 SNJ2 Natrinema virus 2 TM Transmembrane domain tRNAArg Transfer RNA specific for Arginine tRNAMet Transfer RNA specific for Methionine VP Virus protein wHTH Winged helix-turn-helix

Glossary

Halobacterium salinarum, with a function to repress the production of an early lytic transcript. PhiH-like repressor genes have been indentified in the genomes of those pleolipoviruses that contain a putative integrase gene. Pleolipovirus Archaeal virus with a vesicle-like virion architecture. Receptor A molecule, usually exposed on the surface of the host cell, that the virus uses to recognize the host cell. S-Layer protein (archaeal) Archaeal cell wall glycoproteins, which can self-assemble into a paracrystalline surface layer. S-layer proteins are candidates of pleolipovirus receptors. Spike Structural virion surface protein, which usually protrudes from the surface of the viral particle and serves as the receptor recognition protein. In pleolipoviruses it also serves as a membrane fusion protein. VP3-Like protein Pleolipovirus virion membrane associated protein, which resides for the most part, inside the virion. It is a homolog of HRPV-1 VP3 protein. VP4-Like protein Pleolipovirus virion spike protein, which serves as the receptor recognition protein and fusion protein. It is a homolog of HRPV-1 VP4 protein.

Å Angstrom CM Cytoplasmic membrane HAPV-2 Haloarcula pleomorphic virus 2 HGPV-1 Halogeometricum pleomorphic virus 1 HHPV-1 Haloarcula hispanica pleomorphic virus 1 HHPV-2 Haloarcula hispanica pleomorphic virus 2 HHPV3 Haloarcula hispanica pleomorphic virus 3 HHPV4 Haloarcula hispanica pleomorphic virus 4 His2 His2 virus HRPV Haloarchaeal pleomorphic virus HRPV-1 Halorubrum pleomorphic virus 1 HRPV-2 Halorubrum pleomorphic virus 2 HRPV-3 Halorubrum pleomorphic virus 3 HRPV-6 Halorubrum pleomorphic virus 6 HRPV-7 Halorubrum pleomorphic virus 7 HRPV-8 Halorubrum pleomorphic virus 8

Alphapleolipovirus Pleolipoviruses with circular ssDNA genomes encoding putative rolling circle replication initiation proteins. Archaeal virus Virus infecting organisms in the third domain of life, the Archaea. Betapleolipovirus Pleolipoviruses with circular dsDNA genomes with short single stranded discontinuities. Gammapleolipovirus Pleolipoviruses with linear dsDNA genomes with inverted terminal repeat sequences and a gene for putative type B DNA polymerase. Glycosylation Addition of carbohydrate(s), i.e., glycan to another molecule. In this article, glycosylation is limited to proteins as target molecules. Haloarchaea Extremophilic archaea that require NaCl concentration above 1.5 M for survival. Integrase An enzyme that catalyzes the integration of viral genomic DNA into the genome of the host cell. N-Glycosylation A mode of glycosylation where an asparagine (Asn/N) of a protein is modified with an addition of a glycan. PhiH repressor Repressor protein whose gene was originally identified in the archaeal virus PhiH of

Introduction Haloarchaeal pleomorphic viruses (HRPVs) belong to a newly established family of Pleolipoviridae. The first isolate and now the typical member of the family, Halorubrum pleomorphic virus 1 (HRPV-1), was reported in 2009. It was also the first archaeal virus containing a single-stranded DNA (ssDNA) genome. Ever since, new representatives have been reported adding up to almost 20 different isolates that, after tailed viruses, represent the second largest group of viruses infecting haloarchaea (Table 1). Apart from representing the first archaeal viruses containing ssDNA genomes, the family Pleolipoviridae is also the first family, officially

380

Encyclopedia of Virology, 4th Edition, Volume 4

doi:10.1016/B978-0-12-809633-8.20983-7

Vesicle-Like Archaeal Viruses

Table 1

381

Reported Haloarchaeal pleomorphic viruses

Virus

Genome type (c/l)a

Genome size (nt/bp)

Capsid diameter (nm)

Host

HRPV-1 HRPV-2 HRPV-3 HRPV-6 HRPV9 His2 SNJ2 HHPV-1 HHPV-2 HHPV3 HHPV4 HGPV-1 HRPV-7 HRPV-8 HAPV-2 HRPV10 HRPV11 HRPV12

ssDNA, c ssDNA, c discontinuous ssDNA, c discontinuous dsDNA, l dsDNA, c dsDNA, c dsDNA, c discontinuous discontinuous discontinuous NDb ND ND discontinuous discontinuous discontinuous

7048 10,656 8770 8549 16,159 16,067 16,992 8082 8176 11,648 15,010 9694 ND ND ND 9296 9368 9944

41 54 67 49 B57 71 B70–80 52 B50 B50 B60 56 B43 B51 B40 B55 B55 B55

Halorubrum sp. PV6 Halorubrum sp. SS5–4 Halorubrum sp. SP3–3 Halorubrum sp. SS7–4 Halorubrum sp. SS7–4 Haloarcula hispanica Natrinema sp. J7–1 Haloarcula hispanica Haloarcula hispanica Haloarcula hispanica Haloarcula hispanica Halogeometricum sp. CG–9 Halorubrum sp. SS5–4 Halorubrum sp. SP3–3 Haloarcula sp. SS13–14 Halorubrum sp. LR2–17 Halorubrum sp. LR2–12 Halorubrum sp. LR1–23

dsDNA, c dsDNA, c

dsDNA, c dsDNA, c dsDNA, c

dsDNA, c dsDNA, c dsDNA, c

a

c, circular; l, linear conformation. ND, not determined.

b

classified by the International Committee on Taxonomy of Viruses (ICTV), which contains viral species harboring either single stranded or double stranded DNA (dsDNA) genomes. Despite the increasing number of new isolates and their genomic descriptions, we still have very little information about the detailed mechanisms of different steps in the viral life cycle. An exception to this is the recent report on the atomic structure of the HRPV spike protein, a membrane fusion protein, and its low-resolution structure on the virion (see Structural proteins). Since viruses are obligate parasites, information on their host organisms is also crucial for understanding the interaction. One of the particularly important host factors for viral infections is receptors. According to the current understanding, the cell wall of most of the haloarchaeal strains consists of the cytoplasmic membrane (CM) covered by S-layer proteins that form a para-crystalline layer on the cell surface and a pseudo-periplasmic space between the CM and the surface of the S-layer. Not only S-layer proteins can be highly variable and thus represent the most probable candidate for the receptor, but they are also glycosylated. The haloarchaeal S-layer proteins are N-glycosylated, and as the types of glycosylation in different haloarchaeal hosts can be highly diverse, the glycan structures are also good candidates as receptors or as parts of them.

An Overview of the Virion Structure and Viral Life Cycle The basic structure of the HRPV virions is a simple membrane vesicle enclosing the naked genome. A typical virion consists of two major structural proteins both of which are associated with the viral membrane. The larger structural protein, VP4 and VP4-like proteins (approx. from 50 to 60 kDa), forms spike structures on the surface of the virion whereas the smaller protein, VP3 or VP3-like proteins (approx. form 9–15 kDa), localizes mostly to the inner surface of the viral membrane (Fig. 1) without significant interactions with the genome. The host range of the known HRPVs seems to be very narrow with only few, if any, additional strains identified in addition to the original isolation host. The life cycle of the pleomorphic viruses starts with the recognition of the host cell receptor by the spike proteins and the release of viral genome into the host cell cytoplasm in a membrane fusion event. Genome replication and transcription take place in the cytoplasm, most probably using at least two different mechanisms (see Family Pleolipoviridae: related viruses with different genome types) and the structural proteins together with the replicated genomes are translocated to the cytoplasmic membrane for assembly into the virions. Exit of virions occurs by budding. In that process, virions acquire their membrane from the host CM. Thus, lipid composition of the viral membrane is approximately the same as the one of the host CM. Infected cells produce new progeny viruses for several hours after the first new virions are detected. The infection cycle for most of the characterized viruses is persistent, but detailed dynamics of different HRPV life cycles can vary substantially. For instance, HRPV-1 infects only approximately 10% of the host population at the start of the infection whereas HRPV-6 is able to infect at least 80% of the cells within the first 30 min. The duration of a life cycle of different HRPVs are relatively fast varying between approximately 2–5 h. Production of viruses by the cells retards the growth of the culture and leads in some instances to an inability to form colonies on rich culture media. The first reported temperate pleolipovirus, SNJ2, is able to integrate into the tRNAMet gene of the host, Natrinema sp. J7–1. Interestingly, SNJ2 seems to have some type of a synergistic relationship with an icosahedral virus, SNJ1, as the presence of SNJ1 increases the titer of SNJ2 during co-infection.

382

Vesicle-Like Archaeal Viruses

Fig. 1 Schematic representation of pleolipovirus virion structure. Spike proteins are indicated with purple color, internal membrane proteins with turquoise, lipid bilayer with orange and DNA with black and DNA-binding terminal proteins in light green.

Genomic Characteristics Gene Content and Conserved Genes The genomes of the first reported HRPVs, Halorubrum pleomorphic virus 1 (HRPV-1) and Haloarcula hispanica pleomorphic virus 1 (HHPV-1), were circular molecules containing less than 10 predicted open reading frames (ORFs). The identity of the genes encoding the two major structural proteins, VP3 (14 kDa) and VP4 (53 kDa), was verified using N-terminal sequencing. In addition, thorough mass spectrometry analysis of HRPV-1 particles revealed one peptide belonging to gene 8 encoding a putative nucleoside-triphosphatase (NTPase) and possibly being crucial to the particle formation. Comparative genomics, using the two HRPVs mentioned above as well as previously reported haloarchaeal plasmid pHK2, revealed a conserved cluster of the two major structural protein encoding genes as well as ORFs 2, 5 and 6. In addition, all these genetic elements contained an ORF encoding putative rolling circle replication (RCR) protein that was not necessarily conserved in sequence, but in the predicted function. As new isolates of pleomorphic viruses, Halorubrum pleomorphic viruses 2 and 6 (HRPV-2 and HRPV-6) and especially Halorubrum pleomorphic virus 3 (HRPV-3) and Halogeometricum pleomorphic virus 1 (HGTV-1) were reported, the number of conserved genes and predicted, conserved ORFs was reduced to five. Thus, in the HRPV-1 genome, the conserved gene cluster starts from the most conserved (in sequence) gene 3 and ends after the putative dNTPase encoded by gene 8 (Fig. 2(a)). The conserved genes of HRPV-3 and HGPV-1 show lower sequence similarity with the ones of the other viruses, and they do not contain a homolog of the RCR, but a conserved gene containing predicted winged helix-turn-helix (wHTH) domain. The gene content outside the conserved gene region also varies from genome to genome. His2 was the seventh virus originally included in the proposal for a new family of viruses, the Pleolipoviridae. It was isolated by the laboratory of Mike Dyall-Smith from an Australian salt lake on a lawn of Haloarcula hispanica. Unlike all the other genomes characterized to date, the genome of His2 was shown to be a linear dsDNA molecule with yet another suggested replication mechanism, the so-called protein primed replication with proteins covalently attached to the 50 ends of the genome. The genome of His2 is the most divergent from the other pleomorphic virus genomes in that it contains two clearly different parts, one of which contains the gene region conserved in the pleomorphic viruses. The conserved gene region of His2 genome resembles the most those of HRPV-3 and HGPV-1.

Family Pleolipoviridae: Related Viruses With Different Genome Types The initial proposal of the new family Pleolipoviridae included eight viral isolates HRPV-1 (the typical member), HRPV-2, HRPV-3, HRPV-6, HHPV-1, HHPV-2, HGPV-1 and His2 (Fig. 2(a)). As outlined above, they all contain the conserved genomic region encoding the HRPV-1 VP3 and VP4 homologs, the two major structural proteins of the virion. In the other viral isolates, these proteins can be encoded by genes assigned with a different number, which depend on the genomic content. For example, the bigger major structural protein of HRPV-2 is encoded by gene 5. In that case, the protein is VP5 and it is a VP4-like protein.

Vesicle-Like Archaeal Viruses

383

Fig. 2 Genomic comparison of pleolipovirus genomes. (a) Different representative viruses of the Pleolipoviridae family. (b) Betapleolipoviruses. Reproduced from Bamford, D.H., Pietilä, M.K., Roine, E., et al., 2017. ICTV virus taxonomy profile: Pleolipoviridae. Journal of General Virology 98, 2916–2917. doi:10.1099/jgv.0.000972 (under the licence CC BY 4.0, https://creativecommons.org/licenses/by/4.0/). Atanasova, N.S., Demina, T.A., Krishnam Rajan Shanthi, S.N.V., Oksanen, H.M., and Bamford, D.H., 2018b. Extremely halophilic pleomorphic archaeal virus HRPV9 extends the diversity of pleolipoviruses with integrases. Research in Microbiology 169, 500–504. doi:10.1016/j.resmic.2018.04.004 (under the licence CC BY, https://creativecommons.org/licenses/by/4.0/).

384

Vesicle-Like Archaeal Viruses

Downstream from the genes encoding the two major structural proteins, there are two conserved open reading frames encoding proteins with as yet unassigned functions, as well as the putative NTPase. The assignment of the viral isolates into different genera i.e., Alpha-, Beta- and Gammapleolipovirus is based on the gene content outside the conserved gene region. HHPV-1, HHPV-2, HRPV-1, HRPV-2 and HRPV-6 isolates, which carry the ORF encoding the putative RCR, are included in the Alphapleolipovirus genus. The assignment of HRPV-3 and HGPV-1 to the Betapleolipovirus genus is due to the presence of conserved hypothetical gene containing the wHTH domain. As mentioned above, the genome of His2 is different in terms of the other gene content and is assigned to the genus Gammapleolipovirus. It is notable that despite of the relatedness of these viral isolates their genome type can be different. In general, the genomes of Alphapleolipoviridae members contain ssDNA genomes whereas the genomes of Betapleolipoviridae isolates contain dsDNA genomes. An intriguing feature of the genomes belonging to the latter genus is that the otherwise dsDNA genomes contain short genomic regions of ssDNA. In HRPV-3, where these discontinuities were reported for the first time, the ssDNA regions were found to be associated with a specific penta-nucleotide motif GCCCA (50 -30 ) which precedes the ssDNA region of the same strand. Later on, similar discontinuous genomes have been reported for pleolipoviruses HHPV3, HHPV4 and HRPV9. Although the biological meaning of these short stretches is unknown, it has been hypothesized that, upon infection, that they would have a role in attracting the host DNA repair system components for active replication. Thus, despite the lack of information on the detailed mechanisms of genome replication, it is evident that these mechanisms are different for isolates belonging to the different genera.

New Isolates and Haloarchaeal Genomic Regions Analyzes of the HRPV genomes showed that homologs of the Alphapleolipovirus genomes were present in previously reported, partially sequenced plasmid pHK2 of Haloferax sp. Sequencing of the entire 10.8 kb pHK2-plasmid showed that this virus-related genetic element, in addition to the conserved genomic region similar to the pleolipovirus genome, also contains an open reading frame encoding a putative integrase as well as a putative homolog of the phiH repressor gene. A homologous 12.5 kb genomic region in Haloferax volcanii genome spanning from a tRNAArg gene close to HVO_1434 all the way to HVO_1422 was also identified suggesting that the smaller viral genomes originated from bigger genomes of temperate viruses with a typical propensity to integrate into genomic tRNA genes. In fact, in haloarchaeal genomes there seems to be a wealth of integrated proviruses or proviral remnants related to pleolipoviruses, and especially to the viruses belonging to the Betapleolipovirus genus. Indeed, in 2014, a pleolipovirus genome, SNJ2, of the betapleolipovirus type was, for the first time, reported to contain both the integrase and the phiH repressor encoding genes. Also, HHPV4 and HRPV9 genomes have recently been shown to harbor both the integrase and the phiH repressor (Fig. 2(b)), and homologs of the SNJ2-type of integrases have been identified in numerous mobile genetic elements found in haloarchaeal genomes.

Stability of Pleolipovirus Infectivity Salinity Haloarchaeal organisms thrive in hypersaline conditions with, for instance, saturated NaCl concentrations (approximately 5.5 M). Thus, the most important environmental condition affecting the infectivity of pleolipoviruses is the high NaCl concentration. Most pleolipoviruses loose infectivity at NaCl concentrations below 1 M. Different pleolipoviruses vary slightly in their NaCl requirements. HRPV-1 is stable in a narrow range of 1–3 M NaCl, while betapleolipoviruses HRPV-3, HRPV-9, and HGPV-1 were found to be stable in NaCl concentrations varying between 0 M to above 5 M (saturation), during the 24 h incubation time, while HHPV3 and HHPV4, require at least 2.5 M NaCl. A peculiar and apparently pH-dependent infectivity at changing NaCl concentration was observed for pleolipoviruses HHPV3 and HHPV4. HHPV3 was able to maintain the infectivity at pH values 4.5–9 when NaCl concentration was 3.5 M. However, when NaCl was lowered to 1.5 M, the virus lost all infectivity at the optimum pH 7.5. At pH 9, on the other hand, the infectivity was maintained high even at 1.5 M NaCl. Similar pH-dependence was observed for HHPV4, which, genetically, is almost identical to HHPV3. The loss of infectivity was irreversible for both viruses. When HHPV-1 adsorption was studied at different salinities, maximum adsorption was detected when NaCl molarity was as high as 3.5 M. In addition to NaCl, pleolipoviruses HHPV3 and HHPV4 require at least 4 mM CaCl2 concentration for stability and small amounts of CaCl2 are used standardly for the other viruses as well.

Temperature and Other Significant Factors Virus stock preparations of all pleolipoviruses are stable for months when stored at 41C. The natural environments of halophilic archaea are mostly located in tropical or subtropical areas and exposed to intense sunlight. Thus, it is logical that their viruses require temperatures above 301C for optimal infection. The most commonly used temperature for studying pleolipoviruses is 371C. Incubation of the virions at temperatures higher than 401C abolishes infection. Recently, it was shown that in the case of the spike proteins of HRPV-6, heat treatment at 551C induces a drastic conformational change leading to an extended conformation of the spike proteins (see also: Structural proteins).

Vesicle-Like Archaeal Viruses

385

In general, pleolipovirus infectivity is lost in detergents and organic solvents as well as negative stains used in transmission electron microscopy. Some viruses, such as HRPV-6, HRPV-8, HRPV9, and HRPV10 stand out as being more resistant to chloroform (in the test conditions used) than other pleolipoviruses. The reasons for this stability of a lipid-membrane containing particle to chloroform are not known. The effect of negative stains on the pleolipovirus infectivity has been studied to answer the question whether the virion morphology observed in the ordinary negatively stained samples represent the one of infective virions. Most of the negative stains abolish the infectivity, but no clear correlation with the virion morphology has been found. Thus, ordinary negative stain transmission microscopy cannot be used for determining the morphology of an infectious pleolipovirus particle.

Structural Proteins As mentioned above, the basic pleolipovirus virion contains two major structural proteins, both of which are associated with the membrane envelope of the virion. In addition, traces of the conserved ORF encoding the putative NTPase can be found. However, some of the isolates belonging to the Beta- and Gammapleolipovirus genera, seem to contain some additional structural proteins. For instance, in His2 virions there are two protein species making the spikes as well as two protein species serving the role of the smaller membrane protein. Quantitative biochemical dissociation studies for the smaller membrane protein have shown that it is membrane associated and mostly exposed to the lumen of the virion. However, it does not seem to have strong interaction with the genome.

The Spike Protein The spike protein has been studied more intensively due to its role in host recognition and membrane fusion. All the spike proteins contain an N-terminal signal peptide for either Sec- or predicted Tat-secretion pathway. They also contain a C-terminal trans-membrane domain (TM) with which the spike protein stays anchored to the viral membrane. This has been shown experimentally for HRPV-1 and HRPV-6 virions. The primary amino acid sequence for the different spike proteins can be highly variable, which is common for the viral proteins involved in host recognition. Thus, as such it cannot be considered to be among the conserved proteins of the virus. However, its position as one of the conserved proteins is warranted as in all viruses it seems to serve the same function as a spike protein, and its position in the genomic context of the gene is always the same. Different types of modifications bring additional variation to the spike protein properties. The HRPV-1 spike protein is N-glycosylated with a pentasaccharide that comprises glucose, glucuronic acid, mannose, sulphated glucuronic acid and 5-Nformyl-legionaminic acid residue as the terminal sugar. There is also evidence suggesting that this glycan structure is involved in the recognition of the host receptor. This post-translational modification must take place in the pseudo-periplasmic space as described for the S-layer proteins of haloarchaea. None of the spike proteins from other pleolipovirus species have been shown to be glycan modified. However, such a role of glycans in specific receptor recognition may also be true for other pleolipoviruses, but provided by glycan modifications of the host S-layer. As there are no genes predicted to be involved in N-glycosylation pathways in the pleolipoviral genomes, the glycosylation is dependent on the host glycosylation machinery. This was shown by heterologous expression of the HRPV-1 VP4 in Hfx. volcanii in which case the N-glycan structure was the same as described for Hfx. volcanii S-layer protein N-glycan. In addition to the reported glycan modification, one of the two spike proteins of His2 virus was predicted to contain lipid modifications since the protein was positively stained with Sudan Black. The possible biological function of this modification is not known.

(a)

(b)

Fig. 3 Structure of the HRPV-6 spike and the spike protein VP5. (a) Cartoon representation of the spike protein colored from the N-terminal (blue) to the C-terminal (red). (b) Density of the spike on virion surface as solved by cryo-electron tomography and subtomogram averaging with HRPV-6 VP5 crystallographic structure fitted into the density. Reproduced from El Omari, K., Li, S., Kotecha, A., et al., 2019. The structure of a prokaryotic viral envelope protein expands the landscape of membrane fusion proteins. Nature Communications 10, 846, doi:10.1038/s41467-019-08728-7, under the licence CC BY 4.0, http://creativecommons.org/licenses/by/4.0/.

386

Vesicle-Like Archaeal Viruses

The atomic structure of the HRPV-2 and HRPV-6 spike proteins (VP4-like protein VP5) has been determined at 2.5 and 2.7 Å resolution, respectively. Both proteins, approximately 65% identical in their primary amino acid sequence, show highly similar structures. The overall protein structure is a previously unreported V-shaped fold dividing the protein in two major domains that are connected with a single residue (Fig. 3). The lowest similarity, both in amino acid sequence and in structure, was shown to be in a domain that, when fitted to a 16 Å resolution cryo-EM structure of the spike on the HRPV-6 virion, was pointing outwards from the spike structure. Thus, this domain was suggested to be involved in host recognition. In addition to the host recognition, these spike proteins were also shown to be involved in fusion of the viral membrane with the host cell membrane. The highly alpha-helical secondary structure of the N-terminal domain was predicted to be involved in this process. It was hypothesized that upon the interaction of the putative host recognition domain with the host S-layer protein, the N-terminal alpha helical structure will open with concomitant exposure of the fusion peptide at the very N-terminus of the protein to the host membrane.

Further Reading Atanasova, N.S., Heiniö, C.H., Demina, T.A., et al., 2018a. The unexplored diversity of pleolipoviruses: The surprising case of two viruses with identical major structural modules. Genes 9, 131. Atanasova, N.S., Demina, T.A., Krishnam Rajan Shanthi, S.N.V., Oksanen, H.M., Bamford, D.H., 2018b. Extremely halophilic pleomorphic archaeal virus HRPV9 extends the diversity of pleolipoviruses with integrases. Research in Microbiology 169, 500–504. Bamford, D.H., Pietilä, M.K., Roine, E., et al., 2017. ICTV virus taxonomy profile: Pleolipoviridae. Journal of General Virology 98, 2916–2917. Bath, C., Cukalac, T., Porter, K., Dyall-Smith, M.L., 2006. His1 and His2 are distantly related, spindle-shaped haloviruses belonging to the novel virus group, Salterprovirus. Virology 350, 228–239. Demina, T.A., Atanasova, N.S., Pietilä, M.K., Oksanen, H.M., Bamford, D.H., 2016. Vesicle-like virion of Haloarcula hispanica pleomorphic virus 3 preserves high infectivity in saturated salt. Virology 499, 40–51. El Omari, K., Li, S., Kotecha, A., et al., 2019. The structure of a prokaryotic viral envelope protein expands the landscape of membrane fusion proteins. Nature Communications 10, 846. Jarrell, K.F., Ding, Y., Meyer, B.H., et al., 2014. N-linked glycosylation in Archaea: A structural, functional, and genetic analysis. Microbiology and Molecular Biology Reviews 78, 304–341. Kandiba, L., Aitio, O., Helin, J., et al., 2012. Diversity in prokaryotic glycosylation: An archaeal-derived N-linked glycan contains legionaminic acid. Molecular Microbiology 84, 578–593. Liu, Y., Wang, J., Liu, Y., et al., 2015. Identification and characterization of SNJ2, the first temperate pleolipovirus integrating into the genome of the SNJ1-lysogenic archaeal strain. Molecular Microbiology 98, 1002–1020. Mizuno, C.M., Prajapati, B., Lucas-Staat, S., et al., 2019. Novel haloarchaeal viruses from Lake Retba infecting Haloferax and Halorubrum species. Environmental Microbiology 21, 2129–2147. https://doi.org/10.1111/1462-2920.14604. Pietilä, M.K., Atanasova, N.S., Manole, V., et al., 2012. Virion architecture unifies globally distributed Pleolipoviruses infecting halophilic archaea. Journal of Virology 86, 5067–5079. Pietilä, M.K., Roine, E., Paulin, L., Kalkkinen, N., Bamford, D.H., 2009. An ssDNA virus infecting archaea: A new lineage of viruses with a membrane envelope. Molecular Microbiology 72, 307–319. Sencˇilo, A., Paulin, L., Kellner, S., Helm, M., Roine, E., 2012. Related haloarchaeal pleomorphic viruses contain different genome types. Nucleic Acids Research 40, 5523–5534. Wang, J., Liu, Y., Liu, Y., et al., 2018. A novel family of tyrosine integrases encoded by the temperate pleolipovirus SNJ2. Nucleic Acids Research 46, 2521–2536.

Relevant Website https://talk.ictvonline.org/ictv-reports/ictv_online_report/ssdna-dsdna-viruses/w/pleolipoviridae Pleolipoviridae.

Virus–Host Interactions in Archaea Diana P Baquero, Archaeal Virology Unit, Institut Pasteur, Paris, France and Sorbonne University, Paris, France David Prangishvili, Institut Pasteur, Paris, France and Ivane Javakhishvili Tbilisi State University, Tbilisi, Georgia Mart Krupovic, Archaeal Virology Unit, Institut Pasteur, Paris, France r 2021 Elsevier Ltd. All rights reserved.

Nomenclature Acr Anti-CRISPR cOAs Cyclic oligoadenylates ESCRT Endosomal sorting complex required for transport MCP Major capsid protein

Glossary Adsorption rate Rate at which extracellular virions become attached to the host. A-form One of the three major forms of double-stranded DNA, with a 23 Å helical diameter and 11 bp per helix turn. Capsid Protein shell that encloses the genetic material of the virus. Episomal element Extrachromosomal (non-integrated) form of a mobile genetic element inside the host cell. Holliday junction resolvase A specialized structureselective endonuclease that cleaves four-way DNA intermediates formed during DNA replication. Homologous recombination Recombination between two identical or highly similar DNA sequences. Hyperhalophiles Organisms thriving in the presence of extremely high salt concentrations. Hyperthermophiles Organisms requiring extremely high (801C or above) temperatures for optimal growth. Integrase Enzyme that catalyzes the integration of a viral genome into the genome of the host cell. Inverted terminal repeats Short, related or identical sequences located in reverse orientation at the ends of the viral genome. Jelly-roll fold A structural protein fold composed of eight b-strands arranged in two antiparallel four stranded b-sheets. Lysogenic life cycle One of the two alternative life cycles of a virus, whereby the virus genome. integrates into the host chromosome as a provirus or remains as a dormant episomal element. Lytic life cycle One of the two alternative life cycles of a virus, whereby the virus replicates inside the host and lyses the infected cell releasing the progeny viruses to infect other cells.

ORF Open reading frame S-layer Surface layer T4P Type 4 pilus TA Toxin-antitoxin

Methanogenic Producing methane as a metabolic byproduct in anoxic conditions. Orthologous genes Homologous genes that are related by vertical descent from a common ancestor and encode proteins with the same functions in different species. Phase variation system Reversible process that involves the ON/OFF switching of protein expression to adapt to rapidly changing environments generating phenotypical variations within the population. Protein glycosylation Post-translational modification where a carbohydrate molecule is covalently bound to a predetermined region of a protein to form a glycoprotein. Provirus Viral genome integrated into the host chromosome. Pseudomurein endoisopeptidase An enzyme that cleaves pseudomurein cell-wall sacculi of methanogens. Rolling-circle replication Model of unidirectional DNA replication that can rapidly synthesize multiple copies of circular ssDNA molecules. S-layer A paracrystalline protein surface layer that is present in nearly all archaea described so far. Strand-coupled genome replication Model of DNA replication that couples leading-strand and lagging-strand synthesis. Strand-displacement Displacement of a downstream DNA strand encountered during DNA replication. Type 4 pili Surface-exposed filaments involved in many functions of the cell, including locomotion, adhesion, microcolony formation and protein secretion. Virion Infectious mature virus particle.

Introduction Viruses infecting archaea constitute a distinctive part of the virosphere and exhibit diverse virion morphologies, many of which have never been observed among viruses infecting bacteria or eukaryotes. Diversity of archaeal viruses is also reflected in their genome content, with B75% of the genes lacking detectable homologs in sequence databases. Based on their diverse virion morphologies and genomic properties, the characterized archaeal viruses are currently classified into 20 families. Although the knowledge of virus-host interactions in archaea remain highly fragmented, the increasing number of genetic tools developed for archaea and their viruses as well as application of advanced structural and functional genomics efforts have yielded valuable information on certain aspects of virus-host interactions. Some of these mechanisms are shared with bacterial and/or eukaryotic viruses, whereas others are unique to archaeal viruses.

Encyclopedia of Virology, 4th Edition, Volume 4

doi:10.1016/B978-0-12-814515-9.00063-1

387

388

Virus–Host Interactions in Archaea

The virus infection cycle can be subdivided into several stages. The infection starts with the recognition and binding to specific host receptors on the host cell surface which leads to the delivery of the viral genetic material into the cell interior. Following the entry, many viruses hijack the host replication, transcription, and translation machineries to produce multiple copies of the virus progeny. For some enveloped viruses, the morphogenesis and egress are concomitant, whereas in the case of non-enveloped viruses the virion assembly typically precedes the release. Thus far, two different egress strategies have been elucidated for archaeal viruses: a budding mechanism similar to that of some eukaryotic enveloped viruses and a unique lytic mechanism employed by certain crenarchaeal viruses which involves the formation of pyramidal structures on the host cell surface. Similar to bacteriophages and certain eukaryotic viruses, some archaeal viruses can undergo lysogenic life cycle, whereby the virus genome integrates into the host chromosome as a provirus or remains as an episomal element until the cell host is exposed to certain stimuli or stress conditions that induce viral replication. Archaea and their viruses have evolved diverse defense and counterdefense mechanisms, among which CRISPR-Cas system and anti-CRISPR (Acr) proteins have been the most studied. In this article, we review the state-of-the-art on virus-host interactions in Archaea. Note that most studies have focused on just a handful of model virus-host systems, including Sulfolobus islandicus rod-shaped virus 2 (SIRV2; family Rudiviridae), Sulfolobus spindle-shaped virus 1 (SSV1; family Fuselloviridae), Sulfolobus turreted icosahedral virus (STIV; family Turriviridae) and Sulfolobus tengchongensis spindle-shaped virus 2 (STSV2; family Bicaudaviridae), all infecting hyperthermophilic archaea of the genus Sulfolobus (phylum Crenarchaeota).

Virus Entry The infection cycle begins with the recognition of a suitable host through specific interactions between a receptor-binding protein exposed on the virion and a receptor located on the surface of the host cell. Once the viral protein successfully binds to the host receptor, the virus particle typically undergoes a conformational change that results in the delivery of the genetic material into the cytoplasm of the host cell. For the well-studied viruses infecting bacteria, different cell surface structures have been recognized to be targeted by viruses including pili, flagella, peptidoglycan, lipopolysaccharide and integral membrane proteins. Viral attachment and entry have been poorly studied in archaea and only recent efforts have provided the first insights into the entry mechanisms and possible receptors of hyperthermophilic and hyperhalophilic archaeal viruses.

Interaction With Cellular Appendages Similar to bacteria, archaea possess distinct cell surface appendages, including archaeal flagella (archaella) implicated in cell motility and diverse pili involved in intercellular communication and adhesion to various organic and inorganic substrates. Electron microscopy observations showed that many filamentous viruses bind to pili. For instance, termini of the filamentous virions of members of the Lipothrixviridae family are decorated with diverse structures, which resemble claws (AFV1), brushes (AFV2) or mops (SIFV), and have been proposed to play a role in the viral attachment to the host cell, specifically to pilus-like appendages. The interaction of SIRV2, the prototype virus of the family Rudiviridae, with cellular appendages has been studied in more detail. Both ends of the rod-shaped SIRV2 virions contain three terminal fibers which specifically interact with the pili abundantly present on the surface of Sulfolobus islandicus LAL14/1 cells. Notably, when the pili were detached from the cells, the virus interacted nearly exclusively with the tips of the pili, whereas in the context of metabolizing cells, the virions were observed both at the tips and on the sides of the pili, suggesting that virions move along the pili toward the cell surface. However, the energy source for this movement remains unclear. Once at the cell surface, the SIRV2 virions appear to disassemble, presumably as a consequence of the delivery of the viral DNA into the host cytoplasm. Similarly, rudivirus SSRV1 and tristromavirus PFV2 have also been shown to interact with the type IV pili (T4P) of Saccharolobus solfataricus and Pyrobaculum arsenaticum, respectively (Fig. 1(A)–(D)). Deletion of the orthologous pilin genes, pilA1 and pilA2, in S. islandicus M.16.4 resulted in loss of T4P and resistance to rudivirus SIRV8, a close relative of SIRV2, which could no longer adsorb to the host cell. These observations further reinforce the critical role of pili during the early stages of archaeal virus infection. A near-atomic structure of the SIRV2 receptor showed that it is a T4P. Remarkably, the pilus was extremely stable and resisted various harsh treatments, including digestion by trypsin and pepsin as well as boiling in sodium dodecyl sulfate and 5 M guanidinium-HCl. The resilience of the pilus was attributed to extensive surface glycosylation on serine and threonine residues, which are anomalously abundant in the Sulfolobus T4P, compared to the bacterial T4P and other Sulfolobus proteins. Notably, the cryo-EM structures of the T4P of the acidophilic hyperthermophile Saccharolobus solfataricus and neutrophilic hyperthermophile Pyrobaculum arsenaticum (Fig. 1(E)–(H)) revealed that the extensive glycosylation previously observed in the Sulfolobus islandicus pilus is a response to acidic environments rather than extreme temperatures, as at even higher temperatures but neutral environments, much less glycosylation is observed in Pyrobaculum than in Sulfolobus and Saccharolobus pili. Attachment to pili-like filaments was also confirmed for Sulfolobus turreted icosahedral virus (STIV), the type member of the Turriviridae family, which specifically recognizes unidentified pili-like filaments of its host, Saccharolobus solfataricus. The icosahedral STIV virions are decorated with turret-like protrusions at each of the fivefold vertexes. A three-dimensional reconstruction of the STIV-pilus interaction using cryo-electron tomography displayed that the turrets physically interact with the S. solfataricus pilus. Furthermore, tomographic reconstruction and sub-tomogram averaging unequivocally showed that pilus recognition occurs at the cleft between the second and third jelly-roll domains of the pentameric turret protein C381. Notably, structurally similar jelly-roll

Virus–Host Interactions in Archaea

389

Fig. 1 Virus binding to archaeal type 4 pili. Representative cryo-electron (A) and negative staining (B) micrographs of the P. arsenaticum pili interacting with the hypertermophilic virus Pyrobaculum filamentous virus 2 (PFV2). Scale bar, 20 nm in A and 200 nm in B. Orange arrowhead points to the P. arsenaticum pilus; blue arrowhead points to the PFV2 virion; green arrow points to the regions of the pilus-virion interaction. Representative cryo-electron (C) and negative staining (D) micrographs of the S. solfataricus pili interacting with the filamentous hyperthermophilic virus Saccharolobus solfataricus rod-shaped virus 1 (SSRV1). Scale bar, 20 nm in C and 200 nm in D. Orange arrowhead points to the S. solfataricus pilus; blue arrowhead points to the SSRV1 virion; green arrow points to the regions of the pilus-virion interaction. Cryo-EM reconstruction of the P. arsenaticum pilus at 3.8 Å resolution (E) and the S. solfataricus pilus at 3.4 Å resolution (F). Thin slices parallel to the helical axis of the pilus are shown, colored by the helical radius (G, H). Side view and top view of the P. arsenaticum and S. solfataricus pilus atomic models, built into the cryo-EM maps shown in E, F. The model is colored by chain. Image modified from Wang, F., Baquero, D.P., Su, Z., et al., 2020. The structures of two archaeal type IV pili illuminate evolutionary relationships. Nature Communications 11, 3424.

domains have been implicated in receptor recognition in highly diverse viruses, including P22-like head-tailed phages (Caudovirales), tectiviruses and adenoviruses. Despite the molecular insights into the receptor binding, as in the case of SIRV2, it remains unclear how (and whether) the virus moves along the filaments and how circular virus genome is delivered into the host cytoplasm. Notably, adsorption to the pili of both SIRV2 and STIV was shown to occur exclusively at physiologically-relevant, high temperatures.

Interaction With Cell-Surface Most archaea are surrounded by a thin proteinaceous surface layer (S-layer), consisting of glycosylated proteins anchored in the cell membrane. The S-layer is proposed to maintain an osmotic balance across the envelope, protect the cell from harsh environmental conditions and contribute to the cell shape. Notably, except for certain methanogens, archaea lack an equivalent of the bacterial peptidoglycan layer. The relative simplicity of the archaeal cell envelope likely underlies the mechanisms of virus entry and egress, and might have played a key role in shaping the virus diversity associated with archaea.

390

Virus–Host Interactions in Archaea

Spindle-shaped viruses of the Fuselloviridae family are often attached to membrane-derived vesicles or cellular fragments, suggesting that viral receptors are exposed on the host cell surface, with S-layer itself being a prime suspect. The spindle-shaped virion of the model fusellovirus SSV1 at one of the two pointed ends of the virion contains short terminal fibers, likely composed of protein VP4, which are postulated to mediate host recognition. Despite the observations suggesting possible interactions between SSV1 terminal fibers and the host cell surface, two recent studies on the Sulfolobus S-layer produced somewhat contradictory results regarding the role of S-layer for fusellovirus infection. On the one hand, the essential role of the S-layer in SSV1 infection appears to be supported by the finding that Saccharolobus solfataricus cells in which slaB encoding the membraneanchoring S-layer protein was downregulated by a CRISPR-based silencing technology are less susceptible to SSV1 infection. On the other hand, cells in which both genes encoding the S-layer proteins SlaA and SlaB were deleted remained susceptible to infection with SSV9, a close relative of SSV1, suggesting that S-layer is not essential for either adsorption or infection by SSV9. These discrepant results call for additional studies focusing on the entry mechanism of fuselloviruses. The entry process has also been explored for viruses infecting halophilic archaea. Insights have been obtained into the DNA ejection process in the halophilic spindle-shaped virus His1, the type member of the Halspiviridae family. Similar to fuselloviruses, at one of the pointed ends, His1 virion carries a putative receptor-binding module consisting of a central hub and six tail spikes. In vitro analysis demonstrated that His1 DNA ejection is unidirectional, occurs at a rate comparable to that of bacteriophage l and is dependent on external osmotic pressure. Notably, in vitro, the His1 DNA ejection was only partial, suggesting that cellular factors are required for completion of the nucleic acid transfer. Interestingly, upon DNA ejection, the lemon-shaped virions transform into empty tubes, indicating that capsid proteins are capable of undergoing substantial quaternary structural changes. Members of the family Pleolipoviridae have pleomorphic virions, which resemble membrane vesicles decorated with protruding spikes that in all likelihood participate in host attachment and membrane fusion processes. It has been suggested that upon binding to the receptor, the spike protein VP5 of Halorubrum pleomorphic virus 6 (HRPV-6) undergoes conformation change and drives fusion of the viral membrane with the host cytoplasmic membrane. Structural analysis of HRPV-6 virion by cryo-electron tomography and crystallography revealed that HRPV-6 VP5 has a unique V-shaped fold that is unrelated to the previously reported class I–III viral fusion protein. The host recognition by archaeal head-tailed viruses appears to be mediated by tail fiber proteins, resembling the initial interactions of the bacteriophages of the order Caudovirales. Interestingly, the head-tailed dsDNA virus jCh1 infecting the haloalkaliphilic archaeon Natrialba magadii encodes a phase variation system. The system consists of an invertible region including a site-specific recombinase of the tyrosine recombinase superfamily interspersed between convergently oriented ORFs 34 and 36, which encode viral tail fiber proteins. Recombination in this region leads to an exchange of the gene fragments encoding the carboxy-termini of the tail fiber proteins, thereby creating a set of heterogeneous gp34 and gp36 proteins with distinct C-termini. Notably, only ORF34 is expressed during the virus infection indicating that fiber proteins of jCh1 are produced exclusively from this ORF. Binding assays showed that only one type of the heterogeneous tail fibers encoded by ORF34 interacts with N. magadii, suggesting that other tail fiber variants enable adsorption to other host strains. Galactose moieties present on the cell surface were implicated in host recognition by jCh1. Thus, as reported for bacterial viruses, the generation of two types of tail fiber proteins with distinct binding features might expand the jCh1 host range to quickly respond to habitat changes.

Kinetics of Virus Entry There is a wide variation in terms of adsorption kinetics among archaeal viruses. Generally, hyperthermophilic archaeal viruses tend to display rapid adsorption rates, whereas halophilic viruses are notoriously slow. For instance, it has been shown that SIRV2 adsorbs very rapidly, with B80% of the virions bound to the host cell within the first 30 s of infection. Similarly, B50% of spindle-shaped SMV1 virions were attached to the cells within 1 min post-infection (p.i.). By contrast, only 30% of the salterprovirus His1 and siphovirus HHTV-1 virions adsorb to their host in 3 h. Similarly, the adsorption of the tailless icosahedral haloarchaeal virus HCIV-1 is rather efficient but slow, with the 80% of the particles adsorbed to cells at 4–5 h p.i. While the high adsorption rates of hyperthermophilic viruses are thought to minimize the time they are exposed to harsh extracellular conditions, including acidic pH and high temperatures, the low adsorption rates of haloarchaeal viruses are hypothesized to reflect an evolutionary adaptation to the long generation time of the hosts, whereby rapid adsorption might deplete the host population. Alternatively, it has been proposed that the changing salinity conditions of natural hypersaline environments may have favored the slow-adsorbing viruses that bind efficiently only under a specific range of salt concentrations.

Genome Replication Very few studies have investigated experimentally the mechanisms of archaeal virus genome replication. In most cases, the mode of genome replication has been inferred from recognizable virus-encoded replication-associated genes, including rolling-circle replication initiation endonucleases (RCRE), DNA polymerases, replicative helicases and other components of the host replisome. Members of the families Ampullaviridae, Thaspiviridae, Halspiviridae, and Ovaliviridae as well as members of the genus Gammapleolipovirus (Pleolipoviridae) have linear dsDNA genomes and encode protein-primed family B DNA polymerases. By contrast, certain head-tailed haloarchaeal viruses encode RNA-primed DNA polymerases closely related to the corresponding proteins of their hosts. Generally, there is a correlation between the viral genome size and the completeness of the viral DNA replication machinery. Thus, viruses with small to

Virus–Host Interactions in Archaea

391

medium-sized genomes (5–50 kb) commonly encode only essential components of the replication machinery, with the rest of the components being recruited from the host, whereas viruses with large genomes (4100 kb) appear to depend minimally on the host replisome, encoding nearly complete DNA replication machineries including DNA polymerases, proliferating cell nuclear antigen (PCNA), primases, replicative helicases, and homologs of the archaeal Orc1/Cdc6 replication initiators. Global analysis of dsDNA viral genomes demonstrated that replicative helicases are the most common replication proteins (75% of the genomes) in the dsDNA virus world. Indeed, some archaeal viruses with medium-sized genomes encode replicative minichromosome maintenance (MCM) helicases. Interestingly, phylogenetic analysis indicates that the mcm genes found in haloarchaeal, methanosarcinal, and methanococcal (pro) viruses were acquired from their respective hosts many times independently. The RCRE are encoded by members of the Pleolipoviridae (genus Alphapleolipovirus) and Sphaerolipoviridae (genus Betasphaerolipovirus). However, experimental evidence for rolling circle replication, showing the presence of ssDNA replicative intermediates, has been obtained only in the case of sphaerolipovirus SNJ1. It has been shown that the SNJ1 RepA protein, an RCRE of the HUH superfamily, is indispensable for the genome replication of SNJ1. Closest homologs of SNJ1 RepA are encoded by plasmids of halophilic archaea, while more divergent homologs (13%–19% identity) are also identified in euryarchaea from the orders Methanosarcinales and Thermoplasmatales, as well as in ammonia-oxidizing archaea of the phylum Thaumarchaeota, underpinning a wide distribution of the SNJ1 RepA-like proteins across mobile genetic elements from diverse archaeal lineages. Interestingly, SNJ1 RepA also shares similarity with bacterial transposases of the IS91 family, which transpose via a rolling-circle-like mechanism, highlighting evolutionary connections between viruses, plasmids and transposons. A divergent RCRE has been also identified in rod-shaped viruses of the Rudiviridae family. The protein has been structurally characterized and studied biochemically. In vitro, the Rep protein displays the expected nicking activity. However, its role in the viral genome replication remains unclear, because it does not seem to be expressed during the virus life cycle, at least, under laboratory conditions. Although genome replication has been most extensively studied for rudiviruses, the actual mechanism remains enigmatic. Members of the Rudiviridae, such as SIRV2, encode several proteins proposed to be involved in DNA replication, recombination and repair. Those proteins include the above mentioned Rep, an ssDNA-binding protein with a unique fold, a ssDNA-annealing ATPase, a Cas4-like ssDNA nuclease, a dUTPase, and a Holliday junction resolvase. Furthermore, yeast twohybrid analysis showed that five SIRV2 proteins interact with the host DNA sliding clamp PCNA, known as a “molecular toolbelt” which interacts with multiple components of the host replisome. Presumably, SIRV2 recruits the host replication machinery for the assembly of the replisome on the viral DNA template. An immunofluorescence study of SIRV2 infected cells demonstrated that viral DNA synthesis is confined to a focus near the periphery of the cell. The study also provided evidence that the viral ssDNAbinding protein (gp17) as well as host PCNA and DNA polymerase I (Dpo1) are recruited to the site of viral DNA synthesis, confirming their essential role in the viral genome replication. Given that SIRV2 does not encode an identifiable DNA polymerase, the host polymerase Dpo1 found at the viral DNA synthesis site is likely to be involved in the replication of SIRV2 genome. An exceedingly complex model of SIRV2 replication has been proposed, whereby the virus employs a combination of stranddisplacement, rolling-circle and strand-coupled replication mechanisms, which yields highly branched intermediates of about 41200 kb (B34 viral genome units) with unusual ‘brush-like’ structures. However, it remains unclear how these three replication mechanisms are coordinated and whether all of them are essential for SIRV2 genome replication. In addition, the particular role of the host and viral proteins involved in the orchestration of viral DNA replication remains obscure. It should be noted that most archaea-specific viruses do not encode identifiable DNA replication proteins, implying that these viruses either depend on the host DNA replication machinery or employ novel uncharacterized mechanisms for DNA replication. For instance, the genome replication of the lipothixvirus AFV1 has been suggested to start by the formation of a D-loop and progress by the strand displacement replication mechanism, whereas termination relies on recombination events through the formation of terminal‐loop‐like structures. However, the genes involved in this unique mechanism of replication are currently unknown. Similarly, the conspicuous absence of recognizable genes encoding type B DNA polymerases in viral genomes with linear dsDNA and terminal proteins, as in the case of the haloarchaeal sphaerolipoviruses PH1, SH1, and HHIV-2, suggests novel mechanisms of genome replication. A model resembling that of the Streptomyces linear plasmids has been proposed, whereby the cellular polymerase is employed for the viral genome replication and the terminal proteins are involved in the end patching of the DNA after replication.

Genome Integration Many archaeal viruses are temperate and can undergo a lysogenic pathway in which they coexist with the host cell as proviruses. In most cases, the temperate viruses integrate their genomes into the host chromosome by the activity of viral site-specific integrases. However, sometimes proviruses can exist as circular extrachromosomal plasmids. For instance, the genomes of the euryarchaeal headtailed virus jCh1 (family Myoviridae), pleomorphic virus SNJ2 (family Pleolipoviridae), and the crenarchaeal spindle-shaped SSV1 (family Fuselloviridae) integrate into the chromosome of Natrialba magadii, Natrinema sp. J7–1 and Sulfolobus shibatae cells, respectively, whereas the genomes of the sphaerolipovirus SNJ1 and myovirus jH are stably maintained in a non-integrated circular form in Natrinema sp. J7–1 and Halobacterium halobium cells, respectively. Notably, the reactivation of the provirus typically occurs as a response to stressful conditions, such as DNA damage (e.g., by UV light or mitomycin C), temperature shock and shift from aerobic to anaerobic conditions. Depending on the virion release mechanism, the provirus induction can lead either to cell lysis, as for the head-tailed archaeal viruses and SNJ1, or to growth retardation, as observed for the fusellovirus SSV1 and pleolipovirus SNJ2.

392

Virus–Host Interactions in Archaea

Site-specific integrases of archaeal viruses, all belonging to the tyrosine recombinase superfamily, have been experimentally studied for two groups of viruses: fuselloviruses SSV1 and SSV2, and pleolipovirus SNJ2. The integrase of SNJ2 catalyzes homologous recombination between the viral attachment site located next to the integrase gene and the homologous site located within one of the tRNA genes on the host chromosome, leading to merger of the two genomes. Interestingly, integrases of fuselloviruses are unique in that the attachment site lies within the integrase gene itself and homologous recombination leads to disruption of the integrase gene into two fragments which flank the integrated provirus. Nevertheless, phylogenetic analysis has shown that SSV-like integrases have evolved from the more typical SNJ2-like integrases.

Transcription Genomic analysis of archaeal viruses has shown a lack of genes encoding identifiable RNA polymerases as well as the presence of archaeal promoter elements, indicating a strong dependence on the host transcriptional machinery. However, the existence of putative transcription regulators in viral genomes suggests a viral-driven modulation to downregulate the host transcription and redirect the host transcriptional machinery towards the expression of specific viral genes in an efficient and temporal manner. Bioinformatics, biochemical, structural and transcriptomic efforts have provided valuable insights into the role of viral transcription regulators as well as into the modulation of the virus/host gene expression during the course of infection.

Transcription Regulators A number of archaeal virus ORFs have been predicted to contain DNA-binding motifs typical of cellular transcription factors, with the majority displaying ribbon-helix-helix (RHH), winged helix-turn-helix (wHTH), or zinc (Zn) finger motifs. Notably, RHH and wHTH transcription factor sequences encoded by archaea and their viruses are bacterial-like, whereas archaeal Zn-finger domains share a common origin with those of eukaryotes. During the past years several virus transcription factors have been experimentally characterized and their structures have been solved. For instance, The AFV1p06 protein from the lipothrixvirus AFV1 is the first experimentally characterized archaeal Zn-finger protein. The protein preferentially binds to GC-rich DNA regions and displays a classical Zn-finger motif, albeit with an atypical substitution of one of the residues responsible for chelation of the Zn ion. Most of the functionally characterized transcription regulators encoded by archaeal viruses are repressors, often also autoregulating their own expression. For example, the transcription regulator SvtR encoded by the rod-shaped virus SIRV1 contains an RHH fold and forms a homodimer that resembles bacterial repressors CopG, NikR, and MetJ. Functional characterization of SvtR showed that its high-affinity binding-site corresponds to the promoter region of the gp30 gene, which encodes a structural protein involved in the assembly of terminal fibers. Hence, it has been suggested that the primary function of SvtR is to prevent the premature expression of the structural protein encoded by gp30. Furthermore, SvtR binds to the promoter of its own gene (gp08), acting as a repressor and downregulating its own expression. The transcription regulator AvtR encoded by the filamentous virus AFV6 is highly conserved in the Betalipothrixvirus genus and consists of two RHH motifs connected by a linker. AvtR represses the expression of its own gene (gp29), but is able to both activate and repress the expression of the gp30 gene in a concentrationdependent manner. Although in the case of both genes, AvtR binding sites are distant from the TATA boxes, DNase I footprinting assays showed that AvtR protects a region of approximately 100 nucleotides between the divergently oriented gp29 and gp30 promoters. This finding suggests that AvtR regulation depends on protein oligomerization on the DNA template. It has been proposed that AvtR binds to the initial binding site with high affinity and its oligomerization along the viral DNA induces cooperative binding to degenerate secondary sites with weaker affinity. The transcriptional regulator F55 encoded by the fusellovirus SSV1 is one of the more extensively studied transcription factors encoded by archaeal viruses. Upon exposure to UV light, F55 plays a crucial role in the transition from the carrier to the induced state of SSV1. F55 is a dimer with an RHH DNA-binding motif. F55 recognizes tandem repeat sequences located within the promoters of the immediate early-induced transcripts T5, T6, and Tind as well as in its own promoter. Notably, binding of F55 to the target sequences leads to repression of the gene expression, whereas its dissociation causes activation of the transcription upon exposure of the SSV1 infected cells to UV light. A recent study using a variant of electrophoretic mobility shift assay (EMSA) coupled to mass spectrometry revealed that host RadA recombinase is associated with F55 when bound to the specific promoter sequences, forming a RadA-F55-dsDNA complex. The RadA recombinase belongs to the RecA/RadA/Rad51 protein superfamily and promotes DNA repair and recombination in hyperthermophilic archaea. Therefore, it has been proposed that RadA is a molecular sensor of the SSV1 host DNA damage analogous to the role of bacterial RecA protein in the life cycle of phage lambda. According to the proposed model, the exposure of infected cells to UV light causes massive DNA degradation that leads to accumulation of ssDNA regions. The RadA protein, which is coupled to F55 on the viral dsDNA genome, is recruited to the ssDNA regions and is progressively released from the RadA-F55-dsDNA complex, causing dissociation of F55 from the viral target sequences, thereby leading to activation of transcription of the immediate early transcripts. Notably, however, in vitro addition of ssDNA to the stable RadA-F55-dsDNA complex did not result in the release of the transcriptional block, prompting further studies to understand the specific role of RadA in the activation of the SSV1 transcription. The regulation of the lysogeny has been also investigated for the haloarchaeal icosahedral virus SNJ1. The product of SNJ1 ORF4 has been recently suggested to controls the lysis-lysogeny switch. The expression level of ORF4 in the host cell appears to play a crucial role in the repression of genes responsible for the lytic pathway. Consistently, in absence of ORF4, SNJ1 produces

Virus–Host Interactions in Archaea

393

clear plaques, whereas in its presence, the plaques are turbid. In addition, ORF4 has been shown to play a key role in conferring immunity to the host cell against subsequent infections by SNJ1 (homotypic superinfection immunity) by repressing the genome replication of the superinfecting virus. ORF4 is conserved in other SNJ1-like proviruses, suggesting that the mechanisms behind the lysis-lysogeny switch and superinfection immunity is conserved in SNJ1-like proviruses. Bicaudavirus ATV encodes an atypical transcriptional regulator, ORF145, which has apparently evolved from the major capsid protein, with which it shares significant sequence similarity. Similar to the major capsid protein, ORF145 is abundantly present in ATV virions. The protein binds to the host RNA polymerase (RNAP) with nanomolar affinity and inactivates it in a reversible manner via an allosteric mechanism. ORF145, renamed as RNAP inhibitory protein (RIP), binds apically to the DNA-binding channel of the RNAP and locks in a fixed position the RNAP clamp domain that typically switches between open and closed conformations during the transcription cycle. The high-affinity complex formed between RIP and the host RNAP inhibits the formation of transcription pre-initiation complexes and represses abortive and productive initiation as well as transcription elongation. Interestingly, RIP efficiently hinders the transcription directed from both host and viral promoters, suggesting a global inhibitory activity. Although the biological significance of a global transcriptional shutdown is unclear, it has been proposed that global repression prevents the host defense response. Notably, presence of RIP in the ATV virions suggests that host transcription might be repressed during the very early stages of the viral infection, most likely, to prevent the activation of the host type III-B CRISPR system.

Transcriptional Control Whole-genome transcriptomic analyses using DNA microarray and RNAseq technologies have provided novel insights into virushost interactions in crenarchaea. For example, SSV1 infected cells exhibit a tight chronological regulation of viral gene expression upon UV irradiation highly reminiscent of the strategy used by many bacterial and eukaryotic viruses. Thus, SSV1 temporal control leads to three clearly distinguishable sets of genes: immediate early, early and late genes. Transcriptomic analysis of the closely related virus SSV2 also displays a temporal regulation of gene expression which occurs in a distributive fashion with expressed genes not being adjacently located. Contrary to SSV1 and SSV2, analysis of gene expression of the lytic viruses SIRV2 and STIV display little temporal regulation of the viral gene transcription, with SIRV2 starting the transcription at multiple sites in the genome. Moreover, an RNAseq analysis on the two-tailed virus STVS2 shows that transcription of the majority of viral genes starts shortly after infection and increases throughout the infection cycle. Studies on the host gene expression during virus infection showed that host response differs among archaeal viruses. While a small proportion of host genes was differentially expressed during the induction of SSV1 lysogens, the expression of several host genes encoding DNA replication, repair and transcription proteins was increased upon the infection with SSV2. Likewise, the transcription of more than one third of the Sulfolobus genes was differentially regulated as consequence of the SIRV2 infection. Notably, transcriptomic analysis of SIRV2 and STSV2 display a strong upregulation of antiviral defense genes including those for CRISPR-Cas and toxin-antitoxin systems. Consistently, infected Sulfolobus islandicus cells are actively undergoing CRISPR spacer acquisition from STSV2. These findings indicate that host response is virus-dependent since some viruses trigger a massive host response, whereas others, such as SSV1, are nearly unnoticed by the host. Interestingly, the same set of host genes can be up- and down-regulated by two different viruses, which suggests that distinct groups of host functions are required for the propagation of different archaeal viruses. For instance, the crenarchaeal cell division (cdv) operon, encoding homologs of the eukaryotic endosomal sorting complex required for transport (ESCRT) machinery, is downregulated upon SIRV2 and STSV2 infection but upregulated during the STIV infection. Analysis of the host gene expression during virus infection also revealed significant upregulation of the host genes implicated in replication and DNA repair. The infection with fuselloviruses SSV1 and SSV2 leads to upregulation of genes encoding reverse gyrases, two subunits of the topoisomerase VI and the replication initiation protein Orc1/Cdc6, whereas the expression of the genes encoding the MCM helicase, PCNA and Dpo1 were boosted only in the case of SSV2. Similarly, reverse gyrase and Orc1/ Cdc6-like genes were upregulated during the STIV infection, whereas genes implicated in energy production and metabolism were downregulated. The STSV2 infection leads to increased transcription of Holliday junction resolvase, DNA topoisomerase I and several DNA repair proteins, albeit the expression of the reverse gyrase, unlike for other crenarchaeal viruses, was downregulated. Differential regulation of genes encoding proteins involved in transcription suggests that viral transcription also relies on the host cell machinery. For instance, infection with SSV2, SIRV2 and STIV resulted in the upregulation of transcription initiation factor IIB, different subunits of the DNA-directed RNA polymerase and several transcriptional regulators.

Virion Egress The last stage of the viral infection cycle corresponds to the egress of the new virions from the host cell. So far, release mechanisms have been studied in detail only for a handful of archaeal viruses. Generally, there are two types of archaeal viruses – those that disrupt the host cell upon virion egress (i.e., lytic viruses) and those that do not (i.e., non-lytic viruses). Similar to bacteriophages, some archaeal viruses can undergo a lysogenic life cycle in which the expression of most viral genes is suppressed and the virus is able to persist in the cell either as a provirus integrated into the host chromosome or as an episomal (extrachromosomal) element. In addition, some viruses can establish a carrier state, whereby the virus is stably maintained in a fraction of the cellular population, without causing cell lysis with the remaining cells being transiently resistant to the sviral infection.

394

Virus–Host Interactions in Archaea

Cell Membrane Disruption Lytic archaeal viruses have evolved several unrelated mechanisms for disruption of the cell envelope. The most extensively characterized mechanism involves formation of large pyramidal portals, dubbed virus-associated pyramids (VAP) (Fig. 2). VAPs develop on the surface of the infected cells, protrude through the S-layer (Fig. 2(A)–(D)) and open outward as flower petals (Fig. 2(E)–(H)), generating apertures through which the mature virions exit from the cell. This mechanism is used by viruses from at least three unrelated families, namely, Rudiviridae, Turriviridae and Ovaliviridae. VAPs of the rudivirus SIRV2 and turrivirus STIV exhibit seven-fold symmetry, while those of the ovalivirus SEV1 are six-sided. The VAPs consist of multiple copies of a single 10 kDa viral protein containing a transmembrane domain which promotes its insertion into the cellular membrane. Interestingly, heterologous expression of the SIRV2 pyramid protein P98 in archaeal (Sulfolobus acidocaldarius), bacterial (Escherichia coli) or eukaryotic (Saccharomyces cerevisiae) cells resulted in correct insertion into the membrane and formation of VAPs. Nevertheless, the signal triggering the opening of the pyramid structures appears to be archaea-specific since pyramids expressed in bacteria and eukaryotes were never observed in the open conformation. In addition, pyramidal structures have also been observed on the surface of an unidentified crenarchaeon, probably of the order Thermoproteales, infected with an unknown filamentous virus as well as on the surface of Pyrobaculum oguniense cells, suggesting that the VAP-based strategy is common among crenarchaeal viruses. Head-tailed viruses infecting euryarchaea also lyse their host cells, but the underlying mechanism remains unknown. Bacterial relatives of archaeal head-tailed viruses lyse the cells using the holin–endolysin system, in which holin is a small membrane protein which forms lesions in the membrane, whereas endolysin is a peptidoglycan digesting enzyme. Phages infecting gram-negative bacteria often encode additional lysis proteins responsible for disintegration of the outer membrane. Thus far, homologs of the components constituting the holin-endolysin systems have not been identified in archaeal viruses. However, several viruses infecting methanogenic archaea, including siphovirus cM1 infecting Methanothermobacter marburgensis, encode pseudomurein endoisopeptidases which degrade the pseudomurein layer, which is equivalent, but not evolutionarily related to the bacterial peptidoglycan. Activity assays confirmed the cell wall-degrading activity of the pseudomurein endoisopeptidases which cleave the -isopeptide bond between alanine and lysine in the peptide chain of the pseudomurein. However, it remains unclear how these enzymes cross the cell membrane to reach the cell wall since no genes encoding potential holins have been identified in the genome of methanogenic viruses. Furthermore, pseudomurein is not universally present in euryarchaea or even methanogens. How head-tailed viruses infecting other archaea, such as halophiles, are released from the host remains unclear. Lytic life cycle has been also demonstrated for filamentous enveloped viruses of the Tristromaviridae family which infect hyperthermophilic Pyrobaculum species (phylum Crenarchaeota). Electron microscopy analysis has shown that at the late stages of infection, the cells are packed with virions and their envelope is slashed by long, straight cuts not observed for other archaeal virushost systems. However, the mechanism underlying this lysis mechanism has not been investigated.

Viral Release Without Membrane Disruption Many archaeal viruses are released from the host without causing cell lysis. This type of virion egress has been most extensively studied on the example of fuselloviruses, in particular, SSV1. The assembly of SSV1 virions is concomitant with the egress and occurs by a mechanism that resembles the budding of enveloped eukaryotic viruses (Fig. 3). Dual-axis electron tomography analysis has shown that viral nucleoprotein complexes are extruded through the host cytoplasmic membrane in the form of tubular intermediate structures which share a continuous envelope with the host membrane (Fig. 3(A)). Subsequently, the SSV1 virions attached to the cell membrane undergo maturation to the characteristic spindle-shaped morphology. Formation of constricted ring-like structures (Fig. 3(B)) at the trailing end of the virion bud precedes the separation of the SSV1 virion from the cell membrane (Fig. 3(C)). The ring-like structures observed during the final step of SSV1 budding resembles the budding necks observed prior to the ESCRT machinery-mediated membrane scission during egress of some eukaryotic viruses, including human immunodeficiency virus and Ebola virus. The ESCRT system drives key membrane-remodeling processes in eukaryotes, such as cytokinesis, multivesicular body biogenesis and viral budding processes. Notably, proteins homologous to eukaryotic ESCRT components are conserved in several members of the Sulfolobales and play a central role in cell division, suggesting that, as for eukaryotic viruses, SSV1 budding may rely on a cellular membrane remodeling machinery. Similarly, pleolipoviruses have been suggested to possess a nonlytic life cycle and be released through a budding mechanism. Notably, however, unlike crenarchaea, halophilic archaea do not encode the ESCRT machinery.

Antiviral Defense and Viral Counterdefense Mechanisms In most environments, viruses and their hosts are engaged in a constant evolutionary arms race. In this context, a broad range of host defense strategies as well as viral counterdefense mechanisms have been described in bacteria, whereas in archaea such mechanisms remain poorly understood, with the exception of the CRISP-Cas system.

Virus–Host Interactions in Archaea

395

Fig. 2 Virus-associated pyramids (VAPs) in closed and open conformation. Tomographic slice (A, C, E, and G) and segmented, surface-rendered volumes (B, D, F, and H) of VAPs in the membrane of SIRV2-infected S. islandicus cells. VAPs are either closed (A–D) or open (E–H). The S-layer is purple, the cell membrane is blue, and the VAP is yellow. Scale bars, 200 nm. Image reproduced from Daum, B., Quax, T.E., Sachse, M., et al., 2014. Self-assembly of the general membrane-remodeling protein PVAP into sevenfold virus-associated pyramids. Proceedings of the National Academy of Sciences of the United States of America 111, 3829–3834.

396

Virus–Host Interactions in Archaea

Fig. 3 Different stages of SSV1 budding. Slices through tomograms (left) and volume segmentations (right) showing concomitant assembly and release of SSV1 virions. (A) SSV1 virions are attached to the host cell surface, with their envelope continous with the cell membrane; (B) Presence of a constricted budding neck at the trailing end of the SSV1 virion bud; (C) SSV1 virion separated from the cell membrane. Red, putative nucleoprotein; blue, lipid membrane (M); green, S-layer (SL). Scale bars, 50 nm. Image reproduced from Quemin, E.R., Chlanda, P., Sachse, M., et al., 2016. Eukaryotic-like virus budding in archaea. mBio 7, e01439–16.

Antiviral Defense Mechanisms Mechanistically, the defense systems of archaea and bacteria can be classified into three main groups: (1) variation of virus receptors, (2) innate and adaptive immunity and (3) dormancy and programmed cell death. The variation of virus receptors includes programmed changes, such as phase variation and physical masking of the receptor in order to hamper the successful binding of the virus to the host cell. Defense mechanisms that rely on immunity involve the recognition and inactivation of invader genetic material either by nonspecific innate immunity, such as restriction modification modules and Argonaute-based

Virus–Host Interactions in Archaea

397

innate immunity, or highly specific adaptive immunity, represented by CRISPR-Cas systems. Finally, strategies based on induction of dormancy or programmed cell death upon viral infection include toxin-antitoxin (TA) systems, in which the infection disrupts the balance of the host toxin-antitoxin complex leading to retardation of the cell growth or death. During the past few years the study of antiviral defense mechanisms in archaea has been focused on CRISPR-Cas systems, whereas hints about the role of toxinantitoxin system in virus-host interactions has been provided by transcriptomic analysis of infected cells.

CRISPR-Cas systems The CRISPR-Cas systems represent prokaryotic adaptive immunity mechanism present in about 40% of bacteria and almost all archaea. This system protects cells against invasion of mobile genetic elements, such as viruses and plasmids. CRISPR-Cas immunity involves three main stages: (1) adaptation stage, during which new virus/plasmid-derived sequences (spacers) are integrated into the CRISPR array adjacent to the leader sequence of the CRISPR array, (2) processing stage, during which the CRISPR array is transcribed and processed into separate CRISPR RNAs (crRNAs) containing the spacer sequence with 50 and 30 tags from the flanking repeats of the CRISPR array, and (3) interference stage, whereby the crRNA binds to the assembled CRISPR effector complex (interference module) to recognize and degrade DNA and/or RNA molecules containing the protospacer sequence. Based on gene synteny and the composition of the effector complexes, CRISPR-Cas systems have been divided into two classes, each subdivided into three types and several subtypes. CRISPR-Cas systems are very abundant among archaea and often several different types of CRISPR systems are encoded on the same archaeal genome. It has been observed that SIRV2 infection leads to massive activation of the CRISPR-Cas systems present in Sulfolobus islandicus LAL14/1. The genome of S. islandicus LAL14/1 contains six CRISPR-cas loci encoding complexes of the three subtypes: I-A, I-D and III-B. The SIRV2 infection led to a sharp increase of the transcription levels of all CRISPR-cas arrays, except for one incomplete type III-B CRISPR-Cas module lacking the CRISPR array. Moreover, the expression of CRISPR arrays was activated immediately after SIRV2 infection and steadily increased during the course of the infection, with the highest levels of cas expression being reached 1 hpi. Notably, the expression levels of different CRISPR-Cas loci varied upon SIRV2 infection, suggesting their specialized roles during viral infection. The ability of SIRV2 to propagate in S. islandicus LAL14/1 despite the presence of active CRISPR-Cas systems carrying spacers matching the viral genome suggested the presence of viral anti-CRISPR (Acr) proteins which could counteract the host immune response. Similarly, a transcriptomic analysis on STSV2 infected cells revealed a differential expression level of the CRISPR-Cas systems upon infection. S. islandicus REY15A, the STSV2 host, encodes one type I-A and two type III CRISPR systems. Upon STSV2 infection, the type I‐A module was strongly upregulated, whereas the expression level of one type III‐B complex was downregulated through the course of STSV2 infection and the second type III-B complex was weakly upregulated. The interaction between tailed spindle-shaped virus SMV1 and the CRISPR-Cas systems of S. islandicus REY15A was studied in more detail by constructing strains carrying plasmid-borne mini-CRISPR arrays targeting SMV1 genome. The CRISPR response against SMV1 was type-specific, with the III-B CRISPR complex showing a tight control on the inhibition of the viral replication and proliferation during the infection, whereas the I-A CRISPR complex gradually lost the control of the viral proliferation allowing viral replication and release. The absence of SMV1 escape mutations conferring tolerance to I-A CRISPR system suggests that the virus has evolved a mechanism, probably an Acr protein, specific against the type IA CRISPR system. Rudivirus SIRV3, a closely related virus to SIRV2, undergoes a host-dependent carrier state infection in Sulfolobus islandicus REY15A whereby the virus is maintained in a small fraction of the population over several days without apparent chromosomal DNA degradation, disturbance of the cell growth or induction of detectable CRISPR-Cas response. Notably, coinfection with the bicaudavirus SMV1 did not affect the SIRV3 carrier state and led to the coexistence of both viruses in the culture over 12 days, despite the induction of CRISPR spacer acquisition from SIRV3 DNA and the increased transcription of the subtype I-A CRISPR-Cas module. A plausible explanation for the maintenance of both viruses in the cell cultures seems to be that host CRISPR-Cas systems are inhibited by the virus-encoded Acr proteins encoded by SIRV3 and SMV1, which most likely target different types of CRISPR-Cas complexes encoded by the host. Transcriptomic analysis of S. solfataricus P2 cells infected with the fusellovirus SSV2 showed strong upregulation of the six CRISPR loci. By contrast, such effect was not observed when S. solfataricus P2 was infected with SSV1, a close relative of SSV2. Surprisingly, the co-infection with both viruses caused the silencing of the host CRISPR response. The major difference between SSV1 and SSV2 lies in the UV-inducible operon present in SSV1 but not in SSV2. Therefore, it has been speculated that SSV1 UVinducible operon encodes transcription factors that may silence the CRISPR-Cas response in the SSV1-infected strain. Interestingly, CRISPR-mediated defense systems appear to be employed not only by cells to defend against mobile genetic elements, but also by viruses for interviral conflicts. It was discovered that some archaeal viruses carry mini-CRISPR arrays (with 1–2 repeat-spacer units), which are preceded by promoter-containing leader sequences and genetic determinants required for insertion of new spacers. Remarkably, most of these virus-borne spacers target closely related viruses present in the same population. For instance, SPV1 and SPV2, two closely related viruses of the family Portogloboviridae isolated from a Japanese hot spring, were found to possess mini-CRISPR arrays with spacers reciprocally targeting each other. The existence of related viruses carrying mini-CRISPR arrays against each other in the same population suggests that viruses have adapted the host defense system for interviral conflicts and use it as a mechanism of heterotypic superinfection exclusion, in which a cell infected by one virus becomes resistant to another closely-related virus.

Toxin-antitoxin system The toxin-antitoxin (TA) modules consist of a toxin protein capable of inhibiting the cell growth and an antitoxin that neutralizes the action of the toxin. TA modules can be encoded by both cellular organisms and mobile genetic elements, more commonly, plasmids

398

Virus–Host Interactions in Archaea

but also viruses. The toxin and its cognate antitoxin form a stable complex preventing the toxin from exerting its toxic effect in normally growing cells. Whereas toxin is a stable protein, often a nuclease, the antitoxin is typically labile, with quick turnaround, and has to be constantly synthesized. Under stressful circumstances, when the production of antitoxin is perturbed, the toxin is unleashed, leading to cell dormancy or death. Given that virus infection can lead to disbalance in toxin-antitoxin equilibrium, TA clusters have been proposed to represent a mechanism of abortive infection, in which an infected cell commits “altruistic suicide” to protect the population. Although the role of TA systems in virus-host interactions has been poorly understood in archaea, it has been suggested that csa5 represents a toxin gene in the type I-A CRISPR system of Saccharolobus solfataricus P2, because Csa5 is toxic in the CRISPR-deficient S. solfataricus strain. Infection with rudivirus SIRV2 leads to induction of the Csa5 expression and formation of Csa5 oligomers in Sulfolobus cells, suggesting that Csa5 may be involved in programmed cell death in response to virus infections. Future work is required to gain insights into the mechanism of Csa5 toxicity and its biological relevance. Transcriptomic analyses have revealed an upregulation of the type II TA systems upon viral infection. The type II TA systems are widespread among members of the order Sulfolobales and consist of an antitoxin protein that directly binds and inhibits the effect of a toxin protein. S. islandicus LAL14/1 encodes 16 operons of the family VapBC (virulence associated proteins B and C) and 6 operons of the family HEPN-NT (higher eukaryote and prokaryote nucleotide binding-nucleotidyltransferases), whereby the toxins are proposed to perform RNA cleavage. Notably, the expression of 11 out of 16 VapBC and 3 out of 6 HEPN-NT loci increased after SIRV2 infection at different time points. In most cases, the expression level of both genes coding the toxin and antitoxin was upregulated to similar extents. Likewise, bicaudavirus STSV2 infection caused a strong upregulation of the host TA gene pairs, including vapBC. Even though the role of TA operons remains unclear, the increase in TA gene expression after infection strongly suggests a function in host defense response.

Counterdefense Mechanisms Viruses have evolved strategies to evade the targeting by host CRISPR-Cas adaptive immunity. The simplest mechanism involves mutations in the protospacer adjacent motifs (PAM), a short motif necessary for correct CRISPR targeting. A more sophisticated mechanism relies on the dedicated anti-CRISPR proteins (Acrs) which inhibit CRISPR-Cas by a diversity of mechanisms, including binding to specific subunits of the effector complexes. Although Acrs have been studied in many bacteriophages, only three archaeal Acr proteins have been characterized, despite the high abundance of CRISPR systems in archaeal genomes. All three archaeal Acrs were discovered in rudiviruses, although some of them display much broader distribution. The AcrID1 protein inhibits subtype I-D CRISPR-Cas system by binding directly to the Cas10d subunit of the I-D CRISPR-Cas effector complex. The AcrID1-Cas10d interaction blocks the interference stage of the I-D CRISPR-Cas response in which the CRISPR RNAs (crRNAs) and Cas proteins form an effector complex that recognizes the viral sequence complementary to that of the crRNA spacer and cleaves it. Notably, AcrID1 is a conserved dimeric ab-sandwich protein widely distributed in viruses infecting hyperthermophilic crenarchaeota of the order Sulfolobales, including rudiviruses, lipothrixviruses, fuselloviruses, and monocaudaviruses, with about 50 homologs. The two other Acr proteins disarm the type III CRISPR systems. Type III CRISPR-Cas systems exhibit a high complexity and consists of four subtypes (III-A–III-D), of which subtypes III-A (Csm) and III-B (Cmr) have been the most studied ones. In both cases, the type III effector complex binds to the viral protospacer causing the activation of the Cas10 protein as well as the synthesis of cyclic oligoadenylates (cOAs) from ATP. The presence of cOAs, in turn, triggers the activation of the Csm6 RNase in the III-A system and Csx1 in the III-B system to cleave the viral mRNA. It has been recently demonstrated that SIRV2 gp48, which is conserved in several members of Rudiviridae and Lipothrixviridae, is an Acr protein that exclusively inhibits the subtype III-B CRISPR-Cas system (AcrIIIB1) of S. islandicus LAL14/1. AcrIIIB1 was demonstrated to bind to two distinct effector complexes of the subtype III-B system, Cmr-a and Cmr-g, suggesting that the mechanism by which AcrIIIB1 inhibits the subtype III-B response is by interfering with the Csx1 RNase activation process. The third archaeal Acr belongs to the DUF1874 protein family and was named AcrIII-1 family, because unlike all other known Acr of bacteria or archaea, it is not specific for a particular subtype, but blocks all subtypes of type III. AcrIII-1 is an enzyme with a ring nuclease activity, rapidly degrading the cyclic tetra-adenylate (cA4) second messenger into a linear di-adenylate (ApA4P) with a cyclic 20 ,30 -phosphate, thereby preventing the activation of the type IIIassociated RNase and, therefore, blocking the host type III CRISPR defense system. Given that the target of AcrIII-1 is a signaling molecule (cA4) with a constant structure, rather than specific CRISPR effector protein, this Acr is likely to be able to inhibit any type III CRISPR subtype using cA4 as part of its activation. Consistently, the AcrIII-1 family is widely distributed in viruses infecting both archaea and bacteria as well as plasmids and proviruses.

Conclusions Since the first archaeal virus was isolated in 1977, a considerable effort has been made to understand their virion architectures, genome contents and modes of interaction with their hosts. However, despite the substantial progress, a comprehensive understanding of molecular mechanisms that underlie these interactions is still lacking. The continued exploration of the archaeal virus diversity and development of new model systems, coupled with advances in molecular biology, genetics and microscopy tools are expected to shed further light on the mechanisms of virus-host interactions in archaea.

Virus–Host Interactions in Archaea

399

Further Reading Athukoralage, J.S., McMahon, S.A., Zhang, C., et al., 2020. An anti-CRISPR viral ring nuclease subverts type III CRISPR immunity. Nature 577, 572–575. Daum, B., Quax, T.E., Sachse, M., et al., 2014. Self-assembly of the general membrane-remodeling protein PVAP into sevenfold virus-associated pyramids. Proceedings of the National Academy of Sciences of the United States of America 111, 3829–3834. Fusco, S., She, Q., Fiorentino, G., Bartolucci, S., Contursi, P., 2015. Unravelling the role of the F55 regulator in the transition from lysogeny to UV Induction of Sulfolobus spindle-shaped virus 1. Journal of virology 89, 6453–6461. Hanhijärvi, K.J., Žiedaite, G., Pietilä, M.K., Hæggström, E., Bamford, D.H., 2013. DNA ejection from an archaeal virus – A single-molecule approach. Biophysical Journal 104, 2264–2272. Hartman, R., Eilers, B.J., Bollschweiler, D., et al., 2019. The molecular mechanism of cellular attachment for an archaeal virus. Structure 27, 1634–1646. (e1633). He, F., Bhoobalan-Chitty, Y., Van, L.B., et al., 2018. Anti-CRISPR proteins encoded by archaeal lytic viruses inhibit subtype I-D immunity. Nature Microbiology 3, 461–469. Krupovic, M., Cvirkaite-Krupovic, V., Iranzo, J., Prangishvili, D., Koonin, E.V., 2018. Viruses of archaea: Structural, functional, environmental and evolutionary genomics. Virus Research 244, 181–193. Luk, A.W., Williams, T.J., Erdmann, S., Papke, R.T., Cavicchioli, R., 2014. Viruses of haloarchaea. Life 4, 681–715. Makarova, K.S., Wolf, Y.I., Iranzo, J., et al., 2020. Evolutionary classification of CRISPR-Cas systems: A burst of class 2 and derived variants. Nature Reviews Microbiology 18, 67–83. Martínez-Alvarez, L., Deng, L., Peng, X., 2017. Formation of a viral replication focus in Sulfolobus cells infected by the rudivirus Sulfolobus islandicus rod-shaped virus 2. Journal of Virology 91, e00486-17. Medvedeva, S., Liu, Y., Koonin, E.V., et al., 2019. Virus-borne mini-CRISPR arrays are involved in interviral conflicts. Nature Communications 10, 5204. Okutan, E., Deng, L., Mirlashari, S., et al., 2013. Novel insights into gene regulation of the rudivirus SIRV2 infecting Sulfolobus cells. RNA Biology 10, 875–885. Ortmann, A.C., Brumfield, S.K., Walther, J., et al., 2008. Transcriptome analysis of infection of the archaeon Sulfolobus solfataricus with Sulfolobus turreted icosahedral virus. Journal of Virology 82, 4874–4883. Prangishvili, D., Bamford, D.H., Forterre, P., et al., 2017. The enigmatic archaeal virosphere. Nature Reviews Microbiology 15, 724–739. Quax, T.E., Voet, M., Sismeiro, O., et al., 2013. Massive activation of archaeal defense genes during viral infection. Journal of Virology 87, 8419–8428. Quemin, E.R., Chlanda, P., Sachse, M., et al., 2016. Eukaryotic-like virus budding in archaea. mBio 7, e01439-16. Quemin, E.R., Lucas, S., Daum, B., et al., 2013. First insights into the entry process of hyperthermophilic archaeal viruses. Journal of Virology 87, 13379–13385. Sheppard, C., Werner, F., 2017. Structure and mechanisms of viral transcription factors in archaea. Extremophiles 21, 829–838. Wang, F., Cvirkaite-Krupovic, V., Kreutzberger, M.A.B., et al., 2019. An extensively glycosylated archaeal pilus survives extreme conditions. Nature Microbiology 4, 1401–1410. Wang, F., Baquero, D.P., Su, Z., et al., 2020. The structures of two archaeal type IV pili illuminate evolutionary relationships. Nature Communications 11, 3424.

Antiviral Defense Mechanisms in Archaea Qunxin She, Shandong University, Qingdao, China r 2021 Elsevier Ltd. All rights reserved.

Glossary Anti-CRISPR (Acr) genes Viral genes that code for proteins that interact with effectors of CRISPR-Cas systems to inhibit their antiviral immunity. cOA signal transduction pathway Cyclic oligoadenylates (cOA) synthesized by type III CRISPR-Cas systems function as a secondary signal that bind to RNase of Csm6/Csx1 family to activate the enzyme for general cellular RNA degradation, leading to cell dormancy or cell death. CRISPR-Cas systems The prokaryotic adaptive immune system that mediates small RNA-guided target nucleic acids destruction.

PAM-dependent DNA interference One of the CRISPR immunity mechanisms, in which CRISPR-Cas systems specifically recognize protospacer-adjacent motif to distinguish native versus foreign DNA and only target foreign DNA for destruction. Transcription-dependent CRISPR interference Another unique CRISPR immunity mechanism, in which CRISPRCas systems rely on target transcription for mediating DNA interference.

Introduction Archaea and bacteria coexist with mobile genetic elements (MGEs) such as viruses and plasmids in various environments. To maintain their ecological fitness, these microbes interact with viruses beneficially on one hand and develop antiviral mechanisms to prevent growth inhibition and cell death caused by harmful viruses on the other hand. These include the restriction and modification (R–M) systems, abortive infection, and toxin–antitoxin (TA) systems, prokaryotic Argonaute (pAgo), and most recently identified DNA phosphorothioate modification and virus replication inhibition systems, all of which provide innate antiviral mechanisms. About 12 years ago, antiviral immunity based on the clustered regularly interspaced short palindromic repeats (CRISPR)-Cas (CRISPR-associated) system was discovered, and it is present in about 90% of archaea and 40% of bacteria. To date, CRISPR-Cas represents the only adaptive immune system known for prokaryotes. Archaea and bacteria often thrive in the same environments, and they exchange genetic materials via horizontal gene transfer. For this reason, it is not surprising that organisms of these two prokaryotic domains share many strategies of antiviral defense. To date, many bacterial antiviral systems have been studied in great detail, and this has facilitated the investigation of corresponding archaeal antiviral systems. In particular, important progress has been made in the study of archaeal CRISPR immunity due largely to the boom in CRISPR biology and CRISPR technology research in the past 12 years. Here, I mainly summarize the current understanding of archaeal CRISPR-Cas systems.

CRISPR-Cas: The Prokaryotic Adaptive Immune System There are two genetic entities in CRISPR-Cas: CRISPR arrays carry repeats that are interrupted by unique short DNA sequences derived from viruses and other mobile genetic elements (protospacers) whereas cas gene cassettes code for structural proteins and enzymes. For antiviral defense, the immune system starts with searching for protospacers on invading viruses and incorporating them into CRISPR arrays as spacers during the first encounter. Upon a subsequent encounter, small RNAs generated from the acquired spacers guide the immune system to specifically recognize the corresponding protospacers and target the viruses for destruction. Genes coding for Cas proteins are grouped together for each system, forming cas gene modules. Although distantly related Cas proteins can share limited or no sequence similarity, those of the same functional category have adopted very similar structures. At present, CRISPR-Cas systems can be categorized into two broad classes and six main types based on their gene synteny, sequence similarity of Cas proteins, and mechanisms of interference. Class 1 CRISPR-Cas systems employ multisubunit effector complexes to mediate CRISPR interference such as the systems belonging to types I, III, and IV, while those of class 2 rely on a single multidomain protein to perform the same function, including types II, V, and VI. By now, it has been demonstrated that type I, II, and V systems target DNA for destruction whereas those of type VI show RNA interference (RNAi). Furthermore, type III CRISPR systems exhibit multiple CRISPR interference activities, including dual DNA and RNA interference and activation of a signal transduction pathway with cyclic oligoadenylates (cOAs) as the second messenger to activate Cas-accessory RNases, leading to indiscriminate cellular RNA degradation. Our current understanding on molecular mechanisms of archaeal CRISPR-Cas systems is summarized below.

Spacer Acquisition and its Regulation in Archaea Spacer acquisition is the first step toward the generation of adaptive CRISPR immunity in prokaryotes. The process involves two core proteins, Cas1 and Cas2, that form a hexameric integrase. The enzyme complex contains four subunits of Cas1 and two

400

Encyclopedia of Virology, 4th Edition, Volume 4

doi:10.1016/B978-0-12-809633-8.20984-9

Antiviral Defense Mechanisms in Archaea

401

subunits of Cas2. It interacts, on one hand, with the first repeat of a CRISPR array and on the other hand, binds to a protospacer, a DNA segment derived from invading nucleic acids. In the end, the integration reaction adds the DNA segment into the CRISPR array at the end of the first repeat, forming the new first spacer. For this reason, the order of spacers in CRISPR arrays provides the chronological record of previous infections on the host chromosome by viruses and other genetic elements. Archaea and bacteria often contain additional genes in the same locus of cas1 gene. For example, in the genomes of Sulfolobus species, cas1 and cas2 are clustered with three other genes, csa3a, csa1, and cas4, forming the adaptation Cas (aCas) module. In this module, csa3a is the first gene and codes for a transcriptional factor. Based on the gene organization, Csa3a may function as a regulator to control expression of all aCas genes in the operon. Indeed, Csa3a activates not only the expression of aCas genes but also an array of DNA repair genes. Since bacterial DNA repair proteins are shown to be involved in protospacer selection, these archaeal DNA repair proteins are implicated in the same process. Csa3-like transcriptional factors are interesting since they carry a ligand-binding domain called CRISPR-associated Rossman fold (CARF). Other CARF-containing CRISPR-accessory proteins include Csm6 and Csx1 RNases associated with type III CRISPR-Cas systems. Type III systems catalyze the synthesis of cOA second messenger that interacts with the CARF domain of Csm6/Csx1 RNases (see below). However, the ligand that binds to the Sulfolobus islandicus Csa3 CARF domain remains to be identified. Proteins encoded by remaining two genes (csa1 and cas4) belong to the so-called Cas4 superfamily. Cas4 proteins interact with Cas1 and specify the upstream protospacer adjacent motif (PAM). Together they define the spacer length and recognize a downstream motif to ensure proper generation of prespacers as well as functional spacer integration. They are essential for efficient spacer acquisition in archaea. Striking, cas4 genes are not only widespread on archaeal and bacterial chromosomes; viruses and plasmids can also code for such proteins albeit they are only distantly related to those present in aCas modules. Possibly, these viral Cas4 proteins can also interact the CRISPR acquisition machinery to downregulate the spacer acquisition or producing nonfunctional spacers to unarm the CRISPR immunity. For this reason, the virus-encoded Cas4 proteins can be classified as antiCRISPR proteins at the stage of spacer acquisition.

Expression of CRISPR Loci and crRNA Biogenesis All CRISPR loci show polarity as they provide a fossil record of MGE infection on the host chromosome. At the proximal end, there is a leader sequence that specifies the site for incorporation of new spacer as described above. This leader region also contains a promoter that drives the expression of the entire CRISPR array, producing precursor crRNAs. CRISPR loci can carry hundreds of spacers from which leader-proximal spacers are generally expressed to a high level. Nevertheless, there are also exceptions since crRNAs generated from some leader-distal spacers can also be very abundant. There are two possible reasons for the differential expression. First, spacers can carry promoter-like sequences (internal promoters) that increase the expression level of their downstream spacers; second, some DNA-binding protein may interact with CRISPR loci and modulate their expression such as the Sulfolobus solfataricus CRISPR-repeat binding protein. Since CRISPR loci are generally expressed to a high level in archaea, the significance of the modulated CRISPR expression remains elusive. After transcription, the resultant precursor crRNAs are processed by a specialized enzyme of the Cas6 endonuclease family, producing small RNAs of single repeat-spacer unit. Although some bacterial CRISPR-Cas systems also employ other Cas proteins for crRNA processing, all known archaea only utilize Cas6 for crRNA biogenesis. In fact, many archaea contain several CRISPR-Cas systems and more than one cas6 gene. For example, S. solfataricus has six CRISPR loci with two families of repeats, four cas6 genes, and three different types of effector complex. The encoded processing enzymes show optimal activity to distinct repeat sequences and this scenario fits with the specific crRNA processing for different types of antiviral immunity. Nevertheless, S. islandicus REY15A has only one cas6 gene although this archaeon contains three active CRISPR systems. Thus, this cas6 is responsible for production of crRNAs to be used by all three immune systems. Furthermore, since the sizes of crRNAs are different for these antiviral systems, crRNAs generated by Cas6 are further processed and the mechanism of crRNA maturation can be type-specific.

Archaeal Type I CRISPR-Cas Systems Bioinformatics analysis has revealed eight subtypes of type I CRISPR-Cas systems (I-A through I-G, I-U) among which I-A, I-B, I-C, I-D, I-E, and I-G are found in archaea. CRISPR-Cas systems of 3 type I subtypes are characterized in archaea, including I-A systems of S. solfataricus, S. islandicus, Aeropyrum pernix, and Pyrococcus furiosus, and I-B and I-G systems of Haloferax volcanii and P. furiosus, respectively (Fig. 1). These archaeal systems code for four structural Cas proteins: Cas7 is a backbone subunit that is present in multiple copies and binds to crRNA; Cas5 interacts with the 50 -handle of crRNAs; Cas11 is a small subunit interacting with Cas7 and crRNA; whereas Cas8, the largest subunit, interacts with Cas5 and Cas7 and functions in PAM recognition. They also code for their own Cas6 to generate mature crRNAs and Cas3 to mediate DNA interference. Cas3 is the type-specific Cas protein of type I systems. For most type I systems, the enzyme harbors a helicase and a nuclease domain but in I-A systems, the two domains are split into two proteins, denoted Cas30 and Cas300 that contain the helicase domain and the nuclease domain, respectively. Cas proteins of type I systems are known to form CRISPR-associated complexes for antiviral defense (CASCADE) to mediate antiviral immunity. Two basic types of CASCADE complexes are known for type I systems: Those represented by Escherichia coli I-E system are not associated with Cas3, and the enzyme is recruited into the CASCADE upon recognition of a PAM before DNA

402

Antiviral Defense Mechanisms in Archaea

Fig. 1 Gene organization of three type I subtypes of CRISPR-Cas systems. I-A: S. islandicus I-A module; I-B: H. volcanii I-B system; I-G: P. furiosus I-G module. For I-A and I-G systems, their backbone subunits (Cas7) are also called Csa2 and Cst2; large subunits (Cas8) as Csa4 and Cst1, respectively. Small subunit (Cas11) is only present in I-A, i.e., Csa5.

silencing. In I-A subtype systems, Cas30 and Cas300 are integrated parts of CASCADE as shown for A. pernix and P. furiosus I-A systems. The other studied archaeal type I systems are H. volcanii I-B system and P. furiosus I-G complex, and they resemble the I-E effector of E. coli. To date, the structure of archaeal type I CASCADE remains to be determined. Nevertheless, biological function of archaeal type I CRISPR-Cas systems has been demonstrated by using interference plasmid assay (also called invader plasmid assay). In this approach, an interference plasmid was constructed carrying a protospacer that matches one of the spacers in the chromosomal CRISPR arrays. Further, the plasmid also carries a PAM sequence that is absent from CRISPR loci in the host genome. Both interference plasmids and reference plasmids were introduced into archaeal cells by transformation, and the former showed 100–1000 folds of reduced transformation efficiency compared with the latter. Further, all transformants on plates were found to have lost the I-A CRISPR immunity either due to spontaneous deletion of targeting-spacer, or plasmid rearrangement, or mutation of a I-A cas gene. They are called escape mutants. Moreover, since PAM is absent from the spacer on the chromosomal CRISPR array, self-targeting by CASCADE is therefore avoided. The mechanism of the S. islandicus I-A DNA interference is illustrated in Fig. 2.

Archaeal Type III CRISPR-Cas Systems Four type III subtypes of CRISPR-Cas systems are known: III-A, III-B, III-C, and III-D among which III-A and III-D are also called Csm systems whereas III-B and III-C are called Cmr systems. While III-A systems dominate in bacteria, III-B systems are primarily found in archaea. To date only a few archaeal type III systems have been characterized, and these include Cmr systems of P. furiosus, S. solfataricus, and S. islandicus, Csm systems in Thermococcus onnurineus (III-A) and S. solfataricus (III-D) (Fig. 3). The first characterized type III effector is the P. furiosus Cmr ribonucleoprotein complexes, and it specifically cleaves target RNA. By now, a number of III-B effectors have been characterized. These effector complexes contain six different subunits (Cmr1 through Cmr6) among which Cmr1, 4, and 6 are backbone subunits possessing the Cas7 fold; Cmr3 has the Cas5 structure; whereas Cmr2 and Cmr5 are the large and small subunits, respectively. Each system forms effector complexes of two distinct sizes: The larger one has the subunit stoichiometry of Cmr112232445361 carrying a crRNA of 46 or 45 nt while the smaller one has three Cmr4 and two Cmr5 subunits with a shorter crRNA (40 or 39 nt). The two complexes share the same general structure: Cmr4 and Cmr5 form the core backbone in which Cmr4 subunits are packed in a head-to-tail fashion. The backbone is capped with Cmr2 and Cmr3 at the head and Cmr6 and Cmr1 in the tail. The crRNA passes through the entire effector complex. Furthermore, Cmr2 and Cmr3 form a heterodimer that binds specifically to the 50 -handle of crRNA, whereas the tail subunits are loosely associated with the core helix. This arrangement in the ribonucleoprotein (binary effector) complex ensures the flexibility for the tail to mediate target RNA capture to yield the ternary effector complex that is active in multiplex interference activities, including target RNA cleavage, RNA-activated DNA cleavage, and synthesis of cOA second messenger. Target RNA cleavage by archaeal type III-B effector complexes is featured with 6 nt periodicity as for other type III systems. When some of their structures are resolved, it becomes clear that the active sites of RNA cleavage are located on Cmr4. Once target RNAs are cleaved and released from the effector complex, the immune system returns to the binary status that is inactive in DNA interference. Transcription-dependent DNA interference by the Cmr effector relies on two activities: The RNA-activated DNA cleavage and synthesis of cOAs, both of which are conducted in Cmr2, the large subunit belonging to the Cas10 family. The involved mechanism is summarized as follows: (1) transcription of a target DNA produces cognate target RNA (CTR) that will bind to the effector in the process of target RNA capture; (2) interaction between crRNA and its CTR (that shows mismatches between the 30 -anti-tag of target RNA and the 50 -handle of crRNA) induces conformation changes in the effector to activate the Cmr2 HD motif for DNA cleavage; (3) in the meantime, Cmr2 is also activated for the cyclase activity to produce cOA, which functions as a second messenger to activate Csx1/Csm6 family proteins for cellular RNA destruction to yield cell dormancy or cell death; and (4) target RNA cleavage and release restores the inactive form of Cmr effector complexes (Fig. 4).

Antiviral Defense Mechanisms in Archaea

403

Fig. 2 Schematic of antiviral defense by the S. islandicus I-A system. Gene organization of I-A module (type I-A) and adaptation cas module (aCas) as well as two CRISPR loci (with 115 and 93 repeats, respectively) in S. islandicus are shown on the top. Transcription of CRISPR loci yields precursor CRISPR RNA (pre-crRNA) that is processed by Cas6 endonuclease to yield short crRNAs with a single spacer (crRNA). Expression of I-A cas genes give I-A Cas proteins including Cas30 nuclease and Cas300 helicase. I-A Cas proteins and crRNA form ribonucleoprotein (type I-A RNP) complex. In the interference stage, the effector scans for target DNA site on ds-DNA, once a protospacer is found, the large subunit (Cas8) recognizes the CCN protospacer adjacent motif and authenticate the immune response. Cas300 helicase unwinds double-stranded DNA and Cas30 nuclease cuts the target DNA sequence to yield target DNA silencing.

Fig. 3 Gene organization of type III subtypes of CRISPR-Cas systems. Four subtypes (III-A, -B, -C, -D) and one variant, III-Bv, are identified by analysis of gene synteny and sequence similarity of Cas proteins. Representatives of all subtypes except III-C have been characterized. III-B and III-C are also called Cmr systems and III-A and III-D, Csm systems. Annotations of individual subunits are given and their types of Cas proteins are also indicated: 10-Cas10, 7-Cas7, 5-Cas5, and S-small subunit. Cas10 is the type-specific protein and the most conserved Cas protein in this type of CRISPR immunity.

404

Antiviral Defense Mechanisms in Archaea

Fig. 4 Schematic of antiviral defense by the S. islandicus III-B Cmr-a system. Gene organization of adaptation cas module (aCas) and the III-B Cmr-a module as well as two CRISPR loci (with 115 and 93 repeats, respectively) in S. islandicus shown on the top. Transcription of CRISPR loci yields precursor CRISPR RNA (pre-crRNA) that is processed by Cas6 endonuclease to yield short crRNAs with a single spacer (crRNA). Expression of III-B cmr genes give 6 Cmr proteins. Cmr proteins and crRNA form ribonucleoprotein (III-B RNP) complex. Transcription of target gene produces mRNAs containing the cognate target RNA. Cmr1 mediates target RNA capture at the seed sequence region and base pairing between the crRNA and the target RNA yield major conformation changes to yield a ternary Cmr-a complex. Mismatches between the 50 -handle of crRNA and the 30 -anti-tag of target RNA authenticate the CRISPR immunity including RNA-activated DNA cleavage and synthesis of cOA second messenger. cOA binds to the CARF domain of Csx1 and activates its RNase from the HEPN domain to indiscriminately cleave viral and cellular RNAs in the cOA signal transduction pathway.

Antiviral Defense Mechanisms in Archaea

405

To date, all known type III systems possess both activities even though each activity is theoretically sufficient to mediate type III antiviral immunity. The current hypothesis is the cOA pathway is required for silencing a large amount of invading nucleic acids while the DNase is essential for final clearance of MGEs. Among the three Cmr activities, the third activity, target RNA cleavage, mainly functions in the spatiotemporal control of the indiscriminate DNase and cOA synthesis. This mechanism ensures that the indiscriminate destruction of cellular DNA and RNA by type III immunity is only allowed to be active during a very brief time period. Prompt target RNA cleavage and cleavage product release readily inactivate Cmr effector. This provides an important mechanism to recover from cell dormancy or even cell death that would otherwise be induced by an uncontrolled type III immune response. CRISPR arrays are also transcribed on the antisense strand, producing countertranscripts that show the full sequence complementarity with crRNAs including the 50 -handle, called noncognate target RNA. As a result, these target RNAs cannot trigger DNA cleavage and cOA synthesis, providing the mechanism of self-immunity avoidance. Effective control of type III CRISPR-Cas immunity also requires to inactive cOA second messenger. S. solfataricus code for ring nucleases that specifically degrade cOAs. As for Csx1 and Csm6, ring nucleases are also CARF proteins but they do not possess any conventional nuclease activity. Rather, they have a different CARF pocket that not only binds to cOA but also cleaves the messenger molecule in a metal-independent manner. In a Csx1 RNA cleavage assay, the ring nuclease effectively deactivates the Csx1 nuclease by competitively removing cA4 in the reaction.

Novel Archaeal CRISPR-Cas Systems More recently, search for CRISPR-Cas systems in metagenomic databases reveals that archaea also code for class 2 CRISPR-Cas systems. These include two type II systems identified in nanoarchaea: One carrying cas1, cas2, and cas4 for spacer acquisition with a cas9-like gene for interference as well as a hypervariable CRISPR array, and the other including 24 different cas14 variant genes of three subgroups identified in uncultivated archaea. These are a unique group of archaeal organisms that form symbiosis with other organisms and are characterized by small cell size and genome size. Their type proteins, named Cas14, have 400–700 amino acids and therefore are much smaller than those of bacterial class 2 systems (950–1400 aa). Trans-acting CRISPR RNAs (tracrRNA) are also identified from transcriptome metagenomic data, and this rendered it possible to reconstitute Cas14 effector complexes in E. coli and study their antiviral immunity. Interestingly, Cas14 is capable of binding to ssDNA and mediate discriminate ssDNA cleavage. This novel type of CRISPR immunity is only present in archaea so far.

Anti-CRISPR Proteins of Archaeal Viruses Viruses code for proteins that are able to counteract the CRISPR immunity, allowing viruses to proliferate in their hosts. These anti-CRISPR (acr) genes are widely present in bacterial and archaeal viruses, mediating resistance to immunity of CRISPR-Cas systems of types I, II, and V. In archaea, anti-CRISPR activity has been investigated for Sulfolobus viruses. Although remaining to be identified, the Sulfolobus SMV1 codes for Acr-IA proteins since it is resistant to the type I-A immunity in S. islandicus Rey15A. The first archaeal acr gene was identified in SIRV2 and SIRV3, the lytic rudiviruses. These viruses can infect Sulfolobus islandicus LAL14/1 despite of the presence of functional CRISPR-Cas subtypes I-A, I-D, and III-B systems and spacers complementary to their genomes. Further studies reveal that the SIRV3 gp02 gene encodes a protein (Acr-ID1) that interacts with the I-D Cas10 and inhibits the I-D immunity. Interestingly, Acr-ID1 has as many as 50 homologs in archaeal viral genomes. Possibly, archaeal viruses code for even more Acr-IA and Acr-IIIB proteins since these CRISPR-Cas systems are more prevalent in this prokaryotic domain.

Archaeal Innate Antiviral Systems As in bacteria, innate immune systems in archaea include abortive infection, R–M systems, pAgo, and TA systems identified by comparative genomic studies, but most of these innate antiviral systems remain to be investigated. Nevertheless, a few representatives have been characterized, including the Sulfolobus acidocaldarius SuaI R–M system, Pyrococcus furiosus Argonaute (PfAgo) and TA system, and the H. jeotgali DNA phosphorothioate modification and virus replication inhibition system. Here I briefly introduce the last three systems since the SuaI R–M system functions as the well-known bacterial ones. In eukaryotes, Agos function as a key enzyme in the RNAi pathway, and they target RNA transcripts using short 50 -phosphorylated RNA as guides. Their prokaryotic homologs, pAgos, are present in about 32% of archaea and 9% of bacteria. PfAgo shows DNA-guided DNA cleavage activity and the system reduces plasmid formation rate in P. furiosus by 30%–50%. Nevertheless, how archaeal Ago proteins function in antiviral defense remains to be investigated. Archaea code for an arsenal of TA systems, for example, Sulfolobus islandicus Rey15A and LAL14-1 code for 21 or 22 TA systems, and many of them are homologs. This high homology has strongly hindered their functional characterization since their activity can only be revealed when all other homologous systems are inactivated in the same archaeal host. Nevertheless, some distantly related P. furiosus TA systems function as selection makers in Pyrococcus yayanisii. This represents the first example of genetic demonstration of archaeal TA systems that function in inducing programmed cell death. Therefore, archaeal TA systems function under the same principle as for their bacterial homologous systems.

406

Antiviral Defense Mechanisms in Archaea

A completely novel archaeal defense system was identified in Haloterrigena jeotgali. It consists of the genes homologous to the bacterial dndCDEA system that mediates phosphorothioate modification and the PbeABCD system that functions in halting virus propagation via inhibition of DNA replication. This unique combination allows the haloarchaeon to inhibit virus infection, and the defense system is conserved in several archaea.

Future Perspectives Archaea and bacteria have devoted a large portion of their coding capacity for an arsenal of antiviral defense systems, including R–M systems, abortive infection, TA systems, archaeal Ago, DNA phosphorothioate modification, and virus replication inhibition systems, as well as the prokaryotic adaptive immunity encoded by CRISPR-Cas systems. These antiviral mechanisms constitute multiple layers of virus defense network, allowing archaea to effectively defend invasion of mobile genetic elements. Furthermore, many archaea carry multiple CRISPR-Cas systems that utilize very different immunity mechanisms to protect their hosts against invasion of nucleic acids. This raises important questions such as: How do these antiviral systems cooperate in the immune defense? Why are they all required in the arms race between microbes and their genetic elements? As it is now well established that genetic elements play important roles in horizontal gene transfer and evolution as well as in the maintenance of biodiversity on the Earth, these antiviral systems may contribute to fine-tuning of the arms race between archaea and their viruses, which in turn yields optimal horizontal gene transfer to increase the ecological fitness of archaeal organisms during evolution.

Further Readings Athukoralage, J.S., Rouillon, C., Graham, S., Gruschow, S., White, M.F., 2018. Ring nucleases deactivate type III CRISPR ribonucleases by degrading cyclic oligoadenylate. Nature 562, 277–280. Deng, L., Garrett, R.A., Shah, S.A., Peng, X., She, Q., 2013. A novel interference mechanism by a type IIIB CRISPR-Cmr module in Sulfolobus. Molecular Microbiology 87, 1088–1099. Elmore, J., Deighan, T., Westpheling, J., Terns, R.M., Terns, M.P., 2015. DNA targeting by the type I-G and type I-A CRISPR-Cas systems of Pyrococcus furiosus. Nucleic Acids Research 43, 10353–10363. Garrett, R.A., Shah, S.A., Erdmann, S., et al., 2015. CRISPR-Cas adaptive immune systems of the sulfolobales: Unravelling their complexity and diversity. Life 5, 783–817. Gudbergsdottir, S., Deng, L., Chen, Z., et al., 2011. Dynamic properties of the Sulfolobus CRISPR/Cas and CRISPR/Cmr systems when challenged with vector-borne viral and plasmid genes and protospacers. Molecular Microbiology 79, 35–49. He, F., Bhoobalan-Chitty, Y., Van, L.B., et al., 2018. Anti-CRISPR proteins encoded by archaeal lytic viruses inhibit subtype I-D immunity. Nature Microbiology 3, 461–469. Koonin, E.V., Makarova, K.S., Zhang, F., 2017. Diversity, classification and evolution of CRISPR-Cas systems. Current Opinion in Microbiology 37, 67–78. Liu, T., Liu, Z., Ye, Q., et al., 2017. Coupling transcriptional activation of CRISPR-Cas system and DNA repair genes by Csa3a in Sulfolobus islandicus. Nucleic Acids Research 45, 8978–8992. Maier, L.K., Stachler, A.E., Brendel, J., et al., 2019. The nuts and bolts of the Haloferax CRISPR-Cas system I-B. RNA Biology 16, 469–480. Majumdar, S., Zhao, P., Pfister, N.T., et al., 2015. Three CRISPR-Cas immune effector complexes coexist in Pyrococcus furiosus. RNA 21, 1147–1158. Terns, R.M., Terns, M.P., 2013. The RNA- and DNA-targeting CRISPR-Cas immune systems of Pyrococcus furiosus. Biochemical Society Transactions 41, 1416–1421. Zhang, J., Graham, S., Tello, A., Liu, H., White, M.F., 2016. Multiple nucleic acid cleavage modes in divergent type III CRISPR systems. Nucleic Acids Research 44, 1789–1799. Zhang, Y., Lin, J., Feng, M., She, Q., 2018. Molecular mechanisms of III-B CRISPR-Cas systems in archaea. Emerging Topics in Life Sciences 2, 483–491.

Discovery of Archaeal Viruses in Hot Spring Environments Using Viral Metagenomics Jennifer Wirth, Montana State University, Bozeman, MT, United States Jacob H Munson-McGee, Montana State University, Bozeman, MT, United States and Bigelow Laboratory for Ocean Sciences, East Boothbay, ME, United States Mark J Young, Montana State University, Bozeman, MT, United States r 2021 Elsevier Ltd. All rights reserved.

Introduction High-temperature environments have intrigued life scientists ever since the first hyperthermophiles were discovered in the hot spring of Yellowstone National Park USA (YNP) in 1966. Since then the microbial communities of hot springs in YNP, as well as other hightemperature environments from around the world, have been extensively studied resulting in the discovery of archaeal and bacterial thermophilic (organisms with a growth optimum of 60–801C) and hyperthermophilic (organisms with a growth optimum of 4801C). Surprisingly, no hyperthermophilic eukaryotic species have been identified. While high-temperature environments can support both bacterial and archaeal microbial communities, high-temperature acidic environments (pHo4) tend to be dominated by archaeal species. Viruses are ubiquitous with cellular life. In any environment where cells are replicating, one is likely to find viruses associated with those cells. Not surprisingly, in archaeal dominated high-temperature environments, archaeal viruses are expected to be present. However, in part due to the limited number of archaeal species that have been cultured, we know of only a limited number of archaeal viruses from cultured-based studies. In order to overcome this limitation, virologists have increasingly turned to viral metagenomics, the deep sequencing of viral communities directly from environmental samples, to assess viral diversity and to assemble viral genomes. These studies have greatly expanded our knowledge of archaeal viruses in extreme environments, especially with regards to their diversity and unique adaptations for replication in high-temperature environments. There are presently 93 complete archaeal virus genomes deposited at NCBI (June 2019), and many more are expected in the coming years. Regardless of the geographic location, it is the underlying geochemistry of the hot spring that determines the microbial and virus community structure. That is because these are chemolithotropic dominated microbial communities which derive their energy from the oxidation of reduced inorganic compounds such as ammonia, ferrous iron, hydrogen sulfide, or hydrogen. Beyond temperature, sulfate levels and pH are major parameters that influence the microbial and virus community composition. As sulfide oxidation decreases pH, the resulting acidity may leach certain trace elements (Fe, Al, Co, Pb, Cd, V, and others) from rock and sediment through which geothermal water passes to reach the surface or from recirculating acid-sulfate water. Likely due to the energy constraints for microbial life, high-temperature low-pH hot springs typically support low cell densities (typically o106 cells/ml) and low free virus particle concentrations (typically o105 particles/ml). Cellular and viral metagenomic studies of these environments have revealed relatively simple microbial communities usually consisting of o10 archaeal species and viral communities made up of B100 different viral types. An additional benefit of these systems is the fact that viruses are the only microbial predator in these hot springs creating microbial systems where the role of viruses can be directly examined. These simplified cellular and viral communities provide a tractable natural system where host-virus interactions can be examined in detail. Archaeal viruses remain enigmatic biological entities. They are the least characterized group of viruses as compared to viruses infecting Bacteria and Eukarya. Despite having only a limited number of characterized archaeal viruses, they possess a surprising diversity of virion morphologies and viral gene content. This is especially true for archaeal viruses present in hot springs. Archaeal viruses have been isolated from only two of the 18 proposed or recognized archaeal phyla. All known archaeal viruses from hot springs infect members of the Crenarchaeota, while archaeal viruses from high salt environments infect Euryuarchaeota. This bias towards these two phyla is likely only a function of the environments in which archaeal viruses have been most searched for. It is likely that all phyla of Archaea support viruses. More recently, metagenomic studies have identified viruses predicted to infect other archaeal phyla but none of these viruses have been isolated to date. In addition to the two phyla with cultured viruses described above metagenomic studies have identified viruses predicted to infect four additional phyla: candidatus Bathyarchaeota, Nanoarchaeota, candidatus Nanohaloarchaeota and Thaumarchaeota. Below we discuss the advantages and challenges of using metagenomics to identify and characterize archaeal viruses from high temperate environments. First, we review the practical considerations for collecting and processing viral samples for metagenomic analysis from hot spring environments. Next, we discuss the important variables to address in next generation sequencing (NGS) and sequence analysis that are unique to archaeal viral metagenomic studies. Finally, we review the major findings of applying metagenomic and other culture-independent tools in understanding archaeal virus community structure and function in hot springs (Fig. 1).

Sample Collection and Processing Acidic terrestrial hot springs offer both unique advantages and challenges for investigators. The harsh chemical and physical environment selects for a small number of specific cell types, and thus their viruses. This selection can be leveraged to address

Encyclopedia of Virology, 4th Edition, Volume 4

doi:10.1016/B978-0-12-809633-8.20985-0

407

408

Discovery of Archaeal Viruses in Hot Spring Environments Using Viral Metagenomics

Fig. 1 Work flow of combined culture-dependent and culture-independent approaches to achieve virus characterization, including host identification, from hot spring environments.

complex ecological questions in a naturally simplified community. However, these same conditions present a number of difficulties ranging from personnel safety to technical challenges such as low host and virus densities and preventing acid hydrolysis of biomolecules. Sampling terrestrial thermal fields requires preplanning and sampling by trained, experienced personnel with appropriate safety gear. It is essential to plan in advance the specific purpose for which the sample will be later utilized. For example, the sampling strategy for culturing new host and viruses is different from the sampling strategy required for the isolation of environmental nucleic acids. Sample site selection is also critical and dependent on the purpose of the sampling. Many thermal features are ephemeral, thus if long term temporal sampling on the time frame of years is required, appropriate site selection is critical. Furthermore, most thermal fields have a variety and fluctuating set of geochemical conditions. From a practical consideration, field measurements of temperature, pH and conductivity can be valuable in choosing an appropriate hot spring to sample. Hot springs, especially acidic hot springs, can be dangerous places to sample due to often turbulent super boiling acid waters and demineralization of the soils present in thermal fields. Walking in thermal areas is often more akin to traveling on glaciers, however in thermal areas the crevasses are boiling acid and steam. Sampling tools, including but not limited to extendable sampling poles, protective clothing, eyewear, and boots are important for safe sampling. In planning a field sampling trip, one should consider what to sample and how to treat the sample until they are processed in a laboratory. Hot spring sediments often provide a rich source of cells while hot spring waters typically provide a better source of virus particles. Most acidic hot springs create environments where DNA that is not protected within a cell, or within a virus particle is rapidly hydrolyzed, therefore it is important to maintain cell and virus integrity during sampling. Furthermore, the low cellular and viral density present in most hot springs necessitates sampling large volumes, typically 1–100 L. If perfectly preserving a timecritical snapshot is necessary, then a small volume sample can be flash-frozen in liquid nitrogen. For large samples, transporting the samples at ambient temperature is ideal, especially if samples can be further processed within 24 h. Isolation of cellular and viral fractions from hot spring samples is typically a multistep process. Clay particles are often present in samples and are removed by very low g centrifugation (o500  g), which maintains the cells and viruses in solution. Separating the cells and intracellular viruses from free virus particles is accomplished by size filtration through a 0.45 uM and or 0.22 uM filter. The collection of cells on filters provides a convenient method to store and extract cellular DNA and RNA for subsequent NGS applications. Viruses present in filter flow-through can be concentrated by a diversity of methods including tangential flow filtration (TFF), FeCl3 flocculation, polyethylene glycol precipitation, ultracentrifugation and size selection spin column filtration. TFF, FeCl3 flocculation, and PEG precipitation methods lend themselves easily to the larger volume samples, while centrifugationbased methods are typically applicable to smaller volumes. After separation from cells, viral fractions can be further purified using

Discovery of Archaeal Viruses in Hot Spring Environments Using Viral Metagenomics

409

CsCl or CsSO4 buoyant density gradients, sucrose gradients, or size exclusion chromatography. Of course, each step of any purification protocol has its own limitations, biases, and sample loss that the investigator must be aware of and plan accordingly.

Next Generation Sequencing of Samples The availability of inexpensive, efficient NGS has greatly advanced archaeal virus discovery. Preparing nucleic acid samples from environmental virus and cellular samples for NGS requires care. It is essential that nucleic acid extraction methods minimize extraction bias while maintaining nucleic acid integrity. The minimum amount of nucleic acid required for NGS is dependent on estimated viral complexity and the DNA library construction protocol that is being utilized. Library preparation for Illumina Nano Miseq platforms requires only nanograms of input DNA while single molecule based long read technologies such as PacBio and Nanopore require high quality long DNA fragments at a higher starting concentration. It is also important to estimate the sequencing read depth required to adequately characterize the viral community, which is dependent on the estimated complexity of the viral community. A general guideline is to aim for 50–100 base coverage for every type of viral genome present. In order to determine the number of reads to generate, a researcher needs to take into account the size of the viral genome (generally 30–50k for high-temperature archaeal viruses), expected number of viruses in the community (typically 50–200 virus types), length of reads (150–300 for most NGS libraries), and desired coverage of the genome (50–100 base coverage). Hot springs with simple archaeal and viral communities can generally be well characterized with 5 million Illumina reads while a more complex community will obviously require more reads to fully sequence the viral community.

Post Sequence Analysis A diversity of post-sequencing pipelines are available and new ones are constantly being developed for analysis of the viral community structure and the assembly of viral genomes. The overall approach requires the quality assessment of primary sequence data, sequence assembly into contigs, and post-assembly analysis. There are multiple assemblers that have been designed for NGS data. MetaSPAdes and IDBA-UD are two such assemblers. MetaSpades is user friendly, contains extensive documentation, and has been shown to perform well on data from a wide range of environments. MetaSPAdes also has the ability to create assemblies using both long and short reads, which generally results in higher quality assemblies and allows for the resolution between closely related viruses. Utilizing previously sequenced genomes as scaffolds can facilitate the assembly of genomes and differentiation of closely related viruses from the same or different environments. Evaluation of the quality of the assemblies is either performed by the assembler or by a stand-alone program such as QUAST. Assembled circular genomes provide confidence in the completeness of viral genomes. As metagenomic technologies improve and metagenomic assembled viral genomes (MAVs) become more common, assessing the quality and completeness of these sequences has become more important. The Minimum Information for Uncultivated Viral Genomes (MIUViG) has been introduced to standardize the type and quality of data required to publish new MAVs (adapted from similar standards for Metagenome Assembled Microbial Genomes-MAGs). A checklist of mandatory metadata includes statistics about assembly quality, number of contigs, predicted genome type and structure, tools used for assembly and analysis and source dataset type. One of the most challenging and exciting aspects archaeal viral metagenomic studies is gene annotation since a majority of the viral sequences have little to no similarities to sequences in the reference databases. Traditional BLASTn or BLASTx analysis typically results in only 10%–20% of genes having a significant match to known genes. More sensitive analysis such as Hidden Markov Modeling (HMM) and HHpred have demonstrated the ability to identify more viral genes but can be computationally expensive. A recent analysis of genes in almost 3000 prokaryotic viruses with HMM created B10,000 profiles of conserved viral orthologous groups (pVOGs), along with the function of the original genes when known. This is an excellent resource for identifying known and unknown viral genes. Network analysis based on either viral DNA sequence or their encoded proteins, is a relatively new approach to analyze viral metagenomes. These methods identify homologous sequences that are then grouped into viral “clusters” which represent closely related viral genotypes. Relatively small contigs retrieved from conventional assembly pipelines can be grouped into biologically meaningful clusters. This approach overcomes some of the challenges inherent in complex environmental samples where sample heterogeneity can limit assembly platforms that were designed for high quality deep sequencing of relatively homogenous samples. Protein clustering is another approach to reduce the complexity of metagenomic datasets and identify proteins that are shared between viruses and hot springs. However, protein clustering relies on the identification of viral genes, which can be challenging to computationally automate for archaeal viruses due to their genetic diversity from other viruses and microbes. As a result, these steps frequently require labor intensive manual curation of predicted genes greatly complicating and increasing the analysis time. One final step in characterizing archaeal viruses is the taxonomic classification of newly discovered virus. Traditionally this has relied on the establishment of a virus infected host culture and description of the viral genome and particle morphology. Due to their morphological and genetic diversity the B100 archaeal viruses comprise 11 recognized viral families (more than all 3000 þ bacteriophage represent). However, with the rise of MIUViG culture independent classification is becoming increasingly popular.

410

Discovery of Archaeal Viruses in Hot Spring Environments Using Viral Metagenomics

A recently described tool vContact2 takes all predicted proteins in a viral genome and creates protein clusters that can be used to classify viral genomes at the genera, family, and order level. Further advancements and streamlining of viral characterization pipelines will greatly assist in the classification of new viral types. A difficult but important question for any novel uncultivated virus is which cell type is the host? A number of in silico techniques can be combined to confidently address this question. In archaeal genomes the prevalence of CRISPR/Cas systems can greatly aid in host identification. Mapping CRISPR spacer sequences to the metagenomic assembled viral genome gives high confidence of a host-virus relationship. K-mer analysis and BLAST comparison of viral and cellular proteins can lend further support to host identification. Candidate host–virus pairs can then be validated with dual-label FISH probing of cells and viruses. This technique can also provide information about numbers of infected cells in situ and changes over time. Development of qPCR primers can also be used to follow viral and host population dynamics over time. As obligate cellular parasites viruses and their nucleic acid are frequently found associated with cells, and it is worth examining cellular metagenomes for viral sequences. Most cellular genomes that have been examined to date have proviral sequences or prophages integrated into their genome. In addition, many cells are infected with actively replicating viruses and the sequencing of cell associated DNA will identify these viruses as well. Programs such as VirSorter and the archaeal virus specific MArVD identify likely viral sequences in cellular metagenomes by identification of viral hallmark genes, genome characteristics and other markers of viruses. Analysis of cellular genomes from YNP hot springs has revealed that 15%–20% of reads are likely of viral origin. While de novo identification of reads as viral is very difficult, the comparison to a viral database generated from purified viral particles either from the same hot spring or one with similar geochemical characteristics and microbial members greatly facilitates this process. After the viral database is created and curated, reads can be recruited onto the viral contigs in order to identify which viruses are associated with which cell type, linking viruses to cells, providing information about the host range of particular viruses and their relative abundance. Single cell genomics (SCG) can also now achieve this with much improved accuracy as intracellular viruses will be sequenced along with the host genome. The difficulty of culturing certain cell types does not preclude further characterization of the virus using culture-independent approaches. The culture-independent methods outlined above that identify the host can aid in isolating the host and virus in culture. This is particularly true if the host is related to other cultured isolates. Virion structural proteins can often be identified from metagenomic sequence data. However, the inability to do so is no longer an insurmountable hurdle to structural studies. A purified virus fraction can be subjected to SDS PAGE and MS proteomic analysis to identify the predominant viral proteins in the fraction, which are typically the major capsid proteins. These structural proteins can then be mapped back to the predicted open reading frames in the viral metagenome to identify coat protein genes and analyzed further to give insights as to viral morphology. The identified viral structural protein genes can be cloned and expressed in a heterologous protein expression system. Expressed proteins can be used for antibody generation for use in ELISA or FISH assays. Purified proteins can also be use to solve the highresolution structure using x-ray crystallography. Cryo electron microscopy and image reconstruction of purified virus directly from the environment can be used to generate a near atomic resolution structure of the entire virion, providing valuable insights into the virion architecture and function. Combining viral metagenomics with other “omics” based approaches has proved to be a powerful technique to link viruses to their hosts and to ask important questions about the viral replication cycles. Paired viral and cellular metagenomes are allowing researchers to compare the viral populations within cells with that of the released extra cellular viral populations. Environmental viral metagenomic studies are being combined with environmental RNA transcriptomic studies to address questions of which virus are undergoing active replication. The identification of viral gene transcripts not only confirms that a virus is active in an environment it also can also provide insight into which viral genes are the most abundantly transcribed in the system. The more abundant viral genes likely encode viral proteins that are need multiple copies in order to form virions (e.g., structural proteins) while less abundant viral transcripts may indicate regulatory genes that help to control the viral replication cycle. This begins to address one of the major problems with viral metagenomes in that the function of most genes was not possible to elucidate. Viral metabolomics aids in the identification of lipids that are part of the envelope in many archaeal viruses, and other components of the viral particle The ability to synthesize full length MAV DNA in vitro provides a pathway to establishing culture-based host-virus systems. Synthetic biology now allows us to express MAV DNA in suspected host cells. If a candidate host can be identified and is tractable to culturing, then a synthetic clone can potentially be transformed into a potential host and replicated, or launched off a plasmid. This may or may not lead to a productive infection, but can still provide insights into virus life cycle.

What Have we Learned? The initial viral metagenomic studies focused on the identification of novel viruses and the helped to define the viral community structure. This era of “stamp collecting” revolutionized the study of archaeal viruses by providing insight into archaeal virus diversity and the genetic diversity of the thermophilic archaeal virosphere. For example, viral metagenomic analysis of acidic hot springs from YNP revealed that archaeal viruses dominate, that most of the assembled viruses genomes belong to uncultured viruses and that many are completely new viruses, unrelated to other known archaeal, bacterial or eukaryotic viruses. One acidic YNP hot spring was found to contain 110 virus types, of which 103 were previously unknown. Viral genomes assembled through viral metagenomics provide a roadmap and tools to track, purify and characterize these unknown viruses directly from environmental samples. For example, a

Discovery of Archaeal Viruses in Hot Spring Environments Using Viral Metagenomics

411

Fig. 2 Acidianus tailed spindle virus was purified and characterized primarily through culture independent methods.

recently described virus, Acidianus tailed spindle virus (ATSV) was originally isolated, its virion structurally characterized and its host identified all through culture-independent methods. This characterization guided subsequent successful efforts to culture the virus and its host (Fig. 2). Archaeal MAVs have shown that they are genetically quite diverse. Nearly 90% of the genes present have no significant similarity to other genes in the NCBI nr database, and structural proteins were only identified in B20% of viral types. Due to this high percentage of uncharacterized proteins, techniques other than database homology are needed to assign function to many MAV proteins. Two of the most common techniques to do this in MAVs are structural determination and genetic analysis. X-ray crystallography was used to solve the structure and determine the function of the archaeal viral protein A197 in Sulfolobus turreted icosahedral virus 1 (STIV1). While direct DNA sequence annotation of this protein provided little insight into its function, X -ray based structural analysis was able to identify it as a glycosyltransferase. Combined genetic and biochemical analysis has provided insight into the function of many archaeal viral proteins. This includes the discovery of a novel lysis mechanism encoded by STIV1 gene C92 which was determined to produce seven-sided pyramid shaped structures on the surface of the cell prior to lysis. Genetic studies have also showed that less than half the proteins encoded by Sulfolobus spindle shaped virus 1 (SSV-1) are necessary for productive infections in culture. Despite the continued discovery of novel archaeal viral genomes, the field is rapidly moving past the cataloging of archaeal viral genomes to understanding MAVs in a broader biological context. Longitudinal sampling of YNP hot springs over time by viral metagenomics combined with targeted viral qPCR and vFISH analysis has provided insights into archaeal virus-host associations. Like all predators, viruses have been shown to fluctuate in response to their host abundance and to changes in their geochemical environment. Temporal and longitudinal sampling allow for the comparison of MAVs across time and in different hot springs with different microbial communities and geochemical conditions. While the membership of the major archaeal virus community members can remain remarkably stable over a time frame of years, the relative abundance of a given viral type may vary considerably. Phylogenetic analysis of closely related viruses from the same hot spring over time can provide insight into viral evolution. Multiple STIV MAV’s were recently compared from numerous YNP hot springs, revealing three distinct homologs of the C381 structural protein. The STIV1 and STIV2 C381-like structural proteins formed two distinct phylogenetic clusters while a third C381-like protein formed a third cluster. This suggests that a third type of STIV is present in YNP hot springs. A closer examination of the C381 clades reveals a geographic split of where each clade is predominately found. The first clade was primarily found in hot springs in the YNP Nymph Lake thermal region while the second clade was more abundant in the Crater Hills region. This local adaptation is likely in part due to the different microbial communities present in hot springs in these thermal areas. Analysis of full length and near full length MAVs from geographically isolated (i.e., YNP, USA and Kamchatka, Russia) and geochemically distinct hot springs has identified different lifestyle strategies in different populations and different environments. For example, the lysogenic SSV strains found in the Multnovsky geyser basin (Kamchatka, Russia) were closely related to each other but distinct and genetically distant from the YNP SSV strains. Comparison across sites and over time has also enabled identification of core genes that are conserved across viral lineages through time and space as well as the flexible genome that appears to provide adaptation to the local conditions. Combining this analysis with host genome CRISPR spacer content has identified different host population immune strategies and viral countermeasures as well. For example, while all SSVs have spacer matches targeting their core genes, SSVs with fewer variable genes have fewer spacer matches to these accessory genes, while longer SSVs have many spacer matches to these accessory genes, suggesting a mechanism by which SSVs may avoid multiple host CRISPR spacer matches, and thus evade host immunity. Combining environmental single cell genomics with population-based viral metagenomics is a powerful tool to link MAVs to their cellular hosts. A recent study took advantage of a multiyear sequencing effort of the viral community in a single hot spring in YNP and leveraged that knowledge to identify viruses in 250 single cell genomes. Despite low genome coverage of these single cells, viral sequences were identified in over 60% of the cells and sequences from multiple viruses were detected in most of the

412

Discovery of Archaeal Viruses in Hot Spring Environments Using Viral Metagenomics

cells. This indicates that virus-cell associations are common and leads to the speculation that most cells may be infected by viruses most of the time. Not only did this study identify high levels of virus-cell associations in the hot spring, it also identified viral hosts and host ranges for numerous viral types, and identified which viruses were actively infecting cells. While other single cell genome studies have identified viral sequences within cells they have lacked the thorough knowledge of the viral community that allowed the increased sensitivity of this study.

Looking Forward Culture-independent approaches to investigating viruses directly in the environment are a valuable complement to traditional culture-based viral studies. This is especially true for studies of archaeal viruses from hot spring environments where there is a lack of suitable culturing systems. Viral metagenomics is rapidly progressing beyond being only a tool to describe archaeal viral diversity to a valued roadmap to address a wider range of questions. Among these questions are what viruses are infecting what cells and what is the host range of these viruses? How many cells are infected with which viruses and by how many viruses at any given moment? Overall, addressing these questions should provide insights into the bigger question of the role of viruses in driving microbial ecology and evolution in natural systems. There is still a role for the application of viral metagenomes to understand archaeal viral diversity, especially in unexplored natural environments, or in environments that contain archaeal phylum for which no viruses have yet to be discovered. However, there is an important need to leverage viral metagenomic datasets to enhance efforts to culture previously uncultured archaea and their viruses. Culture-based techniques will likely remain the gold standard to investigate details of viral processes for the foreseeable future. However, viral metagenomics are rapidly becoming initial experiments that allow researchers to refine their hypothesis and investigate biological processes in more depth. As new sequencing technologies advance, impacts on archaeal virus discovery and characterization will further expand. The MinION sequencing platform is already allowing researchers to perform sequencing in the field. Concurrent with the improvement of current and development of new sequencing technologies is an ongoing need to generate bioinformatic tool kits and pipelines to process the large amounts of data that will be generated by these technologies. A major challenge is the development of rapid methodologies to annotate archaeal viral genes. This will likely require the combination of computational, genetic, biochemical and structural tools. The use of viral FISH and FACS tools directly on environmental samples will allow for the rapid quantification and isolation of infected cells, while the application of cryo electron microscopy techniques directly to environmental samples will provide new insights in archaeal virus structure, viral attachment and entry, replication and release from into cells. It is likely that examining host-virus associations in their natural environments will further expand in the years to come. It is clear that many aspects of host-virus associations are evident only in the context of their natural environments and are lost when individual viruses and hosts are studied in the isolation of pure culture systems. This is particularly true when examining the ecological consequences of host-virus associations in terms of virus-virus, host-virus, and host-host competition. The interacting roles of host defense systems along with mechanisms to overcome host resistance are often only evident within the natural context of the environment. The field of environmental virology will continue to expand, embracing a much broader role of viruses in the environment from pathogens to commensal agents. Single cell genomics and single cell transcriptomics will further our understanding of biochemistry and cell biology of archaeal virus infection. Currently single cell genomics is capable of identifying the presence of viruses in cells, but the details of viral replication are usually limited. Prokaryotic single cell transcriptomics is still in its infancy but the further advancement of the ability to sequence all of the RNA within a cell will prove crucial to identifying replicating viruses in natural environments. At the same time these techniques will provide unprecedented insight into how cells are defending against viral infection. The field of high-temperature archaeal virology has made tremendous strides over the past two decades characterizing the morphological and genetic diversity of archaeal viruses, identifying virus-host pairs, surveying a wide range of thermal environments, and developing culture-independent techniques for viral characterization. However, the more we learn, the more we realize how much there is still left to discover. Future advances in archaeal virology over the next decade will likely require the isolation and in-depth characterization of many of these viruses and their hosts. A few of the areas that are current focuses of active research are the structural analysis, temporal fluctuations of viruses over time and the role archaeal viruses play in shaping microbial communities in natural environments.

Further Reading Bolduc, B., Wirth, J.F., Mazurie, A., Young, M.J., 2015. Viral assemblage composition in Yellowstone acidic hot springs assessed by network analysis. The ISME Journal 9 (10), 2162–2177. doi:10.1038/ismej.2015.28. Hartman, R., Munson-McGee, J.H., Young, M., Lawrence, C.M., 2019. Survey of high-resolution archaeal virus structures. Current Opinion in Virology 36, 74–83. Hochstein, R., Amenabar, M.J., Munson-McGee, J.H., Boyd, E.S., Young, M.J., 2016. Acidianus tailed spindle virus: A new archaeal large tailed spindle virus discovered by culture-independent methods. Journal of Virology 90 (7), 3458–3468. doi:10.1128/JVI.03098-15. Inskeep, W.P., Jay, Z.J., Tringe, S.G., Herrgård, M.J., Rusch, D.B., 2013. The YNP metagenome project: Environmental parameters responsible for microbial distribution in the Yellowstone geothermal ecosystem. Frontiers in Microbiology 4, 67. doi:10.3389/fmicb.2013.00067.

Discovery of Archaeal Viruses in Hot Spring Environments Using Viral Metagenomics

413

Jang, H.B., Bolduc, B., Zablocki, O., et al., 2019. Taxonomic assignment of uncultivated prokaryotic virus genomes is enabled by gene-sharing networks. Nature Biotechnology 37, 632–639. Liu, Y., Brandt, D., Ishino, S., et al., 2019. New archaeal viruses discovered by metagenomic analysis of viral communities in enrichment cultures. Environmental Microbiology 21, 2002–2014. Munson-McGee, J.H., Peng, S., Dewerff, S., et al., 2018. A virus or more in (nearly) every cell: Ubiquitous networks of virus–host interactions in extreme environments. The ISME Journal. 1–9. doi:10.1038/s41396-018-0071-7. Roux, S., Tournayre, J., Mahul, A., Debroas, D., Enault, F., 2014. Metavir 2: New tools for viral metagenome comparison and assembled virome analysis. BMC Bioinformatics 15, 76. doi:10.1186/1471-2105-15-76. Vik, D.R., Roux, S., Brum, J.R., et al., 2017. Putative archaeal viruses from the mesopelagic ocean. PeerJ 5, e3428. doi:10.7717/peerj.3428.

Metagenomes of Archaeal Viruses in Hypersaline Environments Fernando Santos, María D Ramos-Barbero, and Josefa Antón, University of Alicante, Alicante, Spain r 2021 Elsevier Ltd. All rights reserved.

Nomenclature GC

TEM Transmission electron microscopy VLP Virus-like particle

Guanine þ cytosine content in the DNA

Glossary Fosmid Cloning vector that allows the cloning of large fragments (between 30 and 40 Kb) and thus can be used for cloning complete viral genomes directly purified from environmental samples.

NGS (next generation sequencing) Massive parallel sequencing technologies that allow the sequencing of a mixture of DNA molecules without the need of previously separating them by cloning.

Introduction An environment is considered hypersaline when its salt concentration is more than twice that of seawater. This type of systems includes solar salterns, salt lakes, soda lakes, saline soils and deep-sea brines, among others. From an anthropocentric point of view, hypersaline systems are extreme environments, although this view has recently been challenged considering thermodynamic aspects. The organisms thriving in hypersaline systems are known as halophiles or extreme halophiles, depending on their salt requirements. These organisms are adapted to leave in the presence of salt and need it for their normal functioning. At very high salt, the microbiota of hypersaline systems is normally dominated by Archaea, although other organisms are also present. These Archaea include extremely halophilic members of the Euryarchaea and the recently described and still uncultured Nanohaloarchaea. Within the first group we can find the high GC extremely halophilic archaea, well represented in culture collections (e.g., Halorubrum, Halobacterium, Haloferax, etc.), in addition to the low GC square archaeon Haloquadratum walsbyi, very widespread in many environments worldwide. This archaeon, although culturable, is very difficult to grow on plates. One remarkable characteristic of hypersaline systems is the high number of VLPs (virus-like particles) that they harbor. In fact, brines are the water systems with the highest concentration of viruses reported to date (up to 1010 VLPs/ml) and, frequently, with one if the highest virus to cell ratios (up to 300, when the normal values in aquatic systems are around 10 or below). Furthermore, TEM studies show that hypersaline environments harbor a wide diversity of VLP morphologies. In fact, lemon-shaped (or spindle) viruses are considered to infect exclusively archaea, although so far only one has been isolated. Indeed, as discussed elsewhere in this Encyclopedia, archaeal viruses are underrepresented in culture collections, and this is also the case for virus infecting extremely halophilic archaea. If a microbial community is dominated by Archaea, then very likely the viral assemblage is dominated by archaeal viruses. Thus, the analysis of viral metagenomes from hypersaline environments dominated by Archaea can be very useful to get information about archaeal viruses. This can be especially valuable for learning about viruses infecting the Nanohaloarchaea or Haloquadratum, for which there are no cultured virus representatives since, to date, the only available isolated haloarchaeal viruses infect members of the high GC Euryarchaea. Furthermore, hypersaline systems are good models for studying viral assemblages since at very high salt viruses are likely the main factor controlling the prokaryotic communities.

How can Archaeal Viruses be Studied Using Viral Metagenomes? In addition to the general characterization of the viral metagenome (discussed elsewhere in this Encyclopedia), a key point in the analysis is to determine which viral genomes from the viral metagenomes correspond to viruses infecting Archaea. This means assigning to their archaeal hosts the virus genomes (or genomic fragments) reconstructed from the viral metagenome. This assignment can be accomplished by using different approaches. The safest one, albeit the less applicable given the limitation of databases, is to compare viral genomes reconstructed from metagenomes with viral genomes coming from cultured viruses (or in silico reconstructed viral genomes from previous works). This is particularly limited for the Archaea since, as mentioned above, all the archaeal haloviruses cultured so far infect the high GC members of the hyperhalophilic Euryachaea which are only a part of the archaeal community. This, obviously, imposes a bias in the metagenomics-based viral identification. In any case, even for this group of viruses the identification may not be straightforward. An example of how the tools which are very successful to analyze viruses from many environments fail to recognize viruses from extremely halophilic Archaea has been recently provided by the isolation and characterization of viruses from the hypersaline Lake Retba infecting Haloferax and Halorubrum (both high GC Euryarchaea). Only one out of the four viral genomes described was

414

Encyclopedia of Virology, 4th Edition, Volume 4

doi:10.1016/B978-0-12-809633-8.21227-2

Metagenomes of Archaeal Viruses in Hypersaline Environments

415

recognized as virus with the powerful tool Virsorter, which is frequently used to identify viral genomes retrieved from metagenomic datasets. This, as pointed out by the authors of the study, calls for an improvement of virus hallmark datasets. Another approach to assign viral genomes from a metagenomic library to their host is based on the fact that with time viruses tend to ameliorate their nucleotide composition towards that of the host. This is reflected in the fact that the GC difference between phages and their microbial hosts is normally around 4%, and have similar oligonucleotide frequency and codon usage, although there are considerable deviations to this trend. In any case, in the absence of other information, GC content and oligonucleotide frequency analysis can be used to assign viruses to their host in hypersaline systems. In fact, these systems are well suited for this approach since frequently the GC profiles of the community are bimodal, with a lower pick at around 45%–50% corresponding to the low GC Nanohaloarchaea and Haloquadratum, and a higher pick around 60%–65%. Accordingly, viral assemblages also display this bimodal profile. Finally, in order to assign viruses to hosts, we can also explore the imprint that viral infection leaves in prokaryotic genomes, more specifically in the defense CRISPR/Cas system, CRISPR standing for “clustered regularly interspaced short palindromic repeat”. When exposed to the invaders (like viruses), cells harboring a functional CRISPR/Cas system may become specifically immunized by the acquisition of new spacers (i.e., sequences intervening between the repetitions) that derive from the viral DNA sequences. Thus, knowing the spacers present in a given prokaryotic genome, one can look for the corresponding protospacers in the genome of the infecting viruses. As discussed below, this approach has been used in hypersaline environments to “fish” for virus infecting different extremely halophilic archaea, such as Haloquadratum walsbyi, high GC Euryarchaea or members of the Nanohaloarchaea.

What Viral Metagenomes are Available From Hypersaline Environments and What are Their General Characteristics? Fig. 1 shows the hypersaline environments for which viral assemblages have been characterized through metagenomics. There are basically two types of viral metagenomes in the figure that are obtained either by NGS, by directly sequencing environmental DNA, or by cloning the DNA in vectors (plasmids or fosmids) prior to sequencing. In both cases, viral DNA has to be extracted from the extracellular virus fraction in a way that avoids (or at least minimizes) cellular DNA contamination. The use of fosmids, although

Fig. 1 Publications on metagenome analysis of viral assemblages from hypersaline environments.

416

Metagenomes of Archaeal Viruses in Hypersaline Environments

not devoid of technical difficulties, allows the cloning of complete virus genomes directly from the environments, providing then “natural” contigs which may simplify considerable the downstream analysis. Furthermore, these cloned viral genomes are, indeed, real genotypes present in the natural sample, and not artifacts derived from the assembly of metagenomic short reads coming from different viral genomes. However, the most frequently used approach currently use is the direct sequencing of viral DNA. Furthermore, there is still another metagenomic strategy to analyze virus genomes, which consists of sequencing DNA extracted from the complete community, including cells and the viruses attached or replicating inside of them. In this type of metagenomes, viral sequences can only be retrieved when they have some type of similarity with previously described viruses. However, the approach can still yield valuable results like, in the case of the Atacama salt crust (see Fig. 1), the identification of a virus tentatively infecting members of the Nanohaloarchaea. Although there is not complete agreement among all the works published so far, the comparisons of data from Senegal, Spain, United States and Australia indicate that halophilic vial metagenomes share some common traits worldwide. Thus, in the same way that there is a marine-ness in marine viruses, there is a kind of hypersaline-ness nature in viral metagenomes from hypersaline systems, with genes that are never found in other systems. Furthermore, some related lineages have been found to have a worldwide distribution. However, there is also a considerable degree of specifically local genotypes, which seem to be more important for peculiar environments such as the chaotropic Salar de Uyuni, in Bolivia, which harbors viral communities very different from what was previously known. In any case, salinity is a very strong structuring factor for viral assemblages, as previously found also for their microbial counterparts. In addition, as a rule, the higher the salinity, the higher the contribution of archaeal viruses to the total community. Furthermore, there is a temporal variation of the analyzed hypersaline systems, which can complicate the obtention of general conclusions. Finally, as frequently found in other viral metagenomes, haloarchaeal metaviromes harbor a considerably high level of diversity, as indicated by the low number of genes with matches in databases. However, as in most viruses, haloviral genomes also carry what is know as “accessory metabolic genes” which have been acquired from previous hosts. To this regard, it is specially worth mentioning the detection of rhodopsin coding genes in some uncultured halovirus. The question remains open, however, if this light driven ion pumps play any role in the infection process. Very frequently, genomes from metagenomes of hypersaline environments correspond to viruses from the Caudovirales (head and tail viruses), that can be identified based on the presence of hallmark genes like the terminases (which are very well represented in the available databases). However, these morphologies are not always dominant in all hypersaline environments, which frequently harbor a high number of lemon-shaped viruses. These are more difficult to identify based only on genomic traits given the low number of cultured representatives to compare with. However, even with these limitation, viral genomes from metagenomes have been assigned to the lemon shaped salterproviruses in samples from Namibia, Senegal, and Australia hypersaline systems. Although they have to be taken with caution given the limitations of the approach, several studies indicate that very likely most haloarchaeal viruses do not undergo lysogenic cycles in nature, given the low proportion of integrases detected in the analyzed metaviromes. However, to which extent haloarchaeal viruses undergo chronic infection, which is typical of some archaeal viruses, is still an open question that metagenomics has not been able to answer. Viral metagenomics has also been used to address the dynamics of viral communities, indicating that although stable at the time scale of days, they differ significantly at larger time scales. In the following sections we will focus on the two groups of haloarchaea from which there are no cultures viruses and thus all our knowledge comes from metagenomic studies.

Viruses of Haloquadratum Walsbyi So far, we have not been able to isolate viruses infecting the square archaeon Haloquadratum spp., although the study of plasmid and fosmid libraries has enabled the first insight into this group of viruses as described below. In addition, a metatanscriptomic study carried out with samples from a solar saltern in Santa Pola, Spain, indicated that this part of the viral assemblage was not especially active under natural conditions but increased its activity when the community was submitted to stress. The “environmental halophage 100 , retrieved in 2007 from a fosmid library constructed with viral DNA from the CR30 crystallizer of “Bras del Port” salterns (Santa Pola, Spain), was the first nearly complete haloviral genome that was tentatively associated to Haloquadrum based on its low GC content (51.4%, slightly higher than the GC content of the square archaeon, around 48%). Although this nomenclature was kept for all the viral genomes cloned in fosmid libraries, the name is not correct since the term “phage” may not be appropriate to refer to archaeal viruses. Soon later, a second fosmid library and a plasmid library were obtained. Assembly of the plasmid library produced “uncultured viral contigs” that were grouped in different clusters together with the genomes of some extremely halophilic prokaryotes, according to their GC contents and their dinucleotide frequencies. Cluster 4, including 25% of the contigs and associated to the genome of Hqr. walbyi, was interpreted to contain viruses infecting low GC haloarchaea (only represented by Haloquadratum at that time since nanohaloarchaea had not been described yet, but with a fraction of sequences which could, indeed, correspond to nanohaloarchaeal viruses). A third fosmid library from the same crystallizer was published in 2012, reporting a group of 14 viral genomes (also named “environmental halophages”, eHPs) which were associated to Haloquadratum based on genetic features. These eHPs were also reported as “very abundant” in hypersaline systems by recruitment of metagenomic reads; indeed, eHPs can recruit up to 20% of the total nucleotides from a CR30 metavirome.

Metagenomes of Archaeal Viruses in Hypersaline Environments

417

Fig. 2 Some representatives of the EHqrV group I. Six fosmid-derived sequences are represented by five “environmental halophages” (eHPs) and sequence 3A2, from a 2011 fosmid library (unpublished results). Contigs 001, 004 and 005 derive from the assembly of a previous viral metagenomic library cloned in plasmids. The blue color intensity is associated to BLASTN identity percentages. Reproduced from Garcia-Heredia, I., Martin-Cuadrado, A.B., Mojica, F.J., et al., 2012. Reconstructing viral genomes from the environment using fosmid clones: The case of haloviruses. PLoS One 7, e33802.

Many of these sequences, obtained from crytallizer CR30 between May 2007 and November 2014, might be joined in a new group of “Environmental Haloquadratum Viruses”, or EHqrV (Fig. 2). EHqrV most likely includes a (micro)diverse population of Haloquadratum viruses sensu stricto, given the high degree of genetic relatedness among them and their host: an average low GC content, a similar codon usage, very close di/tetra-nucleotide frequencies, and up to 30 proto-spacers that match the three CRISPR systems of Hqr. walsbyi C23T. Also, EHqrV sequences are characterized by a remarkable degree of synteny and share some conserved genes, such as virion structural proteins and the terminase large subunit (TerL), the genetic signature to place EHqrV within the order Caudovirales.

Viruses of the Nanohaloarchaea Until 2012, all extremely halophilic archaea were included within the phylum Euryarchaeota. However, that year a new group of unusually small extremely halophilic Archaea was discovered in Australian solar salterns. This group turned out to be very widespread and rather abundant in hypersaline systems worldwide. It was later named as Nanohalorcheaota and included within the archaeal superphylum DPANN, formed by phyla belonging to the so-called “microbial dark matter”, since no culture representatives are available so far. Soon after, metagenomic analyses allowed the tentative assignment of new haloviruses to this newly discovered host. The analysis of the fosmid libraries described above also allowed the description of a group of viruses tentatively assigned to the Nanohalaorchaea. This group would also belong to the Caudovirales, given the presence of the hallmark gene terminase, unless a more surprising explanation is waiting to be discovered. In addition to the in silico assignment of viruses to their nanohalaorchea host discussed above for the Atacama samples, tetranucleotide frequency and codon usage, together with CRISPR spacer analysis and the comparison with the above mentioned fosmid libraries, allowed the detection of nanohalorachaeal assembled virus genomes also in salterns of Australia and Senegal. Although extremely powerful, virus-host assignment based solely on metagenomics remain hypothetic until experimentally proven. A combination of viral metagenomics, single cell genomics and microarrays has allowed the unambiguous characterization of a virus infecting a nanohalorachaeon host. This virus (NHV-1) turned out to be closely related to one of the nanohaloarchaeal viruses retrieved from the fosmid libraries. NHV-1 harbored a terminase (in good agreement with previous results from tentative nanohaloviruses) and as well as a putative arsenical resistance repressor-like gene, which is very widespread in cellular and viral metagenomes from hypersaline environments worldwide.

Concluding Remarks Metagenomics has been an extremely powerful tool to unveil the diversity of the haloarchaeal virosphere, although some questions remain still open. We anticipate that the use of long read sequencing technologies and new tools of viral metaproteomics will certainly help to solve them.

418

Metagenomes of Archaeal Viruses in Hypersaline Environments

Further Reading Emerson, J.B., Thomas, B.C., Andrade, K., Heidelberg, K.B., Banfield, J.F., 2013. New approaches indicate constant viral diversity despite shifts in assemblage structure in an Australian hypersaline lake. Applied and Environmental Microbiology 79, 6755–6764. Garcia-Heredia, I., Martin-Cuadrado, A.B., Mojica, F.J., et al., 2012. Reconstructing viral genomes from the environment using fosmid clones: The case of haloviruses. PLoS One 7, e33802. Prangishvili, D., Bamford, D.H., Forterre, P., et al., 2017. The enigmatic archaeal virosphere. Nature Reviews Microbiology 15, 724–739. Roux, S., Enault, F., Ravet, V., et al., 2016. Analysis of metagenomic data reveals common features of halophilic viral communities across continents. Environmental Microbiology 18, 889–903. Santos, F., Yarza, P., Parro, V., et al., 2012. Culture-independent approaches for studying viruses from hypersaline environments. Applied and Environmental Microbiology 76, 1635–1643.

Extreme Environments as a Model System to Study How Virus–Host Interactions Evolve Along the Symbiosis Continuum Samantha J DeWerff and Rachel J Whitaker, University of Illinois at Urbana-Champaign, Urbana, IL, United States r 2021 Elsevier Ltd. All rights reserved.

Glossary Antagonism Symbiosis in which one partner benefits. Chronic virus A virus that buds from the host cell and does not kill its host in horizontal transmission. Chronic viruses may also persist in host cells in integrated or episomal (not integrated) forms for vertical transmission. CRISPR-Cas immunity Adaptive immune system found in bacteria and archaea that allows for sequence-specific targeting of foreign DNA elements to prevent infection. Horizontal transmission Movement of a viral particle from an infected host through lysis or budding that leads to adsorption of the viral particle in a new, uninfected host. Lytic virus A virus that lyses (kills) its host as it is transmitted to the next generation.

Metapopulation Spatially separated populations connected by migration. Mutualism Symbiosis in which both partners benefit. Symbiosis A close interaction between two biological entities. Temperate virus A virus that can either integrate into the host chromosome as a latent lysogen for vertical transmission or kill its host through lysis for horizontal transmission. Vertical transmission Movement of a virus from a mother cell to a daughter cell through genome replication. Viral fitness A measure of the number of viral progeny that are able to successfully infect host cells.

Most studies of microbial viruses in ecology and evolution focus on antagonistic lytic viruses whose transmission requires cell lysis and death. However, not all viruses are lytic. Latent proviruses are found in genomes of all cellular forms of life and persistent chronic viruses are common in all three domains. Chronic and temperate lifestyles include parts of their life cycle that are associated with their hosts. These vertically transmitted genetic units rapidly confer new evolutionary traits to their hosts and change their eco-evolutionary interactions with other organisms. Such viral symbioses result in emergent properties arising from the merging of genetic units in a process called infection genomics that links ecological and evolutionary processes and timelines. Here we expand the view of viruses to include properties across the continuum of symbiosis from antagonist to mutualists.

Aspects of Viral Fitness To expand our understanding of archaeal viruses and their symbiosis, we first consider the virus point of view. Like all evolving genetic units, evolution will select for increased viral fitness, the number of successful progeny transmitted to the next generation (Maynard-Smith, 1989; Hartl, 1988). Viruses are transmitted to the next generation either vertically, by replicating with their host, or horizontally, by successfully establishing a new infection in a new uninfected host. Selection will act on traits (antagonistic or mutualistic) that increase viral fitness (Bull et al., 1991). Most well-known archaeal viruses are lytic and kill their host when they are horizontally transmitted to the next generation. Well-studied examples in archaea include the viruses Sulfolobus islandicus rod-shaped virus (SIRV), Thermoproteus tenax virus 1 (TTV1), and most head-tail morphology viruses that have been shown to infect methanogens and halophiles (Prangishvili et al., 2006, 2017; Luk et al., 2014). In conditions where horizontal transmission results in higher viral fitness, selection may act to increase fitness, for example, by increasing the number of particles produced at lysis, decreasing the time to lysis, or increasing the infectivity of the particles. It is likely that increased fitness of the virus decreases host fitness. This is defined as antagonism on the symbiosis continuum (Bull et al., 1991; Bull, 1994; Cressler et al., 2016). Evolutionary theory predicts that horizontal fitness increases when susceptible hosts in which viruses can establish new infections are abundant. Therefore, a naive population of susceptible hosts where resistance, immunity and infection are low, and the abundance of susceptible hosts is high, should favor viral antagonism for lytic viruses. In contrast, when susceptible hosts are rare fitness is not increased by horizontal transmission (there are no new hosts) and instead vertical transmission increases viral fitness (Turner et al., 1998). In this case, horizontal transmission through lysis would be selected against and viruses would evolve to become less virulent (favoring, for example, delayed lysis) (Kerr et al., 2006). Just as ecological abundance of susceptible hosts impacts the evolution of virus-host interactions, these interactions also impact the ecology of host organisms in mixed communities. For example, lytic viruses impact the abundance of their specific hosts, disrupting the stability of complex microbial communities. Lytic viruses are often modeled as having Lotka-Volterra predator-prey dynamics, causing periodic oscillations of high and low abundance for the host and virus. This impacts competition and interactions between populations of microbes and others in community of viruses and hosts in which they exist in nature (Maslov and Sneppen, 2017). Selection for host resistance to lytic viruses can promote diversification among hosts evolving as resistance or resource specialists in an eco-evolutionary dynamic known as kill the winner (Thingstad, 2000; Weitz et al., 2005).

Encyclopedia of Virology, 4th Edition, Volume 4

doi:10.1016/B978-0-12-814515-9.00096-5

419

420

Extreme Environments as a Model System to Study How Virus–Host Interactions Evolve Along the Symbiosis Continuum

In contrast to lytic viruses, some viruses of archaea are temperate, which have a lifecycle with two distinct modes of transmission. In the lytic mode, viruses will infect the host, rapidly replicate, and kill the host through cell lysis to be transmitted horizontally; however, there is also a lysogenic or latent phase in which the virus is integrated into the host chromosome and passed vertically from mother to daughter cell through host replication (Stewart and Levin, 1984). Sulfolobus turreted icosahedral virus (STIV) exhibits this lifestyle when infecting its host Sulfolobus acidocaldarius. While it has been shown to cause a lytic infection, natural S. acidocaldarius isolates from Yellowstone National Park have also been found with this virus integrated in the genome (Fu and Johnson, 2012; Brumfield et al., 2009; Anderson et al., 2017). In halophile Natrinema sp. J7–1, the sphaerolipoviruses SNJ1 and SNJ2 have also shown a temperate lifestyle; however instead of integrating into the host chromosome the viral genome remains as an extrachromosomal plasmid (Liu et al., 2015; Wang et al., 2016). For temperate viruses, conditions which favor horizontal transmission, such as a high density of susceptible hosts, would theoretically favor the lytic cycle. Like purely lytic viruses, selection would favor traits that, while increasing the fitness of the infecting temperate virus, would decrease the fitness of the host. However, if there was a low density of susceptible hosts, then the lysogenic cycle may be favored. When vertical transmission is favored the fitness of the virus is aligned with the fitness of the host if the host is successful then the virus is as well. Therefore, traits that are more benevolent or beneficial to the host are selected for and shift the interaction toward mutualism, where both partners benefit from the interaction (Bull et al., 1991). While temperate viruses show either horizontal or vertical transmission in the lytic-lysogenic cycle, chronic viruses have mixedmode transmission by exhibiting both vertical and horizontal transmission at the same time (Lipsitch et al., 1996; Weitz et al., 2019). These viruses produce new viral progeny through non-lytic mechanisms that allow for the continued growth of the infected host. Examples of chronically infected viruses in archaea include Sulfolobus spindle-shaped viruses (SSVs) and the pleomorphic viruses infecting haloarchaea (Luk et al., 2014; Quemin et al., 2016; DeWerff et al., 2020). In chronic infection there must be a balance between both modes of transmission. In conditions that favor horizontal transmission, selection for higher fitness traits may decrease the efficacy of the vertical transmission component. The opposite may also occur where vertical transmission is favored; selection for higher fitness may work against efficient horizontal transmission. While lytic viruses can cause instability within microbial populations, temperate and chronic viruses can shape the ecology of their microbial host by providing fitness advantages in the context of their community. The long-term association between virus and host means that the viral fitness, the ability to succeed to a new generation, is aligned to host fitness, the ability of the host to grow and divide (Bull et al., 1991). When virus and host fitness are aligned, viral traits that improve host fitness can then increase the fitness of the virus as well resulting in a viral mutualism. One example of this can be accessory metabolic genes that can give the host a growth advantage compared to the uninfected population, giving their host a competitive advantage (Mann et al., 2003; Lindell et al., 2004; Anantharaman et al., 2014; Obeng et al., 2016). Another possibility would be utilizing an interference strategy where viral infection gives direct competitive advantage in which an infected cell is antagonistic against an uninfected cell, for example, through use of a toxin (Boynton, 2019; DeWerff, 2020; Gama et al., 2013). Finally, some non-lytic viruses confer protection to their hosts against lytic virus infection through superinfection exclusion (Bondy-Denomy et al., 2016; Labrie et al., 2010; Quemin et al., 2013). Any of these contributions to align virus and host fitness are considered mutualisms and are predicted to have an outsized effect that may stabilize microbial populations against the predations of lytic viruses. As previously discussed, low abundance of susceptible hosts is predicted to favor vertical transmission however, this is not the only ecological factor that would favor this transmission mode. Spatially structured populations, even when composed of high densities of microbes, may nonetheless have a limited effective population size that is accessible to a virus (Berngruber et al., 2015). Paired with low host migration rates, this can create conditions in which there is little availability of susceptible hosts at a local scale even if the broader population is susceptible. With chronic viruses, vertical transmission would theoretically be favored in the highly structured areas, but production of new viral particles is still important to allow for possible viral migration in the population which may explain why horizontal transmission is still observed in similar natural populations. Below we explore how environments and specific molecular mechanisms shape the interactions between microbes and viruses found along this symbiosis continuum. To do this we will be reviewing what has been learned from the unique host and viral ecology of extreme environments by focusing on the model system of acidic hot springs and how this might impact their interactions (Fig. 1).

Fig. 1 In this review we explore how the virus, host, and environment together influence the broader ecology of the acidic hot spring system. TEM images courtesy of Elizabeth Rowland.

Extreme Environments as a Model System to Study How Virus–Host Interactions Evolve Along the Symbiosis Continuum

421

Fig. 2 Metapopulations of Sulfolobus hosts found in Yellowstone National Park. Each host spring makes up an island population of locally adapted hosts (and viruses). Migration and gene flow can occur, though at different scales based on distance. When extinction occurs recolonization can occur from within the population, like in the case of a bottleneck event, or competition from a new clonal type.

Environment Impact on Host-Virus Interaction Extreme environments pose unique and highly structured landscapes for microbial life. One such environment is acidic hot springs, like those found in Yellowstone National Park, USA or Kamchatka, Russia where the model crenarchaea Sulfolobus islandicus can be found. Sulfolobus islandicus has been studied as a model not only for population structure but for its viruses as well (Anderson et al., 2017; Whitaker et al., 2003; Reno et al., 2009; Bautista et al., 2017; Munson-McGee et al., 2018a). Sulfolobus species contain several adaptations that support growth in this environment: the enzyme reverse gyrase to maintain DNA structure at high temperatures, the proteinaceous surface layer that helps to maintain cell structure and osmotic stress, and an active proton pump that transports protons out of the cell to maintain intracellular pH (Zhang et al., 2019; Forterre et al., 1996; Gleißner et al., 1997). The adaptations that allow for growth in the extreme environments also mean that there is low survival of Sulfolobus species outside of this environment, where at low temperatures DNA degradation limits the success of migration between hot springs (Hjort and Bernander, 1999). This low migration has led to metapopulations (subpopulations connected by limited migration) not only on a large spatial scale such as between continents, but on a local scale as well such as between hot springs (Fig. 2) (Anderson et al., 2017; Whitaker et al., 2003; Campbell et al., 2017). Limited migration between populations can increase the overall genetic diversity and provide a potential for local adaptation between viruses and their archaeal hosts. As discussed before this type of spatial structure would be hypothesized to favor either vertical transmission in temperate and chronic viruses or less virulent traits in lytic viruses, leading to local adaption between the virus and the host; however, in experimental studies this has not always been supported. When there is migration of the virus, even at low rates, it can provide enough genetic diversity to maintain the selection of antagonistic traits (Morgan et al., 2005). Like some other extreme environments, acidic hot springs are also dynamic with sporadic fluctuations in temperature and pH. In Yellowstone National Park, hot springs are subject to seasonal periods of snowmelt and rainfall in addition to geothermal excursions that rapidly change temperature and pH (Campbell et al., 2017). This influx of cooler, pH neutral water in small hot springs can cause a flush or dilution of the host population. This instability can cause a significant selective sweep or bottleneck of the population that removes host diversity. In the hosts Sulfolobus islandicus and Sulfolobus acidocaldarius, population structure was studied through multi-locus sequence typing or whole genome sequencing respectively in hot springs from around Yellowstone. In populations from smaller hot springs, there was very limited diversity, with hosts belonging to one or two clonal types (Anderson et al., 2017; Campbell et al., 2017). Clonal types were highly associated with a single spring in time and place but were observed to change over time, suggesting an extinction-recolonization dynamic in some but not all springs. It is still not known whether the signatures of extinction are due to viral or environment extinction. Either way, this type of extinction-recolonization dynamic can counteract selection and local adaption and influence virus-host coevolution. Finally, in acidic hot springs (as in other aquatic environments) the density of hosts is relatively low (Inskeep et al., 2010). Low host density may select for higher vertical transmission or increased stability of the virion particle against prolonged environmental exposure. However, local areas of high density such as in biofilms are often observed in the environment. Biofilms are created when microbes attach to the surface and can be made of one or more microbial members. Microbes in a biofilm form three dimensional communities of relatively high density (Nadell et al., 2016). While this can be a condition that allows for a virus to have access to a susceptible population, there are often extracellular structures that can limit the spread of a virus within the

422

Extreme Environments as a Model System to Study How Virus–Host Interactions Evolve Along the Symbiosis Continuum

biofilm itself (Simmons et al., 2018; Chan and Abedon, 2015). While it has been shown that Sulfolobus species can form biofilms, it is unknown at this time if in nature they are found within biofilm structures or in the aquatic phase (Henche et al., 2012a,b).

Viruses of Acidic Hot Springs Viral metagenomic analysis has revealed a wealth of novel viral diversity in geothermal acidic hot springs. Deep sequencing of multiple time points from a single hot spring in Yellowstone resulted in 110 different viral clusters, of which only 7 clusters have representatives in a cultured system (Bolduc et al., 2015). Within this large diversity of viruses, it would be expected that there also contains novel biology that has yet to be discovered. In the few characterized extremophilic viruses there has already been many unexpected discoveries. Some of these are adaptations to the environmental stressors placed on viral particles, such as Sulfolobus islandicus rod-shaped virus (SIRV), which contains A-form DNA in the viral particle (DiMaio et al., 2015). A-form DNA is most commonly found within bacterial spores and helps to protect DNA from degradation. However, due to this low stability in the environment it is imperative for survival that viruses continue to infect a host. Acidianus two-tailed virus (ATV) leaves an infected cell through lysis as a lemon-shaped particle. Once it is in the extracellular environment it then goes through a maturation process that produces 2 long tails from the particle (Prangishvili, 2013; Prangishvili et al., 2006). It is hypothesized that the extension of these tails helps to increase the chances of contact between the viral particle and the receptor of a new host. In the host Sulfolobus islandicus, we will focus on just two viruses whose virus-host interactions can be found at the two ends of the symbiosis continuum. On the antagonism side of the symbiosis continuum is the Sulfolobus islandicus rod-shaped viruses (SIRVs). SIRVs are long stiff rod viruses with short tail fibers at one end (Prangishvili, 2013). These fibers attach to the pili of host cells and then travel down them to infect the host (Quemin et al., 2013). SIRVs are lytic viruses and lyse the cell though the creation of proteinaceous pyramids in the cell membrane that open to release progeny viruses (Prangishvili and Quax, 2011). On the mutualism side we discuss the chronic Sulfolobus spindle-shaped viruses (SSV). These viruses are made up of lemon-shaped particle with tail fibers that often interact to form rosette-like structures of multiple viruses. The best characterized example, SSV1, was initially thought to follow a temperate lifecycle in which the virus can integrate into the genome in a lysogenic state and then, under UV induction, create new viral particles (Schleper et al., 1992). However, further study into the egress mechanism showed that SSVs are not lytic but instead exit the cell through budding (Fig. 3) (Quemin et al., 2016). The continued host growth of the during viral production, although with a fitness cost, suggests instead that these viruses are better characterized as chronic viruses (DeWerff et al., 2020). Like the host population, both SIRV and SSVs are also structured as a metapopulation. In initial studies using a single marker for viral presence it appeared that viruses are highly mobile (Snyder et al., 2007); however, full genome sequences showed that in fact they are both structured like the host as metapopulations in Yellowstone National Park (Bautista et al., 2017; Pauly et al., 2019).

Fig. 3 Budding of the virus Sulfolobus spindle-shaped virus 9 from a chronically infected Sulfolobus islandicus strain. Chronically infected cells were visualized by transmission electron microscopy with negative staining. The ability of the virus to exit the cell through budding without killing the host allows for mixed-mode transmission with both vertical and horizontal transmission occurring at once. TEM images courtesy of Elizabeth Rowland.

Extreme Environments as a Model System to Study How Virus–Host Interactions Evolve Along the Symbiosis Continuum

423

Fig. 4 Model of CRISPR-Cas immunity found in Sulfolobus islandicus. On initial infection and spacer can be acquired and integrated into the leader end of the repeat-spacer array. This array is then transcribed and processed into CRISPR RNAs (crRNA) that will interact with the cascade complex. Upon reinfection with a virus carrying the same protospacer, the crRNA-Cascade complex with bind to the viral genome and degrade it to prevent infection. Figure adapted from Maria A. Bautista.

Complexity of Host Virus-Interactions Archaea have many host defenses against viral infection including surface resistance, restriction modification and CRISPR-Cas immunity (Koonin et al., 2017; Rowland et al., 2019). The molecular mechanism of defense shapes the co-evolutionary dynamics of viruses and hosts. Surface modifications that commonly evolve in the lab, such as deletion of the genes encoding surface appendages, are not seen in natural populations, suggesting this type of resistance is rare in nature because appendages are essential to survival (Rowland et al., 2019). Instead studies of natural populations in thermoacidophilic archaea suggest CRISPRCas immunity is the dominant mechanism of defense (Held et al., 2010, 2013). Clustered regularly interspaced short palindromic repeats (CRISPRs) are repeat-spacer arrays within the host where the spacers can be added that match a viral or other foreign DNA element, called a protospacer (Fig. 4) (Barrangou et al., 2007; Marraffini and Sontheimer, 2008; Amitai and Sorek, 2016). The protospacers that confer immunity are limited only by a short protospacer associated motif (PAM), meaning in any viral genome there are thousands of possible loci that are targetted by CRISPR-Cas immunity. Repeat-spacer elements are transcribed and processed to form a complex with other CRISPR associated (Cas) genes. If the spacer sequence in the protein-RNA complexes matches a viral genome on subsequent infection the Cas machinery will degrade the viral genome to confer immunity. Viruses evade targeting through escape mutations in their protospacer loci or PAM (Amitai and Sorek, 2016). Theory predicts that the unique biology of the CRISPR-Cas defense system promotes diversification of CRISPR-Cas immunity because every host infection can result in a different spacer addition to the CRISPR repeat spacer array resulting in distributed immunity (Fig. 5) (Childs et al., 2012, 2014; van Houte et al., 2016). High distributed immunity effectively creates a population in which there are few susceptible hosts for a virus, since each single escape mutation gives a virus access only to a very small subset of the population. The nested structure of CRISPR-Cas immunity leads to alternating regimes of stable host populations and epidemic viral outbreaks (Pilosof et al., 2020). Acr proteins are viral proteins that work to block key components of CRISPR function or regulation. The first identified antiCRISPR in archaea was found in SIRV (He et al., 2018). This protein was found to interact with the CRISPR cascade complex, and this interaction lead to the inability of the complex to cleave viral DNA targets (Peng et al., 2020). Recently another Acr protein was found encoded in SIRV that does not directly target components of the CRISPR-Cas system but instead encodes a nuclease that cleaves a signaling molecule responsible for upregulating the CRISPR-Cas system during active infection (Athukoralage et al., 2020). As stated previously, in a population with high distributed immunity theory would suggest that this would limit the susceptible population size and favor vertical transmission or lead to viral extinction (Chabas et al., 2018; Childs et al., 2014). The

424

Extreme Environments as a Model System to Study How Virus–Host Interactions Evolve Along the Symbiosis Continuum

Fig. 5 Model of distributed immunity in host populations against a virus with 4 potential protospacers. If a host population only contains spacers to one of them then it has low distributed immunity. In this scenario the virus would only need an escape mutation in the yellow protospacer to allow infection of the whole population. If there is a diversity of CRISPR spacer targets in the population then it will have high distributed immunity. In the figure above, if the virus evolved an escape mutation in any one of the protospacer it would only be able to infect one member of the host population.

presence of Acr proteins changes the interaction between viruses and their host by removing key elements of CRISPR-Cas immunity and leading to a much larger susceptible population than what spacer sequences would predict. This would ultimately favor horizontal transmission and traits associated with higher virulence. Since the CRISPR-Cas array retains a small segment of the viral DNA in the order that it was infected, allows for the tracking both the history of infection over space and time (Held et al., 2010, 2013; Held and Whitaker, 2009). Signatures of previous virus infection recorded in the CRISPR-Cas repeat-spacer sequences contain matches to SIRV and SSV but suggest there are many other types undiscovered viruses that can infect Sulfolobus species (Anderson et al., 2017; Held et al., 2010; Shah et al., 2009). CRISPR-Cas arrays in both Yellowstone and Kamchatkan populations of Sulfolobus have been shown to be highly diversified (Pauly et al., 2019; Held et al., 2010). In addition, analysis of the content of CRISPR spacer arrays from Sulfolobus islandicus suggest that there is a level of recombination between CRISPR arrays as well as shared spacers that were acquired independently (Held et al., 2010). The degree of CRISPR-Cas spacer targeting of different viruses reflect the viruses and their host interactions in natural populations. Sulfolobus islandicus isolated from Kamchatka, Russia highly target Sulfolobus spindle shaped viruses (SSVs) but lack targeting to Sulfolobus islandicus rod-shaped viruses (SIRVs); whereas host populations from Yellowstone National Park, USA highly target SIRVs and only have low immunity to SSVs (Pauly et al., 2019). This matches what has been seen with viral isolation from these locations – while SSVs are quite commonly isolated from Kamchatka hot springs, to date there has yet to be a SIRV isolated. While in Yellowstone National Park hot springs SIRVs are quite common while SSVs, though present, are not the primary viruses isolated from these location.

How do Ecology and Evolution Shape Archaeal Viruses? Putting the pieces together, can we predict how will viruses evolve along the symbiosis continuum in natural populations of archaea in extreme environments? In the context of a metapopulation the landscape of susceptible hosts may be highly varied. One hot spring may contain a population with highly distributed CRISPR-Cas immunity, while another one nearby that has recently experienced a bottleneck or extinction may lack diversified spacers that can target the population. If viral migration occurs between hot springs or a host bottleneck occurs within a hot spring there could be a viral boom period with a large susceptible population or a bust period where no susceptible hosts are present, effectively causing viral extinction unless escape mutations are present. It is within these two different scenarios that we will explore how selection will move the virus-host interactions along the symbiosis continuum. During a viral boom period there is a large susceptible population that can be infected by the virus. When finding a new host is no longer the selective pressure, traits that are favored are ones that will increase the chance of horizontal transmission. This can be accomplished by increasing the number or infectivity of viral progeny or decreasing the time needed for a complete infection cycle. Viruses that have a decreased infection time will have a greater chance of infecting and spreading within the susceptible population compared to their longer infection cycle counterparts. However, these traits that are selected for are tied with an increase in virulence for lytic viruses (as described above). This increase in virulence is harmful to the host population and shifts the interaction toward the extremes of antagonism. Selecting for the quick use of susceptible hosts within the population for faster spread can lead to a tragedy of the commons in which the shared resource, in this case the susceptible hosts, is depleted by the viral population, which can ultimately lead to a bust period in which there is a low or absence of susceptible hosts (Kerr et al., 2006). There is highly distributed immunity among populations of Sulfolobus islandicus against the virus SIRV in this environment; therefore it is perhaps not surprising that there is not one, but two types of antiCRISPRs present in SIRVs as these genes would be highly selected for to increase the fitness of the virus (Peng et al., 2020). In a viral bust period in which there are few susceptible hosts, selection will favor viruses that are more prudent with their hosts. For lytic viruses, this would select for lifestyle characteristics that increase the time of infection within the host allowing for production of

Extreme Environments as a Model System to Study How Virus–Host Interactions Evolve Along the Symbiosis Continuum

425

Fig. 6 Virus-host interactions fall along a symbiosis continuum. When a virus decreases host fitness it moves towards antagonism in which one partner benefits at the cost of the other. If both virus and host gain benefits from the interaction, then it is shifted toward mutualism.

either high quantity or quality of new viral progeny. However, this bust period would also strongly favor viruses that are not strictly lytic but instead follow a dual- or mixed-mode transmission such as with temperate or chronic infection. During a bust period, viruses that are already vertically associated with their hosts would be protected from environmental degradation of the genome or virion particle. This would select for mutualistic traits as discussed earlier that would give a benefit to the infected host (Bull et al., 1991). This has been shown recently with SSV9 infection of the archaea Sulfolobus islandicus. Kamchatka populations exhibit high distributed CRISPR-Cas immunity to SSVs, which limits the size of the susceptible population (Pauly et al., 2019). Though chronic infection comes at a cost in cell growth, DeWerff et al. showed that infected cells produce a toxin that kills uninfected and immune cells. This killing of competitors gives a benefit not only to the infected host cell through toxin immunity but the chronic virus as well (DeWerff et al., 2020). The virus can ensure its continued vertical transmission by the toxin acting in a similar manner to plasmid addiction systems, and selects for successful horizontal transmission by only allowing the infected cells to survive. Given the structure of these metapopulations, a competitive ability would give a substantial benefit to the host. In its established hot spring, the ability to remove uninfected cells would suggest that the chronically infected cell becomes the dominant member. If migration to a new hot spring were to occur, the chronically infected host would be able invade an established uninfected population (even one with distributed CRISPR-Cas immunity) by killing established local hosts.

How do Virus Interactions Impact Ecology and Evolution? So far, we have considered how both viruses, the lytic SIRV and chronic SSV, interact individually with their host and shape the ecology of the system. With SIRVs the ability to prevent CRISPR-Cas function allows for a larger susceptible population despite the high targeted distributed immunity that can be seen in these populations. This allows for selection of traits that are associated with high virulence in the host and shift the interaction toward antagonism (Fig. 6). On the opposite end of the spectrum, SSVs can establish chronic infection that allows for both vertical and horizontal transmission of the virus. The chronically infected host and virus benefit from this interaction through the competitive advantage that is conferred by production of a virally-encoded toxin that kills uninfected cells. This is in line with the selection of benevolent traits predicted to be associated with vertical transmission and a shift of the symbiotic relationship toward mutualism. However, in Yellowstone National Park, both SSV and SIRV viruses can be found simultaneously in hot springs. How does the presence of both change the eco-evolutionary dynamic of each individually? It is necessary to go beyond thinking about one virus – one host interactions and consider those when multiple types of viruses are present. Infection of a chronically infected host with a lytic virus would be detrimental not only to the host but the chronic virus infecting this host cell as well. The metapopulation nature of acidic hot springs and extinction-recolonization dynamic suggest that competition between host strains to invade new habits and quickly establish as the dominant member is a strong driver of selection for viruses that confer competitive advantages and remain with their hosts. Studying natural isolates from these populations as well as single cell genomics has shown that while most cells are infected by at least one virus in nature, chronic infection with SSV does not dominate each subpopulation with Yellowstone National Park (Pauly et al., 2019; Munson-McGee et al., 2018b). Taken together, the interactions discussed here must be more complicated and further research into novel archaeal virus mechanisms is needed to fully understand their eco-evolutionary impact. Through the lens of extreme environments, the expanding view of viruses to include positive mutualistic impacts on their hosts provides dramatic insight into ecology and evolution in the microbial world especially on contemporary time-scales. In addition, study of viral symbiosis is uncovering the diversity of molecular interactions within cells that contribute to their physiology. In this way, the dual role of viruses as individuals and as genetic elements shows us that integrating molecular mechanisms with their eco-evolutionary impacts will play an increasingly important role in biology going forward.

426

Extreme Environments as a Model System to Study How Virus–Host Interactions Evolve Along the Symbiosis Continuum

Acknowledgments We thank Elizabeth Rowland for help and inspiration on the initial drafts of this manuscript and TEM images. Work included discussed here was funded by NSF DEB #1656869.

References Amitai, G., Sorek, R., 2016. CRISPR–Cas adaptation: Insights into the mechanism of action. Nature Reviews Microbiology 14, 67–76. Anantharaman, K., Duhaime, M.B., Breier, J.A., et al., 2014. Sulfur oxidation genes in diverse deep-sea viruses. Science 344, 757–760. Anderson, R.E., Kouris, A., Seward, C.H., Campbell, K.M., Whitaker, R.J., 2017. Structured populations of Sulfolobus acidocaldarius with susceptibility to mobile genetic elements. GBE 9 (6), 1699–1710. Athukoralage, J.S., McMahon, S.A., Zhang, C., et al., 2020. An anti-CRISPR viral ring nuclease subverts type III CRISPR immunity. Nature 577, 572–575. Barrangou, R., Fremaux, C., Deveau, H., et al., 2007. CRISPR provides acquired resistance against viruses in prokaryotes. Science 315, 1709–1712. Bautista, M.A., Black, J.A., Youngblut, N.D., Whitaker, R.J., 2017. Differentiation and structure in sulfolobus islandicus rod-shaped virus populations. Viruses 9. Berngruber, T.W., Lion, S., Gandon, S., 2015. Spatial structure, transmission modes and the evolution of viral exploitation strategies. PLOS Pathogens 11, e1004810. Bolduc, B., Wirth, J.F., Mazurie, A., Young, M.J., 2015. Viral assemblage composition in Yellowstone acidic hot springs assessed by network analysis. The ISME Journal 9, 2162–2177. Bondy-Denomy, J., Qian, J., Westra, E.R., et al., 2016. Prophages mediate defense against phage infection through diverse mechanisms. The ISME Journal 10, 2854–2866. Boynton, P.J., 2019. The ecology of killer yeasts: Interference competition in natural habitats. Yeast 36, 473–485. Brumfield, S.K., Ortmann, A.C., Ruigrok, V., et al., 2009. Particle assembly and ultrastructural features associated with replication of the lytic archaeal virus Sulfolobus turreted icosahedral virus. Journal of Virology 83, 5964–5970. Bull, J.J., Molineux, I.J., Rice, W.R., 1991. Selection of benevolence in a host-parasite system. Evolution 45, 875–882. Bull, J.J., 1994. Virulence. Evolution 48, 1423–1437. Campbell, K.M., Kouris, A., England, W., et al., 2017. Sulfolobus islandicus meta-populations in Yellowstone National Park hot springs. Environmental Microbiology 19 (6), 2334–2347. Chabas, H., Lion, S., Nicot, A., et al., 2018. Evolutionary emergence of infectious diseases in heterogeneous host populations. PLOS Biology 16, e2006738. Chan, B.K., Abedon, S.T., 2015. Bacteriophages and their enzymes in biofilm control. Current Pharmaceutical Design 21, 85–99. Childs, L.M., Held, N.L., Young, M.J., Whitaker, R.J., Weitz, J.S., 2012. Multiscale model of CRISPR-induced coevolutionary dynamics: Diversification at the interface of lamarck and darwin. Evolution 66, 2015–2029. Childs, L.M., England, W.E., Young, M.J., Weitz, J.S., Whitaker, R.J., 2014. CRISPR-induced distributed immunity in microbial populations. PLOS One 9, e101710. Cressler, C.E., McLEOD, D.V., Rozins, C., Hoogen, J.V.D., Day, T., 2016. The adaptive evolution of virulence: A review of theoretical predictions and empirical tests. Parasitology 143, 915–930. DeWerff, S.J., Bautista, M.A., Pauly, M., Zhang, C., Whitaker, R.J., 2020. Killer archaea: Virus-mediated antagonism to CRISPR-immune populations results in emergent virushost mutualism. MBio 11 (2), (00404-20). DiMaio, F., Yu, X., Rensen, E., et al., 2015. A virus that infects a hyperthermophile encapsidates A-form DNA. Science 348, 914–917. Forterre, P., Bergerat, A., Lopex-Garcia, P., 1996. The unique DNA topology and DNA topoisomerases of hyperthermophilic archaea. FEMS Microbiology Reviews 18, 237–248. Fu, C., Johnson, J.E., 2012. Structure and cell biology of archaeal virus STIV. Current Opinion in Virology 2, 122–127. Gama, J.A., Reis, A.M., Domingues, I., et al., 2013. Temperate bacterial viruses as double-edged swords in bacterial warfare. PLOS One 8. Gleißner, M., Kaiser, U., Antonopoulos, E., Schäfer, G., 1997. The Archaeal SoxABCD complex is a proton pump in Sulfolobus acidocaldarius. Journal of Biological Chemistry 272, 8417–8426. Hartl, D.L., 1988. A Primer of Population Genetics. Sunderland, MA: Sinauer Associates, Inc. He, F., Bhoobalan-Chitty, Y., Van, L.B., et al., 2018. Anti-CRISPR proteins encoded by archaeal lytic viruses inhibit subtype I-D immunity. Nature Microbiology 3, 461–469. Held, N.L., Herrera, A., Cadillo-Quiroz, H., Whitaker, R.J., 2010. CRISPR associated diversity within a population of Sulfolobus islandicus. PLOS One 5, e12988. Held, N.L., Herrera, A., Whitaker, R.J., 2013. Reassortment of CRISPR repeat-spacer loci in Sulfolobus islandicus. Environmental Microbiology 15, 3065–3076. Held, N.L., Whitaker, R.J., 2009. Viral biogeography revealed by signatures in Sulfolobus islandicus genomes. Environmental Microbiology 11, 457–466. Henche, A.-L., Ghosh, A., Yu, X., et al., 2012b. Structure and function of the adhesive type IV pilus of Sulfolobus acidocaldarius. Environmental Microbiology 14, 3188–3202. Henche, A.-L., Koerdt, A., Ghosh, A., Albers, S.-V., 2012a. Influence of cell surface structures on crenarchaeal biofilm formation using a thermostable green fluorescent protein. Environmental Microbiology 14, 779–793. Hjort, K., Bernander, R., 1999. Changes in cell size and DNA content insulfolobus cultures during dilution and temperature shift experiments. Journal of Bacteriology 181, 5669–5675. Inskeep, W.P., Rusch, D.B., Jay, Z.J., et al., 2010. Metagenomes from high-temperature chemotrophic systems reveal geochemical controls on microbial community structure and function. PLOS One 5, e9773. Kerr, B., Neuhauser, C., Bohannan, B.J.M., Dean, A.M., 2006. Local migration promotes competitive restraint in a host–pathogen “tragedy of the commons”. Nature 442 (7098), 75–78. Koonin, E.V., Makarova, K.S., Wolf, Y.I., 2017. Evolutionary genomics of defense systems in archaea and bacteria. Annual Review of Microbiology 71, 233–261. Labrie, S.J., Samson, J.E., Moineau, S., 2010. Bacteriophage resistance mechanisms. Nature Reviews Microbiology 8, 317–327. Lindell, D., Sullivan, M.B., Johnson, Z.I., et al., 2004. Transfer of photosynthesis genes to and from Prochlorococcus viruses. Proceedings of the National Academy of Sciences of the United States of America 101, 11013–11018. Lipsitch, M., Siller, S., Nowak, M.A., 1996. The evolution of virulence in pathogens with vertical and horizontal transmission. Evolution 50, 1729–1741. Liu, Y., Wang, J., Liu, Y., et al., 2015. Identification and characterization of SNJ2, the first temperate pleolipovirus integrating into the genome of the SNJ1-lysogenic archaeal strain. Molecular Microbiology 98, 1002–1020. Luk, A.W.S., Williams, T.J., Erdmann, S., Papke, R.T., Cavicchioli, R., 2014. Viruses of Haloarchaea. Life 4, 681–715. Mann, N.H., Cook, A., Millard, A., Bailey, S., Clokie, M., 2003. Bacterial photosynthesis genes in a virus. Nature 424, 741. (6950). Marraffini, L.A., Sontheimer, E.J., 2008. CRISPR interference limits horizontal gene transfer in Staphylococci by targeting DNA. Science 322, 1843–1845. Maslov, S., Sneppen, K., 2017. Population cycles and species diversity in dynamic Kill-the-Winner model of microbial ecosystems. Scientific Reports 7, 39642. Maynard-Smith, J., 1989. Evolutionary genetics. Oxford: Oxford University Press. Morgan, A.D., Gandon, S., Buckling, A., 2005. The effect of migration on local adaptation in a coevolving host–parasite system. Nature 437 (7056), 253–256. Munson-McGee, J.H., Peng, S., Dewerff, S., et al., 2018a. A virus or more in (nearly) every cell: Ubiquitous networks of virus-host interactions in extreme environments. The ISME Journal 12, 1706–1714. Munson-McGee, J.H., Snyder, J.C., Young, M.J., 2018b. Archaeal viruses from high-temperature environments. Genes 9.

Extreme Environments as a Model System to Study How Virus–Host Interactions Evolve Along the Symbiosis Continuum

427

Nadell, C.D., Drescher, K., Foster, K.R., 2016. Spatial structure, cooperation and competition in biofilms. Nature Reviews Microbiology 14, 589–600. Obeng, N., Pratama, A.A., van Elsas, J.D., 2016. The significance of mutualistic phages for bacterial ecology and evolution. Trends in Microbiology 24, 440–449. Pauly, M.D., Bautista, M.A., Black, J.A., Whitaker, R.J., 2019. Diversified local CRISPR-Cas immunity to viruses of Sulfolobus islandicus. Philosophical Transactions of the Royal Society B: Biological Sciences 374. (20180093). Peng, X., Mayo-Muñoz, D., Bhoobalan-Chitty, Y., Martínez-Álvarez, L., 2020. Anti-CRISPR proteins in archaea. Trends in Microbiology 28, 913–921. Pilosof, S., Alcalá-Corona, S.A., Wang, T., et al., 2020. The network structure and eco-evolutionary dynamics of CRISPR-induced immune diversification. Nature Ecology & Evolution 4, 1650–1660. Prangishvili, D., Bamford, D.H., Forterre, P., et al., 2017. The enigmatic archaeal virosphere. Nature Reviews Microbiology 15, 724–739. Prangishvili, D., Forterre, P., Garrett, R.A., 2006. Viruses of the archaea: A unifying view. Nature Reviews Microbiology 4, 837–848. Prangishvili, D., 2013. The wonderful world of archaeal viruses. Annual Review of Microbiology 67, 565–585. Prangishvili, D., Quax, T.E., 2011. Exceptional virion release mechanism: One more surprise from archaeal viruses. Current Opinion in Microbiology 14, 315–320. Prangishvili, D., Vestergaard, G., Häring, M., et al., 2006. Structural and genomic properties of the hyperthermophilic archaeal virus ATV with an extracellular stage of the reproductive cycle. Journal of Molecular Biology 359, 1203–1216. Quemin, E.R.J., Chlanda, P., Sachse, M., et al., 2016. Eukaryotic-like virus budding in archaea. mBio 7, e01439-16. Quemin, E.R.J., Lucas, S., Daum, B., et al., 2013. First Insights into the entry process of hyperthermophilic archaeal viruses. Journal of Virology 87, 13379–13385. Reno, M.L., Held, N.L., Fields, C.J., Burke, P.V., Whitaker, R.J., 2009. Biogeography of the Sulfolobus islandicus pan-genome. Proceedings of the National Academy of Sciences of the United States of America 106, 8605–8610. Rowland, E.F., Bautista, M.A., Zhang, C., Whitaker, R.J., 2019. Surfaceresistance to SSVs and SIRVs in pilin deletions of Sulfolobus islandicus. Molecular Microbiology 113 (4). Schleper, C., Kubo, K., Zillig, W., 1992. The particle SSV1 from the extremely thermophilic archaeon Sulfolobus is a virus: demonstration of infectivity and of transfection with viral DNA. Proceedings of the National Academy of Sciences of the United States of America 89, 7645. Shah, S.A., Hansen, N.R., Garrett, R.A., 2009. Distribution of CRISPR spacer matches in viruses and plasmids of crenarchaeal acidothermophiles and implications for their inhibitory mechanism. Biochemical Society Transactions 37, 23–28. Simmons, M., Drescher, K., Nadell, C.D., Bucci, V., 2018. Phage mobility is a core determinant of phage–bacteria coexistence in biofilms. 2. The ISME Journal 12, 531–543. Snyder, J.C., Wiedenheft, B., Lavin, M., et al., 2007. Virus Movement Maintains Local Virus Population Diversity. Proceedings of the National Academy of Sciences 104 (48), 19102–19107. https://doi.org/10.1073/pnas.0709445104. Stewart, F.M., Levin, B.R., 1984. The population biology of bacterial viruses: Why be temperate. Theoretical Population Biology 26, 93–117. Thingstad, T.F., 2000. Elements of a theory for the mechanisms controlling abundance, diversity, and biogeochemical role of lytic bacterial viruses in aquatic systems. Limnology and Oceanography 45, 1320–1328. Turner, P.E., Cooper, V.S., Lenski, R.E., 1998. Tradeoff between horizontal and vertical modes of transmission in bacterial plasmids. Evolution 52, 315–329. van Houte, S., Ekroth, A.K.E., Broniewski, J.M., et al., 2016. The diversity-generating benefits of a prokaryotic adaptive immune system. Nature 532, 385–388. Wang, Y., Sima, L., Lv, J., et al., 2016. Identification, characterization, and application of the replicon region of the halophilic temperate sphaerolipovirus SNJ1. Journal of Bacteriology 198, 13. Weitz, J.S., Hartman, H., Levin, S.A., 2005. Coevolutionary arms races between bacteria and bacteriophage. PNAS 102, 9535–9540. Weitz, J.S., Li, G., Gulbudak, H., Cortez, M.H., Whitaker, R.J., 2019. Viral invasion fitness across a continuum from lysis to latency. Virus Evolution 5 (1), vez006. Whitaker, R.J., Grogan, D.W., Taylor, J.W., 2003. Geographic barriers isolate endemic populations of hyperthermophilic archaea. Science 301, 976. Zhang, C., Wipfler, R.L., Li, Y., et al., 2019. Cell structure changes in the hyperthermophilic crenarchaeon Sulfolobus islandicus lacking the S-Layer. mBio 10.

Further Reading Bao, X., Roossinck, M.J., 2013. A life history view of mutualistic viral symbioses: Quantity or quality for cooperation? Current Opinion in Microbiology 16, 514–518. Ewald, P.W., 1987. Transmission modes and evolution of the parasitism-mutualism continuuma. Annals of the New York Academy of Sciences 503, 295–306. Roossinck, M.J., 2011. The good viruses: Viral mutualistic symbioses. Nature Reviews Microbiology 9, 99–108.

FUNGAL VIRUSES

An Introduction to Fungal Viruses☆ Nobuhiro Suzuki, Institute of Plant Stress and Resources (IPSR), Okayama University, Kurashiki, Japan r 2021 Elsevier Ltd. All rights reserved. This is an update of S.A. Ghabrial, N. Suzuki, Fungal Viruses, In Encyclopedia of Virology (Third Edition), edited by Brian W.J. Mahy, Marc H.V. Van Regenmortel, Elsevier Ltd, 2008, doi:10.1016/B978-012374410-4.00563-X.

Glossary Hyphal anastomosis The union of a hypha with another resulting in cytoplasmic exchange. Hypovirulence Attenuated fungal virulence mediated by virus infection, mitochondrial defects, or mutations in the fungal genome. Mycoviruses Viruses that infect and multiply in fungi.

Vegetative incompatibility A genetically controlled self/ non-self recognition system in fungi that determines the ability to undergo hyphal anastomosis. This is regarded as antiviral defense that operates at the population level. Virocontrol One form of biological control using viruses. Control of crops against phytopathogenic fungi utilizing viruses infecting them.

Introduction Relative to plant virology, animal virology, or bacterial virology, fungal virology (or mycovirolgy) is new. In 1948, an economically important disease of the cultivated edible mushroom, Agaricus bisporus (a basidiomycete), characterized by malformed fruiting bodies and serious yield loss, was first reported in a mushroom house owned by the La France Brothers of Pennsylvania. The disease was called “La France disease”, and similar diseases were reported shortly thereafter from Europe, Japan, and Australia. Different designations, such as “X-disease”, “watery stripe”, “brown disease”, and “die-back” were given to basically the same disease as La France. The significance of the 1948 report lies in the fact that it led to the discovery of fungal viruses (mycoviruses). In 1962, Hollings noted the presence of at least three types of virus particles in diseased mushroom sporophores. This was the first report of virus particles in association with a fungus and is regarded as the dawn of mycovirology. The subsequent discovery that dsRNA of mycoviral origin was responsible for interferon-inducing activities of cultural filtrates of several species of Penicillium spp. (ascomycetes) greatly stimulated the search for mycoviruses. Fungi, like other living organisms, can be infected by a number of viruses, and mycoviruses are found in all the major groups of fungi such as the Ascomycota and Basidiomycota as well as fungal-like protists such as Oomycota (see below). Although mycoviruses are widely prevalent, only those infecting a limited number of fungal host species have been studied, e.g., the yeast Saccharomyces cerevisiae, edible mushrooms and phytopathogenic fungi. Given the predicted vast number of fungal species (approximately 120,000 accepted species and much more unknown species), it is expected that a greater number of unrecognized mycoviruses occur in nature. Support for this hypothesis comes from recent extensive searches of field fungal isolates that showed relatively high frequencies of virus infection, e.g., approximately 65%, 20%, and 2%–28% of isolates of Helicobasidium mompa (a basidiomycete that causes violet root rot), Rosellinia necatrix (an ascomycetes called the white root rot fungus) and Cryphonectria parasitica (an ascomycetes called the chest blight fungus) were found to be infected, respectively. Another line of support is from recent next-generation sequencing and metaviromic analyzes of environmental and plant-associated samples discovering numerous viruses most likely of fungal origin.

Biological Properties Host Range The natural host range of mycoviruses is likely to be restricted to the same or closely related vegetative compatibility groups that allow lateral transmission. The development of inoculation or introduction methods allows testing of experimental host ranges. Until recently, there were no known experimental host ranges for fungal viruses because of a lack of suitable infectivity assays. Experimental host ranges for some mycoviruses, however, were recently demonstrated and shown to extend to different vegetative compatibility groups, different species, different genera and even to different classes. For example, mycoreovirus 1 (MyRV1, an encapsideated dsRNA virus), the prototype of the genus Mycoreovirus in family Reoviridae, can replicate and induce phenotypic alterations in different vegetative compatibility groups of C. parasitica similar to those exhibited by the original virus-containing strain. Furthermore, Cryphonectria hypovirus 1 (CHV1, a capsidless ( þ )ssRNA virus), the type member of the family Hypoviridae, can replicate in and confer hypovirulence onto members of several other ascomycete genera, e.g., Endothia gyrosa and Valsa ceratosperma, fungi associated with diseases of trees such as eucalyptus or apples and pears, in addition to its natural host, C. parasitica. Cryphonectria parasitica mitovirus 1 (CpMV1, the genus Mitovirus), a mitochondrially replicating capsidless ( þ ) ssRNA virus, can be replicated in Valsa spp., but not in fungi more distantly related to its original host C. parasitica. Of particular ☆

In memory of Professor Said A. Ghabrial.

Encyclopedia of Virology, 4th Edition, Volume 4

doi:10.1016/B978-0-12-814515-9.00045-X

431

432

An Introduction to Fungal Viruses

note is the recent discovery of cross-kingdom infections by a few fungal viruses between fungi and plants or fungi and insects (see below).

Symptom Expression Although the majority of viruses that infect phytopathogenic fungi have been reported to be avirulent, phenotypic consequences of infections with mycoviruses can vary from symptomless to severely debilitating, and from hypovirulence to hypervirulence. Mycoviruses that attenuate the virulence of plant pathogenic fungi provide excellent model systems for basic studies on development of novel biological control measures and for dissecting the mechanisms underlying fungal pathogenesis. In general, infections due to mycoviruses are both symptomless and persistent. Latency benefits the host for survival, and persistence helps the virus in the absence of extracellular modes of transmission. To ensure their retention, some mycoviruses have evolved to bestow selective advantage to their host (e.g., the “killer” phenotypes in yeasts and smuts that are associated with the totivirus infection). Because of their ability to secrete killer toxins, yeast killer strains have been utilized by the brewing industry to provide protection against contamination with adventitious sensitive strains. The genes encoding the smut killer toxins have been used for development of novel transgenic approaches to control the corn smut. Macroscopic symptoms caused by fungal viruses are a consequence of alterations in complex physiological processes that involve interactions between host and virus factors. Several virally encoded proteins are identified as symptom determinants including the papain-like protease, p29, of some hypoviruses that are phylogenetically related to potyviruses and viruses belonging to the picorna-like superfamily. This protein acts to repress host pigmentation and conidiation regardless of whether it is expressed from host chromosomes or the homologous virus genome. A protein encoded by one of the betachrysovirus (family Chrysoviridae) dsRNA genomic segments acts as a symptom determinant of phytopathogenic host fungi. Another interesting example is the proteins encoded by the totivirus Helminthosporium victoriae 190S virus (HvV190S, the prototype of the genus Victorivirus) infecting the filamentous ascomycete Helminthosporium victoriae, the causal agent of Victoria blight of oats. Co-expression of the HvV190S capsid and RdRP proteins results in empty capsid production and phenotypic changes similar to those induced in virusinfected fungal isolates, suggesting that viral replication is not required for symptom development. Host factors involved in symptom expression are not well studied except for host genes involved in the killer phenomenon in yeast infected with the totivirus Saccharomyces cerevisiae virus L-A (ScV-L-A) and its associated satellite dsRNAs as exemplified by super killer genes and the symptom expression by CHV1 in C. parasitica. A genetic screen for host factors led to the identification of a Mg2 þ transporter as being necessary for normal symptom induction by CHV1 in C. parasitca. The putative mRNA binding protein, termed virus response 1 (vr1), and hexagonal peroxisome protein (Hex1) of the filamentous ascomycete Fusarium graminearum, the causal agent of fusarium head blight, are associated with symptom expression induced by a betachrysovirus (Fusarium graminearum virus China 9, FgV-ch9) and a fusarivirus (Fusarium graminearum virus 1, FgV1, a capsidless ( þ )ssRNA virus related to hypoviruses), respectively. Transcriptome analysis was performed for only limited virus/natural host combinations. In the C. parasitica/ mycovirus pathosystem, the SAGA (Spt–Ada–Gcn5 acetyltransferase) complex, a general transcriptional co-activator and DCL2 (a Dicer), a key enzyme of antiviral RNA silencing, mediate the upregulation of many host genes. Some of the virus responsive C. parasitica genes are associated with CHV1 symptom expression or its mitigation. Of the 2200 C. parasitica genes, 13.4% are either up- or down regulated upon infection with the prototype strain (a severe strain) of CHV1, while only 7.5% are altered in their transcription levels by infection with a mild strain of CHV1. Half of the C. parasitica genes responsive to the infection with the latter mild strain are commonly altered in transcription by the severe strain, which generally causes greater magnitude of transcriptional changes.

Transmission Mycoviruses generally lack an extracellular phase to their life cycles. They are transmitted intracellularly during cell division, sporogenesis and cell fusion. Lateral (horizontal) transmission usually occurs only between individuals within the same fungal species, which belong to the same or closely related vegetative compatibility groups. Vegetative compatibility is governed genetically. Mycoviruses may be eliminated during sexual spore formation. Although the totiviruses and narnaviruses (capsidless ( þ )ssRNA viruses belonging to the family Narnaviridae) that infect the budding yeasts are effectively transmitted via ascospores, the mycoviruses infecting the ascomycetous filamentous fungi are in effect eliminated during ascospore formation. Whereas the ( þ )ssRNA and dsRNA viruses infecting mushrooms are transmitted efficiently via basidiospores, the virus-containing strains of the basidiomycete H. mompa are cured during the sexual sporulation processes. Therefore, whether a mycovirus is transmitted through sexual spores depends on the host/virus combination involved. Whereas mycovirus transmission through asexual spores occurs frequently, its rate varies greatly depending on the combination of viral and host fungal strains. It was recently revealed that CHV1 p29 (the papain-like protease p29) might play a role in virus transmission since it enhances the transmission of the homologous virus (CHV1) as well as a heterologous virus (a mycoreovirus, MyRV1) through the conidia in C. parasitica. Though rare in nature, interspecies transmission has been reported between members within the same genus including Cryphonectria, Sclerotinia, and Ophiostoma that share the same habitats. Interspecies transmission is also implicated between distantly related fungi, e.g., between R. necatrix and Entoleuca spp. It remains unknown whether the interspecies barrier is overcome by physical contacts, by vectors, or by infecting viruses.

An Introduction to Fungal Viruses

433

Of note is the recent discovery of extracellular routes for mycovirus entry. There is experimental evidence for the transmissibility of an encapsidated ssDNA virus, Sclerotinia sclerotiorum hypovirulence associated DNA virus 1 (SsHDV1, the family Genomoviridae), by a mycophagous fly between Sclerotinia sclerotiorum (an ascomycetous plant pathogen with a broad host range) colonies and the acquisition and replication of a plant ( þ )ssRNA virus by/in Rhizoctonia solani (a basidiomycetous plant pathogen that has a wide host range). See the Section “Cross-Kingdom Infection by Viruses and Viroids” in this article.

Mixed Infections Mixed infections with two or more related or unrelated viruses and accumulation of defective dsRNA and/or satellite dsRNA molecules are common features of mycovirus infections. Examples include the totiviruses (Sphaeropsis sapinea RNA virus 1 and 2, genus Victrivirus) infecting Sphaeropsis sapinea (an important opportunistic fungal pathogen on pine trees) and S. cerevisiae (ScV-L-A and ScV-LB/C), and Penicillium stoloniferum gammapartitiviruses (Penicillium stoloniferum virus S and F). There are a number of known examples of mixed infections with plant or animal viruses where one virus either interferes with or enhances the replication of the other one. As a consequence, reduction or increase in symptom severity may arise. Recent extensive searches for fungal viruses confirmed relatively high mixed infection rates in some phytopathogenic fungi like C. parasitica, R. necatrix, Heterobasidion spp. (a root rot basidiomycete fungus), S. sclerotiorum and H. mompa. There are at least three types of recognizable virus/virus interactions: synergistic, mutualistic and antagonistic interactions and genome alterations. Recent studies show synergistic interactions between a hypovirus (CHV1) and a mycoreovirus (MyRV1) through a hypovirallyencoded protein. In this case, one-way synergism is observed in that only the hypovirus infection trans-activates the replication of the mycoreovirus, which has a replication strategy distinct from that of hypoviruses. A similar one-way synergistic interaction is detected between Cryphonectria hypovirus 4 (CHV4) and mycoreovirus 2 (MyRV2), in which CHV4 enhances stable infection and replication of MyRV2 in C. parasitica. These two examples of synergism are mediated by papain-like cysteine proteases p29 and p24 encoded by CHV1 and CHV4, respectively, both of which also play a role in the suppressor of antiviral RNA silencing. Genome rearrangement occurs in C. parasitica co-infected with MyRV1 and CHV1 which is a result of RNA silencing suppression via the action of the CHV1 p29. Synergistic interactions are also found between a megabirnavirus, Rosellinia necatrix megabirnavirus 2 (RnMBV2, an encapsideated dsRNA virus, the prototype of the genus Megabirnavirus in family Megabirnaviridae) and a partitivirus, Rosellinia necatrix partitivirus 1 (RnPV1), in which RnMBV2 confers hypovirulence with the aid of RnPV1 and elevates the accumulation of RnPV1. For antagonistic and mutualistic interactions, see the “Future Perspectives” Sections below.

Fungal Virus Taxonomy A list of the virus families and genera into which mycoviruses are classified or officially proposed to the international committee on taxonomy of viruses (ICTV) is included in Table 1. Mycoviruses have diverse genomes including ones made of linear dsRNA, linear positive-sense ( þ ) ssRNA, linear negative-sense (  ) ssRNA, and circular ssDNA. Mycoviruses with dsDNA genomes are not in the list, but might yet be found since dsDNA viruses of water molds, now classified as protists not fungi, have been reported. Mycoviruses for which three-dimensional (3D) structures of the virions have been reported are further described in “Structural features”. Members of some mycovirus families, e.g., the families Narnaviridae, and Hypoviridae, infect only fungi, while members in other families, e.g., the families Metavviridae, Pseudoviridae, Reoviridae, Totiviridae, Chrysoviridae, Endornaviridae, Alphaflexiviridae, Botourmiaviridae and Partitiviridae infect fungi, protozoa, plants or animals. Many mycoviruses have RNA genomes and have either dsRNA genomes or ( þ )ssRNA genomes. Viruses with (  )ssRNA (monopartite mymonaviruses and multipartite phenuiviruses) or circular ssDNA genomes have recently been reported. Since the article of the last edition of this series was published, an extremely large number of viruses have been identified. Visit See “Relevant Website section” for an updated list of all mycoviruses. Note that there are still many unassigned mycoviruses awaiting official taxonomical classification. Table 1 represents these mycoviruses in the “unassigned” sections.

dsRNA Mycoviruses Mycoviruses with dsRNA genomes represent a major portion of the fungal-infecting viruses so far reported. With the exception of the mycoviruses (family Reoviridae), which have spherical double-shelled particles 80 nm in diameter, the typical dsRNA mycoviruses are isometric particles ranging 25–50 nm in diameter. In addition, for polymycoviruses (family Polymycoviridae) with multipartite dsRNA genomes, two viral forms, a filamentous encapsidated and a capsid-less RNA-protein complex forms, have been proposed. These dsRNA viruses are classified, based on the number of genome segments and phylogenetic relation, into one genus Botybirnavirus and eight families, Totiviridae, Partitiviridae, Curvulaviridae, Amalgaviridae, Chrysoviridae, Megabirnaviridae, Quadriviridae, and Polymycoviridae (Table 1). Viruses in the family Totiviridae have non-segmented dsRNA genomes coding for a CP and an RdRP or CP-RdRP fusion protein. At present, five genera have been placed in this family: Totivirus, Victorivirus, Giardiavirus, Leishmaniavirus, and Trichomonasvirus.

434

Table 1

An Introduction to Fungal Viruses List of viral families and genera into which fungal viruses are classifieda,b

Order/Family/Genus/Species Single stranded DNA viruses Geplafuvirales Genomoviridae Gemicirculavirus Sclerotinia gemycircularvirus 1 Gemytripvirus* Gemytripvirus fugra1* Reverse Transcribing RNA viruses Ortervirales Pseudoviridae Pseudovirus Saccharomyces cerevisiae Ty1 virus Hemivirus Saccharomyces cerevisiae Ty5 virus Metaviridae Metavirus Saccharomyces cerevisiae Ty3 virus Double-stranded RNA viruses Reovirales Reoviridae Mycoreovirus Mycoreovirus 1

Ghabrivirales Totiviridae Totivirus Saccharomyces cerevisiae virus L-A Victorivirus Helminthosporium victoriae 190SV Chrysoviridae Alphachrysovirus Penicillium chrysogenum virus Betachrysovirus Botryosphaeria dothidea chrysovirus Megabirnaviridae Megabirnavirus Rosellinia necatrix megabirnavirus 1 Quadriviridae Quadrivirus Rosellinia necatrix quadrivirus 1 Durnavirales Partitiviridae Alphapartitivirus Rosellinia necatrix partitivirus 2 Betapartitivirus Atkinsonella hypoxylon virus Gammapartitivirus Penicillium stoloniferum virus S Unassigned species Agaricus bisporus virus 4 Gaeumannomyces graminis virus 0196A Gaeumannomyces graminis virus T1A Amalgaviridae Zybavirus Zygosaccharomyces bailii virus Z Curvulaviridae* Orthocurvulavirus* Curvularia orthocurvulavirus 1*

Virus abbreviation

No. of segments

Accession number

SsHDV1

1

GQ365709

FgGMTV1

3

MK430076; MK430077; MK430078

SceTy1V

1

M18706

SceTy5V

1

U19263

SceTy3V

1

M34549

MyRV1

11

AY277888; AY277889; AY277890; AB179636; AB179637; AB179638; AB179639; AB179640; AB179641; AB179642; AB179643

ScV-L-A

1

J04692; X13426

Hv190SV

1

U41345

PcV

4

AF296439; AF296440; AF296441; AF296442

BdCV1

4

KF688736; KF688737; KF688738; KF688739

RnMBV1

2

AB512282; AB512283

RnQV1

4

AB620061; AB620062; AB620063; AB620064

RnPV2

2

AB569997; AB569998

AhV

2

L39125; L39126; L39127 (satellite)

PsV-S

2

AY156521; AY156522

AbV4 GgV-0196A GgV-T1A

– – –

No entry in Genbank No entry in Genbank No entry in Genbank

ZbV-Z

1

KU200450

CThTV

2

EF120984; EF120985

An Introduction to Fungal Viruses

Table 1

435

Continued

Order/Family/Genus/Species

Virus abbreviation

No. of segments

AfuTmV1

4

HG975302; HG975303; HG975304; HG975305

BpBBV1

2

JF716350; JF716351

Unassigned Fusarium virguiliforme dsRNA mycovirus 1 yado-nushi virus Rosellinia necatrix megatotivirus 1 Alternaria alternata virus 1 Ustilaginoidea virens RNA virus M

FvV1 YnV1 RnMTV1 AaV1 UvRV-M

1 1 1 4 1?

JN671444 LC061478 LC333746 AB368492; AB438027; AB438028; AB438029 KJ101567

Positive-sense RNA viruses Wolframvirales Narnaviridae Narnavirus Saccharomyces 20S narnavirus

ScNV-20S

1

M63893

CpMV1

1

L31849

BOLV

1

LN827955

MOLV1

1

LT593139

SsOLV1

1

KP900928

PmOLV1

1

MK584843

RsOLV1

1

KP900922

Sobelivirales Barnaviridae Barnavirus Mushroom bacilliform virus

MBV

1

MBU07551

Durnavirales Hypoviridae Hypovirus Cryphonectria hypovirus 1

CHV-1

1

M57938

PEV1

1

AJ877914

Order unassigned Polymycoviridae Polymycovirus Aspergillus fumigatus polymycovirus 1 Family unassigned Botybirnavirus Botrytis porri botybirnavirus 1

Cryppavirales Mitoviridae Mitovirus Cryphonectria mitovirus 1 Ourlivirales Botourmiaviridae Botoulivirus Botrytis botoulivirus Magoulivirus Magnaporthe magoulivirus 1 Scleroulivirus Sclerotinia scleroulivirus 1 Penoulivirus* Phaeoacremonium penoulivirus* Rhizoulivirus* Rhizoctonia rhizoulivirus*

Martellivirales Endornaviridae Alphaendornavirus Phytophthora alphaendornavirus 1 Betaendornavirus Sclerotinia sclerotiorum betaendornavirus 1 Tymovirales Alphaflexiviridae Botrexvirus Botrytis virus X Sclerodarnavirus Sclerotinia sclerotiorum debilitationassociated RNA virus

SsEV1

Accession number

KJ123645

BotVX

1

AY055762

SsDRV

1

AY147260 (Continued )

436

Table 1

An Introduction to Fungal Viruses

Continued

Order/Family/Genus/Species Gammaflexiviridae Mycoflexivirus Botrytis virus F Deltaflexiviridae Deltaflexivirus Sclerotinia sclerotiorum deltaflexivirus 1 Unassigned Diaporthe RNA virus Fusarium graminearum virus DK21 Oyster mushroom spherical virus Sclerophthora macrospora virus A Sclerophthola macrospora virus B Sclerotinia sclerotiorum RNA virus L Fusarium graminearum virus 1 yado-kari virus 1 Fusarium boothii large flexivirus 1 hadaka virus 1

Negative-sense RNA viruses Mononegavirales Mymonaviridae Sclerotimonavirus Sclerotinia sclerotimonavirus Botrytimonavirus* Botrytimonavirus botrytidis* Lentimonavirus* Lentinula lentmonavirus* Phyllomonavirus* Phyllomonavirus phyllospherae* Auricularimonavirus* Auricularimonavirus auriculariae* Penicillimonavirus* Penicillimonavirus alphapenicillii* Plasmopamonavirus* Plasmopamonavirus plasmoparae* Bunyavirales Phenuiviridae Lentinuvirus Lentinula lentinuvirus Entovirus Entoleuca entovirus Unassigned Fusarium poae negative-stranded RNA virus 1 Botrytis cinerea negative-stranded RNA virus 1

Virus abbreviation

No. of segments

Accession number

BotVF

1

AF238884

SsDFV1

1

KT581451

DRV FgV/DK21 OMSV SmV-A SmV-B SsRV-L FgV1 YkV1 FbLFV1 HadV1

1 1 1 3 1 1 1 1 1 11

AF142094 AY533037 AY182001 AB083060; AB083061; AB083061 AB012756 EU779934 AY533037 LC006253 LC425115 LC519840; LC519841; LC519842; LC519843; LC519844; LC519845; LC519846; LC519847; LC519848; LC51984; LC519850

SsNSRV-1

1

KJ186782

BcNSRV-7

1

MT157413

LeNSRV-1

1

LC466007

SLaNSRV-4

1

KT598229

AhNSRV-1

1

MT259204

PdNSRV-1

1

MK584858

PvLAMV8

1

MN557004

LeNSRV-2

2

LC466008; LC466009

EnPLV-1

2

MF375882; MK140653

FpNSRV-1 BcNSRV-1

1? 1?

LC150618 LN827956

a

The type species of the specified genus is underlined. Proposed taxa for new creation or re-classification (awaiting EC consideration and ICTV 2020 ratification) are asterisked.

b

Viruses in the genera Totivirus and Victorivirus infect fungi, whereas those belonging to the latter three genera infect parasitic protozoa. Plant-infecting dsRNA viruses related to totiviruses (potential members of genus Totivirus) were recently found. At least two distinct RdRP expression strategies have been reported for totiviruses: (1) those that express their RdRP as a fusion protein (CP-RdRP or GagPol) by ribosomal frameshifting, such as the yeast ScV-L-A and the viruses that infect parasitic protozoa and (2) those that synthesize RdRP as a separate non-fused protein by an internal initiation mechanism (e.g., a coupled termination–reinitiation mechanism), as demonstrated for HvV190S (a victorivirus) and others that infect filamentous fungi. Phylogenetic analysis of CP or RdRP sequences of totiviruses reflects these differences, and separate phylogenetic clusters can be generated. HvV190S and other victoriviruses that infect filamentous fungi are closer to each other than to viruses infecting yeast, smut fungi and protozoa. The fact that independent alignments of CP and RdRP sequences give similar phylogenetic relationships supports the conclusion that totiviruses infecting filamentous fungi should reside in a genus of their own. HvV190S and other known totiviruses that infect filamentous fungi have been classified within

An Introduction to Fungal Viruses

437

the genus Victorivirus (Table 1). The genomes of partitiviruses, curvulaviruses, and megabirnaviruses consist of two dsRNA segments, while chrysoviruses, quadriviruses, and polymycovirses have three to seven, four, and four to seven dsRNA segments, respectively. The unclassified dsRNA mycovirus Agaricus bisporus virus 1 (AbV1), also designated La France isometric virus, causes a serious disease (named La France disease) of the cultivated mushroom Agaricus bisporus (white button mushroom). AbV1 is of special interest because of its historical and economic importance. The AbV1 virions, isolated from diseased fruit bodies and mycelia, are isometric 36 nm in diameter and co-purify with nine dsRNA segments (referred to as disease-associated dsRNAs). The size of dsRNA segments varies from 3.6 kbp to 0.78 kbp, three of which are believed to be satellites. It is not clear at present whether the nine dsRNA segments are encapsidated individually, in various combinations, or all segments are packaged in single particles. Based on the size of the particles, cesium sulfate gradient profile and results of dsRNA and protein analyzes of the gradient fractions, it is highly unlikely that all dsRNAs are packaged together in single particles. More realistically, AbV1 represents a multiparticle system in which the various particle classes have similar densities. Interestingly, phylogenetic analysis of the conserved motifs of AbV1 RdRP, encoded by dsRNA segment 1, and other dsRNA mycoviruses showed that AbV1 is closely related to the multipartite chrysoviruses. The Blast analyses also supported the above notion. In addition to AbV1, there are an increasing number of unclassified dsRNA mycoviruses. Table 1 lists only some of those wellcharacterized.

Single-Stranded RNA Viruses Single-stranded (ss) RNA mycoviruses are grouped into two with linear positive-sense ( þ )ssRNA genomes (currently classified into nine families: Alphaflexiviridae, Barnaviridae, Deltaflexiviridae, Endornaviridae, Gammaflexiviridae, Hypoviridae, Narnaviridae, Mitoviridae, and Botourmiaviridae) and with linear (  )ssRNA (family Mymonaviridae, and Phenuiviridae) (Table 1). Among them are mycoviruses with apparent ssRNA genomes that do not code for capsid proteins and exist more or less predominantly as dsRNA “replicative” forms in their hosts. Because of the lack of true virions, these viruses were easier to isolate and study as their replicative dsRNA forms, and some were grouped with dsRNA viruses (e.g., family Hypoviridae). However, there is ample evidence that many of these mycoviruses replicate and express their genomes like ( þ )ssRNA viruses and that the lineage of their RdRP and helicase genes are within the lineages of ( þ )ssRNA viruses. The simplest types of these viruses include members of the families Narnaviridae and Mitoviridae, whose RNA genomes code only for RdRP and the viruses exist as RNA/RdRP ribonucleoprotein (RNP) complexes. The corresponding dsRNA replicative forms can be isolated from infected tissues, usually in smaller molar amounts than the genomic ssRNA. Phylogenetic analysis of RdRPs of members of the families Narnaviridae along with those of other mycoviruses and related taxa indicate a distant relationship between members of the family Narnaviridae and bacteriophages such as Qb and MS2 belonging to the family Leviviridae. Viruses in the family Botourmiaviridae are phylogenetically more closely related to narnaviruses than to mitoviruses. The family accommodates four genera: Ourmiavirus, Botoulivirus, Magoulivirus, and Scleroulivirus together with two newly proposed genera “Penoulivirus” and“Rhizoulivirus.” Viruses belonging to the first genus infect plants associated with host diseases and have tri-partite ( þ )ssRNA genomes, whereas viruses in the other genera infect filamentous fungi, and have mono-partite ( þ )ssRNA genomes encoding only RdRPs. These fungal viruses are capsidless, and likely are replicated in the cytoplasm. Lack of true virions (a capsid-less nature) is characteristic of two other groups of classified ( þ )ssRNA viruses, those belonging to the family Hypoviridae and the family Endornaviridae. Viruses in the family Hypoviridae are phylogenetically related to the ( þ ) ssRNA viruses in the family Potyviridae (picorna-like virus supergroup). Comparisons of hypovirus conserved motifs of RdRP, helicase and protease with those of members of the family Potyviridae suggest that viruses in the genus Bymovirus, which are plant viruses transmitted by soil-borne plasmodiophorid Polymyxa graminis, are the closest relatives to hypoviruses. Fungal endornaviruses belong to the genus Betaendornavirus. Exceptionally, a few non-plant viruses in the genus Alphaendornavirus, such as Phytophthora endornavirus 1 and Rhizoctonia cerealis endornavirus 1, infect fungi and oomycetes. Most other alphaendornaviruses infect plants. Although Phytophthora species and other members of the family Pythiaceae (Oomycetes) have many biological properties in common with fungi, they are currently classified, based on sequence similarities, in a protist group known as the Stramenopiles. Endornaviruses are believed to have evolved from an alpha-like virus that has lost its capsid gene. This is consistent with the recent finding that RdRPs of PEV1 and other endornaviruses cluster with those of families and genera in the alpha-like virus superfamily of ( þ )ssRNA viruses. The mushroom bacilliform virus (MBV; genus Barnavirus, family Barnaviridae) is the only mycovirus known to have bacilliform virions. MBV has a ( þ )ssRNA genome that contains seven ORFs, three of which encode a putative chymotrypsin-like serine protease, a putative RdRP and a CP. The polypeptides encoded by the remaining four ORFs have no homology to known proteins. Amino acid sequence comparisons of the putative protease and RdRP suggest that MBV is evolutionarily related to sobemoviruses and poleroviruses, both of which have smaller monopartite ( þ )ssRNA genomes and spherical particles. Although double infections of cultivated mushrooms with MBV and ABV1 occur commonly, the role of MBV in the ensuing dieback disease of cultivated mushroom remains unknown. Botrytis virus F (BVF) has flexuous rod-shaped particles comparable in size and morphology to ( þ )ssRNA plant flexiviruses such as poteto virus X (a potexvirus, family Alphaflexiviridae). Amino acid sequence identities of the conserved helicase and RdRP regions and the CP genes are most similar to those of potex-like viruses. The main difference between BVF and these plant viruses is the lack of a movement protein (MP) gene(s), which encodes a key protein(s) in cell-to-cell movement of a plant virus. Despite

438

An Introduction to Fungal Viruses

similar particle morphology along with amino acid sequence similarities of both the replicase and CP genes, it is obvious that the mycovirus BVF is distinct enough to belong to a new family. It has been approved by ICTV as an exclusive genus and family designated as Mycoflexivirus and Gammaflexiviridae that were created to include BVF (Table 1). The genome of Sclerotinia sclerotiorum debilitation-associated RNA virus (SsDRV) contains a single ORF encoding a protein with significant sequence similarity to the replicases of the “alphavirus-like” supergroup of ( þ )ssRNA viruses. The putative SsDRVencoded replicase protein contains the conserved methyl transferase, helicase and RdRP domains characteristic of the replicases of plant alphaflexiviruses (family Alphaflexiviridae) and BVF. Although phylogenetic analysis of the conserved RdRP motifs verified that SsDRV is closely related to BVF and to the allexiviruses in the family Alphaflexiviridae, SsDRV is distinct enough from these viruses, mainly based on the lack of CP and MP genes, to justify the creation of another genus, Sclerodarnavirus in the family Alphaflexiviridae. Another mycovirus member of the genus Sclerodarnavirus is Botrytis virus X (BVX) (Table 1). Recent taxonomical reorganization of related viruses includes the creation of the genus Deltaflexivirus, family Deltaflexiviridae that accommodates three mycovirus members such as Sclerotinia sclerotiorum deltaflexivirus 1 and Fusarium graminearum deltaflexivirus 1. A negative-sense (  )ssRNA virus termed Sclerotinia sclerotiorum negative-stranded RNA virus 1 (SsNSRV1) was isolated in 2014. SsNSRV1 is now classified in the species Sclerotinia sclerotimonavirus, genus Sclerotimonavirus (family Mymonaviridae, order Mononegavirales). SsNSRV1 forms filamentous nucleocapsid structures of 22 nm  200–2000 nm and appears to employ a gene expression strategy with sequential transcription (“stop-start”) mechanism similar to other mononegaviruses. The family accommodates one genus Sclerotimonavirus and other newly proposed or renamed genera “Botrytimonavirus,” “Lentimonavirus,” “Phyllomonavirus,” “Auricularimonavirus,” “Penicillimonavirus,” and “Plasmopamonavirus.” Multipartite (  )ssRNA viruses have also been reported from edible shiitake mushroom (Lentinula edodes negative-strand RNA virus 2) and Entoleuca sp. infecting avocado (Entoleuca bunyavirus 1). Both viruses have a smaller segment with the putative ambisense coding strategy.

Unassigned ssRNA Viruses It is worth noting that a relatively large number of mycoviruses remain unassigned including some well-characterized ones like Diaporthe RNA virus (DRV), and Sclerophthora macrospora viruses A and B (SmV-A and SmV-B) (Table 1). DRV is another naked RNA mycovirus that is associated with hypovirulence of its fungal host. It has two large ORFs present in the same reading frame, which are most likely translated by readthrough of an UAG stop codon in the central part of the genome. The longest possible translation product has a predicted molecular mass of about 125 kDa, which shows significant homology to the non-structural proteins of carmoviruses of the ( þ )ssRNA virus family Tombusviridae. Interestingly, transcripts derived from full-length cDNA clones were infectious when inoculated in spheroplasts and the transfected isolates exhibited phenotypic traits similar to the naturally infected isolate. SmV A found in Sclerophthora macrospora, the pathogenic fungus responsible for downy mildew of gramineous plants, is a small icosahedral virus containing three segments of ( þ )ssRNAs (RNAs 1, 2, and 3). RNA 1 has two ORFs, one of which contains the RdRP motifs, and RNA 2 codes a single ORF for CP, while RNA 3 has any ORFs suggesting it is a satellite-like RNA. Whereas the deduced amino acid sequence of RdRP shows some similarity to RdRPs of members of the family Nodaviridae, the amino acid sequence of the viral CP shows similarity to those of members in the family Tombusviridae. The capsid of SmV A is composed of two protein species, named CP 1 and CP 2, both encoded in the same ORF in RNA 2. CP 2 is apparently derived from CP 1 via proteolytic cleavage at the N-terminus. The genome organization of SmV A is distinct from those of other known fungal RNA viruses, and suggest that SmV A should be classified into a novel genus of ( þ )ssRNA viruses. SmV B, which is also found in S. macrospora, has small icosahedral, monopartite virions containing a ( þ )ssRNA species as a genome. The viral genome has two large ORFs; ORF1 encodes a putative polyprotein containing the motifs of chymotrypsin-related serine protease, and ORF2 encodes a CP. The genome arrangement of SmV B is similar to those belonging to the genera Sobemovirus, Barnavirus, and Polerovirus. The putative domains for the serine protease, VPg (the viral protein linked to the genome), RdRP, and the CP are located in this order from the 50 terminus to the 30 terminus. SmV B, however, is distinctive since its genome has only two ORFs, but the others have at least four ORFs. The genome organization of the barnavirus MBV, on the other hand, resembles that of poleroviruses. These results suggest that SmV B, like SmV A, should also be classified into a new genus of ( þ )ssRNA viruses. Several unclassified (  )ssRNA viruses distantly related to ophioviruses, which are filamentous plant viruses within the family Aspiviridae known to be transmitted by the soil-inhabiting fungus Olpidium virulentus, have been reported. Examples include Fusarium poae negative-stranded RNA virus 1 and Rhizoctonia solani negative-stranded RNA virus 1 and unclassified bunyaviruses (order Bunyavirales) such as Macrophomina phaseolina negative-stranded RNA virus 1 and Botrytis cinerea negative-stranded RNA virus 1. The mycoviruses belonging to the families Pseudoviridae, Metaviridae, Reoviridae (genus Mycoreovirus), Totiviridae, Partitiviridae, Chrysoviridae, Megabirnaviridae, Quadriviridae, Hypoviridae, Mitoviridae, Narnaviridae, Barnaviridae and Mymonaviridae, and the members of possible novel families such as botybirnaviruses (genus Botybirnavirus), alternaviruses, fusariviruses and phlegiviruses are discussed in more detail in separate articles in this work.

Replication and Gene Expression Strategy Replication cycles of mycoviruses are not well studied except for a few cases including the prototype totivirus ScV-L-A. For dsRNA mycoviruses including members of the Totiviridae, Partitiviridae, Reoviridae, and Chrysoviridae, virus particles or subviral particles,

An Introduction to Fungal Viruses

439

containing RdRP, are believed to play pivotal roles in RNA transcription and replication. Replication of capsidless (naked) ( þ )ssRNA mycoviruses represented by members of the family Hypoviridae may occur in infection-specific, lipid-membranous vesicles presumed to contain viral RdRP and RNA helicase. These vesicles are able to synthesize in vitro both plus and minus RNA at a ratio of 1:8. The narnaviral genomic RNA, encoding only a single protein (RdRP), is associated with RdRP forming an RNP structure rather than being encapsidated. These RNP complexes seem to play a key role in viral replication. Mycoviruses, like many RNA viruses of plants and animals, employ non-canonical translational strategies for expressing their genomes. These include –1 frame shifting (totiviruses, trichomonasviruses and giardiaviruses in the Totiviridae), –2 frame shifting (trichomonasvirses in the Totiviridae), þ 1 frame shifting (Pseudoviridae, Metaviridae), presumable þ 1 frame shifting or ribosomal hopping (leishmaniaviruses in the Totiviridae), termination-coupled initiation (stop/restart strategies; most of victoriviruses in the Totiviridae, Hypoviridae), IRES (internal ribosome entry site)-mediated initiation (Totiviridae, Hypoviridae, Chrysoviridae). In addition, proteases such as a cis-acting papain-like cysteine protease (Hypoviridae) and 2A-like self-processing peptides (yadokariviruses, probably giardiaviruses in the Totiviridae and some others) have also been known. Furthermore, a readthrough of a termination codon strategy is proposed for translation of the 30 proximal ORF of DRV. A non-canonical mechanism may also be required for efficient translation of mRNA of narnaviruses, which lack poly(A) tails. The (CAA)n repeats found at the 50 -untranslated region (UTR) of chrysoviruses are implicated in translation augmentation, as observed for the 50 UTR sequence of a plant alpha-like virus (tobacco mosaic virus, TMV; family Virgaviridae). This possibility needs to be tested, given the recent detection of IRES activities in the UTRs of chrysovirus genomic segments. Translation of the viral genes that are regulated by these mechanisms is considered critical for virus viability.

Recent Technical Advances in Fungal Virology Fungal virology has been thwarted by many constraints on manipulation of fungal viruses. Many, if not all, plant and animal viruses can be inoculated into individuals/tissue cultures of plant and animal hosts. Some assays with those hosts allow quantitative detection of biologically active viruses. As for mycoviruses, it is rather rare to be able to inoculate into host fungi because of experimental limitations, which often makes the etiology of mycoviruses difficult to establish. Fungal cells have rigid cell walls and are usually difficult to digest for preparation of cell-wall free protoplasts ready for transformation or transfection. Even if protoplasts are made, their maintenance, as with animal cell culturing, is not possible. Furthermore, fungal hosts usually have self/non-self recognition systems operating at inter- and intraspecies levels. Intraspecies barriers are based on vegetative incompatibility/compatibility governed genetically. This is often regarded as one of the host defense barriers that inhibit virus transfer between individuals. To overcome these barriers, a few methods are available. The prototypic hypovirus CHV1 is the first for which a reverse genetics is established. Infection with different CHV1 strains can be launched either from cDNA integrated into host chromosomes or in vitro synthesized viral RNA from the cDNA template. It is noteworthy that via bombardment of mycelia, not protoplasts, the infectious CHV1 cDNA clone can be integrated into chromosomes of fungi other than the natural host C. parasitica. cDNA-based transfection systems are now available for other RNA viruses such as DRV (a tombus-like virus), Saccharomyces cerevisiae 20S narnavirus (ScNV-20S), Saccharomyces cerevisiae 23S narnavirus (ScNV-23S), Sclerotinia sclerotiorum ourmia-like virus 4, Sclerotinia sclerotiorum hypovirus 2, and Yado-kari virus 1 (a calici-like virus). Mycoviruses in general lack infectivity as purified virions. However, all three members of the genus Mycoreovirus including MyRV1, MyRV2 and mycoreovirus 3 (MyRV3, from an R. necatrix strain) were found to be infectious as purified particles when applied to fungal protoplasts. It is of interest in this regard that treatment of purified virions with trypsin or chymotrypsin was not required for infectivity. Protoplast fusion provides an alternative approach to introduce mycoviruses into vegetatively incompatible fungal strains that are incapable of hyphal anastomosis. Intra- and interspecies virus transfer via protoplast fusion has been reported in Aspergillus spp., Fusarium spp. and Cryphonectria spp. This method is particularly useful for mycoviruses for which infectious particles or cDNA-derived RNA are unavailable. Another recent revelation is that monokaryotic strains are able to serve as an intermediate mycovirus transmitter between different mycelial incompatibility groups within the same species of R. necatrix. To complete Koch’s postulates, virus curing is as important as virus inoculation, because virus must be back inoculated into an isogenic, virus-free strain with the same genetic background as the original virus-infected strain. Virus-free isolates may be obtained from germlings of asexual spores if virus transmission through spores is less than 100%. Alternatively, virus-free strains may be isolated by hyphal tip culturing, as in the case for H. mompa and R. necatrix. This technique is applicable to fungi infected with a virus that is transmitted to 100% of the asexual spores or for fungi that produce little or no spores. Protoplasting is also useful for virus curing. These curing methods can be employed in the presence of antiviral drugs such as ribavirin.

Future Perspectives Cross-Kingdom Infection by Viruses and Viroids The ssDNA mycovirus SsHDV1 was the first fungal virus that was shown to be replicated in and transmitted by a mycophagous insect, Lycoriella ingenua. Naturally this fly likely serves as a vector of the virus lateral transfer between fungi and also retains the virus across generations through vertical (transovarial) transmission. Of note is that this DNA virus in the purified or crude

440

An Introduction to Fungal Viruses

preparation can extracelluarly enter host mycelia on synthetic media. This feature contributes to field-level virocontrol of rapeseed rot caused by S. sclerotiorum. Other examples of cross-kingdom infections involving mycoviruses include infection of a phytopathogenic basidiomycete, Rhizoctonia solani, by one of the best-studied plant viruses, cucumber mosaic virus (CMV, family Bromoviridae). The virus is apparently maintained stably in fungal colonies and is transmitted between plants. Namely, not only aphids (Hemiptera; a nonpersistent manner) but also fungi act as virus vectors. Furthermore, a two-way facilitative interaction between a plant and fungal alpha-like viruses (TMV and CHV1) has been demonstrated, which could promote cross-kingdom virus infections in both directions. Viroids which are omnipresent plant pathogens having minimal non-protein-coding circular RNA genomes and were not detected from other kingdoms of organisms is also an example. However, hop stunt viroid (HSVd, Pospiviroidae), iresine 1 viroid (Pospiviroidae) and avocado sunblotch viroid (Avsunviroidae), which replicate in the nucleus, nucleus and chloroplast of their plant hosts, respectively, can establish stable infection in a few phytopathogenic fungi including C. parasitica and Valsa mali, and F. graminearum. Interestingly HSVd induces severe growth defects in V. mali, but not in the other two fungi. These examples of cross-kingdom infection provide another layer of research direction. The antiviral RNA silencing is the primary defense mechanism in fungi, invertebrates (such as insects, mites and nematodes) and plants, while key players and mechanisms involved are different in these host organisms. How single viruses counter-defense is an interesting problem to address. Movement proteins are essential for plant viruses to establish a systemic infection in plant hosts, but not for the systemic infection by fungal viruses of fungal hosts. Genome alterations that may occur during long-term maintenance exclusively single kingdom of hosts is worth monitoring. There are a few genera that accommodate fungal viruses and viruses infecting members of other kingdoms such as plants and insects. It will be of great interest to elaborate on their experimental host ranges.

Yeast as a (Model) Host to Study Viral Replication S. cerevisiae has provided an excellent system to investigate virus assembly and replication of the dsRNA totivirus (ScV-L-A) and the ssRNA narnaviruses (ScNV-20S and ScNV-30S) that infect yeast. With the robust yeast genetics, a number of host factors involved in totivirus replication were identified, many of which are related to translation events (see article on the yeast ScV-L-A for details). Furthermore, the yeast provides an “artificial” viral host model system to explore host genes affecting viral replication on a genome-wide basis. Genetic screens of a collection of 4500–4800 single-gene deletion yeast strains (Yeast Knockout strain collection) have been successfully conducted for identifying host factors involved in replication and recombination with two different tri- and monopartite plant ( þ )ssRNA viruses, brome mosaic virus (family Bromoviridae) and tomato bushy stunt virus (TBSV, family Tombusviridae), respectively. Each screen led to the identification of approximately 100 host genes (approximately 1.8% of the entire yeast genes) that affect virus replication. Interestingly, the replication of the two viruses in yeast is affected by a different set of genes. A similar approach was employed to identify genes affecting the recombination of TBSV. This type of use of the yeast system can be expanded to an animal ( þ ) ssRNA virus and a vertebrate (–)ssRNA virus. Recent interesting screening of bacterial effectors, whose targets are known, for antiviral effects in yeast revealed pro- and anti-viral host factors for TBSV replication. It may be surprising that the yeast has yet to be used for viruses infecting filamentous fungi. The yeast should be able to serve as a model host for mycoviruses other than those that naturally infect yeast.

Virus Neo-Lifestyles Recent mycovirus hunting revealed unique virus lifestyles. The first such lifestyle (called “neo-lifestyle”) is exhibited by a ( þ ) ssRNA mycovirus termed yadokari virus 1 (YkV1) hosted by a dsRNA mycovirus called yado-nushi virus 1 (YnV1). YkV1 encodes RdRP but depends on YnV1 for its encapsidation. The replication and transcription of YkV1 are hypothesized to occur in the YnV1 capsid encasing YkV1 RNA (heterocapsid), thus YkV1 behaving like a dsRNA virus. Intriguingly YkV1 enhances the accumulation of the partner YnV1. YnV1 as a full-fledged dsRNA can complete its replication cycle. There appear to be YkV1-like viruses showing partnership with dsRNA viruses in other filamentous fungi. Another virus neo-lifestyle was discovered in Aspergillus fumigatus (one of the most ubiquitous saprophytic ascomycete) infected by a tetra-segmented RNA virus Aspergillus fumigatus polymycovirus 1 (AfuPmV1, family Polymycoviridae). AfuPmV1 RdRP has the presumed catalytic amino acid sequence, GDNQ, similar to those of (  )ssRNA viruses (most of mononegaviruses), but is phylogenetically closer to ( þ )ssRNA viruses such as caliciviruses than to (  )ssRNA viruses. AfuPmV1 is assumed to be associated in a colloidal form with a virally encoded proline-alanine-serine rich protein (PASrp) rather than being encased by capsid protein. More surprisingly, AfuPmV1 is infectious as a purified naked dsRNA as well as a dsRNA/PASrp complex that appears to prevent viral dsRNA from being exposed to RNase. A virus closely related to AfuPmV1 has recently been discovered which is termed Hadaka virus 1 (HadV1). However, HadV1 has an 11-segmented ( þ )ssRNA genome but lacks a PASrp-encoding segment. Unlike polymycoviruses, HadV1 appears to be accessible in infected mycelial homogenates by nuclease.

Host Defense Against Mycoviruses and Their Counter-Defense Host defense responses against viruses have not been explored intensively. RNA silencing is regarded as one of the host defense strategies of eukaryotes against molecular parasites including viruses, and has been shown to operate as antiviral defense in a

An Introduction to Fungal Viruses

441

number of fungi including important models and phytopathogenic fungi. As in the case of other eukaryotes, deficiencies of antiviral-RNA silencing in fungal hosts result in severer symptom induction and enhanced mycovirus replication. There are genetic elements involved in RNA silencing that are conserved widely from fungi to vertebrates. The functional roles of these factors in RNA silencing as antiviral reactions appear to be different between fungi. Two genes, one Dicer gene, dicer-like 2 (dcl2), and one Argonaute gene, argonaute-like 2 (agl2), are required for antiviral responses in C. parasitica, while more players are involved in other fungi such as F. graminearum and S. sclerotiorum, where at least two Dicer genes (dcl1 and dcl2) and Argonaute genes (ago1 and/or ago2) are involved in the defense redundantly. The RNA silencing key genes, agl2 and dcl2, of C. parasitica are highly induced by over 10-fold on infection by some mycoviruses. This transcription upregulation is mediated by the SAGA complex, a general transcriptional co-activator, that requires DCL2 that, thus, plays a positive feedback role. DCL2 is necessary for the SAGA-mediated upregulation of not only agl2 and dcl2 but also many other fungal host genes. Some of the induced host genes contribute to symptom mitigation. The elevated state of antiviral RNA silencing established by a pre-infected virus can eliminate horizontal transmission via hyphal fusion and replication of a second virus susceptible to RNA silencing. The protein p29 encoded by hypovirus CHV1 has been shown to be a suppressor of RNA silencing targeting a transgene that functions in both plant and fungal cells. In fungal cells, CHV1 p29 suppresses the above mentioned transcriptional upregulation of antiviral silencing key genes (dcl2 and agl2). Unraveling the mechanism by which the CHV1 p29, or other mycovirus-encoded RNA silencing suppressors, may block the RNA silencing pathway will be a very interesting challenge. Other layers of host defense include vegetative incompatibility which impairs lateral virus transmission between colonies of a single species. This non-self allorecognition is governed genetically, in the case of C. parasitica, 6 diallelic loci (vic), and therefore, serves as a barrier against virocontrol. It is worth noting that super donor strains with deletions of multiple vic genes allow efficient mycovirus transmission at the field level. Non-self-responses include programmed cell death in which reactive oxygen species play a pivotal role. Some mycoviruses are able to suppress vegetative incompatibility reactions in fungi.

Role of Mycoviruses in Plant-Fungal Mutualistic Associations The question of whether mycoviruses are involved in the mutualistic interactions between endophytic fungi and their host plants is of considerable interest because of the attractive beneficial features of these associations and because of the common occurrences of mycoviruses in all major groups of fungi. This question was recently addressed in an intriguing report that presented evidence for a dsRNA mycovirus being involved in the mutualistic interaction between a fungal endophyte (Curvularia protuberata, an ascomycete) and a tropical panic grass (Dichanthelium lanuginosum). This association allows both organisms to grow at high soil temperatures. The virus in question, which was designated Curvularia thermal tolerance virus (CThTV, the proposed family “Curvulaviridae”), has unusual genome organization with an unknown genome expression strategy. CThTV has apparently a bipartite dsRNA genome (RNA1 and RNA2), but no evidence that these RNAs are packaged in the 27-nm isometric particles isolated from the fungal host. Although many questions pertinent to the fungal endophyte, CThTV, and the veracity of the evidence for viral etiology remain unanswered, this report will undoubtedly stimulate the search for mycoviruses in other mutualistic fungal endophytes. In this regard, it is noteworthy that the well-characterized mutualistic ascomycete endophyte, Epichloë festucae, was found to harbor a totivirus, but no phenotypes were associated with virus infection. S. sclerotiorum is a destructive ascomycetous pathogen in many dicotyledonous crops. However, the single-stranded DNA virus SsHADV1, known to induce hypovirulence in the host fungus S. sclerotiorum, causes a switch of the lifestyle of the fungal phytopathogen from a pathogenic to an endophytic status. Interestingly, treatments of rapeseed by SsHADV1-infected S. sclerotiorum promote plant growth compared to untreated plants, and has protective effects against virulent, virus-free fungal strains of S. sclerotinia but also its closely related species, Botrytis cinerea. These features of SsHADV1 together with horizontal transmissibility to vegetatively incompatible fungal strains make the virus a potentially robust virocontrol agent (see below). Surprisingly, S. sclerotiorum strain DT-8 infected by SsHADV1 grows endophytically in wheat (a monocotyledonous crop), previously believed to be a non-host of the fungus, and increases its grain yield. Furthermore, the fungal strain mitigates the symptoms induced by the Fusarium head blight disease (FHB) caused by Fusarium graminearum both in the laboratory and field conditions, likely through enhanced defense responses. Similar positive effects can be observed in other cereal monocotyledonous crops, such as rice and barley, against different ascomycetous pathogenic fungi such as Magnaporthe oryzae. Specific contribution of SsHADV-1 to this phenomenon is obscure, because these effects appear to be exerted whether S. sclerotiorum is infected by SsHADV1 or not.

Mycovirus as Biocontrol Agents and as Tools for Fundamental Studies The hypovirulence phenotype in the chestnut blight fungus (C. parasitica) is an excellent and well-documented example of a mycoviral-induced phenotype that is currently being exploited for biological control. The debilitating disease of Victoria blight (H. victoriae) and the disease phenotype of the Dutch elm disease fungus (Ophiostoma novo-ulmi, an ascomycete) are examples of pathogenic effects of dsRNA mycoviruses. The ssDNA virus, SsHADV has also attracted great attention because of its potential as a virocontrol (biocontrol) agent and its abilities to enter host fungal cells, be replicated in and be transmitted by the mycophagous fly. An understanding of the molecular basis of disease in these fungal-mycovirus systems would provide excellent opportunities for the development of novel biocontrol strategies of plant pathogenic fungi. Mycoviruses also continue to serve as versatile tools to study

442

An Introduction to Fungal Viruses

the virulence of host fungi, as recognized in studies with the hypovirus/Cryphonectria system, in which substantial advances in our understanding of the molecular basis of hypovirulence have been made. It should be noted that some mycoviruses enhance virulence (hypervirulence), sporulation, and mycelial growth. For example, virus-induced hypervirulence was observed in the Nectria radicicola mycovirus system. The host fungus causes root rot in ginseng (Panax quinquefolius). Another example is the virus (a chrysovirus)-enhanced production of the host-specific toxin (AK-toxin) of Alternaria alternata (an ascomycete) that causes black spot in Japanese pear.

Further Reading Andika, I.B., Wei, S., Cao, C., et al., 2017. Phytopathogenic fungus hosts a plant virus: A naturally occurring cross-kingdom viral infection. Proceedings of the National Academy of Sciences of the United States of America 114 (46), 12267–12272. Buck, K.W., 1998. Molecular variability of viruses of fungi. In: Bridge, P.D., Couteaudier, Y., Clarkson, J.M. (Eds.), Molecular Variability of Fungal Pathogens CAB International Wallingford 53–72. Ghabrial, S.A., 2001. Fungal viruses. In: Maloy, O., Murray, T. (Eds.), Encyclopedia of Plant Pathology 1. New York: John Wiley& Sons, pp. 478–483. Ghabrial, S.A., 2013. Mycoviruses. Academic Press. (2013/03/19 ed. Adv Virus Res, 86). Hillman, B.I., Aulia, A., Suzuki, N., 2018. Viruses of plant-interacting fungi. Advances in Virus Research 100, 99–116. Kondo, H., Kanematsu, S., Suzuki, N., 2013. Viruses of the white root rot fungus, Rosellinia necatrix. Advances in Virus Research 86, 177–214. Koonin, E.V., Dolja, V.V., 2018. Metaviromics: a tectonic shift in understanding virus evolution. Virus Research 246, A1–A3. Kotta-Loizou, I., Coutts, R.H.A., 2017. Mycoviruses in Aspergilli: A comprehensive review. Frontiers in Microbiology 8, 1699. Luque, D., Mata, C.P., Suzuki, N., Ghabrial, S.A., Caston, J.R., 2018. Capsid structure of dsRNA fungal viruses. Viruses 10 (9). doi:10.3390/v10090481. Moriyama, H., Urayama, S.I., Higashiura, T., Le, T.M., Komatsu, K., 2018. Chrysoviruses in Magnaporthe oryzae. Viruses 10 (12). doi:10.3390/v10120697. Nuss, D.L., 2011. Mycoviruses, RNA silencing, and viral RNA recombination. Advances in Virus Research 80, 25–48. Rigling, D., Prospero, S., 2018. Cryphonectria parasitica, the causal agent of chestnut blight: Invasion history, population biology and disease control. Molecular Plant Pathology 19 (1), 7–20. Roossinck, M.J., 2019. Evolutionary and ecological links between plant and fungal viruses. New Phytologist 221 (1), 86–92. Son, M., Lee, K.M., Yu, J., et al., 2013. The HEX1 gene of Fusarium graminearum is required for fungal asexual reproduction and pathogenesis and for efficient viral RNA accumulation of Fusarium graminearum virus 1. Journal of Virology 87 (18), 10356–10367. Sutela, S., Poimala, A., Vainio, E.J., 2019. Viruses of fungi and oomycetes in the soil environment. FEMS Microbiology Ecology 95 (9). doi:10.1093/femsec/fiz1119. Sutela, S., Forgia, M., Vainio, E.J., et al., 2020. The virome from a collection of endomycorrhizal fungi reveals new viral taxa with unprecedented genome organization. Virus Evolution 6 (2). doi:10.1093/ve/veaa076. Turina, M., Ghignone, S., Astolfi, N., et al., 2018. The virome of the arbuscular mycorrhizal fungus Gigaspora margarita reveals the first report of DNA fragments corresponding to replicating non-retroviral RNA viruses in fungi. Environmental Microbiology 20, 2012–2025. doi:10.1111/1462-2920.14060. Wu, S., Cheng, J., Fu, Y., et al., 2017. Virus-mediated suppression of host non-self recognition facilitates horizontal transmission of heterologous viruses. PLoS Pathogens 13 (3), e1006234. Xie, J., Jiang, D., 2014. New insights into mycoviruses and exploration for the biological control of crop fungal diseases. Annual Review of Phytopathology 52, 45–68. Zhang, D.X., Nuss, D.L., 2016. Engineering super mycovirus donor strains of chestnut blight fungus by systematic disruption of multilocus vic genes. Proceedings of the National Academy of Sciences of the United States of America 113 (8), 2062–2067. Zhang, R., Hisano, S., Tani, A., et al., 2016. A capsidless ssRNA virus hosted by an unrelated dsRNA virus. Nature Microbiology 1, 15001. doi:10.1038/NMICROBIOL.2015.1. Zhnag, H., Xie, J.T., Fu, Y.P., et al., 2020. A 2-kb mycovirus converts a pathogenic fungus into a beneficial endophyte for Brassica protection and yield enhancement. Molecular Plant 13, 1420–1433.

Relevant Websites https://talk.ictvonline.org/files/master-species-lists/m/msl Master Species Lists.

Cross-Kingdom Virus Infection Liying Sun, Northwest A&F University, Yangling, China Hideki Kondo, Okayama University, Kurashiki, Japan Ida Bagus Andika, Qingdao Agricultural University, Qingdao, China r 2021 Elsevier Ltd. All rights reserved.

Introduction Viruses, like other obligate parasites, depend on host organisms for their multiplication and survival. As the hosts do not live forever, the continued existence of viruses in the population is maintained by transmission between individual organisms. The extent to which viruses spread is largely influenced by their host range, transmission pathways, and environmental conditions. The compatibility of virus-host cell interactions, host defenses and the ability of the viruses to counter or evade defense responses are critical in determining viral host range. Viruses can infect new hosts through genetic mutations that enhance their compatibility with host cellular components or enable them to evade host defenses. Furthermore, the mode of virus transmission is also a major factor controlling virus spread. As such, the mobility of biological vectors, human activities, and changes of environmental conditions may increase the possibility of contact between viruses and a potential novel host. Viruses are known to spread between species; however, virus transfer across kingdoms is previously thought to be less common and naturally occurring cross-kingdom infection by a virus species is rarely documented. There are several classic examples of virus groups that infect hosts belonging to different taxonomical kingdoms, but they are mostly limited to some plant virus groups that replicate in their insect vectors. Nevertheless, certain viruses appear to have an inherent ability to infect different domains of life, as demonstrated by several reports of viruses artificially introduced to new hosts belonging to different biological kingdoms in the laboratory, particularly insect, plant, and fungal viruses. Moreover, several closely related viruses have been found to separately infect organisms of different kingdoms, such as some viruses infecting plants and fungi. This suggests horizontal virus transfer across kingdoms at some time in the past. Thus cross-kingdom viral infections represent an event that has happened before and might happen again. The distinct biological characteristics of different kingdoms create natural barriers for virus horizontal transfer. Even so, agricultural practices and other human activities have caused drastic changes to the biological and ecological systems of the world. Such a situation might create conditions that conceivably could facilitate cross-kingdom viral infection in the environment. Such a scenario raises concerns for the possible emergence of new animal and plant diseases in the future. This article summarizes cross-kingdom virus infection of plant, insect, and fungal viruses. The topics cover experimental, ecological and evolutionary aspects of virology.

Viruses That Replicate in Both Plants and Insects There are some plant virus groups that are known to be transmitted in a persistent propagative manner by their arthropod vectors, and thus they replicate both in plant hosts and insect vectors (Fig. 1). They are predominantly negative-sense single-stranded RNA viruses ((  )ssRNA viruses; plant rhabdoviruses, tospoviruses, and tenuiviruses), and segmented double-stranded RNA viruses (dsRNA viruses; plant reoviruses). There are also a small portion of non-segmented positive-sense single-stranded RNA viruses (( þ )ssRNA viruses; marafiviruses). Some studies showed that infection by certain plant viruses altered the performance of their insect vectors, such as for longevity, fecundity, and feeding behavior. Classical plant rhabdoviruses (family Rhabdoviridae, order Mononegavirales, unsegmented (  )ssRNA viruses) belonging to the genera Cytorhabdovirus (type species Lettuce necrotic yellows cytorhabdovirus) and Nucleorhabdovirus (type species Potato yellow dwarf nucleorhabdovirus) infect diverse plants including cereal, maize and several vegetables. They are similarly transmitted by aphids (family Aphididae, order Hemiptera) such as the cabbage aphid (Brevicoryne brassicae) and the sowthistle aphid (Hyperomyzus lactucae), leafhoppers (family Cicadellidae, Hemiptera) such as the green rice leafhopper (Nephotettix spp.) and the black-faced leafhopper (Graminella nigrifrons) and planthoppers (family Delphacidae, Hemiptera) such as the small brown planthopper (Laodelphax striatellus) and the corn planthopper (Peregrinus maidis). Members of the newly established genus Dichorhavirus (type species Orchid fleck dichorhavirus, two-segmented (  )ssRNA viruses) are transmitted by false spider mites (Brevipalpus spp., Tenuipalpidae) and are also likely to replicate in their vectors. Genus Orthotospovirus (formerly Tospovirus, type species Tomato spotted wilt orthotospovirus, segmented (  )ssRNA viruses) in the family Tospoviridae (order Bunyavirales) contains several members that infect a wide range of plants. They are exclusively transmitted by thrips (family Thripidae, order Thysanoptera) such as the western flower thrip (Frankliniella occidentalis), the onion thrip (Thrips tabaci), and the melon thrip (T. palmi). The plant hosts of the members of genus Tenuivirus (type species Rice stripe tenuivirus, segmented (  )ssRNA viruses) in the family Phenuiviridae (order Bunyavirales) are limited to the family Poaceae such as rice and maize, and these viruses are propagatively transmitted by planthoppers such as L. striatellus, P. maidis, and the brown planthopper, Nilaparvata lugens. Plant-infecting reoviruses (segmented dsRNA viruses) are classified into 3 genera, Phytoreovirus (type species Wound tumor virus), Fijivirus (type species Fiji disease virus), and Oryzavirus (type species Rice ragged stunt virus) in the family Reoviridae, and they mainly

Encyclopedia of Virology, 4th Edition, Volume 4

doi:10.1016/B978-0-12-809633-8.21320-4

443

444

Cross-Kingdom Virus Infection

Fig. 1 Virus genera with members that are known to replicate in both plants and insect vectors. The virion structures of each virus genus are illustrated. Viral proteins required for cell-to-cell movement in the plant or vector transmission/infection to the insect vector are presented below the arrow. *Forms tubules facilitating viral spread among insect vector cells. **Possible glycoprotein whose biological function is unknown.

infect the plant family Poaceae. Fijiviruses and oryzaviruses (subfamily Spinareovirinae) are propagatively transmitted by planthoppers such as L. striatellus, N. lugens, and the white-backed planthopper (Sogatella furcifera). Phytoreoviruses (subfamily Sedoreovirinae, lacking the large surface projection of virus particles) are propagatively transmitted by leafthoppers, such as Nephotettix spp., Agallia spp., and Recillia dorsalis. Members of the genus Mirafivirus (type species Maize rayado fino virus, plant alpha-like viruses) in the family Tymoviridae (order Tymovirales) infect diverse host plants including maize, citrus and oat. The corn leafhopper (Dalbulus maidis) is known as an efficient vector for maize rayado fino virus. Interestingly, a recent report showed that tobacco ringspot virus (family

Cross-Kingdom Virus Infection

445

Secoviridae, order Picornavirales), a plant picorna-like virus with a two segmented ( þ )ssRNA genome, could replicate in honeybees, Apis mellifera (family Apidae, Hymenoptera) and spread throughout the entire body of the insect. Orthotospoviruses and classical plant rhabdoviruses have enveloped spherical (B80–110 nm in diameter) and bacilliform (B300–350 nm long and B75–100 nm in diameter) virions, respectively. Viral membranes are likely required for the entry of viruses into insect cells but not into plant cells. Viral membrane glycoproteins of plant nucleo- and cytorhabdoviruses (G protein) and orthotospoviruses (GN and GC proteins) are embedded in the outer-membrane envelope and they are believed to be involved in virus entry, interacting with receptor proteins on the insect cell surface. Similarly, P2 and VP9 encoded by phytoreoviruses and oryzaviruses, respectively, are components of the outer capsid protein that bulge from the surface of the double-shelled virion (B60–100 nm in diameter) and are involved in infection of insect. In addition, these plant- and insect-infecting viruses encode cell-to-cell movement proteins that are specifically required for spread of viruses through host plants (Fig. 1). Notably, an insect small RNA virus, flock house virus (FHV, family Nodaviridae, a small non-enveloped spherical virion with a two-segmented ( þ ) ssRNA genome) is able to replicate in mechanically inoculated leaves of several plant species; however, systemic spread of FHV throughout the plant requires expression of a plant virus-encoded cell-to-cell movement protein. Therefore, evidence suggests that these propagatively insect-transmitted viruses are characterized by biological properties required for successful infection in both plant and insect hosts.

Closely Related Viruses Independently Infect Plants and Fungi There has been a surge of research interest in characterizing and studying fungal viruses, particularly those infecting plant pathogenic fungi. A growing number of newly identified fungal viruses contribute greatly to our understanding of virus diversity and evolution. For example, a number of fungal viruses were found to have close similarities with plant viruses, suggesting horizontal transfer of these viruses between plants and fungi in the not so distant past. One of the most striking observations is that some plant and fungal partitiviruses (family Partitiviridae, picorna-like viruses with a segmented dsRNA genome and nonenveloped spherical virion, 25–43 nm in diameter) show close sequence similarities. Phylogenetic analysis shows that plant and fungal partitiviruses were placed together in some clades of alpha- and betapartitiviruses, in addition to clades containing either plant or fungal partitiviruses (Fig. 2(A)). Therefore, some partitiviruses from different kingdoms, plant and fungi, are more similar to each other than to those from within a single kingdom. Additionally, the similarity between plant- and fungus (or oomycetes, including Phytophthora spp.) infecting endornaviruses (family Endornaviridae, capsid-less alpha-like ( þ )ssRNA viruses) has been observed. The members of plant partiti- and endornaviruses can establish persistent infections in host cells but lack a typical cellto-cell movement protein. It is proposed that viruses that cause persistent infections of plants, such as partitiviruses, could have originated from fungal viruses that infect plant associated fungi, including endophytes. Another finding is that a fungal alpha-like virus, Botrytis virus X (genus Botrexvirus, family Alphaflexiviridae, Tymovirales, a filamentous virion with non-segmented ( þ )ssRNA genome) isolated from the gray mold fungus Botrytis cinerea (family Sclerotiniaceae), a necrotrophic plant pathogenic ascomycete, shows close sequence similarity to allexiviruses (genus Allexivirus, Alphaflexiviridae, plant alpha-like viruses with a filamentous virion) that encode a gene block for a cell-to-cell movement protein and mainly infect the plant genus Allium (garlic and onion; Fig. 2(B)). As B. cinerea commonly infects horticulture plants including Allium spp., it is speculated that B. cinerea might have acquired a plant allexivirus or a progenitor during an infection.

Artificially Established Viral Cross Infections Between Plants and Fungi Pioneering work by Paul Ahlquist established a replication system for a plant alpha-like virus, brome mosaic virus (BMV, family Bromoviridae, a tripartite ( þ )ssRNA genome) in yeast Saccharomyces cerevisiae. This has since been achieved for other plant viruses, including tomato bushy stunt (TBSV), Cymbidium ringspot (CymRV), and carnation Italian ringspot viruses (CIRV) (family Tombusviridae, with a non-segmented ( þ )ssRNA genome; Table 1). Replication of these viruses takes place on intracellular membranes derived from organelles in both plant and yeast cells; BMV uses endoplasmic reticulum (ER) membranes, TBSV and CymRV use peroxisomal membranes and CIRV uses mitochondrial membranes. In addition, in the absence of the peroxisomal membrane, TBSV and CymRV can utilize ER membranes for their replication in yeast. Notably, some animal ( þ )ssRNA viruses (FHV and Nodamura virus, a nodavirus) and DNA viruses (human and bovine papillomaviruses, family Papillomaviridae, and adeno-associated virus, family Parvoviridae, circular dsDNA and ssDNA genome, respectively) were also shown to replicate in yeast cells. More recently, some reports show replication of two plant alpha-like viruses, tobacco mosaic virus (TMV, family Virgaviridae, non-segmented ( þ )ssRNA genome) and cucumber mosaic virus (CMV, Bromoviridae, tripartite ( þ )ssRNA genome; see also below), in plant pathogenic filamentous fungi (Table 1). However, it was also shown that CMV was not able to stably infect two other plant pathogenic filamentous fungi, Cryphonectria parasitica (family Cryphonectriaceae, Diaporthales) and Fusarium graminearum (family Nectriaceae), causal agents of chestnut blight and fusarium head blight diseases, respectively. This demonstrated that there was a fungal-specific host compatibility of CMV infection. TMV virion (non-enveloped rod shape particles, 300 nm long and 18 nm in diameter) enters the fungal hyphae and/or conidia during culture by an unknown mechanism, and virus replication does not alter the pathogenicity and growth of the host fungus. Conversely, two fungal viruses, Penicillium aurantiogriseum partiti-like virus 1 and Penicillium aurantiogriseum totivirus 1, from a marine plant endophyte (Penicillium aurantiogriseum, family

446

Cross-Kingdom Virus Infection

Fig. 2 Phylogenetic relationships of selected plant- or fungus-infecting viruses belonging to families Partitiviridae (A) and Tymoviridae (B). Maximum likelihood phylogenetic trees were constructed using PhyML 3.0 based on the multiple amino acid sequence alignment of the replicase protein or its candidate sequences. A model LG þ I þ G þ F was selected as the best-fit model for the tree construction. Clades for beta-, gamma-, deltapartitiviruses, betaflexi- and tymoviruses are shown by collapsed branches with triangles.

Cross-Kingdom Virus Infection

Table 1

447

The list of viruses used for artificial cross-kingdom infections between plants and fungi

Virus

Original host

Brome mosaic virus (genus Bromovirus) Plant (Bromus inermis and other grasses) Plant (mainly vegetables Tomato bushy stunt virus (genus crop and ornamental Tombusvirus) plants) Carnation Italian ringspot virus (genus Plant (some ornamental plants and fruit trees) Tombusvirus) Plant (Cymbidium and Cymbidium ringspot virus (genus clover) Tombusviruse) Plant (legume) Mung bean yellow mosaic India virus (genus Begomovirus, family Geminiviridae) Plant (a very wide host Tobacco mosaic virus (genus range) Tobamovirus) Cucumber mosaic virus (genus Cucumovirus) Penicillium aurantiogriseum totivirus 1 (unclassified toti-like virus)

Plant (a very wide host range) Penicillium aurantiogriseum

Penicillium aurantiogriseum partiti-like virus (unclassified partiti-like virus)

Penicillium aurantiogriseum

New host

Method of virus introduction

Saccharomyces cerevisiae

Cell/spheroplast transformation Cell/spheroplast transformation

S. cerevisiae

S. cerevisiae S. cerevisiae S. cerevisiae

Cell/spheroplast transformation Cell/spheroplast transformation Cell/spheroplast transformation

Colletotrichum acutatum, C. clavatum, C. theobromicola

Virus particles added to liquid medium Rhizoctonia solani, Valsa mali Spheroplast transformation Protoplast Nictotiana benthamiana, N. benthamiana expressing a silencing suppressor (HC-Pro), N. tabacum (BY2 culture transformation cells) Protoplast N. benthamiana expressing a silencing suppressor transformation (HC-Pro), N. tabacum (BY2 culture cells)

Trichocomaceae) can replicate in plant cells (Table 1). These reports demonstrate the compatibility between certain plant viruses and fungal cells, as well as between certain fungal viruses and plant cells.

Evidence of the Natural Transmission of a Plant Virus to a Fungus The first report of the natural infection of a fungus with a plant virus was the discovery of CMV (plant alpha-like virus) infecting a field strain of the plant pathogenic basidiomycete fungus, Rhizoctonia solani (family Ceratobasidiaceae). During an attempt to identify fungal viruses infecting several strains of R. solani collected from potato fields, one fungal strain was found to be coinfected with CMV, in addition to other fungal viruses. Further analysis confirmed that CMV infection was stable in this fungal strain and the virus was horizontally transmitted through hyphal anastomosis, but not vertically through basidiospores. In addition, CMV was also shown to replicate in an ascomycete fungus, the apple canker fungus (Valsa mali, family Valsaceae, Diaporthales). Therefore, many phytopathogenic fungi may support stable infection of this virus. Under laboratory conditions, it was demonstrated that R. solani could acquire CMV from virus-infected potato and Nicotiana benthamina, as well as transmit the virus to uninfected plants. This supported the notion that CMV was transferred from plant to R. solani during infection in the natural environment. Interestingly, CMV infection has no effect on R. solani growth and morphology in artificial culture media, but infection of plants by the R. solani strain carrying CMV induces more severe symptoms than the virus-free strain. Thus, acquiring CMV seems to make R. solani more aggressive which has implications for disease control. It is still not clear how CMV transfers between plant and fungi during infection. As a necrotrophic pathogen, R. solani decomposes the host cells for uptake of nutrients. This possibly allows the transfer of relatively big molecules, such as CMV particles (B28 nm in diameter), from the plant cell into fungal cells. Because CMV is infectious in the form of RNA, it is also possible that CMV is transmitted in the form of naked RNA or ribonucleoprotein.

A Fungal DNA Virus That Replicates in an Insect and Uses It as a Vector Sclerotinia sclerotiorum hypovirulence-associated DNA virus 1 (SsHADV-1), the first DNA virus discovered in fungi, was isolated from a hypovirulent strain of Sclerotinia sclerotiorum (family Sclerotiniaceae), an ascomycete fungal plant pathogen known as the white mold fungus. SsHADV-1 belongs to the genus Gemycircularvirus (family Genomoviridae, a circular ssDNA genome and non-enveloped spherical particles about 20 nm in diameter) whose members are found in various environments, as well as in plant- and animal-associated samples. It was observed that the mushroom fly Lycoriella ingenua (family Sciaridae, order Diptera), an insect that feeds on fungi, could acquire SsHADV-1 when the larvae were fed on virus-infected fungal colonies. SsHADV-1 replicates in L. ingenua cells and the virus is transmitted vertically to L. ingenua offspring. Under laboratory conditions, L. ingenua could transmit SsHADV-1 to virus-free S. sclerotiorum and SsHADV-1 was detected in adult L. ingenua collected from the field, suggesting that SsHADV-1 could use L. ingenua as a vector.

448

Cross-Kingdom Virus Infection

Fig. 3 A diagram showing known examples of natural transmission of viruses among animal, plant and fungal kingdoms. CMV, cucumber mosaic virus; SsHADV-1, Sclerotinia sclerotiorum hypovirulence-associated DNA virus 1.

Given that SsHADV-1 was also detected in two dragonfly species (Erythemis simplicicollis and Pantala hymenaea, family Libellulidae) and one damselfly species (Ischnura ramburii, family Coenagrionidae; order Odonata), and SsHADV-1-related viruses were found in other insects, it is possible that other insects in addition to L. ingenua are natural hosts of SsHADV-1. Many insects feed on fungi or are infected with fungi; therefore, horizontal transfer of viruses between insect and fungus may not be specific, but could occur frequently in nature.

The Contribution of Cross-Kingdom Viral Infection to Virus Evolution The current understanding of RNA virus lineages signifies pervasive horizontal virus transfer between diverse hosts as critical points in RNA virus evolution. Instances where RNA virus groups infect diverse hosts, including organisms of different taxonomic kingdoms includes the ourmiaviruses, plant infecting tripartite ( þ )ssRNA viruses (genus Ourmiavirus, family Botourmiaviridae, non-enveloped bacilliform virions with discrete lengths from 30 to 62 nm), that are related to narnaviruses (genus Narnavirus, family Narnaviridae, capsid-less ( þ )ssRNA viruses) infecting fungi (budding yeast and filamentous ascomycetes) and oomycete (Phytophthora infestans, family Peronosporaceae), and ourmia-like viruses recently discovered from invertebrate and fungi. Thus, ourmiaviruses seem to have evolved from horizontal transfer of narnaviruses or ourmia-like viruses to plants together with the acquisition of genes responsible for viral cell-to-cell movement and capsid formation. Secoviridae (nonsegmented or two segmented ( þ )ssRNA viruses with non-enveloped spherical particles 25–30 nm in diameter), a large virus family limited to plant hosts, belongs to the clade of picorna-like viruses (non-segmented ( þ )ssRNA genome) infecting a variety of invertebrates, mainly arthropods, suggesting ancestral secoviruses evolved in their host plants after horizontal transfer from arthropods. Furthermore, the genome segmentation of their ancestor may have occurred simultaneously with this event except for the members of genera Waikavirus and Sequivirus. Likewise, plant rhabdoviruses (family Rhabdoviridae, (  )ssRNA viruses) are most likely descended from insect rhabdoviruses because they tend to group according to their insect vectors rather than their host plants. Interestingly, the segmented-genome type of rhabdoviruses (dichorhaviruses), which is similar to the segmented genome feature of many plant viruses, is only known to exist in plants, suggesting evolution of rhabdoviruses in their host plants after horizontal transfer. Recently established viral phylogenies using expansive virus metagenomic data showed unprecedented clustering of plant and invertebrate virus groups, further implying pervasive horizontal virus transfers from invertebrates to plants in the distant past.

Concluding Remarks Increasing evidence supports the view that some viruses can infect hosts from different kingdoms such as insects, plants and fungi and that may be rather common in nature. Viral phylogenies suggest that virus transmission across kingdom barriers has occurred in the past. In addition, artificial virus inoculation methods have demonstrated that certain viruses are able to replicate in both plants and fungi. Most importantly, examples of natural virus transmissions between insects and plants or fungi, as well as between plants and fungi, have been documented (Fig. 3). Further experimental and ecological studies on cross-kingdom virus infections would deepen our understanding of the origin, host range, dissemination and reservoir of viruses in the environment.

Cross-Kingdom Virus Infection

449

Further Reading Andika, I.B., Wei, S., Cao, C., et al., 2017. Phytopathogenic fungus hosts a plant virus: a naturally occurring cross-kingdom viral infection. Proceedings of the National Academy of Sciences of the United States of America 114, 12267–12272. Balique, F., Lecoq, H., Raoult, D., Colson, P., 2015. Can plant viruses cross the kingdom border and be pathogenic to humans? Viruses 7, 2074–2098. Dasgupta, R., Garcia, B.H., Goodman, R.M., 2001. Systemic spread of an RNA insect virus in plants expressing plant viral movement protein genes. Proceedings of the National Academy of Sciences of the United States of America 98, 4910–4915. Dolja, V.V., Koonin, E.V., 2018. Metagenomics reshapes the concepts of RNA virus evolution by revealing extensive horizontal virus transfer. Virus Research 244, 36–52. Hogenhout, S.A., Ammar, E.D., Whitfield, A.E., Redinbaugh, M.G., 2008. Insect vector interactions with persistently transmitted viruses. Annual Review of Phytopathology 46, 327–359. Liu, S., Xie, J., Cheng, J., et al., 2016. Fungal DNA virus infects a mycophagous insect and utilizes it as a transmission vector. Proceedings of the National Academy of Sciences of the United States of America 113, 12803–12808. Mascia, T., Nigro, F., Abdallah, A., et al., 2014. Gene silencing and gene expression in phytopathogenic fungi using a plant virus vector. Proceedings of the National Academy of Sciences of the United States of America 111, 4291–4296. Nagy, P.D., 2008. Yeast as a model host to explore plant virus-host interactions. Annual Review of Phytopathology 46, 217–242. Nerva, L., Varese, G.C., Falk, B.W., Turina, M., 2017. Mycoviruses of an endophytic fungus can replicate in plant cells: Evolutionary implications. Scientific Reports 7, 1908. Roossinck, M.J., 2019. Evolutionary and ecological links between plant and fungal viruses. New Phytologist 221, 86–92. doi:10.1111/nph.15364. Shi, M., Lin, X.D., Tian, J.H., et al., 2016. Redefining the invertebrate RNA virosphere. Nature 540, 539–543. Zhao, R.Y., 2017. Yeast for virus research. Microbial Cell 4, 311–330.

Diversity of Mycoviruses in Aspergilli Ioly Kotta-Loizou, Imperial College London, London, United Kingdom r 2021 Elsevier Ltd. All rights reserved.

Glossary Anamorph The asexual reproductive stage of a fungus. Anastomosis Fusion between branches of hyphae. Ascocarp Fruiting body of a fungus belonging to the phylum Ascomycota, containing asci filled with ascospores. Ascospore Sexual spore of a fungus belonging to the phylum Ascomycota, produced inside a sac called an ascus. Conidiospore Asexual spore of a fungus.

Heterokaryon A multinucleate fungal cell that contains genetically different nuclei. Heterothallic fungus Fungus not possessing, within a single individual, the resources to reproduce sexually but requiring two distinct individuals of opposite mating types. Homothallic fungus Fungus possessing, within a single organism, the resources to reproduce sexually. Teleomorph The sexual reproductive stage of a fungus.

Introduction Aspergillus is a genus of filamentous fungi belonging to the order Eurotiales, class Eurotiomycetes, phylum Ascomycota. Genus Aspergillus includes the anamorphs (asexual stage) of more than 250 species, although some of them are known to also have a teleomorph (sexual stage) belonging to a different genus. For example, Emericella nidulans is the teleomorph of A. nidulans. Anamorphic fungi reproduce by producing conidiospores, asexual spores, while teleomorphs belonging to the phylum Ascomycota develop ascocarps, fruiting bodies producing ascospores, sexual spores. Aspergilli have a worldwide distribution, are ubiquitous and able to occupy numerous niches in terms of temperature, osmotic stress, and nutrient availability, with most of them being saprophytic. Individual Aspergillus species are medically, ecologically, and economically important and Aspergillus is one of the most well studied fungi in the laboratory; for example, A. nidulans was used as a model organism to study fungal genetics and the parasexual cycle in the 1950s and it was also one of the first fungi to have its genome fully sequenced in 2005. To date, twenty taxa, nineteen families (Alphaflexiviridae, Amalgaviridae, Barnaviridae, Botourmiaviridae, Chrysoviridae, Deltaflexiviridae, Endornaviridae, Gammaflexiviridae, Genomoviridae, Hypoviridae, Megabirnaviridae, Metaviridae, Mymonaviridae, Narnaviridae, Partitiviridae, Pseudoviridae, Reoviridae, Totiviridae, Quadriviridae) and one genus that does not belong to a family (Botybirnavirus), are officially recognized by the International Committee for the Taxonomy of Viruses (ICTV) to accommodate, exclusively or not, mycoviruses. With the exception of Genomoviridae, mycoviruses have linear double-stranded (ds) or single-stranded (ss) RNA genomes. Out of these twenty recognized taxa, only members of the families Chrysoviridae, Partitiviridae, Narnaviridae, and Totiviridae have been discovered in Aspergilli to date. Other mycoviruses identified that could not be accommodated to the already existing families were often provisionally assigned to proposed families, e.g., Polymycoviridae. In general, RNA mycoviruses do not have any known extracellular phase in their replication cycle; they can be transmitted vertically, from parent to offspring, via conidiospores and ascospores, and horizontally, from one fungal isolate to another, via anastomosis/hyphal fusion. Mycovirus infections are persistent and, once infected, it is very difficult for fungal hosts to eliminate mycoviruses. The majority of mycovirus infections are also asymptomatic, latent, or cryptic.

Prevalence In Aspergilli, the presence of mycoviruses was initially reported in 1970, when spherical virus-like particles (VLPs) containing dsRNA were discovered in A. foetidus and A. niger. Notably, the mycoviruses infecting A. foetidus strain IMI 41871, known as the A. foetidus mycovirus complex, are among few fully described at the molecular level, although their characterization was completed only recently, 45 years following their initial discovery. Since then, investigations have revealed the presence of dsRNA elements, often accompanied by VLPs, in Aspergillus sections Nigri, Circumdati, Flavi, and Fumigati. In the cases where large populations of fungi were examined, the percentage of infected isolates ranged from less than 10% to almost 50%, often dependent on the geographic origin of isolates from the same species. The number of dsRNA elements present in each isolate varied from 1 up to 9, suggesting that in at least some of the cases more than one mycovirus was present. The dsRNA elements ranged from less than 0.5 kbp to 10 kbp in size; however, in most instances, the dsRNA elements were not characterized at the molecular level and therefore no conclusions can be drawn regarding the evolutionary relationships among similarly sized segments. Finally, the diameter of the VLPs ranged from 23 to 40 nm. In Aspergillus section Nigri or ‘black Aspergilli’, apart from A. foetidus and A. niger, the presence of mycoviruses has been reported in A. awamori, A. carbonarius, A. heteromorphus, A. japonicus, and A. tubingensis. The prevalence of mycoviruses appears to be particularly high in Indonesian isolates. In Aspergillus section Flavi, A. flavus, A. leporis, A. nomius, A. parasiticus and A. tamarii, are known to harbor mycoviruses. In Aspergillus section Circumdati, mycoviruses have been found in A. ochraceus and A. petrakii;

450

Encyclopedia of Virology, 4th Edition, Volume 4

doi:10.1016/B978-0-12-809633-8.21321-6

Diversity of Mycoviruses in Aspergilli

451

Aspergillus ochraceus virus (AoV) from strain FA0611 has been fully sequenced. Finally, mycoviruses have been reported in both clinical and environmental isolates of A. fumigatus. Six of them, Aspergillus fumigatus chrysovirus (AfuCV), Aspergillus fumigatus partitivirus 1 (AfuPV-1), Aspergillus fumigatus partitivirus 2 (AfuPV-2), Aspergillus fumigatus narnavirus 1 (AfuNV-1), Aspergillus fumigatus narnavirus 2 (AfuNV-2), Aspergillus fumigatus mitovirus 1 (AfuMV-1), and Aspergillus fumigatus tetramycovirus 1 (AfuTmV-1) have been characterized at the molecular level. Despite the relative high prevalence of mycoviruses in a number of anamorphic Aspergilli, their reported presence in the teleomorphs of the genus Aspergillus is much rarer. No dsRNA elements were discovered in a large E. nidulans (teleomorph of A. nidulans) panel derived from worldwide locations. Nevertheless sporadic mycovirus infections have been reported for Neosartorya hiratsukae, N. quadricincta (teleomorphs of Aspergillus section Fumigati) and Petromyces alliaceus (teleomorph of A. alliaceus, Aspergillus section Flavi). Additionally, it should be noted that the quantity and even the presence of mycoviruses at detectable levels often depend on the composition of the nutrients in the growth medium and the developmental stage of the fungus. Regarding the latter, there have been a range of reports: in some cases virus replication correlates with the abundance of nutrients in the exponential phase of fungal growth while in others the virus titer is higher at the more advanced stages of the life cycle of the fungus. These results should not necessarily be considered conflicting, since they were derived from different fungal species most likely harboring distinct mycoviruses.

Transmission Over the years, Aspergilli have been used as a model system for investigating all aspects of mycovirus transmission, including vertical transmission via conidiospores and ascospores, horizontal transmission via anastomosis/hyphal fusion, the role of heterokaryon incompatibility between fungal isolates as a barrier to mycovirus spread and intraspecies transmission. During the formation of conidiospores from the tip of a differentiated hyphae called the conidiophore, mycoviruses present in the cytoplasm can be trapped within the conidial cell wall. Subsequently, the conidiospores are released from the conidiophore, disperse through the air and eventually give rise to a new fungal colony which is infected with the mycovirus. All relevant studies have shown that vertical transmission of mycoviruses via conidiospores is highly efficient in Aspergilli. This ensures the maximal dispersion of mycoviruses, which, having no extracellular phase in their replication cycle depend on their host for movement, together with their survival under harsh environmental conditions when the cost-efficient conidiation is the reproductive method of choice. In contrast, vertical transmission of mycoviruses via ascospores is far less efficient. For instance, dsRNA elements introduced in E. nidulans (teleomorph of A. nidulans) were present exclusively in ascospores produced by self-fertilization and not following crossing with a compatible partner. This observation explains the paucity of mycoviruses in E. nidulans and was also confirmed for N. hiratsukae. The wider implication is that mycoviruses would not be transmitted via ascospores by Aspergillus teleomorphs which are exclusively heterothallic (unable to self-fertilize) and not homothallic. Horizontal transmission of mycoviruses from one fungal strain to another requires their physical proximity and the fusion of their hyphae, a process called anastomosis. Following anastomosis, the cytoplasm, the nuclei and other organelles of two fungal strains, together with any mycoviruses present, co-exist in a joint mycelium called a heterokaryon. When the heterokaryon breaks down to homokaryons, either by forming distinct uninucleate sectors or by conidiation, mycoviruses may be inherited from a previously virus-free strain. In Aspergilli, the co-existence of multiple different mycoviruses in the same host is possible; however already infected isolates may be able to accommodate additional dsRNA elements, potentially due to limiting competition with the established mycoviruses. Notably, not all fungal strains can fuse their hyphae and form heterokarya and this heterokaryon (or vegetative) incompatibility may restrict mycovirus transmission. The (in)compatibility of fungal strains depends on a number of nuclear genes whose specific alleles determine whether anastomosis should be allowed. Nevertheless, no correlation between the presence of dsRNAs and the isolates’ mitochondrial RFLP haplotype, specific conserved surface protein types or mating type MAT1–1 and MAT1–2 have been noted. In the case of A. niger, mycovirus transfer was possible between isogenic or vegetatively compatible strains following generation and fusion of protoplasts in the laboratory. Mechanical disturbance of mycelia leading to released cytoplasm was necessary to allow mycovirus transfer between vegetatively incompatible strains. In contrast, heterokaryon incompatibility did not prevent mycovirus transfer between A. nidulans strains. The potential of interspecies transmission of dsRNA elements has also been investigated in the laboratory and it has been successfully demonstrated between A. niger (as mycovirus donor) and A. ficuum, A. oryzae, A. tubingensis, A. nidulans, and in addition to the aforementioned hyphomycetes, the yeast Saccharomyces cerevisiae (as mycovirus recipients).

Genomes To date, only a few mycoviruses from Aspergilli have had their genomes fully sequenced and annotated. These include the A. foetidus mycovirus complex, comprising of a member of the family Totiviridae and two as of yet unclassified viruses, members of the families Chrysoviridae, Partitiviridae and Narnaviridae and a member of the provisionally designated family Polymycoviridae (Fig. 1, Table 1). Families Totiviridae, Chrysoviridae, and Partitiviridae accommodate mycoviruses with dsRNA genomes within isometric virions composed of viral proteins. More specifically, Chrysoviridae, and Partitiviridae have four and two genomic segments, respectively, each one

452

Diversity of Mycoviruses in Aspergilli

Fig. 1 Schematic representation of the genomic organization of all fully sequenced Aspergillus mycoviruses. For each genomic segment, the ORF(s) (colored boxes) are flanked by 50 - and 30 -UTRs (black boxes) and the function of the encoded protein(s) is indicated.

Diversity of Mycoviruses in Aspergilli

Table 1

453

Sequence properties of known Aspergillus mycoviruses

Virus name

Abbreviation

Original host

Segment (bp)

ORF size (nt; aa; kDa)

UTR length (bp) 50 -UTR

30 -UTR

(3571) (2734) (2418) (1961) (5194) (3634)

3375; 1124; 127 2406; 801; 87 2181; 726; 79 1743; 580; 65 4745; 741/839; 78/92 2889; 962; 110

51 48 50 50 373 509

145 280 187 147 76 236

A. ochraceus FA0611

dsRNA 1 (1754) dsRNA 2 (1555) dsRNA 3 (1220)

1620; 539; 62 1303; 433; 47 882; 293; 34

66 100 180

68 153 158

AfuPV  1

A. fumigatus 88

dsRNA 1 (1779) dsRNA 2 (1623)

1629; 542; 63 1329; 442; 48

65 104

85 190

Aspergillus fumigatus partitivirus  2

AfuPV  2

A. fumigatus V145–13

dsRNA 1 (1822) dsRNA 2 (1638)

1722; 573; 66 1452; 483; 55

34 95

66 91

Aspergillus fumigatus chrysovirus

AfuCV

A. fumigatus A  56

Aspergillus fumigatus narnavirus  1 Aspergillus fumigatus narnavirus  2 Aspergillus fumigatus mitovirus  1

AfuNV  1 AfuNV  2 AfuMV  1

A. fumigatus V145–43 A. fumigatus V145–80 A. fumigatus V145–13

dsRNA dsRNA dsRNA dsRNA dsRNA dsRNA dsRNA

1 2 3 4 1 1 1

(3560) (3159) (3006) (2863) (2007) (1994) (2500)

3345; 1114; 129 2862; 953; 107 2681; 891; 99 2544; 847; 95 1857; 618; 70 2031; 676; 71 1764; 587; 69

128 167 169 154 78 26 300

87 130 156 165 72 87 436

Aspergillus fumigatus tetramycovirus  1

AfuTmV  1

A. fumigatus Af293

dsRNA dsRNA dsRNA dsRNA

1 2 3 4

(2403) (2233) (1970) (1131)

35 70 51 86

76 72 74 205

Aspergillus foetidus fast virus

AfV-FRN

A. foetidus IMI 41871

Aspergillus foetidus slow virus 1 Aspergillus foetidus slow virus 2

AfV-S1 AfV-S2

A. foetidus IMI 41871 A. foetidus IMI 41871

Aspergillus ochraceus partitivirus

AoPV

Aspergillus fumigatus partitivirus  1

dsRNA dsRNA dsRNA dsRNA dsRNA dsRNA

1 2 3 4 1 1

2292; 2091; 1845; 840;

763; 696; 614; 279;

84 76 67 29

encapsidated separately, while Totiviridae have non-segmented genomes. In contrast, family Narnaviridae accommodate unencapsidated mycoviruses with ssRNA genomes.

The Aspergillus foetidus Mycovirus Complex The A. foetidus mycovirus complex, i.e., the seven dsRNA elements found in A. foetidus strain IMI 41871, was initially reported in the 1970s and used as a model system to study virus replication and transcription, although the actual sequence of these elements was determined in the 2010s. Its presence is widespread among Aspergilli, since it has been noted not only in different isolates of A. foetidus but also in A. niger. The seven ds RNA elements are accommodated in two different classes of VLPs, designated as ‘fast’ and ‘slow’ based on their electrophoretic mobility. The ‘fast’ and ‘slow’ VLPs are both isometric and similar in diameter, approximately 33–37 nm; however they are serologically unrelated to each other and antibodies specific for one class of VLPs do not cross-react with the other, suggesting that each class of VLPs is constructed with a distinct capsid protein. The ‘fast’ VLPs accommodate a yet unclassified mycovirus with four dsRNA elements as its genome, each encapsidated separately. Each one of the four dsRNAs has a single open reading frame (ORF) flanked by conserved 50 - and 30 -untranslated regions (UTRs) and a 30 poly(A) tail. The two larger RNAs encode the RNA-dependent RNA polymerase (RdRP) and the capsid protein (CP), respectively, while the other two encode proteins of unknown function. Similar mycoviruses have been described not only in Aspergilli but in other ascomycetes as well, such as Alternaria alternata and Fusarium poae. The ‘slow’ VLPs accommodate the remaining three dsRNA elements: a member of the genus Victorivirus, family Totiviridae, which has a linear non-segmented genome and encodes an RdRP and a CP; an unclassified mycovirus with a single open reading frame encoding an RdRP, closely related to viruses found in other ascomycetes, such as Fusarium poae, Penicillium aurantiogriseum, and Rosellinia necatrix, and in basidiomycetes, such as Rhizoctonia solani; and a satellite RNA, less than 0.5 kbp in size with no protein coding capacity. All three dsRNA elements depend on the victorivirus, which produces the capsid protein that forms the

454

Diversity of Mycoviruses in Aspergilli

‘slow’ VLPs, for their encapsidation. The victorivirus and the unclassified virus each encode their own RdRP for replicating their own genomes, and the RdRP of the unclassified virus is also responsible for the replication of the satellite RNA.

Partitiviruses Three viruses belonging to the family Partitiviridae, genus Gammapartitivirus, have been found in Aspergilli, Aspergillus ochraceus virus (AoV) was isolated from A. ochraceus, while Aspergillus fumigatus partitivirus-1 (AfuPV-1) and Aspergillus fumigatus partitivirus-2 (AfuPV-2) were discovered in A. fumigatus. Notably, the two partitiviruses from A. fumigatus are not closely related. Each virus has two dsRNA segments as it genome, containing one ORF flanked by conserved 50 - and 30 -UTRs. As is typical for partitiviruses, the larger dsRNA encodes the RdRP and the smaller the CP. The Aspergillus ochraceus virus in particular has an additional component, which has been fully sequenced and is known to contain an ORF encoding a protein of unknown function. Since its UTRs are very similar to those of the other two partitivirus components, it is most likely that it is replicated by the RdRP.

Chrysoviruses The presence of a virus belonging to the genus Alphachrysovirus, family Chrysoviridae and named Aspergillus fumigatus chrysovirus (AfuCV) has been reported in A. fumigatus. The AfuCV genome consists of four dsRNAs, each one containing an ORF flanked by conserved 50 - and 30 -UTRs. The two largest dsRNAs encode the viral RdRP and CP, respectively, while the other two dsRNAs encode proteins whose function is not fully understood. More specifically, the smallest dsRNA encodes a putative cysteine protease (alphachryso-P4) while the third dsRNA encodes a protein (alphachryso-P3) that shares significant sequence similarity with the N-terminus of the RdRP. The alphachryso-P3 contains a ‘phytoreo S7 domain’, initially described in members of the genus Phytoreovirus, family Reoviridae, which is considered to interact with the RdRP and play a role in viral RNA binding and packaging. Furthermore, the remote protein homology detection HHpred software showed that the AfuCV alphachryso-P3 contains a P-loop NTPase domain near its N-terminus, which has homologous sequences in RdRP. Members of the P-loop NTPase domain superfamily are characterized by conserved nucleotide phosphate-binding motifs, also referred to as the Walker A motif (GxxxxGK [S/T]), and the Walker B motif (hhhh[D/E]), where h is a hydrophobic residue.

Narnaviruses and Mitoviruses Three members of the family Naranaviridae have been sequenced in A. fumigatus, two belonging in the genus Narnavirus and one in the genus Mitovirus. These viruses encode only one protein, the RdRP responsible for replication. The two narnaviruses are not closely related and they are considered the simplest, smallest viruses. Unlike the mitovirus, which is found in mitochondria, narnaviruses are cytoplasmic. These are the only ssRNA mycoviruses reported in Aspergillus to date, although it is believed that the unclassified virus in the ‘slow’ A. foetidus mycovirus complex also has an ssRNA genome.

Polymycoviruses The prototype member of the novel proposed family Polymycoviridae, designated as Aspergillus fumigatus tetramycovirus-1 (AfuTmV-1), was discovered in A. fumigatus. AfuTmV-1 has four dsRNA segments as its genome, similarly to members of the families Chrysoviridae and Quadriviridae and the unclassified virus in the ‘fast’ A. foetidus mycovirus complex; however, unlike these viruses, AfuTmV-1 segments are smaller in size, ranging from 1 kbp up to 2.5 kbp in length. AfuTmV-1 does not have a capsid and can be isolated from the fungus as long chains of dsRNA associated with a protein. Additionally, it was the first viral entity shown to be infectious as naked dsRNA, following introduction of its purified genome into A. fumigatus protoplasts. This was a paradigmshifting discovery since, unlike the genome of positive-sense ssRNA viruses that can successfully replicate following its introduction in the host cells, dsRNA viruses were previously considered not to be infectious but as whole virions. The largest dsRNA encodes the viral RdRP, which is related to the RNA picorna-superfamily and has two unusual characteristics that make it unique: firstly, it is evolutionary closer to members of the families Astroviridae and Caliciviridae, which have positivesense, single-stranded RNA genomes, and infect birds and mammals; secondly, its GDD motif, responsible for the catalytic activity of the enzyme, has been replaced with GDNQ, characteristic of negative-stranded RNA viruses such as those in the order Mononegavirales, which include important human pathogens such as measles, rabies, and Ebola. The second largest dsRNA encodes a protein of unknown function containing a putative endoplasmic reticulum (ER) retaining signal peptide and a short transmembrane alpha-helix together with a zinc finger-like motif similar to those that bind nucleic acids. It is hypothesized that it acts as a scaffold protein, recruiting the replication complex in the ER and facilitating the interaction among the viral RNA and the viral proteins. Both RNA and DNA viruses, such as hepatitis C and vaccinia, are known to utilize and rearrange the ER membrane for their replication and assembly and it is conceivable that polymycoviruses are also associated with this host organelle. The third

Diversity of Mycoviruses in Aspergilli

455

largest RNA encodes a methyl-transferase responsible for capping the positive RNA strands of the virus; a conserved catalytic motif and a FAD/NAD(P)-binding Rossmann fold domain have been identified in the protein sequence. The latter binds a coenzyme, which acts as an acceptor or a donor for the hydrogen anion lost or gained during the redox reaction catalysed by the methyltransferase. The presence of the capping structure leads to the recognition of the viral RNAs, which act as mRNAs, by the host’s ribosomes and to the translation of the viral proteins. Notably this is the first report of a capping enzyme and the RdRP being on distinct dsRNA elements. Finally, the smallest dsRNA encodes a protein hypothesized to coat the viral genome, which is nonconventionally encapsidated. This protein is enriched in the residues proline, alanine, and serine and has long intrinsically disordered regions, which may facilitate its interactions with RNA and proteins. The aforementioned four dsRNAs and the proteins they encode are common in all known polymycoviruses; however polymycoviruses may have up to four additional genomic segments, encoding small non-homologous proteins of unknown function. It is feasible that these proteins originally participated in the formation of a capsid, since a polymycovirus from plant pathogenic fungus Colletotrichum camelliae has eight dsRNA segments and is encapsidated in filamentous virions. Loss of one or more of these genomic segments would not affect the ability of polymycoviruses to replicate, but would coincide with the shredding of the capsid.

Phenotypes A. fumigatus is the major causative agent of aspergillosis and related respiratory diseases, affecting immunocompromised patients undergoing transplantation or chemotherapy and those vulnerable due to another infection, such as tuberculosis, or a genetic disorder, such as cystic fibrosis. Other Aspergillus species are significant food contaminants, may have industrial uses and produce toxins such as aflatoxins. A number of case studies have revealed that mycoviruses may have discernible effects on their fungal host, altering its phenotype. These comparisons typically involve virus-free and virus-infected isogenic lines; i.e., the same fungal isolate with and without the virus, in order to ensure that any observed phenomena are due to the presence of the virus and not to potential differences in the genomic background of the host. The most easily detected phenotypes are alterations in pigmentation of the fungus and may be accompanied by differences in growth, conidiation, pathogenicity, and/or mycotoxin production. For instance, virus infection in an A. niger isolate was shown to lead to reduced growth rate and production of conidia. Virus-mediated aflatoxin production is a very interesting phenotype associated with isolates of A. flavus. Aflatoxins are carcinogenic secondary metabolites produced by A. flavus and A. parasiticus, which are opportunistic plant-pathogens, and can be found on a range of agricultural crops including tree nuts, corn, peanuts, grains, and soybeans, mostly in warm and humid environments. This contamination of food resources with aflatoxins, to which the consumer that can be exposed either directly or indirectly via meat and dairy products, is a serious medical and economic concern, particularly in developing countries. The production of aflatoxins is down regulated in the presence of uncharacterised mycoviruses in A. flavus strains NRRL 5565 and NRRL 5940. Nevertheless, this correlation is not universal since mycoviruses do not appear to affect aflatoxin levels in other A. flavus isolates or the production of the mutagenic patulin by A. clavatus. No attempts were made to further characterize the mycoviruses from the NRRL 5565 and NRRL 5940 strains or to take advantage of them in order to control aflatoxin production in the field. The most extensive phenotypic studies have been performed in A. fumigatus isolates: the effect of different mycoviruses on fungal growth has been investigated both in solid and in liquid culture and their effect on virulence has been assessed on mice and on larvae of the greater wax moth Galleria mellonella, both widely used model systems for pathogenicity studies. The A. fumigatus isolate harboring the chrysovirus produces darker green pigmentation than virus-free isogenic line, while the isolate with the partitivirus has lighter pigmentation. Chrysovirus and partitivirus infections both lead to the formation of aconidial sectors and significantly reduced growth in solid and in liquid culture, but no effects on fungal pathogenicity were discernible. In contrast, infection of A. fumigatus with polymycoviruses leads to hypervirulence, or increased pathogenicity, apparently via two different mechanisms: in isolate A78 harboring an uncharacterised in terms of sequence polymycovirus, hypervirulence is at least partially a result of increased growth; in Af293 isolate harboring AfuTmV-1, hypervirulence is not concomitant with enhanced growth and depends on other unidentified parameters associated with pathogenicity. For instance, AfuTmV-1 may affect the expression of genes involved in the thermotolerance of the fungus, the constitution and polysaccharide content of the fungal cell wall, together with melanin pigment production, resistance to oxidative stress and other factors that influence the interplay between the fungus and the host’s immune system. The association, if any, between mycovirus infection in A. fumigatus and clinicopathological parameters, patient survival, and azole resistance is currently unknown. The next major challenge in the investigations of Aspergillus mycoviruses is understanding the molecular mechanisms and signaling pathways mediating the aforementioned phenotypes. Due to the dsRNA nature of their genomes or their replicative forms, mycoviruses act as both triggers and targets of RNA silencing, a process leading to the degradation of viral RNA as part of the anti-viral defense of the fungal host. This phenomenon has been studied in A. nidulans and A. fumigatus, demonstrating that non-conventionally encapsidated viruses, such as polymycoviruses, and encapsidated viruses, such as chrysoviruses and partitiviruses, can be silenced, while some mycoviruses may in turn suppress RNA silencing in their fungal host. Mycovirus infection may also result in differential expression of small regulatory RNAs, suggesting a putative mechanism underpinning the observed mycovirus effects on fungal hosts.

456

Diversity of Mycoviruses in Aspergilli

Further Reading Bhatti, M.F., Jamal, A., Petrou, M.A., et al., 2011. The effects of dsRNA mycoviruses on growth and murine virulence of Aspergillus fumigatus. Fungal Genetics and Biology 48, 1071–1075. Hammond, T.M., Andrewski, M.D., Roossinck, M.J., Keller, N.P., 2008. Aspergillus mycoviruses are targets and suppressors of RNA silencing. Eukaryotic Cell 7, 350–357. Kanhayuwa, L., Kotta-Loizou, I., Özkan, S., Gunning, A.P., Coutts, R.H.A., 2015. A novel mycovirus from Aspergillus fumigatus contains four unique double-stranded RNAs as its genome and is infectious as dsRNA. Proceedings of the National Academy of Sciences of the United States of America 112, 9100–9105. Kotta-Loizou, I., Coutts, R.H.A., 2017. Mycoviruses in Aspergilli: A comprehensive review. Frontiers in Microbiology 8, 1699. Özkan, S., Coutts, R.H.A., 2015. Aspergillus fumigatus mycovirus causes mild hypervirulent effect on pathogenicity when tested on Galleria mellonella. Fungal Genetics and Biology 76, 20–26. van Diepeningen, A.D., Debets, A.J., Hoekstra, R.F., 2006. Dynamics of dsRNA mycoviruses in black Aspergillus populations. Fungal Genetics and Biology 43, 446–452. Zoll, J., Verweij, P.E., Melchers, W.J.G., 2018. Discovery and characterization of novel Aspergillus fumigatus mycoviruses. PLoS One 13, e0200511.

Evolution of Mycoviruses Mahtab Peyambari, Vaskar Thapa, and Marilyn J Roossinck, Pennsylvania State University, State College, PA, United States r 2021 Elsevier Ltd. All rights reserved.

Glossary Alpha helix (a-helix) A common protein secondary structure that is formed when the polypeptide chains twist into a spiral. Bottleneck An evolutionary event that drastically reduces the size of a population. It can be caused by various factors but here different cell types are the contributing factor. It lowers genetic variability in a population. Convergent evolution The process whereby organisms not closely related, independently evolve similar traits as a result of having to adapt to similar environments or ecological niches. Evolutionary lineage A temporal series of organisms, populations, cells, or genes connected by a continuous line of descent from the ancestor to descendant. Lineages are subsets of the evolutionary tree of life. Lineages are often determined by the techniques of molecular systematics. Jelly roll A protein fold or supersecondary structure composed of eight beta strands arranged in two fourstranded sheets. The jelly roll is the most prevalent fold among viral capsid proteins.

Modular evolution The concept that viruses contain sets of genes or modules that have a common ancestor but may be found in different arrangements in viral genomes. Polyphyletic A group of organisms that do not share an immediate common ancestor. Purifying selection The selective removal of mutations or alleles that are not beneficial to an organism. RNA silencing A sequence-specific degradation of RNA, usually triggered by double-stranded RNA. RNA silencing provides an antiviral defense response at the cellular level, and is an adaptive immune system used by plants, fungi and a few other organisms. Vegetative Compatibility The ability of hyphae to fuse (anastomosis). Vegetative compatibility is controlled by one to several nuclear genes that limit completion of hyphal anastomosis between fungal colonies to those that belong to the same vegetative compatibility group (usually abbreviated to v-c group).

Introduction Mycoviruses are commonly found in many taxa of fungi. These viruses lack an extracellular phase but move between hyphal cells during cell division, and naturally transmit horizontally between vegetatively compatible groups through the anastomosis (fusion of hyphae). Some mycoviruses are able to facilitate anastomosis between vegetatively incompatible fungal strains by overcoming non-self recognition, a common phenomenon in fungi that enables distinction of oneself from another fungus. The non-self recognition is activated in fungi when two vegetatively incompatible hyphae come into contact resulting in compartmentalization followed by programmed cell death to interrupt the fusion between hyphae. In many systems vertical transmission of mycoviruses occurs efficiently through fungal asexual spores. The effective horizontal and vertical transmissions in mycoviruses are responsible for their widespread dispersal. The International Committee for the Taxonomy of Viruses (ICTV) in 2017 reported twelve classified and seven unclassified families of mycoviruses. A majority of mycoviruses have double-stranded (dsRNA) genomes, but this could be partially due to bias in the analysis that often involves isolation of dsRNA as a starting point. Unlike most of the single-stranded (ss) RNA families, the extremely broad range of the dsRNA viruses in fungi appears rather unusual. The three largest families of dsRNA viruses, Partitiviridae, Totiviridae, and Reoviridae, show broad host ranges, infecting two or three kingdoms of eukaryotic hosts, implying ancient origins. In general, RNA viruses have higher evolutionary rates because the RNA dependent RNA polymerases (RdRps) that replicate their genomes are more error-prone than those for DNA dependent DNA polymerase. The estimated rate of nucleotide substitutions per site per year for animal RNA viruses falls within one order of magnitude of 1  10–3. No parallel studies on mycovirus evolutionary rates are reported. However, unlike most animal RNA viruses, mycoviruses have an exclusively intracellular lifestyle that predominantly limits them to infection of a single fungal host, or to vegetatively compatible populations. Moreover, mycoviruses experience bottlenecks within their hosts. In many fungi such as the genus Aspergillus, mycoviruses are transmitted vertically via asexual spores but not through sexual spores. Mutation events in mycoviruses with dsRNA genomes are further constrained by their stamping-machine mode of replication. In stamping-machine replication, one strand of the genome produces multiple copies, but this is then copied only once to produce the genomic RNA. In most ssRNA viruses each strand can be amplified during replication, resulting in a logarithmic increase in progeny strands, and allowing for a higher probability of mutation events to occur. It is likely that such a lifestyle of mycoviruses results in a lower tolerance for mutations, or a higher pressure of purifying selection and hence a lower evolutionary rate over time. Studies in plant persistent viruses that share a similar lifestyle with mycoviruses and belong to the families common to both fungal and plant viruses, show similar slow rates of evolution. For example, an endornavirus that infects the ancestor of domesticated rice and a related virus that infects cultivated rice have diverged by about 24% over the 10,000 years since rice was first cultivated. Another example is a chrysovirus isolated from

Encyclopedia of Virology, 4th Edition, Volume 4

doi:10.1016/B978-0-12-809633-8.21322-8

457

458

Evolution of Mycoviruses

ancient maize cobs from about 1000 years ago that can still be found in modern maize. The virus sequences have diverged by only about 3%. Due to the lack of an extracellular phase for mycoviruses, and dependence on the host for transmission, mycoviruses are generally thought to have co-evolved with their hosts. Their symptomless and persistent infections and, in some cases positive interactions with hosts, support this view. However, phylogenetic studies show mycoviruses are diverse despite sharing the same lifestyle and do not seem to have evolved from a single ancestral virus.

Evolutionary History and Drivers of Evolution A wide range of evolutionary forces contributes to genetic variability and adaptability of mycoviruses to a changing environment. A comprehensive evolutionary study across the mycoviruses is lacking, however, findings from some mycovirus taxa (e.g., families) and individual species provide us a picture of evolutionary forces shaping the extant population. The family Totiviridae contains mycoviruses with undivided dsRNA genomes and is one of the more studied families of mycoviruses. Some members of this family also infect protozoans. Sequence comparisons of predicted RdRp amino acids among members show significant sequence similarity with eight conserved motifs among totiviruses that infect yeast, smut, filamentous fungi, and protozoans. However, the similarity in coat proteins (CPs) among members is not as pronounced as with the RdRps, a common observation among viruses. Two evolutionary lineages are distinct among totiviruses infecting fungi. One of the lineages, now grouped under the genus Victorivirus differs from the other (grouped under the genus Totivirus) by its RdRp expression strategy. In Victorivirus the RdRp open reading frame is separate from that of the CP, whereas these reading frames are fused in members of the Totivirus and are expressed by a frame-shifting mechanism. The ancestral relationships of the totiviruses infecting fungi are uncertain. However, the genus Victorivirus is phylogenetically closer to the totiviruses infecting protozoans than the totiviruses of yeast and smut fungi indicating that the fungal viruses in the family Totiviridae are probably polyphyletic. Partitiviruses infecting fungi are another widely studied group of viruses. The viruses in the family Partitiviridae infect fungi, plants, and oomycetes and possess two dsRNAs genome segments. The members of Partitiviridae infecting fungi have three distinct lineages, out of which only one lineage includes exclusively fungi (filamentous, ascomycetes: Gammapartitivirus) as hosts, while the two other lineages have both fungal and plant hosts (Alphapartitivirus and Betapartitivirus). The lineages of partitiviruses with both fungal and plant hosts suggest a horizontal transfer of members between kingdoms. An analysis of the protein domains of endornaviruses infecting fungi indicates diverse origins suggesting a modular evolutionary history. The endornaviruses infecting fungi have naked ssRNA genomes that encode polyproteins for various function. The helicase domains of Gremmeniella abietina type B RNA virus XL1 and Chalara elegans endornavirus 1 (CeEV1) are derived from two different protein superfamilies. Similarly, Helicobasidium mompa endornavirus 1 and CeEV1 have glycosyltransferase domains originating from the different protein family roots. The occurrence of domains with similar functions but of different origins in different fungal endornaviruses indicates a modular convergent evolutionary strategy. In fact, the evolution of domains of similar function but highly divergent primary sequence in many mycoviruses is sometimes interpreted as ancient divergence, but can equally be understood as convergent evolution. The highly conserved RdRp domain in endornaviruses is related to the RdRp of closteroviruses, which are plant ssRNA viruses. A phylogenetic analysis of hypovirus RdRps and the gain or loss of other genes support the hypothesis that these naked ( þ ) ssRNA fungal viruses are derived from the potyviruses, another group of plant ( þ ) ssRNA viruses. The mitoviruses, found in the mitochondria of fungi, may have originated from bacteriophages in the Leviviridae family, the only ( þ ) ssRNA family in prokaryotes. It seems likely that the mitoviruses evolved from a bacterial virus that lost its coat protein and other genes. The related narnaviruses infecting yeasts are found in the cytoplasm, and could have originated from a mitovirus that escaped to the cytosol. The origin of dsRNA viruses is unclear, but one possibility is that a ( þ ) ssRNA could have been the ancestor of all dsRNA viruses that encapsidated a replicative form (dsRNA) together with the RdRp. Mitoviruses, hypoviruses, and endornaviruses, that have a ssRNA-like RdRp, but are found as unencapsidated dsRNA replicative forms, could be evolutionary intermediates in this process. The major evolutionary forces affecting more recent evolution of the mycoviruses are mutation and selective sweeps that were discussed above with respect to the different lineages in many groups of mycoviruses. Besides mutation, recombination and horizontal gene transfer (HGT) have been reported often in the literature. Recombination is an evolutionary force that is common in some ssRNA viruses. Recombination occurs when an RdRp switches its template from one molecule to another during virus replication. Since dsRNA viruses replicate exclusively within the virus particle, the opportunity for recombination between templates is probably very uncommon. However, a low frequency of recombination events is reported in Ustilago maydis virus H1, a member of the genus Totivirus. In members of hypoviruses such as Cryphonectria hypovirus 1 (CHV1) that replicate as ssRNA viruses homologous recombination events are noted frequently and some of the recombinant lineages appear to be responsible for the spread of the virus in Europe. Recent studies suggest that HGT between diverse mycoviruses as an important force contributing to evolution. The homologs of the S7 protein domain, a DNA-binding protein common in phytoreoviruses in the family Reoviridae, are reported from many mycoviruses that belong to the families Endornaviridae, Totiviridae, Chrysoviridae, and unclassified monopartite dsRNA fungal viruses. The domain is associated with the RdRp in many chrysoviruses but found with CPs and polyproteins in totiviruses and endornaviruses, respectively. Phylogenetic analysis of the S7 domains from various mycoviruses supports HGT across species. HGT is also reported in Sclerotinia sclerotiorum reovirus 1 (SsReV1), a member of the family Reoviridae. Virus protein 6 and virus protein 7 of Sclerotinia sclerotiorum reovirus 1 (SsReV1) have sequence similarities with dsRNA binding motif (dsRBM) and

Evolution of Mycoviruses

459

reovirus sigma C protein (Reo_sC), respectively. Reoviruses can infect mammals, protozoans, insects, plants, and fungi. Homologs of the dsRBM and Reo_sC are found in many reoviruses and diverse virus lineages outside the family Reoviridae. The phylogeny based on dsRBM and Reo_sC sequences supports HGT.

Effects of Host Divergence in the Evolution The complex relationship between viruses and their hosts can result from any evolutionary event such as adaptation to the hosts. But the question is “How much of the diversification of mycoviruses is linked to the diversification of their fungal hosts?” In one study of mycovirus-host codivergence, results indicated codivergence as a dominant mode of virus diversification in Partitiviridae and Totiviridae, but not in other mycovirus families. However, the lack of codivergence in other families might be the result of insufficient sampling. As viruses employ host translation systems, their proteins use the same genetic code as their host organism. There are a few organisms that use modified genetic codes rather than the universal genetic code. If a virus that uses the universal genetic code infects a host that utilizes an alternative genetic code, this would constitute high fitness costs. These differences in genetic codes could inhibit viral jumping between different hosts. Some authors have hypothesized that alternative genetic codes could be used as an antiviral strategy against virus transfers from other hosts. The effects of genetic code shifts on virus-host co-evolution remain poorly understood. Mitoviruses (family Narnaviridae) have the simplest genome among mycoviruses. They are capsid-less ssRNA viruses that encode only an RdRp. While all other known mycoviruses reproduce in the fungal cytoplasm, mitoviruses are unique in reproducing in the mitochondria. Fungal mitochondria often use an alternative genetic code, in which the universal stop codon UGA encodes tryptophan (Trp). Mitoviral RdRps have internal UGA codons, so as these RdRps are active in mitochondria, then UGA also encodes Trp in mitoviruses. Recently mitovirus-like sequences have been found in plant mitochondria and also integrated into the nuclear genome of many vascular plants. The latter are not transcriptionally active. The plant mitoviruses do not contain the UGA codon for Trp, so they have adapted to this host change. Although, in most fungal mitochondria UGA (Trp) is a common codon, in some fungi UGA (Trp) is a rare mitochondrial codon, and in mitoviruses in these hosts the UGA is mostly absent. Thus, fungal mitoviruses are adapted to their host species by using UGA as Trp codon, but this changes with host species. Recent investigation of plant transcriptomes revealed the presence of many complete genomes of plant mitoviruses, providing strong evidence for genuine plant mitoviruses. In a study on Scheffersomyces segobiensis, a fungus with an alternative genetic code where CUG codes for serine instead of leucine, a dsRNA virus of the genus Totivirus has removed all but one of the CUG codons from functional positions of its genome, presumably for adaptation to the host modified genetic code. Without this shift in the genetic code, protein folding would be impacted by replacing a hydrophobic residue with a polar residue. Based on phylogenetic analysis, Scheffersomyces segobiensis virus L evolved from exogenous totiviruses with a standard genetic code. Apparently, genetic code shifts are not serious barriers against virus jumping between different hosts.

Coinfection and Evolution Coinfection of multiple mycoviruses is common in nature. Coinfecting mycoviruses impose pressure on each other resulting in neutral, synergistic or antagonistic interactions. In many cases, mycoviruses transfer across different fungal strains of vegetative incompatibility by overcoming non-self recognition. However, some mycoviruses can suppress the non-self recognition and facilitate the exchange of mycoviruses from diverse vegetatively incompatible groups. This opens up the opportunity for interactions among different mycoviruses while coinfecting the same host that have evolutionary consequences. One example is the interactions between Mycoreovirus 1 (MyRV1) from the genus Mycoreovirus and CHV1 from the genus Hypovirus, in the chestnut blight fungus, Cryphonectria parasitica. MyRV1 possesses 11 dsRNA segments (S1-S11). The multifunctional protein p29, encoded by CHV1 induces reproducible intragenic rearrangements of the S6 and S10 genomic segments of MyRV1, including duplication in S6 and an internal deletion in S10. Further, p29 copurifies with MyRV1 genomic RNA and binds in vitro to the VP9 protein of MyRV1 that is involved in replication. Fungi use the adaptive immune system of RNA silencing to counteract RNA viruses. RNA silencing may also contribute to viral RNA recombination. In coinfection, it is likely that RNA silencing may lead to the production of chimeric viral RNAs as a product of intermolecular recombination. Such possibilities have evolutionary consequences and deserve further investigations.

Structural Evolution of Mycoviruses The origin of the RdRp of RNA viruses is unclear. The protein sequences of RdRps, as hallmark genes in RNA viruses, are divergent, but contain well-conserved domains. The high level of structural similarity based on atomic structure among RdRps of all groups of RNA viruses [dsRNA, ( þ ) ssRNA, and (  ) ssRNA] also could imply a common origin for all RdRps. However, it is not possible to be certain whether these critical molecules have diverged or converged for a common function, and if they have diverged, it is difficult to be certain which extant form of RdRp is closest to the ancestral state. The virion architecture is also a crucial element in understanding virus origins. In spite of tremendous genetic variation, there are a limited number of capsid protein folds, and only a small subset of these have the potential to result in a virus particle.

460

Evolution of Mycoviruses

However, the origin of viral capsids is a poorly understood aspect of virus evolution. Unlike the RdRps, the capsid proteins have no conserved domains, and primary protein sequence analysis, even in viruses in the same family, indicates a polyphyletic origin of these genes. This type of chimeric origin is a common evolutionary scenario for encapsidated mycoviruses. Recent analyses of virion structure have revealed unexpected similarities in the folds of capsid proteins among these different viruses, suggesting an evolutionary connection among viruses that infect hosts residing in different kingdoms of life. Two different evolutionary trajectories could be concluded from such observations, divergent or convergent evolution. In the divergent scenario, there is a common ancestor for each separate lineage that existed before their host organisms diverged, but these proteins have diverged beyond any recognition at the sequence level. The other scenario would be that the structural similarities are because of convergent evolution to solve the problem of making a stable and viable capsid. A comparative analysis to investigate the diversity and potential origins of viral capsids resulted in 18 “structure-based viral lineages”, in which 76.3% of viral taxa with defined folds of the major capsid proteins could be categorized. One of these lineages is the dsRNA bluetongue virus (BTV)-like viruses. BTV is a member of the Reoviridae. The icosahedral mycoviruses in the Totiviridae, Partitiviridae, Megabirnaviridae, Chrysoviridae, and Quadriviridae families have a unique 120-subunit T ¼ 1 jelly-roll capsids that also have been described as the internal shell in bacteriophages of the family Cystoviridae and members of Reoviridae, and as viral capsids in picobirnaviruses, which infect higher eukaryotes. They share a structural signature with the lineage of the dsRNA BTV-like viruses. Reoviruses form a complex capsid consisting of two icosahedral shells similar to cystoviruses, indicating a possible evolutionary relationship between these families. The icosahedral capsid of dsRNA viruses remains intact during transcription, extruding the ssRNAs that act as mRNAs and as pregenomic RNAs from the intact capsids, probably to avoid triggering host defense mechanisms. To date, four T ¼ 1 capsids of mycoviruses have been resolved at the atomic level: Saccharomyces cerevisiae L-A virus (Totiviridae), Penicillium chrysogenum virus (Chrysoviridae), Penicillium stoloniferum virus F (Partitiviridae), and Rosellinia necatrix quadrivirus 1 (Quadriviridae). The amino acid sequences of the above four capsid proteins are quite different, however, they all have the same fold, predominantly a-helical, which is also a hallmark of BTV lineage. According to phylogenetic analyses of the RdRps of partitiviruses, they share a common ancestor and RdRp has monophyletic origin in this group of viruses, however, despite the striking similarity of their structure, coat protein genes are often unrelated and are clearly polyphyletic. So, in this case, the evolution of the RdRp alone does not reflect the evolution of partitiviruses, and it is likely that partitiviruses gained coat protein genes at different times, from various sources. All partitiviruses with resolved atomic structures, despite their high coat protein sequence divergence, share distinctive features including surface arches that are unique among dsRNA viruses infecting fungi and not seen in other mycovirus families. The presence of surface arches in partitiviruses, with thin capsid shells, may have a role in stabilization of the capsid, however, their main function is still unknown. Among dsRNA viruses, partitiviruses and picobirnaviruses, have minimalist genomes consisting of two dsRNA segments, and share capsid structure and distinctive features. However, picobirnaviruses, as vertebrate viruses, are grouped in a different family due to a variety of differences from partitiviruses including host range. Given the structural similarity, two different evolutionary trajectories are suggested for Partitiviridae and Picobirnaviridae capsids, either they are diverged from a common ancestor that existed before their host organisms diverged or they are converged from different ancestors as a solution to the problem of a thin, unstable capsid protein. Obviously, not only evolution of the RdRp but also evolution of other genes either gained or lost during evolution are important in the evolution of mycoviruses. There is much to still be learned about mycovirus evolution, largely because this group of viruses is very understudied, and most of the work done has centered on viruses of fungi that are pathogenic in plants or animals.

Further Readings Huiquan, L., Yanping, F., Jiatao, X., et al., 2012. Evolutionary genomics of mycovirus-related dsRNA viruses reveals cross-family horizontal gene transfer and evolution of diverse viral lineages. BMC Evolutionary Biology 12 (1), 91–105. Koonin, E.V., Dolja, V.V., 2014. Virus world as an evolutionary network of viruses and capsidless selfish elements. Microbiology and Molecular Biology Reviews 78 (2), 278–303. Krupovic, M., Koonin, E.V., 2017. Multiple origins of viral capsid proteins from cellular ancestors. Proceedings of the National Academy of Sciences of the United States of America 114 (12), E2401–E2410. Luque, D., Mata, P.C., Suzuki, N., Ghabrial, A.S., Castón, R.J., 2018. Capsid structure of dsRNA fungal viruses. Viruses 10 (9). Roossinck, M.J., Sabanadzovic, S., Okada, R., Valverde, R.A., 2011. The remarkable evolutionary history of endornaviruses. Journal of General Virology 92 (11), 2674–2678. Roossinck, M.J., 2019. Evolutionary and ecological links between plant and fungal viruses. New Phytologist 221 (1), 86–92. Safari, M., Roossinck, M.J., 2014. How does the genome structure and lifestyle of a virus affect its population variation? Current Opinion in Virology 9, 39–44. Shackelton, L.A., Holmes, E.C., 2008. The role of alternative genetic codes in viral evolution and emergence. Journal of Theoretical Biology 254 (1), 128–134. Taylor, D.J., Ballinger, M.J., Bowman, S.M., Bruenn, J.A., 2013. Virus-host co-evolution under a modified nuclear genetic code. Peer Journals 1 (e50), 2167–8359. Wolf, Y.I., Kazlauskas, D., Iranzo, J., et al., 2018. Origins and evolution of the global RNA virome. mBio 9 (6).

Relevant Website https://talk.ictvonline.org International Committee on Taxonomy of Viruses (ICTV).

Mixed Infections of Mycoviruses in Phytopathogenic Fungus Sclerotinia sclerotiorum Jiatao Xie and Daohong Jiang, Huazhong Agricultural University, Wuhan, China r 2021 Elsevier Ltd. All rights reserved.

Glossary Hypovirulence Reduced virulence of phytopathogenic fungi upon mycovirus infections. Mycovirus Virus that can replicate and proliferate in fungal cells.

Virocontrol mycovirus.

Biological control with hypovirulence-associated

Introduction All living beings, as well as human beings, can be infected by innumerable viruses. As an important component of virosphere, mycoviruses (or fungal viruses) infect and replicate in fungi, and maybe play important roles in shaping fungal ecosystem under nature. Hypovirulence is a phenomenon that phytopathogenic fungi infected by one or more mycoviruses have reduced or lose virulence on their hosts. Hypovirulence-associated mycoviruses isolated from nature have potential to prevent and control crop fungal diseases, and it is also a useful exploration to open up a new strategy for the management of plant diseases. Hypovirulenceinducing Cryphonectria hypovirus 1 (CHV1) has been successfully utilized to effectively control chestnut blight disease in Europe, which is a classic virocontrol example of fungal disease and also attracted more phytopathologists to unearthed mycovirus resources. Mycoviruses have been discovered in all major fungal groups during more than 50-year research, and mixed infections with multiple related or unrelated mycoviruses have frequently been confirmed in a single fungal isolate. For example, Ophiostoma novo-ulmi Log1/3–8d2 (Ld) with diseased phenotype was reported to be infected with twelve distinct mitoviruses. Two unrelated RNA viruses, yado-nushi virus 1 and yado-kari virus 1, co-infect a single isolate of phytopathogenic fungus Rosellinia necatrix, and potentially interplay in vivo. The advanced technique of high-throughput sequencing further supplies new insight into the complexity of the mixed infections in fungi. A hypovirulent strain DC17 of Rhizoctonia solani (AG2–2-IV) harbors 17 different mycoviruses that represent different virus lineages. Sixteen mycoviruses occur in a single strain of Fusarium poae. Those reports revealed that the mixed infections in fungi are fairly common and more complex than we previously thought. Sclerotinia sclerotiorum (Lib.) de Bary is a ubiquitous necrotrophic pathogen and has a remarkably broad host range, encompassing over 400 species including important economic crops canola (oilseed rape), soybean, sunflower, and vegetables. In the later stage of the disease, S. sclerotiorum usually produces sclerotia, the resting structure for over-winter and summer, to survive for several years in soil. Under congenial conditions (such as moist soil, 10–201C temperature), sclerotia germinate to produce apothecia (carpogenic germination) or directly produce mycelia (myceliogenic germination), and then results in damage of the plant tissue via the sophisticated pathogenicity mechanism. The disease caused by S. sclerotiorum is responsible for yearly several 100 millions of US dollars crop losses worldwide. To effectively control the sclerotinia disease, fungicides have been continuously used widely. However, long term dependence on chemical pesticides for sclerotinia disease leads to the pollution of agricultural environment and ecological damage, which threatens food safety and human health. Therefore, environment-friendly biological control (such as virocontrol) for crop diseases is an alternative strategy for the sustainable development of agriculture. The first description of hypovirulence and dsRNA elements in S. sclerotiorum was published on Canadian Journal of Plant Pathology in 1992. So far, diverse mycoviruses with RNA or DNA genomes have been discovered in S. sclerotiorum (Table 1) as well as a DNA mycovirus, Sclerotinia sclerotiorum hypovirulence-associated DNA virus 1 (SsHADV1) and the firstly identified negative RNA mycovirus, Sclerotinia sclerotiorum negative-sense RNA virus 1 (SsNSRV-1). By application of a high-throughput sequencing-based metatranscriptomic approach, 84 Australian S. sclerotiorum isolates were identified to host 57 mycoviruses assigned into 10 distinct viral lineages, and 28 mycoviruses were detected from American S. sclerotiorum strains, suggesting that mycovirus infections are quite common in S. sclerotiorum population. Herein, we summarized the genomes of mycoviruses identified and molecularly characterized from the reported twenty-three S. sclerotiorum strains (Table 1). Some of those identified mycoviruses could confer hypovirulence, and are potential resources for biological control of Sclerotinia disease. For example, SsHADV1, the first example of extracellular transmission, has the potential to develop a virocontrol agent (or viral fungicide), and has been confirmed to be effectively control rapeseed rot disease development under field conditions. In addition, SsHADV1 can convert lifestyle of S. sclerotiorum from pathogenic fungus to beneficial endophyte, which enhance rapeseed resistance and yield. Notably, more than 80% (19/23) of those identified strains are mix-infected by multiple related or/and distinct mycoviruses (Table 1). In this article, we focus on characteristics of mixed infections of mycoviruses and their impacting in the individual strains of S. sclerotiorum, and also will discuss their potential interactions among viruses or their interactions with S. sclerotiorum.

Encyclopedia of Virology, 4th Edition, Volume 4

doi:10.1016/B978-0-12-814515-9.00051-5

461

462

Table 1

Mixed Infections of Mycoviruses in Phytopathogenic Fungus Sclerotinia sclerotiorum Mycoviruses infecting strains of Sclerotinia sclerotiorum

Strains

Mycovirusesa

Type of genome

Phenotype

Ep-1PN DT-8 Sunf-M SZ-150 KL-1 16235 WF-1 11691 14563 Lu471 5472 AH98 SX247 HC025 SCH941 JMTJ14 SX466 AX19 AH16 288 277 328 SX10

SsDRV, SsRV-L SsDAHV1 SsPV-S, SsNsRV-1 SsHV1, SsBV3, SsMTV1 SsMV1, SsMV2 SsMV2, SsMV3, SsMV4, SsMV5, SsMV6, SsMV7 SsPV1 RcEV1, SsEV1, SsMV2, SsMV4, SsMV5, SsMV6 SsMV7 SsMV2, SsMV4, SsMV5, SsMV6, SsMV7, SsNsRV-1 SsEV1, SsMV2, SsMV3, SsMV4, SsMV5, SsMV6, SsMV7, SsNsRV-1 SsHV2, SsEV1, SsMV2, SsMV3, SsMV4, SsMV5, SsMV6, SsMV7, SsNsRV-1 SsNSRV-1, SsHV1 SsHV2, SsDRV SsMV1 SsReV1, SsBV1, SsDFV1, SsDFV3, SsYkV1 SsFV1 SsMBV1, SsMV2, SsPV2 SsDFV1, six unknown dsRNA elements SsMV2, SsBV2 SsDFV2, BpRV1 HuSRV1, SsMTV1 SsHV2, SsEV1 SsMyRV4

þ ssRNA ssDNA dsRNA, þ ssRNA dsRNA, þ ssRNA þ ssRNA þ SsRNA dsRNA þ ssRNA  ssRNA, þ ssRNA  ssRNA, þ ssRNA  ssRNA, þ ssRNA  ssRNA, þ ssRNA þ ssRNA þ ssRNA dsRNA, þ ssRNA þ ssRNA dsRNA, þ ssRNA

Hypovirulence Hypovirulence Virulence Hypovirulence Hypovirulence Hypovirulence Hypovirulence Virulence Virulence Virulence Hypovirulence Hypovirulence Hypovirulence Hypovirulence Hypovirulence Hypovirulence Hypovirulence Hypovirulence Hypovirulence Hypovirulence Hypovirulence Hypovirulence Hypovirulence

dsRNA, þ ssRNA dsRNA, þ ssRNA þ ssRNA þ ssRNA dsRNA

a

Abbreviated mycovirus names: SsDRV, Sclerotinia debilitation-associated RNA virus; SsRV-L, Sclerotinia sclerotiorum RNA virus L; SsHADV1, Sclerotinia sclerotiorum hypovirulence associated DNA virus 1; SsPV, Sclerotinia sclerotiorum partitivirus; SsNsRV-1, Sclerotinia sclerotiorum negative single RNA virus 1; SsHV, Sclerotinia sclerotiorum hypovirus; SsBV, Sclerotinia sclerotiorum botybirnavirus; SsMTV1, Sclerotinia sclerotiorum mycotymovirus 1; SsMV, Sclerotinia sclerotiorum mitovirus; RcEV1, Rhizoctonia cerealis endornavirus 1; SsReV1, Sclerotinia sclerotiorum reovirus 1; SsYkV1, Sclerotinia sclerotiorum yadokarivirus 1; SsFV1, Sclerotinia sclerotiorum fusarivirus 1; SsMBV1, Sclerotinia sclerotiorum megabirnavirus 1; SsDFV, Sclerotinia sclerotiorum deltaflexivirus; BpRV1, Botytis porri RNA virus 1; HuSRV1, Hubei sclerotinia RNA virus 1; SsMyRV4, Sclerotinia sclerotiorum mycoreovirus 4.

Strain Ep-1PN Harbors Two Positive-Single Stranded RNA Mycoviruses Strain Ep-1PN was originally isolated from a sclerotium collected from a diseased eggplant (Solanum melongena) and had lower virulence on rapeseed plants. dsRNA extraction suggested that Ep-1PN contained at least three extra-chromosome dsRNA segments (L, M, and S) with 7.4 kbp (base pairs), 6.4 kb, and 1.0 kbp in size, respectively. Subsequently, the L segment was confirmed to be the genome of Sclerotinia sclerotiorum RNA virus L (SsRV-L), M segment represented the replicative form of the genome of Sclerotinia sclerotiorum debilitation-associated RNA virus (SsDRV), and segment S was a defective RNA derived from SsDRV. SsDRV is the first molecularly characterized mycovirus based on the full-length genome in S. sclerotiorum. SsDRV contains 5419 nucleotides (nt) excluding the poly (A) tail and has a single open reading frame (ORF). This ORF encodes a polyprotein with three conserved domains of methyltransferase, helicase and RNA-dependent RNA polymerase (RdRp). In S. sclerotiorum strain SX247 (see strain SX247 section), another strain SsDRV/SX247 was characterized, and share high sequence identity (81%) with SsDRV from strain Ep-1PN. Based on the phylogenetic analysis, SsDRV, the exemplar virus of the species Sclerotinia sclerotiorum debilitation-associated RNA virus of the genus Sclerodarnavirus, is closely related to the replicases of potex-like plant viruses (potexviruses) and Botrytis virus F (BVF) within the family Alphaflexiviridae. Interestingly, SsDRV has a single gene but lacks a coat protein gene, which is significantly distinct from other members of Alphaflexiviridae that usually contains five or six genes, and one of them encodes coat protein. The genome of SsRV-L is 6043 nt in length, with a poly (A) tail. Similar to SsDRV, SsRV-L has only one reading frame and encodes a polyprotein (viral replicase). There is a significant sequence similarity between SsRV-L and human hepatitis E virus (HEV) replicase, which is the first discovery of mycovirus related to the human virus. Phylogenetic analysis showed that SsRV-L was closely related to members from Rubi-like virus, Closterovirus, Benyvirus, Tobamovirus and Omegatetravirus. Six mycoviruses closely related to SsRV-L have been recently detected in two phytopathogenic fungi Rhizoctonia solani (three mycoviruses) and Sclerotium rolfsii (three mycoviruses), suggesting SsRV-L and its related mycoviruses are common in fungi. SsDRV was associated with the hypovirulence phenotype of strain Ep-1PN, while SsRV-L has a limited impact on the biological features of S. sclerotiorum. Thus, the interaction system of S. sclerotiorum-SsDRV helps us to explore the possible mechanisms underlying fungal pathogenicity. 150 genes of S. sclerotiorum are down-regulated upon mycovirus infections. One of those genes, Sclerotinia sclerotiorum integrin-like gene (SSITL), encodes a secreted protein and is related to the virulence of S. sclerotiorum. This protein was confirmed to be secreted into plant cell, and play significant role in the suppression of jasmonic/ethylene (JA/ET) signal pathway mediated resistance via direct interaction with a chloroplast protein at the early stage of infection.

Mixed Infections of Mycoviruses in Phytopathogenic Fungus Sclerotinia sclerotiorum

463

Strain SX247 is Co-Infected With Two þ ssRNA Mycoviruses Strain SX247 has abnormal colony morphology on the PDA medium and is incapable of causing diseased lesions on the rapeseed. Two dsRNA segments, the replicated forms of Sclerotinia sclerotiorum hypovirus 2 (SsHV2/SX247) and SsDRV/SX247 (see strain Ep-1PN), were detected and determined in strain SX247. The full-length cDNA of SsHV2/SX247 is 15219 nt excluding poly (A) and contains a large ORF that encoded a putative polyprotein with three conserved domains of papain-like protease, RdRp, and viral RNA helicase. The phylogenetic analysis revealed that SsHV2/SX247 forms an independent phylogenetic branch with hypovirus infecting Sclerotium rolfsii. Two other strains (SsHV2/5472 and SsHV2L) of SsHV2, respectively, were molecularly characterized from New Zealand isolate 5472 and American isolate 328 (co-infection with a strain of Sclerotinia sclerotiorum endornavirus 1). Compared to SsHV2/SX274 and SsHV2/5472 that are basically all the same in the genomic organization, SsHV2L lacks a fragment of 1.2 kb near the 5 terminus and has an insertion of 524 nt related to Valsa ceratosperma hypovirus 1 (NC_017099), suggesting SsHV2L is a recombinant of hypoviruses. In addition, the infectious clone of SsHV2L was successfully established to explore the interactions with S. sclerotiorum at the molecular level. All three strains of SsHV2 were directly or indirectly confirmed to be responsible for hypovirulence on S. sclerotiorum via methods of transfection with transcripts or horizontal transmission.

Strain AH98 is a Mix-Infection by Two Unrelated Single-Stranded RNA Mycoviruses Two mycoviruses, Sclerotinia sclerotiorum hypovirus 1 (SsHV1) and Sclerotinia sclerotiorum negative-sense RNA virus 1 (SsNSRV-1), were identified from a hypovirulent strain AH98. SsHV1 and its biological features will be described in strain SZ-150 (see below). The full-length genome of SsNSRV-1, the first-ssRNA virus infecting fungi, was 10002 nt (KJ186782) with 6 non-overlapped genes (ORF I-VI) which were linearly arranged in the genome. ORF V encodes a large protein (L protein) with 1934 amino acids and is comprised of the conserved mononegaviral RdRp domain that was closely related to mononegaviruses, while other ORFs encode putative proteins with unknown function. The Gene-junction sequence of (A/U)(U/A/C)UAUU(U/A)AA(U/G)AAAACUUAGG(A/U)(G/U) that is widely present and the unique property in mononegaviruses is found at each ORF junction of SsNSRV-1. The sequence “AAAACUUAGG” serves as the transcriptional stop signal of the upstream ORF, while “UAUUUAAUAAAACU” is the transcriptional start signal of the downstream ORF. All six ORFs can be transcribed independently. In addition, ORF V and ORF VI are co-transcribed. The nucleocapsids of SsNSRV-1 are 22 nm in diameter and 200–2000 nm in length. SsNSRV-1 has defective RNAs with different internal deletions of ORF I and ORF II, and with deletions of the 50 termini. Although SsNSRV-1 is a member of Mononegavirales, it is different from reported members in genome organization, therefore a new family Mymonaviridae and genus Sclerotimonavirus being established to accommodate SsNSRV-1 and its related viruses. Moreover, SsNSRV-1 is widely distributed in the natural environment and has been detected in S. sclerotiorum strains collected from China, Australia, and USA. Transfection with SsNSRV-1 virions demonstrated that SsNSRV-1 is the key factor for hypovirulence in strain AH98.

The Hypovirulent Strain SZ-150 Contains Three Mycoviruses and a Satellite RNA Strain SZ-150 was collected in Suizhou County of China, and its virulence on rapeseed plants was severely declined upon mycovirus infection. Application of the conventional method of dsRNA extraction, SZ-150 was originally considered to harbor two mycovirus-related elements, SsHV1 and a co-replicating satellite RNA (SatH). The genome of SsHV1 is 10,398 bp, excluding a poly (A) tail. SsHV1 contains a single ORF encoding a polyprotein with three conserved domains of glycosyltransferase, RdRp, and viral helicase. SsHV1 is phylogenetically related to hypoviruses, and is the first identified hypovirus in fungi other than Cryphonectria parasitica. SsHV1 was also detected in two S. sclerotiorum strains including strain AH98 coinfected with SsNSRV-1 and the strain AX19, as well as in strains from Australia. SatH consists of 3643 nt and has an ORF encoding a putative protein (p70) with unknown biological functions. SatH replication is entirely dependent on the replication of SsHV1 which acts as a classic helper virus, and the genome size of SatH is larger than the previously reported viral satellite RNAs. Neither SsHV1 without SatH nor infectious cDNA of SatH without SsHV1 results in hypovirulence on S. sclerotiorum, suggesting SsHV1 and SatH are the co-determinant factors in eliciting hypovirulence in strain SZ-150. Further research suggested that the p70 protein encoded by SatH may play an important role as an RNA silencing suppressor to boost expression of SsHV1. It is possible that p70 may suppress the host antiviral response resulting in increased replication of SsHV1 and also guaranteeing the survival of SatH in S. sclerotiorum. In addition to SsHV1 and SatH, SZ-150 was confirmed to harbor the other two mycoviruses, Sclerotinia sclerotiorum botybirnavirus 3 (SsBV3/SZ-150) and Sclerotinia sclerotiorum mycotymovirus 1 (SsMTV1/SZ-150), by the deep-sequencing of small RNAs. SsBV3/SZ-150 was assumed to be a strain of BpBV1, because of the whole genome of SsBV3/SZ-150 shares more than 95% sequence identity with Botrytis porri botybirnavirus 1 (BpBV1) at the nucleotide or amino acid level. SsMTV1/SZ150 was predicted to contain a large ORF that encodes a putative replication-associated polyprotein. SsMTV1/SZ-150 is related, albeit distantly, to members of the family Tymoviridae. Both SsMTV1/SZ-150 and SsBV3/SZ-150 have limited contribution to hypovirulence of strain SZ-150.

464

Mixed Infections of Mycoviruses in Phytopathogenic Fungus Sclerotinia sclerotiorum

Four Mycoviruses Co-Infect a Hypovirulent Strain SCH941 Two novel dsRNA viruses were identified from strain SCH941, a hypovirulent strain of S. sclerotiorum. One was a bipartite dsRNA virus named Sclerotinia sclerotiorum botybirnavirus 1 (SsBV1) and the other was a novel reovirus named as Sclerotinia sclerotiorum reovirus 1 (SsReV1). SsBV1 comprised two dsRNA segments (dsRNA1 and dsRNA2) and its genome size is 12,422 bp in total. A satellite-like RNA (SatlRNA) of SsBV1 was also found with 1647 bp. SsBV1 had spherical virions with 38 nm in diameter. SsBV1 phylogenetically clustered with BpBV1, a bipartite virus infecting Botytis porri, and belongs to the genus Botybirnavirus. Compared to the virus-free strain of S. sclerotiorum, infection of SsBV1 alone has no significant impacts on biological features of S. sclerotiorum, but SsBV1 co-infection with SatlRNA leads to slightly reduced virulence of S. sclerotiorum. Besides SsBV1, botybirnaviruses were isolated from S. sclerotiorum strain AH16 (co-infection by SsBV2 and a mitovirus) and strain SZ-150 (SsBV3) as well as from Australia strains. SsBV2 is responsible for hypovirulence, whereas SsBV3 shows asymptomatic infection in S. sclerotiorum (Ran et al., 2016; Wang et al., 2019). SsReV1 with spherical virions (approximately 65 nm in diameter) encompasses 11 dsRNA segments (dsRNA1-dsRNA11) and its whole genome is 28,055 bp in length. The conserved 50 -terminal sequences 50 -GAGWUKK-3 (W, U or A; K, U or G) and 30 -terminal sequences 50 -UGCAGUC-30 are found in each dsRNA segment. There is only one ORF in each dsRNA segment excepting dsRNA6 and dsRNA11 in which there were two ORFs. dsRNA1 encoded protein VP1 was predicted to be the RdRp of SsReV1 responsible for virus replication. dsRNA2 encoded protein VP2 is likely to be viral methyltransferase, and dsRNA8 encoded VP8 is likely to be viral kinase and helicase. The biological functions of other proteins from SsReV1 are still unknown. Interestingly, two conserved domains, double-stranded RNA binding motif (dsRBM, Pfam 00 035) and reovirus sigma C capsid protein (Reo_sC, pfam04 582), were identified in the genome of SsReV1. Polygenetic analysis and multiply alignment indicate that reoviruses indeed have HGT events with other virus lineages, even with cellular organisms on a large scale. SsReV1 has a closer relationship to mammalian coltiviruses than to fungal mycoreoviruses. However, dsRNA profile of SsReV1 genome was distinct from those of members within the genera Coltivirus and Mycoreovirus, suggesting SsReV1 represents a novel genus in the family Reoviridae. Reoviruses were reported in C. parasitica (MyRV1 and MyRV2), R. necatrix (RnMyRV3), and as well in S. sclerotiorum (SsMyRV4). The infections by those mycoreoviruses result in virulence debilitation of their hosts, but the S. sclerotiorum strain infected by SsReV1 alone shows no symptoms. Co-infection of strain SCH941 by SsReV1 and SsBV1 failed to cause the typical lesion on rapeseed plant, but infection of a S. sclerotiorum strain by SsReV1-alone or SsBV1-alone showed strong virulence on the host. To eliminate the possibility of infection of strain SCH941 by other viruses overlooked by dsRNA extraction, metatranscriptomic analyzes of strain SCH941 was conducted, and finally found three new mycoviruses, Sclerotinia sclerotiorum yadokarivirus 1 (SsYkV1), Sclerotinia sclerotiorum deltaflexivirus 1 (SsDFV1), and SsDFV3 coinfecting the strain SCH941 (unpublished data). As a matter of fact, SCH941 was coinfected with five mycoviruses including two dsRNA mycoviruses and three þ ssRNA mycoviruses. Notably, SsBV1 and SsYkV1 always co-exist in a single isolate and SsYkV1 failed to separate from SsBV1, suggesting they could have potential interplay each other. We recently confirmed that neither SsDFV1 nor SsDFV1 is associated with hypovirulence, but hypovirulent phenotypes of S. sclerotiorum strains were co-determined by SsBV1, SsYkV1, and SsReV1. Although whether synergistic interaction among three mycoviruses exists in S. sclerotiorum still remains unresolved, we found that the accumulation of SsReV1 is significantly increased in S. sclerotiorum strains upon co-infections with SsBV1 and SsYkV1 (unpublished data). The SCH941 and its associated mycoviruses supply an excellent system for exploring the quadripartite interaction of fungi-virus-virus-plant.

Megabirnavirus Infects the Hypovirulent Strain SX466 With a Mitovirus and a Partitivirus Strain SX466 harbors two dsRNA mycoviruses and an ssRNA mycovirus, including Sclerotinia sclerotiorum megabirnavirus 1 (SsMBV1), SsMV2/SX466, and Sclerotinia sclerotiorum partitivirus 2 (SsPV2/SX466) (Wang et al., 2015). SsMBV1 forms rigid spherical particles with approximately 45 nm in diameter, and contains two dsRNA segments (dsRNA1 and dsRNA2) each of which possesses two ORFs (ORF1 & 2 on dsRNA1, ORF3 & 4 on dsRNA4). Interestingly, a papain-like protease domain in dsRNA2-encoded protein appears to have been obtained from the single-stranded RNA viruses via horizontal gene transfer events. dsRNA2/SsMBV1 is dispensable for SsMBV1 replication and packaging, but it could enhance transcript accumulation of dsRNA1/ SsMBV1 and stability of SsMBV1 particle. This similar result was found in other megabirnaviruses infecting R. necatrix. Different from Rosellinia necatrix megabirnavirus 1 (RnMBV1), SsMBV1 has a slight impact on fundamental biological characteristics of its host regardless of the presence or absence of dsRNA2/SsMBV1. Whether SsMV2/SX466 or SsPV2/SX466, or a combination of three mycoviruses is responsible for hypovirulence in strain SX466 needs to be further explored.

Strain Sunf-M is Coinfected With Two Unrelated dsRNA Mycoviruses Although strain Sunf-M isolated from the diseased sunflower plant (Helianthus annuus) has strong virulence on its hosts, two mycoviruses, Sclerotinia sclerotiorum nonsegmented virus L (SsNsV-L) and Sclerotinia sclerotiorum partitivirus S (SsPV-S), and a dsRNA element were detected. SsNsV-L is 9124 nucleotides in length without a poly (A) tail. SsNsV-L has two large ORFs (ORF1 and ORF2) and ORF2 is predicted to encode RdRp. Genome comparison and evolutionary analysis suggested SsNsV-L represents a

Mixed Infections of Mycoviruses in Phytopathogenic Fungus Sclerotinia sclerotiorum

465

novel viral evolutionary lineage. Two other strains (above 90% identities with SsNsV-L/Sunf) of SsNsV-L were detected from two strains of Botritys spp.. SsPV-S in Sunf-M is a typical partitivirus based on the RdRp analysis, but its coat protein has the highest amino acid sequence similarity to IAA-leucine-resistant protein 2 (ILR2) of Arabidopsis, and its similarity to CPs of other partitiviruses is considerably lower. The systematic analysis further revealed that HGT may have widely occurred between dsRNA viruses and eukaryotic nuclear genomes but previously undetected. The similar result was obtained based on the analysis of Rosellinia necatrix partitivirus 2.

Mitoviruses Widely Exist in and Co-Infect S. sclerotiorum Mitoviruses are usually considered to replicate and locate in mitochondria of fungi. Mitoviruses consist of a single ORF, as revealed by invoking fungal mitochondrial codon usage (UGA coding for tryptophan). Mitovirus encode RdRp, but lacking a protein capsid. Based on the NCBI database, thirty-four Sclerotinia sclerotiorum mitoviruses (SsMV) were temporarily named from SsMV1 (accession number JQ013377) to SsMV34 (accession number MF444264), which revealed that mitoviruses are highly prevalent and diverse in S. sclerotiorum relative to any other mycoviruses (Fig. 1). Moreover, multiple infection of a single isolate by mitoviruses is very common in fungi, including S. sclerotiorum. Two mitoviruses of SsMV1 and SsMV2 were found to co-infect the hypovirulent strain KL-1, and three mitoviruses including SsMV5, SsMV6, and SsMV7 co-infected the strain Lu471. In the following section, we only summarize mitoviruses from hypovirulent strains of S. sclerotiorum, and exclude mitoviruses detected via deep-sequencing. Strain KL-1 was originally isolated from diseased lettuce in USA, and co-infected with two mitoviruses SsMV1 and SsMV2. The full-length cDNA of SsMV1/KL-1 and SsMV2/KL-1 has B2500 nt, and usually has a lower GC content. Similar to reported mitoviruses, the UTRs of SsMV1/KL-1 and SsMV2/KL-1 have the stem-loop structures. SsMV1/KL-1 and SsMV2/KL-1 are considered to be the causal agents of hypovirulence in strain KL-1. Another strain (SsMV1/HC025) of SsMV1 was characterized in a Chinese strain HC025 isolated from the disease soybean. SsMV1/HC025 shared the highest sequence identity (74%) with SsMV1/KL-1. The

Fig. 1 Maximum likelihood (ML) phylogenetic tree based on multiple alignments of full RdRp region of mitoviruses. Bootstrap values (%) obtained with 1000 replicates are indicated on the branches and only values more than 50 are shown. Branch lengths correspond to genetic distances; the scale bar (0.5) at the left corresponds to the genetic distance. A group of eight narnaviruses was an out-group. Mitovirus infecting S. sclerotiorum is indicated by blue typeface. GenBank accession numbers is followed in brackets.

466

Mixed Infections of Mycoviruses in Phytopathogenic Fungus Sclerotinia sclerotiorum

UTRs of SsMV1/HC025 had an inverted complementarity, and could form a potentially stable panhandle structure, which lacks in SsMV1/KL-1. SsMV1/HC025 infection exerted obvious effects on host biological properties, including abnormal colony morphology, lower virulence, and the abnormal morphology of mitochondria. SsMV2/16235 was detected in a hypovirulent strain 16235 isolated from an infected Petroselinum crispum (wild parsley) stem in New Zealand. SsMV2/16235 shares 91.6% amino acid sequence identity with RdRp of SsMV2/KL1, therefore being considered an isolate of SsMV2. A dsRNA profile suggested that at least three dsRNA segments appear in the strain infected by SsMV2/NZ1, and one of the three is the replicate form of SsMV2/16235, the other two segments are those of SsMV3/NZ1 and SsMV4/NZ1. Thus, S. sclerotiorum isolate 16235 was co-infected by a minimum of three different mitoviruses. The same research group detected mycoviruses from three S. sclerotiorum strains 14563, Lu471, and 11691 via a dsRNA extraction method. They initially dismissed that strain 14563 was co-infected by three mitoviruses (SsMV2/14563, SsMV5/14563, and SsMV6/14563), strain Lu471 harbors three mitoviruses (SsMV5/Lu471, SsMV6/Lu471, and SsMV7/Lu471), and strain 11691 contains SsMV5/11691 and an endornavirus (SsEV1). Shortly afterward, Illumina sequencing revealed more viruses in the same four fungal strains (16235, 14563, Lu471, and 11691): six mitoviruses in strain 16235; five mitoviruses and SsNSRV-1 in strain 14563; six mitoviruses, SsEV1 and SsNSRV-1 in strain Lu471; and five mitoviruses and two endornaviruses in strain 11691. These mycovirus detection patterns are more complex than previous detection patterns obtained with the dsRNA extraction method.

Three Strains of S. sclerotiorum Harbor Multiple Unassigned Mycoviruses or dsRNA Elements Strain AX19 contains at least seven dsRNA segments (dsRNA1 to 7), and the known mycoviruses infecting strain AX19 include hypovirus (dsRNA2), ourmiavirus (dsRNA5), mitovirus (dsRNA6), and an unassigned virus of Sclerotinia sclerotiorum deltaflexivirus 1 (SsDFV1, dsRNA3). No information was obtained from the other dsRNA segments. The complete genome of SsDFV1 was predicted to contain a larger ORF and three smaller ORFs. Although a larger ORF encoded protein shares sequence similarities to the members within the order Tymovirales, SsDFV1 is markedly distant from all of the members in genomic organization and phylogenetic position. This fact has led the ICTV to create a family Deltaflexiviridae that accommodates SsDFV1. The second deltaflexivirus (SsDFV2) was isolated from hypovirulent strain 288 that was coinfected by a botybirnavirus (BpRV1). SsDFV2 confers hypovirulence on its host and could break the limitation of host vegetative incompatible to horizontal transmission, but the impact of SsDFV1 on S. sclerotiorum still need to be unexplored. A new þ ssRNA mycovirus designated as Hubei sclerotinia RNA virus 1 (HuSRV1) was discovered from the strain 277 of S. sclerotiorum. HuSRV1 is 4492 nt long and lacks a poly(A) tail. The genome size of HuSRV1 is smaller than most of mycoviruses except for mitoviruses, but four ORFs were predicted, and encode four putative proteins (protease, RdRp, coat protein, and hypothetical protein with unknown functions). HuSRV1 is placed in a phylogenetic branch distinct from known viruses. In addition, strain 277 was also infected by SsMTV1 that characterized in strain SZ-150, and its hypovirulent phenotypes was shown to be caused by HuSRV1 infection.

The Possible Reasons for the Co-Infections of Mycoviruses in S. sclerotiorum The complex vegetative incompatibility does exist in S. sclerotiorum populations. It is usually considered that transmission of mycoviruses is limited by the vegetative incompatibility of host fungi. However, the majority of reported S. sclerotiorum strains were infected with multiple mycoviruses (Table 1). More mycoviruses that were not detected by traditional dsRNA extraction could be successfully and accurately detected or discovered by high throughput sequencing. A hypovirulent strain HC051 was confirmed to be co-infected by thirteen mycoviruses including DNA mycovirus, þ ssRNA mycoviruses, -ssRNA mycoviruses, and dsRNA mycoviruses (unpublished data), suggesting that co-infections are more complex than we previously thought in S. sclerotiorum. Those phenomena also suggested that mycoviruses are likely to overcome the limitation of transmission caused by vegetative incompatibility under nature via some unknown mechanisms. Recently, Thapa and Roossinck (2019) summarized the determinants of co-infection of mycoviruses and how to break the limitation of vegetative incompatibility in filamentous fungi. In S. sclerotiorum, we herein list potential reasons for the co-infections of mycoviruses. (1) Some mycoviruses (Such as SsHADV1, SsPV1, and SsDFV2, etc.) are strong infectious. SsHADV1 virions could directly infect hyphae of S. sclerotiorum and easily transmit via hyphal contact. SsHADV1 is prevalent among S. sclerotiorum strains directly isolated from fields and infects a single strain with other unrelated mycoviruses. (2) Mycovirus (such as SsMYRV4) is capable of promoting horizontal transmission of heterologous mycoviruses between vegetatively incompatible strains. (3) Some mycoviruses act as RNA silence suppressors to resist an antiviral immune system of the host fungus. S. sclerotiorum could activate antiviral immune response (such as RNAi) upon mycovirus infections. On the contrary, mycoviruses also have evolved the mechanism to inhibit this host immune response. The satellite RNA of SsHV1 can inhibit the RNAi-related gene expression, therefore increasing the replication of its helper virus. (4) Mycoviruses are capable of escaping from the barrier of vegetative incompatibility system during infection of plants by the fungi hosting the viruses. The horizontal transmission rate of, for example, CHV1 in different vegetative compatible strains is significantly increased when fungal strains infect chestnut than when they grow on artificial medium. A similar phenomenon was found in S. sclerotiorum mycoviral transmission (unpublished data), suggesting that the mechanism of horizontal transmission under natural conditions is different from that under laboratory conditions, and remains to be further explored.

Mixed Infections of Mycoviruses in Phytopathogenic Fungus Sclerotinia sclerotiorum

467

References Thapa, V., Roossinck, M.J., 2019. Determinants of coinfection in the mycoviruses. Frontiers in Cell Infection and Microbiology 9, 169. Wang, Q., Cheng, S., Xiao, X., et al., 2019. Discovery of two mycoviruses by high-throughput sequencing and assembly of mycovirus-derived small silencing RNAs from a hypovirulent strain of Sclerotinia sclerotiorum. Frontiers in Microbiology 10, 1415.

Further Reading Bolton, M.D., Thomma, B.P., Nelson, B.D., 2006. Sclerotinia sclerotiorum (Lib.) de Bary: Biology and molecular traits of a cosmopolitan pathogen. Molecular Plant Pathology 7 (1), 1–16. Hillman, B.I., Cai, G., 2013. The family Narnaviridae: Simplest of RNA viruses. Advance Virus Research 86, 149–176. Jiang, D., Fu, Y., Li, G., et al., 2013. Viruses of the plant pathogenic fungus Sclerotinia sclerotiorum. Advances in Virus Research 86, 215–248. Khalifa, M.E., Varsani, A., Ganley, A.R.D., et al., 2016. Comparison of Illumina de novo assembled and Sanger sequenced viral genomes: A case study for RNA viruses recovered from the plant pathogenic fungus Sclerotinia sclerotiorum. Virus Research 219, 51–57. Liu, L., Xie, J., Cheng, J., et al., 2014. Fungal negative-stranded RNA virus that is related to bornaviruses and nyaviruses. Proceedings of the National Academy of Sciences of the United States of America 111, 12205–12210. Marzano, S.Y.L., Nelson, B.D., Ajayi-Oyetunde, O., et al., 2016. Identification of diverse mycoviruses through metatranscriptomics characterization of the viromes of five major fungal plant pathogens. Journal of Virology 90, 6846–6863. Mu, F., Xie, J., Cheng, S., et al., 2018. Virome characterization of a collection of S. sclerotiorum from Australia. Frontiers in Microbiology 8, 2540. Nuss, D.L., 2005. Hypovirulence: Mycoviruses at the fungal-plant interface. Nature Review Microbiology 3, 632–642. Wu, S., Cheng, J., Fu, Y., et al., 2017. Virus-mediated suppression of host non-self recognition facilitates horizontal transmission of heterologous viruses. PLoS Pathogens 13 (3), e1006234. Xie, J., Jiang, D., 2014. New insights into mycoviruses and exploration for the biological control of crop fungal diseases. Annual Review of Phytopathology 52, 45–68. Yu, X., Li, B., Fu, Y., et al., 2010. A geminivirus-related DNA mycovirus that confers hypovirulence to a plant pathogenic fungus. Proceedings of the National Academy of Sciences of the United States of America 107, 8387–8392. Zhang, H., Xie, J., Fu, Y., et al., 2020. A 2-kb mycovirus converts a pathogenic fungus into a beneficial endophyte for Brassica protection and yield enhancement. Molecular Plant 13 (10), 1420–1433. Zhang, R., Hisano, S., Tani, A., et al., 2016. A capsidless ssRNA virus hosted by an unrelated dsRNA virus. Nature Microbiology 1, 15001.

Mycovirus-Mediated Biological Control Daniel Rigling, Swiss Federal Institute for Forest, Snow and Landscape Research (WSL), Birmensdorf, Switzerland Cécile Robin, INRAE – French National Research Institute for Agriculture, Food and Environment, UMR BIOGECO, Cestas, France Simone Prospero, Swiss Federal Institute for Forest, Snow and Landscape Research WSL, Birmensdorf, Switzerland r 2021 Elsevier Ltd. All rights reserved.

Glossary Biological control A method to control pests and pathogens by using natural enemies. Horizontal mycovirus transmission Transmission of viruses from infected to non-infected fungal strains via hyphal anastomosis. Hyphal anastomosis Fusion of fungal hyphae. The formation of hyphal anastomoses allows cytoplasmic exchange and horizontal mycovirus transmission between different fungal strains. Hypovirulence Reduced ability of a pathogen to cause disease. The term hypovirulence is commonly used in the

context of mycovirus-mediated attenuation of fungal virulence. Transfection Artificially introducing viral particles or viral transcripts into fungal cells. Vegetative incompatibility A genetic system that controls the ability of two fungal strains to form hyphal anastomoses and subsequently horizontal mycovirus transmission. Vertical mycovirus transmission Transmission of viruses to sexual and asexual spores (progeny) produced by virusinfected fungal strains.

Introduction Fungi represent a major group of pathogens that are able to infect plants and cause diseases in agricultural and forestry production systems. Management of plant diseases mainly depends on resistance breeding and chemical control with fungicides. However, fungal pathogens can overcome plant resistance and can develop tolerance against fungicides. In addition, chemical control can have undesirable side effects on the environment. Biological control offers an alternative disease management tool. Biological disease control utilizes natural enemies or parasites that attack and suppress pathogens. Viruses that infect fungi (so called mycoviruses or fungal viruses) have the potential to act as biological control agents against plant diseases. Mycoviruses have been found in all major groups of plant pathogenic fungi. However, in many cases they live inside the fungal host without causing it harm. Fungal pathogens can also cause infections in humans and animals, including insects. Mycoviruses have been detected in several of these fungal pathogens. While some are associated with enhanced fungal virulence, others can induce hypovirulence or debilitation in their hosts. This article, however, will focus on mycoviruses that occur in plant pathogenic fungi alone and will not discuss those present in other fungal pathogens.

Disease Cycle of Plant Pathogenic Fungi Plant pathogenic fungi can have complex life cycles. The disease process typically begins by an initial infection from fungal spores that germinate on, then penetrate into the host plant. This is followed by parasitic growth inside the host tissue. Depending on the type of tissue attacked and the severity of the infection, parts of the plants (e.g., for leaf pathogens) or the entire plant (e.g., for root or stem pathogens) are killed by the pathogens. Finally, the fungal pathogens produce spores on the infected plant tissue. These spores are then dispersed to new hosts by wind, rain, or insect vectors. Many fungal pathogens can reproduce both sexually and asexually. Sexual reproduction takes place after mating of two individuals, which upon meiosis produce genetically recombinant sexual spores. In contrast, asexual spores are produced by single individuals and are all genetically identical. Asexual fungal spores can have two functions. They can cause new infections, but can also act as spermatia during sexual reproduction. Two main lifestyles are distinguished in plant pathogenic fungi. Biotrophic pathogens derive nutrients from living plant cells, therefore hosts are not killed rapidly. In contrast, necrotrophic pathogens derive nutrients from dead cells that have been killed by the pathogen. Necrotrophic pathogens usually attack weak or damaged plants and can also survive as saprophytes on dead plant tissues.

Basic Concepts for Mycovirus-Mediated Biological Control For a mycovirus to act as biological control agent, four basic criteria must be fulfilled: (1) It must reduce the virulence of the pathogen (i.e., cause hypovirulence) by suppressing at least one phase in the disease cycle of the fungal pathogen. Both the infection or the reproduction process of the pathogen can be targeted. Suppressing either process alone, or in combination will disturb the disease epidemic and reduce the severity of the disease.

468

Encyclopedia of Virology, 4th Edition, Volume 4

doi:10.1016/B978-0-12-809633-8.21516-1

Mycovirus-Mediated Biological Control

469

(2) The mycovirus must be transmissible into the fungal pathogen that is causing the disease. Most mycoviruses do not have an extracellular phase. Meaning that they can only be transmitted upon hyphal anastomosis (fusion) and cytoplasmic exchange between fungal individuals. Extracellular transmission of virus particles can also occur, but this seems to be an exception. (3) It must be able to overcome the cellular defense mechanisms induced by fungi against virus infections. Fungi use RNA silencing as an antiviral defense pathway. Viral RNA within the fungal host is recognized and broken-down into small RNA fragments. Some mycoviruses can overcome this defense response by suppressing the RNA silencing pathway. (4) Finally, the mycovirus must be able to replicate and spread through the fungal mycelia. Most fungi produce networks of filamentous hyphae with actively growing hyphae at the edge of the thallus. Fungal reproductive structures (i.e., fruiting bodies) develop in the older growths. Spread of the virus into actively growing hyphae is important to interfere with mycelial growth and the infection process, as well as to allow for horizontal virus transmission. While spread of the mycovirus into the reproductive structures is required for vertical virus transmission into spores. Further desired characteristics include the spread and persistence of the biocontrol mycovirus in the pathogen population. Since most mycoviruses do not have an extracellular phase they are fully dependent on their host for their survival and spread. This means the mycovirus can only spread together with its fungal host, which typically disseminates by spores. Consequently, the spread of a mycovirus depends on vertical virus transmission into fungal spores.

General Procedure for Developing a Mycovirus-Mediated Biological Control System Virus Detection and Characterization The detection of mycoviruses is the first step towards the development of a biological control system. Isolates of known pathogenic species, which exhibit slow or debilitated growth in vitro and/or reduced virulence in planta are particularly interesting for virus screening. For necrotrophic pathogens, mycoviruses that cause hypovirulence are more likely detected among fungal isolates recovered from the saprophytic stage of the fungus. The classical approach for the detection of mycoviruses involves the extraction of double-stranded (ds) RNA from the fungal isolates. This approach exploits the fact that most mycoviruses are RNA viruses that either have a dsRNA genome, or in the case of single-stranded RNA viruses, a dsRNA replicative form. Further detection steps include: synthesis of complementary DNA (cDNA) using reverse transcriptase and sequencing, or cloning of the cDNA fragments into appropriate vectors and sequencing. The alignment of overlapping cDNA clones then allows the assembly of the viral genome. This approach was first used in the 1990s by Donald Nuss and co-workers to determine the complete nucleotide sequence of the Cryphonectria hypovirus 1 (CHV1) genome, one of the most well-characterized and well-known biocontrol mycoviruses (Table 1). In recent years, many novel mycoviruses have been detected by using next-generation sequencing, also known as ‘high-throughput’ or ‘deep sequencing’. This detection approach has co-opted RNA-Seq (RNA sequencing) tools that were originally developed for whole transcriptome analysis. The procedure starts with the extraction of total RNA from a fungal isolate, followed by synthesizing and then sequencing the total cDNA. Putative viral sequences are then quality controlled using bioinformatic tools and then identified with BLAST (Basic Local Alignment Search Tool) searches using viral sequence collections. In the most successful searches, a match can be found with conserved motifs in viral genes encoding the RNA-dependent RNA polymerase. These RNA polymerases are responsible for the replication of the viral genomes. The viral RNA polymerase gene is also the first choice for phylogenetic analysis to reveal the taxonomic classification of the detected mycovirus. Once the nucleotide sequence of a mycovirus has been determined, specific PCR assays can be designed to screen fungal culture collection for the presence of a particular mycovirus. The availability of a large culture collection is important because different fungal isolates can host different virus strains, which often vary in their phenotypic effects on its host. Thus, the potential to explore mycovirus diversity could be crucial for finding suitable biological control agents.

Testing for Phenotypic Effects of the Mycovirus As outlined earlier, it is essential that a mycovirus has a negative effect on its fungal host to be considered as a biocontrol agent. To confirm a negative mycoviral effect, pairs of isogenic fungal strains that differ only in presence/absence of the particular mycovirus are required. Such pairs can be obtained by: (1) horizontal virus transmission from a virus-infected to a virus-free strain by co-culturing and hyphal fusion (anastomosis) or protoplast fusion; (2) elimination of the mycovirus from an infected strain by using antiviral chemicals (e.g., ribavirin), thermal treatments, or single spore isolations; (3) in vitro transfection of protoplasts of a virus-free strain with virus particles or infectious viral transcripts derived from a cDNA clone; or (4) transfection of protoplasts of a virus-free strain with an infectious cDNA clone. Phenotypes that should then be tested must include those related to growth and sporulation on artificial agar medium, as well as in planta. Poor or debilitated growth of virus-infected strains on agar medium is, in most cases, a reliable indicator for reduced growth (virulence) in planta. The virus effect can be quantified as the difference between the performance (e.g., growth rate, lesion area) of the virus-infected and that of the corresponding isogenic virus-free strain. This should be expressed as a proportion (%) of the performance of the virus-free strain. A negative virus effect therefore results in a negative proportion value.

Altered culture morphology

MyRV1 Mycoreovirus 1 (Reoviridae) CpMV1 Cryphonectria parasitica mitovirus 1 (Narnaviridae)

OnuMV Ophiostoma novo-ulmi mitoviruses (Narnaviridae)

Botrytis cinerea Necrotrophic pathogen causing

BcMV1 Botrytis cinerea mitovirus 1 (Narnaviridae)

SsMV1 Sclerotinia sclerotiorum mitovirus 1 (Narnaviridae) SsHV2 Sclerotinia sclerotiorum hypovirus 2 (Hypoviridae)

Sclerotinia sclerotiorum SsHADV1 Necrotrophic pathogen causing Sclerotinia sclerotiorum hypovirulencewhite mold or stem rot on many associated DNA virus 1 (Genomoviridae) plant species SsPV1 (Sclerotinia sclerotiorum partitivirus 1 (Partitiviridae)

Ophiostoma novo-ulmi Wilt pathogen causing Dutch elm disease

Horizontal transmission via hyphal Laboratory experiments with detached leaves contacts, vertical transmission to asexual conidia

Slow mycelial growth, abnormal mycelial sectors, decreased formation of infection cushions

Laboratory experiments with detached leaves

Not known

Reduced mycelial growth

Horizontal transmission via hyphal Preventive application, growth chamber contacts, extracellular transmission of experiments and field trials virus particles Reduced mycelial growth, reduced Horizontal transmission via hyphal Laboratory experiments sclerotia production contacts, transmission into sclerotia, interspecific transmission to other Sclerotinia species. Reduced mycelial growth, Horizontal transmission via hyphal contacts Laboratory experiments with detached leaves mitochondrial malformation

Abnormal culture morphology, reduced sclerotia size

Debilitated growth, reduced conidial viability, mitochondrial malfunction

Inoculations studies with elm trees

Slow growth in culture

CHV3 Cryphonectria hypovirus 3 (Hypoviridae)

Horizontal transmission by hyphal contacts, Vertical transmission into conidia

Horizontal transmission via hyphal contacts, scarcely transmitted into asexual spores, no transmission in sexual spores Horizontal transmission via hyphal Therapeutic treatments of chestnut blight contacts, scarcely transmitted into cankers in the US asexual spores Infectious as virus particles in transfection Laboratory experiments assays Horizontal transmission via hyphal Laboratory experiments using excised contacts, vertical transmission into chestnut wood and bark conidia, maternal inheritance in sexual crosses

Moderately reduced fungal sporulation and pigmentation

CHV2 Cryphonectria hypovirus 2 (Hypoviridae)

Mildly reduced mycelial growth

Main method of control of chestnut blight in European orchards and forests: Therapeutic treatments and natural dissemination of hypovirulence, field experiments in North America Laboratory experiments

Horizontal transmission via hyphal contacts, vertical transmission into asexual spores, no transmission into sexual spores

Reduced sporulation and pigmentation, female sterility

CHV1 Cryphonectria hypovirus 1 (Hypoviridae)

Cryphonectria parasitica Necrotrophic bark pathogen causing chestnut blight on Castanea species

Biocontrol experience

Mycovirus (Family)

Phenotypic effects on fungal host Transmission properties beside hypovirulence

Examples of proven and potential mycovirus-mediated biological control of plant diseases

Fungal pathogen

Table 1

470 Mycovirus-Mediated Biological Control

Reduced mycelial growth rate, Horizontal transmission via hyphal contact, Laboratory experiments increased pigmentation, vertical transmission to conidia inhibition of mycotoxin production Reduced mycelial growth rate and Vertical transmission to conidia FgV-ch9 Laboratory and greenhouse experiments sporulation, abnormal colony Fusarium graminearum mycovirus China 9 morphology, disorganized (Chrysoviridae) cytoplasm Reduced mycelial growth rate, FgHV2 Not known Laboratory experiments with wheat spikes conidiation and mycotoxin Fusarium graminearum hypovirus 2 production (Hypoviridae)

FgV1 Fusarium graminearum virus 1 (Fusariviridae)

Laboratory experiments with apples and branches

Fusarium graminearum Necrotrophic pathogen causing Fusarium head blight of wheat, barley and other small-grain cereals

Reduced growth, production of Not known, transfection of C. parasitica abundant white aerial mycelium, possible and resulting in a hypovirulent excessive sectoring phenotype

HvV190S Helminthosporium victoriae virus 190S (Totiviridae)

Helminthosporium victoriae Necrotrophic pathogen causing Victoria blight of oats

Infectious as virus particles in transfection Greenhouse experiments with apple seedlings assays

RnMBV1 Rosellinia necatrix megabirnavirus 1 (Megabirnaviridae) MyRV3 Mycoreovirus 3 (Reoviridae)

Rosellinia necatrix Necrotrophic pathogen causing root rot in many woody plant species Reduced mycelial growth

Laboratory experiments with detached leaves

Infectious as virus particles in transfection Greenhouse experiments with apple seedlings assays

Not known

Horizontal transmission via hyphal contacts Laboratory experiments with detached leaves

Horizontal transmission via hyphal Laboratory experiments with detached leaves contacts, vertical transmission to macroconidia Horizontal transmission via hyphal contacts Laboratory experiments with detached leaves

Reduced mycelial growth

white mold or stem rot on many BcRV1 Reduced mycelial growth, plant species Botrytis cinerea RNA virus 1 (eventually new sectored colony margin family) Decreased formation of infection BcHV1 cushions Botrytis cinerea hypovirus 1 (Hypoviridae) Changed culture pigmentation, BcPV2 Botrytis cinerea partitivirus 2 (Partitiviridae) reduced production of conidia and sclerotia Reduction of sporulation and Bc378V1 laccase activity Botrytis cinerea CCg378 virus 1 (Partitiviridae)

Mycovirus-Mediated Biological Control 471

472

Mycovirus-Mediated Biological Control

Transmission Properties To be able to exploit a mycovirus’s biocontrol ability, a mycovirus needs to be capable of infecting the target pathogen. Since the vast majority of mycoviruses have no extracellular phase, virus infection only occurs upon hyphal fusion between infected and non-infected fungal individuals. Horizontal virus transmission abilities can be tested for on agar medium by co-culturing (pairing) virus-infected and virus-free strains. Mycovirus transmission can be directly observed in the recipient strain in some instances, when the recipient strain adopts the cultural characteristics of the virus donor strain. For example, CHV1 suppresses pigmentation in the chestnut blight fungus, Cryphonectria parasitica. Infected fungal isolates exhibit a white culture morphology, compared to orange pigmented virus-free isolates. Horizontal transmission of CHV1 can be followed by observing the change in pigmentation in the virus-free isolate. However, culture morphology is not always a reliable indicator for virus infection. Successful virus transmission must be confirmed by a virus-specific PCR assay or dsRNA extraction. Failure of horizontal mycovirus transmission is often due to vegetative incompatibility (see Glossary) between individuals of the same fungal species. This non-self-recognition system restricts the formation of hyphal anastomoses and consequently the horizontal transmission of mycoviruses. To be able to spread and persist in a natural environment, the mycovirus needs to be dispersed through the spores of its fungal host. Confirmation of vertical virus transmission into spores requires single-spore cultures derived from a virus-infected isolate. The cultures are then tested for the presence of the mycovirus following the same methodology as horizontal transfer tests. For most mycoviruses, vertical transmission into asexual spores (conidia) is relatively easy to assess because conidia are often formed on agar medium. Transmission into sexual spores is more difficult to assess. In heterothallic fungal species, mating between two sexually compatible fungal isolates is needed for the production of sexual fruiting bodies. For vertical transmission studies, at least one of the two mating partners should be mycovirus-infected. In homothallic species, self-fertilization occurs in a single isolate which produces sexual fruiting bodies without mating type partner. Sexual fruiting bodies may be produced in the laboratory under controlled conditions using defined fungal strains. Alternatively, for highly prevalent mycoviruses, fruiting bodies or sexual spores can be collected in the field and single spore cultures analyzed in the laboratory.

Biocontrol Testing in the Field Once a mycovirus has been shown to cause transmissible hypovirulence under laboratory and greenhouse conditions, further testing of its biocontrol potential should be conducted in the field. For most applications, a mycovirus-infected fungal strain is used. The mycovirus acts as the biological control agent and the fungal strain is a necessary carrier of the mycovirus. Strain-specific identification of both fungal carrier and mycovirus is important for tracing these organisms after field applications and determining their efficiency and persistence. Traceability markers are based on intra-specific genetic variations and include microsatellite markers for fungal strains and single nucleotide polymorphisms (SNPs) for mycovirus strains. Field trials should also demonstrate the efficacy of the biological control, i.e., a reduction in disease severity and an increase in plant productivity.

Biosafety Issues Any release of a fungal carrier strain and a mycovirus into the environment requires consideration of biosafety issues. Critical areas of concern include human health (worker and consumer risk) and risk for non-target organisms (e.g., bees, beneficial fungi). In respect to the fungal carrier, the potential production of toxins/secondary metabolites needs to be considered. Such components may not only harm non-target organisms in the field, but could also affect groundwater quality.

Molecular Approaches to Improve the Use of Mycoviruses as Biological Control Agents Studies in the Cryphonectria-hypovirus system have outlined strategies to improve mycovirus-mediated biocontrol using molecular approaches. Spread of the hypoviruses in this system is limited by the fungal vegetative incompatibility system and by exclusion of the virus during sexual reproduction of the fungus. To overcome these limitations, C. parasitica strains have been genetically engineered by integrating an infectious cDNA copy of the virus into the fungal genome. In sexual crosses with these strains, half of the progeny (regardless of their vc types) will receive the infectious cDNA copy and become virus-infected. Another promising approach involves gene disruptions at multiple vic loci in C. parasitica resulting in super donor strains. These strains exhibit enhanced ability to transmit the virus into C. parasitica populations with high vc type diversity. The potential to extend the natural host range of mycoviruses was also demonstrated with CHV1, which was introduced as synthetic transcripts into related fungal species.

Selected Examples for Mycovirus-Mediated Biological Control Mycoviruses that induce hypovirulence may have the potential to be used as biological control agents for several plant pathogenic fungi (Table 1). Hypovirulence in the chestnut blight fungus represents the best-known biocontrol system and will be covered in this section along with other well-known examples.

Mycovirus-Mediated Biological Control

473

Hypovirulence in Cryphonectria parasitica The “exclusive transmissible hypovirulence” Chestnut blight, caused by the ascomycete fungus Cryphonectria parasitica (Murr.) Barr., is probably the most well-known example of a tree disease successfully controlled via mycovirus-mediated hypovirulence. Main hosts of this pathogen are species in the genus Castanea (family Fagaceae), on which the fungus causes perennial bark lesions (so-called ‘cankers’) on stems and branches, eventually leading to wilting of the plant parts distal to the infection point. The fungus originates from Eastern Asia (China, Japan, and Korea) and was accidentally introduced in the 20th century to North America and Europe. In the new continents C. parasitica encountered two susceptible chestnut species Castanea dentata (Marsh.) Borkh. (American chestnut) and C. sativa Mill. (European chestnut). In North America, the chestnut blight epidemics resulted in the ecological extinction of the formerly forest forming C. dentata within its native distribution of 3.6 million hectares. In Europe, after an initial virulent phase of the epidemics that caused high tree mortality, chestnut trees showing atypical superficial, non-lethal bark cankers were observed in Italy and in France. The C. parasitica strains isolated from these cankers presented a different phenotype than those recovered from virulent cankers. Specifically, instead of showing an orange pigmentation and a strong sporulation on potato dextrose agar (PDA), they were white and did not sporulate. The atypical strains were reduced in virulence when inoculated on chestnut trees. Further analyses revealed that this phenomenon, which was then called hypovirulence, was associated with the presence of dsRNA in the C. parasitica mycelium. This dsRNA represents the replicative form of a mycovirus in the family Hypoviridae, namely Cryphonectria hypovirus 1 (CHV1). Hypoviruses are positive-stranded RNA viruses, located in the cytoplasm of the fungal host. Aside from CHV1, the genus Hypovirus includes three other well-characterized species: Cryphonectria hypovirus 2, Cryphonectria hypovirus 3, and Cryphonectria hypovirus 4. The exemplar strains of these species, Cryphonectria hypovirus 2 (CHV2), CHV3 and CHV4, are phylogenetically related to CHV1, but have different effects on the fungal host. CHV2 and CHV3 also induce a hypovirulent phenotype in C. parasitica, whereas the presence of CHV4 does not cause hypovirulence in C. parasitica. In North America, CHV3 and CHV4 are the most common hypoviruses, but CHV2 also occurs. In Asia, both CHV1 and CHV2 are present. In Europe, CHV1 is the only hypovirus reported to be present. CHV1 inhibits C. parasitica’s sexual reproduction, strongly reduces asexual sporulation, and attenuates parasitic growth of its fungal host. Hypovirulent C. parasitica strains are not able to colonize and kill the cambium of C. sativa. As a result, the plant part distal to the infection point survives. Vertical transmission of CHV1 is restricted to asexual spores (conidia) and occurs at variable frequencies. In contrast, sexual spores (ascospores) only spread the virulent form of the pathogen. Horizontal CHV1 transmission to other fungal mycelia via hyphal anastomosis also occurs and is controlled by a vegetative compatibility (vc) system (Fig. 1), involving at least six unlinked, di-allelic vegetative incompatibility (vic) loci. Transmission success is highest between vegetatively compatible C. parasitica strains, meaning strains that have the same alleles at all vic loci. Transmission is severely reduced if strains are heteroallelic (i.e., have different alleles) at five of the six vic loci. The sixth locus, vic4, does not seem to affect CHV1 transmission. The other five vic loci restrict CHV1 transmission in a variable way. In most cases, restriction is asymmetric depending on which vic allele is present in the CHV1 donor or recipient strain. Field investigations showed that vc barriers seem to be less restrictive in the field than estimated from in vitro studies. Allelic combinations at the six vic loci define a total of 64 vic genotypes which correspond to the 64 vc types (EU-1 to EU-64) genetically characterized so far. However, the presence of vc types not compatible with these 64 described types suggest that vegetative compatibility in C. parasitica is most likely regulated by more than six vic loci or that additional alleles exist at the known vic loci.

Fig. 1 Horizontal transmission of CHV1. The hypovirus is transmitted between vegetative compatible (left), but not between vegetative incompatible (right) strains of Cryphonectria parasitica. CHV1 infected strains exhibit a white culture morphology, which enables the visual assessment of hypovirus transmission. Photos: Phytopathology, WSL.

474

Mycovirus-Mediated Biological Control

Several genetically distinct subtypes of CHV1 have been identified in Europe, whose geographic distribution reflects the invasion history of C. parasitica. Moreover, several recombination events seem to have contributed to the evolution of CHV1 in Europe. The Italian subtype (subtype I) is the most widespread in central and south-eastern Europe (Italy, Switzerland, south-eastern France, Bosnia-Herzegovina, Croatia, Slovenia, Macedonia, Greece and Turkey). This subtype is represented by the two well characterized strains CHV1/Euro7 and CHV1/EP721. The prototypic hypovirus CHV1 (CHV1/EP713) belongs to the subtype F1. The CHV1 subtype F1 has been found in France and Spain together with subtypes F2, and D/E. In Eurasian Georgia a unique subtype (G) occurs that displays a recombinant pattern between subtypes F2 and D/E. The different subtypes vary in their virulence toward the infected C. parasitica strains. Specifically, subtype I has a mild effect on fungal virulence and sporulation and is commonly associated with a high natural prevalence of hypovirulence. In contrast, subtypes F1 and F2 have a severe effect on their fungal host and almost completely inhibit its parasitic growth and sporulation.

Artificial application of hypovirulence As early as 1965, the field application of hypovirulent strains of C. parasitica for biocontrol of chestnut blight was proposed by the French mycologist Jean Grente. In the 1970s in south-eastern France, the hypotheses that the “exclusive transmissible hypovirulence” can be transmitted in the field through inoculations of cankers, as well as that the natural spread of hypovirulence observed in Italy in the 1950s can be accelerated by human intervention, were tested. To this end, virulent cankers were inoculated with hypovirulent C. parasitica strains (Fig. 2). Inoculated cankers healed following treatment and hypovirulent strains were isolated from cankers on surrounding trees. Following these first promising results, the French Ministry of Agriculture promoted biological control to help mitigate chestnut blight in France. By 1976, all French orchards had benefited from artificial treatments. Before any field application, the vc types present in the designated orchard or stand had to be determined. The necessary canker sampling was organized by INRA (French National Institute for Agricultural Research) in orchards and adjacent coppices. Technical facilities were also provided by INRA to grow and distribute hypovirulent strains. The following biocontrol treatment consisted of stripping the edges of the cankers, making holes with a cork borer in healthy bark tissue around the margins, and filling the holes with a mixture containing one or more hypovirulent strains (in case of different vc types). Mixtures were adapted to each production area. From 1998 on, private companies have started to market the hypovirulent strains in France. A license of know-how was established by INRA with these companies in order to control the quality of the product and to be sure that the hypovirulent strains sold are best appropriated for the French C. parasitica populations. After 20 years of biological control, in 1995, a new research program on chestnut blight was initiated at INRA in Bordeaux. This program, carried out in close collaboration with several organizations, has allowed to redefine the mixtures of hypovirulent strains to be used, to develop a new method of application of the product, and to characterize the different viral strains and their transmission properties. Chestnut blight is currently well controlled in southern France. Although its incidence in orchards and coppices is still high, chestnut growers no longer consider it a major problem, except for young grafted trees. In Greece, artificial hypovirulence has also been intensively applied for the biological control of chestnut blight. In 1995, an initial project was started to determine the diversity and distribution of C. parasitica vc types. Following this (1998–2000), a pilot biocontrol project was carried out in the chestnut coppice stands of Mt. Athos. Roughly 200 000 virulent cankers were treated with a hypovirulent strain of vc type EU-12. Promising results from this pilot project led to two nationwide treatment campaigns (2007–2009 and 2014–2016). It is estimated that roughly 170,000 cankers were treated with a CHV1 strain (subtype I), which was naturally present at Mt. Athos. The hypovirulent inoculum was prepared at the Forest Research Institute in Vassilika on a commercial level. In counties where more than one vc type were found, a mixture of inoculum of the corresponding vc types was applied, as done in France. To help forest managers in organizing the artificial introduction of hypovirulence in areas without naturally occurring hypovirulence, an Integrated Biological Control Plan was developed. Evaluations after the treatment campaigns showed that the treated cankers had healed and the released CHV1 strains had started to naturally infect, and heal, non-inoculated cankers. The type of forest management strongly influenced the spread of the hypovirus, which was faster in coppice forests than in

Fig. 2 Biological control of chestnut blight. Mycelium of a hypovirus-infected C. parasitica strain is filled into holes at the margin of the infection. Photo: Phytopathology, WSL.

Mycovirus-Mediated Biological Control

475

orchards. Thanks to the successful treatments, Greek forest managers and chestnut growers are currently much less concerned about chestnut blight than 20 years ago and chestnut cultivation has re-gained importance. Artificial canker treatments with hypovirulent C. parasitica strains have also been conducted for more than 20 years in Northern Switzerland. In contrast to Southern Switzerland, no natural hypovirulence has appeared North of the Alps. At roughly 20 sites, more than 20 000 cankers were treated with a local C. parasitica isolate of the dominant vc type that was converted with a CHV1 infected isolate from Southern Switzerland. Following the method adopted in France, holes were applied around the virulent cankers and filled with the hypovirulent inoculum. One to two years after treatment, canker morphology was assessed and C. parasitica was re-isolated from both treated and untreated cankers. Results revealed the presence of the released CHV1 strain in most treated cankers, which had consequently healed. This confirmed the effectiveness of therapeutic canker treatment in Switzerland. However, there were considerable differences among sites in the spread of the hypovirus to untreated cankers. Although factors determining these differences are still unclear, it seems that the incidence of canker treatment may play an important role. Specifically, a threshold number of cankers must be treated before CHV1 begins to spread into the local C. parasitica population. In North America, natural hypovirulence is associated with the hypovirus CHV3. This has occurred in a few chestnut stands in Michigan and Ontario, which are outside the natural range of the American chestnut. Artificial application of hypovirulence for biological control of chestnut blight in forests and plantations started as early as the mid-1970s in West Virginia, Connecticut, Virginia and Wisconsin. However, results were much less positive than in Europe. Although therapeutic canker treatment has been mostly successful, hypovirulence has largely failed at the population level, meaning that the released hypoviruses did not spread in the local C. parasitica populations. Two of the most significant factors responsible for this failure are thought to be the high susceptibility of American chestnut to C. parasitica and local vc type diversity which is much higher than in Europe. Molecular approaches have been recently used to enhance the establishment and dissemination of the hypovirus. These include the production of transgenic C. parasitica strains carrying an infectious cDNA copy of CHV1 that is integrated into the fungal genome. As well as the production of super hypovirus donor strains, which are able to transmit the hypovirus to recipient strains that are heteroallelic at one-to-five vic loci. In a recent field trail in the US, the use of the super donor strains resulted in enhanced hypovirus transmission into treated chestnut blight cankers. This result highlights the potential of these strains to serve as universal hypovirus carriers for biological control of C. parasitica, particularly for populations with high vc type diversity. Therapeutic treatment of individual C. parasitica cankers on C. sativa and C. dentata has been mostly successful and provides an efficient control method for these perennial infections. In contrast, mycovirus-mediated hypovirulence has frequently shown less success at the population level, especially in North America. Natural hypovirulence is still a recent phenomenon in Europe and its success is strongly dependent on characteristics of the three actors involved in this pathosystem, i.e., virus, fungus, and tree. The outcome of the interaction is also affected by the environment. Therefore, predicting the sustainability of this natural biological control system is particularly difficult. Mycoviruses other than hypoviruses have also been detected in C. parasitica. Among these viruses, two mycoreovirus strains and one mitovirus strain can also induce a hypovirulent phenotype. C. parasitica isolates infected by either of two mycoreovirus strains (MyRV1 or MyRV2) exhibit a strong hypovirulent phenotype. Both mycoreoviruses are infectious as particles in transfection assays allowing to transfer the virus to fungal isolates in different vc types. The mitovirus strain, CpMV1/NB631, has a mild effect on its host. CpMV1 is transmitted at high frequency into asexual spores and, as expected for a mitochondrial mycovirus is maternally inherited.

Hypovirulence in Ophiostoma novo-ulmi Ophiostoma novo-ulmi Brasier is the causal agent of Dutch elm disease, which is responsible for widespread mortality of elm (Ulmus sp.) trees in North America and Europe. The pathogen is spread by elm bark beetles, which transmit fungal spores into feeding wounds on elm twigs. Mycoviruses in O. novo-ulmi were originally described as D-factors, inducing severe debilitation of the fungal host. Later on, several D-factors were identified as mitoviruses and assigned to O. novo-ulmi mitoviruses 1–7 based on phylogenetic analysis. The mitoviruses are transmitted through asexual conidia but not through sexual ascospores. Horizontal virus transmission can occur via hyphal anastomosis between vegetative compatible strains. The prevalence of mitovirus-infected isolates was found to be high in clonal populations at the epidemic disease front because of lack of restriction by vegetative incompatibility. In post-epidemic populations, the prevalence of viruses decreased as the diversity of vc types increased. Virus-induced hypovirulence in O. novo-ulmi is mainly associated with poor mycelial growth and reduced viability of conidia. This means that much more conidia are required for virus-infected isolates to infect elm trees. A proposed approach for biological control is to use elm bark beetles as carrier for virus-infected fungal conidia and thereby introducing the virus into local pathogen populations.

Hypovirulence in Sclerotinia sclerotiorum The ascomycete fungus Sclerotinia sclerotiorum (Lib.) de Bary is a necrotrophic plant pathogen that causes white mold or stem rot on a number of different host plants. This includes important agricultural crop species such as bean, rapeseed, soybean, and lettuce. The global pathogen occurs in temperate and subtemperate regions. The pathogen is homothallic and can form sexual fruiting bodies (apothecia) through self-fertilization. The sexual spores (ascospores) spread by wind and serve as primary source for new infections. The pathogen can also reproduce asexually by the production of sclerotia, which serve as survival structure and dissemination propagules.

476

Mycovirus-Mediated Biological Control

Several mycoviruses that can cause hypovirulence in S. sclerotiorum have been described (Table 1). Most promising as biological control agent is SsHADV-1 (Sclerotinia sclerotiorum hypovirulence-associated DNA virus 1). This virus is among the few DNA mycoviruses that have been described. To date, it is also the only example where purified virus particles can be used to infect a fungal host. Virus particles that were directly applied to leaves were found to suppress infection by S. sclerotiorum. In field trials with rapeseed, the preventive application of hyphal fragments containing SsHAVDV-1 particles significantly reduced Sclerotinina stem rot and increased crop yield. S. sclerotiorum also harbors a mycovirus in the family Hypoviridae, which induces hypovirulence, namely SsHV2 (Sclerotinia sclerotiorum hypovirus 2). Fungal strains infected with this mycovirus exhibit reduced mycelial growth, reduced sporulation, and reduced production of sclerotia. In laboratory experiments with detached leaves, a significant variation in virulence among SsHV2infected strains was observed. Several of these fungal strains showed a strong reduction in virulence, which could lead to candidate agents for biological control of S. sclerotiorum. Sclerotinia sclerotiorum partitivirus 1 (SsPV1) is another mycovirus that confers hypovirulence in S. sclerotiorum and related species. SsPV1 induced phenotypic traits in its fungal host include abnormal colony morphology and severe growth reduction. Horizontal transmission of SsPV1 to other fungal strains has been shown to result in transmission of hypovirulence and associated traits. A strain of a mitochondrial mycovirus, Sclerotinia sclerotiorum mitovirus 1 (SsMV1/HC025) was also found to induce hypovirulence in S. sclerotiorum. An infection by this mitovirus has severe effects on the fungus including reduced mycelial growth, reduced sclerotia production, and mitochondrial malformation. On detached leaves of rapeseed and soybean plants, SsMV1/HCO25-infected fungal strains exhibit a strong hypovirulent phenotype with almost no lesion development. Transmission of SsMV1/HCO25 via hyphal contacts converts recipient virulent strains to the hypovirulent phenotype.

Hypovirulence in Botrytis cinerea Botrytis cinerea Pers. is a cosmopolitan necrotrophic pathogen infecting more than 200 plant species in temperate and subtropical climates, some of which are of high economic importance (e.g., grapes, strawberries, solanaceous vegetable). Symptoms vary across plant organs and tissues but include: leaf blight, blossom blight, and post-harvest fruit rots. In spite of the risk of developing fungicide resistance, the pathogen is still mainly controlled using chemical fungicides. In 2001, Botrytis-specific fungicides (so called “botryticides”) represented 10% of the world fungicide market. Although numerous single- or double-stranded mycoviruses have been reported in B. cinerea and some have already been sequenced, not all of them are associated with hypovirulence. The mycoviruses that appear to attenuate the virulence of the infected fungal strains include: Botrytis cinerea mitovirus 1 (BcMV1, previously named BcDRV), Botrytis cinerea RNA virus 1 (BcRV1), Botrytis cinerea hypovirus 1 (BcHV1), Botrytis cinerea partitivirus 2 (BcPV2), and Botrytis cinerea CCg378 virus 1 (Bc378V1). BcMV1 was found in a hypovirulent B. cinerea isolate from oilseed rape in China and appears to debilitate mitochondria in hyphal cells. Laboratory experiments showed that BcMV1 can be transmitted both vertically, to asexual conidia, and horizontally, to other fungal strains. Though the horizontal transmission possibility might be regulated by vegetative incompatibility. BcRV1 was isolated from a Chinese hypovirulent isolate of B. cinerea from Berberis sp. This mycovirus was found to be capable of vertical transmission via macroconidia and horizontally through hyphal anastomosis. The accumulation level of BcRV1 greatly impacts its attenuation effect on the virulence of B. cinerea. In contrast to the two previous mycoviruses, infection by BcHV1 reduces fungal virulence without affecting mycelial growth. The attenuated virulence seems to arise from a significant alteration of the expression of genes associated with the formation of infection cushions, which play a central role during the infection process. BcHV1 can be transmitted horizontally through hyphal anastomosis. BcPV2, a partitivirus in the genus Alphapartitivirus, was isolated in an atypically pink strain of B. cinerea. Hosts infected with this strain showed a normal vegetative growth, and a reduced production of conidia and sclerotia. Importantly, they also showed an attenuated virulence on several crops, including apples, tomatoes, and potatoes. Following hyphal contact, BcHV1 was successfully transmitted to several virulent B. cinerea strains. This resulted in an attenuation of both their pathogenicity and the production of conidia and sclerotia. Bc378V1, a member of the family Partitiviridae, was found in a wild-type B. cinerea strain that was co-infected by another mycovirus that was not characterized further. Laccase activity and sporulation rates were lower in strains co-infected with both mycoviruses relative to the virus-free strain and strains only infected with Bc378V1. Hypovirulence is therefore most likely conferred by the presence of both mycoviruses. Despite the presence of several mycoviruses reducing the virulence of the infected B. cinerea strains, as of yet none have a real potential as a biocontrol agent. The two main limiting factors are a low competitive ability of mycovirus-infected strains relative to mycovirus-free strains, as well as the limited horizontal transmissibility of the mycoviruses due to vegetative incompatibility.

Hypovirulence in Rosellinia necatrix Rossellinia necatrix is an ascomycete fungus, which causes root rot on many woody plant species including grapevine and many fruit trees. It is a necrotrophic pathogen that also occurs as a saprophyte in the soil. The pathogen is difficult to control due to its soilborne lifestyle. Different mycoviruses have been detected in R. necatrix but often these cause symptomless infections. Two mycoviruses were found to confer hypovirulence: Rosellinia necatrix megabirnavirus 1 (RnMBV1) and Mycoreovirus 3 (MyRV3). Virus particles of both mycoviruses can be used to transfect virus-free R. necatrix strains, which subsequently become hypovirulent.

Mycovirus-Mediated Biological Control

477

Hypovirulence in Helminthosporium victoriae The plant pathogenic fungus Helminthosporium victoriae (F. Meehan & H.C. Murphy) is the causal agent of Victoria blight of oats (Avena sativa). At the end of the 1940s, this fungal pathogen resulted in significant yield losses in most of USA’s oat-growing regions. Discovery of a hypovirulence-associated mycovirus occurred in a similar way to that for chestnut blight. In the 1950s, fields of “Victorgrain” oats were observed in Louisiana that, despite an infection by H. victoriae, showed only limited yield losses. Isolates recovered from blighted plants presented a stunted and highly sectored growth. This phenotype was transmitted to ‘normally’ growing colonies via hyphal anastomosis. The phenotype transmission suggested that the reduced virulence observed in the field was due to a transmissible disease of the fungus. It was later shown that diseased H. victoriae isolates harbor two viruses, the victorivirus Helminthosporium victoriae virus 190S (HvV190S) and the chrysovirus Helminthosporium victoriae 145S (HvV145S). Recent studies have provided strong evidence indicating that HvV190S alone is responsible for the hypovirulence in its fungal host. Coincidentally, laboratory experiments revealed that an infection with HvV190S also induces a hypovirulent phenotype in the chestnut blight fungus Cryphonectria parasitica. This might open the door to an artificial introduction of HvV190Scaused hypovirulence in other plant pathogenic fungi.

Hypovirulence in Fusarium graminearum Fusarium graminearum Schwabe is the primary causal agent of Fusarium head blight. This is a severe and worldwide disease of wheat, barley and other small-grain cereals. Quite a few mycoviruses have been reported in this pathogen but three of them (FgV1, FgV-ch9, FgHV2) are associated with hypovirulence. Fusarium graminearum virus 1 (FgV1) (proposed family Fusariviridae), causes hypovirulence and reduces rate of mycelial growth, increases pigmentation, and inhibits mycotoxin production. It can be transmitted horizontally via hyphal anastomosis and vertically to asexual conidia. Fusarium graminearum mycovirus-China 9 (FgV-ch9) was detected in F. graminearum strains recovered from cereals in China. FgV-ch9 can be vertically transmitted to asexual conidia. Based on genomic structure and phylogenetic studies this virus belongs to a novel genus in the family Chrysoviridae. At high and medium FgV-ch9 concentration, infected F. graminearum strains show reduced mycelial growth and sporulation (conidia and perithecia), abnormal colony morphology, disorganized cytoplasm, and attenuated virulence on wheat and maize. At low virus concentrations the fungus shows no symptoms of an infection. Finally, the third virus, Fusarium graminearum hypovirus 2 (FgHV2) belongs to the genus Alphahypovirus and negatively affects mycelial growth rate, conidiation, and mycotoxin production of the infected F. graminearum strains.

Future Perspectives The field of fungal virology has made significant progress in our understanding of the biology of mycovirus-fungus interactions. This research has mainly been driven by the potential of mycoviruses to be used for biological control of fungal pathogens. In recent years a large number of novel mycoviruses have been described in various plant pathogenic fungi. Several of these mycoviruses confer hypovirulence and may have potential as biological control agents for plant pathogens. Thus far, the mycovirus-mediated biological control of chestnut blight is the most successful example of its kind when considering practical applications. Importantly, thanks to the recent promising results in mycovirus research, the perspectives are bright for further successful applications of mycoviruses for biological control of plant diseases.

Further Reading Boland, G.J., 2003. Fungal viruses, hypovirulence, and biological control of Sclerotinia species. Canadian Journal of Plant Pathology 26, 6–18. Dawe, A., Nuss, D.L., 2013. Hypovirus molecular biology: From Koch’s postulates to host self-recognition genes that restrict virus transmission. Advances in Virus Research 86, 109–147. Ghabrial, S.A., Castoń , J.R., Jiang, D., Nibert, M.L., Suzuki, N., 2015. 50-plus years of fungal viruses. Virology 479–480, 356–368. Hillman, B.I., Suzuki, N., 2004. Viruses of the chestnut blight fungus, Cryphonectria parasitica. Advances in Virus Research 63, 423–472. Milgroom, M.G., Cortesi, P., 2004. Biological control of chestnut blight with hypovirulence: A critical analysis. Annual Review of Phytopathology 42, 311–338. Muñoz-Adalia, E.J., Fernández, M.M., Diez, J.J., 2016. The use of mycoviruses in the control of forest diseases. Biocontrol Science and Technology 26, 577–604. Nuss, D.E., 2005. Hypovirulence: Mycoviruses at the fungal–plant interface. Nature Reviews Microbiology 3, 632–642. Pearson, M.N., Beever, R.E., Boine, B., Arthur, K., 2009. Mycoviruses of filamentous fungi and their relevance to plant pathology. Molecular Plant Pathology 10, 115–128. Rigling, D., Prospero, S., 2018. Cryphonectria parasitica, the causal agent of chestnut blight: Invasion history, population biology and disease control. Molecular Plant Pathology 19, 7–20. Suzuki, N., 2017. Frontiers in fungal virology. Journal of General Plant Pathology 83, 419–423. van de Sande, W.W., Lo-Ten-Foe, J.R., van Belkum, A., et al., 2010. Mycoviruses: Future therapeutic agents of invasive fungal infections in humans? European Journal of Clinical Microbiology & Infectious Diseases 29, 755–763. Xie, J., Jiang, D., 2014. New insights into mycoviruses and exploration for the biological control of crop fungal diseases. Annual Review of Phytopathology 52, 45–68.

Mycoviruses With Filamentous Particles Michael N Pearson, The University of Auckland, Auckland, New Zealand r 2021 Elsevier Ltd. All rights reserved.

Glossary

Triple gene block A set of three genes in overlapping reading frames that function together in facilitating cell-to-cell movement of plant viruses.

Plasmodesmata Microscopic channels connecting the cytoplasm of adjacent cells in plants.

Introduction Mycoviruses have been reported from at least 19 virus families, including viruses with double-stranded (ds) RNA, positive-sense ( þ ), single-stranded (ss) RNA, negative-sense (  ) ssRNA(  ), and RNA reverse-transcribing genomes, plus at least one ssDNA virus. The majority of these viruses are either unencapsidated or have isometric, encapsidated particles. To date only five mycovirus species with filamentous encapsidated particles have been described (Table 1). Two of these have ssRNA( þ ) genomes and belong to the order Tymovirales: Botrytis virus F (BotV-F), family Gammaflexiviridae and Botrytis virus X (BotV-X), family Alfaflexiviridae. Two others have ssRNA(  ) genomes and belong to the order Mononegavirales: Sclerotinia sclerotiorum negative-stranded RNA virus 1 (SsNSRV-1), species Sclerotinia sclerotimonavirus, family Mymonaviridae, and Fusarium graminearum negative-stranded RNA virus 1 (FgNSRV-1). The fifth is a dsRNA virus Colletotrichum camelliae filamentous virus 1 (CcFV-1) which is phylogenetically related to Aspergillus fumigatus tetramycovirus-1 and Beauveria bassiana polymycovirus-1. The small number of filamentous mycoviruses reported may simply reflect their rarity compared to other types of mycoviruses but historically it may also reflect the methods most often used to detect mycoviruses. For many years the most commonly used method to screen for the presence of mycoviruses was to extract high molecular weight (MW) dsRNAs, indicative of the presence of either dsRNA viruses or the replicative form of ssRNA viruses. Although this simple technique discovered many RNA mycoviruses it does not detect DNA viruses and also fails to detect some ssRNA viruses, as is well documented for some ssRNA plant viruses, such as luteoviruses and some potyviruses. Amongst the mycoviruses BotV-X and BotV-F are examples of viruses not detected by dsRNA analysis. Although these viruses were discovered during screening of Botrytis cinerea isolates for viral dsRNAs they did not yield sufficient quantities of dsRNA for detection by gel electrophoresis and were only found during virus purification by differential centrifugation, from a B. cinerea isolate that had been chosen as a negative control. More recently, deep sequencing of total RNA extracts from fungi has detected virus sequences that phylogenetic analysis places in families that include filamentous, encapsidated viruses. For example, the genome of Macrophomina phaseolina tobamo-like virus 1 (MpTLV1) contains four ORFs encoding a methyltransferase and helicase, an RdRp, a putative movement protein, and a coat protein, with a gene order and genome organization typical of members of the genus Tobamovirus. Although filamentous particles have not been observed the existence of a coat protein gene suggests that encapsidated particles are produced. In other cases genome analysis may show the absence of a coat protein gene, indicating that encapsidated particles are not produced. For example, Sclerotinia sclerotiorum debilitation associated RNA virus (SsDRV) is classified as a member of the family Alaphaflexiviridae, a predominently plant virus family, although it lacks a CP gene and consequently does not form encapsidated particles.

Table 1

Summary of characterized flexuous mycoviruses (September 2018) Genome Order

Botrytis virus X

ss RNA Tymovirales Alphaflexiviridae Botrexvirus 7 (þ) ss RNA Tymovirales Gammaflexiviridae Mycoflexivirus 7 (þ) ss RNA Mononegavirales Mymonaviridae Sclerotimonavirus 10.0 ()

1

ss RNA Mononegavirales Unassigned ()

Unassigned

9.1

5

dsRNA

Unassigned

12.3

8

Botrytis virus F Sclerotinia sclerotimonavirusa Fusarium graminearum negative-stranded RNA virus 1 Colletotrichum camelliae filamentous virus 1 (CcFV  1)

Unassigned

Family

Genus

Genome No. of size (Kb) genome segments

Virus name

Unassigned

1 6

Particle size (nm)

Particle type

13  720

filamentous nucleoprotein 13  720 filamentous nucleoprotein 22  200–2000 filamentous enveloped nucleoprotein 35–50  B1200 filamentous enveloped nucleoprotein 12–18  1000–4427 filamentous nucleoprotein

Syn. ¼ Sclerotinia sclerotiorum negative-stranded RNA virus 1 (SsNSRV-1).

a

478

Encyclopedia of Virology, 4th Edition, Volume 4

doi:10.1016/B978-0-12-809633-8.21325-3

Mycoviruses With Filamentous Particles

479

Fig. 1 Genome organization of Botrytis virus F showing the relative positions of the ORFs and their expression products. Mtr, methyltransferase; Hel, helicase; RdRp, RNA-dependent RNA polymerase; CP, capsid protein. The arrow marks the “leaky” opal stop codon in ORF1. Copyright © 2017, International Committee on Taxonomy of Viruses [ICTV].

Botrytis virus F (BotV-F) Family – Gammaflexiviridae Genus – Mycoflexivirus Genome Structure BotV-F was the first flexuous mycovirus to be characterized and also the first fully sequenced virus from B. cinerea. The genome consists of a single molecule of linear ssRNA( þ ), 6827 nt in length, excluding the 30 -poly(A) tail. The genomic RNA consists of two major ORFs separated by a 93 nucleotide utranslated region (UTR), a 50 -UTR of 63 nt and a 30 -UTR of 71 nt, followed by a poly(A) tail (Fig. 1). ORF1 contains conserved methyltransferase and helicase motifs, terminating with an opal stop codon (UGA) and yielding a protein of 153 kDa. Readthrough of this codon is expected to extend the protein to 212 kDa that also includes an RNA-dependent RNA polymerase (RdRp) domain. ORF2 encodes the only structural protein, the 32 kDa coat protein. One obvious difference between the genome structure of BotV-F and those of related plant viruses is the apparent lack of a movement protein that many plant viruses encode to mediate the passage of infectious viral material from cell to cell by interaction with the plant host plasmodesmata. However, fungal viruses such as BotV-F presumably do not require movement proteins as the septal pore, which separates individual compartments of the mycelium, is 150–250 nm and therefore unlikely to be a barrier to the movement of mycovirus particles.

Biological Properties BotV-F was originally detected in B. cinerea isolated from strawberry in New Zealand as a co-infection with Botrytis virus X (BotV-X) (genus Botrexvirus, family Alphaflexiviridae). BotV-F was subsequently detected in B. cinerea isolates from grapevine in New Zealand, cucumber in Israel, lettuce in the UK, and tomato and grapevine in France. Surveys of B. cinerea isolates from strawberry plants (Fragaria sp.) from the Auckland region and isolates from grapevine (Vitis sp.) across New Zealand detected BotV-F in 8/59 (13.6%) of isolates. In addition 16% (4/25) of B. cinerea isolates from various host plant species around the world also tested positive for BotV-F. Phylogentic analysis of thirteen BotV-F isolates, based on B300 bp sequences spanning the intergenic region between ORFs 1 and 2, placed the isolates in three distinct clades. However, the three clades showed no significant correlation with the geographic origin of the isolates suggesting that BotV-F has a long association with B. cinerea and has been distributed with its host fungus, most probably in B. cinerea infections of plants. B. cinerea transfected in-vitro with purified BotV-F particles showed changes in hyphal morphology, reduced linear growth in culture, reduced virulence in a Phaseolus vulgaris detached leaf assay and significant changes in amino acid, carbohydrate, and fatty acid metabolism.

Virion Morphology The virions of BotV-F are flexuous nucleocapsid filamentous with a modal length of approximately 720 nm and a diameter of approximately 13 nm (Fig. 2). The virus particles can be partially purified by chloroform treatment of homogenised fungal mycelium and differential centrifugation.

Phylogenetic Relationships When first characterized in 2001, sequence analysis of BotV-F indicated that it was most similar to plant viruses in the (then) family Flexiviridae, although it did not fit well into any of the existing genera. The complete replicase protein showed a highest amino acid identity (22%–23%) to those of the allexivirus, garlic virus X (GVX), the tymovirus, turnip yellow mosaic virus (TYMV), the potexvirus, strawberry mild yellow edge-associated virus (SMYEV), and the vitivirus, grapevine virus B (GVB). In the methyltransferase region, BotV-F showed highest amino acid identity (34%) to the tymovirus, TYMV and the marafivirus, oat blue dwarf virus (OBDV), while in the helicase region it was closer to GVX and SMYEV (28%) and in the RdRp to the allexivirus garlic virus A (GarV-A) (40%); the potexvirus SMYEV (39%), and the capillovirus, cherry virus A (CVA) (39%). The amino acid sequence of the putative coat protein of BotV-F shows highest homology to the conserved central core regions of viruses belonging to the genera Capillovirus, Trichovirus, and Vitivirus, but is unusual in having a long C-terminal region compared with the coat proteins of plant viruses. Another unusual feature of BotV-F, compared to flexuous filamentous ssRNA( þ ) plant viruses, is the putative

480

Mycoviruses With Filamentous Particles

Fig. 2 Electron micrograph of flexuous rod-shaped particles in a partly purified preparation of Botrytis cinerea isolate RH106–10, negatively stained with 2% potassium phosphotungstate, pH 4. Bar ¼ 200 nm. From Howitt, R.L.J., Beever, R.E., Pearson, M.N., and Forster, R.L.S., 2001. Genome characterization of Botrytis virus F, a flexuous rod-shaped mycovirus resembling plant ‘potex-like’ viruses. Journal of General Virology 82, 67–78.

Fig. 3 Genome organization of Botrytis virus X showing the relative positions of the ORFs and their expression products. Mtr, methyltransferase; Hel, helicase; RdRp, RNA-dependent RNA polymerase; CP, capsid protein. Copyright © 2017, International Committee on Taxonomy of Viruses [ICTV].

read-through opal codon in the replicase, which is more typical of read-through codons found in the bipartite genomes of ssRNA straight rod-shaped viruses such as the tobraviruses and furoviruses. In the 9th Report of the ICTV (2011) BotV-F was assigned as the type member of a newly created genus Mycoflexivirus within a newly created family Gammaflexiviridae, order Tymovirlaes.

Botrytis Virus X (BotV-X) Family – Alfaflexiviridae Genus – Botrexvirus Genome Structure The BotV-X genome consists of a single molecule of linear ssRNA( þ ), 6966 nt in length, excluding the 30 -poly(A) tail. The genome comprises five putative ORFs, a 50 -UTR of 95 nt, untranslated regions of 17 nt separating ORFs 1 and 2, and 6 nt separating ORFs 2 and 3, and a 30 -UTR of 149 nt, followed by a poly(A) tail (Fig. 3). The genome has a 50 cap immediately followed by the nucleotide sequence GAAAAC, typically found in members of the plant virus genus Potexvirus. ORF1 encodes the 158 kDa polymerase. ORF2, which is in the þ 1 frame relative to ORF1, encodes a 30 kDa putative protein (p30) but nucleotide and amino acid sequences of p30 do not reveal any significant homology with other known sequences. ORF3, which is in the same frame as ORF2, encodes the 44 kDa coat protein and terminates at an amber (UAG) codon. ORF4 and ORF5 both encode putative proteins with an expected molecular mass of B14 kDa. ORF4 overlaps ORF3 and is in the same frame as ORF1, while ORF5 is in the same frame as ORF2 and ORF3. Neither of these proteins show any significant homology with known viral proteins.

Biological Properties BotV-X was first discovered infecting an isolate of the plant pathogenic fungus B. cinerea from a strawberry plant in New Zealand, which was also infected with BotV-F. BotV-X was subsequently detected in other New Zealand B. cinerea isolates from grapevine and tomato.

Mycoviruses With Filamentous Particles

481

The transmission rate of BotV-X through asexual conidia is 495% and under experimental conditions transmisssion has also been shown through sexually produced ascospores from both the male parent (34% transmission) and female parent (53% transmission). Since B. cinerea is rarely observed to produce ascospores in nature they presumably play a minor role in virus dispersal but sexual reproduction may provide a mechanism of the virus moving into new vegetatively incompatible groups of the fungus. Immunofluorescent light microscopy of sections through BotV-X infected fungal tissue and fixed mycelia grown on glass slides, using antibodies raised against BotV-X coat protein expressed in Escherichia coli, showed (presumed) virus aggregates in hyphal tips, closely associated with fungal cell membranes and walls, and adjacent to septal pores. Although it is generally assumed that mycoviruses move passively through the cytoplasm it is not known exactly how flexuous viruses of filamentous fungi are transported from cell to cell. The occurrence of virus aggregates next to septa suggests that the septal pores of B. cinerea may restrict viral movement, although the presence of viral aggregates on both sides of the septum suggests that movement through septa does occur. In an apple assay, naturally BVX-infected isolates grew slightly slower (i.e., were less virulent) than virus-free isolates. However, B. cinerea isolates transfected in-vitro with purified BotV-X particles showed no significant difference in virulence compared to the isogenic virus free line, in either bean leaf or apple fruit assays. In culture on malt extract agar plates some naturally infected B. cinerea isolates showed increased growth rates compared to BotV-X-free isolates.

Virion Morphology The virions of BotV-X are flexuous nucleoprotein filaments with a modal length of B720 nm and a diameter of B13 nm (Fig. 2). The virus particles can be partially purified by chloroform treatment of homogenised fungal mycelium and differential centrifugation.

Phylogenetic Relationships ORF1 encodes the 158 kDa replicase protein (p158) the amino acid sequence of which shows highest identity to replicases from the allexivirus garlic virus A [GarV-A] (49%) and the potexvirus cymbidium mosaic virus [CymMV] (36%). Amino acid alignments of the three internal conserved methyltransferase, helicase, and RdRp domains of GarV-A show 51%, 55%, and 73% identity, respectively, while in CymMV identity to these regions ranges from 45% to 51%. Phylogenetic analysis of the RdRp region clusters BVX amongst the Allexivirus and Potexvirus genera but closest to the allexiviruses. The first 211 amino acids of the p44 protein (ORF2) do not show significant homology with known proteins, but the C-terminal 189 amino acids (encoding a 20 kDa protein) show significant identity to the coat proteins of CymMV (35%), the allexivirus, garlic virus X [GarV-X](34%) and the carlavirus, garlic latent virus [GarLV] (34%). Alignment of the conserved residues in the coat protein, associated with the putative salt bridge and corresponding hydrophobic core in flexuous rod-shaped ssRNA plant viruses, also showed highest amino acid identity to CymMV (65%). Phylogenetic analysis of the putative coat protein gene clusters BotV-X with the genus Allexivirus. Although BotV-X has a different genome structure from both allexiviruses and potexviruses it has a number of properties in common with the viruses of these two genera. While it displays high replicase sequence similarity to the allexiviruses, other features, such as the size of the replicase gene, the overall size of the genome, and the conserved 50 terminal nucleotide sequence, are more typical of potexviruses. In the 9th Report of the ICTV (2011) BotV-X was assigned as the type member of a newly created genus Botrexvirus within a newly created family Alfaflexiviridae.

Sclerotinia Sclerotimonavirus Family – Mymonaviridae Genus – Sclerotimonavirus Genome Structure The genome of the sclerotimonavirus Sclerotinia sclerotiorum negative-stranded RNA virus 1 (SsNSRV-1) is a 10,002 nt ssRNA (  ) that lacks a poly(A) tail at the 30 terminus and has an uncapped 50 terminus (Fig. 4). It consists of six non-overlapping ORFs (I-VI), linearly arranged and separated by conserved gene-junction sequences commonly found in mononegaviruses. ORF V encodes the largest protein, which contains a conserved mononegaviral RdRp domain. ORF II appears to encode the two proteins (p43 and p41) that have been isolated from purified nucleocapsids of the virus. In mononegaviruses ORF IV typically encodes glycoprotein G, however it’s unlikely that ORF IV of SsNSRV-1 encodes G protein as it is only 186 bp and lacks a putative signal peptide.

Biological Properties SsNSRV-1 of Sclerotinia sclerotimonavirus confers hypovirulence on Sclerotinia sclerotiorum. Symptoms include slow growth on potato dextrose agar (PDA), loss of the ability to produce sclerotia, loss of pathogenicity on rapeseed and zigzag growth of hyphal tips in culture (possibly due to loss of growth polarity). SsNSRV-1 infected Sclerotinia sclerotiorum has been isolated from both soybean and rapeseed in China and is widely distributed, having been detected in five provinces (Anhui, Heilongjiang, Hubei, Jiangxi, and Shaanxi) representing a range of climatic regions.

482

Mycoviruses With Filamentous Particles

Fig. 4 Virion morphology and genome organization and expression map of Sclerotinia sclerotiorum negative-stranded RNA virus 1 (SsNSRV-1). (a) Morphology and structure of SsNSRV-1 virions and nucleoprotein–RNA complexes (RNPs). Particles and RNPs were purified from mycelia of strain AH98 and Ep-1PNA367-PT2 and negatively stained with 2% PTA (W/V, pH 7.4) (scale bars are shown at the bottom). (A) Filamentous, possibly enveloped virions and RNPs. (B) Purified tight or loose coils of RNPs. (C) Rings that make up the coils and NP monomers. (b) Genomic organization of SsNSRV-1 showing all six ORFs. (c) Deduced transcription map of SsNSRV-1 based on 50 - and 30 -RACE. Fig. 1 From, 2015.008agM.A.v3.Mymonaviridae.pdf (https://talk.ictvonline.org/files/ictv_official_taxonomy_updates_since_the_8th_report/m/animal-dsrna-and-ssrna–viruses/ 5831). Modified from Liu, L., Xie, J., Cheng, J., et al., 2014. Fungal negative-stranded RNA virus that is related to bornaviruses and nyaviruses. Proceedings of the National Academy of Sciences of the United States of America 111, 12205–12210.

Virion Morphology The virions of SsNSRV-1, purified from mycelia and negatively stained with 2% PTA, are filamentous, 25–50 nm in diameter and 200–2000 nm in length, and appear to be enveloped (Fig. 4). The nucleocapsids are long, flexible, helical structures, which, when tightly coiled, have a diameter of 20–22 and possibly contain two nucleoproteins of B43 kDa (p43) and B41 kDa (p41), both translated from ORF II. The nucleocapsids of SsNSRV-1 (as observed by TEM) consist of tightly stacked turns and have variable

Mycoviruses With Filamentous Particles

483

appearance, including both relatively rigid rods and loosely coiled helices, similar to the RNA–nucleoprotein complexes (RNPs) reported from other mononegaviruses. Electron micrographs of ultrathin fungal hyphal sections show tubular structures, similar to the purified virions, consisting of nucleocapsids enclosed within an outer membrane. Mononegaviruses typically have enveloped virions but the apparently stable filamentous shape of SsNSRV-1 is unusual as most mononegaviruses have pleomorphic virions.

Phylogenetic Relationships SsNSRV-1 shares a number of general characteristics in common with mononegaviruses, although there are some differences. Mononegaviruses typically have nonsegmented ssRNA(  ) genomes, 8.9–19 kb in size and virions of variable morphologies that are enveloped with a phospholipid membrane originating from the host. Phylogenetic analysis of RdRp amino acid sequences clusters SsNSRV-1 with viruses of the families Nyamiviridae and Bornaviridae (order Mononegavirales), although SsNSRV-1 does not conform to the typical five-gene pattern (N-P-M-G-L) of mononegaviruses as it has six ORFs, and the NPs are encoded by ORF II instead of ORF I. The 2018 ICTV report classifies SsNSRV-1 in a newly created genus Sclerotimonavirus, family Mymonaviridae, Order Mononegavirales.

Fusarium Graminearum Negative-Stranded RNA virus 1 Family – Mymonaviridae (?) Genus – Unclassified Genome Structure Fusarium graminearum negative-stranded RNA virus 1 (FgNSRV-1) has a ss RNA(-) genome of 9072 nucleotides in length, consisting of five ORFs (ORF I-V) of 774, 1167, 238, 5893 and 552 nt, respectively, and lacks a poly (A) tail at the 30 -terminus. The five ORFs putatively encode five proteins (P I to P V) with predicted molecular masses of 29.1, 42.8, 6.2, 221.6, and 21.1 kDa, respectively. The 30 -untranslated region (UTR) and 50 -UTR are 126 nt and 220 nt long, respectively, and have perfect complementarity for the first six residues (30 -UCCUGC–GCAGGA-50 ), a feature common among mononegaviruses. The P II and P IV proteins show significant amino acid sequence similarity to the nucleoprotein N and large protein L of SsNSRV-1 sequences, with the P IV containing mononegaviral-like L polymerase domains. However, similar to SsNSRV-1, the genomic arrangement of FgNSRV-1 does not follow the basic five-gene pattern (N-P-M-G-L) of typical mononegaviruses, the N protein being encoded by ORF II, rather than ORF I, as in most other known mononegaviruses.

Biological Properties FgNSRV-1 was originally isolated from the plant pathogenic fungus Fusarium graminearum. It has no discernable impact on host morphology, mycelial growth, biomass, conidia production, or virulence.

Particle Morphology The virions of FgNSRV-1 (purified from mycelia) are filamentous, 35–50 nm in diameter and B1200 nm in length. Short rod-like particles and loose nucleocapsid-like helical structures, similar to the condensed nucleocapsid structures of SsNSRV-1, have also been observed. Given the similarities between FgNSRV-1 and SsNSRV-1, it is possible that the FgNSRV-1 virions are also enveloped by a membrane.

Phylogenetic Relationships FgNSRV-1 shows high sequence similarity (93.9%–98.8% nucleotide; 97.3%–100% amino acid) to soybean leaf-associated negative-stranded RNA virus 1 (SsNSRV-1), a virus of unknown morphology identified in a soybean leaf metatranscriptomic study. The putative nucleoprotein (P II) and RNA polymerase (P IV) also show significant homology to mononegaviruses with phylogenetic analysis of entire polymerase sequences indicating that FgNSRV-1 is related to the ssRNA(  ) mycovirus SsNSRV-1, suggesting that FgNSRV-1 can be classified in the family Mymonaviridae.

Colletotrichum Camelliae Filamentous Virus 1 Family – Unclassified Genus – Unclassified Genome Structure The genome of Colletotrichum camelliae filamentous virus 1 (CcFV-1) consists of eight separate dsRNAs, of 2444, 2253, 2012, 1299, 1122, 1085, 1053, and 990 bp. DsRNAs 1, 2, 3, 4, 6, and 7 each consist of a single ORF, while dsRNAs 5 and 8 each contain

484

Mycoviruses With Filamentous Particles

two ORFs (one 4200 aa, the other o50 aa). DsRNA 1 shares 47% amino acid identity with the RNA-dependent RNA polymerase (RdRp) of Aspergillus fumigatus tetramicrovirus-1 (AfuTmV-1) and low similarities to the dsRNA1 of Alternaria tenuissima virus (ATV), Cladosporium cladosporioides virus 1 (CcV-1), and Botryosphaeria dothidea virus 1 (BdRV1). The remaining RNAs show no detectable similarity with any known viral RNA sequences. The predicted protein from ORF 1 clusters CcFV-1 with AFuTmV-1, BdRV1, Beauveria bassiana polymycovirus-1 (BbPmV-1), and Beauveria bassiana polymycovirus-2 (BbPmV-2), separate from other dsRNA viruses and closer to ssRNA( þ ) viruses belonging to the Caliciviridae. It has been suggested that CcFV-1 may be an intermediate link between dsRNA and ssRNA( þ ) viruses. The protein coded by dsRNA 3 of CcFV-1 is highly similar to the methyltransferase of various bacteria but the functions of the remaining putative proteins are unknown.

Biological Properties CcFV-1 is a dsRNA virus that was first reported from an isolate of the plant pathogenic fungus Colletotrichum camelliae, from tea (C. sinensis (L.)) in China. CcFV-1 infected C. camelliae exhibited different morphology and pigmentation from virus-free isolates and showed reduced growth rates (5.4 versus 6.3–6.5 mm/day) and reduced virulence with lesion lengths of 1.0 mm compared to 6.1–14.9 mm for virus free isolates. The CcFV-1 infected isolate also showed cytological abnormalities in ultra-thin hyphal sections observed by TEM, including the formation of large membranous vesicles that contained virus-like particles with a width similar to those of the CcFV-1.

Virion Morphology CcFV-1 infected cultures contain non-enveloped flexuous filamentous particles with widths ranging from 11.9 to 17.7 nm and lengths from 1001 to 4427 nm, with analysis of particles from different fractions of ultracentrifuged sucrose gradients indicating that dsRNAs 1–8 are separately encapsidated. DsRNA 4 codes for the capsid protein, as demonstrated by immunosorbent electron microscopy, Western blotting and indirect enzyme-linked immunosorbent assay, using antisera produced to purified P4 protein.

Phylogenetic Relationships Colletotrichum camelliae filamentous virus 1 is currently an unclassified virus species. Although the RdRp of CcFV-1 shows some degree of identity to the RdRps of AfuTmV-1, BdRV1, BbPmV-1, and BbPmV-2, the first three of these viruses are capsidless with genomes consisting of 4–5 dsRNAs, while BbPmV-2 has 7 dsRNAs and is possibly capsidless or may form bacilliform virus-like particles. In contrast CcFV-1 forms encapsidated filamentous particles and has eight dsRNAs. In addition the RdRp of CcFV-1 appears to be in an intermediate position between those of ssRNA( þ ) viruses ssRNA( þ ) viruses of the extended picorna-like superfamily such as caliciviruses, astroviruses, and picornaviruses, and dsRNA viruses. It has been proposed that CcFV-1 may have originated from a ssRNA( þ ) viral ancestor belonging to clades three or six of the picorna-like superfamily although the CP has no detectable sequence similarity with those of known filamentous ssRNA( þ ) plant viruses.

Relationships Between Filamentous Viruses From Fungi and Plants Mycoviruses have been reported from at least 19 virus families and one genus, eleven of which (Barnaviridae, Chrysoviridae, Deltaflexiviridae, Gammaflexiviridae, Genomoviridae, Hypoviridae, Megabirnaviridae, Mymonaviridae, Narnaviridae, Quadriviridae, and Botybirnavirus) currently include only fungal viruses. The other nine families include viruses from a range of hosts: Alphaflexiviridae (fungi and plants), Amalgaviridae (fungi and plants), Botourmiaviridae (fungi and plants), Endornaviridae (fungi and plants), Metaviridae (fungi, invertebrates, plants and vertebrates), Partitiviridae (fungi, plants, protozoa), Pseudoviridae (fungi, algae, invertebrates, plants, vertebrates), Reoviridae (fungi, invertebrates, plants and vertebrates), Totiviridae (fungi and protozoa), although in the majority of cases the fungal viruses are classified in distinct genera. Isometric encapsidated mycoviruses are relatively common and are found in at least six of these families and the one genus, but to date (2019) only five encapsidated filamentous mycoviruses have been fully sequenced and described. Although these five filamentous mycoviruses are taxonomically diverse they have one important factor in common, they are all from plant pathogenic fungi. BotV-F and BotV-X, are ssRNA( þ ) and belong to the order Tymovirales which contains predominantly plant viruses. SsNSRV-1, belonging to the species Sclerotinia sclerotimonavirus, and FgNSRV-1, as yet unassigned to a species, have ssRNA(  ) genomes and belong to the order Mononegavirales, which includes viruses from a wide range of hosts, including vertebrates, invertebrates and plants. However, the filamentous virions of SsNSRV-1 and FgNSRV-1 are somewhat different from other members of the Mononegavirales where the virions are typically spherical, bacilliform or plemorphic, although the nucleocapsids may have a filamentous form. The fifth virus is a dsRNA virus CcFV-1, which is phylogenetically related to the unclassified viruses AfuTmV-1 and BbPmV-1. However, the particles of CcFV-1 have a more defined filamentous structure than the other two viruses, which do not encode a conventional CP and are described as being non-conventionally encapsidated, since they encode a proline/alanine/serine (PAS)-rich protein which most likely coats the viral dsRNA.

Mycoviruses With Filamentous Particles

485

Interestingly, despite the apparent scarcity of flexuous filamentous mycoviruses, the first two filamentous mycoviruses to be characterized, BotV-X and BotV-F, were found in the same B. cinerea isolate. While they differ in genome structure the genomes are of a similar size and virus purification from the co-infected B. cinerea isolate yielded only one size of flexuous filamentous particle, indicating that the two viruses have similar-sized particles. BotV-X and BotV-F show only low to moderate sequence identity to each other in the replicase and coat protein genes and there is no significant similarity between other putative proteins encoded by the two genomes. However, the replicase genes of both viruses shows greatest similarity to plant viruses belonging to the order Tymovirales, with a highest amino acid identity of 73% between the RdRp of BotV-X and the plant virus GarV-A. While this is remarkably high for interviral homology between plant and fungal viruses there are many examples of homology between plant and fungal viruses from a range of viral families including the Endornaviridae, Totiviridae, Chrysoviridae, Amalgaviridae, and Partitiviridae. Despite some sequence similarities the genome structures of BotV-F and BotV-X are distinct from those of the plant virus families of the Tymovirales, one obvious difference being the absence of a gene encoding a movement protein in BotV-X and BotV-F. In plant viruses of the Tymovirales, movement proteins, which interact with plant host plasmodesmata to allow passage of infectious viral material from cell to cell, are present either as a single polypeptide (e.g., Capillovirus, Vitivirus) or as a triple gene block (e.g., Allexivirus, Potexvirus). The lack a movement protein in BotV-X and BotV-F is a feature consistent with the lack of plasmodesmata or similar restrictions to movement in their fungal host. The remarkably high sequence similarity between the replicase of BotV-X and that of the plant allexivirus, GarV-A, may indicate a relatively recent divergence from a common ancestor. Since B. cinerea has an extremely wide host range, including alliums, a possibly scenario is that historically B. cinerea acquired a plant virus, such as GarV-A, that was infecting the same host and this virus has subsequently lost unnecessary parts of the genome, such as the movement protein gene. SsNSRV-1 and FgNSRV-1 are also found in plant pathogenic fungi and are phylogenetically related to mononegaviruses that collectively infect a range of hosts, including plants. Similar to BotV-X and BotV-F, there are differences in genome organization between FgNSRV-1 and SsNSRV-1 and related viruses from other hosts. There are also differences in particle morphology, the virions of SsNSRV-1 and FgNSRV-1, which are filamentous, and the virions of other members of the Mononegavirales which are typically spherical, bacilliform or plemorphic, although the nucleocapsids may take on a filamentous form. As with BotV-X and BotV-F the pathogenic association between the fungal host and plants potentially provides an opportunity for the fungus to acquire a plant virus that becomes modified over time. Although most plant viruses do not have lipoprotein envelopes, some, such as members of the ssRNA(-) families Bunyaviridae and Rhabdoviridae (order Mononegavirales) have lipid based envelopes, which are important for specific interaction with their insect vectors. While there is currently no evidence for insect vectors of mycoviruses the envelopes of SsNSRV-1 and FgNSRV-1 virions would be consistent with these viruses being derived from insect transmitted plant viruses. Shared lineages between plant and fungal viruses are common with both plant and fungal viruses belonging to at least twelve ICTV-recognized families, with phylogenetic evidence of common ancestry. In addition it has been shown that some fungal viruses are able to replicate in plants cells and vice versa. For example the two mycoviruses Penicillium aurantiogrseum totivirus-1 and Penicillium aurantiogrseum partitivirus-1 are able to replicate in protoplasts of the plant species Nicotiana benthamiana and Nicotiana tabacum, while the plant virus, cucumber mosaic virus, is able to form a stable infections in cultures of the fungus Rhizoctonia solani. It seems less likely that mycoviruses would be acquired by and form an ongoing infection in plants, because of the lack of movement proteins required for systemic movement in plants. However, the possibility should not be totally discounted as it has been proposed that plant ourmiaviruses, which have a tripartite ssRNA( þ ) genome, have arisen from a narnavirus (monopartite genome) that acquired additional RNAs that encode a movement protein. Although the dsRNA virus (CcFV-1) was found in the plant pathogen Colletotrichum camelliae it shares a highest amino acid identity of 47% with the RNA-dependent RNA polymerase (RdRp) of AfuTmV-1, a virus from the human fungal pathogen Aspergillus fumigatus. However it also shares low identity to the dsRNA-1 of ATV, CcV-1, and BdRV1 from the plant pathogenic fungi Alternaria tenuissima, Cladosporium cladosporioides and Botryosphaeria dothidea, respectively. This raises the possibility that CcFV1 may be derived from a plant virus, although none of the characterized dsRNA plant viruses have long filamentous particles. The increasing use of deep sequencing is detecting additional, often unencapsidated, mycoviruses with some sequence identity to known filamentous viruses. For example, members of the family Deltaflexivirus (order Tymovirales), Fusarium graminearum deltaflexivirus 1 (FgDFV1), Sclerotinia sclerotiorum deltaflexivirus 1 (SsDFV1), and soybean leaf-associated mycoflexivirus 1 (SlaMyfV1), are distantly related to viruses of the order Tymovirales, although none of these viruses produce coat proteins. However, in other cases it is unknown whether presumed virus sequences belong to viruses that produce encapsidated filamentous particles and further investigation, including virus purification and electron microscopy is required to determine the nature of the virions.

Further Reading Arthur, K., Pearson, M.N., 2014. Geographic distribution and sequence diversity of the mycovirus Botrytis virus F. Mycological Progress 13, 1249–1253. Boine, B., Kingston, R.L., Pearson, M.N., 2012. Recombinant expression of the coat protein of Botrytis virus X and development of an immunofluorescence detection method to study its intracellular distribution in Botrytis cinerea. Journal of General Virology 93, 2502–2511. Chen, X., He, H., Yang, X., et al., 2016. The complete genome sequence of a novel Fusarium graminearum RNA virus in a new proposed family within the order Tymovirales. Archives of Virology 161, 2899–2903. Howitt, R.L.J., Beever, R.E., Pearson, M.N., Forster, R.L.S., 2001. Genome characterization of Botrytis virus F, a flexuous rod-shaped mycovirus resembling plant ‘potex-like’ viruses. Journal of General Virology 82, 67–78.

486

Mycoviruses With Filamentous Particles

Howitt, R.L.J., Beever, R.E., Pearson, M.N., Forster, R.L.S., 2006. Genome characterization of a flexuous rod-shaped mycovirus, Botrytis virus X, reveals high amino acid identity to genes from plant ‘potex-like’ viruses. Archives of Virology 151, 563–579. Jia, H., Dong, K., Zhou, L., et al., 2017. A dsRNA virus with filamentous viral particles. Nature Communications 8, 168. Liu, L., Xie, J., Cheng, J., et al., 2014. Fungal negative-stranded RNA virus that is related to bornaviruses and nyaviruses. Proceedings of the National Academy of Sciences of the United States of America 111, 12205–12210. Marzano, S.-Y.L., Nelson, B.D., Ajayi-Oyetunde, O., et al., 2016. Identification of diverse mycoviruses through metatranscriptomics characterization of the viromes of five major fungal plant pathogens. Journal of Virology 90, 6846–6863. Mu, F., Xie, J., Cheng, S., et al., 2018. Virome characterization of a collection of Sclerotinia sclerotiorum from Australia. Frontiers in Microbiology 8, 2540. Pearson, M.N., Beever, R.E., Boine, B., Arthur, K., 2009. Mycoviruses of filamentous fungi and their relevance to plant pathology. Molecular Plant Pathology 10, 115–128. Pearson, M.N., Beever, R.E., Boine, B., Tan, C., 2009. Can mycoviruses be used for the biocontrol of the plant pathogenic fungus Botrytis cinerea? In: Elad, Y., Freeman, S. (Eds.), Biological Control of Fungal and Bacterial Plant Pathogens 43. IOBC/WPRS Bulletin, pp. 7–10. (ISBN978-92-9067-217-3). Roossinck, M.J., 2019. Evolutionary and ecological links between plant and fungal viruses. New Phytologist (Tansley Review) 221, 86–92. Wang, L., He, H., Wang, S., et al., 2018. Evidence for a novel negative-stranded RNA mycovirus isolated from the plant pathogenic fungus Fusarium graminearum. Virology 518, 232–240.

Prions of Yeast and Fungi Reed B Wickner, National Institute of Diabetes and Digestive and Kidney Diseases, National Institutes of Health, Bethesda, MD, United States Herman K Edskes, National Institutes of Health, Bethesda, MD, United States Published by Elsevier Ltd. This is an update of R.B. Wickner, H. Edskes, T. Nakayashiki et al., Prions of Yeast and Fungi, In Encyclopedia of Virology (Third Edition), edited by Brian W.J. Mahy and Marc H.V. Van Regenmortel, Elsevier Ltd., 2008, doi:10.1016/B978-012374410-4.00407-6.

Glossary Nonchromosomal gene A gene that segregates 4 þ :0  in meiosis and can be transferred by cytoplasmic mixing, in contrast to chromosomal genes that segregate 2 þ :2  in meiosis, and are not transferred by cytoplasmic mixing.

Nonsense suppressor tRNA A mutant transfer RNA that recognizes a translational stop codon and inserts an amino acid thus allowing the peptide chain to continue.

Introduction and History The word prion, meaning “infectious protein” without need for a nucleic acid, was coined to explain the properties of the agent producing the mammalian transmissible spongiform encephalopathies (TSEs). The yeast and fungal prions were identified by their unique genetic properties which were unexpected for any nucleic acid replicon, but specifically predicted for an infectious protein. [PSI] was described by Brian Cox in 1965 as a nonchromosomal genetic element of Saccharomyces cerevisiae that increased the efficiency of a weak nonsense suppressor transfer RNA (tRNA). [URE3] was described by Francois Lacroute as a nonchromosomal gene that relieved nitrogen catabolite repression, allowing expression of genes needed for utilizing poor nitrogen sources even when a good nitrogen source was available. The [Het-s] prion was described in 1952 by Rizet as a nonchromosomal gene needed for heterokaryon incompatibility in Podospora anserina. In 1994, we showed that [URE3] and [PSI þ ] were prions based on their genetic properties (see below). The [PIN] prion of yeast was discovered in 1997 by Derkatch and Liebman in their studies of de novo generation of the [PSI] prion. There are now an array of prions known in yeast (Table 1).

Genetic Signature of a Prion Viruses of yeast and fungi generally do not exit one cell and enter another, but spread by cell–cell fusion, as in mating or heterokaryon formation. Infectious proteins (prions) should likewise be nonchromosomal genetic elements. To distinguish prions from replicating nucleic acids, three genetic criteria were proposed (Fig. 1): (1) If a prion can be cured, it can reappear in the cured strain at some low frequency. (2) Overproduction of the protein capable of being a prion should increase the frequency of the prion arising de novo. (3) If the prion produces a phenotype by the simple inactivation of the protein, then this phenotype should resemble the phenotype of mutation of the gene encoding the protein, which gene must be needed for prion propagation. All three criteria were satisfied by [PSI] and [URE3], strongly indicating, perhaps proving, that they were prions. The [Het-s] prion of P. anserina and the [PIN] prion of Saccharomyces cerevisiae were likewise proved by application of the same genetic criteria, but because their prion form produces the phenotype, rather than the absence of the normal form, the prion protein gene is needed for propagation of the prion, but the phenotype of the prion is not the same as mutants in the prion protein gene.

Self-Propagating Amyloid as the Basis for Most Yeast Prions The finding that Sup35p, Ure2p, and HET-s were protease resistant and aggregated in prion-containing cells, and that these proteins (and particularly their prion domains (Fig. 2)) would form amyloid in vitro indicated that a self-propagating amyloid (Fig. 3) was the basis of these prions. This was confirmed by the finding that the corresponding prions were transmitted by introduction of amyloid formed in vitro from the recombinant proteins. In some cases it was shown that the soluble form or nonspecific aggregates of the protein were ineffective. This argues that the respective amyloids are not by-products or a dead-end stage of these prions, but are themselves the infectious material. All infectious Ure2p amyloids are larger than about 40-mer size. Amyloid formed in vitro is capable, for at least [PSI] and [URE3], of transmitting any of several prion variants. This implies that the amyloids can have any of several structures. Strong variants of [PSI þ ] (those

Encyclopedia of Virology, 4th Edition, Volume 4

doi:10.1016/B978-0-12-809633-8.21282-X

487

488

Table 1

Prions of Yeast and Fungi

Prions of Saccharomyces cerevisiae and Podospora anserina

Prion

Prion protein

Prion phenotype

Normal protein function

[URE3]

Ure2p

[PSI þ ]

Sup35p

Repression of genes for utilizing poor nitrogen sources in presence of a good nitrogen source Translation termination

[PIN þ ] [OCT þ ] [SWI þ ] [MOT þ ] [MOD þ ] [BETA] [Het-s]

Rnq1p Cyc8p Swi1p Mot3p Mod5p Prb1p HET-s

Derepressed genes for utilizing poor nitrogen sources in presence of a good nitrogen source; slow growth Readthrough of termination codons; slow growth; death (depends on prion variant) Rare generation (by cross-seeding) of [PSI þ ] or [URE3] Slow growth; impaired mating and sporulation Poor growth on raffinose, galactose or glycerol Inappropriate derepression of anaerobic genes; colony polymorphisms Partial azole-resistance; slow growth Active protease B (non-amyloid prion) Heterokaryon incompatibility

none known transcription repressor subunit chromatin remodeling subunit transcription regulator tRNA isopentenyltransferase Active protease B (this is a functional prion) Heterokaryon incompatibility (this is a functional prion)

Note: All but [BETA] are amyloid-based prions. [Het-s] is a prion of the filamentous fungus Podospora anserine, and the others are prions of S. cerevisiae.

Fig. 1 Genetic signature of a prion. Among nonchromosomal genetic elements, those with the three properties shown here must not be nucleic acid replicons and are almost certainly prions. However, only prions for which the prion form of the protein is inactive (such as [URE3] or [PSI]) will have the third property. The [Het-s] and [PIN þ ] prions are active forms of the HET-s protein and Rnq1 protein, respectively and have the first and second properties but not the third.

Fig. 2 Prion domains. The prion domains of Ure2p, Sup35p, and HET-s are largely unstructured in the native form and in b-sheet in the prion form.

Prions of Yeast and Fungi

489

Fig. 3 Infectious amyloids of yeast prion proteins have an in-register parallel folded b-sheet architecture. It is suggested that the amyloids of different prion variants differ in the locations of the folds of the sheets (i.e., the turns of the strands). The in-register architecture is maintained by the favorable interactions among identical side-chains. The same interactions force the new monomer joining the end of the filaments to align in-register and thus have the turns in their strands at the same location as molecules already in the filament. This constitutes conformational templating.

with a more pronounced phenotype) are based on shorter, more fragile amyloid filaments with a shorter b-sheet region than are weak [PSI þ ] variants.

Shuffling Prion Domains and Amyloid Structure The prion domains (Fig. 2) of Ure2p and Sup35p are quite rich in Asn and Gln residues, and nearly the entire sequence of Rnq1p, the basis of the [PIN] prion, is rich in these amino acids. However, many Q/N-rich proteins are not capable of being prions. Thus, it was assumed that specific sequences in the known prion domains were important for prion formation. The Sup35 prion domain has octapeptide repeats much like those in PrP (the mammalian prion protein), and deletion or duplication of these showed substantial effects on prion generation. In addition, single amino acid changes in the prion domain of Sup35p blocked prion propagation. To critically test whether the Ure2p prion domain had sequences essential for prion development, the entire Q/N-rich region (residues 1–89) was randomly shuffled (without changing the amino acid content) and each of five shuffled sequences were inserted into the chromosome in place of the normal prion domain. Surprisingly, each of these five shuffled sequences could support prion generation and propagation, although one was rather unstable. Each protein with the shuffled sequence could also form amyloid in vitro. This showed that it was the amino acid content of the Ure2p prion domain that determines prion formation, and that sequence plays only a minor role. Similarly, five shuffled versions of the Sup35p prion domain were each inserted in place of the normal sequence. Again, all five shuffled versions allowed formation and propagation of a [PSI]-like prion. It is likely that the effects of deletion or duplication of the octapeptide repeats observed on prion formation or propagation were due to changes in the length or composition of the prion domain. It appears that repeats per se are not important for prion generation or propagation.

Shuffleable Prion Domains Suggests Parallel In-Register b-Sheet Structure Amyloids are b-sheet structures, but there are at least three kinds of b-sheets. Antiparallel b-sheets have adjacent peptide chains oriented in opposite directions: N-4C next to C-4N. This results in pairing of largely nonidentical residues. A b-helix also involves pairing of largely nonidentical residues, although they are within the same peptide chain. A parallel b-sheet pairs identical residues if it is in-register, but in principle, one could have an out-of-register parallel b-sheet, in which case nonidentical residues would be bonded to each other. Prion propagation (and amyloid propagation in general) is a very sequence-specific process. For example, a single amino acid change (at residue 138) can block propagation of scrapie in tissue culture. Humans are polymorphic at PrP residue 129 with roughly equal numbers of alleles encoding M and V. Either M/M or V/V individuals can get Creutzfeldt–Jakob disease (a human prion disease), but M/V heterozygotes rarely do. Similarly, a single amino acid change in the prion domain of Sup35p can block propagation of [PSI] from the normal sequence, but the mutant Sup35p can itself become a prion nonetheless. Thus, if a prion amyloid has an antiparallel, parallel out-of-register, or b-helix structure, there must be some form of complementarity between bonded residues. Shuffling such a sequence would be expected to destroy the complementarity. In contrast, shuffling the sequence of a parallel in-register b-sheet would still leave identical residues paired. This suggested that prion domains that can be shuffled without destroying their prion-forming ability are forming parallel in-register b-sheets.

490

Prions of Yeast and Fungi

Infectious Prion Amyloids Have In-Register Parallel Folded b-sheet Architecture Solid-state NMR experiments, combined with mass per filament length determinations, have shown that the prion domains of Sup35p, Ure2p and Rnq1p are each folded in-register parallel b-sheet structures, like the architecture of such pathologic amyloids of human Ab, amylin/IAPP, Tau, and alpha-synuclein. In contrast, the functional prion [Het-s] is composed of amyloid of the HET-s protein having a b-helix structure (Fig. 3).

Prion Variants and the Species Barrier A single protein or peptide can be the basis of a wide variety of prion “strains” or “variants”, based on differing conformations of the protein in the amyloid, each of which is self-propagating. The manifestation of prion variant differences may be strength of the prion phenotype, stability of prion propagation, sensitivity to the overproduction or deficiency of various chaperones or non-chaperone proteins, or the sensitivity to the various endogenous anti-prion systems (see below). Although prions may be transmissible between species (in mammals or yeast), the efficiency of transmission is generally lower between species than within a species, an effect due to sequence differences between the prion proteins of the donor and recipient species. Barriers are also seen within a single species (“intraspecies barriers”) as a result of sequence polymorphisms, probably selected in the wild to reduce the probability of being infected with a pathologic prion. The ease with which any of these transmission barriers are overcome is sharply dependent on the prion variant, with some completely unable to be transmitted and others transmitted without apparent barrier. It is proposed that these barriers reflect the range of conformers that a given molecule can assume. If the donor conformer is such that the recipient can assume that structure, then there will be little barrier, and the contrary. The genetics of prion variants makes it clear that some variants are “dominant” to others when introduced together, while other variants are compatible and segregate relative to each other during mitotic growth. While a variant may appear to be stable with respect to one property, it may be unstable with respect to another. Inaccurate templating may cause some prion “mutations”, while severely detrimental effects of a [URE3] or [PSI þ ] prion on the cells may lead to selection during growth of very rare milder variants or loss of the prion.

Prion Variant Information Templating Mechanism The folded in-register parallel b-sheet architecture of the infectious yeast prion amyloids determines the structure except for the extent of the b-sheets and the location of the folds in the sheet. The structure is kept in-register by favorable interactions among identical amino acid side chains that can only happen if register is maintained (Fig. 3). This naturally suggests a mechanism by which protein conformation can be templated. In order that their amino acid side chains have these favorable interactions, the unstructured prion domains of the monomers about to join the end of the filament must have their turns at the same locations as those of the molecules already in the filament. This directs the conformation of the new molecule to be essentially that of the rest of the molecules in the filament. Each prion variant is suggested to have the folds of the sheet (¼ turns of the molecule) in places specific for that variant. This is the only mechanism that has yet been proposed for prion variant propagation/conformational templating.

Chaperones and Prions Chaperones of the Hsp40, Hsp70, and Hsp104 groups, as well as Hsp90 and its co-chaperones, have been found to be clearly involved in prion propagation (Table 2). Millimolar concentrations of guanidine cure each of the amyloid-based prions by specific inhibition of Hsp104. At least one function of these chaperones is to break large amyloid filaments into smaller ones which can then be distributed at cell division. Overexpressed Hsp104 has a second activity, in curing the [PSI þ ] prion, and less efficiently, [URE3]. The mechanism of curing is controversial but may involve asymmetric segregation of prion filaments, possibly related to Hsp1040 s role in collecting damaged proteins and retaining them in the mother cell. This second Hsp104 activity is lost in mutants in (or deletion of) the N-terminal domain (e.g., T160M), but such mutants retain their prion propagation promoting activity and their ability to protect cells from heat shock. Other proteins affecting prion propagation include the nucleotide exchange factors for Hsp70s, namely Fes1p and Sse1p, the cytoskeleton assembly factor Sla1p, and the co-chaperone Sgt2p.

Prion Generation, and [PIN]: A Prion That Gives Rise to Prions One of the lines of evidence that showed [PSI] was a prion of Sup35p was that overproduction of Sup35p increased the frequency with which [PSI] arises de novo. However, it was found that in some strains, overproduction of Sup35p did not yield detectable emergence of [PSI]-carrying clones. Another nonchromosomal genetic element, named [PIN] for [PSI]-inducibility, was found necessary. [PIN] is a self-propagating amyloid form of the Rnq1 (rich in Asn (N) and Gln (Q)) protein, and it promotes de novo generation of [URE3] as well as [PSI].

Prions of Yeast and Fungi

Table 2

491

Interactions of prions with chaperones and other proteins

Mechanism

Components

Filament breakage: removal of a single prion protein molecule from Hsp104, Hsp70 (Ssa1–4), Hsp40 (Sis1) middle of the filament generates new filaments Filament collection B Asymmetric segregation Hsp104 overproduction curing [with Hsp90s and Hsp90 co-chaperones] Btn2 [with Hsp42] ?Blocking filament ends Upf1,2,3 ?Competition for monomers Upf1,2,3 ?Occupation of unstructured (prion) domains ?? Solubilizing monomers: a consequence of the “Filament breakage” mechanism above ?Blockage of access to filaments for breakage ?Sequestration of Sis1p

Hsp104, Hsp70 (Ssa1–4), Hsp40 (Sis1) Excess Hsp104 Cur1p overproduction

Yeast prions affected All amyloid-based prions [PSI þ ], ([URE3]) [URE3], (some [PSI þ ]) [PSI þ ] [PSI þ ] ??any amyloid-based prion protein All amyloid-based prions [PSI þ ] [URE3]

Biological Roles of Prions: A Help or a Hindrance? In an attempt to discern whether yeast prions are an advantage or disadvantage to their host organism, cell growth of isogenic [PSI þ ] and [psi–] strains have been carried out under a variety of conditions. To what extent the various growth conditions tested represent the normal yeast habitat seems unknowable, although [psi–] was an advantage under far more conditions than was [PSI þ ]. An alternative approach was to compare the frequency with which [PSI þ ] or [URE3] was found in wild strains to those of several “selfish” RNA and DNA viruses and replicons known in S. cerevisiae. In any organism, an infectious element (such as a virus) may be widely distributed in spite of it causing disease in its host because the infection process overcomes and outraces the loss of infected individuals from the negative effects of the infecting element. Certainly an advantageous infectious element will quickly become widespread as selection and infection operate in the same direction. In fact, while the mildly deleterious RNA and DNA viruses and plasmids of yeast are easily found in wild strains neither [URE3] nor [PSI þ ] was found in any of the 70 wild strains examined. This indicates that the mildest variants of [URE3] and [PSI þ ] produce disease in their hosts, and a rather more severe disease than the mild nucleic acid replicons. However, more detrimental variants of each prion, either dramatically slowing growth or even killing the host, are more common than the mild variants usually used for laboratory studies. Moreover, the recent discovery of an array of cellular anti-prion systems (see below) indicates that the cells themselves do not regard these prions as beneficial. The [Het-s] prion of Podospora appears to carry out the normal fungal function of heterokaryon incompatibility, thought to be a protection against the sometimes debilitating fungal viruses. Indeed, as one would expect for a prion with a function for the cell, over 90% of wild Podospora isolates carry [Het-s], confirming its beneficial effects. Unlike [PSI þ ] and [URE3], [PIN þ ] is found in wild strains at a frequency similar to that of the parasitic RNA and DNA viruses and plasmids. This suggests that [PIN þ ] is at least not as severe a pathogen as are [URE3] and [PSI þ ].

Anti-Prion Systems The pathological character of yeast prions suggested that there may be cellular systems that cure prions as they arise, much as DNA repair systems eliminate most mutations as they arise. Indeed, it was found that over 90% of [URE3] variants arising (at increased frequency) in a btn2 cur1 strain are cured by replacing just the normal amount of these proteins. Btn2p cures [URE3] by collecting Ure2p amyloid filaments at one locus in the cell, increasing the likelihood that one of the daughter cells will lack filaments and thus be cured. Likewise, the frequency of [PSI þ ] arising spontaneously is elevated over 10-fold in a mutant (T160M) that lacks the Hsp104-overproduction-curing activity toward [PSI þ ], and most of those variants are cured by just replacing the normal Hsp104 without overproduction. Mutants in the UPF genes, responsible for nonsense-mediated decay, show 410-fold elevated frequency of [PSI þ ] generation, and, again, most of the variants arising are cured by restoring normal levels of expression of these proteins. Sub35p forms a complex with Upf1p, Upf2p and Upf3p, and it is suggested that this complex either competes with the filaments for free monomers, or blocks the ends of filaments preventing their growth. The ribosome-associated Hsp70 family chaperones Ssb1p and Ssb2p keep down the frequency of [PSI þ ] arising de novo, but in this case restoring the normal amounts of these proteins does not cure the variants arising. It is inferred that the Ssbs, known to be involved in promoting proper folding of nascent proteins, prevent the generation of [PSI þ ]. The Hsp40 family member Sis1p is essential for propagation of [PSI þ ] and [URE3], but also protects cells from the potential lethality of [PSI þ ] without curing the prion. Similarly, the Lug1/Ylr352w protein prevents the near-lethality of [URE3], apparently via a novel function of Ure2p (not nitrogen regulation) that is essential in the absence of Lug1/Ylr352w.

492

Prions of Yeast and Fungi

Inositol Polyphosphates and Prion Propagation In a screen for anti-prion systems, it was found that nearly all variants of the [PSI þ ] prion require one of several inositol poly/pyrophosphates for their propagation. Inositol hexakisphosphate (IP6), 5-diphospho-IP5 and 5-diphospho-IP4 are each able to support [PSI þ ] propagation. In the absence of the 5-diphospho-IPs, the 1-diphospho-IP5 has an inhibitory effect on [PSI þ ] propagation. Siw14p, a 5-pyrophosphatase active on 5-diphospho IPs, prevents propagation of about half of [PSI þ ] variants because they need elevated levels of that compound. The mediators of these effects are, as yet, unknown.

Enzyme as Prion While most of the known prions involve amyloids, the word prion (infectious protein) is more general, requiring only that transmission be by protein alone. If an enzyme is made as an inactive precursor that needs the active form of the same enzyme for its activation, then such a protein can be a prion. The vacuolar protease B (Prb1p) of S. cerevisiae can be such a prion in a mutant lacking protease A, which normally activates its precursor. Cells initially carrying only the inactive precursor remain so unless the active enzyme is introduced. Once a cell has some active enzyme, the autoactivation process can continue indefinitely. It is likely that other examples of this type of phenomenon will be found among the many protein kinases, methylases, acetylases, and other protein-modifying enzymes that are known.

Conclusions The advent of yeast prions has propelled the prion field forward, giving the first proof that an infectious protein can exist, the first structure/architecture of a prion amyloid, and a wide array of information on cellular components favoring and antagonizing prion propagation. With the findings that the common human amyloidosis are actually prions, the continuing insights into the general prion phenomena that are being provided by the yeast and fungal work will continue to push this field forward.

Acknowledgment This research was supported [in part] by the Intramural Research Program of the NIH, The National Institute of Diabetes and Digestive and Kidney Diseases (NIDDK).

See also: Prions of Vertebrates

Further Reading Aigle, M., Lacroute, F., 1975. Genetical aspects of [URE3], a non-Mendelian, cytoplasmically inherited mutation in yeast. Molecular and General Genetics 136, 327–335. Cascarina, S.M., Paul, K.R., Machihara, S., Ross, E.D., 2018. Sequence features governing aggregation or degradation of prion-like proteins. PLOS Genetics 14, e1007517. doi:10.1371/journal.pgen.1007517. Chernoff, Y.O., Lindquist, S.L., Ono, B.-I., Inge-Vechtomov, S.G., Liebman, S.W., 1995. Role of the chaperone protein Hsp104 in propagation of the yeast prion-like factor [psi þ ]. Science 268, 880–884. Coustou, V., Deleu, C., Saupe, S., Begueret, J., 1997. The protein product of the het-s heterokaryon incompatibility gene of the fungus Podospora anserina behaves as a prion analog. Proceedings of the National Academy of Sciences of the United States of America 94, 9773–9778. Cox, B.S., 1965. PSI, A cytoplasmic suppressor of super-suppressor in yeast. Heredity 20, 505–521. Derkatch, I.L., Bradley, M.E., Hong, J.Y., Liebman, S.W., 2001. Prions affect the appearance of other prions: The story of [PIN]. Cell 106, 171–182. Jung, G., Jones, G., Masison, D.C., 2002. Amino acid residue 184 of yeast Hsp104 chaperone is critical for prion-curing by guanidine, prion propagation, and thermotolerance. Proceedings of the National Academy of Sciences of the United States of America 99, 9936–9941. King, C.Y., Diaz-Avalos, R., 2004. Protein-only transmission of three yeast prion strains. Nature 428, 319–323. Liebman, S.W., Chernoff, Y.O., 2012. Prions in yeast. Genetics 191, 1041–1072. Maddelein, M.L., Dos Reis, S., Duvezin-Caubet, S., Coulary-Salin, B., Saupe, S.J., 2002. Amyloid aggregates of the HET-s prion protein are infectious. Proceedings of the National Academy of Sciences of the United States of America 99, 7402–7407. Ness, F., Cox, B., Wonwigkam, J., Naeimi, W.R., Tuite, M.F., 2017. Over-expression of the molecular chaperone Hsp104 in Saccharomyces cerevisiae results in the malpartition of [PSI þ ] propagons. Molecular Microbiology 104, 125–143. Paushkin, S.V., Kushnirov, V.V., Smirnov, V.N., Ter-Avanesyan, M.D., 1997. In vitro propagation of the prion-like state of yeast Sup35 protein. Science 277, 381–383. Reidy, M., Sharma, R., Shastry, S., et al., 2014. Hsp40s specify functions of Hsp104 and Hsp90 protein chaperone machines. PLOS Genetics 10, e1004720. Tanaka, M., Chien, P., Naber, N., Cooke, R., Weissman, J.S., 2004. Conformational variations in an infectious protein determine prion strain differences. Nature 428, 323–328. Tycko, R., Wickner, R.B., 2013. Molecular structures of amyloid and prion fibrils: Consensus vs. controversy. Accounts of Chemical Research 46, 1487–1496. Wickner, R.B., 1994. [URE3] as an altered URE2 protein: Evidence for a prion analog in S. cerevisiae. Science 264, 566–569. Wickner, R.B., Bezsonov, E.E., Son, M., et al., 2018. Anti-prion systems in yeast and inositol polyphosphates. Biochemistry 57, 1285–1292. doi:10.1021/acs.biochem.1027b01285. Wickner, R.B., Shewmaker, F., Bateman, D.A., et al., 2015. Yeast prions: Structure, biology and prion-handling systems. Microbiology and Molecular Biology Reviews 79, 1–17.

Single-Stranded DNA Mycoviruses Daohong Jiang, Huazhong Agricultural University, Wuhan, China r 2021 Elsevier Ltd. All rights reserved.

Glossary Biological control A method to prevent pests using beneficial organisms, here specifically means using mycovirus to control fungal diseases. DNA mycovirus, also called fungal DNA virus A DNA virus that replicates in fungi. Gemonoviridae A group of small circular single-stranded DNA viruses that are phylogenetically related to plant geminiviruses, but lack move protein genes. Hypovirulence Reduced virulence of a fungal pathogen caused by the infection of mycovirus; Hypovirulence and hypovirus of chestnut blight/Cryphonectria parasitica

system is a classic example for studying mycovirus and biological control of plant fungal diseases. Sclerotinia sclerotiorum An worldwide spread ascomycetous fungal pathogen that can attack more than 400 species and subspecies of plants. Important disease caused by it incldues stem rot on rapeseed (Brassica napus). ssDNA mycovirus single-stranded DNA mycovirus. Vegetative incompatibility A non-self-recognition system that is determined by vic genes in fungi, and is a mechanism for fungi to prevent the infection by molecular parasites, such as mycovirus. It is a key hindrance to successful biological of fungal diseases using mycovirus.

Introduction DNA viruses are viruses that have DNA genomes and replicate using DNA-dependent DNA polymerase. DNA viruses can be grouped into two classes, double-stranded (ds) DNA viruses and single-stranded (ss) DNA viruses. DNA viruses are very common in both prokaryotic microorganisms and eucaryotic organisms including humans, animals, and plants. The most feared DNA viruses is variola virus which causes smallpox. Three families of DNA viruses have been established in plants, namely Caulimoviridae, Geminiviridae, and Nanoviridae, whereas DNA virus has been very rarely reported in fungi. Previously, a DNA virus named Rhizidiomyces virus was discovered in filamentous microorganism Rhiziodiomyces sp., which used to be grouped in the kingdom Fungi, however, Rhiziodiomyces is actually a fungi-like water mold that belongs to the kingdom Chromista. In the 1980s, a DNA virus with large virions has been observed in a strain of Rhizoctonia solani in China, but this study has not been followed. The presence or absence of DNA virus in fungi was unknown for a considerable period of time. In 2010, a small molecule circular single-stranded DNA virus was discovered and reported in a cosmopolitan necrotrophic pathogenic fungus Sclereotinia sclerotiorum. The discovery of DNA mycoviruses has significantly enriched the understanding of mycoviruses, broadened the research field of mycoviruses, and provided new research ideas and methods for the use of mycoviruses to control fungal diseases.

The Host of the DNA Mycovirus Sclerotinia sclerotiorum is an ascomycetous plant pathogenic fungus that was firstly described in 1837 by Libert, M.A. and named Peziza sclerotiorum; the name was changed as Sclerotinia libertiana by Fuckel, L. in 1870, finally changed as the current name by de Bary, A. in 1884. S. sclerotiorum is a typical necrotrophic plant pathogen; it attacks more than 400 species and subspecies of plants, most of which are dicotyledons, including many economically important crops, such as cruciferous crops, leguminous crops, and compositae vegetable crops. The diseases caused by S. sclerotiorum are often called white mold on many vegetable crops or stem rot on rapeseed (Brassica napus). S. sclerotiorum infects plants via either ascospores or infectious hyphae, produces sclerotia on plant residues at the late stage of infection and uses them to undergo dormancy in the field. When contacting hosts, sclerotia germinate carpogenically to release ascospores into the air and infect aerial parts of plants, or germinate myceliogenically to produce infectious hyphae to infect basal stem and old leaves that contact the ground. The sclerotia, apothecia, ascuses, and ascospores, and the induced symptom on rapeseed are presented in Fig. 1. The stem rot is the major disease on rapeseed worldwide, and no resistant cultivar is available. The control of stem rot is mainly dependent on fungicides which usually is not efficient. Hence, a number of field-collected isolates of S. sclerotiorum were screened for hypovirulence-associated mycoviruses to control stem rot, and Sclerotinia sclerotiorum hypovirulence-associated DNA virus 1 (SsHADV-1) was discovered.

Discovery of DNA Mycoviruses Sclerotia collected from rapeseed fields in China were used to screen hypovirulence-associated mycoviruses and one abnormal strain DT-8 was isolated from a sclerotium sampled at Datong Lake District, Yiyang City, Hunan Province, China. Strain DT-8 grows slowly on potato dextrose agar (PDA) medium and forms an abnormal colony morphology rich in aerial hyphae on PDA

Encyclopedia of Virology, 4th Edition, Volume 4

doi:10.1016/B978-0-12-809633-8.21346-0

493

494

Single-Stranded DNA Mycoviruses

Fig. 1 Sclerotinia sclerotiorum and the induced stem rot of rapeseed (Brassica napus). A, Sclerotia (arrows) produced on residue of dead rapeseed; B, Apothecia, germinated from sclerotia; C–D, asco and ascospores; E, Symptom on stem (arrows); F, Symptom on stem basal (arrows); G, Premature and lodging, heavy infected rapeseed field, only green plants were not killed.

plate (Fig. 2(A) and (D)). The sclerotia produced by strain DT-8 are usually smaller than these produced by normal strains of S. sclerotiorum and are distributed on colony irregularly (Fig. 2(C)). Only very few sclerotia could produce apothecia and ascospores, while ascospore-derived offsprings are fully recovered with no different from the normal strains; occasionally, subcultures derived from hyphal tip of strain DT-8, such as strain DT-8VF, also shows normal phenotype of S. sclerotiorum. Strain DT-8 almost loses virulence both on rapeseed and Arabidposis thaliana (Fig. 2(B)). When extracting DNA from strain DT-8 using CTAB method, besides genomic DNA band, two or more additional small DNA bands could be observed under UV light when separated by agarose gel electrophoresis (Fig. 3(A)), while cultures cured did not have these DNA bands, suggesting that these additional DNA were possibly associated with the abnormal characteristics of strain DT-8. The additional DNA was recovered from agarose gel for sequencing analysis, and results showed that the DNA was actually a genome of a small single-stranded circular DNA virus. Isometric particles with a diameter of 22 nm (Fig. 3(C)) were successfully extracted and purified from mycelia of strain DT-8 by ultra-centrifugation, and viral DNA could be successfully extracted from virions (Fig. 3(E)). These results directly confirmed that strain DT-8 was infected by a DNA virus. By dual culture, hypovirulence and the associated traits were co-transmitted with DNA through hyphal contact (hyphal anaomostosis). Furthermore purified virions could successfully transfect the protoplast of normal strain of S. sclerotiorum, and confer hypovirulence to its host. Hence, the DNA virus is associated with hypovirulence and named as Sclerotinia sclerotiorum hypovirulence-associated DNA 1 (SsHADV-1).

The Genome and Proteins of SsHADV-1 The genome of SsHADV-1 is a single-stranded circular DNA with a full length of 2166 nt (Fig. 3(B) and (F)). The genome has only two large open reading frames (ORF) encoding a coat protein (CP) with 312 amino acid residues (aa) and a putative replication initiation protein (Rep) with 324 aa. The CP gene is located on the sense strand of the DNA genome, while the Rep gene is located on the antisense strand of the DNA genome (Fig. 3(F)). The two genes are separated by two non-coding DNA, the large one called the large intergenic region (LIR) which has 133 nt, and the small one called small intergenic region (SIR) with a length of 119 nt. LIR has the nonanucleotide TAATATT ↓ AT at the apex of a stem-loop structure, which is a conserve recognition site (↓) circular ssDNA viruses for Rep to initiate viral DNA replication. Furthermore, there is a bidirectional promoter in the LIR to transcribe CP gene and Rep gene. The coat protein was confirmed by sequencing analysis of protein samples extracted from purified virions,

Single-Stranded DNA Mycoviruses

495

Fig. 2 Abnormal phenotype of strain DT-8 of Sclerotinia sclerotiorum. A, Abnormal colony morphology, developed on PDA medium plate at 201C for 15 days; B, Hypovirulence, could not induce typical lesion on leaf of Arabidopsis thaliana, inoculated plants were maintained at 201C for 4 days post inoculation; C, Small sclerotia produced on a PDA medium (201C for 30 days); D, Slow growth, bars represent standard deviation from the mean (n ¼ 8), the small letters on top of the bars indicate whether the differences are statistically significant (P o 0.05). The hypovirulent strain Ep-1PN which is infected by two RNA viruses, its virus-free sexual progeny Ep-1PNA367 and virus-free isogenic strain DT-8VF derived by hyphal tipping culture of strain DT-8 were used as controls. Reproduced from Yu, X., Li, B., Jiang, D., et al., 2010. A geminivirus-related DNA mycovirus that confers hypovirulence to a plant pathogenic fungus. Proceedings of the National Academy of Sciences of the United States of America 107, 8387–8392.

while Rep protein was confirmed by alignment analysis with related viruses. Rep of SsHADV-1 shows high similarity to the Rep proteins of geminiviruses that infect numerous plants. Two conserved domains in the Rep proteins of geminiviruses, namely Rep catalytic domain (Gemini_AL1) (pfam00799) and Rep central domain (Gemini_AL1_M) (pfam08283) are also identified on the Rep of SsHADV-1. Seven conserved motifs which are shared with viruses in the genus Mastrevirus in the family Geminiviridae are also identified, namely IRD, RCR-I, RCR-II, RCR-III, RBR, NTP Binding-A, and NTP binding-B (Fig. 4(B)).

Host Range of SsHADV-1 SsHADV-1 can infect strains of S. sclerotiorum by hyphal anamostosis or just by contacting with colonies of virus-infected strains and non-infected strains. A transfection method has shown SsHADV-1 to be introduced and replicate in species in the genus Sclerotinia, such as S. minor and S. nivalis. However, SsHADV-1 could not replicate in Botrytis cinerea which belong to the same family as Sclerotinia sp. SsHADV-1 also could neither replicate in Coniothyrium minitans, a mycoparasite of Sclerotinia spp., Magnaporthe oryzae nor Saccharomyces cerevisiae which are phylogenetically distant to its natural hosts. As far as we know, there is no any report that SsHADV-1 could infect other fungi in nature. Hence, SsHADV-1 has a narrow fungal host range.

Extracellular Entry of SsHADV-1 Many mycoviruses have no significant effects on their host, and often lack extracellular phase, hence are also considered to be dispensable genetic elements in fungi. Virions of mycoviruses generally are not able to infect their host directly. Unlike most known mycoviruses, SsHADV-1 can infect S. sclerotiorum through intact hyphae. When colonies of virus-free strain of S. sclerotiorum contacts SsHADV-1 virions which are spread on a PDA plate, the virus enters into fungal cells and successfully infect the colony and convert hypovirulence to the virus-free strain (Fig. 5(A)–(C)). The virions also can infect S. sclerotiorum on plants, when sprayed

496

Single-Stranded DNA Mycoviruses

Fig. 3 Genomic structure and virions of Sclerotinia sclerotiorum hypovirulence-associated DNA virus 1 (SsHADV-1). A, Extrachromosomal DNA elements in strain DT-8 DNA samples were extracted with CTAB method from mycelia of strain DT-8. DNA samples were separated on 1.0% agarose gel. Lane Maker 1, l-Hind III-digested DNA Marker; Lane Maker 2, DL2000 DNA ladder marker (TaKaRa). B, Southern blot analysis of total DNA extracted from mycelia. The forms of viral DNA are indicated as “OC dsDNA” for open circular double-stranded (ds)DNA, “SC” dsDNA for supercoiled dsDNA, and “circular” for circular single-stranded (ss) DNA. A 379-bp DNA fragment of the Rep gene was PCR amplified and labeled with a-32P dCTP and used as a probe. C, Isometric virions (about 22 nm) purified from mycelia of strain DT-8, observed under transmission electron microscopy. Virions were negatively stained with 1% uranyl acetate before observation. D, The major protein (coat protein) of SsHADV-1 extracted from virions separated by SDS/PAGE. Virions were extracted from mycelial mass of strain DT-8 and purified by ultra-centrifugation. The protein was stained with Coomassie blue before observation. E, Viral genomic DNA extracted from virions separated on 1% agarose gel. F, Genome structure of SsHADV-1. Two large Open Reading Flame (ORFs), one coding for coat protein (CP) and one for putative replication initiation protein (Rep) were presented as thick arrows. The positions of the conserved stem-loop structure for small circular single-stranded DNA viruses, large intergenic region (LIR) and small intergenic region (SIR) were displayed; the detail DNA sequence of stem-loop structure was also shown on the right. LIR is also shown in an expanded form to indicate the elements of the bidirectional promoter. Reproduced from Yu, X., Li, B., Jiang, D., et al., 2010. A geminivirus-related DNA mycovirus that confers hypovirulence to a plant pathogenic fungus. Proceedings of the National Academy of Sciences of the United States of America 107, 8387–8392.

on the aerial parts of plant (Fig. 5(D)–(E)); however, virions on plant are not stable and will be disable in a short time after sprayed on plants. These properties suggest that SsHADV-1 has a potential to biological control of plant diseases.

Mutualistic Interaction Between SsHADV 1 and Mushroom Sciarid Fly Mushroom sciarid fly Lycoriella ingenua (Diptera: Sciaridae) is a common pest in mushroom houses (Fig. 6(A)). A phenomenon is that SsHADV-1–infected strain DT-8 always specifically attract sciarid fly to lay eggs on its colony, and the larva feeds on the colony and

Single-Stranded DNA Mycoviruses

497

Fig. 4 Conserved domains and motifs on putative Rep of SsHADV-1. A, The locations of two conserved domains, Gemini_AL1 and Gemini_AL1_M are geminivirus Rep catalytic domain (accession no. pfam00799) and geminivirus Pep protein central domain (pfam08283), the E-values are 1.77e-12 and 5.01–08. This schematic is automatically generated by the Blastp program on NCBI when using Rep of SsHADV-1 as a query sequence. B, Alignment analysis of amino acid sequences among SsHADV-1 and selected viruses from the genus Mastrevirus in the family Geminiviridae by using CLUSTALX (2.0). Identical and similar amino acid residues are indicated with asterisks and colons. Conserved motifs I to VII are IRD, RCR-I, RCR-II, RCR-III, RBR, NTP Binding-A, and NTP binding-B are shaded with light gray color. “TbYDV” means tobacco yellow dwarf virus (NP_620726), “CpCDV” means chickpea chlorotic dwarf virus (YP_002014712) and “BeYDV” means bean yellow dwarf virus (NP_612221); numbers in brackets are the positions of amino acid residues that are not listed. Reproduced from Yu, X., Li, B., Jiang, D., et al., 2010. A geminivirus-related DNA mycovirus that confers hypovirulence to a plant pathogenic fungus. Proceedings of the National Academy of Sciences of the United States of America 107, 8387–8392.

eats out all hyphae (Fig. 6(B)). After feeding on colony of the virus-infect strain, the insect develops to pupa from larva, and then adult emerges from pupa, after mating, female adult then lays eggs back to the fungal colonies. Viral DNA of SsHADV-1 could be determined by PCR amplification from the larva, pupa, adults (female and male), and eggs of fly that was fed on virus-infected colonies (Fig. 6(C)–(E)), suggesting that this fly carries hyphae attaching on insect surface or in the gastrointestinal system, or that SsHADV-1 enters into insect cells. The insects fed on a virus-infected colony were further examined using immunofluorescence observation with the antibody of coat protein, and results showed that strong fluorescence signal could be observed in the head of larva and pupa, in the digestive tract and oviduct of adults and in eggs (Fig. 6(F)–(I)). These observations suggest that SsHADV-1 enters into tissues and cells of sciarid fly, and replicates in this insect. The virions of SsHADV-1 was used to challenge the cells of Spodoptera frugiperda (Sf9), and SsHADV-1 was examined by passage experiment, RT-PCR amplification, Flow cytometry analysis, Northern blot analysis, and immunofluorescence observation under confocal microscope (Fig. 6(K) and (L)). These assays demonstrated that SsHADV-1 could replicate in Sf9 cells and in sciarid fly. To understand why the virus-infected colony could attract sciarid fly, the volatile compounds released by both virus-infected and virus-free colonies were collected and analyzed with GC–MS. The results showed that virus-infection suppresses the biosynthesis of 1-Octen-3-ol and 3-Octanone, and “Y” shape tube test revealed that high concentrations of these two compounds are likely to be repellents for adults of sciarid fly. The fecundity of female adults can be significantly increased when larvae are fed on the virus-infected colony. This result suggests that SsHADV-1 modifies the secondary metabolites of S. sclerotiorum.

Transmission of SsHADV-1 Mycoviruses are often transmitted vertically by host’s reproduction, mostly by host asexual reproduction, occasionally by both asexual and sexual reproduction, and also transmitted horizontally by hyphal anastomosis between virus-infected and virus-free strains. Horizontal transmission of mycoviruses is usually suppressed by a fungal vegetative incompatibility system, hence virus cannot efficiently transmit between two vegetative incompatible strains. SsHADV-1 cannot transmit vertically since S. sclerotiorum does not form any asexual spores, while sclerotia cannot germinate easily to form apothecia, and ascospore-offsprings are virusfree. Unlike other mycoviruses, SsHADV-1 can transmit between two vegetative incompatible strains (Fig. 7). The possible mechanism is that virions of SsHADV-1 can infect hyphae of S. sclerotiorum directly via an extracellular route. While growing, hyphae of virus-infected strain frequently break and tear so that virions are easily released. When hyphae of virus-free strain contact

498

Single-Stranded DNA Mycoviruses

Fig. 5 SsHADV-1 directly infects intact hyphae of Sclerotinia sclerotiorum and protects plants against S. sclerotiorum. A-C, Purified virions infect hyphae on PDA medium. A, Sectoring observed in a colony of a virus-free strain Ep-1PNA367 at 2 dpi. Strain Ep-1PNA367 grew on a PDA plate for 1 day and then virions were spread on the PDA medium around the margin of the young colony with a distance of about 1–2 cm. PBS buffer was used instead of virion suspension as a control, no sectoring was observed in the colony. The photographs were shot at 4 dpi; B, Abnormal colony morphology of SsHADV-1-infected strain Ep-1PNA367, colony morphology of strain Ep-1PNA367 and strain DT-8 were also presented. All cultures were incubated on PDA medium for 7 d at 201C; C, Viral DNA could be extracted from infected strain Ep-1PNA367 and observed by agarose gel electrophoresis. Lane M: l-Hind III digested DNA Marker. Lanes 1–3, DNA samples of strains Ep-1PNA367, virus-infected Ep-PNA367, and DT-8, respectively. D-E, Spraying virions of SsHADV-1 protects plants against the infection of S. sclerotiorum. D, Virions suspension prevented S. sclerotiorum from killing plants of A. thaliana. Virions suspensions at different concentrations were spread on one leaf of each plant, and then mycelial agar disks from the merge of active colony of strain Ep-1PNA367 were placed on the treated leaves. The inoculated plants were maintained in an incubator with 100% relative humidity (RH) at 201C, and photographs were shot at 6 dpi. E, Therapeutic activity of virions against the attack of S. sclerotiorum. Virion crude suspension inhibited the expansion of S. sclerotiorum on rapeseed plants and cured the lesions. S. sclerotiorum were pre-inoculated on plant leaves for 2 days for lesion development, and then virions suspension, chemical fungicide Carbendazim (100 ppm) or PBS buffer was sprayed on plants. Three leaves were inoculated for each plant. Plants were maintained in an incubator with 100% RH at 201C, and photographs were shot at 8 and 12 dpi. Reproduced from Yu, X., Li, B., Fu, Y., et al., 2013. Extracellular transmission of a DNA mycovirus and its use as a natural fungicide. Proceedings of the National Academy of Sciences of the United States of America 110, 1452–1457.

virions, infection happens. Although transmission vectors of RNA mycoviruses have not been discovered yet, SsHADV-1 can replicate in sciarid fly and enter into eggs, suggesting that sciarid fly can possibly transmit SsHADV-1. Viruliferous larva, pupa, adults, and eggs (acquired virus by feeding on virus-infected colony) and larvae and pupae which are injected with purified virions could transmit virus to virus-free fungal strains. The transmission also occurs on rapeseed plants (Fig. 8), suggesting that SsHADV1 and sciarid fly undergo a mutualistic lifestyle in the field.

Distribution of SsHADV-1 It is most likely that SsHADV-1 is widely distributed. Firstly, SsHADV-1 has been observed in strains of S. sclerotiorum isolated from many places in China; Secondly, it also has been detected in Australia strains of S. sclerotiorum; Thirdly, viral DNA of SsHADV-1 has been detected in the river sewage in New Zealand. Additionally, viral DNA of SsHADV-1 is detected in dragonflies (Pantala hymenaea). Different viral strains in the same species as SsHADV-1 (qcw23688.1) were also found in Haddock in the United States, with Rep identity of 78% (253/324) and similarity of 83% (272/324). These findings indicate that the distribution of SsHADV-1 is worldwide, at least in Asia, North America, and Oceania.

Single-Stranded DNA Mycoviruses

499

Fig. 6 SsHADV-1 replicates in insect. A, Eggs, larva, pupa, and adults of a mushroom sciarid fly Lycoriella ingenua (Diptera: Sciaridae); B, The larva of sciarid fly feeding on colony of S. sclerotiorum, red arrowheads show the larva, and white arrowheads show galleries generated by feeding; C, PCR detection of virus in/on sciarid fly after reared on the colony of strain DT-8 with virus-specific primers pairs; D, Determination of SsHADV-1 with Southern blot analysis in sciarid fly. Lane M, DNA markers; Lane 1, DNA from strain DT-8 (positive control); Lanes 2–5, DNA samples extracted from eggs, larvae, pupae and adults which were fed on the colony of strain DT-8, respectively; Lane 6, DNA from larvae reared on strain DT-8VF (negative control). A DIGlabeled probe which was made using a DNA fragment (814 bp) derived from genomic DNA was used to probe the Southern blots.E, Determination of virus retention time in the body of larvae by using PCR amplification. Lane M, DNA marker; Lane 1, DNA sample from strain DT-8; Lanes 2–5, DNA samples from larvae starved for 1, 3, 5, and 7 d, respectively; Lane 6, DNA sample from larvae reared on strain DT-8VF. F-J, Immunofluorescence detection of SsHADV-1 in larva, pupa, midgut, and midgut of female adults, and eggs of L. ingenua. Insect developed either on the colony of strain DT-8 (a) or on the colony of strain DT-8VF (b) (negative control) was used for detection. F, Larvae, longitudinal section of larvae (head parts); G, Pupae, longitudinal section of pupae (head parts); H, Midgut of female adults. Photographs shown at the upper right were magnified from a part of midgut (indicated by red boxes); I, Ovarian duct of female adults; J, Eggs dissected from female adults. Bars at the bottom right are 200 mm for F and G, 100 mm for H, I, and J, 50 mm for the upper right in H. Samples were stained with secondary antibody FITC conjugated to virions-specific monoclonal antibody prepared with viral CP protein. The Immunofluorescence reaction was observed under a confocal microscope (fv1000mp, Olympus). K and L, SsHADV-1 replicates in Sf9 cells. K, Immunofluorescence examination of SsHADV-1 in Sf9 cells, cells were inoculated with SsHADV-1 virions and observed at 72 hpi bars ¼ 20 mm. L, Northern blot analysis for expression of viral Rep and CP gene in Sf9 cells. DIG-labeled DNA probes of 853 bp and 901 bp corresponding to viral DNA for Rep and CP gene, respectively, were used. Reproduced from Yu, X., Li, B., Jiang, D., et al., 2010. A geminivirus-related DNA mycovirus that confers hypovirulence to a plant pathogenic fungus. Proceedings of the National Academy of Sciences of the United States of America 107, 8387–8392.

The Taxonomy of SsHADV-1 SsHADV-1 is phylogenetically closely related to geminiviruses that infect numerous plants, the Rep protein of SsHADV-1 showed high similarity to these of viruses in the family Geminiviridae. However, the coat protein gene is different from these of viruses in family Geminiviridae. Hence, SsHADV-1 should have some evolutionary relation with geminiviruses, but also significantly different from these viruses. In 2012, a novel genus Gemycircularvirus, a name from Gemini-like myco-infecting circular virus was established with SsHADV-1

500

Single-Stranded DNA Mycoviruses

Fig. 7 SsHADV-1 transmits between two vegetative incompatible strains of S. sclerotiorum. A and B, Incompatible reaction (cell death zone, red arrow indicated) between a hygromycin-resistant strain Ep-1PNA367R (a) and strain DT-8VF (b) or its isogenic strain DT-8 (c). For A, colonies were dual cultured on a PDA plate for one week. Photograph was shot from the reverse side of the plate, for B, strain DT-8 was inoculated 2 days ahead of inoculation of strain Ep-1PNA367R on a PDA plate, the photograph was taken at 7 days post inoculation of strain Ep-1PNA367R. C, Both strain Ep-1PNA367R and SsHADV-1-infected strain Ep-1PNA367R (d) derived from dual culture in B could grow on hygromycin-amended PDA medium (50 mg/mL), while strains DT-8 and DT-8VF could not. Both strain DT-8 and SsHADV-1-infected strain Ep-1PNA367R were developed in plate for 4 d, while strain Ep-1PNA367R and strain DT-8VF for one day. D-G, SsHADV-1-infected strain Ep-1PNA367R conferred hypovirulence and its associated trait to strain Ep-1PNA367R by dual cultivation. D, Colony morphology of strain Ep-1PNA367R changed after contacting with the SsHADV-1-infected strain Ep-1PNA367R, which is different from strain Ep-1PNA367R (E), but similar to virus-infected one (F); Colonies in the dual culture were developed for 10 days for virus-infected strain, and for 7 days for strain Ep-1PNA367R, colonies for E and F were developed on PDA medium for 7 d. G, SsHADV-1-infected strain Ep-1PNA367R lost virulence on Arabidopsis (d). H, Viral DNA was extracted in newly infected strain Ep-1PNA367R. Lane M, l-Hind III digested DNA Marker; Lane 1, newly infected strain Ep-1PNA367; Lane 2, Ep-1PNA367R; Lane 3, strain DT-8. Reproduced from Liu, S., Xie, J., Cheng, J., et al., 2016. DNA mycovirus infects a mycophagous insect and utilizes it as a transmission vector. Proceedings of the National Academy of Sciences of the United States of America 113, 12803–12808.

as the exemplar virus. In 2016, a new family, Genomoviridae (sigil: Ge- for geminivirus-like, nomo- for no movement protein) was established with SsHADV-1 representative virus strain of the type species, Sclerotinia gemycircularvirus 1, in the genus Gemycircularvirus. In Gemycircularvirus, there are more than 43 sister species reported from the world at different niches. These gemcycircularviruses are associated with plants, animal tissues (including humans, mammals, birds, insects, sea fish, and reptile, etc.), and feces of mammals and birds, sewage and sediments and waste water. However, only SsHADV-1 is well-studied, the others are only viral sequences, and their exactly hosts are not known. Based on the complete amino acid sequence of Rep of SsHADV-1 and other selected viruses, a phylogenetic

Single-Stranded DNA Mycoviruses

501

Fig. 8 Transmission of SsHADV-1 by mushroom sciarid fly on rapeseed plants. A, Rapeseed plants were inoculated with strains DT-8 (left) and 1980 (right) with a 30 cm distance between two basins. The basin was sealed and kept 100% RH and then virus-free larvae were placed on small lesions induced by strain DT-8. When right side plants were killed by strain 1980, subcultures recovered from diseased plant residues were subjected to virus detection; B, PCR-positive subcultures (AT1, AT7, and AT10) were confirmed with Southern blot analysis with the same probe used in Fig. 6(D). Viral DNA for Southern blot analysis was amplified with rolling circle amplification (RCA); DNA samples extracted from strains DT-8 and 1980 were used as controls. C-F, Colony morphology, growth rate, sclerotial production and virulence of virus-infected subcultures AT1, AT7, and AT10 were significantly different from that of virus-free strain 1980. Reproduced from Liu, S., Xie, J., Cheng, J., et al., 2016. DNA mycovirus infects a mycophagous insect and utilizes it as a transmission vector. Proceedings of the National Academy of Sciences of the United States of America 113, 12803–12808.

tree was constructed using Maximum Likelihood method with a MEG 7.0 program. From the evolutionary tree, viruses in the Gemycircularvirus and its-related genera in the family Genomoviridae are extremely rich in diversity (Fig. 9).

Explore SsHADV-1 to Control Fungal Disease The history of using hypovirulence-associated mycovirus to control fungal disease can date back to the early 1970s of the last century. At that time, a transmissible hypovirulent strain was isolated from self-cured cankers caused by the chestnut blight fungus (Cryphonectria parasitica), and this hypovirulent strain was successfully used to control chestnut blight in European countries. It became evident later that the transmissible hypovirulent strain was actually caused by the infection of mycovirus (Cryphonectria hypovirus 1). This successful example prompted a large number of phytopathologists to screening for similar hypovirulenceassociated mycoviruses in different pathogenic fungi and expected to be applied in different fungal disease systems. Unfortunately, these studies have not achieved significant success, even for the hypovirus, the control efficiency of chestnut blight in the United States is not satisfactory. The common consensus is that the efficient transmission of mycoviruses is limited by the host vegetative incompatibility system (also called non-self-recognition system). Mycoviruses can only spread efficiently among host individuals in the same vegetative compatible group, but cannot move in different vegetative compatible groups. Fungi usually have complicated vegetative incompatibility system, making use of mycoviruses to control diseases almost impossible. Of course, a group of scholars have made continuous efforts to overcome this difficulty, which will make it possible to control chestnut blight with hypovirus in the near future. Unlike most known viruses, SsHADV-1 has very unique characteristics. First, the vegetative incompatibility could not prevent the efficient spread of SsHADV-1 in the population of S. sclerotiorum; second, SsHADV-1 can drive insect as transmission vectors, which may help SsHADV-1 spread in the field in a short time; third, virions of SsHADV-1 can directly infect hyphae of S. sclerotiorum, that

502

Single-Stranded DNA Mycoviruses

Fig. 9 Phylogenetic analysis of SsHADV-1 and its related species in the family Genomoviridae using complete Rep amino acid sequence by Maximum Likelihood method. The evolutionary analyses were conducted with MEGA7.0 program (Kumar, S., Stecher, G., and Tamura, K., 2016). The virus name, occasionally without name but virus original niche, and its accession number are listed on right correspondence to its position on the tree. SsHADV-1 and its closely relatives are highlighted with gray color; Viruses in other identified genera in the family Genomoviridae are highlighted with blue color, and viruses in the family Geminiviridae are highlighted with red color.

Single-Stranded DNA Mycoviruses

503

makes it possible to use virus particles or processed virus particles to control Sclerotina disease. Field experiment demonstrated that spraying hyphal fragment suspension of SsHADV-1-infected strain DT-8 on rapeseed plants at the blossom stage could also reduce disease severity and enhance rapeseed yield up to 16.4% under field conditions. SsHADV-1 and strain DT-8 were introduced to the United States in 2017 by Prof. Weidong Chen in Washington State University for investigating its potential on biological control of white mold of leguminous crops caused by S. sclerotiorum.

Further Reading Liu, S., Xie, J., Cheng, J., et al., 2016. DNA mycovirus infects a mycophagous insect and utilizes it as a transmission vector. Proceedings of the National Academy of Sciences of the United States of America 113, 12803–12808. Xie, J., Jiang, D., 2014. New insights into mycoviruses and exploration for the biological control of crop fungal diseases. Annual Review of Phytopathology 52, 45–68. Xu, L., Li, G., Jiang, D., Chen, W., 2018. Sclerotinia sclerotiorum: An evaluation of virulence theories. Annual Review of Phytopathology 56, 311–338. Yu, X., Li, B., Fu, Y., et al., 2013. Extracellular transmission of a DNA mycovirus and its use as a natural fungicide. Proceedings of the National Academy of Sciences of the United States of America 110, 1452–1457. Yu, X., Li, B., Jiang, D., et al., 2010. A geminivirus-related DNA mycovirus that confers hypovirulence to a plant pathogenic fungus. Proceedings of the National Academy of Sciences of the United States of America 107, 8387–8392. Zhang, D., Nuss, D., 2016. Engineering super mycovirus donor strains of chestnut blight fungus by systematic disruption of multilocus vic genes. Proceedings of the National Academy of Sciences of the United States of America 113, 2062–2067.

Structure of Double-Stranded RNA Mycoviruses☆ José R Castón, National Center for Biotechnology, Spanish National Research Council, Madrid, Spain Nobuhiro Suzuki, Institute of Plant Stress and Resources (IPSR), Okayama University, Kurashiki, Japan Said A Ghabrial†, Department of Plant Pathology, University of Kentucky, Lexington, KY, United States r 2021 Elsevier Ltd. All rights reserved.

Introduction Double-stranded RNA (dsRNA) viruses infect practically all organisms, from bacteria to unicellular and/or multicellular simple eukaryotes (fungi, protozoa), through to plants and animals. Although dsRNA viruses are a rather diverse group, they share general architectural principles and numerous functional features. The complexity of the capsid ranges from single shell to multilayered and concentric capsid. Whereas the outer shell has a protective role and is involved in cell entry, the innermost capsid (or inner core) is devoted to the organization of the viral genome and viral polymerase. This inner capsid consists of 120 protein subunits arranged in a T ¼ 1 icosahedral shell, i.e., a capsid protein (CP) dimer is the asymmetric unit. The T¼ 1 capsid is also referred as a “T ¼2 layer” – an exception to the quasi-equivalence theory introduced by Caspar and Klug. The T ¼1 capsids of dsRNA viruses are critical for genome replication (minus-strand synthesis) and transcription (plus-strand synthesis), with the viral RNA-dependent RNA polymerase(s) (RdRp) frequently packaged as an integral component of the capsid. T¼1 capsids also function as molecular sieves, allowing the exit of positive single-stranded RNA transcripts for protein synthesis, and the entrance of nucleotides for intra-capsid RNA synthesis. The pores are presumably small enough to exclude potentially degradative enzymes. In addition, T ¼1 capsid remains structurally undisturbed throughout the viral cycle, isolating dsRNA molecules and any replicative intermediates, thus preventing the triggering of dsRNA sensor-mediated antiviral host defense mechanisms such as RNA silencing, interferon synthesis and apoptosis. The totiviruses Scharomyces cerevisiae virus L-A (ScV-L-A) and Ustilago maydis virus H1 (UmV-H1), which infect the yeast Saccharomyces cerevisiae and the smut fungus Ustilago maydis, respectively, were the first unambiguously described viruses with a T¼1 capsid formed by 12 decamers (rather than 12 pentamers). The conservation of this stoichiometry and architecture is probably related to the stringent requirements of capsid RNA metabolism-associated activity (the capsid organizes the packaged genome and the replicative complex[es]). T¼1 120-subunit capsids have been described for members of the Reo- and Picobirnaviridae, which mostly infect higher eukaryotic organisms. They have also been described for bacteriophages of the family Cystoviridae that infect the prokaryote Pseudomonas syringae. These capsids are also present in members of the Toti-, Partiti-, Megabirna-, Chryso-, and Quadriviridae families, that largely infect unicellular and/or multicellular simple eukaryotes such as fungi, protozoa, but also some plants. In contrast with the smooth outer surface of T ¼1 capsid of reoviruses, the 120-subunit capsids of fungal dsRNA viruses have a corrugated outer surface with protuberances rising above the continuous protein shell. Notably, the average thickness of a 120-subunit T¼ 1 CP is 15–30 Å in mammalian dsRNA viruses, but those of mycoviruses are thicker. Unlike their bacteria- and higher eukaryote-infecting counterparts, most mycoviruses are transmitted by cytoplasmic interchange; they never leave the host, and indeed have no strategy for entering host cells. Recent studies of fungal (and protozoan) dsRNA viruses have identified functional and structural features dissimilar to those recorded for reoviruses, as well as evolutionary relationships among T¼ 1 capsid structural proteins. Mycovirus particles accumulate in the fungal cytoplasm; the close relationship between the fungal dsRNA virus and its host probably placed many constraints on the virus that it overcame by increasing CP complexity. The T¼1 capsid proteins of four mycoviruses have been resolved at the atomic level: Gag of the yeast virus ScV-L-A (family Totiviridae), CP of Penicillium chrysogenum virus (PcV, family Chrysoviridae), CP of Penicillium stoloniferum virus F (PsV-F, family Partitiviridae), and the heterodimer P2-P4 of Rosellinia necatrix quadrivirus 1 (RnQV1, family Quadriviridae).

Structure of dsRNA Virus Capsids Totiviruses Sacharomyces cerevisiae virus-L-A is the type species of the genus Totivirus (family Totiviridae). The ScV-L-A genome is a 4.6 kbp, singlesegment dsRNA molecule that encodes a major CP (termed Gag; 680 residues, 76 kDa) and the viral polymerase (Pol; 868 residues, 94 kDa) as a Gag-Pol fusion protein generated by  1 ribosomal frameshifting (Pol is bound covalently to the inside of the particle wall). The structure of ScV-L-A was first examined by three-dimensional cryo-electron microscopy (3D cryo-EM) and later by X-ray crystallography at a resolution of 3.4 Å . Scanning transmission electron microscopy (STEM) was used to determine the virus stoichiometry. The rough, icosahedral, B40 nm diameter T¼1 lattice of ScV-L-A has 120 copies of Gag, of which one or two are fused to the Pol moiety (Fig. 1(a) and (b)). The structural unit is an asymmetric Gag dimer. Each Gag monomer can adopt two related conformations, termed A and B, which have notable structural differences and reside in different bonding environments (i.e., make non-equivalent contacts) within ☆ †

This work is dedicated to the memory of our friend and colleague, Said A. Ghabrial, who passed away in November 2018. Deceased.

504

Encyclopedia of Virology, 4th Edition, Volume 4

doi:10.1016/B978-0-12-809633-8.21275-2

Structure of Double-Stranded RNA Mycoviruses

505

Fig. 1 Structure of the ScV-L-A T¼1 capsid and X-ray-based structure of the CP. (a) Cryo-EM image of ScV-L-A. Bar, 500 Å . (b) Cryo-EM T¼1 capsid of ScV-L-A viewed along a twofold axis of icosahedral symmetry, showing the Gag subunits A (blue) and B (yellow). (c) Atomic model of a Gag dimer (top view; PDB accession number 1m1c). Icosahedral symmetry five- (pentagon), three- (triangle) and two-fold (oval) axes are indicated in black.

the capsid (Fig. 1(c)). These subunits are arranged in two sets of five: five A subunits directly surrounding the icosahedral five-fold axis, leaving an 18 Å diameter channel for the entry of nucleotide triphosphates and the exit of viral mRNA, and five B subunits intercalated between the A subunits, forming a decamer. Twelve such (A:B)5 decamers then constitute the complete capsid. This quaternary organization is similar to the 120-subunit T¼1 inner core of reoviruses, in which adjacent A and B subunits within each decamer are oriented approximately in parallel, suggesting an asymmetric A:B dimer as a possible intermediate for capsid assembly. Gag functions as an enzyme and has a major role in the sophisticated interaction between ScV-L-A and the host cell. The Gag segment Gln139-Ser182 (in which His154 is the active site) contributes to the rough outer surface of the capsid, mediates decapping of cellular mRNAs and transfers the 7-methyl-GMP (m7GMP) cap from the 50 end of the cellular mRNA to the 50 end of the viral RNA. L-A counters a host exoribonuclease that targets uncapped RNAs (such as viral mRNA), allowing the latter to compete with host mRNA for use of the translation machinery. Helminthosporium victoriae virus 190S (HvV190S), the prototype of the genus Victorivirus in the family Totiviridae, infects the filamentous fungus H. victoriae, and has a similar capsid organization to that of ScV-L-A. The smooth HvV190S capsid is composed of 120 CP monomers, with RdRp incorporated as a separate, non-fused protein. The RdRp is either non-covalently associated with the underside of the capsid (as in reoviruses), or free in the capsid interior, or non-covalently bound to the genome. 3D structures have also been reported for two protozoal members of this family, Trichomonas vaginalis virus 1 and Giardia lamblia virus, and show strong similarities to the fungal virus structures.

Chrysoviruses Chrysoviruses are isometric virions characterized by a multipartite genome. Penicillium chrysogenum virus (PcV) is the prototype virus of the Chrysoviridae, a family of largely symptomless mycoviruses with a genome typically consisting of four monocistronic genome segments (genome size 2.4–3.6 kbp). Each segment is encapsidated separately in a similar particle, i.e., chrysoviruses are multisegmented and multiparticulate virions. dsRNA-1 (3.6 kbp) encodes the RdRp (1117 residues, 128.5 kDa; 1 or 2 copies per virion), dsRNA-2 (3.2 kbp) encodes the CP (982 residues, 109 kDa), and dsRNA-3 and -4 (3 and 2.9 kbp, respectively) code for virion-associated proteins of unknown function (912 residues, 101 kDa and 847 residues, 95 kDa, respectively). The 3D structures of the capsids of two chrysoviruses have been determined by cryo-EM analysis, that of PcV at atomic resolution, and that of Cryphonectria nitschkei chrysovirus virus 1 (CnCV1) at subnanometer resolution. Stoichiometric estimates by analytical ultracentrifugation analysis indicated that PcV and CnCV1 virions are exceptions to the most extended tendency among dsRNA viruses – a T¼1 core with 60 equivalent dimers – since they have an authentic T¼1 capsid formed by 60 copies of a single monomer. The capsid diameter is 400 Å and the protein shell 48 Å thick (Fig. 2(a) and (b)). Similar to ScV-L-A, the outer capsid surface of PcV is relatively

506

Structure of Double-Stranded RNA Mycoviruses

Fig. 2 Cryo-EM- based structure of the PcV T¼1 capsid protein. (a) Cryo-EM image of PcV. Bar, 500 Å . (b) T¼1 capsid of PcV viewed along a twofold axis of icosahedral symmetry, showing the N-terminal domain A (1–498, blue), the linker segment (499–515, red), and the C-terminal domain B (516–982, yellow). (c) Top view of the atomic model of the PcV CP (3j3i; 982 residues). Symbols indicate icosahedral symmetry axes.

uneven with 12 outwardly protruding pentons (each containing five copies of the CP); this contrasts with the T¼ 1 capsid of reoviruses, in which the CP has a plate-like structure and serve as a template to prime the assembly of the T¼13 surrounding capsid. The 982residue CP of PcV is formed by duplication of an a-helical domain; this is indicative of gene duplication despite negligible sequence similarity between the two roughly parallel a-helical domains (Fig. 2(c)). The N-terminal A domain (residues 1–498; Fig. 2(c), blue) and the C-terminal B domain (residues 516–982; Fig. 2(c); yellow) are connected by a 16-residue linker (Ala499-Ile515; Fig. 2(c), red) accessible from the capsid outer surface. These domains are arranged in two sets of five; five A domains directly surround the icosahedral fivefold axis, and five B domains intercalated between them, forming a pseudodecamer (Fig. 3(a)). This organization is clearly reminiscent of the 120-subunit T¼1 lattice of totivirus capsids, in which the two asymmetrical dimer components are arranged approximately in parallel. The structural details of PcV capsid reinforce the idea that a T¼1 layer with a dimer as the asymmetric unit provides an optimal framework for managing dsRNA metabolism. Superimposition of the PcV A and B a-helical domains (Fig. 3(b) and (c)) identifies a single “hotspot” located on the capsid outer surface where variations are introduced by insertion of 50–100 residue polypeptides (Fig. 3(c), orange triangle). A preferential insertion site would allow the acquisition of new functions while preserving basic CP folding. It is plausible that, in addition to its structural role, chrysovirus CP might also have enzymatic activity, like ScV-L-A CP.

Quadriviruses Rosellinia necatrix quadrivirus 1 (RnQV1) is the exemplar of the type species of the genus Quadrivirus in the family Quadriviridae. The filamentous ascomycete Rosellinia necatrix, a pathogen of many plants, can be infected by dsRNA viruses belonging to at least six families. RnQV1 is associated with latent infections and has a multipartite genome consisting of four monocistronic dsRNA segments (as in chrysoviruses) with genome sizes ranging from 3.7 to 4.9 kbp. DsRNA-1 (4.9 kbp) codes for a protein of unknown function (1602 residues), dsRNA-2 (4.3 kbp) encodes the P2 CP (1356 residues), dsRNA-3 (4 kbp) codes for RdRp (1117 residues), and dsRNA-4 (3.7 kbp) codes for the P4 CP (1061 residues). RnQV1 strains W1075 and W1118, isolated from different locations in Japan, have been analyzed by 3D cryo-EM and analytical ultracentrifugation. Their P2 and P4 proteins co-assemble into isometric virus particles B45 nm in diameter, that each package either one or two of the four genome segments (Fig. 4(a)). Whereas most dsRNA virus capsids are based on dimers of a single protein, RnQV1 has a single-shelled T ¼ 1 capsid formed by 60 P2 and P4 protein heterodimers (Fig. 4(b)). P2 and P4 of RnQV1 strain W1118 remain nearly intact, but in strain W1075 both proteins are cleaved into discrete polypeptides, apparently without altering capsid structural integrity. The atomic structure of the RnQV1 W1118 capsid at 3.7 Å resolution shows that P2–P4

Structure of Double-Stranded RNA Mycoviruses

507

Fig. 3 PcV capsid protein is a structural duplication. (a) Atomic model of the PcV capsid viewed along a twofold axis (color code as in Fig. 2(c)). (b) Superimposed A and B domains (white segments indicate non-superimposed regions for both domains). (c) Sequence alignment of domains A (blue) and B (yellow) resulting from Dali structural alignment. a-helices (rectangles) and b-strands (arrows) are rainbow-colored from blue (N terminus) to red (C terminus) for each domain. Triangles represent non-aligned segments (sizes indicated): the orange triangle indicates the single “hotspot” on the outer capsid surface. Strictly conserved residues are on a red background and partially conserved residues are in a red rectangle. Adapted from Luque, D., Gómez-Blanco, J., Garriga, D., et al., 2014. Cryo-EM near-atomic structure of a dsrna fungal virus shows ancient structural motifs preserved in the dsRNA viral lineage. Proceedings of the National Academy of Sciences of the United States of America 111 (21), 7641–7646.

Fig. 4 RnQV1 T¼1 capsid cryo-EM-based structure. (a) Cryo-EM image of empty RnQV1 particles. Bar, 500 Å (b) T¼ 1 capsid of RnQV1 viewed along a twofold axis of icosahedral symmetry, showing P2 (blue) and P4 (yellow). (c) Top view of the atomic models of P2 (blue) and P4 (yellow) (PDB accession number 5nd1).

heterodimers are organized into a quaternary structure similar to that of the homodimers of chrysoviruses and totiviruses (Fig. 4(c)). Although the RnQV1 capsid, and that of PcV, is an exception to the rule that all dsRNA viruses have a T ¼1 capsid with a CP homodimer as the asymmetric unit, it follows the architectural principle that a 120-subunit capsid is a conserved assembly that supports dsRNA replication and organization.

508

Structure of Double-Stranded RNA Mycoviruses

Fig. 5 PsV-F and RnMB1 T¼1 capsids. (A) X-ray-based structure of the T¼1 capsid of PsV-F viewed along a twofold axis of icosahedral symmetry, showing the CP subunits A (blue) and B (yellow). Bar, 100 Å . (b) Top view of the atomic model of a PsV-F CP dimer (3es5). Symbols indicate icosahedral symmetry axes; red oval indicates a local twofold symmetry axis. (c) Side view of a PsV-F CP dimer. The protruding arch (P) and shell (S) domains are indicated. (d) Cryo-EM T¼1 capsid of RnMB1 viewed along a twofold axis of icosahedral symmetry, showing the rough boundaries of subunits A (blue) and B (yellow). Bar, 100 Å .

Despite their low sequence similarity, the superimposition of P2 and P4 revealed them to have a common a-helical domain. As described for the PcV capsid protein, P2 and P4 appear to have also acquired new functions through the insertion of complex domains at preferential insertion sites on the capsid outer surface. These domains are probably related to enzyme activity. The P2 insertion has a fold similar to that of gelsolin and profilin, two actin-binding proteins with a function in cytoskeleton metabolism, whereas the P4 insertion suggests a protease activity involved in cleavage of the P2 383-residue C-terminal region (absent in the mature viral particle). This P2 C-terminal segment might represent an external scaffolding domain.

Partitiviruses Members of the family Partitiviridae have bisegmented, 3.1–4.8 kbp-long genomes. Each segment is encapsidated separately in a similar virus particle. dsRNA1 encodes RdRp (1 copy per particle), while dsRNA2 encodes the CP. The partitiviruses that infect fungi are grouped into three genera: Alpha-, Beta-, and Gamma-partitivirus. Alpha- and beta-partitiviruses infect plants or filamentous fungi, whereas gamma-partitiviruses infect only the latter. In general, partitivirus infections are largely symptomless. Four fungal partitivirus structures have been resolved by 3D cryo-EM, including those of the gamma-partitiviruses Penicillium stoloniferum virus S (PsV-S) and Penicillium stoloniferum virus F (PsV-F) (by X-ray crystallography at 3.3 Å resolution), and of the beta-partitiviruses Fusarium poae virus 1 (FpV1) and Sclerotinia sclerotiorum partitivirus 1 (SsPV1). The single-layered, 120-subunit capsids of these viruses are 35–42 nm in diameter and distinctive in having “arch-like” surface features that protrude above the continuous capsid shell (Fig. 5(a)–(c)). These T ¼1 capsids have a different quaternary organization, their CP dimer having almost perfect local twofold symmetry (Fig. 5(b)). The quasi-symmetric CP dimer is stabilized by domain swapping within the shell region of the A and B subunits, as well as by intradimeric interactions between equivalent protruding arch domains on the particle surface (Fig. 5(c)). A similar organization has been found in a picobirnavirus, a bisegmented dsRNA virus that infects humans and other vertebrates. Based on their capsid organization, partiti- and picobirnaviruses are proposed be assembled from dimers of CP dimers (i.e., tetramers) as intermediates. In contrast, the proposed assembly pathway for the 120-subunit capsids of Toti- and Reoviridae members is based on pentamers of CP dimers (i.e., decamers).

Megabirnaviruses Rosellinia necatrix megabirnavirus 1 (RnMBV1) is the exemplar of the type species of the genus Megabirnavirus in the family Megabirnaviridae. The RnMBV1 genome has two dsRNA segments, separately encapsidated in isometric particles of B50 nm

Structure of Double-Stranded RNA Mycoviruses

509

diameter. dsRNA1 (8.9 kbp) encompasses two partially overlapping ORF’s; ORF1 encodes the CP (135 kDa) and ORF2 codes for the RdRp, which is expressed as a fusion protein with CP (250 kDa). dsRNA2 (7.2 kbp) has two non-overlapping ORF’s that encodes proteins with unknown functions. RnMBV1 causes severe reduction of both mycelial growth of Rosellinia necatrix and fungal virulence to plant hosts, and thus has strong potential for virocontrol of white root rot. RnMBV1 structure has been analyzed by 3D cryo-electron microscopy (Fig. 5(d)). Similar to ScV-L-A, the outer capsid surface of RnMBV1 is rough with 120 spherical large protrusions on the virus surface (B50 Å wide). The role of these protruding domains is unknown.

Evolutionary Relationships Based on Structural Comparisons Structural comparisons of CPs are used to establish relatedness when sequence conservation is limited, and have detected relationships among viruses that infect organisms that, in evolutionary terms, are widely separated. Icosahedral viruses are grouped into four lineages: the dsDNA viruses with an upright double b-barrel CP (the prototypes are phage PRD1 and adenoviruses), the head-tailed phages and herpesviruses that share the Hong Kong 97 (HK97)-like CP fold (also termed the Johnson fold), the picornavirus-like superfamily with a single b-barrel as the CP fold, and the dsRNA or bluetongue virus (BTV)-like viruses. The PRD1- and HK97-like lineages include archaea-, bacteria-, and eukaryote-infecting viruses, suggesting that their last common ancestral hosts were infected by the progenitors of the current viral lineages before the host organisms diverged. The similarity of the A and B a-helical domains of PcV CP, which have many well-matching secondary structural elements, indicate a common fold in both domains (Fig. 3). It is noteworthy that domain duplication has recently been detected in the capsid protein of other dsRNA fungal virus, botybirnaviruses. Gene duplication (or joined folds) has been a recurrent evolutionary event in other viral lineages. The conserved B350 residue-long PcV fold is also preserved in the Gag of ScV-L-A (Fig. 6). This basic a-helical domain shares many secondary structural elements with ScV-L-A Gag, in particular those regions involved in interactions at the icosahedral symmetry axes. The preserved fold in Gag has three peptide insertion sites facing the outer capsid surface, one of which colocalizes with the single-insertion hotspots of the PcV CP domains. This colocalization suggests that these preferential insertion sites are ancient, and provide a means for the acquisition of new functions without altering the structural and functional motifs of the dsRNA virus CP. P2 and P4 of RnQV1 also have a common fold some 300 residues long, with two preferential insertion sites on the outer surface. Both coincide with the ScV-L-A Gag insertion sites, and one with the single-insertion site of the PcV A and B a-helical CP domains. Notably, the conserved folds of PcV and ScV-L-A CP are similar to the common fold of P2 and P4, indicating that this fold may have evolved from a common ancestral domain of the dsRNA virus lineage (Fig. 6). Duplication of an ancestral gene for a CP with the BTV-like fold might have resulted in two separate (as in quadriviruses) or covalently joined folds (as in chrysoviruses). This event could direct assembly of a T ¼1 capsid with 120 subunits or domains with a dimer as the asymmetric unit, a necessary arrangement for dsRNA replication/transcription. Separate and joined folds are found in the CP of other virus families such as picornaviruses and comoviruses, respectively. Once the 120-subunit capsid was well established, later divergent evolutionary events would have introduced additional changes in each copy, or even the complete removal of one of them, giving a CP that assembles as a dimer of unfused identical monomers. Alternatively, the ancestral CP could have initially acquired dimer assembly ability, followed by gene duplication.

RdRp and dsRNA Organization Within Mycovirus Capsids Reovirus T¼1 cores have 10–12 RdRp complexes per virion, around which the dsRNA is densely coiled. RdRp complexes are noncovalently anchored to the capsid inner surface near the icosahedral five-fold axes, as presumably they are in mycoreoviruses. In addition to RdRp molecules, reovirus replicase complexes include a few minor core proteins with ATPase and/or RNA binding activities. For members of the Toti-, Chryso–Partiti-, and Quadriviridae families, the RdRp molecules are incorporated into 1 or 2 copies per particle, and show more variability than reovirus. For chryso-, partiti- and quadriviruses, the RdRp is expressed as a physically separate protein from a discrete genome segment, and is incorporated into virions via non-covalent interactions with the capsid and/or genome. The same is true for victoriviruses such as HvV190S, except that the RdRp is expressed from the single genome segment of those viruses. For totiviruses such as ScV-L-A, in contrast, the RdRp is expressed as a C-terminal fusion product with the CP. As a result, in ScV-L-A, the 1 or 2 RdRp domains per virion are covalently tethered to the capsid via the fused CP domain, which occupy 1 or 2 subunit positions in the capsid. The mycovirus T¼ 1 capsid wall is perforated by many channels, but none is large enough to pass an A-form 23-Å -diameter genomic dsRNA. Thus the capsid functions as a molecular sieve. Whereas the largest pores (15–20 Å diameter and usually located near the fivefold axis) would allow the passage of nascent mRNA into the host cytoplasm, the smallest holes (5–10 Å in diameter and usually located at the three-fold axis) could be used for nucleotide substrate or pyrophosphate byproduct diffusion. In nontranscribing T¼ 1 capsids, the pores are very narrow, but the N- or C-termini or the side chains of residues that face the channel wall might adopt alternative conformations to allow the exit of viral transcripts.

510

Structure of Double-Stranded RNA Mycoviruses

Fig. 6 Structural homology of mycovirus T¼1 CP. The PcV CP A domain (PcV-A, left, center) was structurally aligned with ScV-L-A Gag (right, center), and P2 (top, center) and P4 (bottom, center) with PcV-A and ScV-L-A Gag. Center rainbow-colored structures indicate conserved secondary structure elements within the dsRNA viruses. PcV-A is aligned with ScV-L-A Gag (blue and pink, center). P2 is aligned with PcV-A (blue and light blue, top left) and with ScV-L-A Gag (blue and pink, top right). P4 is aligned with PcV-A (yellow and blue, bottom left) and P4 with ScV-L-AL-A Gag (yellow and pink). Total numbers of secondary structural elements with close relative spatial locations are indicated. Reproduced from Luque, D., Mata, C.P., Nobuhiro, S., Ghabrial, S.A., Castón, J.R., 2018. Capsid structure of dsRNA fungal viruses. Viruses 10 (9), 481.

With the exception of totiviruses (which have a single genomic segment), many fungal dsRNA viruses, including chryso-, partiti-, and quadriviruses, have multisegmented dsRNA genomes. In addition, the multisegmented viruses appear to be multiparticulate, i.e., segments are encapsidated separately. Fungal dsRNA viruses have spacious capsids compared to the inner cores of complex eukaryotic dsRNA viruses (Table 1). Whereas reoviruses have 9–12 dsRNA genomic segments packed into liquid crystalline arrays at high density (B40 bp/100 nm3 – a spacing between dsRNA strands of 25–30 Å ), fungal virus capsids (including ScV-L-A, PcV, PsV-F and RnQV1) contain a single, loosely packed dsRNA molecule (B20 bp/100 nm3 – an interstrand spacing of B40–45 Å ). In reoviruses, individual genome segments must be transported through the active sites of the RdRp complexes at the 5-fold axes, and template motion could be a limiting factor. ScV-L-A is a simplified version of these viruses, with a single-segment genome. The looser packing of the dsRNA would probably improve template motion in the more spacious transcriptional and replicative active particles, minimizing electrostatic repulsion between dsRNA strands. Most mycovirus T¼1 capsids are negatively charged on their inner surface, a feature common to many such capsids of dsRNA viruses. This might facilitate the movement of template and/or product RNA molecules by repulsion, maintaining the RNA layer at B25 Å from the capsid surface (Fig. 7(a), (c), and (d)). The PcV capsid is an exception. It has positively charged regions on the inner

Structure of Double-Stranded RNA Mycoviruses

Table 1

511

Genome packaging densities in fungal dsRNA viruses

Virus family

Reoviridae, MRV Totiviridae, ScV-L-A Partitiviridae, PsV-S Chrysoviridae, PcV Megabirnaviridae, RnMBV1 Quadriviridae, RnQV1

dsRNA features

Capsid features

No segments

Size (kbp)

MW a (MDa)

CP (residue)

fb/ir c (nm)

dsRNA density (bp/100nm3)d

10 1 1 (2)e 1 (4)e 1 (2)e 1–2 (4)

B23.5 B4.6 B1.7 (3.3) B3.2 (12.6) B8.1 (16.2) B4.3 (17.1)

16 3.1 1.2 2.2 5.5 2.9

1275 680 420 982 1240 1356 þ 1061

B60/24.5 B43/17 B35/12 B40/16 B52/19 B47/16

38 22 23 19 28 25 (50)f

(2.2) (8.6) (11) (11.7)

MW were calculated assuming a mass of 682 Da/bp. HSV dsDNA is assumed to have a B-form. Outer diameter. c Inner radius. d Densities when volume of a perfect sphere is assumed and any other internal components are ignored. e For PsV-S, PcV and RnMBV1 dsRNA, the genome is formed by two, four or two dsRNA molecules, respectively, but a mean value was calculated for one dsRNA molecule/particle in each column. f 25 if there is one dsRNA molecule/particle; 50 if there are two dsRNA molecules/particle. a

b

Fig. 7 dsRNA virus T¼1 capsid inner surfaces with electrostatic potentials. (a)–(d) ScV-L-A (a), PcV (b), PsV-F (c), and RnQV1 (d) capsid inner surfaces viewed along a twofold axis of icosahedral symmetry. The inner surface charge representations of these capsids show the distribution of negative (red) and positive (blue) charges. Note the numerous electropositive areas in PcV. dsRNA and packaged proteins (such as RNA polymerases) were removed computationally.

surface (Fig. 7(b)), and shows numerous interactions with the underlying genome, which is ordered in the outermost RNA layer. As a result, there is almost no space between the latter layer and the inner capsid surface. These contacts have been defined at the atomic level in PcV and PsV-F virions. The lower density of the central region and the associated slight increase in dsRNA mobility might be necessary for maximum RdRp activity in the context of a non-fused RdRp complex. Comparative analysis of dsRNA packing densities in dsRNA virions have revealed two major tendencies among T ¼1 capsids of dsRNA viruses: (1) those with 9–12 dsRNA segments densely packaged within the same particle and containing 9–12 RdRp complexes, as seen in reoviruses, and (2) those with a single-genomic dsRNA segment with less internal order and one or two copies of the RdRp complex per particle, as seen in mycoviruses.

512

Structure of Double-Stranded RNA Mycoviruses

Concluding Remarks and Future Perspectives Structural studies of a limited number of fungal viruses have revealed them to hold to the basic concepts of dsRNA viruses, but also to have unexpected features that have contributed to a better understanding of their structure, function, and evolution. dsRNA mycovirus capsids, exemplified by ScV-L-A, PcV, PsV-F, and RnQV1, show structural variations of the same framework optimized for RNA metabolism; they possess 60 asymmetric or symmetric dimers of a single protein (as in ScV-L-A or PsV-F, respectively), dimers of similar domains (as in PcV), or dimers of two different proteins (as in RnQV1). Since mycoviruses are commonly confined to their hosts, their capsids incorporate polypeptides and domains on their outer surfaces for the acquisition of new functions without altering the structure and function of the CP. Such acquisitions would eventually lead to optimal viral-host interactions. Despite the recent advances made in understanding the structure of dsRNA mycoviruses, many aspects of many fungal viruses remain unknown. Future structural studies should focus on the asymmetric substructures and components of their capsids – such as their RdRp (isolated or packaged inside virions) – and their packaged dsRNA genome.

See also: An Introduction to Fungal Viruses. Chrysoviruses (Chrysoviridae) - General Features and Chrysovirus-Related Viruses. Fungal Partitiviruses (Partitiviridae). Megabirnaviruses (Megabirnaviridae). Plant and Protozoal Partitiviruses (Partitiviridae). Portal Vertex. Quadriviruses (Quadriviridae)

Further Reading Abrescia, N.G., Bamford, D.H., Grimes, J.M., Stuart, D.I., 2012. Structure unifies the viral universe. Annual Review of Biochemistry 81, 795–822. Dunn, S.E., Li, H., Cardone, G., et al., 2013. Three-dimensional structure of victorivirus HvV190S suggests coat proteins in most totiviruses share a conserved core. PLoS Pathogens 9, e1003225. Ghabrial, S.A., Castón, J.R., Jiang, D., Nibert, M.L., Suzuki, N., 2015. 50-plus years of fungal viruses. Virology 479–480, 356–368. Luque, D., Gómez-Blanco, J., Garriga, D., et al., 2014. Cryo-EM near-atomic structure of a dsrna fungal virus shows ancient structural motifs preserved in the dsRNA viral lineage. Proceedings of the National Academy of Sciences of the United States of America 111 (21), 7641–7646. Luque, D., Mata, C.P., Nobuhiro, S., Ghabrial, S.A., Castón, J.R., 2018. Capsid structure of dsRNA fungal viruses. Viruses 10, 481. Mata, C.P., Luque, D., Gómez-Blanco, J., et al., 2017. Acquisition of functions on the outer capsid surface during evolution of double-stranded RNA fungal viruses. PLoS Pathogens 13, e1006755. Mertens, P., 2004. The dsRNA viruses. Virus Research 101, 3–13. Miyazaki, N., Salaipeth, L., Kanematsu, S., Iwasaki, K., Suzuki, N., 2015. Megabirnavirus structure reveals a putative 120-subunit capsid formed by asymmetrical dimers with distinctive large protrusions. Journal of General Virology 96, 2435–2441. Naitow, H., Tang, J., Canady, M., Wickner, R.B., Johnson, J.E., 2002. L-A virus at 3.4 a resolution reveals particle architecture and mRNA decapping mechanism. Nature Structural Biology 9, 725–728. Pan, J., Dong, L., Lin, L., et al., 2009. Atomic structure reveals the unique capsid organization of a dsRNA virus. Proceedings of the National Academy of Sciences 106, 4225–4230. Patton, J.T., 2008. Segmented double-stranded RNA viruses. Structure and Molecular Biology. Norfolk: Caister Academic Press. Sato, Y., Castón, J.R., Suzuki, N., 2018. The biological attributes, genome architecture and packaging of diverse multi-component fungal viruses. Current Opinion in Virology 33, 55–65.

Ustilago maydis Viruses and Their Killer Toxins Alexis Williams and Thomas J Smith, The University of Texas Medical Branch, Galveston, TX, United States r 2021 Elsevier Ltd. All rights reserved. This is an update of J. Bruenn, Ustilago Maydis Viruses, In Encyclopedia of Virology (Third Edition), edited by Brian W.J. Mahy and Marc H.V. Van Regenmortel, Elsevier Ltd., 2008, doi:10.1016/B978-012374410-4.00519-7.

The Totiviruses The family Totiviridae comprises dsRNA viruses that are classified in five genera; Totivirus, Victorivirus Giardiavirus, Leishmaniavirus, and Trichomonasvirus. Members of the genera Giardiavirus, Leishmaniavirus, and Trichomonasvirus infect protozoa, whereas totiviruses and victoriviruses infect fungi. Only giardiaviruses and trichomonasviruses make capsids that are released from the host to infect other cells. The other members of the family persistently infect their hosts and never leave the cytoplasm. The word ‘toti’ is Latin for ‘whole’ and reflects the fact that the genome is undivided. The positive strand of the genome has two overlapping open reading frames; the coat protein (gag) followed by the polymerase (pol). A fraction of the gag/pol proteins are fused together via a translational frameshift. Even for those genera that do not leave the host, the capsid acts to sequester the dsRNA genome away from the antiviral cellular machinery and form a ‘mini nuclei’ for genome replication. Early cryo-EM image reconstructions of two representatives of the genus Totivirus, Ustilago maydis virus H1 (UmV-H1) and Saccharomyces cerevisiae virus L-A (ScV-L-A) showed that the capsids have an unusual T ¼ 1 icosahedral symmetry where two identical copies of the gag protein are in each icosahedral asymmetric unit. While T ¼ 2 is not an acceptable icosahedral triangulation number, these capsids have been called ‘T ¼ 2’ to reflect this unusual capsid architecture. These structures were subsequently followed by a crystallographic structure of ScV-L-A, shown in Fig. 1. Structures of several members of the totivirus family have now been determined and are reviewed in the recent paper describing the structure of the giardiavirus, Giardia lamblia virus strain Portland I (GLV-I). In spite of low sequence similarity, all members of the family Totiviridae share this unusual ‘T ¼ 2’ symmetry. The production of the gag-pol fusion protein plays a critical role in ScV-L-A and UmV-H1 viral replication. In ScV-L-A, there are approximately two copies of the fusion protein per virion where the capsid protein portion forms part of the capsid and the RNA dependent RNA polymerase lies inside the capsid where genome replication occurs. For the toxin encoding species, additional dsRNA segments are separately encapsidated. These are essentially satellite dsRNAs and wholly dependent upon the main portion of the viral genome for their replication and coat protein.

The Killer Phenomena Several strains of Ustilago maydis, a causal agent of corn smut disease, exhibit a ‘killer’ phenotype that is due to persistent infection by U. maydis viruses. These viruses produce potent killer proteins that are secreted by the host. This is a rare example of virus/host symbiosis in that these viruses are dependent upon host survival and, to that end, produce antifungal proteins that kill competing, uninfected strains of U. maydis. Three killer strains of U. maydis have been characterized to date: P1, P4 and P6, that secrete KP1, KP4 and KP6 toxins, respectively. Correspondingly, there are three groups of resistant cells where resistance is determined by three independent recessive nuclear genes: p1r, p4r and p6r. The identities of these resistance factors are unknown. These factors are specific for each antifungal protein. Therefore, there are no single resistance alleles that confer simultaneous resistance to all three toxins. All three immunity genes are recessive to their sensitive alleles. Immunity factors are not conferred by the viral genome itself. Resistance to these antifungal proteins was found in ~20% of haploid strains in the natural population. The incidence of strains expressing a toxin is ~1% of the total population and is close to the expected frequency if immunity is a recessive, nuclear encoded factor. This implies that the viruses and their hosts have undergone a co-evolutionary process where the viruses incorporated antifungal proteins, perhaps from other fungi in the distant past, and immunity factors in the host have been selected for that protect it from the effects of the antifungal proteins.

KP4 The KP4 toxin is different from the other U. maydis killer proteins. It is a single polypeptide of 105 amino acids and, unlike KP1 and KP6, is not processed by Kex2p protease. There is no significant sequence similarity between KP4, KP6, and other known toxins. While most of the yeast toxins are acidic and the KP6 and KP1 toxins have neutral pI’s, KP4 is extremely basic with a pI 4 9.0. The structure of KP4 has a unique, compact α/β sandwich structure held together by five disulfide bonds (Fig. 2). α/β proteins are very common and the majority function by interacting with other proteins. KP4 has a single split βαβ motif with a total of seven β-strands

Encyclopedia of Virology, 4th Edition, Volume 4

doi:10.1016/B978-0-12-809633-8.20943-6

513

514

Ustilago maydis Viruses and Their Killer Toxins

Fig. 1 Structure of the ScVL1 capsid. Subunits A and B are shown here in red and blue, respectively. Ribbon diagrams of one copy of A and B are also shown in this “T ¼ 2” particles. This unusual capsid organization was also found in UmV and other members of the totivirus family. The 2, 3 and 5-fold axes of symmetry are indicated.

Fig. 2 Atomic structure of KP4. This ribbon diagram is colored red to blue as the chain extends from the N to the C termini. The five disulfide bonds are represented by ball and stick models. Also shown is lysine 42 (K42) which, when mutated to a glutamine or chemically modified abrogates KP4 activity.

(131–137) and three α-helices (α1–α3). One of the most unusual features of the protein is that the two major helices, α2 and α3, have a left-handed βαβ cross-over conformation. The possible biological function of these unusual arrangements is unclear. It may be to create a more hydrophobic ‘cup-like’ surface on the opposite side of the β-sheet that might be important for protein-protein interactions.

Effects of KP4 on U. maydis Cells The dogma had been that KP4 affects U. maydis cells by forming pores in the cell membrane like colicin and some of the small anti-microbial peptides found in animals and yeast. However, from its structure this seemed unlikely. KP4 is only 105 residues in length but contains 5 disulfide bonds and is highly charged. From this, it seemed unlikely that it would form channels on its own or, because of the disulfide bonds, be able to undergo the kinds of conformational changes required to form pores in the membrane. The structure of KP4 is moderately similar to that of scorpion alpha toxins, which are known to target voltage gated sodium channels and inhibit growth in susceptible cells. Therefore, it was proposed KP4 inhibits growth of susceptible fungi through targeting a membrane surface ion channel. To test for this, cell growth was measured in the presence of KP4 with increasing concentrations of K+, Na+, Mg2+, and Ca2+. Ca2+ was very effective at abrogating KP4 inhibition, but, with the exception of a slight effect by K+, none of the other metals had any effect even at very high concentrations. This suggested that KP4 might be blocking Ca2+ channels. If true, then this further suggests that the block is via reversible protein-protein interactions with the Ca2+ channels. To test this hypothesis, sensitive P2 cells were treated for more than 24 h with high concentrations of KP4 and then extensively washed with minimal media. When treated in this way, the cells did recover with about a 6-h delay. Ca2+ added to the

Ustilago maydis Viruses and Their Killer Toxins

515

wash was very effective at removing all KP4 inhibition. This suggests that Ca2+ abrogates KP4 effects by directly blocking toxin binding. In scorpion toxin AaHII, K58 is a highly reactive lysine and lies near the C-terminus that, when chemically modified, inactivates the toxin. To test whether KP4 and scorpion toxins share this critical feature, KP4 was mutated and chemically modified. There is only one lysine in KP4, K42 (Fig. 2). As with scorpion toxin, when this lysine was chemically modified with acetic anhydride there was a total loss in KP4 activity. This result was confirmed with site directed mutagenesis where K42 was replaced with a glutamine and was also found to block KP4 activity. Finally, as a direct proof that KP4 blocks metal ion import, we demonstrated that KP4 inhibited uptake of 45Ca2+ in U. maydis cells. Subsequently, experiments were performed with animal cells where electrophysiology is more tractable without the cell wall and the various types of Ca2+ channels are well characterized.

KP4 Blocks L-type Voltage Gated Ca2+ Channels The most convincing evidence that KP4 acts on Ca2+ channels was that KP4 specifically blocks voltage gated animal channels. KP4 blocked Cav1.2, but had no effect on Cav2.1 or Cav2.3 channels. KP4 specifically blocks L-type Ca2+ channels with a weak voltage dependence to the block. In a manner akin to the calciseptine peptide block of Cav1.2, KP4 does not affect the voltage-dependence of activation but slightly shifts the voltage-dependence of inactivation of the channel. KP4 activity against both mammalian and fungal cells was blocked by chemical modification of the single lysine residue – akin to what was found with scorpion toxin. Further, as was observed in fungi, the inhibition of the animal Cav1.2 channel was abrogated by the addition of Ca2+. Interestingly, we suggested at the time that calcium channel protein (Cch1) was the likely target for KP4 since Cch1 showed homology to the mammalian voltage gated L-type Ca2+ channels.

Effect of KP4 on Plants Since KP4 affects fungal and mammalian calcium influx, it was important to ascertain what effect it might have on plants. Unlike in mammalian cells, calcium ion channels are not as selective in plants when it comes to transporting divalent and monovalent charged ions. A plant homologue of L-type mammalian channels is the rca/VDCC2 channel. This channel responds to changes in membrane potential similar to the range of L-type calcium channels, is sensitive to inhibition by dihydroproline (DHP) derivatives, and has similar calcium ion conductance as their mammalian counterparts. To measure possible effects of KP4 on plant calcium channels, a non-intrusive and real-time method was chosen to monitor the effects of these antifungal proteins on tip-growing root hair cells. A transgenic line of A. thaliana expressing an enhanced yellow fluorescent protein/Rab GTPase fusion protein, EYFP-RabA4b, was used to monitor calcium gradients. RabA4b is involved in a vesicular transport and is localized to the tips of growing root hairs but disappears in mature root hair cells that have stopped expanding. As in fungal hyphae, the root hair tip growth is associated with an apex-high cytosolic free calcium gradient generated by a local influx of calcium at the tip. This calcium gradient was shown to be crucial for RabA4b localization since disruption of the calcium gradient causes a concomitant abrogation of the RabA4b gradient. The addition of KP4 to growing root hairs causes rapid dissipation of the EYFP-RabA4b gradient at the apical tip and a concomitant cessation of root hair elongation. In all cases, the EYFP-RabA4b gradient was reestablished and root hair extension resumed after KP4 was washed away. Consistent with KP4 blocking a calcium influx channel, the speed of this KP4 effect, and subsequent removal, suggests it targets the outer membrane where it can rapidly bind and cause its effects. Consistent with these results, growth of whole roots was also inhibited by the exogenous application of KP4 and this effect is also blocked by the addition of exogenous calcium. Therefore, KP4 is targeting some conserved portion of calcium channels that are conserved among three kingdoms.

Possible Application of KP4: Fungal Resistance in Plants Maize smut is a global disease responsible for extensive agricultural losses. In spite of the apparent inhibitory effect of adding KP4 exogenously to Arabidopsis root hairs, we created several lines of transgenic maize to see whether constitutive expression of KP4 would confer resistance to U. maydis infection. To target KP4 to the extracellular space of transgenic maize, a monocot codon-optimized KP4 gene containing the secretory signal peptide sequence of a plant defensin MsDef1 was placed under the control of a constitutive maize Ubi1 promoter. KP4 was expressed to high levels in-planta as per ELISA and killing assays without any apparent impact on plant development. The fact that we did not see any phenotypic effect of KP4 on root development could be due to less sensitivity of transgenic maize roots to KP4, possible adaptation of the plants to KP4 expression, or that the local concentration of KP4 around the roots is significantly lower than in the experiments described above. Since transgenic maize lines secreted bioactive KP4 protein, we next determined the ability of this protein to protect transgenic lines from the corn smut disease. Transgenic maize lines expressing KP4 were made and compared to control lines. Seven-day-old seedlings were inoculated with a mixture of the wild-type U. maydis KP4-sensitive strains. Both strains generated galls in

516

Ustilago maydis Viruses and Their Killer Toxins

nontransgenic maize plants. When the transgenic plants were challenged with U. maydis by stem or ear inoculation, they exhibited robust resistance in a dose-dependent manner. These results demonstrate that this family of naturally expressed antifungal proteins holds promise in identifying new targets for antifungal agents and a novel means to protect crops against fungal infections. While KP4 does affect fungi other than U. maydis (e.g., Fusarium graminearum and Aspergillus fumigatus (Smith and Shah, unpublished results)), there has to be strong activity against pathogens of commercial importance. Future studies on the structure/function of KP4 may improve activity against these more commercially important pathogens.

Evolutionary Origin of KP4 Recently, a large number of KP4-like genes/proteins were found in plants and fungi other than U. maydis, including the secretome of Fusarium graminearum. From sequence analysis, it appears that KP4 likely evolved in a progenitor fungus, possibly a Sordariomycete, and then moved by horizontal gene transfer to other organisms including U. maydis. However, it was not clear what function these KP4-like proteins might serve in these pathogenic fungi. Recent studies demonstrated that three KP4like genes in F. graminearum are up-regulated as the fungal infection progresses in wheat. However, it is not clear whether this is causation or just correlation. Interestingly, calcium signaling plays a major role in a wide range of physiological responses in the plant upon fungal infection including cytoskeletal and reactive oxygen species response. It is tempting to speculate that perhaps the original fungus expressing KP4 used it to dampen plant cellular defenses against infection. Consistent with this is the fact that the effects of KP4 on U. maydis are abrogated by exogenously adding cAMP, an important secondary messenger for calcium signaling, to the media. It may be that, unlike the channel forming toxins, KP4 evolved to disrupt cytoplasmic signaling pathways regulated by cytoplasmic calcium gradients.

KP6 KP6 is translated as a single polypeptide chain and processed by Kex2p during export through Golgi apparatus to form α and β subunits by the elimination of a 31-residue linker region. There is relatively little known about how KP6 affects U. maydis cells. Previous studies have eliminated possible chitinase or protease activity and it is clear that it acts in a manner wholly different to that of KP4. While 100 mM CaCl2 completely blocks KP4 inhibition and as little as 20 mM CaCl2 is sufficient to cause significant abrogation, none of the metals had any significant effect on the KP6 killing of U. maydis (Smith, unpublished data). The only published data on the mode of action of KP6 suggested that it might be having a strong effect on the membrane integrity or osmotic regulation. KP6 needs both subunits for activity. Both subunits can be expressed separately in immune U. maydis cells and become active when mixed together. Sensitive U. maydis cells are able to produce KP6α but not KP6β. This suggests that KP6β added to the extracellular media is not toxic but is when expressed intracellularly. Sensitive U. maydis cells can be treated with KP6α alone without any evidence of antifungal affects. However, if the sensitive cells are killed when treated with KP6α alone, washed, and then KP6β is added. However, if sensitive cells are treated with KP6β, washed, and then KP6α is added, there is no antifungal activity. Together, this suggests that KP6α may bring KP6β into the cell for killing.

The Atomic Structure of KP6 KP6α forms a single domain structure that has an overall ellipsoid shape and also belongs to the α/β-sandwich family (Fig. 3). The tertiary structure consists of a four-stranded antiparallel β-sheet, a pair of antiparallel α-helices, a short strand along one edge of the sheet, and a short N-terminal helix. Although the fold is reminiscent of toxins of similar size, the topology of KP6α is distinctly different in that the α/β-sandwich motif has two right-handed βαβ split crossovers. Interestingly, the two KP6 subunits are nearly identical (Fig. 3). The RMSD of the 3D alignment using Chimera was 1.5 Å using 74 of the residues in each of the subunits. Within each subunit, the N and C termini are in close proximity. However, in the α-β heterodimer, the C terminus of KP6α is on the opposite side of the N-terminus of KP6β (Fig. 3). From modeling studies of what the protoxin structure might look like, it seems likely that the α-β heterodimer undergoes a major structural transition, perhaps activation, during Kex2p processing and export. This N-terminal region is where the majority of the differences lie between KP6α and KP6β (Fig. 3). In KP6α, there is an amphipathic α-helix at the N-terminus whereas in KP6β, the N-terminus is shorter by five residues and is entirely a coil structure with the first three residues disordered. In KP6β, the β3-α3 loop is much longer than that in KP6α and lies in the general region of this N-terminal α-helix in KP6α. The only other major difference is that the α2-β2 loop is longer and partially disordered in KP6β. The α-helix at the N-terminus of the α-subunit is rather unusual in that a disulfide bond holds it in a position that presents a very hydrophobic surface to the aqueous environment rather than burying it within the protein (arrows in Fig. 3). Interestingly, in the structure of KP6α alone, this hydrophobic patch forms trimeric interactions in the crystal lattice. Clearly there is strong pressure to bury this helix, away from the aqueous environment. It seems likely that this hydrophobic region plays an important role in targeting the fungal cell, perhaps the cell membrane or a membrane associated protein.

Ustilago maydis Viruses and Their Killer Toxins

517

Fig. 3 Atomic structure of KP6. Top: Shown here are the atomic structures of the α and β chains of KP6. The colors of the models change from blue to red as the chains extend from the N to the C termini. The disulfide bonds are represented as ball and stick models. Note the hydrophobic, exposed, phenylalanine residues at the N-terminal helix of the α chain. Bottom: Stereo diagram showing the high degree of structural homology between the α and β chains of KP6. The α chain is colored as a rainbow as in the top figure while the β chain is mauve colored. Note that the core structures of the two subunits are nearly identical and most of the differences are found in the N-termini.

KP1 Relatively little is known about the structure and mode of action of KP1. KP1 is secreted by the P1 U. maydis strain, is 12.9 kDa in size, is not glycosylated, and has a basic pI of 8.0. Similar to KP6, KP1 is produced as a pre-protoxin and then processed by Kex2p cleavage to produce two polypeptides, KP1α and KP1β. However, unlike KP6, that requires both polypeptides for killing, KP1 only requires the β subunit. Just as KP4 and both KP6 polypeptides, KP1 contains six cysteine residues, all of which are probably connected via disulfide bonds. To date, the structure, the cellular receptor, the mechanism of killing, and the ED50 of KP1 remain unknown.

Comparison of the Killer Proteins As discussed above, while KP4 and KP6 are both expressed by U. maydis viruses, they share no homology with regard to structure or function. For comparison, Fig. 4 shows schematic diagrams for KP1, KP4, KP6, and yeast toxin, K1. From the structures of KP4 and KP6, the disulfide bond pattern is represented by grey lines while the locations of cysteine residues are represented by yellow balls in the KP1 and K1 sequences. At the bottom are the results of transmembrane helical predictions based on the various sequences. The general organization and sizes of KP6, KP1, and K1 pre-protoxins appear comparable (Fig. 4). However, there is no significant sequence similarity among any of them. Using a transmembrane prediction algorithm, TMHMM2, a very interesting pattern appears among the toxins. As expected, all four toxins are predicted to have transmembrane helices in the N-terminal export signal. For KP1, KP4, and KP6, this is the only part of the protein predicted to have a transmembrane helix. In contrast, there are predicted to be two additional transmembrane helices in the α subunit of K1. This is consistent with the results suggesting that the α subunit of K1 affects the membrane integrity of the target cell. While not shown in the figure, those helices are also predicted to be in the S. cerevisiae toxins K2 and K28. This suggests that the S. cerevisiae toxins interact with the target in a homologous manner with each other, while the U. maydis toxins are quite different among themselves as well as the S. cerevisiae toxins. The other significant difference among the toxins are the disulfide bond patterns and locations of cysteine residues. KP4 is the only single subunit toxin and has five disulfide bonds in the 105 residues. Similarly, both subunits of KP6 are heavily stabilized by disulfide bonds with four and three disulfide bonds in the α and β subunits, respectively. This is important to deduce mode of action since these disulfide bonds make it impossible for the proteins to undergo large conformational changes that could form pores in the membrane like colicin. Therefore, in solution, these are stable hydrophilic proteins that seem more likely to interact

518

Ustilago maydis Viruses and Their Killer Toxins

Fig. 4 Similarities and differences among the fungal toxins. In these scaled diagrams of the pre-protoxins, the various regions of the proteins are represented by the various colors. The mauve and orange arrows highlight the signal peptidase and Kex2p protease cleavage sites, respectively. The grey lines in the KP4 and KP6 figures represent the disulfide bonds observed in the crystal structure. For the other diagrams, the locations of cysteine residues are highlighted in yellow. The bottom figure shows the results of the transmembrane helix prediction algorithm, TMHMM2. Note that KP1, KP4, and KP6 only have predicted transmembrane helices in the signal peptide while K1 has two additional predicted helices in the α subunit.

with cell membrane proteins than form channels in the hydrophobic membrane. In the absence of a crystal structure, the disulfide bonding pattern of KP1 is less clear but there are sufficient cysteine residues to make 2–3 disulfide bonds in the two subunits. K1 has three cysteine residues in each subunit with at least one from each forming an inter-subunit disulfide link. Interestingly, the cysteine residues in the α subunit are clustered between the two predicted transmembrane proteins. Therefore, the S. cerevisiae toxins appear to be more flexible by virtue of fewer, if any, intra subunit disulfide bonds and more likely to interact with the membrane via the trans membrane helices.

Further Reading Allen, A., Chatt, E., Smith, T.J., 2013. The atomic structure of the virally encoded antifungal protein, KP6. Journal of Molecular Biology 452, 609–621. Allen, A., Islamovic, E., Kaur, J., Gold, S., Smith, T.J., 2011. Transgenic maize plants expressing the Totivirus antifungal protein, KP4, are highly resistant to corn smut. Plant Biotechnology Journal 9, 857–864. doi:10.1111/j.1467–7652.2011.00590.x. Allen, A., Snyder, A.K., Preuss, M., et al., 2008. Plant defensins and virally encoded fungal toxin KP4 inhibit plant root growth. Planta 227, 331–339. Brown, D.W., 2011. The KP4 killer protein family. Current Genomics 57, 51–62. Cheng, R.H., Caston, J.R., Wang, G., et al., 1994. Fungal virus capsids: Cytoplasmic compartments for the replication of double-stranded RNA formed as icosahedral shells of asymmetric Gag dimers. Journal of Molecular Biology 244, 255–258. Fujimura, T., Ribas, J.C., Makhov, A.M., Wickner, R.B., 1992. Pol of gag-pol fusion protein required for encapsidation of viral RNA of yeast L-A virus. Nature 359, 746–749. Gage, M.J., Bruenn, J., Fischer, M., Sanders, D., Smith, T.J., 2001. KP4 fungal toxin inhibits growth in Ustilago maydis by blocking calcium uptake. Molecular Microbiology 41 (4), 775–785. Gage, M.J., Rane, S.G., Hockerman, G.H., Smith, T.J., 2002. The virally encoded fungal toxin KP4 specifically blocks L-type voltage-gated calcium channels. Molecular Pharmacology 61 (4), 936–944. Gu, F., Khimani, A., Rane, S.G., et al., 1995. Structure and function of a virally encoded fungal toxin from Ustilago maydis: A fungal and mammalian Ca2+ channel inhibitor. Structure 3 (8), 805–814. Janssen, M.E.W., Takagi, Y., Parent, K.N., et al., 2015. Three-dimensional structure of a protozoal double-stranded RNA virus that infects the enteric pathogen giardia lamblia. Journal of Virology 89, 1182–1194. Koltin, Y., 1988. The killer system of Ustilago maydis: Secreted polypeptides encoded by viruses. In: Koltin, Y., Leibowitz, M. (Eds.), Viruses of Fungi and Simple Eukaryotes. New York: Marcel Dekker, pp. 209–242. Li, N., Park, C.-M., Erman, M., et al., 1999. The crystal structure of Ustilago maydis KP6 killer toxin alpha-subunit. A multimeric assembly with a central pore. Journal of Biological Chemistry 274, 20425–20431.

Ustilago maydis Viruses and Their Killer Toxins

519

Lu, S., Edwards, M.C., 2015. Genome-wide analysis of small secreted cysteine-rich proteins identifies candidate effector proteins potentially involved in Fusarium graminearum. wheat interactions. Phytopathology 106 (2), 166–176. doi:10.1094/PHYTO-09-15-0215-R. Martinac, B., Zhu, H., Kubalski, A., et al., 1990. Yeast K1 killer toxin forms ion channels in sensitive yeast spheroplasts and in artificial liposomes. Proceedings of the National Academy of Sciences of the United States of America 87, 6228–6232. Naitow, H., Tang, J., Canady, M., Wickner, R.B., Johnson, J.E., 2002. L-A virus at 3.4 Å resolution reveals particle architecture and mRNA decapping mechanism. Nature Structural & Molecular Biology 9, 725–728. Park, C.-M., Banerjee, N., Koltin, Y., Bruenn, J.A., 1996. The Ustilago maydis virally encoded KP1 killer toxin. Molecular Microbiology 20, 957–963.

Vegetative Incompatibility in Filamentous Fungi Songsong Wu, Daohong Jiang, and Jiatao Xie, Huazhong Agricultural University, Wuhan, China r 2021 Elsevier Ltd. All rights reserved.

Glossary Anastomosis Fusion of somatic hyphae followed by exchange of cytoplasmic contents. Horizontal transmission The transmission of virus, parasite, or other pathogens from one individual to another within the same generation. Innate immunity Physical and chemical barriers including cells, cytokines, and antiviral proteins that inhibit infection

with little specificity and without generation or adaptation of a protective memory. NLRs A class of cytosolic nucleotide oligomerization domain (NOD)-like receptors that recognize non-self signals in plants, animals, and fungi. Vegetative incompatibility A self/non-self recognition system in filamentous fungi that determines the outcome of hyphal fusion between fungal individuals.

Introduction Self and non-self recognition widely exist among all organisms, which include the immune systems in vertebrates, selfincompatibility during reproduction in plants, mating and vegetative incompatibility (VI) in fungi. Non-self recognition determines the outcome of somatic fusion between different tissues and causes conflicts between individuals that belong to different genetic backgrounds. In filamentous fungi, fungal colony with an interconnected network are formed via hyphal tips extension and branching formation. Different fungal isolates can undergo hyphal fusion to form a heterokaryon, and this process is essential for cytoplasmic communication and sexual reproduction in fungi. If two fungal individuals have one or more different het/vic loci, the fusion cells commonly show vacuolization and PCD reaction. Therefore, hyphal contact between two incompatible fungal individuals usually results in a macroscopic phenotype with black-pigmented bands (death cells) at the interface between two contacting fungal isolates, while isolates with the same genetic background can be well integrated and form one colony. Vegetative compatibility groups (VCGs) are defined as compatible isolates that have the same het/vic loci and genetically related within the same species. Isolates from different VCGs differ in colony phenotype and pathogenicity among some fungi, such as Rhizoctonia solani. Therefore, VCGs identification is a useful tool to analysis fungal population structure and evaluate the evolution of invasive pathogen fungi, such as Hymenoscyphus fraxineus and Ophiostoma novo-ulmi. In Aspergillus flavus, toxigenic and atoxigenic isolates belong to different VCGs, and atoxigenic VCGs can be used as biocontrol agents to reduce aflatoxin content by increasing the incidence of the atoxigenic isolates among natural population. The application of atoxigenic VCGs supplied an important biocontrol strategy for plant pathogens. The capable transmission of intracellular parasites among fungal individuals under natural condition is an essential factor for spreading infectious diseases. Mycoviruses as a group of intracellular parasites, are considered to be limited to intracellular transmission, with the exception for Sclerotinia sclerotiorum hypovirulence-associated DNA virus 1 (SsHADV-1) which has an extracellular phase. Mycoviruses can spread effectively within a fungal colony or between different strains that belong to the same fungal VCGs. In black Aspergilli, C. parasitica, and S. sclerotiorum, somatic incompatibility can efficiently block virus transfer among vegetatively incompatible individuals. Therefore, VI can effectively reduce the risk of horizontal transmission of mycovirus and its related genotypes among fungal populations. Heterokaryosis is believed to be controlled by the function of het genes in fungi. So far, het genes have been cloned in some model species that include eleven het loci in Neurospora crassa, nine het loci in Podospora anserina, eight het loci in Aspergillus nidulans and six het loci in C. parasitica. Furthermore, the interaction between horizontal transmission of CHV1 (Cryphonectria hypovirus 1) and vic systems in C. parasitica has been thoroughly studied. Horizontal transmission of virulence attenuating CHV1 is known to be restricted by VI systems. In C. parasitica, two super mycovirus donor strains (SD328/SD82) were generated by disrupting four out of six vic genes, and a mixture of these two strains can delivery CHV1 to different VCGs both in the laboratory and under natural conditions. In S. sclerotiorum, Sclerotinia sclerotiorum mycoreovirus 4 (SsMYRV4) could help heterologous viruses to transmission among different VCGs via inhibiting VI -mediated PCD of donor strains. The feature of SsMYRV4 also creates a bridge strain for viruses spreading between different VCGs under natural conditions. These results demonstrated that the biocontrol capability of hypovirulent-associated mycoviruses can be enhanced via modifying the vegetative compatibility type among fungal populations. In this article, we focus on the process of hyphal fusion between individuals from different VCGs. We analyze the genetic and molecular control of VI reactions. We also examine the ubiquitous immune system of NLRs to analyze the role of HET proteins in processing PCD and mycovirus transmission upon allorecognition in filamentous fungi. These analyses will provide clues for enhancing mycovirus horizontal transmission via block/weaken VI reactions in fungi.

520

Encyclopedia of Virology, 4th Edition, Volume 4

doi:10.1016/B978-0-12-814515-9.00065-5

Vegetative Incompatibility in Filamentous Fungi

521

Fig. 1 Cell fusions between S. sclerotiorum isolates. (A) Cell fusion between vegetatively compatible and vegetatively incompatible isolates of S. sclerotiorum. Strains were cultured on PDA medium supplemented with 75 mL/L of McCormick’s red food coloring for 48 h. Strains Ep-1PNA367 and RL6 belong to the same VCG, while strains RL19, 1980, RL26 and RL29 are from different VCGs. A clear line separates the individual strains that are vegetatively incompatible. (B) Light microscope observation of cell fusion between vegetatively compatible isolates (left panel) and vegetatively incompatible isolates (right panel). (C) SEM observation of the cell fusion (left panel: vegetatively compatible; right panel: incompatible). Strains cultured on PDA plates covered with cellophane membrane. Once hyphal tips of the two strains contact each other, the observation of fusion cells were conducted under microscope.

Characteristics of the Vegetative Incompatibility Reaction Microscopic and Macroscopic Analyses of Hyphal Fusion In many organisms, cell fusion plays an essential role in many developmental processes, such as sexual cells fusion and multinucleate cell-based organ formation. In filamentous fungi, during an early stage of hyphal anastomosis, the cell wall synthesizing enzymes and digestive enzymes in secretory vesicles are delivered to the point of contact. After exchange cellular contents of the contacted cells, new septa usually form near to the fusion site. This hyphal anastomosis increases nutrition and water utilization in the fungal colony. While cell fusion from different genetic background often results in the formation of heterokaryon, heterokaryon grows well if fused cells formed from genetically compatible individuals, whereas the fused cells formed from incompatible individuals show vacuolation and PCD, termed VI. The process of VI in some fungal species can be easily observed by the naked eye, when the incompatible pairs of strains grow on agar medium. There usually appears a black-pigmented band at the interface of two contacting fungal isolates (Fig. 1(A), yellow arrow), while there is no clear barrage between the fused vegetatively compatible isolates (Fig. 1(A), red arrow). This vegetative compatibility testing of pair-cultured isolates on agar has been used to analyze VCGs and natural populations in many fungi, including in S. sclerotiorum, P. anserina and O. novo-ulmi. The black-pigmented banding between two contacting fungal isolates is formed by degeneration of the fused cells and neighboring cells. These cells marked as vacuolation with increasing the size of vacuoles and then cytoplasmic granulation. Some contacted cells from different VCGs become shrinks (Fig. 1(B), yellow arrow). Scanning electron microscope (SEM) observation of hyphal fusion between incompatible strains showed that the fusion cells caused cytoplasmic shrinkage (Fig. 1(C), yellow arrow), while the fusion cells from compatible strains grew well (Fig. 1(C), red arrow). Transmission electron microscopy (TEM) observation of the fused cells from compatible pair strains revealed that cell wall fusion with normal vacuoles and new septa formed at the sites of anastomosis. In the incompatible fused cells, the plasma membranes of heterokaryon were disconnected, cytoplasm and vacuoles contents disappeared. Incompatible fuse leads to PCD without hyphal anastomosis. Therefore, the morphological changes of vacuoles and cytoplasm in the fusion cells act as an indicator of VI.

Identification of VCGs VCGs identification is important to evaluate the genotypic and phenotypic diversity of fungal populations. Multiple techniques have been developed to identify VCGs in filamentous fungi.

522

Vegetative Incompatibility in Filamentous Fungi

Barrage assay The barrage assay is a convenient method because it does not require additional specific genetic markers, and can be performed directly by pair-culturing of isolates. Potato dextrose ager (PDA) supplemented with McCormick’s red food coloring has been applied for VCGs identification in S. sclerotiorum (Fig. 1(A)). PDA medium supplemented with bromocresol green pH indicator dye has been used for VCGs tests in C. parasitica. The medium supplemented with additional dyes could make the barrage more apparent to the naked eye.

Microscopy Microscopic analysis of fused cells and neighboring cells to identify VCGs has been applied in some fungal species. Typical features including shrinkage of the plasma membrane and vacuolization of the cytoplasm serve as marker of VI. Mycelia plugs from different VCGs are placed on PDA plates covered with cellophane membrane. Pairs of plugs from the same VCG act as controls. When the hyphae from the two strains contact each other for 3–6 h, we can observe the microstructure of the fused cells under the microscope (Fig. 1(B)).

Visualization of labeled proteins during co-culture When isolates from the same VCG contact each other, the two isolates can exchange cytoplasmic contents during anastomosis. The contacted cells from different VCGs will undergo PCD and prevent cytoplasmic communication. Therefore, if two tested isolates are labeled with two different fluorescent protein, we can observe the transmission of cytosolic fluorescent proteins during anastomosis; two fluorescent signals in the contacted cells indicate that two strains belong to the same VCG; only one fluorescent signal in the contacted cells at the region of contact indicates that the two strains belong to different VCGs and failed to fuse.

Auxotrophic complementation Auxotrophic complementation can be used to test for VCGs. Hyphae from two isolates expressing different selective markers are allowed to intertwine at the interface region. If the pair of strains is compatible, the fused cells from the interface region can be able to grow by auxotrophic complementation.

Detecting the PCD of fused cells In filamentous fungi, the VI reaction usually causes PCD of the fused cells and neighboring cells. We can test for the occurrence of cell death at the contacted area by staining dead cells, such as Evans blue staining. As a feature of early PCD, DNA fragmentation can be observed using TUNEL (terminal deoxyribonucleotidyl transferase) assays on VCGs identification. These methods for PCD detection are widely used among fungal pathogens.

The Genetics of Vegetative Incompatibility The outcome of fused cells is dominated by self/nonself-recognition genes. If two contacted isolates have the same recognition genes, the hyphae from the interaction zone will successfully fuse and form a coherent colony. However, when the two isolates have difference at one or more of self-recognition loci, the fused hyphae in the interaction zone usually undergo PCD, manifested as VI. Therefore, the failure of hyphal fusion between VCGs often reflects genetic differences at the het or vic loci. In N. crassa and P. anserina, allelic and non-allelic VI systems have been established. In the allelic VI system, co-expression of incompatible alleles at the same locus leads to VI reaction, for instance het-s/HET-S in P. anserina. In non-allelic VI systems, interactions among unlinked incompatible alleles from different genes results in VI, such as het-r and het-v-mediated VI in P. anserina. Furthermore, the HET proteins that contain a common B150 amino acid conserved HET domain usually contribute to VI reaction, and HET proteins widely exist in fungal genomes. In P. anserina, the proteins with HET domain acted as the mediator of PCD during the VI process. In N. crassa, 73 putative HET proteins have been identified via comparative population genomic analysis, but the roles of those HET genes in the process of VI remain to be investigated. Using the same approach, 44 proteins with HET domain were predicted in the genome of S. sclerotiorum. To date, the VI systems of three fungal species (N. crassa, P. anserine, and C. parasitica) as model fungi have well been studied. In N. crassa, the locus (mat-A1 and mat-a1) controls mating and sexual development, strains with null mutations in mat A-1 are both heterokaryon compatible and sterile with a mating type strains. However, during vegetative growth, hyphae fusion of opposite mating type leads to growth inhibition and PCD, which is mediated by the tolerant (tol) locus. The tol gene contains HET domain and leucine-rich repeat regions (LRRs). The mutant tol acts as a specific suppressor of heterokaryon incompatibility. Furthermore, interactions between het genes (such as un-24 /het-6 and het-c/pin-c) were further confirmed to control VI. In P. anserina, eight VI systems have been characterized; the function of allelic locus het-s and the nonallelic het loci (het-c, het-e, and het-r) depend on genes encoding HET domain proteins, which have been confirmed to function in trigging PCD. Typical het genes contain a HET domain at the N-terminal, a NACHT domain in the central, and a WD40 repeats domain at C-terminal. The NACHT domain contains LRRs and a pyrin domain (PYD). In plants and animals, as recognition of pathogen-associated molecular patterns, NACHT-containing proteins stimulate immune responses and regulate apoptosis. Proteins containing WD40 domain are involved in many life processes, including apoptosis, transcriptional regulation and immune responses. The NACHTWD40 domain is functionally important in HET-E-triggered PCD reactions in fungi. Furthermore, the het-s allele encodes a prion form protein HET-s, it acts as an activation trigger of a cell death execution protein. In C. parasitica, six unlinked vic loci have been

Vegetative Incompatibility in Filamentous Fungi

523

identified and characterized. Among these VI systems, the function of vic1, vic6, and vic7 depends on proteins containing HET domains, and vic4 encoded protein has a NACHT-WD40 architecture. The predicted proteins with HET domains are usually not conserved in fungi, thus it is difficult to define het genes via its orthologous relationships. Interestingly, heterologous expression of het-c2 from N. crassa can also trigger a VI reaction in Aspergillus niger. This result indicated that the het-c homolog could function as a het gene in a different fungus, although this remains to be explored in different species. The function of un-24 triggering VI-mediated PCD shows the same function in N. crassa and its related species, but it does not function as a het gene in other species. Therefore, proteins containing a HET domain have diverse functions, and some of these proteins are involved in VI. Exploring the function of these proteins in VI processing will enhance our understanding of the polymorphism of het proteins.

The Signaling Pathways of Vegetative Incompatibility The VI process involves many signaling pathways, including recognition of a non-self-signal, signal transmission, and triggering of PCD. Although the VI systems appear to be complex in different fungi, the cellular reactions to the VI signals might be similar. When two incompatible hyphae contact each other, shrinkage of the plasma membrane and vacuolization are frequently observed in fungi. Interestingly, vegetatively incompatible individuals from Verticillium albo-atrum and Verticillium dahliae can form heterokaryons via protoplast fusion, not via hyphal fusion. Interspecific and intraspecific heterokaryons formation could be improved by removing the cell wall. Therefore, the outcome of VI is controlled by nuclear and also associated with components affecting the cell wall or cell wall formation. Therefore, cell wall, plasma membranes and vacuoles could play important roles in VI reactions. Heterotrimeric G proteins play an essential role in the signal-transduction pathways, including pheromone response and mating, pathogenesis, and non-self-recognition in eukaryotes. G proteins consist of three subunits including Ga, Gb, and Gg. In P. anserina, one a subunit of G proteins (MOD-D) regulates the process of VI. Increasing the cAMP level suppresses the defects in vegetative growth caused by the mod-D1 mutation, while increased cAMP has no influence on the suppressor effect of the mod-D1 mutation during VI reaction, thus, the function of MOD-D in VI reactions is independent of the cAMP pathway. In S. sclerotiorum, three homologs of Ga, one of Gb, and one of Gg have been identified as G subunit proteins. With the exception of one Ga subunit gene, the expression levels of the other four G protein subunit genes were significantly up-regulated during hyphal contact between incompatible individuals, while there is no expression change during anastomosis. Therefore, G proteins as multifunctional signaling proteins are also involved in transduction of VI signals in fungi. In all eukaryotes, Calcium (Ca2 þ ), a ubiquitous second messenger, is involved in multiple biological processes. Ca2 þ levels play an essential role during germling and hyphal fusion. Increasing intracellular Ca2 þ levels can commonly trigger plasma membrane fusion of secretory vesicles. Furthermore, the process of activation of MAP kinases usually occurs at cell membrane. In N. crassa, membrane protein HAM-11 is involved in regulation downstream of the MAK-2 pathway, but the Dham-11 mutant fails to undergo self-fusion. HAM-11 protein activates the MAPK signaling complex to achieve cell fusion. Another WD40 repeatscontained protein HAM-5 as a MAP kinase shows physically interaction with MEK-2 and MAK-2. HAM-5 oscillates with MAK-2 during hyphal fusion and controls the disassembly of the MAK-2 MAPK complex, but the Dham-5 mutant fails to undergo selffusion. Therefore, HAM-5 is important for the activation of MAK-2 cascade during VI reaction. MAK-2 pathway is also essential for transmitting the VI-medicated PCD signal to downstream during incompatible fungi fusion (Fig. 2). In N. crassa, HET-C is a transmembrane protein, and the initial self/nonself recognition between two strains occurs in the cell membrane. Another protein HET-E is also related to endomembrane and cytoplasmic. In P. anserina, [HET-s] prion and HET-S (a soluble protein) that coexist in one cell will lead to the loss of membrane integrity and cell death. Thus, alteration of the stability of the cell membrane by HET proteins is an initial response after hypha fusion finally lead to VI-associated cell death reactions. In Rosellinia necatrix, cytological analysis shows that the VI-medicated PCD is a vacuole-mediated process. The integrity of vacuoles might act as an important role in the switch between compatible and incompatible reactions. In P. anserina, among idi genes (induced during incompatibility), idi-1 is associated with the response to nutrient starvation and the VI process. IDI-1 is a cell wall protein localized at the septum, and rapamycin treatment could induce overexpression of idi-1. This treatment also shows hallmarks of VI, including increased septation, vacuolization and autophagy. Thus, IDI-1 could be involved in the TOR kinase pathway during VI reaction. idi-7 is an ortholog of Atg8 in Saccharomyces cerevisiae, which is involved in autophagy. Another idi gene, pspA (a subtilisin-like serine protease), has an ortholog in S. cerevisiae, and encode a vacuolar protease involved in autophagy. These observations suggested that VI reaction may be similar to autophagy, and this process is related to signal protein transportation from cell membrane to vacuole. Most of the molecularly characterized proteins involved in VI have been identified as possessing a HET domain in fungi. In S. sclerotiorum, most HET genes are significantly upregulated during VI; while these candidate het genes are downregulated when VI reactions are suppressed. In N. crassa, a transcription factor encoded by the vib-1 gene regulates het gene expression. Deletion or mutation of vib-1 suppressed the VI reaction mediated by genetic differences het loci. Thus, proteins containing a HET domain are important for downstream functions during VI reactions. According to the transcriptional analysis in P. anserina and N. crassa, the top 100 up-regulated genes during the VI-mediated PCD reaction involve in cell signaling pathways or autophagy, which indicated that some common factors have the same function during VI reactions in both P. anserina and N. crassa. In N. crassa, many conserved eukaryotic factors are involved in regulation of germling fusion, such as MAP kinase modules, NADPH oxidases (NOX), and calciumregulated factors. These signaling pathways are confirmed to play important roles in hyphal fusion in fungi (Fig. 2).

524

Vegetative Incompatibility in Filamentous Fungi

Fig. 2 The potential model of VI signaling network mediating hyphal fusion. The two fungi cells have different vic genes (vic-a and vic-b). The vic-a cell secretes specificity signal molecules and secondary metabolites. The vic-b cell has several kinds of VI signal receptor on the cell wall and cell membrane (such as GPCR and NLRs). Once these signal receptors recognize the VI signal, it will finally trigger PCD via MAK2 kinase pathway, ROS pathway, and TOR kinase pathway.

HET Protein Involved in NLR-Mediated Innate Immunity in Fungi Innate immunity is the first line to defense against the invasion of non-self pathogens. In plants and animals, NLRs play essential roles in dominant innate immunity. The process includes pathogens recognition, non-self signal transduction, triggering rapid cell death, and the overproduction of antimicrobial compounds to block the pathogens. NLR proteins consist of tripartite domain including an N-terminal downstream-acting domain, NBD (nucleotide-binding domain) in central, and LRR (leucine-rich repeat containing domain) at C-terminal. NLRs in fungi have similar structures and pathogen-detection functions in plant and animal. As a self/non-self recognition system in fungi, VI is controlled by the het (or vic) genes and act as a defense mechanism against mycoviruses via preventing hyphal fusion from different VCGs. Based on currently available fungal genomes, 5616 candidate NLRs were identified from 198 fungi (30 NLRs per genome). In contrast to the NLRs from plants and animals, fungal NLRs show high diversity of domain organizations. Fungal NLRs include at least 12 N-terminal effector domains (such as HET, HeLo, Patatin, and Peptidase S8), 3 NB

Vegetative Incompatibility in Filamentous Fungi

525

domains (NB-ACC, NACHT and AAA), and 3C-terminal domains (ANX, WD and TPR). Some fungal NLRs have enzymatic activities at N-terminal domains, such as Patatin with phospholipase and peptidase activities. These structures indicate that fungal NLRs have numerous functions in addition to their signaling function. As mentioned above, fungal proteins containing HET domains have function in VI-medicated PCD. In P. anserina, 10 of 16 characterized vic proteins contain a HET domain, and overexpression of a HET domain could initiate a VI-like PCD. The vic2 and vic4 loci in C. parasitica encode typical NLRs. In S. sclerotiorum, two proteins have an NB-ARC domain, two proteins contain a WD domain, and one protein has a NACHT domain among 44 predicted proteins containing a HET-domain. In N. crassa, patatin-like phospholipase-1 (PLP-1) with typical NLRs was identified, and the N-terminal phospholipase activity and central NBD are essential for triggering allorecognition and PCD in P. anserina and N. crassa. These observations indicated that the functions of fungal NLR-like proteins, including some HET proteins, are similar to the NLRs from animals and plants, and contribute to non-self recognition. In plants and animals, NLR-mediated innate immunity is triggered by infection by various pathogens. Fungal NLRs-mediated VI also function in bacteria-fungi and virus–fungi interactions as a microbe-associated molecular patterns. Transcriptomic analyses of NLR-controlled VI revealed the activation of processes including cell wall modification, secondary metabolite production and autophagy; this is very similar to the responses induced by bacteria in plant or animal. The vic2 system in C. parasitica has been confirmed to reduce virus horizontal transmission via triggering VI-medicated PCD. In Arabidopsis, the patatin domain could enhance the resistance to cucumber mosaic virus via triggering cell death, and the patatin domain in the vic2 has similar function to initiate cell death in C. parasitica. Thus, VI in fungi leads to rapid cell death and increase antimicrobial activity, and NLR-controlled VI could reduce virus horizontal transmission during hyphal fusion of incompatible individuals.

Mycovirus Transmission Mycovirus was first reported in cultivated mushrooms in 1962. With the application of high-throughput sequencing in mycovirus, more and more mycoviruses with different types were found in all four phyla of fungi. For the most mycoviruses, transmission is limited to intracellular transmission (cell to cell), while there has been reported some exceptions that break the restrictions of VI in fungi. SsHADV1 has an extracellular phase and can infect its sister species of Sclerotinia directly via virus particles. Furthermore, some kind of vectors have been identified that facilities mycovirus transmission. An insect, Lycoriella ingénue, can be a transmission vector to facilities SsHADV-1 spread. A plant virus, cucumber mosaic virus (CMV), can naturally infect fungus Rhizoctonia solani, and R. solani isolates can also acquire CMV from plants during infection, which suggests that CMV can be transmitted in R. solani isolates via plants vector. Most mycoviruses have neither transmission vector nor extracellular phase, and these mycoviruses are limited to the intracellular life cycles, with lateral transmission among fungal populations via hyphal fusion and vertical transmission via producing sexual/asexual spores. In generally, mycoviruses can be spread through asexual spores with a high frequency rate. In Basidiomycetes fungi, mycoviruses can frequently enter into basidiospores. However, in ascomycetous fungi, most of mycoviruses could not enter into ascospores during sexual reproduction, exception for some mitoviruses and ourmia-like viruses, which may relate to mitochondrial inheritance. Mycovirus horizontal transmission between fungal isolates is mainly achieved through anastomosis. Therefore, hyphal anastomosis is essential for maintaining the mycovirus proportion in the natural fungal population. Vegetatively compatible combination within and between individuals occurs with a high frequency, which promotes hyphal network formation and facilitates mycovirus transmission within a colony. However, most hyphal fusion between different VCGs of the same species triggers cell death. VI as NLR-mediated innate immunity in fungi can effectively block mycovirus transmission among fungal populations. The success of CHV1 application to biocontrol chestnut blight in Europe encouraged similar virocontrol of other fungal pathogens with mycoviruses. The same efforts to virocontrol chestnut blight in North America have failed due to the diversity in VCGs of C. parasitica isolates. In black Aspergilli, horizontal transmission of mycoviruses only occurs among the same VCGs under laboratory conditions. Therefore, horizontal transmission of mycovirus is usually considered to be limited to the strains belonging to the same VCG. Under the balancing selective forces, diverse organisms maintain polymorphisms of genes involved in innate immunity and VI systems. In C. parasitica, VI-related genes are under selective forces due to restriction the transmission of debilitation-associated viruses among fungal populations. This suggests that the polymorphisms of VCGs are the way to prevent transmission of viral disease. The influence of VI systems on virus transmission has been well-studied in C. parasitica. Based on the C. parasitica strains isolated from a natural population with different vic genes, the ability of CHV1 transmission among different isolates shows negatively correlated with the number of different vic alleles between them. When donor and recipient strains have one different vic allele, the transmission frequency of CHV1 was 0.5, while transmission frequency decreased to 0.13 or lower when VCGs differed by two or more vic loci. In C. parasitica, there shows a significant negative correlation between PCD and the frequency of CHV1 transmission. In S. sclerotiorum, an RNA virus (SsDRV) cannot be transmitted to the strains that belong to different VCGs. Thus, VI is a main barrier for mycovirus horizontal transmission between different VCGs of S. sclerotiorum, and reduction of the frequency of VI-medicated PCD can promote virus horizontal transmission. In the CHV1-C. parasitica system, five of the six vic loci (except vic4) have confirmed to restrict mycovirus horizontal transmission among VCGs.

Mycoviruses can be Horizontally Transmitted Among VCGs As VI is the main factor to restrict horizontal transmission of mycoviruses, the efficiency of mycoviruses transmission can be promoted by modifying VI systems. Based on the function of these five vic genes in CHV1 transmission, a super mycovirus donor

526

Vegetative Incompatibility in Filamentous Fungi

strain with disruption of four vic genes (vic1a-2, vic3b-1, pix6–2, and vic7a-2) has been engineered, a mixture of the two donor strains with vic2–1 or vic2–2 enables CHV1 to be transmitted easily under laboratory and natural conditions. Therefore, it could serve as an enhanced hypovirus vector via breaching the restriction of VI. Similarly modified VI systems in other fungi can be applied to facilitate mycoviruses horizontal transmission. Current research revealed that some of mycoviruses can escape from the restriction of VI systems under the natural or laboratory conditions. In S. sclerotiorum, Sclerotinia sclerotiorum deltaflexivirus 2 (SsDFV2) and Sclerotinia sclerotiorum partitiviruses virus 1 (SsPV1) can be transmitted between different VCGs without the limitations of the VI system, and SsPV1 can even be transmitted to the other Sclerotinia species through hyphal contact. Two partitiviruses (HetRV3-ec1 and HetRV4-pa1) have been confirmed to easily transmit between different VCGs.

Potential Approaches to Enhance Mycovirus Transmission Between VCGs VCGs shows high diversity in many filamentous fungi, but mycoviruses are still very common in different VCGs, suggesting that viruses can block VI system or have potential transmission bridge under natural conditions. In R. necatrix, Rosellinia necatrix fusarivirus 1 (RnFV1) (RNA element N10 carried in fungal strain NW10) can be horizontally transmitted between the two incompatible strains after co-inoculated on an apple tree for three years. It was unclear that any environmental factor changes help RnFV1 to break through the VI system in R. necatrix during the process of infecting the host plant. Therefore, unlike simple laboratory conditions, mycovirus transmission can break through the barriers of VI systems under certain natural conditions. It also has been reported that 1.5 mM Zinc chloride treatment was able to facilitate mycovirus transmission via increasing hyphal anastomosis and attenuating PCD in R. necatrix and some other fungi. In N. crassa, a cell death-activated zinc cluster transcription factor (CZT-1) acts as the regulator for cell death. Using UPLC-HRMS, some secondary metabolite productions have been found significantly shifted in C. parasitica during vic3 incompatibility, such as a farnesyl-S-oxide analog resembling mating pheromones. Adding activated charcoal to the culture medium can also inhibit the VI reactions and promote the transmission of the mycovirus in R. necatrix. Some proteins located at cell wall (such as IDI1) may be affected by adding activated charcoal or Ca2 þ /Zn2 þ . Therefore, mycovirus transmission can be modified via eliminating the specific secretions of the strain or suppressing the VI reactions (Fig. 3). With the development of high-throughput sequencing to gain viral genome information, more and more mycoviruses showed close phylogenetic relationships to plant viruses, suggesting that some mycoviruses could transmit in a cross-kingdom manner.

Fig. 3 The potential model of mycovirus horizontal transmission via breakthrough the barrage of VI systems. Strain A, strain B, and bridge strain belong to different VCGs, when strain A and strain B contact each other, the contacting cells will occur strong PCD reactions (red dotted arrow). To attenuate vegetative incompatibility, treatment with zinc compounds can enhance transmission efficiency of mycoviruses to strains with different genetic backgrounds. A bridge strain for mycovirus transmission also can improve mycovirus transmission between different VCGs, this bridge strain can be modulating fungal allorecognition (disruption of vic genes) or the strain that infected by a certain virus (such as SsMYRV4) with suppressing host VI reactions. Mycovirus can be transmitted via different kinds of vectors, such as plants, mites, and insects, etc.

Vegetative Incompatibility in Filamentous Fungi

527

R. solani isolates can obtain and transmit CMV to the host plants, so CMV can be transmitted among different VCGs of R. solani in plant. Some mycoviruses are very similar to plant viruses in the order Tymovirales, these mycoviruses may replicate in both plant and fungi, and the mycoviruses should shuttle between plants and fungi when pathogenic fungi infect plant tissues, so these groups of mycoviruses may block the barrage of VI systems via vector plants. Conversely, a mycovirus may infect a host plant, bacteria, or protozoa transiently and then virus can be transmitted back to fungi (Fig. 3). In root rot fungi Heterobasidion spp., the transmission efficacy of mycovirus HetPV15-pa1 could increase from 0% to 50% among different VCGs if the donor strain was pre-infected with HetPV13-an1. Thus, co-infection with some viruses or bacteria could help other mycovirus transmission among VCGs by blocking VI-mediated PCD. SsMYRV4 can suppress VI-mediated PCD to block the barriers of VI systems in S. sclerotiorum. VI-mediated PCD is controlled via transduction of VI signals, including recognition of strain-specific molecules via G-protein-coupled receptors (GPCRs) and NLRs, transduction of VI signals via MAk2 kinase and induction PCD via ROS signaling (Fig. 2). Thus, suppression of the expression of the key genes in these pathways would effectively block VI-mediated PCD and enable mycovirus horizontal transmission among VCGs (Fig. 3).

Further Reading Biella, S., Smith, M.L., Cortesi, P., et al., 2002. Programmed cell death correlates with virus transmission in a filamentous fungus. Proceedings of the Royal Society B-Biology Science 269, 2269–2276. Daskalov, A., Heller, J., Herzog, S., et al., 2017. Molecular mechanisms regulating cell fusion and heterokaryon formation in filamentous fungi. Microbiol Spectr 5 (2). Fleißner, A., Herzog, S., 2016. Signal exchange and integration during self-fusion in filamentous fungi. Seminars in Cell & Developmental Biology 57, 76–83. Gonçalves, A.P., Chow, K.M., Cea-Sánchez, C., et al., 2020. WHI-2 regulates intercellular communication via a MAP kinase signaling complex. Frontiers in Microbiology 10, 3162. Hutchison, E., Brown, S., Tian, C., et al., 2009. Transcriptional profiling and functional analysis of heterokaryon incompatibility in Neurospora crassa reveals that reactive oxygen species, but not metacaspases, are associated with programmed cell death. Microbiology 155, 3957–3970. Kaneko, I., Dementhon, K., Xiang, Q., et al., 2006. Nonallelic interactions between het-c and a polymorphic locus, pin-c, are essential for nonself recognition and programmed cell death in Neurospora crassa. Genetics 172, 1545–1555. Li, L., Wright, S.J., Krystofova, S., et al., 2007. Heterotrimeric G protein signaling in filamentous fungi. Annual Review Microbiology 61, 423–452. Milgroom, M.G., Cortesi, P., 2004. Biological control of chestnut blight with hypovirulence: A critical analysis. Annual Review Phytopathology 42, 311–338. Milgroom, M.G., Smith, M.L., Drott, M.T., et al., 2018. Balancing selection at nonself recognition loci in the chestnut blight fungus, Cryphonectria parasitica, demonstrated by trans-species polymorphisms, positive selection, and even allele frequencies. Heredity 121, 511–523. Monika, S.F., Wilfried, J., Nuss, D.L., 2019. Integration of self and non-self recognition modulates asexual cell-to-cell communication in Neurospora crassa. Genetics 211, 1255–1267. Paoletti, M., 2016. Vegetative incompatibility in fungi: From recognition to cell death, whatever does the trick. Fungal Biology Review 30, 152–162. Read, N.D., Goryachev, A.B., Lichius, A., 2012. The mechanistic basis of self-fusion between conidial anastomosis tubes during fungal colony initiation. Fungal Biology Review 26, 1–11. Uehling, J., Deveau, A., Paoletti, M., 2017. Do fungi have an innate immune response? An NLR-based comparison to plant and animal immune systems. PLoS Pathogens 13, e1006578. Wu, S., Cheng, J., Fu, Y., et al., 2017. Virus-mediated suppression of host non-self recognition facilitates horizontal transmission of heterologous viruses. PLoS Pathogens 13, e1006234. Zhang, D.X., Nuss, D.L., 2016. Engineering super mycovirus donor strains of chestnut blight fungus by systematic disruption of multilocus vic genes. Proceedings of the National Academy of Sciences of the United States of America 113, 2062–2067.

Viral Diseases of Agaricus bisporus, the Button Mushroom Kerry S Burton, Leamington Spa, United Kingdom Greg Deakin, NIAB-EMR, East Malling, United Kingdom r 2021 Elsevier Ltd. All rights reserved.

Glossary Casing The upper layer (55 mm) of the bilayer system for A. bisporus cultivation, formulated by mixing peat (75%–85%) and lime/chalk (15%–25%) followed by large amounts of water. Fruitbody Multicellular fleshy structure bearing spores and used as a food and also known as mushroom, sporophore, sporocarp and toadstool. Hyphal anastomosis The union of two hyphae allowing cytoplasmic exchange. Mushroom compost The nutritional substrate for cultivated A. bisporus made by solid-state fermentation of cereal straw,

animal manure, gypsum and water over 14–20 days and then positioned as the lower layer (200 mm) of the bilayer cultivation system. ORF Open reading frame. ORFan ORF with no similarity to other described ORFs and no known function. Phase 3 compost Mushroom compost colonised by A. bisporus mycelium for 17 days at 251C. Vegetative incompatibility A genetically controlled system which prevents two fungal strains from undergoing anastomosis. VLP Virus-like particle.

Introduction Two disease-causing viruses of the cultivated mushroom, Agaricus bisporus, have been characterized. For both diseases the viruses have been identified as part of multiple virus infections i.e., in the presence of other apparently asymptomatic viruses. Four million tonnes of A. bisporus mushroom fruitbodies are cultivated annually (value of BUS $4.7bn) throughout the world and traded internationally enabling viral mixing and movement, and the accumulation of high viral loads. A. bisporus is cultivated in a bilayer system with a 55 mm non-nutritious casing layer above a 200 mm layer of composted cereal straw colonised by A. bisporus mycelium. This cultivation system is intensive enabling up to nine crops per year to be grown in the same room with cleaning and sterilization between each crop. In the early 1950s a new disorder of the cultivated mushroom was described and named as La France Disease after the initial observations at the La France Farm in Pennsylvania, USA. The disease has since been reported in other U.S. farms and around the world. The symptoms involve slow mycelial and fruitbody growth, malformed mushroom fruitbodies and major crop loss (up to 100%) often observed as barren zones on the mushroom growing bed. La France disease is strongly associated with a 36 nm isometric virus-like particle (VLP) encapsidating nine double-stranded RNA (dsRNA) elements. Other VLPs have been isolated from diseased tissue, the bacilliform particle (19 nm  50 nm) of the barnavirus, mushroom bacilliform virus (MBV) and 25 nm, 29 nm and 50 nm isometric particles, however their role in disease has not been established. The nine dsRNA elements of the 36 nm particle were collectively named La France isometric virus, or LIV and later by the more systematic name Agaricus bisporus virus 1 (AbV1). In the late 1990s in the UK a new disease syndrome was described with disease symptoms superficially similar to La France disease, i.e., yield-loss (seen as bare patches with no mushrooms growing) and distorted fruitbodies, but without the characteristic presence of VLPs. Furthermore, this new disease had an additional symptom, a uniform brown discoloration on the mushroom caps. The name given to the new disease syndrome was Mushroom Virus X disease (MVX disease) and it is associated with multiple viral infections of up to 30 viral RNAs. Large scale sequencing of the RNAs associated with MVX-infected fruitbodies identified 18 unique viruses and 8 ORFans. Of these, only a single virus, Agaricus bisporus virus 16 (AbV16) and one uncategorized non-host RNA molecule (ORFan8) have been correlated with any symptom; cap browning. Brown Cap Mushroom Disease (BCMD) is now viewed as a viral disease distinct from MVX. Since 2000, the prevalence of the bare patches and distortions caused by MVX has declined while the brown cap symptoms have persisted and reported throughout Europe, North America and Asia. BCMD is economically important as brown discoloration is deleterious to quality (of white mushrooms) and therefore value, and the discoloration can occur after harvest causing friction between mushroom producers and buyers. AbV1 and AbV16 are often found in co-infections with, MBV. Whether any form of synergistic or antagonistic relationship exists between these viruses, or any of the other many viruses infecting Agaricus bisporus, remains unknown. It is of note that the levels of MBV can increase by more than ten-fold when in a multiple infection with AbV1 compared to isolates with undetectable levels of AbV1. However, on its own MBV is not capable of producing any of the symptoms of either disease.

528

Encyclopedia of Virology, 4th Edition, Volume 4

doi:10.1016/B978-0-12-809633-8.21515-X

Viral Diseases of Agaricus bisporus, the Button Mushroom

529

Fig. 1 Genome organization of AbV1. All segments are arranged 50 – 30 and the longest ORFs are shown as yellow annotations. Blue annotations indicate the presence of the shared motif (GGMAACGGCTAGTTKGCC). NCBI accession is shown above each segment. Dashed lines indicate the segment has not been sequenced with the sizes of these segments estimates based on gel electrophoreses.

La France Disease – Agaricus Bisporus Virus 1 (AbV1) Genome Organization The AbV1 genome comprises of six abundant, always present dsRNAs (L1, L2, L3, L4, L5 and M2) and three less abundant dsRNAs (M1, S1 and S2), with size range of 0.78–3.6 kbp (Fig. 1). If detectable, the less abundant dsRNAs are never found in isolation. Four of the abundant dsRNAs (L1, L3, L5 and M2) and M1 have been sequenced. L1 (3.4 kbp) has a single open reading frame (ORF) which codes for an RNA-dependent RNA polymerase (RdRp) domain containing protein of c. 122 kDa, however the 50 UTR of B200 bp has not been fully sequenced and so the ORF may be slightly longer. L3 (2.8 kbp) contains a single ORF which codes for an 87 kDa protein and has a 29 nucleotide poly (A) tail on the coding strand. Both L1 and L3 have significant amino-acid similarity to RdRps and structural proteins respectively of members of the Chrysoviridae. L5, M1 and M2 code for proteins of size 82 kDa, 42 kDa and 38 kDa respectively, none of which have homology to any known proteins. The polyprotein encoded by L5 is known to be cytoplasmically located and not associated with the AbV1 virions. L3, L5 and M2 share an 18-base motif (GGMAACGGCTAGTTKGCC) in their 50 UTR. The less abundant segment M1 lacks this motif, and it is unknown if the L1 50 UTR contains it due to the incomplete sequencing.

Virion Structure and Composition AbV1 virions are isometric particles with diameter estimates between 34–36 nm. Three polypeptides are associated with the virions, two abundant molecules of molecular mass 90 kDa and 120 kDa, and a less abundant molecule with estimated mass of either 115 kDa or 129 kDa. The less abundant polypeptide is the RdRp coded by L1. The two abundant polypeptides are presumably structural constituents of the capsid. The 90 kDa protein is coded by the L3 dsRNA, while the 120 kDa molecule is probably coded by L2 as this is the only unsequenced dsRNA of sufficient length to produce a protein of this size. It is not known, or whether it is physically possible, if single virions encapsidated all 9 dsRNA elements, or like other members of the Chrysoviridae family the genomic segments are encapsidated separately.

Taxonomy and Classification Phylogenetic analysis of the RdRp coding region of the L1 segment indicates that Agaricus bisporus virus 1 is closely related to members of the family Chrysoviridae with dsRNA genomes, and is positioned within the genus Betachrysovirus of the Chrysoviridae (Fig. 2). It is of note that, unlike most of the classified fungal alphachysoviruses which are asymptomatic in their hosts, members of the genus Betachrysovirus such as Magnaporthe oryas chrysovirus 1 and Alternaria alternate chrysovirus 1, typically produce symptoms of retarded growth.

530

Viral Diseases of Agaricus bisporus, the Button Mushroom

Fig. 2 Phylogenetic relationship of the Chrysoviridae and related species. Alignment of the predicated amino acid sequences of the given accessions was performed with MAFFT v7. Uninformative regions of the alignments were removed using Gblocks v0.91. Phylogenetic trees were constructed using Phyml 3 with SMS for best model selection. The numbers refer to the percentage branch support from 1000 bootstrap repetitions. Branches with less than 45% bootstrap support have been collapsed.

Biological Properties (Virus Host Relationships) As well as La France, the disease has been variously known as X-disease, die-back, watery stipe and brown disease, reflecting the wide range of associated symptoms. The major characteristic symptom is localized bare patches on mushroom growing beds (die-back) caused by failure of the mycelium to colonize the casing and thus failure of mushrooms to emerge. Around the periphery of these die-back zones, the fruitbodies often fail to grow properly and show a characteristic “drumstick” phenotype, where stipes are elongated or swollen and the caps small and misshapen. Other symptoms include brown discoloration of internal tissue (as streaks), bent stipes and a tendency for premature maturation. The severity of the disease varies from a very few abnormal mushrooms to complete loss of the crop. The range of symptoms seems to be affected by the host genotype, the timing of the initial infection with more devastating symptoms occurring when infection occurs at an earlier stage during the mushroom production cycle, and cultural practices. The presence of the AbV1 dsRNAs in a mushroom fruitbody is not in itself sufficient to cause the disease phenotype, though symptomatic fruitbodies from an infected crop always contained the six major dsRNAs.

Epidemiology AbV1 is spread between mushroom crops via anastomosis; predominantly when infected spores germinate on the mushroom bed, or directly via infected mycelium. It has been noted that infected mushrooms tend to mature early and release a huge number of spores. Like the overwhelming majority of fungal viruses there are no-known external routes to infection or natural vectors.

Diagnostics and Identification The 36-mm particles of AbV1 can be detected by electron microscope. Gel separation of the dsRNAs has also been successfully used to detect the virus. The more sensitive techniques are PCR-based using the primers to the known RNA sequences of the virus.

Viral Diseases of Agaricus bisporus, the Button Mushroom

531

Fig. 3 Genome organization of Abv16 and ORFan8. All segments are arranged 50 – 30 and the longest ORFs are shown as yellow annotations. Blue annotations show the presence and location of the shared motif (STTCAGSGTBBVWSHAGCRGTAAWT). An indication of segment variability is given as the number of single nucleotide polymorphisms per kilobase of sequence (snp/kb). NCBI accession is shown above each segment.

Fig. 4 Phylogenetic relationship of the members of the proposed family Ambsetviridae compared to selected members of alphavirus-like supergroup. Alignment of RdRp domains in the predicted amino acid sequences of members of the alphavirus-like supergroup was performed with MAFFT v7. Uninformative regions of the alignments were removed using Gblocks v0.91. Phylogenetic trees were constructed using Neighbor joining. The numbers refer to the percentage branch support from 100 bootstrap repetitions. Branches with less than 50% bootstrap support have been collapsed.

532

Viral Diseases of Agaricus bisporus, the Button Mushroom

Fig. 5 Typical symptom expression of Brown Cap Mushroom Disease (a) single brown in an experimental bed, (b) single brown in commercial production, (c) group of browns amongst white mushrooms, (d) off-white and brown mushrooms.

Control There is no natural immunity to AbV1 in any cultivated A. bisporus strains. There is a degree of vegetative incompatibility between strains; browns, creams and whites will not readily anastomose therefore rotating strains at the mushroom farm can lead to some protection. However, successful control of La France disease has been achieved by substantial increases in hygiene at mushroom farms; steam treatment of the growing room and equipment and the use of air filtration to exclude infected mushroom spores.

Brown Cap Mushroom Disease – Agaricus Bisporus Virus 16 (AbV16) Genome and Virion Structure AbV16 is a non-particulate, multipartite virus with four segments ranging in size from 558 to 1949 bases (Fig. 3). The ORFs of AbV16 RNA1 and AbV16 RNA2 encode an RdRp and viral methyltransferase respectively. The function of AbV16 RNAs 3 and 4 are unknown. Each of the AbV16 RNA’s contains a 25 base motif in the 30 UTR. ORFan8 does not contain the 30 motif but it is included as part of the causative agent for BCMD, as its levels correlate with the degree of cap browning and with the levels of the components of AbV16. ORFan8 and the AbV16 segments are polyadenylated, and considered to represent the genome of a positive sense, single-stranded RNA virus by close relation to other single-stranded RNA viruses. No virion structure has been identified for this virus.

Classification AbV16 has been classified as within the alphavirus supergroup by phylogenetic analysis of the Pfam RdRp_2 domain of AbV16 RNA1. It is proposed to be a member of a new viral family, the Ambsetviridae, which includes homologs of RNAs previously not classified as viruses but identified from Transcriptome Shotgun Assembly and Expressed Sequence Tag databases (Fig. 4). A recently described virus identified in Rosellinia necatrix is a probable member of this family. The organisms identified from the searches suggest the Ambsetviridae is a family of fungal and plant viruses. Two of these organisms have homologs to three components of AbV16 and a further seven have homologs to two components. No homologs have been found to the viral segment AbV16 RNA4 or ORFan8.

Viral Expression and Disease Development AbV16 and ORFan8 are present in infected mushrooms at two distinct levels. Low level infection results in slightly discolored mushroom caps only detectable by precision colorimeters. However, in visibly brown mushrooms, the RNA levels are 103–104 times

Viral Diseases of Agaricus bisporus, the Button Mushroom

533

higher, intermediate levels have not been detected. The factors determining this persistent to acute transition, from low to high abundance, are not fully understood. There appears to be a stochastic element as brown mushrooms can be observed apparently randomly distributed throughout a bed of white mushrooms (Fig. 5). It is known that the transition occurs early in fruitbody development and so may be associated with host cell division. It is possible the transition may also embrace the “quasi-species” concept, involving the generation of a wide range of sequence variants overcoming host defences, as AbV16 RNAs1 and 2 show high sequence variability.

Epidemiology Commercial inoculum for mushroom culture involves high standards of hygiene, testing and documentation of axenic cultures of A. bisporus. Infection is likely to occur at mushroom growing farms or “phase 3 compost” producers from infected sources (cultivated or wild A. bisporus), and associated with inadequate hygiene procedures. The intra-mycelial movement of AbV16 within compost culture has been shown to be slow at 40 mm/day (compared with 120 mm/day and 4200 mm/day for two other A. bisporus viruses, AbV12 and AbV7 respectively). Therefore, a point-infection of AbV16 on mushroom growing beds will have limited range for enlargement during the short growing period. Wide-spread outbreaks of BCMD are likely to have resulted from an infection during phase 3 compost production and/or compost mixing, transport and distribution. A further interesting feature of AbV16 is that its levels in mycelium in compost or laboratory culture can decline dramatically over time to become undetectable. The reasons for this decline are unknown, however AbV16 has been found only as part of a multiple viral infection and it is possible that inter-viral competition associated with specific viral combinations may play a role.

Diagnosis High level infection of AbV16 and ORFan8 can be detected visually as brown discolored mushroom fruitbodies and low-level infection can be detected by sensitive colorimeter (DE 4 15 or L/b o 12). AbV16 and ORFan8 can be detected qualitatively by agarose gel separation or PCR. The increased sensitivity of quantitative PCR enables detection of early, low level infection in mycelium in compost/casing and fruitbodies.

Further Reading Deakin, G., Dobbs, E., Bennett, J., et al., 2017. Multiple viral infections in Agaricus bisporus – Characterisation of 18 unique RNA viruses and 8 ORFans identified by deep sequencing. Scientific Reports 7, 2469. doi:10.1038/s41598-017–01592-9. Eastwood, D.C., Green, J., Grogan, H., Burton, K.S., 2015. Characterizing the viral agents causing brown cap mushroom disease of Agaricus bisporus. Applied and Environmental Microbiology 81, 7125–7134. Fleming-Archibald, C., Burton, K.S., Grogan, H.M., 2015. Brown cap mushroom virus (associated with Mushroom Virus X) prevention. MushTV Factsheet 02/15. Available at: http://www.hdc.org.uk/sites/default/files/02_15%20Brown%20Cap%20Mushroom%20Virus%20prevention.pdf. Fleming-Archibald, C., Ruggiero, A., Grogan, H.M., 2015. Brown mushroom symptom expression following infection of an Agaricus bisporus crop with MVX associated dsRNAs. Fungal Biology 119, 1237–1245. Fletcher, J.T., Gaze, R.H., 2008. Mushroom Pest and Disease Control: A Colour Handbook. Mansion Publishing Ltd: London, UK, p. 192. Frost, R.R., Passmore, E.L., 1980. Mushroom viruses: A re‐appraisal. Journal of Phytopathology 98, 272–284. Goodin, M.M., Schlagnhaufer, B., Romaine, C.P., 1992. Encapsidation of the La France disease-specific double-stranded RNAs in 36 nm isometric virus-like particles. Phytopathology 82, 285–290. Grogan, H.M., Adie, B.A., Gaze, R.H., Challen, M.P., Mills, P.R., 2003. Double-stranded RNA elements associated with the MVX disease of Agaricus bisporus. Mycological Research 107, 147–154. Harmsen, M.C., Van Griensven, L.J.L.D., Wessels, J.G.H., 1989. Molecular analysis of Agaricus bisporus double-stranded RNA. Journal of General Virology 70, 1613–1616. Harmsen, M.C., Tolner, B., Kram, A., et al., 1991. Sequences of three dsRNAs associated with La France disease of the cultivated mushroom (Agaricus bisporus). Current Genetics 20, 137–144. Marino, R., Saksena, K.N., Schuler, M., Mayfield, J.E., Lemke, P.A., 1976. Double-stranded ribonucleic acid in Agaricus bisporus. Applied and Environmental Microbiology 31, 433–438. Revill, P.A., 2008. Barnaviruses. In: Mahy, B.W.J., van Regenmortel, M.H.V. (Eds.), Encyclopedia of Virology, third ed. Elsevier Ltd, pp. 286–288. Romaine, C.P., Goodin, M.M., 2001. Unraveling the viral complex associated with La France disease of the cultivated mushroom, Agaricus bisporus. In: Tavantzis, S.M. (Ed.), dsRNA Genetic Elements- Concepts and Applications in Agriculture, Forestry and Medicine. CRC Press, pp. 237–257. Van der Lende, T.R., Duitman, E.H., Gunnewijk, M.G., Yu, L., Wessels, J.G., 1996. Functional analysis of dsRNAs (L1, L3, L5, and M2) associated with isometric 34-nm virions of Agaricus bisporus (white button mushroom). Virology 217, 88–96.

Viral Killer Toxins Manfred J Schmitt and Björn Becker, Saarland University, Saarbrücken, Germany r 2021 Elsevier Ltd. All rights reserved. This is an update of M.J. Schmitt, Viral Killer Toxins, In Encyclopedia of Virology (Third Edition), edited by Brian W.J. Mahy and Marc H.V. Van Regenmortel, Elsevier Ltd., 2008, doi:10.1016/B978-012374410-4.00577-X.

Nomenclature

IRE Internal replication enhancer TRE Terminal recognition element ERAD Endoplasmatic reticulum (ER)-associated protein degradation NLS Nuclear localization sequence

bp Base pairs ds Double-stranded RNAi RNA interference VBS Viral binding site

Glossary ER-associated degradation (ERAD) A cellular quality control mechanism ensuring efficient removal of misfolded and/or unassembled proteins from the endoplasmic reticulum (ER) lumen and their subsequent elimination by the cytoplasmic ubiquitin–proteasome system. H/KDEL receptors Seven-transmembrane proteins primarily required for the retention of soluble ER resident proteins in the endoplasmatic reticulum in yeast and mammalian cells. Heterokaryon Coexistence of two or more genetically different nuclei in a common cytoplasm. Importins: A family of proteins that transport macromolecules into the eukaryotic nucleus.

Killer virus system A killer virus system consists of a helper totivirus and a satellite double-stranded (ds) RNA encoding the unprocessed precursor of a secreted protein toxin that also gives functional immunity. Preprotoxin Toxin precursor in the cytoplasm of a killer yeast which is post-translationally imported into the secretory pathway for toxin processing, maturation, and secretion. Satellite dsRNA An encapsidated dsRNA that is dependent on a helper virus for encapsidation and replication. Spheroplasts A yeast cell whose cell wall has been enzymatically removed.

Introduction Although double-stranded (ds) RNA viruses had previously been identified in filamentous fungi, dsRNA viruses in yeast were identified as determinants of the killer phenomenon in Saccharomyces cerevisiae and are now known to be associated with the presence of cytoplasmically inherited members (totiviruses) of the genus Totivirus in the family Totiviridae, which are frequently found in various yeast genera. Among these, the killers of S. cerevisiae, Zygosaccharomyces bailii, Hanseniaspora uvarum, Torulaspora delbrueckii and Ustilago maydis – the latter being the cause of corn smut – are best characterized. Interestingly, the absence of RNA interference (RNAi) seems to be a prerequisite for dsRNA viruses existence as killer systems were so far merely found in RNAideficient yeast species whereas RNAi-proficient yeasts did not develop any killer systems during evolution. Characteristic for all killers is the secretion of a certain type of protein toxin that is lethal to sensitive strains of different species and genera without a direct cell-to-cell contact. Cell killing is usually achieved in a receptor-mediated process, requiring initial toxin binding to components of the outer yeast cell surface (such as b-1,6-D-glucans, a-1,3-mannoproteins, or chitin) and subsequent toxin transfer to a secondary plasma membrane receptor. Depending on the toxin, final lethality can be caused by plasma membrane damage, G1- or S-phase cell-cycle arrest, apoptosis, and/or by rapid inhibition of DNA synthesis. In the yeasts S. cerevisiae, Z. bailii, H. uvarum, Torulaspora delbrueckii as well as in the maize smut fungus U. maydis, the killer phenotype is cytoplasmically inherited and caused by an infection with dsRNA viruses. Since the majority of fungal mycoviruses are noninfectious and symptomless in the corresponding host, they are often classified as cryptic viruses or virus-like particles (VLPs). All known fungal viruses spread vertically by cell–cell mating and/or heterokaryon formation. In S. cerevisiae, diploids formed by mating of a killer with a sensitive strain are likewise killers, as are all haploid progeny of subsequent meiosis. In contrast, virus-free strains are usually sensitive nonkillers, while those containing Saccharomyces cerevisiae virus L-A (ScV-L-A) and a toxin-encoding M-dsRNA are killers (see below). Sensitive strains survive mating with killers, and cytoplasmic mixing of the multiple M-dsRNA copies during zygosis accounts for the inheritance pattern during meiosis. Extracellular spread of virions is generally hampered by the rigid yeast and fungal cell wall barrier, and fungal viruses have adopted a strategy of transmission via mating and hyphal fusion (which occurs frequently in nature) making an extracellular route of spread dispensable. While some of these viruses can be associated with adverse phenotypic effects on the fungus (like La France disease in Agaricus bisporus, plaque formation in Penicillium, and hypovirulence in Cryphonectria), dsRNA viruses and their associated satellite dsRNAs are responsible

534

Encyclopedia of Virology, 4th Edition, Volume 4

doi:10.1016/B978-0-12-809633-8.21280-6

Viral Killer Toxins

Table 1

535

dsRNA encoded killer toxin systems of various yeast species

Totivirus/satellite dsRNA

Virus host

Function of virus

dsRNA (kb)

Encoded protein(s)

ScV-L-A ScV-M1 ScV-M2 ScV-M28 ScV-Mlus UmV-H UmV-P1 UmV-P4 UmV-P6 HuV-L HuV-M ZbV-L ZbV-M TdV-LAbarr TdV-Mbarr-1

Saccharomyces cerevisiae Saccharomyces cerevisiae Saccharomyces cerevisiae Saccharomyces cerevisiae Saccharomyces cerevisiae Ustilago maydis Ustilago maydis Ustilago maydis Ustilago maydis Hanseniaspora uvarum Hanseniaspora uvarum Zygosaccharomyces bailii Zygosaccharomyces bailii Torulaspora delbrueckii Torulaspora delbrueckii

Helper virus Satellite dsRNA Satellite dsRNA Satellite dsRNA Satellite dsRNA Helper virus Satellite dsRNA Satellite dsRNA Satellite dsRNA Helper virus Satellite dsRNA Helper virus Satellite dsRNA Helper virus Satellite dsRNA

LA (4.6) M1 (1.6) M2 (1.5) M28 (1.8) Mlus (2.1–2.3) H (6.1) MP1(1.4) MP4 (1.0) MP6 (1.2) LA (4.6) MHu (1.0) LZb (4.6) MZb (2.1) LA (4.6) Mbarr-1 (1,7)

Gag, major capsid protein; K1 preprotoxin K2 preprotoxin K28 preprotoxin Klus preprotoxin Gag, major capsid protein; KP1 preprotoxin KP4 toxin KP6 preprotoxin Gag, major capsid protein; KT470 toxin precursor Gag, major capsid protein; prepro-zygocin Gag, major capsid protein; Kbarr-1 preprotoxin

Gag-Pol, RDRPa

Gag-Pol, RDRPa

Gag-Pol, RDRPa Gag-Pol, RDRPa Gag-Pol, RDRPa

a

RDRP, RNA-dependent RNA polymerase. Adapted from Schmitt, M.J., Breinig, F., 2002. The viral killer system in yeast: From molecular biology to application. FEMS Microbiology Reviews 26, 257–276, with permission from Blackwell Publishing.

for a killer phenotype which is based on the secretion of a polypeptide toxin (killer toxin) that is lethal to a variety of sensitive yeast and fungal strains. With the exception of toxin-secreting strains of Z. bailii, killer toxin production is usually associated with specific immunity, protecting the corresponding killer yeast against its own viral toxin.

DsRNA Viruses and Killer Phenotype Expression in S. Cerevisiae On the basis of killing profiles and the lack of cross-immunity, four major killer types (K1, K2, K28, Klus) have so far been identified in S. cerevisiae. Each of them produces a specific killer toxin and a self-protecting immunity component. Killer phenotype expression correlates with the presence of two types of dsRNA species stably persisting in the cytoplasm of the infected host: the genomic dsRNA of the helper virus, ScV-L-A, and one of four toxin-coding satellite dsRNAs (ScV-M1, ScV-M2, ScV-M28 or ScVMlus). The ScV-L-A and M-dsRNAs are separately encapsidated into capsids encoded by ScV-L-A dsRNA and are present in high copy number in the yeast cell cytoplasm. In vivo, ScV-L-A does not confer a phenotype nor does it lead to host cell lysis or slower cell growth. While the killer phenotype can be transmitted to sensitive yeast cell spheroplasts (harboring ScV-L-A) by transfection with purified ScV-M (either during mating or cotransformation with a dsDNA plasmid), extracellular transmission occurs rarely, if at all, in nature. The survival strategy adopted by these dsRNA viruses appears to be a balanced host interaction resulting in stable maintenance, little if any growth disadvantage, and vertical transmission. Mechanisms of exiting and entering the host cell through its tough and rigid cell wall are rendered unnecessary by relatively efficient horizontal transmission during the frequent zygosis events in yeast. Acquisition of a toxin-encoding M satellite dsRNA provides positive selection for both this dsRNA and ScVL-A, since virus-free segregants are killed. As summarized in Table 1, the linear dsRNA genome of ScV-L-A contains two open reading frames (ORFs) on its plus-strand RNA: ORF1 encodes the major capsid protein Gag necessary for encapsidation and viral particle structure, the second gene (ORF2) represents the RNA-dependent RNA polymerase Pol which is in vivo expressed as a Gag–Pol fusion protein by a –1 ribosomal frameshift event. In contrast to ScV-L-A, each M-dsRNA genome contains a single ORF coding for a preprotoxin (pptox) representing the unprocessed precursor of the mature and secreted toxin that also gives functional immunity. Since each toxin-coding ScV-M dsRNA depends on the coexistance of ScV-L-A for stable maintenance and replication, the killer viruses resemble classical satellites of ScV-L-A. Although the presence of all four M-dsRNAs with different killer specificities in a single cell is excluded at the replicative level of the M genomes, this limitation can be by-passed by introducing cDNA copies of other pptox genes into a natural killer strain. Thereby, the artificial construction of a stable triple-killer strain producing three different virus toxins (K1, K2 and K28) and showing multiple toxin immunity at the same time is feasible. As typical for many dsRNA viruses, the replication cycles of ScV-L-A and M-dsRNAs depend on the presence of specific packaging signals at or near the 30 -end of the viral single-stranded RNA (ssRNA) transcript. These RNA regions function as viral binding sites (VBSs) and consist of a stem–loop structure whose stem is interrupted by an unpaired protruding A residue (Fig. 1). For replication of ScV-L-A, two sequence elements are essential: an internal replication enhancer (IRE), which is essentially indistinguishable from the VBS, and a small stem–loop structure 5 bp from the 30 -terminus (30 -TRE). While ScV-L-A contains just a single VBS element, the toxin-coding transcripts of M1, M28, and MZb (encoding the Z. bailii viral toxin zygocin) each have two such VBS domains. Interestingly, no VBS elements are present in the dsRNA sequence of Mlus and Mbarr-1, indicating that alternative cis- active elements can be predicted to exist in the 30 -RNA region which act as signal for replication, replication-enhancement and packaging. In vivo, these VBS elements are

536

Viral Killer Toxins

Fig. 1 Comparison of the 50 - and 30 -ssRNA sequences in ScV-L-A and in the toxin-coding transcripts of its satellites ScV-M1, ScV-M28, ScV-Mlus, TdV-Mbarr-1, and ZbV-M. Potential cis-active sequences within the 30 -termini as well as poly-A rich regions are indicated (VBS, viral binding site; IRE, internal replication enhancer; TRE, 30 -terminal recognition element). Length of the ( þ )ssRNA are shown in base pairs (bp).

cis-active sequences that are recognized by Gag–Pol and subsequently packaged into new viral particles. The replicase reaction on the ( þ )ssRNA template takes place in vitro (i.e., within the viral capsid) and requires a correct 30 -end sequence and structure. Within the intact and mature virion, conservative transcription of the plus-strand from the dsRNA template requires recognition of its very 50 -terminal sequence by Gag–Pol. In M28 (as in other dsRNAs), the plus-strand initiates with 50 -GAAAA(A); since there are little additional conserved sequences immediately downstream, this terminal recognition element (50 -TRE) may be all that is necessary for transcription initiation (Fig. 1).

Viral Replication Cycle Yeast ScV-L-A virions are noninfectious icosahedral particles 39 nm in diameter that show certain similarities to core particles of mammalian orthoreoviruses and rotaviruses. Each ScV-L-A particle consists of a single copy of the 4.6 kb L-A dsRNA genome which is encapsidated by 60 asymmetric dimers of the 76 kDa coat protein Gag and two copies of the 171 kDa Gag–Pol fusion protein. During conservative replication, the single-stranded plus-strand RNA (L-A( þ )ssRNA) is transcribed within the viral particle (in viro) and subsequently extruded into the cytoplasm where it serves as (1) messenger RNA for translation into the viral proteins Gag and Gag–Pol and (2) RNA template which is packaged into new viral particles. Once this coat assembly is completed, Gag–Pol functions as replicase, synthesizes a minus-strand, and generates the dsRNA genome of the mature virus. The replication cycle of the toxin-coding M satellite dsRNA resembles that of L-A with the exception that each M virion can accept two copies of the smaller M-dsRNA genome before the ssRNA transcript is extruded into the cytoplasm; in analogy to certain DNA bacteriophages, this phenomenon has been named ‘headful packaging’.

Viral Preprotoxin Processing and Toxin Maturation In totivirus infected killer yeasts, the toxin-encoding M( þ )ssRNA transcript is translated on free cytosolic ribosomes into a pptox precursor which is post-translationally imported into the secretory pathway for further processing, maturation, and toxin secretion. Interestingly, intracellular pptox processing in killer strains of either S. cerevisiae (K1, K2, K28), Z. bailii (zygocin), or U. maydis

Viral Killer Toxins

537

Fig. 2 Analogy of toxin precursor processing in totivirus infected killers of S. cerevisiae, Z. bailii, T. delbrueckii and U. maydis. Schematic outline of the known and predicted precursor proteins (preprotoxins) encoding either a heterodimeric a/b toxin (K1, K28, Klus and Kbarr-1) or a monomeric virus toxin (K2, zygocin, KP4). Signal peptidase (SP) cleavage, Kex2p endopeptidase as well as Kex1p carboxypeptidase processing sites are indicated. Potential N-glycosylation sites are indicated with black circles (aa, amino acids).

(KP4) has been demonstrated as being mechanistically conserved, resulting in the secretion of a biologically active monomeric or a/b heterodimeric virus toxin. Based on hydrophobicity plots and in silico sequence analysis, a preprotoxin structure is also predicted for the Klus and Kbarr-1 killer toxins, however further experimental data is required to determine the exact nature and function of the postulated subunits (Fig. 2). For K28 killer strains containing ScV-M28, pptox processing has been intensively studied and is well understood. Mature K28 toxin represents a heterodimer whose a- and b-subunit is covalently linked by a single disulfide bond. Since the unprocessed toxin precursor resembles a secretory protein, it contains a hydrophobic signal sequence for pptox import into the ER lumen followed by the toxin subunits a (10.5 kDa) and b (11.0 kDa) which are separated from each other by a potentially N-glycosylated g-sequence (Fig. 2). During passage through the yeast secretory pathway the K28 toxin precursor is enzymatically processed to the biologically active heterodimer in a way that is highly homologous to prohormone (e.g., preproinsulin) conversion in mammalian cells. In a late Golgi compartment, the N-glycosylated g-sequence is removed by the action of the furin-like endopeptidase Kex2p, the C-terminus of b is trimmed by carboxypeptidase Kex1p cleavage, and the biologically active protein is secreted as a 21 kDa a/b heterodimer whose b-C-terminus carries a four-amino-acid epitope (HDEL) that represents a classical ER retention signal. Since

538

Viral Killer Toxins

Fig. 3 Preprotoxin processing and toxin secretion in an ScV-M28 infected killer yeast. After in vivo translation of the preprotoxin coding killer virus transcript, the toxin precursor is post-translationally imported into the endoplasmic reticulum (ER) through the Sec61 complex. Signal peptidase (SP) cleavage in the ER lumen removes the N-terminal signal sequence (pre-region) and protoxin folding is mediated by lumenal ER chaperones. The intervening g-sequence is N-glycosylated and a single disulfide bond between a and b is generated. In a late Golgi compartment, Kex2p endopeptidase cleaves the pro-region, removes the g-sequence, and carboxypeptidase Kex1p cleavage trimms the C-termini of both subunits, leading to the secretion of mature a/b toxin whose b-C-terminal HDEL motif is uncovered and thus accessible for interaction with the HDEL receptor of the target cell. SV, secretory vesicle; CW, cell wall; PM, plasma membrane. Adapted from Schmitt, M.J., Breinig, F., 2002. The viral killer system in yeast: From molecular biology to application. FEMS Microbiology Reviews 26, 257–276, with permission from Blackwell Publishing.

this signal is initially masked by a terminal arginine residue (HDELR), ER retention of the toxin precursor is prevented until the protoxin enters a late Golgi compartment where Kex1p cleavage uncovers the HDEL signal of the toxin (Fig. 3).

Endocytosis and Intracellular Transport of the K28 Virus Toxin In contrast to most virally encoded yeast killer toxins, which do not enter their host but rather kill a sensitive cell by disrupting plasma membrane function, K28 is taken up by receptor-mediated endocytosis and subsequently targeted to the secretory pathway. After initial binding to a-1,3- and/or a-1,2-mannotriose side-chains of a 185 kDa cell wall mannoprotein, K28 interacts with its secondary toxin receptor, the HDEL receptor Erd2p, at the plasma membrane level (Fig. 4). Responsible and essential for this interaction is a short-amino-acid motif at the carboxyterminus of the K28 b-subunit (HDEL) which normally functions as ER targeting/retention signal for ER-resident chaperones such as Kar2p/BiP and Pdi1p. Once the K28/Erd2p complex is endocytosed, K28 is retrogradly transported from endosomes via Golgi to the ER, from where the toxin retrotranslocates into the cytosol to finally reach its target(s) in the nucleus. Due to the high affinity binding between the toxin’s HDEL-motif and the cellular HDEL receptor within the low pH-environment of the endosomal and Golgi compartment, the receptor/toxin complexes are likely associated until they reach the neutral pH inside the ER. However, further research is needed to fully understand the exact mechanism(s) of how the toxin/receptor complexes are in vivo recognized by the endocytotic machinery and how the internalization and retrograde transport through the different cellular compartments is regulated. In fact, endocytotic uptake and retrograde transport is a common strategy realized in certain prototypes of bacterial toxins such as Pseudomonas exotoxin A, Escherichia coli heat-labile toxin (HLT), or even Shiga toxin. All these protein toxins are family members of microbial A/B toxins which are usually internalized by receptor-mediated endocytosis, followed by reverse secretion via Golgi and ER. Interestingly, many of these toxins contain putative ER retention signals at their C-termini, and H/KDEL-dependent mechanisms have, therefore, been postulated to be of major importance for toxin entry into mammalian cells. In this respect, a major difference between the yeast K28 virus toxin and bacterial A/B toxins is that K28 itself is produced and secreted by a eukaryotic (yeast) cell, and therefore the C-terminal ER-targeting signal in the toxin precursor is initially masked by a terminal arginine residue which ensures successful pptox passage through the early secretory pathway. Once the toxin has reached a late Golgi compartment, Kex1p cleavage removes the b-C-terminal arginine residue and thereby uncovers the ER targeting signal of the virus toxin. To ensure proper access of the ER targeting signal to the H/KDEL receptor of the sensitive target cell, many A/B toxins (including the yeast K28 virus toxin) contain a unique disulfide bond at or near the C-terminus. Consequently,

Viral Killer Toxins

539

Fig. 4 Schematic model of the endocytotic uptake and retrograde transport of K28 into the ER. In a first step, the b-subunit of the mature K28 a/b heterodimer binds in an energy-independent process to a-1,3- and/or a-1,2-mannotriose side-chains of the primary mannoprotein receptor (R1) in the yeast cell wall (CW). In a second step, K28 crosses the cell wall and subsequently interacts with the yeast HDEL receptor Erd2p (R2) at the plasma membrane level. The C-terminal HDEL motif of the toxin’s b-subunit is crucial for the toxin/receptor interaction. Finally, each receptor/toxin complex is internalized by clathrin-mediated endocytosis and retrogradely transported from endosomes through the Golgi apparatus into the ER, where K28 is released from the HDEL receptor and subsequently retrotranslocated into the yeast cytosol. Thereby, K28 binding to and/or release from the HDEL receptor Erd2p critically depends on the pH environment of the respective subcellular compartment; receptor/toxin complexes are stably associated in a more acidic pH environment (pH o 6.8) which is typical for endosomes and Golgi, whereas the complexes dissociate at the neutral pH of the ER lumen (pH 4 6.8).

mutant toxin variants with altered inter- and/or intra-subunit disulfide bonding are nontoxic in vivo due to the incapability to reach their intracellular target. Thus, disulfide bond formation in microbial and viral A/B toxins is of major importance to ensure interaction competence of the toxins with the H/KDEL-receptor of the particular target cell.

ER Exit and Nuclear Entry of the K28 Virus Toxin During host cell penetration, K28 retrotranslocates from the ER into the cytosol and dissociates into its subunit components. The b-subunit is subsequently polyubiquitinated and proteasomally degraded while the cytotoxic a-subunit enters the nucleus and causes cell death (Fig. 5). ER exit of the a/b heterodimeric toxin is presumably mediated by the Sec61 complex, termed translocon, which functions as major transport channel in the ER membrane of yeast and higher eukaryotes. In yeast, each translocon resembles a core heterotrimeric complex consisting of the transmembrane protein Sec61p and the two smaller subunits Sbh1p and Sss1p. Besides being the major channel for co- and post-translational protein import into the ER, Sec61p is also involved in the export and removal of malfolded and/or misassembled proteins from the secretory pathway to initiate their proteasomal degradation in the cytosol. In addition to its central function in protein quality control in the ER, Sec61 is also responsible for ER retrotranslocation of certain plant, microbial and viral A/B toxins such as ricin, cholera toxin, Pseudomonas exotoxin A, and the yeast K28 virus toxin. In contrast to microbial and plant A/B toxins, however, retrotranslocation of K28 from the ER lumen is independent of ubiquitination and proteasome activity and classical components normally involved in ER-associated protein degradation (ERAD) are not required for ER exit of this virus toxin. In K28 intoxicated cells, toxin translocation competence in the ER strongly depends on the activity of lumenal Hsp70 chaperones (such as Kar2p/ BiP and Pdi1p), Hsp40 cochaperones (such as Scj1p and Jem1p), and additionally requires proper maintenance of calcium homeostasis in the ER (Fig. 5). So far it is not known what cellular component within or near the ER membrane is responsible for toxin exit, but K28 might be a fruitful tool to identify a novel transport pathway for protein transport across the ER membrane.

K28 Affects DNA Synthesis, Cell-Cycle Progression, and Induces Apoptosis Although the cytotoxic a-component of K28 (10.5 kDa) can enter the nucleus by passive diffusion, extension of a by a classical nuclear localization sequence (NLS) significantly enhances its in vivo toxicity due to faster and more efficient nuclear import mediated

540

Viral Killer Toxins

Fig. 5 Retrotranslocation of the K28 virus toxin from the ER and its lethal effect in the nucleus. After endocytotic uptake and retrograde transport via Golgi and ER, the toxin is gated through the Sec61p export channel by the help of lumenal ER chaperones such as Pdi1p, Kar2p, Jem1p, and Scj1p. Thereby, Pdi1p plays a major role in the toxin retrotranslocation step. The “disulfide rearrangement model” schematically illustrates how reactivity of the sulfhydryl residues in the b-subunit changes during K28 intoxication. Under acidic pH conditions (pH 4.7), K28 is structurally stable as a disulfide-bonded a/b heterodimer. At neutral pH of the ER lumen, a free sulfhydryl (SH) residue in the b-subunit deprotonates which leads to the formation of an active thiol (S-). The nucleophilic attack of the reactive thiol and the heterodimer dissociation is prevented by the activity of Pdi1p, enabling exit of K28 as a/b heterodimer from the ER. In the pH-neutral and Pdi1p-free environment of the cytosol, the nucleophilic attack induces a disulfide bond rearrangement and subsequent release of the monomeric a-subunit from the heterodimer. Thereafter, b is poly-ubiquitinated via Ubc4p (E2) and Rsp5p (E3) and proteasomally degraded while a enters the nucleus and causes cell death. Within the nucleus, the a-toxin interacts with essential host proteins involved in eukaryotic cell-cycle control and causes cell death through G1/S cell-cycle arrest and inhibition of DNA synthesis. Cellular components of the ER quality control system ERAD (such as Cue1p, Ubc7p, Der3p/Hrd1p, Ubc6p, and Der1p) are not involved in ER-to-cytosol export of the K28 virus toxin (ERAD components are shown in gray color). Reproduced and adapted from Leis, S., Spindler, J., Reiter, J., Breinig, F., Schmitt, M.J., 2005. Saccharomyces cerevisiae K28 toxin – A secreted virus toxin of the A/B family of protein toxins. In: Schmitt. M.J., Schaffrath, R. (Eds.), Topics in Current Genetics 11: Microbial Protein Toxins. Berlin, Heidelberg: Springer, pp. 111–132, with kind permission of Springer Science and business Media.

by a/b importins of the host. Within the nucleus, a interacts with host proteins of essential function in eukaryotic cell cycle control and initiation of DNA synthesis. Thus, as the virus toxin targets evolutionary highly conserved proteins with basic and central function, toxin resistance mechanisms based on mutations in essential chromosomal host genes hardly occur in vivo, indicating that the toxin has developed a very efficient strategy to penetrate and kill its target cell. Most interestingly, while high toxin concentrations (10 pmol or higher) cause necrotic cell killing via cell-cycle arrest and inhibition of DNA synthesis, treatment with low doses of viral killer toxins (o1 pmol) results in an apoptotic host-cell response triggered by the accumulation of reactive oxygen species, ROS (Fig. 6). Since toxin concentration is usually low in the natural environment of a killer yeast, toxin-induced apoptosis is probably an important prerequisite for efficient cell killing. Furthermore, since apoptosis is also important in the pathogenesis of virus infections in mammals, it is not surprising that toxin-encoding yeast killer viruses (as has been shown for ScV-M1 and ScV-M2) can also induce

Viral Killer Toxins

541

Fig. 6 Receptor-mediated toxicity of the viral killer toxins K1, zygocin, and K28. Killing of a sensitive yeast is envisaged in a two-step process involving initial toxin binding to receptors within the cell wall (R1) and the cytoplasmic membrane (R2). After interaction with the plasma membrane, ionophoric toxins such as K1 and zygocin disrupt cytoplasmic membrane function, while K28 enters the cell by endocytosis and diffuses into the nucleus to cause cell death (note that the cell surface receptors R1 and R2 are different in all three toxins; see also table inset). At high toxin doses (410 pmol) sensitive cells arrest in the cell cycle with pre-replicated DNA (1n; left panel), while cells treated with K28 in low concentrations (o1 pmol) respond with apoptosis as shown by typical apoptotic markers such as chromosomal DNA fragmentation (TUNELpositive cells), accumulation of reactive oxygen species (ROS) and phosphatidylserine exposure at the external surface of the plasma membrane detected by annexin-V staining (right panel).

a programmed suicide pathway in noninfected yeast. Although viral killer toxins were shown to be primarily responsible for this phenomenon, yeast killer viruses are not solely responsible for triggering a cell death pathway in yeast.

Lethality of Membrane Damaging Viral Killer Toxins Yeast viral killer toxins kill sensitive cells in a receptor-mediated fashion by interacting with receptors at the level of the cell wall and the cytoplasmic membrane (Fig. 5). The initial step involves rapid toxin binding to a primary receptor R1 which is localized within the mannoprotein or b-1,6-glucan fraction of the cell wall. In the second step the toxin translocates to the plasma membrane and interacts with a secondary receptor R2 (Fig. 6). To date, only the membrane receptors for K1 and K28 have been identified. Kre1p, an O-glycosylated cell surface protein initially GPI-anchored to the plasma membrane and involved in b-1, 6-glucan biosynthesis and K1 cell wall receptor assembly, was identified as plasma membrane receptor of killer toxin K1. In case of K28, the HDEL receptor Erd2p has been demonstrated to be responsible for both efficient K28 binding to and endocytotic uptake from the yeast plasma membrane. Once bound to the plasma membrane, ionophoric virus toxins (such as K1 and zygocin) disrupt cytoplasmic membrane function by forming cation-selective ion channels, while K28 enters the cell and acts in the nucleus: DNA synthesis is rapidly inhibited and cells arrest at the G1/S boundary of the cell cycle (Fig. 6). Ion channel formation in yeast membranes induced by the K1 virus toxin was initially reported using patch-clamping techniques as a result from direct toxin action. However, this observation is inconsistent with the complete resistance seen in immune yeast cell spheroplasts, and so far receptor-independent channels have been observed, neither in yeast membranes nor in Xenopus laevis oocytes. Similar to K1 and K2, zygocin represents a membrane-damaging virus toxin which is produced and secreted

542

Viral Killer Toxins

by ZbV-M infected killer strains of the osmotolerant spoilage yeast Z. bailii. Zygocin itself is a monomeric nonglycosylated protein toxin with an unusual broad killing spectrum, being equally active against phytopathogenic as well as human pathogenic yeasts including Candida albicans, C. glabrata, C. tropicalis, and Sporothrix schenkii. Since even filamentous fungi such as Fusarium oxysporum and Colletotrichum graminicola are effectively killed by the toxin, zygocin represents a virus toxin with significant antimycotic potential. Similar to K1 and K2 but significantly more efficient, zygocin disrupts plasma membrane integrity and causes rapid cell killing. Its ionophoric mode of action has been reinforced by in silico sequence analysis, identifying a stretch of potential a-helical conformation that forms an amphipathic structure characteristic for membrane-disturbing antimicrobial peptides such as alamethicin, melittin, and dermaseptin. In addition, this feature is accompanied by a transmembrane helix at the C-terminus of zygocin which is predicted to favor a membrane permeabilizing potential, not by activating native ion channels but rather by establishing pores by itself after toxin oligomerization. It is therefore assumed that the hydrophobic part in zygocins’ amphipathic a-helix is responsible for toxin binding to the target cell. The postulated model of zygocin action resembles that of human a-defensins. In analogy to alamethicin, toxicity of zygocin is probably mediated by incorporation of its transmembrane helix into the plasma membrane, a process solely driven by the natural transmembrane potential of the energized yeast and fungal plasma membrane. Thus, zygocins’ mode of action portrays the lethal mechanism of antimicrobial peptides that are produced by virtually all higher eukaryotes. Mechanisms of resistance against antimicrobial peptides are rare and often limited to changes in the composition of the cytoplasmic membrane. In major contrast to mammalian cells, the outer leaflet of yeast and fungal membranes is enriched in negatively charged lipids. Due to the cationic net charge of antimicrobial peptides (including zygocin), an affinity to these lipids facilitates toxin adsorption to the target membrane. Consistent with that, deletion of chromosomal genes whose gene products affect plasma membrane lipid composition (such as PDR16 and PDR17) causes a dramatic decrease in zygocin sensitivity because toxin binding to the plasma membrane is largely prevented in the genetic background of a yeast pdr16/17 mutant. In contrast to K1, a zygocin-specific membrane receptor is not required for its in vivo toxicity as the physicochemical properties of zygocin allow efficient plasma membrane interaction independent of any membrane receptor or docking protein. Although the more recently discovered killer toxins Klus and Kbarr-1 likewise show an interesting broad killing spectrum similar to that of zygocin (i.e., killing yeasts such as S. cerevisiae, Hanseniaspora sp., Kluyveromyces lactis, Schizosaccharomyces pombe, Candida albicans, C. tropicalis, C. dubliniensis, C. kefir, C. glabrata, C. parasilopsis, C. krusei, Yarrowia lipolytica, and Hansenula mrakii), their precise mode and side of action is completely unknown and needs further investigation.

Self-Protection in Killer Virus-Infected Yeast – Toxin Immunity For many decades it was unknown how a killer virus-infected yeast protects itself against its own secreted toxin. In killer yeast, functional immunity is essential for survival since the toxins often target and inhibit central eukaryotic cell functions. This is in major contrast to bacterial toxins such as cholera toxin and Shiga toxins which selectively kill eukaryotes, thus making immunity dispensable in a prokaryotic host. So far, only the mechanism of protecting immunity against the K28 virus toxin has been elucidated. It has been shown that K28 killer cells take up external toxin (either produced by itself or by other K28 killers) and translocate it back to the cytosol where the reinternalized a/b toxin rapidly forms a complex with the pptox precursor that has not yet been imported into the ER. Within this complex, the K28 heterodimer is selectively ubiquitinated and proteasomally degraded while the pptox moiety of the complex is in part released to be either imported into the ER (to give active virus toxin) or to complex a newly internalized K28 heterodimer. In this process, the amount of cytosolic ubiquitin is critical for immunity and overexpression of mutant ubiquitin (blocked in polyubiquitin chain formation) results in a significant decrease in toxin secretion and a suicidal phenotype based on nonfunctional immunity. Alternatively, decreasing cytosolic ubiquitin causes an increase in toxin secretion, while immunity is not impaired as sufficient pptox is available for K28 complex formation. This simple and highly efficient mechanism ensures that a toxin-producing killer yeast is fully protected against the lethal action of its own toxin. In contrast to K28 immunity, the precise mechanism(s) of self-protection against the ionophoric virus toxins K1, K2 and zygocin is still obscure and remains largely unknown.

Further Reading Becker, B., Blum, A., Giesselmann, E., et al., 2016. H/KDEL receptors mediate host cell intoxication by a viral A/B toxin in yeast. Scientific Reports 6, 31105. doi:10.1038/ srep31105. Becker, B., Schmitt, M.J., 2017. Yeast killer toxin K28: Biology and unique strategy of host cell intoxication and killing. Toxins 9. doi:10.3390/toxins9100333. Breinig, F., Sendzik, T., Eisfeld, K., Schmitt, M.J., 2006. Dissecting toxin immunity in virus-infected killer yeast uncovers an intrinsic strategy of self-protection. Proceedings of the National Academy of Sciences of the United States of America 103, 3810–3815. Bruenn, J.A., 2002. The double-stranded RNA viruses of Ustilago maydis and their killer toxins. In: Tavantzis, S.M. (Ed.), DsRNA Genetic Elements: Concepts and Applications in Agriculture, Forestry, and Medicine. Boca Raton, FL: CRC Press, pp. 109–124. El-Sherbeini, M., Bostian, K.A., 1987. Viruses in fungi: Infection of yeast with the K1 and K2 killer virus. Proceedings of the National Academy of Sciences of the United States of America 84, 4293–4297. Ivanovska, J., Hardwick, J.M., 2005. Viruses activate a genetically conserved cell death pathway in a unicellular organism. Journal of Cell Biology 170, 391–399. Ramirez, M., Velazquez, R., Maqueda, M., Lopez-Pineiro, A., Ribas, J.C., 2015. A new wine Torulaspora delbrueckii killer strain with broad antifungal activity and its toxin-encoding double-stranded rna virus. Frontiers in Microbiology 6, 983. Reiter, J., Herker, E., Madeo, F., Schmitt, M.J., 2005. Viral killer toxins induce caspase-mediated apoptosis in yeast. Journal of Cell Biology 168, 353–358.

Viral Killer Toxins

543

Rodriguez-Cousino, N., Maqueda, M., Ambrona, J., et al., 2011. A new wine Saccharomyces cerevisiae killer toxin (Klus), encoded by a double-stranded RNA virus, with broad antifungal activity is evolutionarily related to a chromosomal host gene. Applied and Environmental Microbiology 77, 1822–1832. Ruggiano, A., Foresti, O., Carvalho, P., 2014. Quality control: ER-associated degradation: Protein quality control and beyond. Journal of Cell Biology 204, 869–879. Schmitt, M.J., Breinig, F., 2006. Yeast viral killer toxins: Lethality and self-protection. Nature Reviews Microbiology 4, 212–221. Schmitt, M.J., Neuhausen, F., 1994. Killer toxin-secreting double- stranded RNA mycoviruses in the yeasts Hanseniaspora uvarum and Zygosaccharomyces bailii. Journal of Virology 68, 1765–1772. Suzuki, Y., Schwartz, S.L., Mueller, N.C., Schmitt, M.J., 2017. Cysteine residues in a yeast viral A/B toxin crucially control host cell killing via pH-triggered disulfide rearrangements. Molecular Biology of the Cell 28, 1123–1131. Weiler, F., Schmitt, M.J., 2005. Zygocin – A monomeric protein toxin secreted by virus-infected Zygosaccharomyces bailii. In: Schmitt, M.J., Schaffrath, R. (Eds.), Microbial Protein Toxins. Berlin Heidelberg: Springer, pp. 175–187. Wickner, R.B., 1996. Double-stranded RNA viruses of Saccharomyces cerevisiae. Microbiology Reviews 60, 250–265.

Alternaviruses (Unassigned) Hiromitsu Moriyama, Tokyo University of Agriculture and Technology, Tokyo, Japan Nanako Aoki, Kuko Fuke, Kana Takeshita Urayama, Naoki Takeshita, and Chien-Fu Wu, Tokyo University of Agriculture and Technology, Fuchu, Japan r 2021 Elsevier Ltd. All rights reserved.

Glossary Mycoviruses Viruses that infect and propagate fungi. Polyadeniration Addition of a poly (A) tail to 30 UTR of mRNA or viral plus-stranded RNA. In eukaryotes, polyadenylation is part of the process that produces mature mRNA for translation. Rapid amplified of cDNA ends (RACE) This technique can provide the sequence of an RNA transcript to the 50 end (50 RACE-PCR) or 30 end (30 RACE-PCR) of the RNA.

RNA-dependent RNA polymerase (RdRP) Enzyme that catalyzes the replication of RNA from an RNA template. Viruses with multipatile genomes The essential genome is divided among several genomic segments (segmented genome) that are either separately encapsidated in identical capsids.

Introduction Fungi are infected by viruses, named as mycovirus. Recently, numerous mycoviruses are reported to cause epigenetic phenomena. Historically, mycoviruses were found in diseased Agaricus bisporus mushroom or Aspergillus foetidus, human pathogenic fungi over 50 years ago. Mycoviruses are primarily classified into four groups based on their genomic structure: single-stranded DNA (ssDNA), positive single-stranded RNA ( þ ssRNA), negative single-stranded RNA (  ssRNA), and double-stranded RNA (dsRNA). In general, mycoviruses with dsRNA genomes are classified into six families based on the amino acid sequence of the RNA-dependent RNA polymerase (RdRp), the genomic structure, the virion structure. These families are: the Totiviridae (nonsegmented genome), the Partitiviridae (2 segments), the Megabirnaviridae (2 segments), the Chrysoviridae (3–5 segments), the Quadriviridae (4 segments), and the Reoviridae (9–12 segments) (Ghabrial et al., 2015). Reovirus genome consists of 10–12 segments of dsRNA; each of the dsRNA segments has m7GpppNm cap structure at the 50 -end, and lack a 30 poly (A) tail. Partitiviruses genomes generally consist of 2 segments of dsRNA; each of the dsRNA segments has interrupted poly (A) tail at the 30 -end of the coding strands. Alternavirus has 3–4 dsRNA segments; each of the dsRNA segments has intact stretched poly (A) tail at the 30 -end of the coding strands, which trait is a unique molecular property as segmented double-stranded RNA viruses. Although the number of reports on mycovirus articles has been increasing in recent years, there are only five reports describing alternaviuses. Here, we introduce intriguing characteristic properties of alternavirus, primarily focusing on Alternaria alternata vius 1 (AaV1) and Aspergillus foetidus dsRNA mycovirus (AfV-F), which are similar in size to dsRNA viruses in the families Totivirdae, Partitiviridae, and Chrysoviridae, and while the genomic structures of the dsRNA viruses in three families are completely different. Based on our findings for AaV1, AfV-F, and other three alternaviruses, we propose a new family, “Alternaviridae”, to accommodate them.

Genome Organization The virions of AaV1 and AfV-F contain four unrelated linear, separately encapsidated, monocistronic dsRNA segments (1.4–3.6 kbp in size; Table 1). The largest segment, dsRNA1 encodes for the RdRp. DsRNA2, and dsRNA3 might encode for the major capsid protein (CP). The dsRNA segment 4 encodes for proteins of unknown function. The genomic structure of AaV1, type species of the genus Alternaviridae, and that of AfV-F are schematically represented in Fig. 1. It is noteworthy that all of the coding strands of the four dsRNA genomes have 30 -poly (A) tails ranging from 36 to 50 nt in length and stretches of poly (U) at the 50 terminus of the non-coding strand in all of the four dsRNAs was detected by 50 -RACE. In the AaV1, both the 50 - and 30 -terminal sequences of all four dsRNAs are significantly conserved. Even in AfV-F, the 50 - and 0 3 -terminal sequences are highly conserved between the four RNA segments. The nucleotide sequences preserved in each are different between AaV1 and AfV-F (Fig. 2). These terminal sequences may be involved in the replication cycles of these dsRNAs depending on each RdRp of AaV1 and AfV-F. There is no evidence that these four dsRNAs are transcripts from host genomic DNA, which proved by southern hybridization experiments carried out with cDNA proves derived from the four dsRNAs of AaV1 in the EGS 35–193, Alternaria alternata.

Virion Properties The buoyant densities of virions of AaV1 is in the range of 1.35–1.40 g/cm3 in CsCl. After treatment with chloroform–butanol followed by PEG precipitation, virus particles were purified by CsCl density equilibrium centrifugation. AaV1 virions are isometric, about 33 nm in diameter, and are similar in size to those of dsRNA viruses in the families Totiviridae, Partitiviridae, or Chrysoviridae.

544

Encyclopedia of Virology, 4th Edition, Volume 4

doi:10.1016/B978-0-12-814515-9.00031-X

Alternaviruses (Unassigned)

Table 1

545

Member of alternavirus

Host

Virus

acronym

DsRNA segm entno. (length in bp, encoded prote in: size in kDa)

GeneBank accession no.

Alternaria alternata

Alternaria alternata vius 1

AaV1

1 2 3 4

(3613: RdRp, 129) (2795:P2, 90) (2576:P3, 82) (1420:P4, 40

AB AB AB AB

Aspergillus foetidus

Aspergillus foetidus m ycovirus

AfV–F

1 2 3 4

(3588: RdRp, 127) (2770:P2, 87) (2466:P3, 79) (2005:P4, 65)

NC_020103.1 HE588145 HE588146 HE647818

Aspergillus niger Fusaium poae Fusarium gam inearum

Aspergillus m ycovirus 341 Fusaium poae alternavirus 1 Fusarium gam inearum alternavirus 1

AsV341 FpAV1 FgAV1

1 (3588: RdRp, 127) 1 (3559: RdRp, 126) 1 (3524: RdRp, 126)

ABX79997 NC_030883.1 NC_036596.1

368492 438027 438028 438029

Fig. 1 Genome organization of two alternaviruses. Alternaria alternata virus 1 (AaV1), and Aspergillus foetidus dsRNA mycovirus (AfV-F). The genome consists of four dsRNA segments; each monocistronic. The RdRp ORFs (nt 49–3498 on AaV1 dsRNA1, nt 53–3426 on AfV-F dsRNA1), the P2 ORFs (nt 53–2587 on AaV1 dsRNA2, nt 49–2454 on AfV-F dsRNA2), the P3 ORFs (nt 52–2331 on AaV1 dsRNA3, nt 51–2231 on AfV-F), the P4 ORF4s (nt 51–1232 on AaV1 dsRNA4, nt 51–1793 on AfV-F).

Generally, each virion contains only one of the four genomic dsRNA segments. The four dsRNA segments are packaged separately in individual virus particles, with different buoyant densities; the dsRNA 1 segment is mainly detected in the fraction of buoyant density 1.40 g/cm3, while dsRNA 2 and dsRNA 3 segments are detected in the fractions of buoyant density 1.38–1.39 g/cm3, the dsRNA 4 segment is detectable in the fraction of buoyant density 1.35 g/cm3, although it is barely detectable. Therefore, the dsRNA-containing fractions varied depending on the size of the packaged dsRNA segments in the particles. After purification by CsCl density equilibrium centrifugation, the dsRNA-containing fractions were used for observation of virus particles by electron microscopy. Isometric virus particles with a diameter of about 33 nm were routinely observed (Fig. 3(a)). Approximately 97 kDa protein was detected as the major band, which might be coat proteins encoded by dsRNA 2 and dsRNA 3 (Fig. 3(b)).

Biological Effects of Alternavirus on Their Host The AaV1 infected strain EGS 35–193 (ATCC), Alternaria alternata showed an abnormal phenotype, including reduced mycelial growth, aerial mycelial collapse, and unregulated pigmentation. In order to investigate whether the high concentration of AaV1 dsRNAs are responsible for this impaired growth, we attempted to eliminate or reduce the dsRNAs from the host fungus strain EGS 35–193. By exposing the strain to cycloheximide during hyphal tip isolation, we obtained an isolate E118, in which the amounts of the dsRNAs were reduced. In the isolate E118, the amount of dsRNA was reduced to about one tenth as compared to the parent stain, EGS 35–193. The isolate E118 restored normal mycelial growth and pigmentation, suggesting that the high titer of AaV1 caused attenuated mycelial growth of host fungus, strain EGS 35–193, Alternaria alternata. After inoculation on PDA agar medium for 24 h, regular-shaped mycelia were observed in the low titer of the isolate E118, but irregular and poor mycelia were observed in the original high titer strain

546

Alternaviruses (Unassigned)

Fig. 2 Comparison of the 50 and 30 UTRs of the four dsRNA segments of AaV1 and AfV-F. Multiple alignments were obtained using CLUSTAL XII (and some manual adjustments) with the nucleotide sequences of the 50 UTR (a) and the 30 UTR (b). Asterisks signify identical bases at the indicated position (shaded) and colons specify that three out of four bases are identical at the indicated positions.

Fig. 3 (a) Negative contrast electron micrograph of particles of Alternaria alternata virus 1. (b) Purified AaV1 proteins stained with Coomassie Brilliant Blue.

EGS35-193. In addition, abnormally enlarged vesicles appeared in mycelial cells of EGS 35–193. Bursting of these enlarged vesicles was observed around the hyphal cells. Only a few small vesicles were observed in cells of the low titer isolate E118. These results indicate that the reduced copy number of the dsRNAs in the restored isolates might be responsible for restoring normal morphology.

Proteins Encoded by Alternaviruses The large dsRNA segment (dsRNA1) of alternaviruses so far sequenced contains a single large ORF coding for RdRp. A multiple alignment of the amino acid sequences of the putative RdRp encoded by the AaV1 dsRNA1 with RdRps of AfV-F, AsV341, FpAV1, and FgAV1 is shown in Fig. 4. Eight conserved motifs characteristic of RdRps of dsRNA viruses of simple eukaryotes were found in

Alternaviruses (Unassigned)

547

Fig. 4 Multiple alignment of RdRp amino acid sequences from AaV1, AfV-F, AsV341, FpAV1, and FgAV1. The alignment was performed using Clustal X ver. 2.0 and trimmed to include the 8 conserved core regions of RdRp (indicated by solid lines), using Seaview ver. 4.0. Asterisks signify identical residues (shaded) at the indicated positions; colons signify highly conserved amino acid residues within a column; numbers in parentheses correspond to the number of amino acid residues separating the motifs.

the ORFs of the dsRNA1 in the five alternaviruses. The two putative RdRps encoded by AfV-F and AsV341 are very closely related to each other (identity: 99.6%). The two putative RdRps encoded by FpAV1 and FgAV1 are also very closely related to each other (identity: 98%). The RdRp encoded by AaV1 exhibits lower sequence similarities with that of AfV-F (identity: 36%), or that of FpAV1 (identity: 34%), while the identity between the RdRps of AfV-F and FpAV1 is 47%. Although the putative RdRps encoded by the five alternaviruses appear to be closely related to each other, they are distantly related to members of any dsRNA virus family, including the Totiviridae, Chrysoviridae, and Partitiviridae, where even in the most conserved GDD motif, the glycine (G) residue is replaced by an alanine (A) residue in cases of the five alternaviruses (Fig. 4). Phylogenetic analyses of their putative RdRp gene reveal that the five alternaviruses revealed distinct clade apart from the three families, Totiviridae, Chrysoviridae, and Patitiviridae (Fig. 5). Indeed, the genome structures of the dsRNA viruses in the three families are completely different. Alternaviruses have four dsRNA segments of 1.5–3.6 kbp, which are similar in size to dsRNA viruses in the family Chrysoviridae that possess four dsRNA segments of 2.4–3.6 kbp. In contrast, viruses in the Totiviridae and Partitiviridae families have single or two dsRNA segments, respectively. Thus, the alternaviruses appear to be evolutionarily related to but not members of the family Chrysoviridae.

30 Poly (A) Structure of Alternaviruses It is noteworthy the coding strands of genomic dsRNAs in the alternaviruses contain 30 poly (A) tails, which are supposed to range from 33 to 50 nt in length. There was a possibility that a single-stranded (ss) form of dsRNAs was present in larger amounts than double-stranded (ds) form. In order to investigate whether the ratio of the AaV1 ssRNA to dsRNA form, northern blot analyses using three probes types were performed to determine the relative abundance of viral ssRNA and dsRNA; hybridization experiments were performed with probes for positive-strand RNA, negative-strand RNA, or both strands (dsDNA probes). If the positivestrand RNA were present in much greater quantities than the dsRNA, the signals detected by the positive-strand-specific probe should have been stronger than those detected by the negative-strand-specific probe or by a DNA probe that can detect both strands. However, there was no significant difference in the intensities of signals detected by the three probes, indicating that little single-stranded form of dsRNAs exists in the alternavirus, and most of the dsRNAs are present as genomic RNA rather than the replicative form of the ssRNA genome (Table 2).

Evolutionary Relationships Among Chrysoviruses In recent years, the discovery of many mycoviruses has been reported. Historically, totiviruses and partitiviruses have been found widely not only fungi such as ascomycetes, yeast and mushrooms belonging to basidiomycetes such as Agaricus but also other simple eukaryotic organisms such as protozoa or oomycetes. Chrysoviruses discovered from Penicillium chrysogenum with four double-stranded RNAs were classified by Gabriel as members of the family Chrysoviridae independent from Partitiviridae, in EoV

548

Alternaviruses (Unassigned)

Hv190SV

99

SsRV1

88 83

SsRV2 97

68 92

GaRV-L1 LRV1-1 LRV2-1 ScVL-BC

42

ScVL-A

93

AbV1 ACD-CV

89

Hv145SV PcV

96 65 53

FoCV1 AaV1 99 AfV-F

98

AsV341 81

FgAV1 100 FpAV1

21 100

AhPV RsV717 HmPV-V1-1 FUPO-1

Fig. 5 A phylogenic analysis of the RdRp sequences of Alternaviruses and selected mycoviruses in the Totiviridae, Chrysoviridae, Partitiviridae families. An unrooted phylogenetic tree based on the neighbor-joining method was performed using the program MEGA 10. The numbers at nodes represent bootstrap values as percentages estimated by 100 replicates. The following viruses in the family Totiviridae were included in the phylogenetic analysis (abbreviations in parenthesis): Helminthosporium victoriae virus 190S (HvV190S); Sphaeropsis sapinea RNA virus 1 (SsRV1); Sphaeropsis sapinea RNA virus 2 (SsRV2); Giardia lamblia virus L1 (GaRV-L1); Leishmania RNA virus 1-1 (LRV1-1); Leishmania RNA virus 2-1 (LRV2-1); Saccharomyces cerevisuae virus L-A (ScV-L-A); Saccharomyces cerevisuae virus L-BC (ScV-L-BC). The following viruses in the family Chrysoviridae were included in the phylogenetic analysis (abbreviations in parenthesis): Agaricus bisporus virus 1 (AbV1); Amasya cherry disease associated chrysovirus (ACD-CV); Helminthonsporium victoriae virus 145S (HvV145S); Penicillium chrysogenum virus (PcV); Fusarium oxysporum chrysovirus 1 (FoCV1). The following viruses in the family Alternaviridae were included in the phylogenetic analysis (abbreviations in parenthesis): Alternaria alternata vius 1 (AaV1); Aspergillus foetidus dsRNA mycovirus (AfV-F); Aspergillus mycovirus 341 (AsV341); Fusaium poae alternavirus 1 (FpAV1); Fusarium gaminearum alternavirus 1 (FgAV1). The following viruses in the family Partitiviridae were included in the phylogenetic analysis (abbreviations in parenthesis): Atkinsonella hypoxylon partitivirus (AhPV); Rhizoctonia solani virus 717 (RsV717); Helicobasidium mompa partitivirus V1-1 (HmPV-V1-1); Fusarium poae virus 1 (FUPO-1).

Table 2

Comparison between AaV–1 and dsRNA viruses in three mycovirus families

Virus family (virus name)

Number of dsRNA genome

Segments size (kbp)

Virus particle

Alternavirus Totivirus Partitivirus Chrysovirus Megabirnavirus Reovirus

4 1 2 4 2 10B12

1.5–3.6 4.6–6.7 1.4–2.2 2.4–3.6 6.5–7.0 1.1–3.5

isom isom isom isom isom isom

etric etric etric etric etric etric

33 40 33 35 50 60

nm nm nm nm nm nm

3 0 poly (A)

5 0 Cap

33–50 nt not found 20–30 nt not found not found not found

not found not found not found not found found

3rd edition, 2008. The viruses of these families and the proposed family Alternaviridae can be mixed infectiously in one host cell. To date, mycoviruses infecting these hosts have been discovered from more than hundreds of simple eukaryotes in more 20 families. However, there are only five reports of the alternaviuses as hosts of three saprophytes, namely the genera Alternaria spp, Fusarium spp., and Aspergillus spp. As one of the characteristics of the alternaviruses, it is to have intact poly (A) structures at the 30 ends of their dsRNA genomes, which is the first discovery as segmented double-stranded RNA viruses. In AaV1, the poly (A) structure at the 30 end has been proved to exist also in the double-stranded RNA genome extracted from within the purified virus particles. Recent studies by the authors have also provided data supporting the presence of a cap structure on the 50 ends of the dsRNA genomes.

Reference Ghabrial, S.A., Castón, J.R., Jiang, D., Nibert, M.L., Suzuki, N., 2015. 50-plus years of fungal viruses. Virology 479–480, 356–368.

Further Reading Aoki, N., Moriyama, H., Kodama, M., et al., 2009. A novel mycovirus associated with four double-stranded RNAs affects host fungal growth in Alternaria alternata. Virus Research 140, 179–187. Hammond, T.M., Andrewski, M.D., Roossinck, M.J., Keller, N.P., 2008. Aspergillus mycoviruses are targets and suppressors of RNA silencing. Eukaryotic Cell 7, 350–357. He, H., Chen, X., Li, P., Qiu, D., Guo, L., 2018. Complete genome sequence of a Fusarium graminearum double-stranded RNA virus in a newly proposed family, Alternaviridae. Genome Announcements 6, e00064-18. Kozlakidis, Z., Herrero, N., Ozkan, S., et al., 2013. Sequence determination of a quadripartite dsRNA virus isolated 135 from Aspergillus foetidus. Archives of Virology 158, 267–272. Osaki, H., Sasaki, A., Nomiyama, K., Tomioka, K., 2016. Multiple virus infection in a single strain of Fusarium poae shown by deep sequencing. Virus Genes 52, 835–847.

Barnaviruses (Barnaviridae) Peter A Revill, The Peter Doherty Institute of Infection and Immunity, Royal Melbourne Hospital, Melbourne, VIC, Australia r 2021 Published by Elsevier Ltd.

Glossary Casing A layer of peat moss placed on top of the growing beds to encourage sporophore formation.

Introduction Mushroom bacilliform barnavirus is the type species and still only member of the Barnaviridae. In the U.S.A in 1948, a disease of the cultivated mushroom Agaricus bisporus was discovered on a property in Pennsylvania that had a major impact on the mushroom industry and fungal pathology in general. It was characterized by poorly colonised mycelium and misshapen fruiting bodies (sporophores) with long thin stipes and small globular caps producing a drumstick-like appearance, or they were thickened with a barrel-like appearance. The poor colonization of the compost and casing by infected mycelium often produced characteristic bare patches on the growing beds and reduced yields. The disease was named La France disease and a virus was implicated as a possible cause in 1960, after it was shown that the disease could be transmitted to healthy cultures by hyphal anastomosis. In 1962, 3 different virus-like particles were identified, two of which were spherical (25 nm and 29 nm), and the third was a 19 nm  50 nm elongated or bacilliform particle with rounded ends. Subsequently a 34–36 nm spherical particle with a dsRNA genome (La France infectious virus, LFIV) has been identified as the causal agent of La France disease. The bacilliform virus particle was of particular interest as almost all mycoviruses identified to that point had a spherical or isometric morphology. 19 nm  48 nm bacilliform virus-like particles were subsequently identified in the ascomycete Microsphaera mougeotti, and 17 nm  35 nm bacilliform particles were also observed in the deuteromycete Verticilium fungicola, itself a pathogen of A. bisporus. However no relationship with mushroom bacilliform virus (MBV) and these bacilliform particles has been established. Originally named mushroom virus 3 (MV3), the virus was subsequently renamed mushroom bacilliform virus (MBV). The viral genome was identified as single-stranded (ss), positive-sense RNA, and the virus was classified by the International Committee on Taxonomy of Viruses (ICTV) as the exemplar virus of the species Mushroom bacilliform barnavirus (genus Barnavirus, Family Barnaviridae). The family name derives its roots from Bacilliform RNA virus. MBV remains the only barnavirus identified to date. MBV remains the only barnavirus characterized to date, however next generation sequencing and metagenomics studies have recently identified another virus with similarity to MBV. Analysis of RNA sequences isolated from the plant pathogenic fungus Rhizoctonia solani identified a near-complete viral sequence that grouped closely with MBV on a phylogenetic tree, had a similar genome arrangement, albeit lacking an ORF1 suggesting the sequence was incomplete, and shared amino acid sequence identity, particularly in ORF3, the viral encoded RdRp (47% identity). Termed Rhizoctonia solani barnavirus 1 (RsBarV1; KP900904), the incomplete sequence is yet to be ratified as a barnavirus by the ICTV. Subsequent examination of public Transcriptome Shotgun Assembly (TSA) GenBank databases using BLAST search algorithms has since identified another near-complete virus sequence with a high degree of similarity to both BsBarV1 and MBV, named Colobanthus quitensis associated barnavirus 1 (CqABV1), detected in the Antarctic pearlwort Colobanthus quitensis. If ratified, this would be the first barnavirus identified from a plant species, however it is also possible that the sequence was derived from an associated unidentified fungus, rather than C. quitensis itself. Virus characterization studies, including identification of a bacilliform virus particle, are required to determine if these newly identified sequences are true barnavirus genomes.

MBV Virion Properties The MBV virion Mr is 7.1  106, with a buoyant density in Cs2SO4 of 1.32 g/cm3. Virions are stable between pH 6 and 8 and ionic strength of 0.01–0.1 M phosphate.

MBV Virion Structure and Composition MBV has a bacilliform or bullet shaped morphology, with particles generally 19 nm  50 nm in size (Fig. 1). Virions contain a single major CP of 21.9 kDa and there are approximately 240 molecules in each capsid. Virions encapsidate a single linear molecule of a positive-sense ssRNA, 4.0 kb in size. The complete 4009-nt sequence is available (Acc No U07551). The RNA has a viral 50 linked protein or viral genome-linked protein (VPg) and lacks a poly(A) tail. RNA constitutes about 20% of virion weight.

Encyclopedia of Virology, 4th Edition, Volume 4

doi:10.1016/B978-0-12-809633-8.21511-2

549

550

Barnaviruses (Barnaviridae)

Fig. 1 Electron micrograph of mushroom bacilliform virus particles. The bar represents 100 nm.

Fig. 2 The mushroom bacilliform virus (MBV) genome arrangement.

MBV Genome Organization and Expression The RNA genome (4009 nt) contains four major and three minor ORFs and has 50 - and 30 -UTRs of 60 nt and 250 nt, respectively (Fig. 2). ORFs 1–4 encode polypeptides of 20, 73, 47, and 22 kDa, respectively. The deduced amino acid sequence of ORF2 contains 3 conserved chymotrypsin-related serine protease sequence motifs. Blast searches of the deduced ORF2 amino acid sequence show similarity to serine proteases encoded by plant sobemoviruses. ORF2 also encodes the VPg. ORF3 contains the GX3TX3NXnGDD amino acid sequence shared by the putative RNA-dependent RNA polymerases (RdRps) of positive-sense ssRNA viruses and has similarity to the RdRps of sobemoviruses, enamoviruses and poleroviruses. ORF4 encodes the capsid protein (CP). ORFs 5–7 encode 8, 6.5, and 6 kDa polypeptides, respectively. The polypeptides potentially encoded by ORFs 1, 5, 6, and 7 show no significant similarity to known polypeptides. The negative strand of MBV contains seven small ORFs of unknown significance. These potentially encode polypeptides ranging from Mr 65K to Mr 105K. The MBV genome arrangement and transcription/translation strategies are strikingly similar to those of a number of plant viruses, particularly poleroviruses and sobemoviruses. MBV probably also uses similar strategies to express its gene products, including leaky ribosomal scanning for expression of ORF2, ribosomal frameshifting for expression of the RdRP, and subgenomic RNA for expression of the CP. Of these, only subgenomic RNA has been confirmed in vivo. In a cell-free system, genomic length RNA directs the synthesis of major 21-kDa and 77-kDa polypeptides and several minor polypeptides of 18–60 kDa. The full-length genomic RNA and a sgRNA (0.9 kb) encoding ORF4 (CP) are found in infected cells. Virions accumulate singly or as aggregates in the cytoplasm. However the MBV life-cycle has yet to be determined.

MBV Evolutionary Relationships As discussed earlier, the deduced MBV RNA-dependent RNA polymerase (RdRP) sequence shows most similarity to the RdRP of the putative barnavirus RsBarV1, grouping closely together on a phylogenetic tree (Fig. 3). In addition, the deduced MBV RdRP amino acid sequences share striking similarity with those of some plant viruses, particularly sobemoviruses, poleroviruses, and enamoviruses (Fig. 3). This, together with the similarity of the MBV and sobemovirus/polerovirus genome arrangements, suggests that MBV may have shared a common ancestor with these plant virus groups in the distant past.

MBV Transmission and Host Range MBV is transmitted horizontally through infected mycelium and it is yet to be determined if the virus can be transmitted in spores. There is no known insect vector. Although morphologically similar viruses to MBV have been identified in the field agaric, A. campestris, it is unknown whether these particles are related to MBV. Consequently MBV remains the only barnavirus identified to date.

Barnaviruses (Barnaviridae)

551

Fig. 3 Neighbor-joining tree of the mushroom bacilliform virus (MBV) RdRp compared to RdRPs of a number of plant viruses and the putative barnavirus RsBarV1, generated using MEGA X v10.0.4. Sequences were aligned using the Clustal W implemented in Genious V11.1.5 (http://www.geneious.com), using the cost matrix: BLOSUM default setting (1000 bootstrap replicates). and the tree was constructed with Treeview. Although rooted by outgroup (BYDV-PAV, barley yellow dwarf virus) this is an unrooted tree. CYDV-RPV ¼ cereal yellow dwarf virus-RPV; SBMV ¼ southern bean mosaic virus; RYMV ¼ rice yellow mottle virus; PEMV ¼ pea enation mosaic virus; BWYV ¼ beet western yellows virus; PLRV ¼ potato leafroll virus; RsBarV1 ¼ Rhizoctonia solani barnavirus 1. The accession numbers of the sequences used in the analysis were PLRV (D00530), CYDV-RPV (NC004751), BWYV (NC004756), PEMV (NC003629), RYMV (NC001575), SBMV (DQ875594), MBV (U07557), BYDV-PAV (EF043235), RsBarV1 (KP900904).

Further Reading Goodin, M.M., Schlagnhaufer, B., Romaine, C.P., 1992. Encapsidation of the La France disease specific double-stranded RNAs in 36 nm isometric virus-like particles. Phytopathology 82, 285–290. Kumar, S., Stecher, G., Li, M., Knyaz, C., Tamura, K., 2018. MEGA X: Molecular Evolutionary Genetics Analysis across computing platforms. Molecular Biology and Evolution 35 (6), 1547–1549. Marzano, S.-Y.L., Nelson, B.D., Ajayi-Oyetunde, O., et al., 2016. Identification of diverse mycoviruses through metatranscriptomics characterization of the viromes of five major fungal plant pathogens. Journal of Virology 90, 6846–6863. Moyer, J.W., Smith, S.H., 1976. Partial purification and antiserum production to the 19  50 nm mushroom virus particle. Phytopathology 66, 1260–1261. Moyer, J.W., Smith, S.H., 1977. Purification and serological detection of mushroom virus-like particles. Phytopathology 67, 1207–1210. Nibert, M.L., Manny, A.R., Debat, H.J., et al., 2018. Archives of Virology 163, 1921–1926. Revill, P.A., Davidson, A.D., Wright, P.J., 1994. The nucleotide sequence and genome organization of mushroom bacilliform virus: A single-stranded RNA virus of Agaricus bisporus (Lange) Imbach. Virology 202, 904–911. Revill, P.A., Davidson, A.D., Wright, P.J., 1998. Mushroom bacilliform virus RNA: The initiation of translation at the 50 -end of the genome and identification of the VPg. Virology 249, 231–237. Revill, P.A., Davidson, A.D., Wright, P.J., 1999. Identification of a subgenomic mRNA encoding the capsid protein of mushroom bacilliform virus, a single-stranded RNA mycovirus. Virology 260, 273–276. Romaine, C.P., Schlagnhaufer, B., 1991. Hybridization analysis of the single-stranded RNA bacilliform virus associated with La France disease of Agaricus bisporus. Phytopathology 81, 1336–1340. Tavantzis, S.M., Romaine, C.P., Smith, S.H., 1980. Purification and partial characterization of a bacilliform virus from Agaricus bisporus: A single-stranded RNA mycovirus. Virology 105, 94–102. Tavantzis, S.M., Romaine, C.P., Smith, S.H., 1983. Mechanism of genome expression in a single-stranded RNA virus from the cultivated mushroom Agaricus bisporus. Phytopathology 106, 45–50.

Botybirnaviruses (Botybirnavirus) Mingde Wu, Guoqing Li, Daohong Jiang, and Jiatao Xie, Huazhong Agricultural University, Wuhan, China r 2021 Elsevier Ltd. All rights reserved.

Nomenclature Boty

birna

Refers to the bipartite dsRNA genome

Originates from the Latin name of the host fungus “Botrytis”

Glossary Anastomosis The fusion between two hyphae leading to their cytoplasmic exchange. Conidia Asexual, non-motile spores of fungi, a major asexual reproduction structure for most ascomycetes. Hypovirulence A phenomenon of reduced virulence for fungal pathogens, and sometimes is caused by viral infection.

Sclerotia A hardened granulated structure formed by a compact mass of fungal mycelium with the ability to survive under some extreme environmental conditions. Vegetative incompatibility A cell-death reaction occurs between two genotypically distinct fungal species or isolates, and mostly resulting in failed transmission of viruses via anastomosis.

Introduction The first botybirnavirus, Botrytis porri botybirnavirus 1 (BpBV1), was identified in the phytopathogenic fungus Botrytis porri, as a causal agent of hypovirulence of B. porri. BpBV1-infected B. porri strains showed dramatically impaired mycelial growth and virulence, and formed numerous mycelial sectors at the colony margin (Fig. 1). Two dsRNA segments (the genome of BpBV1) of approximately 6.2 and 5.8 kbp in size together with the spherical virons were detected in the virus-infected isolate GarlicBc-72. BpBV1 is able to be efficiently transmitted via conidia, and can also be horizontally transferred to other B. porri isolates through anastomosis. However, the horizontal transmission seems to be limited by vegetative incompatibility. Since the firstly discovery of botybirnavirus, botybirnaviruses were soon reported in the close relative of Botrytis, Sclerotinia sclerotiorum. The infection of botybirnaviruses was determined to be close associated with the hypovirulence of S. sclerotiorum as well. Recently, more candidate members in the genus of Botybirnavirus, were detected in the fungi of Ascomycota, like fungi of genera Alternaria and Bipolaris. With the development of sequencing technology, more botybirnavirus-like sequences were also identified during the metatranscriptomic analysis, such as the funding of botybirnavirus-like sequences in the metatranscriptomics survey of soybean phyllosphere phytobiomes. Therefore, botybirnaviruses may have a wide distribution among different fungal groups.

Virion Properties Electron microscope observation of negative stained botybirnavirus virons (BpBV1) with aqueous uranyl acetate showed that they are spherical in shape, of approximately 35–40 nm in diameter, and probably non-enveloped (Table 1, Fig. 2). At least two structural proteins (SPs) are identified for the BpBV1 virions, encoded by the two dsRNA segments, respectively. Like viruses in Chrysoviridae and Partitiviridae, the two dsRNAs of BpBV1 are thought to be separately encapsidated in separate virus particles because the virus particles might not be spacious enough for simultaneous packaging of the two dsRNAs. Moreover, variable and unequal molar ratios of two dsRNA segments in purified virus particles of Sclerotinia sclerotiorum botybirnavirus 1 (SsBV1) also support the idea that the two segments of botybirnaviruses are separately packaged into particles.

Genome Organization and Replication The genomes of botybirnaviruses comprise two dsRNAs. The larger dsRNA of BpBV1 is 6215 bp (dsRNA-1, GenBank accession no. JF716350) in length while the size of smaller dsRNA is 5879 bp (dsRNA-2, GenBank accession no. JF716351). The two segments show high sequence identity (B95%) at the 50 termini (500 bp), including the 50 -untranslated regions (UTRs) and partial coding region (Fig. 3). Each dsRNA segment of BpBV1 possesses a large open reading frame (ORF), designated as ORF I (on dsRNA-1) and ORF II (on dsRNA-2) (Fig. 3). The protein encoded by the 50 -proximal coding region of ORF I and ORF II have been determined to be the SPs through peptide mass fingerprinting (PMF) analysis. The protein encoded by the 30 -proximal coding region of ORF I encompasses the RdRp_4 conserved domain sequence (pfam02123 in the conserved domain database of NCBI) and shows

552

Encyclopedia of Virology, 4th Edition, Volume 4

doi:10.1016/B978-0-12-809633-8.21513-6

Botybirnaviruses (Botybirnavirus)

553

Fig. 1 Colony morphology of the Botrytis porri botybirnavirus 1 (BpBV1)-transfected isolate 38T (left) and the BpRV1-free strain GarlicBc-38 (right) on potato dextrose agar (201C, 15 days).

Table 1

Viron properties of different botybirnaviruses

Virus name

Shape

Diameter (nm)

Genome size (bp)

Genbank Acc. No.

Botrytis porri botybirnavirus 1 Sclerotinia sclerotiorum botybirnavirus 1 Sclerotinia sclerotiorum botybirnavirus 2 Alternaria botybirnavirus 1 Bipolaris maydis botybirnavirus 1

spherical spherical spherical spherical NDa

35 38 40 40 ND

6215, 6457, 6159, 6188, 6435,

JF716350, JF716350 KP774592, KP774593 KT962972, KT962973 KX784490, KX784491 MF034087, MF034086

5879 5965 5872 5903 5987

ND ¼ Not Done.

a

Fig. 2 Transmission electron microscopy image (negative staining) of the Botrytis porri botybirnavirus 1 virons from strain GarlicBc-72 of B. porri (bar ¼ 50 nm).

sequence similarity with RdRp encoded by viruses in the families Totiviridae, Chrysoviridae, Quadriviridae, and Megabirnaviridae. In addition, the proteins encoded by the 50 -proxinmal region of ORF I and by the entire ORF II lack significant sequence similarities to the proteins of any other known virus groups. SsBV1, a member of a tentative species of the genus Botyvirus, has a third dispensable satellite-like dsRNA (SatlRNA) element of B1.7 kbp (GenBank accession no. KP774594). The expression strategy of botybinaviruses and functions of remaining regions of the two deduced polyproteins remain unclear. The amino acid sequences of the two SPs also show a significant similarity with each other. Interestingly, whether phylogeny is analyzed based on either of the two SP, their tree topology is similar. This suggests that the ancestral botybirnavirus might have undergone a single duplication event of the SP coding domain.

Taxonomy and Similarity With Other Viruses Botybirnaviruses are phylogenetically more closely related to viruses of the families Totiviridae and Quadriviridae than to viruses of the families Chrysoviridae and Megabirnaviridae based on the RdRp_4 conserved domain sequence similarity and phylogenetic analysis (Fig. 4). The family name, Botybirnaviridae, was initially proposed to the International Committee of Taxonomy of Viruses

554

Botybirnaviruses (Botybirnavirus)

Fig. 3 Genome organization of Botrytis porri botybirnavirus 1 (BpBV1). The genome possesses two dsRNA segments, dsRNA-1 and dsRNA-2. The dsRNA-1 is 6215 bp long and comprises one large ORF (nt positions 404–6112), encoding a polyprotein of 1902 aa. The dsRNA-2 is 5788 bp long and also comprises one large ORF (nt positions 405–5771), which encodes a polypeptide of 1788 aa. p70 and p85/80 are the structural proteins of BpBV1 inferred from peptide fingerprinting analyses. Highly-conserved region (B500 bp) covering the entire 50 -UTR and partial coding region are highlighted in red on the BpBV1 genome.

(ICTV) to accommodate the genus Botybirnavirus including four species (Table 1). The viruses belonging to the proposed family can be clearly separated from other viruses with bipartite genomes based on bipartite genome size (B12 kbp, which is smaller than that of Megabirnaviridae (B17 kbp) but larger than those of Partitiviridae (B3–4.8 kbp), Picobirnaviridae (B4 kbp) and Birnaviridae (B6 kbp)), genome organization and nucleotide and amino acid sequence similarity of encoded RdRp and CP sequences. The current taxonomic status of botybirnaviruses, approved by ICTV, is that Botrytis porri botybirnavirus 1 is only the species in the genus Botybirnavirus which is as yet unassigned to a family.

Transmission and Distribution There are no known natural vectors for the transmission of botybirnaviruses. Under laboratory conditions, the transmission of BpBV1 is similar to most mycoviruses, vertically through sporulation or horizontally by hyphal anastomosis between somatically compatible fungal strains. The vertical transmission via sporulation for BpBV1 is highly efficient in B. porri strain GarlicBc-72. BpBV1 could be detected in 34 out of 35 single-conidium (SC) isolates of strain GarlicBc-72 with the vertical transmission rate of 97.1%. In contrast, the horizontal transmission of BpBV1 was not so efficient, it could only be transmitted from strain GarlicBc-72 to one of the recovered SC isolate SC35 but failed to be transmitted to other two strains of B. porri during the in-labpratory test. This might be due to the vegetative imcompatibity among different B. porri strains. As isolate SC35 is isogenic to strain GarlicBc-72, they should be vegetatively compatible to each other resulting in successful viral transmission between the two individuals. Similar unsuccessful horizontal transmission was also observed for Sclerotinia sclerotiorum botybirnavirus 2 (SsBV2), and may also be caused by the vegetative imcompatibity between the donor and recipient S. sclerotiorum strains. Although in-laboratory tests showed the horizontal transmission of BpBV1 was limited, it is surprising that the same botybirnavirus, BpBV1, was detected in both B. squamosa and S. sclerotiorum in addition to B. porri. Moreover, recent results from high through-put sequencing also indicate that BpBV1 may be present in the population of B. cinerea. This suggests that interspecific transmission of BpBV1 may occur in field conditions without the limitation by vegetative incompatibility among different fungal species, although the underlining mechanism is still unclear. Botybirnaviruses probably have a wide host range, mainly infecting ascomycetes including B. porri, B. squamosa, S. sclerotiorum, B. maydis, and Alternaria. Besides being detected in population of B. porri and B. squamosa of China, high through-put sequencing data also showed that botybirnaviruses were also present in soybean leaf samples from United States and BpBV1 was also found in the S. sclerotiorum population of Australia, indicating BpBV1 and other botybirnaviruses may have a geographic distribution all over the world.

Biology BpBV1 was reported to confer hypovirulence on the phytopathogenic fungus B. porri, the causing agent of garlic clove rot, garlic leaf blight, and leek leaf rot. Compared with BpBV1-free isolates of B. porri, the BpBV1-infected isolate grew slowly on potato dextrose agar, formed numerous mycelial sectors at the colony margin and caused smaller lesions on leaves of garlic (Allium sativum). In addition, abundant vacuole-like membranous structures and membranous vacuoles/vesicles were observed in the cytoplasm of virus-infected isolate. Three lines of experimental evidence support the conclusion that BpBV1 is responsible for the reduced mycelial growth and hypovirulence in B. porri. The first clue is based on the vertical transmission experiment. All BpBV1infected SC isolates of strain GarlicBc-72 were hypovirulent and accompanied with reduced mycelial growth. In contrast, the BpBV1-free SC isolate SC35 became virulent on garlic leaves and grew normally. Secondly, after BpBV1 was transmitted to the virus cured strain SC35 through hyphal contact, three derivative isolates of strain SC35 carrying BpBV1 showed reduced mycelial

Botybirnaviruses (Botybirnavirus)

555

Fig. 4 Phylogenetic analysis of Botrytis porri botybirnavirus 1 and other 21 selected RNA viruses presented in an neighbor-joining (NJ) tree inferred from the RdRp sequences. The NJ tree was constructed by using the CLUSTAL_W program in the MEGA 6.0 software. The number labeled at each node indicates the bootstrap percentage (N ¼ 1000). Abbreviated virus names and GenBank accession number for viral RdRp: SsRV1, Sphaeropsis sapinea RNA virus 1 (NC_001963); HvV190S, Helminthosporium victoriae virus 190S (U41345); LRV-1-1, Leishmania RNA virus 1-1 (M92355); LRV-2-1, Leishmania RNA virus 2-1 (U32108); TVV1-1, Trichomonas vaginalis virus 1-1 (U08999); TVV2-1, Trichomonas vaginalis virus 2-1 (AF127178); UmVH1, Ustilago maydis virus H1 (NC_003823); ScV-L-A, Saccharomyces cerevisiae virus L-A (J04692); ScV-L-BC, Saccharomyces cerevisiae virus L-BC (U01060); RnQV1-W1075, Rosellinia necatrix quadrivirus 1-W1075 (AB620063); RnQV1-W1118, Rosellinia necatrix quadrivirus 1-W1118 (AB744679); BpBV1, Botrytis porri botybirnavirus 1 (JF716350); SsBV1, Sclerotinia sclerotiorum botybirnavirus 1 (KP774592); SsBV2, Sclerotinia sclerotiorum botybirnavirus 2 (KT962972); ABV1, Alternaria botybirnavirus 1 (KX784491); BmBV1, Bipolaris maydis botybirnavirus 1 (MF034087); PCV, Penicillium chrysogenum virus (AF296339); HvV145S, Helminthosporium victoriae virus 145S (AF297176); RnMBV1-W779, Rosellinia necatrix megabirnavirus 1-W779 (AB512282); SsMBV1, Sclerotinia sclerotiorum megabirnavirus 1 (KP686398); AHV, Atkinsonella hypoxylon virus (L39125); WcCV1, White clover cryptic virus 1 (AY705784).

growth and impaired virulence on garlic leaves. The transfection experiment provided the third evidence, in which the purified BpBV1 particles were introduced into the protoplasts of the BpBV1-free virulent strain GarlicBc-38 of B. porri artificially. The derivative isolate 38T also displayed debilitation symptoms (Fig. 1), including reduced mycelial growth and hypovirulence on garlic leaves. It is interesting that although BpBV1 caused severe debilitation on B. porri, the effects of BpBV1 on B. squamosa seems to be mild. Only slightly reduced mycelial growth and virulence on garlic leaves was observed on the BpBV1-infected B. squamosa strain. Whether this phenomenon was caused by the minor sequence differences between the two BpBV1 strains from B. porri and B. squamosa, or the different genetic background of the two hosts remain to be investigated. Similarly, the infection of botybirnaviruses in S. sclerotiorum is also able to reduce its mycelial growth and virulence. The hypovirulent S. sclerotiorum strain AH16 was infected by SsBV2 together with a mitovirus (Sclerotinia sclerotiorum mitovirus 4, SsMV4). To test whether SsBV2 or/and SsMV4 is responsible for the hypovirulence of strain AH16, both viruses were tested to be transmitted from strain AH16 to Ep-1PNA367R via hyphal contact, or to be eliminated from strain AH16 through protoplast regeneration. Nevertheless, due to the vegetative incompatibility between the donor strain and the recipient strain, the transmission was failed. Moreover, the protoplast regeneration was also unable to cure the virus in strain AH16. Finally, the virions of SsBV2 was successfully introduced into strain Ep-1PNA367R. The results based on dsRNA extraction, or RT-PCR detection revealed that the transfectant Ep-1PNA367RVT carried SsBV2 without the presence of SsMV4. Compared with SsBV2-free strain Ep-1PNA367R, strain AH16 grew slower and showed reduced virulence on detached soybean leaves. In addition, strain Ep-1PNA367R formed sclerotia at

556

Botybirnaviruses (Botybirnavirus)

7 days post inoculation (dpi), whereas strain AH16 formed fewer and smaller sclerotia at 15 dpi. It is noteworthy that strain Ep-1PNA367RVT showed more severely debilitated symptoms compared with strain AH16, including slower mycelial growth, more impaired virulence and no sclerotial production. In addition, SsBV2 was able to transmit from strain Ep-1PNA367RVT to SsBV2-free strain Ep-1PNA367, and the newly SsBV2-infected strain exhibited similar biological traits to those of Ep-1PNA367RVT. Therefore, based on these observations, SsBV2 confers hypovirulence to S. sclerotiorum. In contrast, infection of SsBV1 alone has no significant impacts on culture morphology and virulence of S. sclerotiorum. However, infection of SsBV1 along with its SatlRNA leads to slightly reduced virulence and slower mycelial growth rate on its host. Nevertheless, the culture morphology and sclerotial formation of strains carrying both SsBV1 and SatlRNA was comparable to those of strains without the infection by both SsBV1 and SatlRNA, or by SsBV1 alone. Besides Botrytis spp. and S. sclerotiorum, botybirnaviruses were also reported in the fungi B. maydis and Alternaria. Although the genomes of those two botybirnaviruses were fully sequenced, the impacts on their fungal hosts are still unknown.

Further Reading Liu, L., Wang, Q., Cheng, J., et al., 2015. Molecular characterization of a bipartite double-stranded RNA virus and its satellite-like RNA co-infecting the phytopathogenic fungus Sclerotinia sclerotiorum. Frontiers in Microbiology 6, 406. Marzano, S.Y.L., Domier, L.L., 2016. Novel mycoviruses discovered from metatranscriptomics survey of soybean phyllosphere phytobiomes. Virus Research 213, 332–342. Mu, F., Xie, J.T., Cheng, S.F., et al., 2018. Virome Characterization of a collection of Sclerotinia sclerotiorum from Australia. Frontiers in Microbiology 8, 2540. Ran, H., Liu, L., Li, B., et al., 2016. Coinfection of a hypovirulent isolate of Sclerotinia sclerotiorum with a new botybirnavirus and a strain of a mitovirus. Virology Journal 13, 92. Wu, M.D., Jin, F.Y., Zhang, J., et al., 2012. Characterization of a novel bipartite double-stranded RNA mycovirus conferring hypovirulence in the phytopathogenic fungus Botrytis porri. Journal of Virology 86, 6605–6619. Xiang, J., Fu, M., Hong, N., et al., 2017. Characterization of a novel botybirnavirus isolated from a phytopathogenic Alternaria fungus. Archives of Virololgy 162, 3907–3911.

Chrysoviruses (Chrysoviridae) - General Features and Chrysovirus-Related Viruses☆ Ioly Kotta-Loizou, Imperial College London, London, United Kingdom Robert HA Coutts, University of Hertfordshire, Hatfield, United Kingdom José R Castón, National Center for Biotechnology, Spanish National Research Council, Madrid, Spain Hiromitsu Moriyama, Tokyo University of Agriculture and Technology, Tokyo, Japan Said A Ghabrial†, Department of Plant Pathology, University of Kentucky, Lexington, KY, United States r 2021 Elsevier Ltd. All rights reserved.

Glossary Anastomosis The fusion between branches of hyphae. Conidium Asexual spore of a fungus.

Heterokaryon A multinucleate fungal cell that contains genetically different nuclei.

Introduction Chrysoviruses were initially discovered in the late 1960s and early 1970s, when the presence of isometric virus particles in many industrial strains of the ascomycete Penicillium chrysogenum used for penicillin production was noted. Virus infection was considered to be responsible for the instability of some of these strains, generating significant interest in the study of Penicillium viruses. Penicillium chrysogenum virus (PcV), the exemplar virus of the genus Alphachrysovirus, was one of the first mycoviruses subjected to extensive biochemical, biophysical and structural studies. PcV and the related viruses Penicillium brevi-compactum virus (PbV) and Penicillium cyaneo-fulvum virus (Pc-fV) have similar isometric particles, 35–40 nm in diameter, and are serologically related. These virus properties were undisputed in these initial studies, however there was confusion as to whether these viruses contain three or four double-stranded (ds) RNA segments. Because none of these three Penicillium viruses were characterized at the molecular level, they were originally grouped under the genus Chrysovirus and provisionally placed in the family Partitiviridae with the assumption that their genomes were bipartite, with dsRNA 1 encoding the RNA-dependent RNA polymerase (RdRp) and dsRNA 2 encoding the major capsid protein (CP). Additional dsRNAs found in these isolates were nominated dsRNA 3 and 4 and were presumed to represent defective or satellite dsRNAs as described for some partitiviruses. Determination of the complete nucleotide sequence and genome organization of each of the four monocistronic dsRNA segments associated with PcV virions together with another chrysovirus Helminthosporium victoriae virus 145S (HvV145S) in the early 2000s led to a reconsideration of the classification of the genus Chrysovirus. Based on the consistent and simultaneous presence of four dsRNA segments, the existence of extended regions of highly conserved terminal sequences at both ends of the four segments, together with sequence comparisons and phylogenetic analysis, it became clear that PcV and related viruses should not be classified with the family Partitiviridae. These observations led to the creation of a new family the Chrysoviridae to accommodate isometric dsRNA mycoviruses with multipartite genomes. The name chryso means gold in Greek and is derived from the specific epithet of P. chrysogenum, the fungal host of the prototype strain of the type species, (PcV), which often produces a golden yellow pigment. More recently, the discovery and characterization of a large number of chrysovirus-related viruses resulted in a reorganization of the Chrysoviridae family: the original genus Chrysovirus was renamed as Alphacrysovirus and a new genus Betachrysovirus was created with Botryosphaeria dothidea chrysovirus 1 as the type species. The two genera accommodate viruses that possess three to seven genomic segments and are not only present in ascomycetes and basidiomycetes but also associated with plants and insects.

Virion Properties Virions of members in the family Chrysoviridae are isometric, proteinaceous in nature with no envelope and are 35–40 nm in diameter. The buoyant densities of the virions are in the range of 1.34–1.39 g cm3 and their sedimentation coefficients (in Svedberg units) are in the range of 145S to 150S. Generally, each virion contains only one genomic dsRNA segment, however, purified preparations of PcV and Pc-fV can contain minor distinctly sedimenting components that include empty particles and replication intermediates. Chrysovirus particles possess virion-associated RdRp activity, which catalyzes the synthesis of singlestranded (ss) RNA copies of the ( þ ) strand of each of the genomic dsRNA molecules. In vitro transcription occurs by a conservative mechanism, whereby the released ssRNA represents the newly synthesized ( þ ) strand. ☆ This work is dedicated to the memory of our friend and colleague Said Ghabrial, who sequenced the first chrysovirus in 2000 and passed away in November 2018. † Deceased.

Encyclopedia of Virology, 4th Edition, Volume 4

doi:10.1016/B978-0-12-809633-8.21319-8

557

558

Chrysoviruses (Chrysoviridae) - General Features and Chrysovirus-Related Viruses

Virion structure and composition The three-dimensional structure and protein stoichiometry of several chrysoviruses has been analyzed by three-dimensional cryo-electron microscopy (3D cryo-EM) combined with analytical ultracentrifugation analysis. The 3D cryo-EM structures of two chrysoviruses, PcV and Cryphonectria nitschkei chrysovirus virus 1 (CnCV1), were determined at sub-nanometer and near-atomic resolutions, respectively. Virions are isometric, non-enveloped, about 40 nm in diameter with a B50 Å thick protein shell. The capsids of PcV and CnCV1 comprise 60 copies of a 982- and 902-amino acid (aa) polypeptide, respectively, arranged on an authentic T¼1 icosahedral lattice. The most prominent features are 12 outwardly protruding pentons, each containing five copies of the CP (Fig. 1(a)). The uneven outer surface is similar to the capsid of Saccharomyces cerevisiae virus L-A (ScV-L-A; a totivirus with an undivided genome), and contrasts with the smooth outer surface of reoviruses, where the CP has a plate-like structure. PcV and CnCV1 capsids constitute partial exceptions to the most extended tendency among dsRNA viruses, the T ¼1 core with 60 equivalent dimers. The PcV CP is formed by a repeated predominantly a-helical domain, indicative of ancestral gene duplication. These two domains are arranged in two sets of five: five A domains directly surround the icosahedral fivefold axis and five B domains intercalated between them, forming a pseudodecamer. This organization with two similar motifs generates an architecture resembling that of the 120-subunit T¼1 lattice of totivirus, reovirus, and cystovirus capsids, in which the two asymmetrical dimer components are arranged in a near-parallel fashion. A 120-subunit (or 120-domain) T¼ 1 layer, which remains structurally undisturbed throughout the viral cycle, is therefore a ubiquitous architecture for management of dsRNA metabolism. The near-atomic cryo-EM structure of the PcV virion at B4 Å resolution shows that the complete 982-aa CP is built of 33 a-helices and 20 b-strands (Fig. 1(b)). The CP has two domains or halves with a very similar fold, an N-terminal domain (residues 1–498) connected by a 16-residue linker to a C-terminal domain (516  982) (Fig. 1(c)). Both PcV CP domains have a long (430 Å ) alphahelix tangential to the capsid surface (Fig. 1(b); arrows). Despite the lack of sequence similarity between the two halves, the CP is an almost perfect structural duplication of a single domain in which many a-helices and b-chains show good matching (Fig. 2(a)). Superimposition of secondary structure elements shows, in addition to the N- and C-terminal arms, a single “hot spot” on the outer capsid surface into which structural and functional variations can be introduced by insertion of 50–100 residue segments (Fig. 2(b)). A preferential insertion site would allow the acquisition of new functions (e.g., new enzymatic activities) while preserving basic capsid protein folding. Notably, the basic PcV fold is well preserved among dsRNA viruses, and provides information regarding the progenitor fold of the dsRNA virus lineage and its evolutionary mechanisms. Structural comparisons of either of the PcV domains and the ScV-L-A or Rosellinia necatrix quadrivirus 1 (RnQV1) CPs highlight the same conserved PcV motif and the hot spot for insertions, in addition to two additional insertion zones that face the outer capsid surface (Fig. 2(b) and (c)). This co-localization suggests that these preferential insertion sites are ancient. Structural comparison of PcV and other dsRNA viruses in the family Reoviridae shows ancestral structural motifs or subdomains that have acted as a skeleton.

Fig. 1 Three-dimensional cryo-EM reconstruction of Penicillium chrysogenum virus (PcV) virions at B4 Å resolution. (a) Radially color-coded surface-shaded virion T¼1 capsid showing 12 outwardly protruding pentamers (orange). Bar ¼ 100 Å . (b) Rainbow-colored ribbon diagram of the PcV capsid protein; arrows indicate the longest a-helices. (c) Atomic model of the PcV capsid highlighting the capsid protein with the N-terminal domain (yellow), the linker segment (red), and the C-terminal domain (yellow).

Chrysoviruses (Chrysoviridae) - General Features and Chrysovirus-Related Viruses

559

Fig. 2 Structural duplication of the Penicillium chrysogenum virus (PcV) capsid protein (CP) and structural homology of the PcV CP fold with the Saccharomyces cerevisiae virus L-A CP. (a) Superimposed N-terminal (blue) and C-terminal (yellow) domains of the PcV CP (non-superimposed regions are white). (b) Sequence alignment of N- and C-terminal domains resulting from the structural alignment. a-helices (rectangles) and b-strands (arrows) are rainbow-colored from blue (N terminus) to red (C terminus) for each domain. Triangles represent non-aligned segments (sizes indicated). (c) Structural alignment of the PcV CP N-terminal domain (blue) with L-A capsid protein (red).

Whereas dsRNA in the interior of reovirus cores is very compact (with an average genome density of 40 bp/100 nm3 and a spacing among dsRNA strands of 25–30 Å ), PcV and CnCV1 have spacious capsids. Considering the volume available in the capsid interior and average genome size of 3200 bp (each segment is separately encapsidated in a similar particle), the packed dsRNA would have an interstrand spacing of 40 Å , which closely matches the relatively low density in most fungal virus capsids. The looser packing of the dsRNA would probably improve template motion in the transcriptional and replicative active particle that contains 1–2 copies of the RdRp. Although most mycovirus T ¼1 capsids are negatively charged on their inner surface, the PcV CP inner surface has highly positively charged triskelion-shaped areas that maintain the encapsidated genome in close contact with the inner capsid surface. This arrangement of dsRNA might further facilitate genome mobility within the capsid, and/or have a structural role in capsid stability. The capsid is highly porous; whereas the B11 Å diameter pores at the fivefold axis would allow the exit of viral transcripts but not of dsRNA, the B5 Å -diameter pores at the threefold axis would allow nucleotide diffusion.

Genome Organization The virions of members of family Chrysoviridae contain three to seven unrelated linear, separately encapsidated, monocistronic dsRNA segments 0.8–3.7 kbp in size (Table 1); the genomic structures of PcV, the prototype of genus Alphachrysovirus, and

560

Table 1 Genus

Chrysoviruses (Chrysoviridae) - General Features and Chrysovirus-Related Viruses

Members of the Chrysoviridae family dsRNA & protein

50 -UTR

30 -UTR

accession number

dsRNA 1 (3399 bp); RdRP (124 kDa)

86 nt

49 nt

AJ781166

dsRNA dsRNA dsRNA dsRNA dsRNA dsRNA dsRNA dsRNA dsRNA dsRNA dsRNA dsRNA dsRNA dsRNA

CP (112 kDa) alphachryso-P4 (98 kDa) alphachryso-P3 (77 kDa) RdRP (126 kDa) CP (113 kDa) alphachryso-P4 (97 kDa) RdRP (129 kDa) CP (109 kDa) alphachryso-P3 (100 kDa) alphachryso-P4 (95 kDa) RdRP (131 kDa) CP (126 kDa) alphachryso-P4 (113 kDa) RdRP (126 kDa)

95 nt 94 nt 105 nt 179 nt 293 nt 393 nt 128 nt 167 nt 169 nt 154 nt 101 nt 135 nt 132 nt 71 nt

48 nt 84 nt 359 nt 71 nt 161 nt 259 nt 87 nt 130 nt 161 nt 165 nt 97 nt 81 nt 139 nt 59 nt

AJ781165 AJ781164 AJ781163 FJ899675 FJ899676 FJ899677 FN178512 FN178513 FN178514 FN178515 KP782031 KP782030 KP782029 KT581957

dsRNA 2 (2869 bp); CP (102 kDa) dsRNA 3 (2630 bp); alphachryso-P4 (92 kDa) Cryphonectria nitschkei chrysovirus 1 dsRNA 1 (partial; 2978 bpa); RdRP (110 kDa) dsRNA 2 (partial; 2980 bpa); CP (99 kDa) (CnCV1) dsRNA 3 (partial; 2552 bpa); alphachryso-P4 (91 kDa) dsRNA 4 (partial; 2960 bpa); alphachryso-P3 (81 kDa) Fusarium oxysporum chrysovirus 1 dsRNA 1 (partial; 2574 bpa); RdRP dsRNA 2 (partial; 648 bpa); CP (FoCV1) dsRNA 3 (partial, non-functional; 994 bpa); alphachryso-P4 Helminthosporium victoriae virus 145S dsRNA 1 (3612 bp); RdRP (125 kDa) dsRNA 2 (3134 bp); CP (100 kDa) (HvV145S) dsRNA 3 (2972 bp); alphachryso-P4 (93 kDa) dsRNA 4 (2763 bp); alphachryso-P3 (81 kDa) Isaria javanica chrysovirus 1 dsRNA 1 (3593 bp); RdRP (129 kDa) dsRNA 2 (3175 bp); CP (109 kDa) (IjCV1) dsRNA 3 (3165 bp); alphachryso-P3 (108 kDa) dsRNA 4 (2874 bp); alphachryso-P4 (92 kDa) Macrophomina phaseolina chrysovirus 1 dsRNA 1 (3712 bp); RdRP (129 kDa) dsRNA 2 (3462 bp); CP (111 kDa) (MpCV1) dsRNA 3 (2985 bp); alphachryso-P4 (94 kDa) dsRNA 4 (2927 bp); alphachryso-P3 (100 kDa) Penicillium chrysogenum virus dsRNA 1 (3562 bp); RdRP (129 kDa) dsRNA 2 (3200 bp); CP (109 kDa) (PcV) dsRNA 3 (2976 bp); alphachryso-P3 (101 kDa) dsRNA 4 (2902 bp); alphachryso-P4 (95 kDa) Persea americana chrysovirus dsRNA 1 (3421 bp); RdRP (126 kDa) dsRNA 2 (3335 bp); CP (122 kDa) (PaCV) dsRNA 3 (2857 bp); alphachryso-P4 (92 kDa) Raphanus sativus chrysovirus 1 dsRNA 1 (3638 bp); RdRP (131 kDa) dsRNA 2 (3517 bp); CP (124 kDa) (RsCV1) dsRNA 3 (3299 bp); alphachryso-P4 (110 kDa) Shuangao chryso-like virus dsRNA 1 (3461 bp); RdRP (127 kDa) dsRNA 2 (3140 bp); (108 kDa) (SCLV) dsRNA 3 (3080 bp); (98 kDa) dsRNA 4 (3059 bp); (106 kDa) Verticillium dahliae chrysovirus 1 dsRNA 1 (3594 bp); RdRP (127 kDa) dsRNA 2 (3313 bp); CP (113 kDa) (VdCV1) dsRNA 3 (2983 bp); alphachryso-P3 (84 kDa) dsRNA 4 (2932 bp); alphachryso-P4 (90 kDa)

71 nt 66 nt 82 nt 242 nt ?

95 nt 92 nt 7? 17? 77

KT581958 KT581959 GQ290649 GQ290645 HM013825

698 nt

30?

HM013826

? ? ?

? ? ?

EF152346 EF152347 EF152348

207 nt 293 nt 302 nt 412 nt 174 nt 176 nt 147 nt 212 nt 256 nt 374 nt 278 nt 127 nt 144 nt 157 nt 161 nt 162 nt 97 nt 95 nt 108 nt 106 nt 128 nt 134 nt 45 nt 200 nt 153 nt 150 nt 91 nt 204 nt 635 nt 264 nt

144 nt 153 nt 150 nt 209 nt 65 nt 74 nt 84 nt 130 nt 138 nt 112 nt 175 nt 139 nt 64 nt 94 nt 83 nt 196 nt 42 nt 99 nt 135 nt 115 nt 83 nt 177 nt 59 nt 57 nt 218 nt 74 nt 176 nt 58 nt 53 nt 196 nt

AF297176 AF297177 AF297178 AF297179 KX898416 KX898417 KX898418 KX898419 KP900886 KP900887 KP900889 KP900888 AF296439 AF296440 AF296441 AF296442 KJ418374 KJ418375 KJ418376 JQ045335 JQ045336 JQ045337 MF176340 MF176342 MF176341 MF176343 HM004067 HM004068 HM004069 HM004070

virus name & abbreviation

Alphachrysovirus Amasya cherry disease associated chrysovirus (ACDACV)

Anthurium mosaic-associated virus (AMAV) Aspergillus fumigatus chrysovirus (AfuCV)

Brassica campestris chrysovirus 1 (BcCV1) Colletotrichum gloeosporioides chrysovirus 1 (CgCV1)

2 3 4 1 2 3 1 2 3 4 1 2 3 1

(3128 (2833 (2498 (3550 (3448 (3244 (3560 (3159 (3006 (2863 (3639 (3567 (3337 (3397

bp); bp); bp); bp); bp); bp); bp); bp); bp); bp); bp); bp); bp); bp);

Chrysoviruses (Chrysoviridae) - General Features and Chrysovirus-Related Viruses

Table 1

561

Continued

Genus

virus name & abbreviation

Betachrysovirus

Alternaria alternata chrysovirus 1 (AaCV1)

dsRNA & protein

dsRNA 1 (3647 bp); RdRP (124 kDa) dsRNA 2 (2857 bp); CP (82 kDa) dsRNA 3 (2785 bp); betachryso-P4 (83 kDa) dsRNA 4 (2772 bp); betachryso-P3 (84 kDa) dsRNA 5 (836 bp); 13 kDa Botryosphaeria dothidea chrysovirus 1 dsRNA 1 (3654 bp); RdRP (126 kDa) dsRNA 2 (2773 bp); CP (80 kDa) (BdCV1) dsRNA 3 (2597 bp); betachryso-P3 (82 kDa) dsRNA 4 (2574 bp); betachryso-P4 (77 kDa) Colletotrichum fructicola chrysovirus 1 dsRNA 1 (3620 bp); RdRP (126 kDa) dsRNA 2 (2801 bp); CP (89 kDa) (CfCV1) dsRNA 3 (2687 bp); betachryso-P3 (76 kDa) dsRNA 4 (2437 bp); betachryso-P4 (72 kDa) dsRNA 5 (1750 bp); 54 kDa dsRNA 6 (1536 bp); 33 kDa dsRNA 7 (1211 bp); 10 kDa Fusarium graminearum dsRNA dsRNA 1 (3580 bp); RdRP (128 kDa) mycovirus 2 dsRNA 2 (3000 bp); betachryso-P4 (95 kDa) (FgV2) dsRNA 3 (2982 bp); CP (94 kDa) dsRNA 4 (2748 bp); betachryso-P3 (91 kDa) dsRNA 5 (2414 bp); 80 kDa Fusarium oxysporum f. sp. dianthi dsRNA 1 (3555 bp); RdRP (129 kDa) mycovirus dsRNA 2 (2809 bp); betachryso-P4 (95 kDa) (FodV) dsRNA 3 (2794 bp); CP (92 kDa) dsRNA 4 (2646 bp); betachryso-P3 (92 kDa) Magnaporthe oryzae chrysovirus 1-A dsRNA 1 (3554 bp); RdRP (125 kDa) dsRNA 2 (3250 bp); betachryso-P3 (99 kDa) (MoCV1-A) dsRNA 3 (3074 bp); betachryso-P4 (84 kDa) dsRNA 4 (3043 bp); CP (85 kDa) dsRNA 5 (2879 bp); 63 kDa Penicillium janczewskii chrysovirus 1 dsRNA 1 (3698 bp); RdRP (127 kDa) dsRNA 2 (2899 bp); CP (84 kDa) (PjCV1) dsRNA 3 (2942 bp); betachryso-P3 (90 kDa) dsRNA 4 (2506 bp); betachryso-P4 (74 kDa) Penicillium janczewskii chrysovirus 2 dsRNA 1 (3540 bp); RdRP (124 kDa) dsRNA 2 (2699 bp); CP (84 kDa) (PjCV2) dsRNA 3 (2535 bp); betachryso-P3 (80 kDa) dsRNA 4 (partial; 2155 bpa); betachryso-P4 (70 kDa)

50 -UTR

30 -UTR

accession number

199 nt 389 nt 366 nt 334 nt 102 nt 230 nt 272 nt 293 nt 292 nt 154 nt 233 nt 273 nt 340 nt 61 nt 67 nt 252 nt 82 nt

94 nt 152 nt 94 nt 107 nt 386 nt 73 nt 254 mt 63 nt 128 nt 67 nt 90 nt 290 nt 109 nt 168 nt 560 nt 689 nt 84 nt

LC350277 LC350278 LC350279 LC350280 LC350281 KF688736 KF688737 KF688738 KF688739 MG425969 MG425970 MG425971 MG425972 MG425973 MG425974 MG425975 HQ343295

93 nt 105 nt 78 nt 97 nt 82 nt

279 nt 306 nt 162 nt 184 nt 53 nt

HQ343296 HQ343297 HQ343298 HQ343299 KP876629

84 nt 97 nt 97 nt 118 nt 343 nt 144 nt 161 nt 178 nt 168 nt 186 nt 223 nt 179 nt 94 nt 194 nt 164 nt 190 nt

88 nt 138 nt 56 nt 52 nt 102 nt 530 nt 443 nt 865 nt 149 nt 400 nt 253 nt 263 nt 86 nt 153 nt 100 nt ?

KP876630 KP876631 KP876632 AB560761 AB560762 AB560763 AB560764 AB700631 KT601115 KT601116 KT601117 KT601118 KT950836 KT950837 KT950838 KT950839

a

length of dsRNA is underestimated because only partial sequence is available.

Botryosphaeria dothidea chrysovirus 1 (BdCV1), the prototype of genus Betachrysovirus, each comprising four dsRNA segments, are schematically represented in Fig. 3. For the genus Alphachrysovirus, the largest segment, dsRNA 1, encodes the RdRp and the second largest segment, dsRNA 2, encodes the major CP. The earlier conflicting reports on whether PcV contains three or four segments were explained when studies on cDNA cloning and sequencing of the viral dsRNAs were completed, revealing that dsRNAs 3 and 4 only differ in size by 74 bp (Table 1) and co-migrate when separated by agarose gel electrophoresis. Previous studies on sequencing analysis and in vitro coupled transcription-translation assays showed that each of the four dsRNAs is monocistronic, as each dsRNA contains a single major open reading frame (ORF) and each is translated into a single major product of the size predicted from its deduced aa sequence. Therefore, the fact that PcV virions contain four distinct dsRNA segments was clearly established. Unlike PcV, HvV145S dsRNAs 3 and 4 are clearly resolved from each other when purified dsRNA preparations are subjected to agarose gel electrophoresis and, as shown in Table 1, are significantly different in size. Assignment of numbers 1–4 to PcV dsRNAs was made according to their decreasing size. Following the same criterion as used for PcV, the dsRNAs associated with all members of the genus Alphachrysovirus were assigned the numbers 1–4. Sequence comparisons, however, indicated that dsRNA 3 of HvV145S, Amasya cherry disease associated chrysovirus (ACDACV) and Macrophomina phaseolina chrysovirus 1 (MpCV1) are in fact the counterparts of PcV dsRNA 4 rather than dsRNA 3. Likewise, dsRNA 4 of these three chrysoviruses are the counterparts of PcV dsRNA3. Since PcV was

562

Chrysoviruses (Chrysoviridae) - General Features and Chrysovirus-Related Viruses

Fig. 3 Schematic representation of the genomic organization of Penicillium chrysogenum virus (PcV), exemplar virus for the type species of genus Alphachrysovirus, and Botryosphaeria dothidea chrysovirus 1 (BdCV1), exemplar virus for the type species of genus Betachrysovirus. Each genome consists of four dsRNAs, each containing one open reading frame (ORF; colored boxes) flanked by 50 - and 30 -untranslated regions (UTRs; black double lines). The light colored box in dsRNA 1 represents the RdRP_4 motif; the dark colored box in dsRNA 1 represents an independent P-loop NTPase domain predicted by HHpred; the gray colored thick lines represent a region of homology between the proteins encoded by dsRNAs 1 and 3.

the first chrysovirus to be characterized at the molecular level and to avoid confusion, the protein designations P3 and P4 as used for PcV were adopted and referred to as alphachryso-P3 and alphachryso-P4. Therefore, whereas the alphachryso-P3 protein represents the gene product of PcV dsRNA 3, it comprises the corresponding gene product of, for instance, HvV145S dsRNA4. Additionally, some members of the genus Alphachrysovirus, including Brassica campestris chrysovirus 1 (BcCV1), Colletotrichum gloeosporioides chrysovirus 1 (CgCV1), Fusarium oxysporum chrysovirus 1 (FoCV1), Persea americana chrysovirus (PaCV), and Raphanus sativus chrysovirus 1 (RsCV1) appear to have only 3 dsRNA segments, lacking the dsRNA encoding the alphachryso-P3 protein. Regarding members of the genus Betachrysovirus, the RdRp is encoded by the largest segment, dsRNA 1, while the major CP is usually encoded by dsRNA 2, but it may also be encoded by dsRNA 3, in the case of Fusarium graminearum dsRNA mycovirus 2 (FgV2) and Fusarium oxysporum f. sp. dianthi mycovirus 1 (FodV), or dsRNA 4, in the case of Magnaporthe oryzae chrysovirus 1-A (MoCV1-A). It should be noted that betachrysovirus CPs share no detectable homology with alphachrysovirus CPs. Similarly to the approach used for alphachrysoviruses, the betachrysovirus proteins encoded by dsRNA 3 and 4 of the exemplar virus BdCV1 were designated as betachryso-P3 and betachryso-P4, respectively, and are not necessarily produced by dsRNA 3 and 4 of all known betachrysoviruses. For instance, MoCV1-A dsRNA 2 encodes betachryso-P3, while dsRNA 3 encodes betachryso-P4. Unlike alphachrysoviruses, no betachrysoviruses with 3 genomic segments have been reported; however, betachrysoviruses with five genomic segments, such as MoCV1-A, FgV2 and Alternaria alternata chrysovirus 1 (AaCV1), and one Betachrysovirus with 7 genomic segments, Colletotrichum fructicola chrysovirus 1 (CfCV1), has been described. Although the proteins alphachryso-P3 and alphachryso-P4, encoded by PcV dsRNA 3 and dsRNA 4, respectively, are of unknown function, protein database searches reveal that PcV alphachryso-P3 sequence shares a ‘phytoreo S7 domain’ with a family consisting of several phytoreovirus P7 proteins known to be viral core proteins with nucleic acid binding activities (see below). The PcV alphachryso-P4 (and comparable proteins of other alphachrysoviruses) contains the motifs that form the conserved core of the ovarian tumor gene-like superfamily of predicted cysteine proteases. In contrast, little is known about the possible functions of the betachryso-P3 and betachryso-P4 proteins which share no homology with the alphachryso-P3 and alphachryso-P4 proteins or the betachrysoviruses proteins encoded by the dsRNAs 5–7. The majority of the 50 - and 30 -untranslated regions (UTRs) of chrysovirus dsRNAs are relatively long, up to 698 nt and 865 nt, respectively (Table 1). The longest 50 -UTR belongs to CnCV1 dsRNA 4, while the longest 30 -UTR belongs to MoCV1-A dsRNA 5. The only 50 -UTR shorter than 50 nt belongs to dsRNA 1 of the insect-associated Shuangao chryso-like virus (SCLV). Both PcV (alphacrysovirus) and BdCV1 (betachrysovirus) UTRs have strictly conserved 50 - and 30 -termini (Fig. 4) and there appear to be some sequence similarity between them; for instance the poly A sequence at the 50 -terminus or the ‘GUGU’ sequence at the 30 -terminus. In addition, a 40–75 nt region with high sequence identity is present in the 50 -UTR of all four PcV dsRNAs. A second region of strong sequence similarity is present immediately downstream, which consists of a stretch of 30–50 nt containing a reiteration of the sequence ‘CAA’. The (CAA)n repeats are found to a lesser extent in the 50 -UTRs of BdCV1 and are similar to the enhancer elements present at the 50 UTRs of tobamoviruses.

Chrysoviruses (Chrysoviridae) - General Features and Chrysovirus-Related Viruses

563

Fig. 4 Comparison of the 50 - and 30 -untranslated regions (UTR; termini of all dsRNAs of the alphachrysovirus Penicillium chrysogenum virus (PcV) and the betachrysovirus Botryosphaeria dothidea chrysovirus 1 (BdCV1). Asterisks and colons signify identical and highly conserved nucleotides at the indicated positions, respectively. Identical nucleotides in the same position for both viruses are shaded.

Fig. 5 Comparison of the eight conserved motifs of all chrysovirus RdRps presented in Table 1. Numbers in parentheses correspond to amino acid residues present between the motifs. Asterisks and colons respectively signify identical and highly conserved amino acid residues at the positions indicated.

Genome Expression and Replication Chrysovirus RdRps The largest dsRNA segment (dsRNA 1) of all chrysoviruses so far sequenced contains a single large ORF encoding the RdRp. The molecular mass of chrysovirus RdRps ranges from 110 to 131 kDa (Table 1). These values are consistent with those estimated by SDS-PAGE of the in vitro translation products of full-length transcripts derived from both PcV and Hv145SV dsRNA 1 cDNAs. Examination of the deduced amino acid sequence of the RdRp ORF reveals the presence of the eight conserved motifs characteristic of RdRps of dsRNA viruses present in simple eukaryotes (Fig. 5). Additionally, there is an independent P-loop NTPase domain at the N-terminus of alphachrysovirus and betachrysovirus RdRps as shown by HHpred, a remote protein homology detection software. In the case of alphachrysoviruses this domain overlaps with a region homologous to alphachryso-P3. Members of the P-loop NTPase domain superfamily are characterized by conserved nucleotide phosphate-binding motifs, also referred to as the Walker A motif (GxxxxGK[S/T]) and the Walker B motif (hhhh[D/E], where h is a hydrophobic residue).

Chrysovirus CPs The second largest dsRNA segment (dsRNA 2) of alphachrysoviruses currently sequenced contains a single large ORF encoding the CP, while the CP of betachrysoviruses currently sequenced may also be produced by dsRNA 3 or 4. The molecular mass of chrysovirus CPs ranges from 80 to 126 kDa (Table 1). The predicted size of PcV CP (109 kDa) is similar to that estimated by

564

Chrysoviruses (Chrysoviridae) - General Features and Chrysovirus-Related Viruses

SDS-PAGE of purified PcV virions as well as that determined for the in vitro translation product of a full-length transcript of dsRNA 2 cDNA. Direct evidence that dsRNA 2 of both the alphachrysovirus PcV and the betachrysovirus BdCV1 encodes CP was provided by amino acid sequencing of a tryptic peptide derived from a gradient-purified PcV capsid.

Alphachryso-P3 Shares a ‘Phytoreo S7 Domain’ With Core Proteins of Phytoreoviruses DsRNA 3 of PcV, Aspergillus fumigatus chrysovirus (AfuCV), Isaria javanica chrysovirus 1 (IjCV1), and Verticillium dahliae chrysovirus 1 (VdCV1) encodes the alphachryso-P3 protein, whereas dsRNA 4 of Hv145SV, ACDACV, and MpCV1 encodes the corresponding alphachryso-P3. Although the function of alphachryso-P3 is not known, sequence analysis and database searches offer some clues. ProDom database searches reveal that chryso-P3 sequences share a ‘phytoreo S7 domain’ with a family consisting of several phytoreovirus P7 proteins known to be viral core proteins with nucleic acid binding activities. The consensus for the three chrysoviruses is [X(V/I)V(M/L)P(A/M)G(C/H)GK(T/S)T-(L/I)]. Phytoreovirus P7 proteins bind to their corresponding P1 (transcriptase/replicase) proteins, which bind to the genomic dsRNAs. It is of interest, in this regard, that the N-terminal regions of all alphachryso-P3s (encompassing the amino acids within positions 1–500) share significant sequence similarity with comparable N-terminal regions of the putative RdRps encoded by chrysovirus dsRNA1s. The regions in the dsRNA 1-encoded proteins with high similarity to alphachryso-P3 occur upstream of the eight highly conserved motifs characteristic of RdRps of dsRNA viruses of simple eukaryotes. The significance of this sequence similarity to the function of alphachryso-P3 is not known for certain, but one may speculate that the N-terminal region of these proteins might play a role in viral RNA binding and packaging. In contrast, betachyso-P3 shares no detectable homology with alphachryso-P3, contains no phytoreo S7 domain and has no similarity with the betachrysovirus RdRps.

Chryso-P4 is a Putative Protease and Virion Associated as a Minor Protein Present evidence, based on amino acid sequencing of a tryptic peptide derived from gradient-purified PcV virions, strongly supports the conclusion that PcV alphachryso-P4 is a virion-associated minor protein. Additionally, alphachryso-P4 encoded by some chrysoviruses contains the motif PGDGXCXXHX. This motif (I), along with motifs II (with a conserved K), III, and IV (with a conserved H), form the conserved core of the ovarian tumor gene-like superfamily of predicted cysteine proteases. Multiple alignments showed that motifs I–IV are also present in other viruses including Agaricus bisporus virus 1, a tentative member of the family Chrysoviridae. Whether the RNAs of these viruses indeed code for the predicted proteases remains to be investigated.

Replication of Chrysoviruses There is very limited information on how chrysoviruses replicate their dsRNAs. The virion-associated RdRp catalyzes in vitro end-to-end transcription of each dsRNA to produce mRNA by a conservative mechanism. Purified virions containing both ssRNA and dsRNA have been isolated from Penicillium spp. infected with PcV or Pc-fV and may represent replication intermediates.

Taxonomy and Phylogenetic Analysis The Penicillium chrysoviruses PcV, Pc-fV, and PbV are serologically related and have similar biochemical and biophysical properties. Although molecular data is only available for PcV, the three viruses could be considered as strains of the same virus for all practical purposes. The fact that these closely related viruses occur in different fungal species suggested that transmission by means other than anastomosis may occur naturally, since heterokaryon formation between different fungal species is unlikely. Horizontal transmission of fungal viruses in nature however, has yet to be demonstrated and in the case of viruses of Penicillium spp. may not need to occur since the viruses replicate in parallel with their hosts and are carried intracellularly during vegetative growth of the host. Furthermore, the viruses are efficiently disseminated by vertical transmission via the conidia of Penicillium spp. It seems feasible, however, that virus infection arose early in the phylogeny of P. brevi-compactum, P. chrysogenum, and P. cyaneo-fulvum before they diverged and that the resident virus remained associated with them during their subsequent evolution. BLAST searches of chrysovirus RdRp amino acid sequence showed significantly high sequence similarity with the RdRps of several members of the family Totiviridae. Interestingly, no significant hits were evident with any of the viruses in family Partitiviridae, another validation for the removal of chrysoviruses from the family Partitiviridae and their placement in the newly created family Chrysoviridae. The conclusion that chrysovirus RdRps are more closely related to those of totiviruses than to those of partitiviruses is supported by published results of phylogenetic analysis of RdRp conserved motifs and flanking sequences of chrysoviruses and viruses in families Totiviridae and Partitiviridae and comparisons of the conserved motifs of chrysovirus RdRps with those of totiviruses and partitiviruses. For instance, the database Pfam, which includes annotations and multiple sequence alignments generated using hidden Markov models, classifies chrysovirus and totivirus RdRps in of the protein family RdRP_4 (PF02123), while partitivirus RdRps belong to RdRP_1 (PF00680). The recent discoveries of new viruses related to chrysoviruses have led to an expansion and reorganization of the family Chrysoviridae. The family has now two genera, Alphachrysovirus and Betachrysovirus, accommodating seventeen and eight species

Chrysoviruses (Chrysoviridae) - General Features and Chrysovirus-Related Viruses

565

Fig. 6 Maximum likelihood phylogenetic tree created based on the RdRP sequences of chrysoviruses. The sequences were aligned with MUSCLE as implemented by MEGA 6, all positions with less than 30% site coverage were eliminated and the LG þ G þ I þ F substitution model was used. At the end of the branches, blue circles indicate that the virus infects fungi; green circles indicate that the virus infects or is associated with plants; red circles indicate that the virus infects or is associated with insects. Saccharomyces cerevisiae virus L-A (ScV-L-A), a member of the genus Totivirus, family Totiviridae, was used as the outgroup.

respectively, based on the formation of two distinct clades as evidenced by phylogenetic analysis (Fig. 6). The analysis was based on the sequence of chrysovirus RdRp, which is the only protein with detectable homology among all viruses belonging to the two genera. The rest of the viral proteins, including the CP, are conserved within but not among genera. The one exception is SCLV, an insect-associated virus belonging to the genus Alphachrysovirus, whose proteins, the RdRp notwithstanding, are non-homologous to those of the other members of genus Alphachrysovirus or Betachrysovirus. An interesting feature of chrysoviruses is a variability in the number of their genomic segments, which range from three to seven. More specifically, the known members of the genus Alphachrysovirus can have three or four segments as their genome. Six members of the genus Alphachrysovirus have three genomic segments and most of them infect or are associated with plants. Therefore alphachryso-P3 is not necessary for virus replication, at least under certain conditions that have not been characterized yet. Additionally, since FoCV1 alphachryso-P4 appears to be non-functional due to the presence of nonsense mutations in the open reading frame, it is feasible that the replication cycle of chrysoviruses can be completed in the presence of only two proteins, the RdRp and the CP. The members of the genus Betachrysovirus, a very recent addition to the family Chrysoviridae, are not as well characterized. They typically have four or five genomic segments and, in the case of CfCV1, up to seven segments. The RdRP, the CPs, and betachrysoP3 and betachryso-P4 are conserved among all betachrysoviruses, while the rest, if present, are non-homologous among members of the genus. The RdRp of the betachrysoviruses is most closely related to that of their sister taxon Alphachrysovirus, but the proteins encoded from the other segments, including the CP, demonstrate no clear homology with those of the alphachysoviruses or other proteins available in the databases.

Biology and Effects of Chrysoviruses on Fungal Hosts Mycoviruses may reduce or increase the virulence of their fungal hosts and as such are considered to have potential as biological control agents. Cryphonectria hypovirus 1 (CHV1), a positive-sense single-stranded RNA mycovirus belonging to the family Hypoviridae, is the best example for attenuation of the host fungus of Cryphonectria parasitica. Viruses belonging to the family Chrysoviridae have also been reported to alter the pathogenicity of their host fungi. Within the genus Alphachrysovirus, AfuCV reduces growth and pathogenicity of its human pathogenic host. Additionally, five mycoviruses which belong to genus Betachrysovirus, have been reported

566

Chrysoviruses (Chrysoviridae) - General Features and Chrysovirus-Related Viruses

to alter the pathogenicity of their host fungi against plants including MoCV1-A, AaCV1, BdCV1, Fusarium graminearum mycovirusChina 9 (FgV-Ch-9) and Aspergillus thermomutatus chrysovirus 1 (AthCV1). For instance, MoCV1-A infection generally confers hypovirulence to the fungus but is also a driving force to generate different pathogenic races of the fungus. In another example, AaCV1 exhibits two contrasting effects: it impairs the growth of the host fungus while rendering the fungus hypervirulent to the plant, with increased production of a host-specific toxin. MoCV1-A and MoCV1-B are the first viruses reported to cause hypovirulence traits in the rice blast pathogen Magnaporthe oryzae, including impaired growth of host cells, altered colony morphology, and reduced pigmentation. The hyphal morphology of the host fungus infected with MoCV1-B revealed a remarkable albino phenotype on potato dextrose agar (PDA) medium, indicating that melanin biosynthesis appears to be suppressed by MoCV1-B infection. The cell wall of a MoCV1-B infected strain was loose and enlarged, and staining with calcofluor-white showed that the cell wall was also damaged by MoCV1-B infection. No conidial formation was observed M. oryzae infected with MoCV1-B on PDA. Pathogenic races of M. oryzae are determined by a gene-for-gene system, where an avirulence gene in the pathogen induces disease resistance in a rice variety with a corresponding resistance (R) gene. To examine whether MoCV1-A infection affects pathogenic races of M. oryzae, a virus-free and a MoCV1-A-infected M. oryzae strain were inoculated onto different rice varieties. Inoculation of the R gene-free rice variety showed that MoCV1-A infection resulted in reduced fungal virulence. However, when spray or leaf-sheath inoculation methods were used to inoculate monogenic rice lines carrying different R genes MoCV1-A-infected and MoCV1-Afree M. oryzae strains caused different lesion types (resistance to susceptible or vice versa) on individual rice varieties. These data suggest that MoCV1-A infection can alter the pathogenicity of the host M. oryzae from avirulence to virulence, or from virulence to avirulence, depending on the rice variety. Recently, it has been found that infection of the Japanese pear pathotype fungus, Alternaria alternata with AaCV1 simultaneously impaired growth of the host fungus and increased levels of the host-specific AK-toxin. This is another example of a mycovirus infection causing changes in pathogenic races of the host fungus, because A. alternata can infect some specific varieties of Japanese pear (Pyrus pyrifolia Nakai) cultivar Nijisseiki, but not others. It is likely that the enhancement of host fungal pathogenicity in some varieties, without any mutation in the avirulence genes, is one of the strategies used by mycoviruses to survive in the agricultural ecosystem, where humans cultivate many plant varieties with different R genes to reduce damage caused by plant disease. Mycoviruses may increase survival rates by retaining the diversity of avirulence genes in their host fungi, which are important for fungal adaptation to plants. Since only limited sequence information is currently available for betachrysovirus dsRNA genomes it is difficult to assign potential roles for individual dsRNA encoded products in plant pathogenicity. Recently attempts at dissecting the roles of the five MoCV1-A dsRNA encoded proteins in pathogenicity have been reported using over expressed proteins. The putative protein encoded by MoCV1-A dsRNA1 contains eight conserved motifs characteristic of RdRps and is assumed to function in this role. However, based on BLAST searches, it was not possible to predict the functions of the other four putative proteins. Therefore, shuttle vectors were constructed by ligating each of the remaining four MoCV1-A ORF sequences predicted from the genomic sequences of dsRNAs 2–5 downstream of a low expression promoter (ADH1) or a high expression promoter (TDH3) for expression in yeast cells. The influence of each protein on yeast cell growth was investigated in liquid cultures. Optical density, viable cell number, glucose concentration, and pH were measured together with observations of cell morphology and immunological analyses. Abnormalities in cell morphology including the appearance of enlarged vacuoles and vesicles were observed when the CP (encoded by dsRNA 4; Table 1) was over expressed in yeast cells. A series of cultivation tests revealed that CP expression also caused a decrease in the rate of cell proliferation and a decrease in cell life span. Over expression of the CP in the human pathogenic budding yeast Cryptococcus neoformans also caused a decrease in the growth rate, increase in emergence and enlargement of vacuoles. Additionally, the formation of capsules, which are involved in the pathogenicity of C. neoformans, was also reduced in the CP-expressing cells, suggesting a reduction in pathogenic potential. Expression of a CP-GFP fusion protein in S. cerevisiae resulted in the formation of abnormal cell aggregates. In MoCV1-A, CP expression also resulted in significant inhibition of growth at high temperatures (351C and 371C) as compared to cells grown at the optimal temperature (301C) plus reduced expression of stress-response genes and increased expression of translationrelated genes. The MoCV1-A CP showed significant similarity to related proteins from other viruses in genus Betachrysovirus. Multiple alignments of the CP-related protein sequences showed that their central regions (aa 210–591 in MoCV1-A CP) are relatively conserved. Indeed, yeast transformants expressing the conserved central region of the MoCV1-A CP protein (325–575 aa) showed similar impaired growth phenotypes to those observed in yeast expressing the full-length MoCV1-A CP protein. The yeast heterologous expression system revealed that the CP proteins of MoCV1-A and AaCV1 (in this case encoded by dsRNA 2) were responsible for growth inhibition of the yeasts S. cerevisiae and C. neoformans, suggesting that the CP in addition to its structural role may have been preserved in betachrysoviruses as a protein implicated in hypovirulence. This hypothesis is supported by a recent report showing that the structural protein of another betachrysovirus, FgV-Ch-9 (closely related to FgV2), acts as symptom determinant.

Acknowledgment This work is dedicated to the memory of our friend and colleague Said Ghabrial, who sequenced the first chrysovirus in 2000 and passed away in November 2018.

Chrysoviruses (Chrysoviridae) - General Features and Chrysovirus-Related Viruses

567

Further Reading Bhatti, M.F., Jamal, A., Petrou, M.A., et al., 2011. The effects of dsRNA mycoviruses on growth and murine virulence of Aspergillus fumigatus. Fungal Genetics and Biology 48, 1071–1075. Ejmal, M.A., Holland, D.J., MacDiarmid, R.M., Pearson, M.N., 2018. The effect of Aspergillus thermomutatus chrysovirus 1 on the biology of three Aspergillus species. Viruses 10. pii: E539. Gómez-Blanco, J., Luque, D., Gonzalez, J.M., et al., 2012. Cryphonectria nitschkei virus 1 structure shows that the capsid protein of chrysoviruses is a duplicated helix-rich fold conserved in fungal double-stranded RNA viruses. Journal of Virology 86, 8314–8318. Luque, D., Gómez-Blanco, J., Garriga, D., et al., 2014. Cryo-EM near-atomic structure of a dsRNA fungal virus shows ancient structural motifs preserved in the dsRNA viral lineage. Proceedings of the National Academy of Sciences of the United States of America 111, 7641–7646. Luque, D., González, J.M., Garriga, D., et al., 2010. The T ¼ 1 capsid protein of Penicillium chrysogenum virus is formed by a repeated helix-rich core indicative of gene duplication. Journal of Virology 84, 7256–7266. Moriyama, H., Urayama, S.I., Higashiura, T., Le, T.M., Komatsu, K., 2018. Chrysoviruses in Magnaporthe oryzae. Viruses 10, E697. Okada, R., Ichinose, S., Takeshita, K., et al., 2018. Molecular characterization of a novel mycovirus in Alternaria alternata manifesting two-sided effects: Down-regulation of host growth and up-regulation of host plant pathogenicity. Virology 519, 23–32. Urayama, S., Kimura, Y., Katoh, Y., et al., 2016. Suppressive effects of mycoviral proteins encoded by Magnaporthe oryzae chrysovirus 1 strain A on conidial germination of the rice blast fungus. Virus Research 223, 10–19. Wang, L., Jiang, J., Wang, Y., et al., 2014. Hypovirulence of the phytopathogenic fungus Botryosphaeria dothidea: Association with a coinfecting chrysovirus and a partitivirus. Journal of Virology 88, 7517–7527. Zhai, L., Zhang, M., Hong, N., et al., 2018. Identification and characterization of a novel hepta-segmented dsRNA virus from the phytopathogenic fungus Colletotrichum fructicola. Frontiers in Microbiology 9, 754.

Fungal Partitiviruses (Partitiviridae) Eeva J Vainio, Natural Resources Institute Finland (Luke), Helsinki, Finland r 2021 Elsevier Ltd. All rights reserved. This is an update of S. Tavantzis, Partitiviruses of Fungi, In Encyclopedia of Virology (Third Edition), edited by Brian W.J. Mahy and Marc H.V. Van Regenmortel, Elsevier Ltd., 2008, doi:10.1016/B978-012374410-4.00405-2.

Glossary Ascomycete/basidiomycete Ascomycetes (phylum Ascomycota) are fungi that produce sexual spores inside sac-like structures called asci, whereas basidiomycetes (phylum Basidiomycota) produce sexual spores externally on specialized cells called basidia. Capsid Protective shell composed of multiple copies of protein subunits that encapsidate the viral genome. Heterokaryotic/homokaryotic Nuclear condition of hyphal cells among many basidiomycetes. The germination of a single basidiospore results in the formation of a homokaryotic (i.e., multinucleate but effectively haploid) primary mycelium. A compatible mating between two homokaryotic mycelia produces a heterokaryotic (i.e., multinucleate but effectively diploid) secondary mycelium with two different nuclear haplotypes. Two different heterokaryons are somatically incompatible unless

they are very closely related. In ascomycetes, the vegetative mycelium is usually haploid and uninucleate. Hyphal anastomosis Fusion of hyphal cells allowing cytoplasmic exchange between contacting mycelia. Hypovirulence Reduced ability to cause disease. Often associated with the presence of mycoviruses in plant pathogenic fungi. Icosahedral symmetry Arrangement of the capsid protein subunits of isometric (spherical) viruses in the symmetry of an icosahedron consisting of 60 identical objects or asymmetric units. In the so called ‘T ¼20 symmetry, 120 identical protein monomers form 60 asymmetric dimers. Somatic incompatibility Process of allorecognition in fungi that keeps different fungal genotypes separated. Somatic (vegetative) incompatibility leads to programmed cell death of heterokaryotic cells formed after anastomosis between hyphae of genetically incompatible strains.

Introduction Viruses infecting fungi (mycoviruses) were first discovered by M. Hollings in diseased strains of the commercially cultivated mushroom, Agaricus bisporus in the 1960s. Later during the same decade, the first members of current virus family Partitiviridae were found in isolates of the ascomycetous mold fungus Penicillium spp. Mycoviruses are now known to be very common among fungi. They are found in diverse fungal phyla including ascomycetes, basidiomycetes, chytridiomycetes and zygomycetes with various lifestyles ranging from symbionts to pathogens. The following classified virus families include viruses infecting fungi: Alphaflexiviridae, Barnaviridae, Botourmiaviridae, Deltaflexiviridae, Endornaviridae, Gammaflexiviridae, Hypoviridae and Narnaviridae with positive-sense single-stranded (ss) RNA genomes; Mymonaviridae with negative-sense ssRNA genomes; Amalgaviridae, Chrysoviridae, Megabirnaviridae, Partitiviridae, Reoviridae, Totiviridae and Quadriviridae with double-stranded (ds) RNA genomes; and Genomoviridae with ssDNA genomes. In addition, the classified genus Botybirnavirus includes dsRNA viruses found so far only in fungi. Virus families Metaviridae and Pseudoviridae contain retrotransposable elements found in ascomycetous yeast fungi, plants, insects and nematodes. Viruses belonging to the family Partitiviridae are classified into five different genera with characteristic hosts for members of each genus: either plants or fungi for genera Alphapartitivirus and Betapartitivirus; fungi for genus Gammapartitivirus, plants for genus Deltapartitivirus and protozoa for genus Cryspovirus. The name “partitivirus” originates from the Latin word “partitius”, which means “divided” and refers to the genome structure of these viruses, which is bipartite and consists of two linear monocistronic dsRNAs. The term “cryptovirus” (from Greek “crypto,” meaning “hidden, covered, or secret”) was formerly used in genus names of plantinfecting Partitiviridae members, but since many of the fungal partitiviruses can also be considered cryptic (symptomless), this term is no longer used to differentiate between plant and fungal partitiviruses. Plant and protozoal partitiviruses are discussed in detail elsewhere in this encyclopedia.

Classification This article focuses on the genera Alphapartitivirus, Betapartitivirus, and Gammapartitivirus, which include viruses that infect filamentous fungi. In addition to characteristic hosts ranges within each genus, partitivirus genera are demarcated based on genome segment and protein lengths within a typical size range for each genus, separate phylogenetic grouping of RdRP sequences from each genus, and level of amino acid sequence identity (less than 24% in pairwise comparisons of viruses from different genera). There are currently 28 fungal partitiviruses that have been assigned to a genus: ten alphapartitiviruses, ten betapartitiviruses, and eight gammapartitiviruses. In addition, the genomes of several more partitiviruses have been completely sequenced and

568

Encyclopedia of Virology, 4th Edition, Volume 4

doi:10.1016/B978-0-12-809633-8.21327-7

Fungal Partitiviruses (Partitiviridae)

569

Table 1 List of fungal viruses in the family Partitiviridae classified by the ICTV. The genera Alpha- Beta- and Gammapartitivirus are indicated using the corresponding Greek alphabets. The two essential genome segments are named as dsRNA1 and dsRNA2 Virus species

Cherry chlorotic rusty spot associated partitivirus Chondrostereum purpureum cryptic virus 1 Flammulina velutipes browning virus Helicobasidium mompa partitivirus V70 Heterobasidion partitivirus 1 Heterobasidion partitivirus 3 Heterobasidion partitivirus 12 Heterobasidion partitivirus 13 Heterobasidion partitivirus 15 Rosellinia necatrix partitivirus 2 Atkinsonella hypoxylon virus Ceratocystis resinifera virus 1 Fusarium poae virus 1 Heterobasidion partitivirus 2 Heterobasidion partitivirus 7 Heterobasidion partitivirus 8 Heterobasidion partitivirus P Pleurotus ostreatus virus 1 Rhizoctonia solani virus 717 Rosellinia necatrix virus 1 Aspergillus ochraceous virus Discula destructiva virus 1 Discula destructiva virus 2 Fusarium solani virus 1 Gremmeniella abietina RNA virus MS1 Ophiostoma partitivirus 1 Penicillium stoloniferum virus F Penicillium stoloniferum virus S

Genus

a a a a a a a a a a b b b b b b b b b b g g g g g g g g

GenBank accession numbers dsRNA1/RdRP

dsRNA2/CP

dsRNA3

dsRNA4

AJ781168 AM999771 AB465308 AB025903 HQ541323 FJ816271 KF963175 KF963177 KF963186 AB569997 L39125 AY603052 AF047013 HM565953 JN606091 JX625227 AF473549 AY533038 AF133290 AB113347 EU118277 AF316992 AY033436 D55668 AY089993 AM087202 AY738336 AY156521

AJ781167 AM999772 AB465309

AM749118

AM749120

HQ541324 FJ816272 KF963176 KF963178 KF963187 AB569998 L39126 AY603051 AF015924 HM565954 JN606090 JX625228 AY533036 AF133291 AB113348 EU118278 AF316993 AY033437 D55669 AY089994 AM087203 AY738337 AY156522

L39127

EU118279 AF316994

AF316995

AY089995 AY738338

described in scientific publications. The classified species of the family Partitiviridae found in fungi are listed in Table 1, and tentative new Partitiviridae members with complete published genome sequences are shown in Table 2. Sequence information is lacking from the classified partitiviruses Agaricus bisporus virus 4, and Gaeumannomyces graminis viruses 019/6-A and T1-A, and they remain unassigned to a genus.

Virion Structure Partitivirus particles are isometric, non-enveloped, and 25–43 nm in diameter (Fig. 1). The capsid consists of 120 copies of a single capsid protein (CP) arranged as 60 dimers with T¼ 1 icosahedral symmetry. Dimeric “arch-like” surface protrusions are frequently observed on partitivirus capsids. Virion buoyant densities in CsCl gradients range from 1.34 to 1.44 g cm3 for particles with nucleic acid. The structures of four fungal partitiviruses have been determined by electron cryomicroscopy and 3D reconstruction (Fig. 2). These include the gammapartitiviruses Penicillium stoloniferum virus F and S (PsV-F and PsV-S), and the betapartitiviruses Fusarium poae virus 1 (FpV1) and Sclerotinia sclerotiorum partitivirus 1 (SsPV1). The structure of PsV-F has also been resolved with X-ray crystallography. Partitiviruses possess two essential genome segments that are individually encapsidated in separate particles. Virion-associated RNA polymerase activity is present, and one or two molecules of RNA dependent RNA polymerase (RdRP) are packaged inside each particle. Purified virion preparations contain mature virions and a heterogenous population of particles thought to be replicative intermediates that may contain only one ssRNA molecule (the genomic positive strand), or both ssRNA and dsRNA, or two molecules of dsRNA.

Genome The two essential genome segments of partitiviruses are separately encapsidated, and each linear segment contains one large ORF on the positive strand RNA molecule. One segment/ORF (dsRNA1) encodes the RdRP, whereas the other segment/ORF (dsRNA2) encodes the capsid protein (CP) (Fig. 3). The CP encoding genome segment is usually smaller than the polymerase encoding

570

Fungal Partitiviruses (Partitiviridae)

Table 2 List of tentative fungal partitiviruses with complete genome sequences. Putative affiliation with the classified genera Alpha- Beta- and Gammapartitivirus is indicated using the corresponding Greek alphabets. The two essential genome segments are named as dsRNA1 and dsRNA2 Virus isolate

Rhizoctonia solani partitivirus 2 Rosellinia necatrix partitivirus 7 Heterobasidion partitivirus 20 Rhizoctonia solani partitivirus 3 Rhizoctonia solani partitivirus 4 Botrytis cinerea partitivirus 2 Rhizoctonia oryzae-sativae partitivirus 1 Hypomyces chrysospermus partitivirus 1 Sclerotinia sclerotiorum partitivirus 1 strain SsPV1-WF  1 Ustilaginoidea virens partitivirus 3 strain HP  30 Rosellinia necatrix partititivirus 6 strain W113 Fusarium poae partitivirus 2 Lentinula edodes partitivirus 1 Rosellinia necatrix partitivirus 8 Grifola frondosa partitivirus 1 Ustilaginoidea virens partitivirus 1 Aspergillus fumigatus partitivirus 1 isolate 88 Colletotrichum acutatum partitivirus 1 isolate CaRV1 Ustilaginoidea virens partitivirus 2 isolate Uv0901 Verticillium dahliae partitivirus 1 strain Vd08284 Penicillium aurantiogriseum partitivirus 1 Pythium nunn partitivirus 1 Nigospora oryzae partitivirus 1 Talaromyces marneffei partitivirus 1 Pseudogymnoascus destructans partitivirus pa Beauveria bassiana partitivirus 1 Beauveria bassiana partitivirus 2 Magnaporthe oryzae partitivirus 1 Botryosphaeria dothidea partitivirus 1 Alternaria alternata partitivirus 1 Aspergillus fumigatus partitivirus 2

Genus

a a a a a a a a b b b b b b b g g g g g g g g g g g g g ? ? ?

GenBank accession numbers dsRNA1/RdRP

dsRNA2/CP

dsRNA3

dsRNA4

KF372436 LC076694 MG566085 KX914900 KX914902 MG011707 MK015642 MK652147 JX297511 KF680478 LC010952‡ LC150608 KX354971 LC314788 LC425601 KC503898 FN376847 KC572132 KF361014 KC422244 KT601103 LC371062 MF742398 KM235311 KY207543 LN896303 LN896305 KX119172 KF688740 KY352402 MH192991

KF372437 LC076695 MG566086 KX914901 KX914903 MG011708 MK015643 MK652146 JX297510 KF680479 LC010953 LC150609 KX354972 LC314789 LC425602 KC503899 FN398100 KC572133 KF361015 KC422243 KT601104 LC371063 MF742399 KM235304 KY207544 LN896304 LN896306 KX119173 KF688741 KY352403 MH192992

KC503900

KC503901

KF688742

segment. The total size of partitivirus genomes is typically 3.6–3.9 kbp for members of genus Alphapartitivirus, 4.3–4.8 kbp for members of genus Betapartitivirus, and 3.1–3.4 kbp for members of genus Gammapartitivirus. The genome segments are individually 1.4–2.4 kbp in size. Table 3 shows the size ranges of individual genome segments in classified alpha- beta- and gammapartitiviruses, as well as the predicted sizes of RdRP and capsid proteins. The first 7–20 nucleotides in the 50 –non translated region of the coding RNA are usually highly conserved between the two genome segments of single viruses and may form stem-loop structures. Such conserved terminal sequences are believed to be involved in polymerase recognition for replication and/or RNA packaging. Members of genera Alphapartitivirus and Betapartitivirus typically possess poly(A) tracts which may be interrupted by other nucleotides near the coding-strand 3´ terminus of one or both genome segments, whereas gammapartitiviruses usually lack terminal poly(A) tracts. Besides the two essential genome segments, some partitiviruses have additional (satellite or defective) dsRNA elements. Apparent RNA satellites encoding small (237–303 aa) proteins, homologous to each other but not to either CP or RdRP, are present in several classified and tentative gammapartitiviruses, including Aspergillus ochraceous virus, Discula destructiva virus 1, Gremmeniella abietina RNA virus MS1, and Ustilaginoidea virens partitivirus 1 (Tables 1 and 2). RNA satellites appearing not to encode any proteins are found associated with studied isolates of the alphapartitivirus Cherry chlorotic rusty spot associated partitivirus, the betapartitivirus Atkinsonella hypoxylon virus, and gammapartitiviruses Discula destructiva virus 1 and Penicillium stoloniferum virus F (Tables 1 and 2). The third genome segment of alphapartitivirus Rosellinia necatrix partitivirus 2 (Table 1) is a truncated version of the RdRP that may function as defective-interfering RNA.

Life Cycle The model for partitivirus replication has been inferred based on in vitro studies of virion-associated RdRP of the gammapartitivirus Penicillium stoloniferum virus S, and isolation of particles that represent various stages in the replication cycle from naturally infected mycelium (Fig. 4). The virion-associated RdRP mediates positive-strand RNA synthesis within the virus particle

Fungal Partitiviruses (Partitiviridae)

571

Fig. 1 Electron micrographs of Penicillium stoloniferum virus S. Samples were negatively stained in 2% uranyl acetate (a) or prepared unstained and vitrified (b). Micrographs were recorded on a CCD detector in an FEI Polara transmission electron microscope operated at 200 keV with samples at (a) room or (b) liquid nitrogen temperatures. In (a) bacteriophage P22 (five largest particles) was mixed with PsV-S to serve as a calibration reference. Heavy metal stain surrounds virions (e.g., black arrow) and contrasts their surfaces against the background carbon support film. Stain penetrates into the interior of ‘empty’ capsids (e.g., white arrow), resulting in particle images in which only a thin, annular shell of stain-excluding material (capsid) is seen. In (b) the unstained PsV-S sample was vitrified in liquid ethane. Here particles appear dark (higher density) against a lighter background of surrounding water (lower density). The inset shows a three-times enlarged view of an individual particle, in which several knobby surface features are clearly visible. Reproduced from Ghabrial, S.A, Ochoa, W.F, Baker, T.S, Nibert, M.L., 2008. Partitiviruses: General features. In: Mahy, B.W.J, van Regenmortel, M.H.V (Eds.), Encyclopedia of Virology, third ed. Oxford: Elsevier, pp. 68–75. with permission from Elsevier.

Fig. 2 Three-dimensional (3D) structure of PsV-S. (a) Shaded, surface representation of PsV-S 3D reconstruction viewed along an icosahedral twofold axes. The 3D map is color-coded to emphasize the radial extent of different features (yellows and greens highlight features closest to the particle center, and oranges and reds those farthest from the center). A total of 60 prominent protrusions extend radially outward from the capsid surface. Each protrusion exhibits an approximate dyad symmetry, which is consistent with the expectation that the partitivirus capsid consists of 120 capsid protein monomers, organized as 60 asymmetric dimers in a so-called ‘T¼20 lattice. (b) Density projection image of a central, planar section through the PsV-S 3D reconstruction (from region marked by dashed box in (a)). Darker shades of gray correspond to higher electron densities in the map section and lighter shades represent low-density features such as water outside as well as inside the particles. The capsid shell appears darkest because it contains a closely packed, highly ordered (icosahedral) arrangement of capsid subunits. The genomic dsRNA on the inside appears at lower density, in part because the RNA is not as densely packed and in part because the RNA adopts a less ordered arrangement. The protrusions seen in (a) appear as large ‘bumps’ in the central section view that decorate the outside of a contiguous, B2 nm thick, shell. Arrows point to faint density features that appear to form contacts between the inner surface of the protein capsid and the underlying RNA. These contacts occur close to the fivefold axes of the icosahedral shell. Reproduced from Ghabrial, S.A, Ochoa, W.F., Baker, T.S., Nibert, M.L., 2008. Partitiviruses: General features. In: Mahy, B.W.J., van Regenmortel, M.H.V. (Eds.), Encyclopedia of Virology, third ed. Oxford: Elsevier, pp. 68–75. with permission from Elsevier.

572

Fungal Partitiviruses (Partitiviridae)

Fig. 3 Atkinsonella hypoxylon virus (AhV), an isolate of the type species of the genus Betapartitivirus, has a bipartite genome consisting of dsRNA1 and dsRNA2. dsRNA1 contains the RdRP ORF (nt positions 40–2038) and dsRNA2 codes for the CP ORF (nt positions of 72–2030). The RdRP and CP ORFs are represented by rectangular boxes. Reproduced from Ghabrial, S.A., Buck, K.W., Hillman, B.I., Milne, R.G., 2005. Partitiviridae. In: Fauquet, C.M., Mayo, M.A., Maniloff, J., Desselberger, U., Ball, L.A. (Eds.), Virus Taxonomy: Eighth Report of the International Committee on Taxonomy of Viruses, San Diego, CA: Elsevier Academic Press, pp. 581–590. with permission from Elsevier.

Table 3 Size ranges of individual genome segments in classified members of genera Alpha- Beta- and Gammapartitivirus, and the predicted sizes of their RdRP and capsid proteins Genus

Alphapartitivirus Betapartitivirus Gammapartitivirus

dsRNA1 (bp)

1873–2027 2180–2444 1645–1787

dsRNA2 (bp)

1708–1866 2135–2354 1445–1611

RdRP

CP

aa

Mr (kDA)

aa

Mr (kDA)

581–621 663–746 519–539

68–73 77–87 60–62

463–521 636–686 413–443

51–57 71–77 44–47

using the negative strand of the genome dsRNA as a template by a semi-conservative mechanism. The newly synthesized positivestrand RNA is retained inside the particle as part of the dsRNA while the parental genomic positive strand (encoding RdRP or CP) is released from the particle and used in translation or in packaging into a new partitivirus particle. The monocistronic positivesense transcripts of both genomic dsRNAs are translated by cellular ribosomes and virions accumulate in the cytoplasm. The replicase activity of the RdRP is believed to catalyze the synthesis of negative-strand RNA on the positive-strand template within the assembling virion, thereby reconstituting a genomic dsRNA segment.

Epidemiology Fungal partitiviruses are transmitted intracellularly during cell division, hyphal anastomosis (i.e., cell fusion between fungal strains), and sporogenesis. Vertical virus transmission may occur by sexual spores (basidiospores or rarely ascospores) derived from virus-infected fruiting bodies as well as vegetative spores (conidia). In ascomycetous fungi, such as Penicillium stoloniferum, Gremmeniella abietina, and Ustilaginoidea virens, transmission of partitiviruses through asexual spores is highly efficient, and 90%–100% of single conidia receive these viruses. Only part of basidiomycetous fungi produce asexual spores, and the proportion of virus-infected conidia seems to vary even among isolates of the same host species. In different isolates of the basidiomycete Heterobasidion parviporum, two different unclassified alphapartitiviruses were found to be transmitted to 3%–55% of germinated conidia. The basidiospores of this fungus also carry dsRNA elements consistent with partitiviruses, which have been detected in 10%–84% of basidiospores derived from virus-infected fruiting bodies. In many ascomycetes (e.g., Gaeumannomyces graminis, Rosellinia necatrix, and Gremmeniella abietina), partitiviruses are usually eliminated during ascospore formation. There are no known natural vectors for partitiviruses. Therefore, the horizontal transmission of fungal partitiviruses is regulated by host systems governing hyphal fusion, i.e., somatic (vegetative) incompatibility, mating type incompatibility and intersterility. Inoculation experiments in field conditions have revealed that the unclassified alphapartitivirus Heterobasidion partitivirus 4 (HetPV4) (GenBank HQ541325; JX271788-9) is transmitted between conspecific host isolates by two different means: (1) mating between an introduced heterokaryotic virus donor strain and a resident homokaryotic Heterobasidion strain, and (2) mycelial contact between two somatically incompatible heterokaryons. In Rosellinia necatrix, the transmission of partitiviruses between somatically incompatible isolates of the host fungus can be enhanced by the addition of zinc ions, which promotes hyphal fusion and retards the cell death reaction during an incompatible anastomosis. On the other hand, horizontal transmission of Rosellinia necatrix partitivirus 1 (RnPV1) is hindered by the presence of a mycoreovirus, MyRV3, of Rosellinia necatrix in the recipient fungal colony. Therefore, preexisting viral infections may affect the transmissibility of partitiviruses between host isolates. Partitiviruses are occasionally transmitted across species borders in both laboratory and field conditions. Dual fungal cultures have been used for the transmission of Heterobasidion partitiviruses between congeneric host species. Microscopic examination has revealed occasional anastomosing cells between two intersterile Heterobasidion species, H. ecrustosum and H. abietinum, which resulted in subsequent death of the cells involved, but allowed for between-species transmission of the alphapartitivirus HetPV3

Fungal Partitiviruses (Partitiviridae)

573

V2

V1

Transcriptase ?

?

Replicase

Replicase

Replicase

Replicase

Transcript released Translation RdRp

Translation Transcript packaged

CP Assembly

Fig. 4 Model for the replication strategy of Penicillium stoloniferum virus S. The open circles represent capsid protein (CP) subunits and the closed circles represent RNA-dependent RNA polymerase (RdRP) subunits. Solid lines represent parental RNA strands whereas wavy lines represent newly synthesized progeny RNA strands. Virion preparations include mature virions with either dsRNA1 or dsRNA2, but also a small proportion of particles that contain only one ssRNA molecule corresponding to the genomic positive strand of the respective dsRNAs, and a heterogenous population of particles more dense than the mature virions. These heavy particles are believed to represent different stages in the replication cycle, including particles containing the individual genomic dsRNAs with ssRNA tails of varying lengths, particles with one molecule of dsRNA and one molecule of its ssRNA transcript, and particles containing two molecules of dsRNA. Reproduced from Ghabrial, S.A., Hillman, B.I., 1999. Partitiviruses-fungal (Partitiviridae). In: Granoff, A., Webster, R.G. (Eds.), Encyclopedia of Virology, second ed. San Diego: Academic Press, pp. 1477–1151. with permission from Elsevier.

(Table 1). Partitiviruses have also been introduced to more distantly related host species by transfecting fungal protoplasts with purified virions. This has been accomplished in C. parasitica using partitiviruses of Rosellinia necatrix (RnPV1 and RnPV2; Tables 1 and 2), and in Botrytis cinerea with SsPV1. In addition, protoplast transfection has been used to transmit viruses between conspecific strains in B. cinerea and Aspergillus fumigatus. Sequence analysis has revealed nearly identical alphapartitivirus strains (HetPV11-au1 and HetPV11-pa1; GenBank HQ541328–9 and MG948857–8) in two intersterile species of Heterobasidion (H. australe and H. parviporum) within the same region in Bhutan, suggesting natural transmission of HetPV11 between these fungal species. Conspecific betapartitiviruses have also been described in the ascomycetous blue-stain fungi Ceratocystis resinifera (Table 1) and C. polonica (GenBank AY247204–5). Similar partitiviruses sharing over 90% RdRP aa sequence identity (the threshold for species demarcation in the family Partitiviridae) are sometimes found in divergent host species. For example, gammapartitiviruses found in the ascomycetes Colletotrichum truncatum (a plant pathogen) and Sodiomyces alkalinus (an alkalophile) seem to be conspecific (GenBank KR074421 and KY484539–40), and alphapartitiviruses infecting the basidiomycete fungi Heterobasidion parviporum (order Russulales) and Megacollybia platyphylla (order Agaricales) are likewise closely related (JN606085; KT733076). The genera Alphapartitivirus and Betapartitivirus include viruses isolated from both fungi and plants. Therefore, a capacity for occasional successful transmission of these viruses between fungal and plant hosts appears likely. Some of the viruses in these genera are found in plant pathogenic fungi, which suggests a potential route for horizontal transmission between fungi and plants. Notably, highly similar partitiviruses have been detected in sugar beet (Beta vulgaris) and a basidiomycete fungus infecting its leaves, Helicobasidium purpureum. Thus, the partial polymerase sequence of Helicobasidium purpureum partitivirus (GenBank AY949837) is 99% identical to that reported for Beet cryptic virus 1, suggesting that the virus may have transmitted between the associated organisms. This view is further supported by an in vitro transfection study showing that Penicillium aurantiogriseum partitivirus 1 (GenBank KT601103–4) is capable of replicating in the Nicotiana tabacum BY2 cell line.

Pathogenesis Partitiviruses are found in all parts of an infected mycelium and can accumulate to very high concentrations (at least 1 mg of virions per one gram of mycelial tissue for Penicillium stoloniferum virus F). The transmission of betapartitivirus RnPV1 within

574

Fungal Partitiviruses (Partitiviridae)

and between R. necatrix colonies has been visualized by a colony-print immunoassay, which revealed that the virus was distributed throughout host colonies. However, growing hyphal tips may remain virus-free for a period of time, and then single hyphal tip isolation can be used for curing fungal isolates of partitiviruses. In nature, clonally spreading mycelia of long-living root rot pathogens (such as Heterobasidion spp. and Helicobasidium mompa) typically consist of virus-infected and virus-free hyphal sections. Mixed infections of fungal hosts with two or more different partitiviruses have been observed in several species. The gammapartitiviruses PsV-S and PsV-F were originally found in a single isolate of Penicillium stoloniferum. Stable partitivirus co-infections are also found in strains of Heterobasidion spp. that may host two different alphapartitiviruses (HetPV16 and HetPV20; GenBank KY859977 and MG566085–6) or alpha- and betapartitiviruses (HetPV9 and HetPV7; GenBank JN606085 and JN606090–1). Co-infections by alpha- and betapartitiviruses occur also in Helicobasidium mompa (GenBank AB110979 and AB110980), and mixed infections by two different alphapartitiviruses are found in Rhizoctonia solani (KX914900–3). The most striking example of mixed partitivirus infections has been described in single isolates of mycorrhizal Ceratobasidium fungi that may host at least five and up to ten different alpha- and betapartitiviruses (GenBank KU291902–22). These fungi occur as root symbionts of Pterostylis sanguinea orchids in Australia. Mixed infections of fungal isolates with partitiviruses and unrelated viruses are also relatively common. The Rosellinia necatrix strain hosting RnPV1 is naturally co-infected with Rosellinia necatrix megabirnavirus 2 (RnMBV2), and a single strain of the conifer pathogen Gremmeniella abietina has been shown to harbor three unrelated viruses (the gammapartitivirus Gremmeniella abietina RNA virus MS2, a mitovirus and a totivirus). Similarly, partitiviruses infecting strains of Heterobasidion spp. may co-occur with mitoviruses or isolates of the unclassified species Heterobasidion RNA virus 6, and Botryosphaeria dothidea partitivirus 1 (GenBank KF688740–1) occurs in a mixed infection with a chrysovirus. A mixed infection by a betapartitivirus (GenBank KY296404–5), gammapartitivirus (KY484539–40), and a fusarivirus was described in an isolate of Sodiomyces alkalinus. In general, fungal partitiviruses seem to be associated with symptomless infections of their hosts. However, some of them have deleterious effects on host cells. The unclassified alphapartitivirus Rhizoctonia solani partitivirus 2 (Table 2) causes hypovirulence in Rhizoctonia solani, a soil-borne basidiomycete fungus. Similarly, the unclassified betapartitivirus SsPV1 (Table 2) has been shown to reduce the virulence of its natural host as well as Botrytis cinerea after protoplast transfection. Moreover, Aspergillus fumigatus partitivirus 1 (Table 2) seems to mediate reduced growth rate, conidiation and pigmentation in Aspergilllus fumigatus, and the alphapartitivirus HetPV13-an1 (Table 1) causes drastic growth reduction in several strains of Heterobasidion annosum and H. parviporum. In contrast, the gammapartitivirus Talaromyces marneffei partitivirus 1 (Table 2) seems to enhance the virulence of its native host, which causes opportunistic infections in mammals, including immunosuppressed humans. In some cases, host symptoms may only develop during viral co-infection: a mixed infection by RnPV1 and RnMBV2 leads to host hypovirulence, while the individual viruses exhibit asymptomatic infections.

Taxonomic and Phylogenetic Considerations Phylogenetic analyses of the predicted amino acid sequences of partitivirus RdRPs show that the classified fungal partitiviruses are included in three distinct clusters corresponding to the genera Alphapartitivirus, Betapartitivirus, and Gammapartitivirus (Fig. 5). However, certain unclassified partitiviruses seem to form a lineage distinct from established partitivirus genera. Based on sequence identity scores and phylogenetic analysis, this virus clade includes Botryosphaeria dothidea partitivirus 1, Alternaria alternata partitivirus 1 and Aspergillus fumigatus partitivirus 2 (Table 2), which are only distantly related to members of genus Gammapartitivirus (Fig. 5). The discovery of a gammapartitivirus in the oomycete Pythium nunn in 2018 (GenBank accession LC371062 and LC371063; Fig. 5) expands the host range of partitiviruses to the phylum Stramenopila, which is evolutionarily distinct from fungi and includes organisms with flagellated zoospores ranging from plant pathogenic oomycetes to macroscopic brown algae. It should also be noted that virus sequences resembling known partitiviruses have been discovered in various insect species with high throughput sequencing. Whether these viruses are hosted by the insects or associated microbes remains to be investigated. The putative RdRP of the unclassified Penicillium aurantiogriseum partiti-like virus (GenBank accession KT601105) seems to be related to these insect-derived partiti-like sequences, but its taxonomical status remains to be determined, and it seems to form a lineage separate from the classified partitivirus genera. The evolutionary rate of partitivirus CPs is considerably higher than that of the RdRPs, and pairwise amino acid sequence identities between members of different partitivirus genera are typically less than 10%. Therefore, phylogenetic analyses based on the CP sequences show weaker statistical support than analysis of RdRP sequences. Dendrograms separately constructed from either RdRP or CP sequences seem to reveal essentially the same grouping, but are not entirely congruent. In some cases, the RdRP and CP sequences of a single virus strain show slightly different phylogenetic grouping, suggesting that segment reassortment may occasionally occur among congeneric virus strains in the family Partitiviridae. The Heterobasidion betapartitivirus HetPV8 and alphapartitivirus HetPV20 (Tables 1 and 2) are examples of such cases. Based on genome organization and sequence similarity, members of the family Partitiviridae are related to a separate and currently unclassified group of fungal dsRNA viruses with two small genome segments (each 1.7–2.4 kbp). These include the mutualistic Curvularia thermal tolerance virus, at least five other viruses with published genome sequences (Table 4), and several isolates with unpublished and/or incomplete genome sequences from the fungal genera Cryphonectria, Gremmeniella, Lactarius, Myriodontium, Rhizoctonia, Sclerotium, and Trichoderma. The larger genome segment of these viruses has a single ORF encoding an

Fungal Partitiviruses (Partitiviridae)

575

Fig. 5 Unrooted Maximum Likelihood dendrogram constructed based on the complete deduced amino acid sequences of RdRPs of approved members of the family Partitiviridae and selected unclassified partitiviruses. The sequences were aligned using MAFFT v7.388 in Geneious R10 (Biomatters Ltd.) and the evolutionary history was inferred using MEGA 7. The percentage of replicate trees in which the associated taxa clustered together in the bootstrap test (100 replicates) are shown next to the branches. The evolutionary distances were computed using the Le Gascuel 2008 model (LG þ G þ I) and are in the units of the number of amino acid substitutions per site. Virus names and GenBank accession numbers corresponding to RdRP nucleotide sequences are given.

576

Fungal Partitiviruses (Partitiviridae)

Table 4 Exemplar isolates of unclassified viruses forming a separate phylogenetic clade related to members of virus family Partitiviridae, including the mutualistic Curvularia thermal tolerance virus Virus isolate

Curvularia Thermal Tolerance Virus Penicillium aurantiogriseum bipartite virus 1 Fusarium graminearum dsRNA mycovirus 4 Rhizoctonia solani dsRNA virus 1 Sclerotium hydrophilum virus 1 Heterobasidion RNA virus 6

GenBank accession dsRNA1

dsRNA2

EF120984 KT601101 GQ140627 JX976612 KU886558 HQ189459

EF120985 KT601102 GQ140628 JX976613 KU886559 MK468678

RdRP of 592–692 aa, whereas the smaller segment has 1–2 ORFs of unknown function. Members of a recently described group of viruses provisionally designated as “Unirnaviruses” also share moderate levels of RdRP homology with partitiviruses, but their genomes are nonsegmented. Viruses with quadripartite genomes currently classified in the family Chrysoviridae were formerly included in the family Partitiviridae, but a separate family was formed in 2002 to accommodate these viruses as described elsewhere in this encyclopedia. Members of the family Megabirnaviridae and genus Botybirnavirus have bisegmented genomes and infect fungi, but their genomes are substantially larger than those of partitiviruses (8.9 kbp and 7.2 kbp for the exemplar strain RnMBV1; 6.2 kbp and 5.9 kbp for Botrytis porri RNA virus 1, the only classified member of the genus Botybirnavirus). The larger genome segment of megabirnaviruses and botybirnaviruses encodes a CP and RdRP. These virus families are discussed in detail elsewhere in this encyclopedia. Finally, members of the family Partitiviridae have properties similar to members of the family Picobirnaviridae, e.g., the genomes are bisegmented, and the capsids are small (o45 nm in diameter) with 120 subunits organized as 60 dimers in “T ¼2” symmetry. The picobirnaviruses, however, are phylogenetically distinct and probably have other basic differences including co-packaging of both genome segments and capacity for extracellular transmission. Picobirnaviruses are found in vertebrates rather than plants, fungi and protozoa.

Further Reading Bao, X., Roossinck, M.J., 2013. Multiplexed interactions: Viruses of endophytic fungi. Advances in Virus Resesarch 86, 37–58. Buck, K.W., Kempson-Jones, G.F., 1973. Biophysical properties of Penicillium stoloniferum virus S. Journal of General Virology 18, 223–235. Ghabrial, S.A., Ochoa, W., Baker, T.B., Nibert, M., 2008. Partitiviruses: General features. In: Mahy, B.W.J., Van Regenmortel, M.H.V. (Eds.), Encyclopedia of Virology, third ed. Oxford: Elsevier, pp. 68–75. Kondo, H., Kanematsu, S., Suzuki, N., 2013. Viruses of the white root rot fungus, Rosellinia necatrix. Advances in Virus Research 86, 177–214. Marquez, L.M., Redman, R.S., Rodriguez, R.J., Roossinck, M.J., 2007. A virus in a fungus in a plant: Three-way symbiosis required for thermal tolerance. Science 315, 513–515. Nerva, L., Silvestri, A., Ciuffo, M., et al., 2017. Transmission of Penicillium aurantiogriseum partiti-like virus 1 to a new fungal host (Cryphonectria parasitica) confers higher resistance to salinity and reveals adaptive genomic changes. Environmental Microbiology 19, 4480–4492. Nibert, M.L., Ghabrial, S.A., Maiss, E., et al., 2014. Taxonomic reorganization of family Partitiviridae and other recent progress in partitivirus research. Virus Research 188, 128–141. Nibert, M.L., Tang, J., Xie, J., et al., 2013. 3D structures of fungal partitiviruses. Advances in Virus Research 86, 59–85. Oh, C.S., Hillman, B.I., 1995. Genome organization of a partitivirus from the filamentous ascomycete Atkinsonella hypoxylon. Journal of General Virology 76, 1461–1470. Sasaki, A., Kanematsu, S., Onoue, M., Oyama, Y., Yoshida, K., 2006. Infection of Rosellinia necatrix with purified viral particles of a member of Partitiviridae (RnPV1-W8). Archives of Virology 151, 697–707. Shi, M., Lin, X.D., Tian, J.H., et al., 2016. Redefining the invertebrate RNA virosphere. Nature 540, 539–543. Tavantzis, S., 2008. Partitiviruses of fungi. In: Mahy, B.W.J., Van Regenmortel, M.H.V. (Eds.), Encyclopedia of Virology, third ed. Oxford: Elsevier, pp. 63–68. Vainio, E.J., Hantula, J., 2016. Taxonomy, biogeography and importance of Heterobasidion viruses. Virus Research 219, 2–10. Vainio, E.J., Chiba, S., Ghabrial, S.A., et al., 2018. ICTV virus taxonomy profile: Partitiviridae. Journal of General Virology 99, 17–18. Xiao, X., Cheng, J., Tang, J., et al., 2014. A novel partitivirus that confers hypovirulence on plant pathogenic fungi. Journal of Virology 88, 10120–10133.

Relevant Websites https://talk.ictvonline.org/ictv-reports/ictv_online_report/dsrna-viruses/w/partitiviridae Partitiviridae - dsRNA Viruses ICTV. https://talk.ictvonline.org/files/ictv_official_taxonomy_updates_since_the_8th_report/m/fungal-official/4815/ Proposal - ICTV.

Fusariviruses (Unassigned) Sotaro Chiba, Nagoya University, Nagoya, Japan r 2021 Elsevier Ltd. All rights reserved.

Glossary Hypovirulence The reduced pathogenicity of fungal pathogens, in terms of abilities to infect, grow, colonize, and destruct the host organisms, and is generally associated with mycovirus infections.

Sub-genomic RNA An RNA segment shorter than the full-length genome of an RNA virus that expresses protein encoded by downstream ORFs.

Introduction Mycoviruses (fungal viruses) are increasingly reported with an aid of improved sequencing technology and this has uncovered the highly diverse nature of mycoviruses. Consequently, in the past decade, several new fungal viral taxa, such as the families Megabirnaviridae, Quadriviridae, and Mymonaviridae, have been established. Many reported viruses have still not been assigned to existing viral taxa, even though it has been a long time since the initial discovery and characterization of some of these viruses. One good example is the Fusarium graminearum virus 1 strain DK21 (FgV1-DK21), which was identified in Korea in 2002 and thoroughly characterized in 2007. This virus is still taxonomically floating and had been recorded as an unassigned or unclassified virus in the 9th and 10th editions of the International Committee on Taxonomy of Viruses (ICTV) report, published in 2011 and 2018–2019. The related article in the current edition of ICTV report can also be accessed online (see “Relevant Website section”). Since 2014, viruses that are evolutionarily associated with FgV1-DK21 have been continuously reported, although the genomic structures of those viruses are not identical to that of FgV1-DK21. Therefore, the need for grouping these viruses becomes obvious today. With this in mind, Japanese researchers previously proposed the provisional new genus, Fusarivirus, in the new family, Fusariviridae, in 2014. The proposed genus/family name was created after the host name of FgV1-DK21, Fusarium graminearum (reclassified as F. boothii), and thereafter, the terms became widely used to classify the potential members of the group. However, as of 2019, the creation of this new taxon is still incomplete. In this article, the known information on fusariviruses reported so far is briefly summarized, with a view to establish the Fusarivirus genus under the family Fusariviridae. The future direction of research on fusariviruses is also discussed below.

Taxonomy and Classification Potential members of the provisional genus Fusarivirus are listed in Table 1. From this point onwards, viruses that are members of this genus are described as fusariviruses. The representative species in the provisional genus is Fusarium graminearum virus 1, which was first discovered in 2002. A phylogenetic analysis based on the amino acid sequences of the RNA dependent RNA polymerase (RdRp) domain in replication-associated proteins (motifs 1–8) resulted in two major clades: group I, which accommodates FgV1-DK21 and typical fusariviruses with two clearly separated subgroups in the clade, and groups II and III, which are in the same clade that is formed independently of group I (groups II and III are differentiated by their genomic structures) (Fig. 1). These three groups have the potential to establish independent genera. Rhizoctonia solani fusarivirus 3 (RsFV3) with a unique monocistronic genome structure is phylogenetically distant from groups I, II, and III and is thus separately categorized into group IV. Likewise, Agaricus bisporus virus 10 and 11 (AbV10 and AbV11) are more distantly related to these four groups and are further categorized as group I/II-like viruses in this article owing to the underlying similarity in their genomic structures and sizes. Currently, there are no criteria for fusarivirus species and genus demarcations. Amino acid identities of fusariviral replicationassociated proteins are not greater than 60%, except for Alternaria alternata fusarivirus 1 (AaFV1) and Alternaria brassicicola fusarivirus 1 (AbFV1), which share identities as high as 86%. Therefore, species demarcation might possess an amino acid percent identity of r60 for the replication-associated protein that carry the helicase (Hel) and RdRp domains.

Virion Property None of the currently reported fusariviruses have been shown to form virions. Fusariviruses are considered not to form rigid particles, but to be the so-called nude viruses or capsid-less viruses. This is reasonable because hypoviruses, which are closely related to fusariviruses from an evolutionary standpoint, do not form virions. Cryphonectria hypovirus 1 (CHV1), the representative virus in the family Hypoviridae, exists in the fungal host cells as a complex of viral replicase and dsRNA (considered a replicative intermediate), which is capsuled in double-layered small lipid vesicles. Such a subcellular structure associated with

Encyclopedia of Virology, 4th Edition, Volume 4

doi:10.1016/B978-0-12-809633-8.21276-4

577

578

Table 1

Fusariviruses (Unassigned)

List of members in the provisional genus Fusarivirus

Group

Virus name

Abbreviation Genome size (nts)

1-a 1-a 1-a 1-a 1-a 1-a 1-a 1-b 1-b 1-b 1-b 1-b 1-b 1-b 1/2-like 1/2-like 2 2 2 2 2 2 2 3 3 4  

Fusarium graminearum virus 1 FgV1 Penicillium roqueforti ssRNA mycovirus 1 PrRV1 Pleospora typhicola fusarivirus 1 PtFV1 Rosellinia necatrix fusarivirus 1 RnFV1 Sodiomyces alkalinus fusarivirus 1 SaFV1 Neurospora discreta fusarivirus 1 NdFV1 Fusarium poae fusarivirus 1 FpFV1 Rutstroemia firma fusarivirus 1 RfFV1 Macrophomia phaseolina single-stranded RNA virus 1 MpRV1 Neofusicoccum luteum fusarivirus 1 NlFV1 Penicillium aurantiogriseum fusarivirus 1 PaFV1 Zymoseptoria tritici fusarivirus 1 ZtFV1 Aspergillus ellipticus fusarivirus 1 AeFV1 Gaeumannomyces tritici fusarivirus 1 GtFV1 Agaricus bisporus virus 10 AbV10 Agaricus bisporus virus 11 AbV11 Nigrospora oryzae fusarivirus 1 NoFV1 Sclerotinia homoeocarpa fusarivirus 1 ShFV1 Sclerotinia sclerotiorum fusarivirus 1 SsFV1 Morchella importuna fusarivirus 1 MiFV1 Alternaria alternata fusarivirus 1 AaFV1 Alternaria brassicicola fusarivirus 1 AbFV1 Botrytis cinerea fusarivirus 1 BcFV1 Rhizoctonia solani fusarivirus 1 RsFV1 Rhizoctonia solani fusarivirus 2 RsFV2 Rhizoctonia solani fusarivirus 3 RsFV3 Gremmeniella fusarivirus 1 GFV1 Rosellinia necatrix fusarivirus 2 RnFV2

6621a,b 6002d 6733e 6286a 6239a,b 6625a 6379d 6628a 6356e 6244a,b 6193d 5969d 6253d 6332e 7033b,d,g 6981b,d,g 7004a 7171d 7754a,b 7823a 6647d 6639a,b 8411a,b 10,776d 10,710e 5959e 2228d,h 3685; 407; 1287d,h

ORF 50 -UTR (nts) 30 -UTR (nts) Accession 4 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 3 2 4 3 3 3 2 4 4 1  

53 31 29 17 5 4 74 88 200 46 46 36 34 48 98 44 109 114 83 20 83 84 510 161 111 474  

46c 55 408 92f 57c 455 296 134 101 35 70 42 49 115 108 163 244 41 416 164 219 219 364 235 152 97  

AY533037 KJ817266 KT601107 AB915829 KT983618 MK279503 LC150611 MK279504 KP900890 KY906213 KT601099 MK279506 MK279500 MK279501 KY357495 KY357496 KU980909 MK279505 KP842791 MK279502 LC209862 KT581960 MG554633 MK558257 MK558256 MK558258 LR031261 LC333742–4

a

3'-poly(A) sequence is present and excluded from nucleotide counting. RACE analysis is performed. c Subgenomic RNA production is expected. d No poly(A)-sequence is detected and actual form of 3'-end is unknown. e Integrated poly(A)-like sequence at 3'-terminal is included. f Subgenomic RNA production is denied. g 5'-end is determined but 3'-end is not. h Only partial sequences are available. b

fusariviral infection has only been reported as virus-like particle (VLP) for FgV1-DK21. Purified VLPs were aggregates of irregular shaped and sized spherical vesicular structures that were unlike rigid particles. This VLP was shown to strongly associate with viral dsRNAs, thus the structures may stand for replication site of FgV1-DK21. Similar characteristics for other fusariviruses have not been demonstrated.

Genome Organization Although the type of nucleic acid for fusarivirus genomes is not yet experimentally determined, it is expected to be a positive-sense single-stranded RNA [( þ )ssRNA], as is the case for hypoviruses and narnaviruses. The form of the 50 end is unknown, but that of the 30 end is polyadenylated, at least for those complete genomes determined by RACE analyzes (Table 1). The length of the RNA genome is 5959–10,776 nucleotides (nts), excluding 30 -poly(A), and the 50 and 30 untranslated regions (UTRs) are in ranges from 5 to 510 nts for the 50 UTR and from 35 to 455 nts for the 30 UTR, excluding poly(A) but including integrated poly(A)-like sequences (Table 1). The replication-associated protein is generally encoded by the largest ORF (ORF1) at the 50 proximal portion, and this is followed by one to three small ORFs in the downstream (Fig. 2). Hel and RdRp domains are present in the replicationassociated proteins. This typical genome structure was found in fusarivirus groups I and II. However, viruses in group III have a different genome structure that has a larger sized genome with four coding genes, and the first two and last ORFs encode proteins with no similarities to any proteins in current public databases, whereas the replication-associated protein is encoded by the third ORF (Fig. 2). Additional Hel domain is detectable in the first ORF of viral genomes in group III. Finally, a virus in group IV has a monocistronic nature that encodes only the replication-associated protein. Viruses from groups III and IV have been found only from a basidiomycetous fungus Rhizoctonia solani, and these expanded the diversity of fusariviruses.

Fusariviruses (Unassigned)

579

Fig. 1 Maximum-likelihood (ML) phylogenetic tree of fusariviruses. The ML tree was drawn based on multiple amino acid sequence alignment of RdRp domain (motif 1–8) of fusariviral and hypoviral replication-associated proteins. The brunches supported by aLRT (over 0.9) are denoted by dots at nodes. Phylogenetic relationship and genomic structure are taken into account for grouping of fusariviruses.

ORF1 and the second largest ORF (ORF 2, 3, or 4) are commonly encoded by members of classic fusariviruses in groups I and II. These secondly largest proteins exhibit low sequence similarities with each other in general, even between proteins in the same groups. It is notable that the structural maintenance of chromosomes (SMC)-like motif or Spc7 kinetochore protein-like motif was detected in the second largest proteins of fusariviruses in group I but were generally not detected for those of the II and I/II-like groups. This suggests that the viruses in group I have common ancestors.

Gene Expression The gene expression strategy for the 50 proximal ORF of fusariviruses is largely unknown, but there is some information for downstream ORFs. The ORFs 2, 3, and 4 of FgV1-DK21 are expected to be expressed by two sub-genomic RNAs (Fig. 2). A subgenomic RNA for ORF2 of Sodiomyces alkalinus fusarivirus 1 (SaFV1) is suspected, however, the detected fragment in this instance was vague, so it needs verification. Contrary to this, the possibility to synthesize sub-genomic RNA for ORF2 of Rosellinia necatrix fusarivirus 1 (RnFV1) was experimentally denied, suggesting poor commonality of gene expression strategies among the different fusariviruses.

Virus Transmissions FgV1-DK21, BcFV1, RnFV1, and Sclerotinia sclerotiorum fusarivirus 1 (SsFV1) are capable of transmission via hyphal anastomosis between virus-donor and virus-recipient strains (horizontal transmission), but others are not confirmed. Similarly, transmission through asexual and sexual spores (vertical transmission) has been observed for FgV1-DK21 and SaFV1. A natural transmission vector for fusariviruses has not yet been discovered.

580

Fusariviruses (Unassigned)

Fig. 2 Genome composition of representative fusariviruses. Genomic structures of representative fusariviruses are schematically shown. Non-coding regions are indicated by lines and coding regions are shown by rectangles. RdRp and Hel domains are illustrated in the protein coding regions.

Virus–Host Interactions Among all reported fusariviruses, only FgV1-DK21 exhibited an association with host morphological and physiological changes such as impaired mycelial growth, increased pigmentation, reduced mycotoxin production, and attenuated virulence of its natural host, F. boothii. FgV1-DK21 also showed hypovirulence-conferring ability to its heterologous hosts F. asiaticum, F. graminearum, F. oxysporum, and C. parasitica. RnFV1, AbFV1, SsFV1, and probably BcFV1 have no impact on the biological properties of their host. The effects of infection by other fusariviruses on host fungal properties are unknown. FgV1-DK21 is targeted by the host anti-viral RNA silencing or RNA interference. It suppresses this defensive machinery of the host through the repression of key genes in RNA silencing such as FgDICER2 and FgAGO1 by the function of ORF2 protein (pORF2). This pORF2 function is similar to the mode of an RNA silencing suppressor encoded by CHV1, p29. Moreover, FgV1-DK21 is known to interact with host cellular proteins. For example, hexagonal peroxisome protein (Hex1) facilitates the replication of FgV1-DK21 via a direct interaction with viral RNA. Hex1 is a component of Woronin bodies and plays a crucial role in the asexual reproduction and virulence of host fungus. No other virus–host interactions are characterized for other fusariviruses but FgV1-DK21.

Summary and Future Prospective The behavior of the representative fusarivirus, FgV1-DK21, has been thoroughly investigated, and some questions particularly related to viral propagation and pathogenicity have been uncovered. However, the basic characteristics of fusariviruses as a genus or a family are still unknown. These include the function of genes encoded by fusariviruses, the gene expression strategies commonly taken by fusariviruses, the 50 end form of fusariviral genomes, and the stable form of fusariviruses in host cells. The use of infectious cDNA clone of fusariviruses will be a powerful tool for understanding of above-mentioned characteristics of fusarivirus. Furthermore, since unique fusariviruses such as RsFV1 (group III), RsFV3 (group IV), and AbV10 (group I/II-like) are reported, more information on the viruses that are most closely related to these are required for a clearer understanding of the evolutionary history and proper taxonomical arrangement of fusariviruses.

Fusariviruses (Unassigned)

581

Further Reading Chu, Y.M., Jeon, J.J., Yea, S.J., et al., 2002. Double-stranded RNA mycovirus from Fusarium graminearum. Applied and Environmental Microbiology 68 (5), 2529–2534. Deakin, G., Dobbs, E., Bennett, J.M., et al., 2017. Multiple viral infections in Agaricus bisporus – Characterisation of 18 unique RNA viruses and 8 ORFans identified by deep sequencing. Scientific Reports 7 (1), 2469. Gilbert, K.B., Holcomb, E.E., Allscheid, R.L., Carrington, J.C., 2019. Hiding in plain sight: New virus genomes discovered via a systematic analysis of fungal public transcriptomes. PLoS One 14 (7). Hrabáková, L., Grum-Grzhimaylo, A.A., Koloniuk, I., et al., 2017. The alkalophilic fungus Sodiomyces alkalinus hosts beta-and gammapartitiviruses together with a new fusarivirus. PLoS One 12 (11), e0187799. Kwon, S.J., Lim, W.S., Park, S.H., Park, M.R., Kim, K.H., 2007. Molecular characterization of a dsRNA mycovirus, Fusarium graminearum virus-DK21, which is phylogenetically related to hypoviruses but has a genome organization and gene expression strategy resembling those of plant potex-like viruses. Molecules and Cells 23 (3), 304. Lee, K.M., Yu, J., Son, M., Lee, Y.W., Kim, K.H., 2011. Transmission of Fusarium boothii mycovirus via protoplast fusion causes hypovirulence in other phytopathogenic fungi. PLoS One 6 (6), e21629. Picarelli, M.A.S., Forgia, M., Rivas, E.B., et al., 2019. Extreme diversity of mycoviruses present in isolates of Rhizoctonia solani AG2-2 LP from Zoysia japonica from Brazil. Frontiers in Cellular and Infection Microbiology 9, 244. Son, M., Lee, K.M., Yu, J., et al., 2013. The HEX1 gene of Fusarium graminearum is required for fungal asexual reproduction and pathogenesis and for efficient viral RNA accumulation of Fusarium graminearum virus 1. Journal of Virology 87 (18), 10356–10367. Son, M., Choi, H., Kim, K.H., 2016. Specific binding of Fusarium graminearum Hex1 protein to untranslated regions of the genomic RNA of Fusarium graminearum virus 1 correlates with increased accumulation of both strands of viral RNA. Virology 489, 202–211. Son, M., Lee, Y., Kim, K.H., 2016. The transcription cofactor Swi6 of the Fusarium graminearum is involved in fusarium graminearum virus 1 infection-induced phenotypic alterations. Plant Pathology Journal 32 (4), 281. Yu, J., Park, J.Y., Heo, J.I., Kim, K.H., 2019. The ORF2 protein of Fusarium graminearum virus 1 suppresses the transcription of FgDICER2 and FgAGO1 to limit host antiviral defences. Molecular Plant Pathology. 230–243. Zhang, R., Liu, S., Chiba, S., et al., 2014. A novel single-stranded RNA virus isolated from a phytopathogenic filamentous fungus, Rosellinia necatrix, with similarity to hypo-like viruses. Frontiers in Microbiology 5, 360.

Relevant Website https://talk.ictvonline.org/ictv-reports/ictv_online_report/ Virus Taxonomy: The Classification and Nomenclature of Viruses.

Giardiavirus (Totiviridae) Juliana Gabriela Silva de Lima, João Paulo Matos Santos Lima, and Daniel Carlos Ferreira Lanza, Federal University of Rio Grande do Norte, Natal, Brazil r 2021 Elsevier Ltd. All rights reserved. This is a reproduction of Juliana Gabriela Silva de Lima, João Paulo Matos Santos Lima, Daniel Carlos Ferreira Lanza, 2017, Giardiaviruses, In Reference Module in Life Sciencess, Elsevier Inc, doi:10.1016/B978-0-12-809633-8.11003-9.

History In 1986, a B6.3 kpb linear double-stranded (ds) RNA molecule was observed in nucleic acid extracts of Giardia lamblia Portland I trophozoites obtained by Dr. D. G. Lindmark of Cleveland State University. Further analysis of this dsRNA showed it to be a genome from a small isometric virus that infects this protozoan. The virus found was named Giardia lamblia virus (GLV), referring to the host it infects (Wang and Wang, 1986). Giardia lamblia, also known as Giardia intestinalis or duodenalis, is an anaerobic parasitic flagellate that inhabits the gastrointestinal tract of humans as well as many other mammals. When infecting humans, the parasitic protozoan causes giardiasis, an acute diarrhea that often progresses to chronic carrier-stage in adults and severe malnutrition in children. In chronic form, the parasites adhere to the human gut wall and cause losses in host’s digestion and absorption of nutrients. In addition to GLV, there is the Giardia canis virus (GCV), a B6.3 kpb dsRNA virus that infects Giardia canis, a protozoan that affects dogs and other canids. In 2001, the GCV was isolated from different strains of G. canis and identified by Jianjun et al. Unlike G. lamblia infection in humans, most Giardia infections in dogs are asymptomatic. However, some infected dogs may suffer from acute or chronic diarrhea, weight loss or poor weight gain, in spite of having a normal appetite and, less commonly, vomiting and lethargy (Inger et al., 2007). GLV and GCV probably correspond to the same species, as they share more than 94% of identity between nucleotide sequences, in addition to infecting the same giardia species.

Taxonomy, Classification and Evolution Currently, according to the International Committee on Taxonomy of Viruses (ICTV), the GLV belongs to the Giardiavirus genus of the Totiviridae family. Totiviridae are characterized by having a dsRNA non-segmented genome and virions with simple structures. Two other genera of dsRNA viruses which also infect protozoa, Leishmaniavirus and Trichomonasvirus, are included in this family, in addition to Totivirus and Victorivirus comprising viruses that infect fungi (King et al., 2012; ICTV, 2015). Some putative Totiviridae viruses are known to infect plants such as maize, the herb Panax notoginseng and the red alga Delisea pulchra (Chen et al., 2016; Guo et al., 2016; Lachnit et al., 2016). Some others were found in some fish species such as salmon and golden shiner baitfish (Wiik-Nielsen et al., 2012; Mor and Phelps, 2012). When compared with Saccharomyces cerevisiae virus L (ScV), one of the best-known viruses of the Totiviridae family, GLV and ScV are similar in many ways, namely: (1) single molecules of genomic dsRNAs of similar sizes; (2) single major capsid polypeptides of similar sizes; (3) virion-associated RdRp with similar amino acid sequence motifs deduced from the genomic sequence, (4) ability to synthesize viral messages from intact virions in vitro; and (5) use of ribosomal frameshifting for the synthesis of viral polymerase. However, the two viruses do not cross-infect and the two dsRNAs do not cross-hybridize in Northern blots (Wang and Wang, 1991). Sequence alignments from GLV cDNAs and those from ScV-L, Leishmania RNA virus (LRV) and Trichomonas vaginalis virus (TVV) indicate that, despite the similar organization of the viral genomes, GLV share very little overall sequence identity with any of these viruses whose genomic sequences have been completely determined. GLV is therefore not closely evolutionarily related to any of these viruses. The GCV has not yet been given a completely defined classification. Phylogenetic analysis of RdRp gene sequences demonstrate that GCV is clustered with GLV and piscine myocarditis virus (PMCV) a virus that affects fish myocardial cells, forming the GLVlike clade. Within this same family, the closest related group to GLV-like is the IMNV-like clade, composed of some viruses characterized by affecting arthropods (insects and crustaceans) (Oliveira et al., 2014). The IMNV-like clade was recently proposed as a new genus called Artivirus, based on their distinct genomic organization in relation to other members of the Totiviridae family (Dantas et al., 2015). The striking similarity in genome organization and sequence identity as well as the more close phylogenetic proximity of their hosts may indicate that GCV and GLV share a common evolutionary origin; this can be further elucidated with more detailed genetic studies. The phylogenetic relationships among representative members of the Totiviridae family regarding RdRp amino acid sequences are shown in Fig. 1.

Host Range and Geographic Distribution Purified GLV readily infects many virus-free isolate of G. lamblia trophozoites, but not any other parasitic protozoa tested, including Tritrichomonas foetus and Trichomonas vaginalis (Sepp et al., 1994). The virus has also been shown to not infect two transformed human

582

Encyclopedia of Virology, 4th Edition, Volume 4

doi:10.1016/B978-0-12-809633-8.11003-9

Giardiavirus (Totiviridae)

583

Fig. 1 Phylogeny of the Totiviridae family members calculated from an alignment of RdRp aminoacid sequences using Bayesian inference (Oliveira et al., 2014). Numbers in branch nodes indicate posterior probabilities and colors represent the genera in according to ICTV. The artivirus group was not formally considered as a genus of the Totiviridae family to date.

intestinal cell lines (Wang et al., 1988). It is therefore believed that GLV has a rather narrow host range; it probably only infects G. lamblia in nature. On the other hand, the cellular host for this virus (G. lamblia) parasitizes many mammals other than humans. Viruses that are identical in shape and size to genomic dsRNA and share dsRNA sequence homology with one another have been detected from many G. lamblia strains and isolates obtained from humans, guinea pigs, cats, beavers, llamas, and sheep. The human isolates were collected from Belgium, Poland, England, Israel, Ecuador, Puerto Rico, and various states in the USA (Miller et al., 1988). GCV is known to affect G. canis trophozoites. Some studies have reported G. canis isolates affecting dogs in Italy, Spain, USA, Australia, China, Canada, Germany and Sardinia, and in wildlife animals such as coyotes from California and African painted dogs reviewed in Thompson and Ash (2016). Giardia is common in domestic dogs throughout the world, and is often the most common enteric parasite. Since Giardia species are found in almost all parts of the world, affecting organisms in developing as well as developed countries, it is expected that GCV and GLV follows their respective hosts and are also distributed worldwide.

Physical and Biochemical Characteristics From the 3D structures obtained by transmission electron cryomicroscopy (Cryo-TEM), it can be observed that GLV has a “T ¼2” type considered icosahedral shaped capsid composed of 120 polypeptide subunits called capsid proteins (CP), and arranged in a “T ¼ 1” lattice (Fig. 2(A)). Most of the exterior capsid surface forms a mostly flat raised plateau located at the center of each of the icosahedral 5-fold axes. This plateau is composed by 10 CP subunits. According to analysis of the equatorial sections of the virus (Fig. 2(B)), it is suggested that the CP subunits have prevalence of a-helical secondary structures, with each subunit having a maximum length of approximately 100 Å and very similar tertiary structures (Janssen et al., 2015). The capsid has a single layer with a maximum thickness of approximately 60 Å (median B40 Å ) and outer diameter of approximately 485 Å , which makes GLV the virus with the largest capsid among all totiviruses studied to date, including the IMNV capsid (Janssen et al., 2015).

584

Giardiavirus (Totiviridae)

Fig. 2 Three dimensional structure of GLV virions. (A) Space-filling stereo view of the viral particle based on an icosahedral 2-fold axis (gray point). The colors are correspondent to the radius being dark green outermost and red innermost regions. (B) Central thick section through the virion. Black and white corresponding to highest and lowest densities in the density map, respectively. (Janssen et al., 2015; EMDB - EMD-5948).

There are no studies on GCV capsid physical characteristics yet, but according to its high genomic sequence similarity to GLV, it can be assumed that GCV has a capsid with similar structure. Regarding the GLV genome array within the virus, it is believed that it is homogeneously distributed within the capsid, except for some points of greatest concentration near the shell. Thus, GLV has a different genome distribution when compared with other totiviruses (Janssen et al., 2015). As the GLV dsRNA can be readily radiolabeled by [32P]pCp and RNA ligase, this molecule must have a free 30 -OH group at one or both of its 30 end. The exact structure at the 50 end of the dsRNA has not yet been elucidated. It is known that the dsRNA molecule cannot be phosphorylated by T4 polynucleotide kinase, and that the denatured dsRNA molecule can be circularized with T4 RNA ligase. Therefore, the 50 terminus of the GLV genomic dsRNA is probably phosphorylated. Additionally, no covalently linked protein is found at this extremity such as in the case of the poliovirus (Garlapati and Wang, 2005; Furfine et al., 1989; Wang and Wang, 1986). Both GLV and GCV viruses encode two different proteins, with one being a polypeptide of approximately 100 kDa (P100) that gives rise to the CP subunits, and the viral RNA-dependent RNA polymerase (RdRp), consisting of a 190 kDa polypeptide (Wang et al., 1993). GLV purified virions are good antigens in mice. Polyclonal antibodies raised against whole virions react with P100 in Western blots and can effectively block viral infection in the in vitro culture of G. lamblia trophozoites. The same antiserum also reacts positively with a minor component of the virion, which is the 190 kDa viral RdRp. Indeed, GLV-encoded RdRp activity has been detected in the infected cell extract as well as the purified GLV fractions. Antisera raised against synthetic dsRNA only cross-reacts with the viral dsRNA and are not protective against viral infection of G. lamblia trophozoites in vitro. Giardiaviruses do not contain any lipids, nor is the P100 polypeptide glycosylated (Wang et al., 1988).

Giardiavirus (Totiviridae)

Table 1

585

Structural properties of some small dsRNA viruses

Virus

Shape

Diameter (nm)

Density (gm/ml1)

dsRNA (mm)

dsRNA (kbp) Capsid protein (kDa)

GLV GCV TVV LRV ESV ScV-L UmV-P1 IMNV HvV190S

Isometric Isometric Isometric Isometric Isometric Isometric Isometric Isometric Isometric

48.5 48.5a 35 40 35 44 41–43 45 36

1.368

1.50

1.468

1.50

6.3 6.3 4.6–5.9 5.3 6.2 4.6 6.1 8.2 5.2

1.368 1.418

1.63 1.31

98 98 74–82 82 86.5 76 73 99 81

a

Expected. Abbreviations: GLV, Giardia lamblia virus; GCV, Giardia canis virus; TVV, Trichomonas vaginalis virus; LRV, Leishmania RNA virus; ESV, Eimeria stiedae virus; ScV-L, Saccharomyces cerevisiae virus L; UmV-P1, Ustilago maydis virus; IMNV, Infectious myonecrosis virus; HvV190S, Helminthosporium victoriae virus 190S.

Experiments in which Trichomonas vaginalis virus 1 (TVV1) and GLV virions were exposed to different temperatures and pHs showed that the GLV is more thermoresistant than the TVV1 virions. In addition to GLV being able to mediate transcription through a wide pH range, it has maximum transcription efficiency at pHs close to 8.0, while TVV1 virions mediate transcription in a narrower pH range with maximum efficiency at pHs near 6.5. These characteristics correspond to different intracellular conditions encountered in their respective hosts, since T. vaginalis trophozoites ideal growth occurs under more acidic conditions (pHo6.5) than G. lamblia trophozoites (pH47) (Janssen et al., 2015). The structural properties of the GLV and GCV comparing with other totiviruses are shown in Table 1.

Organization and Molecular Biology of GLV and GCV dsRNA Genomes The 6277 bp GLV and 6276 bp GCV dsRNA genomes contain only two large open reading frames (ORFs), both on the same RNA strand. The two genomes share nucleotide identity higher than 94%, presenting the most number of single nucleotide variations at ORF1 region. GLV and GCV first ORFs (nts 367–3027 and 367–3030, respectively) encode the precursor polypeptide for the capsid protein. Thirty-two amino acid residues from the N-terminus of this precursor protein are apparently removed by a cellular cysteine protease before the processed capsid protein is assembled into the GLV virion (Yu et al., 1995; Chen et al., 2007; Liu et al., 2008). The second ORF (nts 2805–5978 and 2808–5981 for GLV and GCV, respectively) is encoded in –1 frame (in relation to ORF 1), and the two ORFs overlap by 220 bp. RdRp motifs have been found in many ORF2 from different totiviruses (Oliveira et al., 2014; Dantas et al., 2015). It is now known that the GLV RNA polymerase (190 kDa) is synthesized as a fusion protein of ORFs 1 and 2 at a level that is 2–5% of P100. Apparently, ribosomes carrying the growing nascent polypeptide chain are stalled when they encounter the pseudoknot structure residing within the 220 nt overlap of these two ORFs. At a 2–5% frequency, the ribosomes slip back one reading frame and proceed to translate ORF 2 as the C-terminus of a fusion protein. The ability of this GLV overlap fragment to induce a –1 ribosomal frameshift has been demonstrated in a reporter system in yeast (Wang et al, 1993). The schemes of GLV and GCV genomes are shown in Fig. 3. Flanking the 50 and 30 ends of the two ORFs are two untranslated regions. For the GLV, the 50 and 30 UTRs have 367 and 294 nts respectively, and in GCV the corresponding regions have 367 and 298 nts (Wang et al., 1993; Dantas et al., 2015). It was identified that these two regions contain sequence elements that are critical for initiation of transcription and replication of GLV RNA. For example, deletion of a single nucleotide from the 50 terminus of the ( þ ) strand GLV RNA totally abolishes transcription of GLV mRNA. Similarly, deletion or alteration of sequences in these two regions drastically reduces the level of progeny viral RNA (Garlapati et al., 2001; Garlapati and Wang, 2004, 2005). GLV has been successfully used as a vector for the introduction of foreign genes into G. lamblia. When a portion of the GLV genome is replaced with the firefly luciferase gene in a cDNA construct downstream from a T7 promoter, chimeric RNA can be synthesized in vitro using T7 polymerase. Giardia cells that are infected with wild-type GLV and electroporated with the chimeric RNA show luciferase activity of a millionfold above background. The chimeric RNA is replicated as dsRNA and packaged into recombinant virions that are shed into the culture supernatant. These recombinant viruses together with wild type GLV can in turn infect naive Giardia trophozoites to produce luciferase activity (Garlapati et al., 2001). The list of foreign genes tested in this system include gene encoding neomycin phosphotransferase, hygromycin phosphotransferase, and green fluorescent protein. Moreover, they can be delivered and expressed at high levels (Yu et al., 1996). Inclusion of a nts 368–631 fragment in the coding region of P100, such as the N-terminus of a fusion protein with the foreign gene product has been shown to enhance translation level by 5000-fold (Yee and Nash, 1995). This region therefore may contain elements that can promote interaction between GLV mRNA and ribosomes of G. lamblia. Giardiavirus mRNA is neither capped nor adenylated. Translation is initiated at an internal ribosomal entry site (IRES) in the mRNA consisting of a 253 nt segment in the 50 untranslated region (50 UTR) and a 264 nt stretch in the ORF immediately downstream from the 50 UTR. The initiation codon is localized at the center of an unstructured 31 nt segment flanked by complex

586

Giardiavirus (Totiviridae)

Fig. 3 Organization of the Giardia lamblia virus and Giardia canis virus dsRNA genomes.

secondary structures in the 50 UTR and the ORF of the IRES. The precise position of the initiation codon is critical for translation initiation. It also suggests a direct recruitment of the 40 S small ribosomal subunit to the initiation codon without ribosomal scanning, a process that is known to be absent from the translation machinery in Giardia (Garlapati and Wang, 2005).

Infection and Replication GLV infects susceptible isolates of G. lamblia very efficiently. It has been estimated that demonstrable infection can be achieved at an infection multiplicity as low as 10 GLV particles per cell (Miller et al., 1988). The virus is not observed in Giardia cysts, and is not known to occur during the transition between cyst and trophozoite. The exact mode of viral entry into the cell has not yet been elucidated. With infection course, it is observed that the viruses enter into susceptible parasite cells via endocytosis, where the peripheral lysosomal vacuoles serve as virus translocation ports to cytoplasm (Tai et al., 1993). Sepp et al. (1994) suggests that this internalization is mediated by endocytosis receptors. However, a number of virus-free G. lamblia strains have been found to be resistant to GLV infection. When trophozoites of these strains are electroporated with the single-stranded GLV RNA (ssRNA), they become infected by GLV and can support and complete the replication cycle, producing progeny virions that are fully infective to sensitive strains (Furfine and Wang, 1990). It is expected that these receptors are present on the surface of cells susceptible to the virus, are responsible for binding and internalizing the virus, and are absent in resistant protozoa. GLV is the only known protozoan viruses which can be efficiently transmitted by extracellular pathways, indicating that virion have the necessary protein machinery to efficiently mediate the entry. Using purified GLV to infect a virus-free G. lamblia WB strain, in situ hybridization was used to show that dsRNA replication was first detected in the cytoplasm. Toward the late stage of infection, viral dsRNA was found to spread into the twin nuclei of the infected flagellate. Transmission electron microscopic examination of thin sections of infected cells at stationary phase also reveals paracrystals of virus-like particles in the nuclei (Fig. 4). It has been estimated that an infected cell may harbor as many as 105 GLV particles without lysis. Meanwhile, mature and infectious GLV particles begin to appear in the supernatant of Giardia culture medium 42 h after infection (Wang and Wang, 1986; Tai et al., 1991). It is not known whether a specific cellular process is involved in the GLV release into the culture supernatant, although cell lysis has not been observed as a consequence of viral infection. The viruses are excreted into the culture medium and can infect these same isolates of many other free Giardia. The G. lamblia growth rate of newly infected cells decreases with increasing infection multiplicity. As the ratio of the infecting virus to cell increases, the percentage of the non-adhering (non-dividing) trophozoites also increases. However, in established infected cell lines or at moderate multiplicity of infection (o1000), the infected cell assumes normal appearance and maintains the same growth rate as its uninfected counterpart. Furthermore, GLV infection persists indefinitely throughout repeated subcultures (Wang and Wang, 1986; Miller et al., 1988). It is suggested by Janssen et al. (2015) that GLV can transcribe its genome inside the icosahedral capsid right after entry into the G. lamblia cytoplasm. The genome is transcribed by the RdRp domain of the CP/RdRp subunit(s) still inside the “T¼ 2” icosahedral capsid, and the plus-strand transcripts are then released from the particle, allowing translation of the GLV proteins and packed into nascent progeny virions. This strategy is thought to be common to all dsRNA viruses, possibly except for Birnaviruses (Luque et al., 2009).

Giardiavirus (Totiviridae)

587

Fig. 4 Thin section of a giardiavirus-infected cell. C, cytoplasm; N, nuclear envelope; F, flagella; D, ventral disk; V, aggregates of virus particles.

In addition to the GLV 6.3 kbp genomic dsRNA, a single stranded RNA (ssRNA) of identical length that is homologous to only one of the two strands of GLV dsRNA is also found in the GLV-infected cellular extract (Furfine et al., 1989). Studies of the infection time course showed that, in contrast to GLV dsRNA which increases steadily from 23 to 141 h after infection, ssRNA becomes detectable at about the same time, peaks at 42–50 h after infection, then gradually declines. Electroporation of gelpurified ssRNA into the uninfected WB cells resulted in the recovery of dsRNA from the cell extract and infectious GLV particles in culture supernatant, demonstrating that SS is the viral messenger RNA as well as the full-length replicative form of the viral genome (Furfine and Wang, 1990). Nucleotide sequence analysis of GLV cDNA clones also verifies that it is the ssRNA strand that encodes the large ORFs. There is no subgenomic viral RNA detected inside the infected cell. The RNA products synthesized in vitro by the virion-associated RdRp are homologous to ssRNA and complementary to the negative strand. Transcription of viral message therefore must proceed conservatively by utilizing the dsRNA as its template. Studies performed by Cao et al. (2009) indicate that GCV may be released by budding. TEM micrographs of G. canis infected with GCV show a gradual process of budding formation during the virus discharge. Alternatively, they suggest that the virus may also be released by lysis of infected trophozoites because of the apparent loss of the cell membranes when the trophozites were crowded. Since the successful cloning of the full length cDNA from dsRNA template of GLV (Wang et al., 1993), the modified giardiavirus RNA has been used as a genetic tool for the study of possible intervention and pathogenesis (Liu et al., 2005). Similarly, GCV has been used to construct a stable transfection system which was used to stably transfect trophozoites of G. canis (Chen et al., 2007; Liu et al., 2008). Liu et al. (2008) successfully established a stable transfection system based on the GCV with stable expression of green fluorescent protein in G. canis, while Chen et al. (2007) cloned a hammerhead ribozyme flanked with various lengths of antisense Krr1 RNA into a viral vector derived from GCV genome. In the same study, RNA plasmid transcripts showed high cleavage activities on G. canis Krr1 mRNA in vitro causing a decrease in its levels and leading to trophozoite deformation. The extent to which a person is affected by giardiasis varies widely. The outcome of the severity of parasitic infection depends on many variables derived not only from the parasite, but also from the human host. Attempts at correlating the presence or absence of GLV with the severity of giardiasis have been inconclusive, and we still do not know the role GLV might play in the delicate balance of the host–parasite interaction, if any.

References Cao, L., Gong, P., Li, J., et al., 2009. Giardia canis: Ultrastructural analysis of G. canis trophozoites transfected with full length G. canis virus cDNA transcripts. Exp. Parasitol. 123, 212–217. Chen, S., Cao, L., Huang, Q., Qian, Y., Zhou, X., 2016. The complete genome sequence of a novel maize-associated totivirus. Arch. Virol. 161, 487–490. Chen, L.F., Li, J.H., Zhang, X.C., et al., 2007. Inhibition of Krr1 gene expression in Giardia canis by a virus mediated hammerhead ribozyme. Veterinary Parasitol. 143, 14–20. Dantas, M.D.A., Chavante, S.F., Teixeira, D.I.A., Lima, J.P.M., Lanza, D.C., 2015. Analysis of new isolates reveals new genome organization and a hypervariable region in infectious myonecrosis virus (IMNV). Virus Res. 203, 66–71. Furfine, E.S., Wang, C.C., 1990. Transfection of the Giardia lamblia double-stranded RNA virus into Giardia lamblia by electroporation of a single-stranded RNA copy of the viral genome. Mol. Cell. Biol. 10 (7), 3659–3662. Furfine, E.S., White, T.C., Wang, A.L., Wang, C.C., 1989. A single-stranded RNA copy of the Giardia lamblia virus double-stranded RNA genome is present in the infected Giardia lamblia. Nucleic Acids Res. 17 (18), 7453–7467.

588

Giardiavirus (Totiviridae)

Garlapati, S., Chou, J., Wang, C.C., 2001. Specific secondary structures in the capsid-coding region of giardiavirus transcript are required for its translation in Giardia lamblia. J. Mol. Biol. 308, 623–638. Garlapati, S., Wang, C.C., 2004. Identification of a novel internal ribosome entry site in Giardiavirus that extends to both sides of the initiation codon. J. Biol. Chem. 279, 3389–3397. Garlapati, S., Wang, C.C., 2005. Structural elements in the 50 -untranslated region of giardiavirus transcript essential for internal ribosome entry site-mediated translation initiation. Eukaryotic Cell 4, 742–754. Guo, L., Yang, X., Wu, W., et al., 2016. Identification and molecular characterization of Panax notoginseng virus A, which may represent an undescribed novel species of the genus totivirus, family Totiviridae. Arch. Virol. 161, 731–734. ICTV - International Committee on Taxonomy of Viruses, 2015. Virus Taxonomy: Release. Available at: http://www.ictvonline.org/virusTaxonomy.asp (accessed 30.10.16) Inger, S.H., Bjørn, K.G., Lucy, J.R., 2007. A longitudinal study on the occurrence of Cryptosporidium and Giardia in dogs during their first year of life. Acta Vet. Scand. 49, 22. Janssen, M.E.W., Takagi, Y., Parent, K.N., et al., 2015. Three-dimensional structure of a protozoal double-stranded RNA virus that infects the enteric pathogen Giardia lamblia. J. Virol. 89, 1182–1194. King, A.M., Adams, M.J., Lefkowitz, E.J., Carstens, E.B., 2012. Virus Taxonomy: Ninth Report of the International Committee on Taxonomy of Viruses. Amsterdam: Elsevier. Lachnit, T., Thomas, T., Steinberg, P., 2016. Expanding our understanding of the seaweed holobiont: RNA viruses of the red alga Delisea pulchra. Front. Microbiol. 6, 1489. Liu, C., Li, J., Zhang, X., Liu, Q., et al., 2008. Stable expression of green fluorescent protein mediated by GCV in Giardia canis. Parasitol. Int. 57, 320–324. Liu, Q., Zhang, X., Li, J., et al., 2005. Giardia lamblia: Stable expression of green fluorescent protein mediated by giardiavirus. Experimental Parasitology 109 (3), 181–187. (Epub 2005 Jan 20). Luque, D., Saugar, I., Rejas, M.T., et al., 2009. Infectious bursal disease virus: Ribonucleoprotein complexes of a double-stranded RNA virus. J. Mol. Biol. 386, 891–901. Miller, R.L., Wang, A.L., Wang, C.C., 1988. Identification of Giardia lamblia isolates susceptible and resistant to infection by the double-stranded RNA virus. Exp. Parasitol. 66 (1), 118–123. Mor, S.K., Phelps, N.B., 2012. Molecular detection of a novel totivirus from golden shiner (Notemigonus crysoleucas) baitfish in the USA. Arch. Virol. 161, 2227–2234. Oliveira, R.A., Almeida, R.V., Dantas, M.D., et al., 2014. In silico single strand melting curve: A new approach to identify nucleic acid polymorphisms in Totiviridae. BMC Bioinformatics 15, 243. Sepp, T., Wang, A.L., Wang, C.C., 1994. Giardiavirus-resistant Giardia lamblia lacks a virus receptor on the cell membrane surface. J. Virol. 68, 1426–1431. Tai, J.H., Ong, S.J., Chang, S.C., Su, H.M., 1993. Giardiavirus enters Giardia lamblia WB trophozoite via endocytosis. Exp. Parasitol. 76, 65–74. Tai, J.H., Wang, A.L., Ong, S.J., et al., 1991. The course of giardiavirus infection in the Giardia lamblia trophozoites. Exp Parasitol. 73 (4), 413–423. Thompson, R.C.A., Ash, A., 2016. Molecular epidemiology of Giardia and Cryptosporidium infections. Infection Genetics. Evolution 40, 315–323. Wang, A.L., Miller, R.L., Wang, C.C., 1988. Antibodies to the Giardia lamblia double-stranded RNA virus major protein can block the viral infection. Mol. Biochem. Parasitol. 30 (3), 225–232. Wang, A.L., Wang, C.C., 1986. Discovery of a specific double-stranded RNA virus in Giardia lamblia. Mol. Biochem. Parasitol. 21 (3), 269–276. Wang, A.L., Wang, C.C., 1991. Viruses of the protozoa. Annu. Rev. Microbiol. 45, 251–263. Wang, A.L., Yang, H.M., Shen, K.A., Wang, C.C., 1993. Giardiavirus double-stranded RNA genome encodes a capsid polypeptide and a gag-pol- like fusion protein by a translation frameshift. Proc. Natl. Acad. Sci 90, 8595–8599. Wiik-Nielsen, J., Alarcón, M., Fineid, B., Rode, M., Haugland, Ø., 2012. Genetic variation in Norwegian piscine myocarditis virus in Atlantic salmon. Salmo salar L.J. Fish Deseases 36, 129–139. Yee, J., Nash, T.E., 1995. Transient transfection and expression of firefly luciferase in Giardia lamblia. Proc. Natl. Acad. Sci 92, 5615–5619. Yu, D.C., Wang, A.L., Wang, C.C., 1996. Stable coexpression of a drug-resistance gene and a heterologous gene in an ancient parasitic protozoan Giardia lamblia. Mol. Biochem. Parasitol. 83, 81–91. Yu, D.C., Wang, A.L., Wu, C.H., Wang, C.C., 1995. Virus-mediated expression of firefly luciferase in the parasitic protozoan Giardia lamblia. Mol. Cell. Biol. 15, 4867–4872.

Hypoviruses (Hypoviridae) Dong-Xiu Zhang and Donald L Nuss, University of Maryland, Rockville, MD, United States r 2021 Elsevier Ltd. All rights reserved. This is an update of D.L. Nuss, Hypoviruses, In Encyclopedia of Virology (Third Edition), edited by Brian W.J. Mahy and Marc H.V. Van Regenmortel, Elsevier Ltd., 2008, doi:10.1016/B978-012374410-4.00406-4.

Glossary Anastomosis The fusion of fungal hyphae resulting in exchange of cytoplasmic material and hypovirus transmission. Hypovirulence Virus-mediated attenuation of fungal virulence.

Vegetative incompatibility A system controlled by at lease six genetic loci that determines the ability of two fungal strains to undergo anastomosis.

Introduction The discovery of hypoviruses, a group of RNA viruses that reduce the virulence (hypovirulence, see article on Hypovirulence; volume 3, this encyclopedia) of the chestnut blight fungus Cryphonectria parasitica, stimulated intensive research into the potential of using fungal viruses for biological control of fungal diseases. Documented examples of virus-mediated hypovirulence have been reported for fungal diseases of plants that range from trees to turfgrass and involve mycoviruses which include representatives from the Totiviridae, Chrysoviridae, Reoviridae, Narnaviridae and Hypoviridae (the hypoviruses). However, the hypoviruses remain the most thoroughly studied of the hypovirulence-associated viruses, primarily due to several significant advances in hypovirus molecular biology. In 1977, Peter Day and co-workers at the Connecticut Agricultural Experiment Station reported that hypovirulent C. parasitica strains harbor double stranded (ds) RNAs, providing the first indication of the nature of cytoplasmic elements responsible for the hypovirulence phenotype. Subsequent surveys of dsRNAs associated with different North American and European hypovirulent strains revealed considerable variations in concentration, number and size of dsRNA components. By the late 1980s, it was clear that a detailed molecular analysis of the dsRNAs associated with a single hypovirulent strain was required to bring some measure of order to the mounting confusion generated by such surveys. This resulted in the cloning and complete sequence determination of the prototypic hypovirus, now designated CHV1-EP713, in 1991. This milestone was followed in 1992 by the construction of an infectious full-length (12,712 bp) cDNA clone of CHV1-EP713 RNA by Choi and Nuss. This development furnished direct evidence that hypoviruses are indeed the causative agents responsible for transmissible hypovirulence and provided the means for facile manipulation of the hypovirus genome. Current interest in hypoviruses extends past biological control potential to their utility as unique experimental tools for probing fundamental processes underlying fungal pathogenesis and mycovirus-fungal host interactions.

Taxonomy and Genetic Organization Hypoviruses are classified within the family Hypoviridae, consisting of the genus Hypovirus and four species with the exemplar strains designated as Cryphonectria parasitica hypovirus 1–4 (CHV1- CHV4). Hypovirus taxonomy is not based on virus structure, since this group of viruses does not encode a coat protein, but on genome organization, sequence similarity and symptom expression. Hypovirus genetic information is found predominantly as dsRNA associated with membrane vesicles ranging in diameter from 50 to 80 nm. As observed for fungal viruses generally, hypoviruses exhibit no extracellular phase in their life cycle. Infections cannot be initiated by inoculation with an infected cell extract or enriched fractions. Instead, these viruses are transmitted by cytoplasmic mixing as a result of fusion (anastomosis) between vegetatively compatible strains or to a variable degree in asexual spores (conidia). The hypoviruses were assigned the designations CHV1-CHV4 in the order in which their genome sequence was completed. The primary nucleotide sequence for the coding strand of the prototypic member of the genus Hypovirus, CHV1-EP713, specifies two large open reading frames (ORFs) designated ORF A and ORF B (Fig. 1). A second closely related member, CHV1-Euro7, has the same organization and shares approximately 90% identity at the nucleotide level with CHV1-EP713. The type member of the Cryphonectria parasitica hypovirus 2, CHV2-NB58, shares only about 60% nucleotide sequence identity with CHV1-EP713 and lacks a portion of ORF A that, for the CHV1, encodes a functional cis-acting cysteine protease. Sequenced members of Cryphonectria parasitica hypovirus 3 and 4, CHV3-GH2 and CHV4-SR2, respectively, are several kb shorter than CHV1-EP713 and CHV2-NB58, B9.2–9.8 kb versus 12.5–12.7 kb, contain a single large ORF rather than two ORFs and are both more distantly related phylogenically to CHV1-EP713 than CHV1-EP713 is to CHV2-NB58. Unassigned hypovirus-related viruses have been reported to infect filamentous fungi other than C. parassitica, including Sclerotinia sclerotiorum, Valsa ceratoperma, Phomopsis longicolla, Fusarium ssp., Macrophomina phaseolina and Agaricus bisporus.

Encyclopedia of Virology, 4th Edition, Volume 4

doi:10.1016/B978-0-12-809633-8.20940-0

589

590

Hypoviruses (Hypoviridae)

Fig. 1 Genetic organization of sequenced hypovirus genomes representing the four species that comprise the genus Hypovirus, family Hypoviridae. Amino acid identity levels for coding regions of the two sequenced members of Cryphonectria hypovirus 1 species, CHV1-EP713 and CHV1-Euro7, are indicated between representations of the two viral genomes. Protein coding regions homologous to CHV1-EP713 encoded p29, p40, p48. polymerase and helicase are color coded. The magenta regions represent a short conserved cysteine rich domain. Note that the genomes of CHV3-GH2 and CHV4-SR2 contain a single ORF. Modified from Dawe, A.L., Nuss, D.L., 2001. Hypoviruses and chestnut blight: Exploiting viruses to understand and modulate fungal pathogenesis. Annual Review of Genetics 35, pp. 1–29, with permission from Annual Reviews.

Fig. 2 Expression strategy for prototypic hypovirus CHV1-EP713. The CHV1-EP713 coding strand consists of 12,712 nucleotides excluding a poly(A) tail, and contains two major coding domains designated ORF A and ORF B. Details are discussed in the text. Modified from Dawe, A.L., Nuss, D.L., 2001. Hypoviruses and chestnut blight: Exploiting viruses to understand and modulate fungal pathogenesis. Annual Review of Genetics 35, pp. 1–29, with permission from Annual Reviews.

Hypovirus Gene Expression Strategy Although hypovirus genetic information is readily recovered from infected cultures as linear dsRNA, the absence of a discrete virus particle and an extracellular infection phase presents some difficulties in precisely defining the hypovirus genome. Synthetic copies of the coding strand are infectious by electroporation into fungal spheroplasts and phylogenetic analyses suggests a common ancestry with the positive strand RNA plant potyviruses. Thus, one could consider hypoviruses as having a positive strand RNA genome and the dsRNA as representing accumulated replicative form RNA. Irrespective of this complication, direct analysis of hypovirus dsRNAs, cDNA cloning studies and in vitro translational analyses has provided the following view of genetic organization and expression strategies for the prototypic hypovirus CHV1-EP713 shown in Fig. 2. One strand contains a 3′-poly A tail while the complementary strand contains a 5′ poly U tract. All of the CHV1-EP713 coding information appears to reside within two contiguous ORFs on the 12,712 nt long polyadenylated strand. Translation of ORF A is facilitated by an internal ribosome entry site (IRES) located at the end of the 5′-non-coding region and extending into the beginning of the coding domain. ORF A encodes two polypeptides, p29 and p40, that are released from a polyprotein precursor, p69, by an autoproteolytic event between Gly-248 and Gly-249, mediated by a cysteine-like protease catalytic domain located within p29. ORF B has the capacity to encode a polyprotein of 3165 amino acids and contains unmistakable RNA-dependent RNA polymerase and helicase motifs. Proteolytic processing of only a portion of the ORF B polyprotein has been elucidated in the form of the autoproteolytic release of a 48 kDa

Hypoviruses (Hypoviridae)

591

protein, p48, from the N-terminus. This cleavage event occurs between Gly-418 and Ala-419 and is catalyzed by essential residues Cys-341 and His-388 within p48. The junction between ORF A and ORF B is defined by the sequence 5′-UAAUG-3′, in which the UAA portion clearly serves as the termination codon for ORF A and the AUG portion is thought to serve as the initiation codon for ORF B. The mechanism by which ribosomes transition through the junction involves coupled termination/reinitiation. This unusual pentanucleotide sequence is found at the ORF A/ORF B junction for all confirmed strains of Cryphonectria hypovirus 1. There is clearly a need for additional fine detailed mapping of the processing cascades for hypovirus-encoded polyproteins. The genome of CHV2-NB58, the type strain of Cryphonectria hypovirus 2, also consists of a two ORF configuration with a UAAUG junction (Fig. 2). However, ORF A lacks the p29 papain-like catalytic or cleavage sites and directs the translation of a 50-kDa protein product. ORF B of CHV2-NB58 does contain a p48 homologue, p52. The N-terminal portion of the single ORF of type strain CHV3-GH2 (Cryphonectria hypovirus 3), contains a protease, p32, with similarity to p29. A putative protease domain has been identified at the N-terminal portion of the type strain CHV4-SR2 of Cryphonectria hypovirus 4, but protease cleavage has not been demonstrated. In most hypovirus-infected C. parasitica isolates, the full-length viral dsRNA is accompanied by a constellation of shorter dsRNA species. These ancillary dsRNAs appear to be generated by internal deletion events, are replicated only in the presence of the fulllength viral RNA and are not associated with any function or phenotypic effect.

Hypovirus Functional Domains Hypovirus-encoded symptom determinants and important expression and replication elements have been mapped (Fig. 3) by a combination of approaches that include (a) the construction of recombinant chimeras from hypoviruses that differ in their influence on host phenotype, (b) mutagenesis of a hypovirus infectious cDNA clone and (c) cellular expression of viral coding domains independent of virus infection. In analogy with plant viruses, CHV1-EP713 and CHV1-Euro7 can be viewed as severe and mild hypovirus isolates, respectively. Although these two CHV1 isolates cause quite different phenotypic changes in their fungal host, they share a high level of sequence similarity that has allowed the construction of viable chimeric viruses to begin mapping the determinants responsible for the differences in phenotypic changes. Differences in host colony morphologies were found to map to a region extending from a position just downstream of the p48 coding domain (map position 3575) to map position 9879, with clear indications of multiple discrete determinants. More specifically, the region extending from position 3575 to 5310 was able to confer a CHV1-EP713-like colony morphology when inserted into a CHV1-Euro7 genetic background. The CHV1-EP713 p48 coding region was found to be a dominant determinant contributing to suppression of asexual spore formation on the canker face. In a separate study, p48 was shown to have the unusual property of being required for the initiation but not the maintenance of viral RNA replication. The chimeric hypoviruses also proved to be very useful reagents when coupled with a pathway specific promoter/reporter system to map viral determinants responsible for altering host G-protein/cAMP-mediated signaling. A common undesired side effect of hypovirus-mediated virulence attenuation is a significantly reduced ability of the fungal host to colonize and produce spores on the corresponding plant host. This reduces the ability of hypovirulent fungal strains to persist and spread through the ecosystem. Thus, from a practical perspective, a better understanding of the nature of viral symptom determinants and their relative effects on specific regulatory pathways and expression of gene clusters provides the means for a more rational approach for engineering hypoviruses that exhibit a desired balance between virulence attenuation and ecological fitness. Additional insights into the functional role of viral coding regions were indirectly provided during efforts to develop hypoviruses as gene expression vectors. The nucleotide sequence corresponding to the first 24 codons of p29 was found to be required for viral replication, while the remaining 598 codons of ORF A, including all of the p40 coding region, was found to be dispensable. The 24 N-terminal codons were subsequently shown to function as an essential element of an internal ribosome entry

Fig. 3 Map of CHV1-EP713 symptom determinants and essential/dispensable replication elements as described in the text. Modified from Dawe, A.L., Nuss, D.L., 2001. Hypoviruses and chestnut blight: Exploiting viruses to understand and modulate fungal pathogenesis. Annual Review of Genetics 35, pp. 1–29, with permission from Annual Reviews.

592

Hypoviruses (Hypoviridae)

site (IRES) involved in ORF A translational initiation. Substantial alterations were also tolerated in the pentanucleotide UAAUG that contains the ORF A termination codon and the overlapping ORF B initiation codon. For example, replication competence was maintained following either a frame-shift mutation that caused a two-codon extension of ORF A or a modification that produced a single-ORF genomic organization. Further characterization of p40 revealed a role as an accessory function in viral RNA amplification with a functional domain extending from Thr(288) to Asn(313) and a role of a p40 coding domain just upstream of the UAAUG pentamer in coupled translational termination/reinitiation. Expression of the CHV1-EP713 encoded papain-like protease, p29, in the absence of virus infection was shown to cause a subset of phenotypic changes exhibited by CHV1-EP713-infected strains, e.g., a white phenotype (reduction in orange pigmentation), reduced asexual sporulation and a slight reduction in the production of fungal laccase activity. By deleting all but the first 24 N-terminal codons of p29 in the context of the CHV1-EP713 infectious cDNA clone (mutant virus Δp29), it was also possible to show that the p29 protein is dispensable for viral replication and to demonstrate a near restoration of orange pigment production and a moderate increase in conidation levels relative to wild-type CHV1-EP713-infected fungal colonies. Deletion of p29 had no effect on virus-mediated virulence attenuation. A gain-of-function analysis involving progressive repair of the Δp29 mutant was also devised to map the p29 symptom determinant domain to a region extending from Phe-25 to Gln-73. When expressed from a chromosomally integrated cDNA copy, p29 elevated RNA accumulation and vertical transmission (through conidia) of the Δp29 mutant virus to levels observed for wild-type CHV1-EP713. Additional mutational studies indicated a linkage between p29-mediated changes in host phenotype in the absence of virus infection and p29-mediated in trans enhancement of viral RNA accumulation and transmission. The multifunctional nature of p29 was extended to include suppressor of RNA silencing both in the natural fungal host and in a heterologous plant system. This activity was predicted based on similarities between p29 and the well-characterized potyvirusencoded suppressor of RNA silencing HC-Pro and provided the first circumstantial evidence that RNA silencing in fungi may serve as an antiviral defense mechanism, a well-established function in plants.

Anti-Hypovirus Defense Mechanisms Direct evidence for RNA silencing as a defense against hypovirus infection was obtained by systematic disruption of the C. parasitica-encoded Dicer and Argonaute genes, core RNA silencing pathway components. Dicers nucleases recognize and process viral double-stranded and structured RNAs into small 21–24 nt long RNAs, termed virus-derived small (vs) RNAs. With the aid of an Argonaute family protein, the vsRNAs are incorporated into an effector complex termed the RNA-induced silencing complex (RISC). One strand of the vsRNA is then degraded and the remaining strand guides the effector complex to the cognate viral RNA, which is then cleaved by the Argonaute-associated RNase H-like activity. Disruption of the two Dicer-like genes and the four Argonaute-like genes found in C. parasitica resulted in no observable phenotypic changes. However, infection of the mutant strains with hypoviruses or an unrelated mycoreovirus MYRV1-Cp9B21 resulted in increased viral RNA levels and severe virus-induced symptoms only in the Δdcl2 and Δagl2 mutants. The dependence on a single Dicer and a single Argonaute for antiviral defense in C. parasitica contrasts with dependence on multiple Dicer and Argonaute genes for antiviral defense in plants. Dicer dcl2 transcript levels were shown to increase 10–12-fold in an agl2-dependent manner, following infection by Cryphonectria hypovirus 1 (CHV1) and to exhibit super-induction (over 30-fold) after infection by CHV1-Δp69, lacking the p29 viral suppressor of RNA silencing. A genetic screen subsequently identified the conserved transcriptional activator SAGA (Spt-Ada-Gen5 acetyltransferase) as a regulator of this up-regulation. Mutational analysis demonstrated that the SAGA-associated histone acetyltransferase (HAT) activity, but not the histone deubiquitinase (DUB) activity, was the major influence on the SAGA-mediated induction of dcl2 transcription following virus infection. Early efforts to characterize hypovirus RNA at the molecular level revealed the accumulation of internally deleted defective interfering (DI) RNAs generated from full-length viral RNA by a combination of deletion and recombination events. Recombination-mediated deletion of non-viral nucleotide sequences also limited efforts to use recombinant hypoviruses to express foreign genes. Interestingly, DI RNAs failed to accumulate and recombinant hypovirus expression vectors were shown to be stable in the Δdcl2 and Δagl2 mutants. The combined results were the first to link an RNA silencing pathway to viral RNA recombination and are of potentially broad significance with possible applications for stable production of biologicals by mycovirus-based vectors. While the RNA silencing pathway serves as an anti-viral defense at the cellular level, the C. parasitica vegetative incompatibility (vic) non-self recognition system provides antiviral defense at the population level. In the absence of an extracellular phase to their replication cycle, mycoviruses rely on anastomosis (fusion of hyphae) as a major avenue for horizontal transmission. The fungal vic systems restricts mycovirus transmission due to incompatible reactions triggered when genetically distinct, vic-incompatible individuals of the same species interact. This reaction results in localized programmed cell death along the zone of contact, limiting transmission of viruses and other cytoplasmic elements. The C. parasitica vic system is controlled by at least six di-allelic loci, designated vic1, vic2, vic3, vic4 vic6 and vic7, that were recently identified and characterized at the molecular level. Gene disruption analysis formally confirmed that five of the six loci contribute to restriction of virus transmission. An allelic difference at vic4 does not restrict virus transmission. Systematic disruption of four of the five virus-restricting vic gene alleles was reported to engineer a super hypovirus donor (SD) strain formulation that was able to transmit hypovirus under laboratory conditions to uninfected strains that were heteroallelic at one to five of the virus-restricting vic loci. These results predict that the SD formulation could circumvent vic-imposed restrictions to virus

Hypoviruses (Hypoviridae)

593

transmission by serving as an effective vector to introduce hypovirus into natural field strains with widely diverse vic genotypic combinations of the six defined diallelic vic loci. This property should find application in efforts to integrate biological control with increased disease resistance derived from backcross breeding programs or transgenes for effective woodland restoration of the American chestnut.

Further Reading Dawe, A.L., Nuss, D.L., 2013. Hypovirus molecular biology: From Koch’s postulates to host self-recognition genes that restrict virus transmission. Advances in Virus Research 86, 109. Hillman, B.I., Suzuki, N., 2004. Viruses of the chestnut blight fungus, Cryphonectria parasitica. Advances in Virus Research 63, 423. Nuss, D.L., 2005. Hypovirulence: Mycoviruses at the fungal-Plant interface. Nature Reviews Microbiology 3, 632. Suzuki, N., Ghabrial, S.A., Kim, K., et al., 2018. ICTV virus taxonomy profile: Hypovaridae. Journal of General Virology 99, 615. Zhang, D.X., Nuss, D.L., 2016. Engineering super mycovirus donor strains of chestnut blight fungus by systematic disruption of multilocus vic genes. Proceedings of the National Academy of Sciences of the United States of America 113, 2062. Zhang, D.X., Spiering, M.J., Dawe, A.L., Nuss, D.L., 2014. Vegetative incompatibility loci with dedicated roles in allorecognition restrict mycovirus transmission in chestnut blight fungus. Genetics 197, 701.

Megabirnaviruses (Megabirnaviridae) Yukiyo Sato, Okayama University, Kurashiki, Japan Nobuhiro Suzuki, Institute of Plant Stress and Resources (IPSR), Okayama University, Kurashiki, Japan r 2021 Elsevier Ltd. All rights reserved.

Glossary Ascomycetous fungi Fungi phylogenetically classified into the division Ascomycota, which is characterized by the formation of spores in asci in the fungal life cycle. Hypovirulence Mycovirus-mediated attenuation in virulence of phytopathogenic fungi to plants. Mycoviruses Viruses that infect fungi. Ribosomal-1 frameshifting A translation mechanism of the polycistronic gene, in which ribosome slip through one

nucleotide on a slippery or shifty heptamer (typically “X XXY YYZ”, where the same letter represents the same set of nucleotides), followed by an RNA element forming a pseudo-knot or stem-loop structure. Spheroplasts Isolated cells enclosed in partially digested cell walls. T ¼ 1 icosahedral symmetry Symmetry of capsid composed of 60 subunits, which forms 12 pentameric capsomeres.

Introduction Rosellinia necatrix megabirnavirus 1 (RnMBV1, species Rosellinia necatrix megabirnavirus 1, genus Megabrinavirus, family Megabirnaviridae) was first isolated in 2009 from a phytopathogenic ascomycetous fungus, Rosellinia necatrix, which is the causal agent of white root rot disease and causes severe damage to many perennial crops. The virus was discovered during routine screens for mycoviruses conferring hypovirulence to the host fungus as part of a project led by Dr. Naoyuki Matsumoto that was aimed at virocontrol, a form of biocontrol using viruses that was initiated in the 1990s in Japan. The research group was inspired by the success of the virocontrol of chestnut blight disease using hypoviruses in Europe. RnMBV1 inhibits growth of and reduces virulence of R. necatrix. It is composed of a bipartite double-stranded RNA (dsRNA) genome that is encapsidated in non-enveloped spherical particles that are 52 nm in diameter. Among the established taxa to which multipartite dsRNA viruses are assigned, namely the family Reoviridae, Partitiviridae, Chrysoviridae, Quadriviridae and Megabirnaviridae, and the genus Botybirnavirus, the polycistronic gene encoding strategy is observed only in megabirnaviruses. Thus, in 2011 RnMBV1 was assigned to the newly established taxa, the genus Megabirnavirus in the family Megabirnaviridae. “Birna-” is derived from bipartite segments and “Mega-” means much greater genome size (approximately 16 kbp) than those of birnaviridae and picobirnaviridae (approximately 6 kbp and 4 kbp, respectively). The family Megabirnaviridae is currently comprised of a single genus, Rosellinia necatrix megabirnavirus 1, to which only the type member is assigned. Besides, several mycoviruses that are phylogenetically related to and share common properties with RnMBV1 have been reported in the last five years, although they have not been officially assigned to the taxa. In this article the properties of RnMBV1 are summarized and compared with those of related viruses.

Virion Properties The megabirnavirus genome is encapsidated in non-enveloped spherical particles that are 52 nm in diameter (Fig. 1). Virus particles contain a bisegmented dsRNA genome (Fig. 2). Each segment seems to be packaged separately, based on the two following observations: first, purified virus particles include an unequal and variable molar ratio of two dsRNA segments; second, artificial virion transfection into hosts can trigger the emergence of mutants that completely lack dsRNA2. Virions are constructed from the major capsid protein (CP), which is 136 kDa. Purified virions also contain minor CP-RdRP fusion proteins that are greater than 250 kDa (Fig. 1). Purified virions can be introduced to spheroplasts from the natural and experimental fungal hosts, R. necatrix and Cryphonectria parasitica, respectively.

Genome Organization The megabirnavirus genome consists of two segments, termed dsRNA1 and dsRNA2 (Fig. 2, Table 1). Each segment has two nonoverlapping putative open reading frames (ORFs). The larger segment (dsRNA1, 8.9 kbp) contains ORF1 and ORF2, which encode the major structural protein (CP, P1) and RNA-dependent RNA polymerase (RdRP, P2), respectively. The smaller segment (dsRNA2, 7.2 kbp) contains ORF3 and ORF4, encoding hypothetical proteins P3 and P4, respectively, with unknown function. Both genome segments contain an extremely long 50 -untranslated region (UTR) of over 1.6 kbp. 30 -UTRs of both segments are relatively short (o 0.4 kbp). The terminal sequence of 50 -UTR and 30 -UTR is conserved between both segments. The 50 -terminal 24-bp and 30 -terminal 8 bp-sequence are strictly conserved between the RnMBV1 segments (Fig. 3).

594

Encyclopedia of Virology, 4th Edition, Volume 4

doi:10.1016/B978-0-12-809633-8.20947-3

Megabirnaviruses (Megabirnaviridae)

595

Fig. 1 Transmission electron micrograph (TEM) image and structural proteins of purified RnMBV1 particles. (a) TEM image of negatively stained RnMBV1 virions, the species in the family Megabirnaviridae. The scale bar represents 100 nm. (b) Viral proteins composing purified RnMBV1 viral particles (VP). The viral proteins were run on SDS-PAGE gel and stained with Coomassie brilliant blue R-250. The black and gray arrows indicate the bands for CPs or CP-RdRP fusion proteins, respectively. The virus-free (VF) control is included. M indicates molecular size marker for proteins. These figures are reproduced from Chiba, S., Salaipeth, L., Lin, Y., et al., 2009. A novel bipartite double-stranded RNA mycovirus from the white root rot fungus Rosellinia necatrix: Molecular and biological characterization, taxonomic considerations, and potential for biological control. Journal of Virology 83, 12801–12812, with permission from the American Society for Microbiology.

Fig. 2 Genome organization of RnMBV1. Doubled lines and boxes indicates dsRNAs and ORFs, respectively. Numbers near the boxes indicate position of each ORF on the positive strands. The position of putative slippery sequence for ribosomal frameshifting is indicated with an arrow. Reproduced from Sato, Y., Miyazaki, N., Kanematsu, S., et al., 2019. ICTV virus taxonomy profile: Megabirnaviridae. Journal of General Virology 100, 1269-1270, with permission from the Microbiology Society.

Table 1

Single member in the genus Megabirnavirus, the family Megabirnaviridae

Virus species

Abbreviation

DsRNA segment no. (length in bp, encoded proteins; size in kDa)

GenBank accession no.

Rosellinia necatrix megabirnavirus 1

RnMBV1

1 (8931; CP, 136; CP þ RdRP, 266) 2 (7180; P3, 152; P4, 24)

AB512282 AB512283

Genome Expression and Replication Replication of megabirnavirus is predicted to occur in virus particles, as is the case for other dsRNA viruses. The fact that purified virions include RdRP-CP fusion proteins supports this hypothesis. RdRP-CP fusion products are likely translated from bicistronic transcripts of dsRNA1 via ribosomal-1 frameshifting. ORF2 encoding RdRP lies downstream -1 reading frame of CP-encoding ORF1 on the same positive strand (Fig. 2). There is a putative slippery sequence (50 -A AAA AAC-30 ) immediately upstream of the stop codon for ORF1. There is also a sequence element that potentially forms pseudo-knot/stem-loop structures upstream and downstream of the ORF1 stop codon. The long 50 UTR of megabirnavirus contains multiple mini-ORFs and hypothetical internal ribosomal entry sites (IRESs). It remains to be experimentally determined whether translation initiation of ORF1 and ORF3 is mediated by these IRESs. It is also unknown whether or how ORF4 is expressed.

Virion Structure The three-dimensional structure of RnMBV1 was determined by cryo-electron microscopy at 15.7 Å resolution (Fig. 4). Each capsid is composed of 60 homodimers of CP with T ¼ 1 icosahedral symmetry like many other fungal dsRNA viruses. A distinct features of virions of megabirnavirus is putative 120 large protrusions on the surface of capsid shell. The protrusions are observed, with a width of B45 Å and a height of B50 Å, around icosahedral two- and threefold axes, while trenches are observed around icosahedral fivefold axes.

596

Megabirnaviruses (Megabirnaviridae)

Fig. 3 Comparison of the terminal sequence of 50 and 30 UTRs of two dsRNA segments of RnMBV1. Multiple sequence alignments of the 50 UTRs (a) and 30 UTRs (b) of two dsRNA segments were performed with ClustalW2. Asterisks indicate the position where an identical base is present. Accession no. of sequence is listed in Table 1.

Fig. 4 Three-dimensional cryo-electron microscopy reconstruction of virions of RnMBV1 at a resolution of 15.7 Å. The surface representation of RnMBV1 particles viewed along an axis with fivefold symmetry (a) or twofold symmetry (b). Molecular boundaries of the A and B subunits are drawn with continuous orange and red, respectively. Courtesy of Naoyuki Miyazaki.

The inner and outer diameter of RnMBV1 capsid shell, excluding the protrusions, are 190 and 210 Å, respectively. For Saccharomyces cerevisiae virus L-A in the family Totiviridae, grooves on capsid shells could play an enzymatic role in cap snatching from cellular mRNA. An investigation into the functions of the protrusions exposed on the surface of RnMBV1 virions could be interesting future work.

Biological Properties The biological traits of megabirnavirus have been extensively studied in the RnMBV1-W779 strain. RnMBV1 persistently infects the natural host R. necatrix via inter-cellular cytoplasmic exchange and cell division. Furthermore, RnMBV1 can be horizontally transmitted to compatible fungal hosts and inhibits the growth of R. necatrix, which is a devastating phytopathogen that causes white root rot in fruit trees. Due to the hypovirulence it confers to R. necatrix (Fig. 5), RnMBV1 is regarded as a potential biological control agent for plant disease management. RnMBV1 can be inoculated into the ascomycetous fungus Cryphonectria parasitica. C. parasitica is not only the causal agent of chestnut blight disease, but it is also a model fungus for studying of virus-virus and virus-host interactions. RnMBV1 is also responsible for growth inhibition and hypovirulence of C. parasitica, as well as R. necatrix. In a C. parasitica mutant (Δdcl2) that lacks a component of antiviral RNA interference (RNAi) machinery (dicer-like protein 2), accumulation of RnMBV1 is B20 fold higher than in the wild type. The symptoms induced by RnMBV1 are promoted in Δdcl2. These observations suggest that RNAi in

Megabirnaviruses (Megabirnaviridae)

W779

W1015

W779

W1015

597

(a)

(b) Fig. 5 Mycelial growth of virus-carrying field strain W779 and virus-cured isogenic strain W1015 of R. necatrix. (a) Colony morphology of R. necatrix strains W779 and W1015. The fungal strains, W779 and W1015, were grown on PDA for 5 days in the dark and photographed. (b) Mycelial growth on apple rootstocks. Twigs of Japanese pear (1.5 cm long) were placed for 3 weeks on 5-day-old PDA cultures of the fungal strains W779 and W1015. Japanese pear twigs covered with mycelia of each fungal strain were fixed on stems of apple rootstocks (M. prunifolia var. ringo) that had been pre-cultured in a soil medium for horticulture. Inoculated apple plants were continued to be cultured in the soil for additional 2 weeks. The mycelial expansion is shown by white arrows while inoculation sites are denoted by arrowheads. Levels of the mycelial growth on apple rootstocks are equivalent to virulence levels. These figures are reproduced from Chiba, S., Salaipeth, L., Lin, Y., et al., 2009. A novel bipartite double-stranded RNA mycovirus from the white root rot fungus Rosellinia necatrix: Molecular and biological characterization, taxonomic considerations, and potential for biological control. Journal of Virology 83, 12801–12812, with permission from the American Society for Microbiology.

C. parasitica contributes to its tolerance to RnMBV1, which is stably maintained in both the wild type and Δdcl2 genotypes during repeated subculture on media. There is little information on the vertical transmission of the megabirnavirus in nature. The vertical transmission rate of RnMBV1 was investigated in C. parasitica, in which asexual spores can be easily formed, and it was found to be substantially low (o1%) in both the wild type and Δdcl2 genotypes of C. parasitica. All of the three well-characterized megabirnaviruses, namely RnMBV1 and two unassigned viruses termed Sclerotinia sclerotiorum megabirnavirus 1 (SsMBV1) and Rosellinia necatrix megabirnavirus 2 (RnMBV2), have bi-segmented dsRNA genomes and are associated with host phenotypic alterations. However, virion transfection of host fungal strains often leads to genome alterations, including genome rearrangements and segment loss. Examination of such RnMBV1 transfectants suggests the dispensability of ORF3- and ORF4encoded proteins. Even the complete loss of dsRNA2 is tolerated for virus viability, as discussed below.

Taxonomic and Phylogenetic Considerations The Megabirnaviridae family currently consists of a single genus, Megabirnavirus, to which only one species, Rosellinia necatrix megabirnavirus 1, is assigned. Based on identity of the deduced amino acid sequence of RdRP, megabirnaviruses are placed in a

598

Table 2

Megabirnaviruses (Megabirnaviridae)

List of the unclassified viruses closely related to megabirnavirus

Virus

Abbreviation

DsRNA segment no. (length in bp, encoded proteins; size in kDa)

Sclerotinia sclerotiorum megabirnavirus 1

SsMBV1

Rosellinia necatrix megabirnavirus 2

RnMBV2

Fusarium pseudograminearum megabirnavirus 1

FpgMBV1

Pleosporales megabirnavirus 1

PMbV1

Rosellinia necatrix megabirnavirus 3 Entoleuca megabirnavirus Rhizoctonia solani megabirnavirus 1 Rhizoctonia solani RNA virus-HN008

RnMBV3a EnMBV1a RsMBV1b RsRV-HN008

dsRNA1 (8806; P1, dsRNA2 (7909; P3, dsRNA1 (8985; P1, dsRNA2 (7959; P3, dsRNA1 (8951) dsRNA2 (5337) dsRNA1 (8845; P1, dsRNA2 (5136; P3, dsRNA1 (8967; P1, dsRNA1 (8927; P1, Not determinedb dsRNA1 (7596; P1,

137; 179; 138; 165;

P1 þ RdRP, 267) P4, 40) CP þ RdRP, 266) P4, 35)

131; P1 þ RdRP, 258) 117) 131; P1 þ RdRP, 265) 131; P1 þ RdRP, 265) 128; RdRP, 140)

GenBank accession no. KP686398 KP686399 LC062704 LC062705 MH057692 MH057693 KT601119 KT601120 LC333756 MF375886 KX349071b KP861921

a

Presence/absence and sequence of dsRNA2 remain to be determined. Only partial sequence of RdRP is available.

b

P2 (RdRP)

AKJ87315.1 SsMBV1

100 76

BAU24262.1 RnMBV2

100

BAI48016.1 RnMBV1

99

Megabirna-related, unclassified viruses Megabirnaviridae

ALO50147.1 PMbV1 AVD68671.1 EnMBV1 100

BBB86809.1 RnMBV3

Megabirna-related, unclassified viruses

AKO82515.1 RsRV-HN008 AAM68953.1 HvV145S AAM95601.1 PcV

100 100

(a)

Chrysoviridae

ADG21213.1 VdCV1

0.20

P1 (CP)

100

AKJ87314.1 SsMBV1

100

BAU24261.1 RnMBV2

99

BAI48015.1 RnMBV1

95

Megabirna-related, unclassified viruses Megabirnaviridae

ALO50146.1 PMbV1 BBB86808.1 RnMBV3 100

AVD68670.1 EnMBV1

Megabirna-related, unclassified viruses

AKO82514.1 RsRV-HN008 AAM68954.1 HvV145S 100 72

(b)

ADG21214.1 VdCV1

Chrysoviridae

AAM95602.1 PcV

0.20

Fig. 6 Phylogenetic analysis of megabirnavirus and related, unclassified viruses. Phylogenetic trees were constructed based on the complete deduced amino acid sequences of RdRP (a) and P1 (b) Full name of megabirnavirus and related viruses can be found to Tables 1 and 2. The chrysoviruses (HvV145S: Helminthosporium victoriae 145S virus; PcV: Penicillium chrysogenum virus; and VdCV: Verticillium dahliae chrysovirus 1) are also included as an outgroup. The sequences obtained from GenBank were aligned using MUSCLE. The evolutionary history was inferred using the Neighbor-Joining method and the evolutionary distances were computed using the Poisson correction method. The percentage of reproducibility of trees in the bootstrap test (1000 replicates) are indicated next to the branches. The whole step was conducted using MEGA7 software.

Megabirnaviruses (Megabirnaviridae)

599

phylogenetic group distinct from the known dsRNA mycoviruses, such as members of the Chrysoviridae and Totiviridae families. Although megabirnaviruses are phylogenetically related to these dsRNA virus families, megabirnaviruses possess genome organizations and virion structures that are distinct from these viruses. Recently, several unclassified mycoviruses that are closely related to a member of the Megabirnaviridae family have been discovered (Table 2, Fig. 6). Among them, SsMBV1 and RnMBV2 are most closely related to RnMBV1 and have been characterized in detail. The genome organizations and sizes of these two viruses are very similar to RnMBV1 (Tables 1 and 2, Fig. 7). SsMBV1 and RnMBV2 also have bi-segmented linear dsRNA genomes, which consists of 8.8–8.9 kbp of dsRNA1 and 7.2–7.9 kbp of dsRNA2. Both segments of each virus contain two ORFs, long 50 -UTRs that are more than 1.6 kbp and relatively short 30 -UTR. The signature sequence for translation of RdRP via ribosomal -1 frameshifting is also conserved between RnMBV1, SsMBV1 and RnMBV2. The conservation of 50 -UTR sequence between dsRNA1 and dsRNA2 is stricter in SsMBV1 and RnMBV2 than in RnMBV1. SsMBV1 was found in an ascomycete fungus, Sclerotinia sclerotiorum. Unlike the P3 of RnMBV1 or RnMBV2, the P3 of SsMBV1 contains a conserved papain-like protease coding domain, which is commonly found in positive-strand, single-stranded RNA viruses, hypoviruses. Phylogenetic analysis suggests that this papain-like protease domain in the SsMBV1 genome might have been horizontally transferred from Cryphonectria hypovirus 1. SsMBV1 moderately attenuates the growth of S. sclerotiorum. RnMBV2 was isolated from the same host species, R. necatrix, as RnMBV1, although the two infected host strains were collected from different prefectures in Japan. RnMBV2 would be classified as a distinct species from RnMBV1, based on phylogenetic distances, identity of amino acid sequence of hypothetical proteins and virulence to the host. Phylogenetic analysis based on the

dsRNA2

dsRNA1 RnMBV1 AAAAAAC SsMBV1

Papain-like protease domain

AAAAAAC RnMBV2 AAAAAAC FpgMBV1 GAAAAAC PMbV1 GGAAAAC RnMBV3

? GGAAAAC

EnMBV1

? GGAAAAC

RsRV-HN008

dsRNA

ORF1

ORF2 (RdRP)

ORF3

ORF4

3 kb

Fig. 7 Genome organization of megabirnavirus and related, unclassified viruses. Abbreviations of virus names corresponds to viruses listed in Tables 1 and 2. Doubled lines indicate dsRNA. ORF1, ORF2, ORF3 and ORF4 are represented by green, orange, purple and yellow boxes, respectively. The position and sequence of putative slippery heptamer adjacent to the stop codon of ORF1 is indicated with an arrow. The position of papain-like protease domain in ORF3 of SsMBV1 is indicated with a red box.

600

Megabirnaviruses (Megabirnaviridae)

deduced amino acid sequence of RdRP and CP indicates that RnMBV2 is more closely related to SsMBV1 than to RnMBV1 (Fig. 6). A single infection by RnMBV2 causes no obvious effects on R. necatrix, which is in contrast to infection by RnMBV1. However, co-infection by RnMBV2 and a partitivirus, Rosellinia necatrix partitivirus 1 (RnPV1), induces apparent growth inhibition of R. necatrix, although single-infections involving either virus does not. The enhanced RnPV1 replication likely underlies this synergistic interaction between RnMBV2 and RnPV1. In addition to SsMBV1 and RnMBV2, several unclassified viruses closely related to megabirnaviruses were found in diverse ascomycetous and basidiomycetous fungi. These include Fusarium pseudograminearum megabirnavirus 1 (FpgMBV1), Pleosporales megabirnavirus 1 (PMbV1), Rosellinia necatrix megabirnavirus 3 (RnMBV3), Entoleuca megabirnavirus 1 (EnMBV1), Rhizoctonia solani megabirnavirus 1 (RsMBV1) and Rhizoctonia solani RNA virus HN008 (RsRV-HN008) (Table 2, Fig. 6). FpgMBV1, PMbV1 and RsRV-HN008 have been detected in single infections of the respective original host fungi, while RnMBV3, EnMBV1 and RsMBV1 have been detected in mixed infections with other viruses by next-generation sequencing-based virome analyses. Thus, only the sequence of dsRNA1 (RnMBV3 and EnMBV1) or the partial sequence of RdRP (RsMBV1) have been reported, and the complete genome organizations of these three viruses remain unknown. The dsRNA1 sequences of FpgMBV1, PMbV1, RnMBV3 and EnMBV1 are similar to that of RnMBV1 in several aspects: (1) dsRNA size (8.8–9.0 kbp), (2) 50 -UTR length (41.6 kbp), (3) location of ORF2 (RdRP encoding gene) in -1 reading frame, relative to ORF1 and (4) presence of atypical shifty heptamer (50 -G RAA AAC-30 , here R represents A or G) resembling a typical one (50 -A AAA AAC-30 ) immediately upstream of the stop codon of ORF1 (Fig. 7). The dsRNA2 sequence of FpgMBV1 and PMbV1 (5.1–5.3 kbp) is much shorter than dsRNA2s of RnMBV1, SsMBV1 and RnMBV2 (7.2–8.0 kbp). RsRV-HN008 is distinct from these other megabirna-related viruses; RsRV-HN008 has a monosegmented genome with a short 50 -UTR (39 bp) and two ORFs in the same reading frame (Fig. 7). All of these viruses, including RsRV-HN008, are placed on the same phylogenetic clade based on the identity of the deduced amino acid sequence of CP or P1, as well as RdRP (Fig. 6). Further consideration is required for the cogent classification of these viruses and the setting of criteria for the family Megabirnaviridae.

Future Perspectives Functions of Megabirna-P3 and -P4 It has been confirmed that P3s, but not P4s, of RnMBV1 are expressed in infected fungal tissues; however, elucidation of the functions of P3 and P4 is a future research challenge. Artificial virion transfection of R. necatrix by RnMBV1 occasionally leads to the loss of dsRNA2, which simultaneously triggers the emergence of rearranged segments derived from dsRNA1. RnMBV1 mutants lacking dsRNA2 are capable of replicating and packaging, but showed impaired replication and attenuated hypovirulence to fungal hosts in comparison with wild type RnMBV1. Furthermore, dsRNA2-lacking RnMBV1 mutants appear to require rearranged dsRNA1 derivatives, in addition to intact dsRNA1, to stably infect the host fungus. Similarly, SsMBV1 mutants lacking dsRNA2 emerge by transfection of wild type SsMBV1 into S. sclerotiorum. SsMBV1 mutants without dsRNA2 also accumulate lower amounts of dsRNA1 compared to wild type. Thus, dsRNA2 of megabirnaviruses seems to be involved in efficient replication, translation and/or stability of dsRNA1. DsRNA2 is also likely to contribute to increased virulence to hosts. It remains to be elucidated whether megabirna-P3 and -P4 or the dsRNA itself plays a role in these contexts.

Further Reading Chiba, S., Salaipeth, L., Lin, Y., et al., 2009. A novel bipartite double-stranded RNA mycovirus from the white root rot fungus Rosellinia necatrix: Molecular and biological characterization, taxonomic considerations, and potential for biological control. Journal of Virology 83, 12801–12812. Kanematsu, S., Shimizu, T., Salaipeth, L., et al., 2014. Genome rearrangement of a mycovirus Rosellinia necatrix megabirnavirus 1 affecting its ability to attenuate virulence of the host fungus. Virology 450–451, 308–315. Miyazaki, N., Salaipeth, L., Kanematsu, S., et al., 2015. Megabirnavirus structure reveals a putative 120-subunit capsid formed by asymmetrical dimers with distinctive large protrusions. Journal of General Virology 96, 2435–2441. Salaipeth, L., Chiba, S., Eusebio-Cope, A., et al., 2014. Biological properties and expression strategy of rosellinia necatrix megabirnavirus 1 analysed in an experimental host, Cryphonectria parasitica. Journal of General Virology 95, 740–750. Sato, Y., Miyazaki, N., Kanematsu, S., et al., 2019. ICTV virus taxonomy profile: Megabirnaviridae. Journal of General Virology 100, 1269–1270. Wang, M., Wang, Y., Sun, X., et al., 2015. Characterization of a novel megabirnavirus from Sclerotinia sclerotiorum reveals horizontal gene transfer from single-stranded RNA virus to double-stranded RNA virus. Journal of Virology 89, 8567–8579.

Mitoviruses (Mitoviridae) Bradley I Hillman and Alanna B Cohen, Rutgers University, New Brunswick, NJ, United States r 2021 Elsevier Ltd. All rights reserved.

Glossary Anastomosis Fusion between hyphal branches allowing for exchange of cytosolplasm, ions, and organelles, and the most common means of horizontal transmission of mycoviruses. Conidia Asexual, non-motile fungal spores. Haplotype Group of alleles inherited together. In the case of haploid fungi, it constitutes the genetic makeup of the individual. Non-retroviral endogenous RNA (NERVE) Genetic elements comprised of fragments or complete genomes

derived from non-retroviral RNA viruses found embedded in eukaryotic genomes. Protoplast A cell that has its cell wall removed, allowing for membrane fusion and permeabilization. RNA silencing The negative regulation or inhibition of gene expression by non-coding RNA elements. Also called RNA interference (RNAi). Vegetative incompatibility Genetic differences between fusing vegetative hyphae of fungi that result in unviable heterokaryonsand prevent horizontal mycovirus transmission.

The genus Mitovirus contains arguably the simplest of all viruses known: an RNA molecule often of less than 3 kb encodes only RNA-dependent RNA polymerase (RdRp) that serves to direct replication of the simple genome. They are called mitoviruses because they reside in mitochondria. In common with several other fungal viruses, mitoviruses have no capsid protein, but rather consist of RNA, likely encapsulated in lipid vesicles together with virally-encoded RNA-dependent RNA polymerase. Mitoviruses are now known to be abundant in fungi – possibly the most common of all fungal viruses – and they are now suspected to be components of the plant virome as well. Most mitoviruses have been identified through double-stranded (ds) RNA screening of fungal cultures, a commonly used method to screen fungal culture collections for virus presence. Mitovirus dsRNA is usually relatively abundant in infected cultures, consistent with either a dsRNA or positive sense ( þ ) RNA genome. Free þ RNA is more abundant in cultures than free negativesense (  ) RNA, consistent with a þ RNA mitovirus genome. Phylogenetic relatedness of mitovirus RdRp coding sequences to þ RNA viruses rather than  RNA viruses or dsRNA viruses supports their classification as þ RNA viruses. Although they were first characterized in the mid 1990s, relatively little progress has been made on details of their replication and life cycles. This is in large part because of the lack of good infectivity systems. As mitoviruses have no protein-containing particles to deliver infectious nucleic acid and delivery into mitochondria is required to initiate infection, transmission has been accomplished to date only vertically via asexual or sexual spores, or horizontally by hyphal anastomosis. No one yet has developed a robust reverse genetics system – that is, infectious cDNA clones – for any mitovirus. Recent demonstration of mitovirus transmission to marked heterologous species, genera, and families opens the door for deeper examination of host range and phenotypic effects of a single mitovirus in different host backgrounds.

Genome Structure There is relatively little variation in mitovirus genome size and structure (Fig. 1). The largest mitovirus genome sequenced to date is B5.0 kb from the basidiomycete Heterobasidion annosum; the smallest is B2.2 kb from the ascomycete Sclerotinia sclerotiorum. In

Fig. 1 Genome maps drawn to scale of representative viruses belonging to the genus Mitovirus and close relatives. Color-coded regions represent conserved domains including: RdRp; RNA-dependent RNA polymerase (red), MP; movement protein (green), CP; capsid protein (blue), maturation protein (orange), and Lys; lysis protein (purple). Within each genus, there is variability with respect to genome size, RdRp protein size, and number of nucleotides in the non-translated sequences at 50 - and 30 termini. Size ranges of each or these elements within members of each genus are provided.

Encyclopedia of Virology, 4th Edition, Volume 4

doi:10.1016/B978-0-12-809633-8.21324-1

601

602

Mitoviruses (Mitoviridae)

all cases, a single ORF encodes the RdRp and no other known proteins. A hallmark of fungal mitovirus RNAs is the presence of UGA codons distributed throughout the ORF in varying numbers encoding tryptophan (Try) rather than serving as a terminator of translation (see below). There is no evidence for post-translational processing of mitovirus-encoded proteins and no evidence for the use of additional shorter-than-full-length subgenomic messenger (m) RNAs for mitovirus gene expression. Non-translated sequences at the 50 -end of the RNA vary from 15 to 957 nucleotides; non-translated sequences at the 30 -ends vary from 0 to 1788 nucleotides. Specific host and viral factors required for mitochondrial genome replication and expression have not been investigated in any depth primarily because of the lack of infectious viral cDNA clones that can be manipulated and because of the extreme difficulty associated with manipulating host mitochondrial gene expression.

Accessory RNAs Associated With Mitovirus Infections In addition to the single, replication-competent RNA of mitoviruses, additional accessory RNA molecules may be found as well. Accessory RNAs that may be associated with RNA virus infections include satellite RNAs, defective or defective-interfering RNAs, or satellite/defective RNA hybrids. Satellite RNAs (S-RNAs) are unrelated by nucleotide sequence to the parent virus, except possibly for 50 - or 30 -terminal nucleotides; defective RNAs (D-RNAs), also called defective genomes, are incomplete deletion mutants derived from the parent RNA; defective-interfering RNAs (DI-RNAs) are defective RNAs that have been demonstrated to interfere with parent virus replication or function. S-RNAs, D-RNAs, and DI-RNAs are all common in fungal RNA virus infections, and all have been identified in mitovirus infections. A heterogeneous S-RNA population was associated with infection of the mitovirus Ophiostoma novo-ulmi mitovirus 3A (OnuMV-3A) in the fungus Sclerotinia homoeocarpa, but there was no evidence that the satellite affected accumulation of the parent virus replication. Similarly, D-RNAs were identified in O. novo-ulmi strain Ld infected with multiple mitoviruses, but there was no evidence for interference of parent virus replication. In the Botrytis cinerea isolate CanBc-1c-78, the mitovirus BcMV-1 was found to be associated with a deletion mutant of the parent viral genome that fits the criteria of a DI-RNA by reducing accumulation of that parent viral RNA. As with other mitovirus investigations, studies on S-RNAs, D-RNAs, and DI-RNAs are hampered by the lack of robust infectivity systems.

Phenotypic Effects of Mitovirus Infection Many fungal viruses have little measurable effect on fungal colony morphology, growth rate in culture, or virulence, while others may have profound effects. Mitoviruses reflect this spectrum of biological consequences as well, some resulting in symptomless infections but others causing measurable and perhaps severe pathology. The first mitoviruses were identified based on their effects on virulence of the Dutch elm disease fungus in natural settings and colony phenotypes of O. novo-ulmi in culture, but that system was particularly difficult to tease apart from the standpoint of satisfying Koch’s postulates for individual mitovirus elements because they were numerous, coinfecting, and resulted in different phenotypes in the fungus. The effects of the individual mitoviruses were difficult to quantify in the absence of traditional virology methods such as particle isolation and inoculation to uninfected individuals. In contrast, the first mitovirus identified based on genetic and molecular properties in C. parasitica was a single element that had only a slight measurable effect on phenotype, so the challenge was to develop isogenic, virus-free fungal isolates and determine whether the minor changes to growth rate in culture and fungal virulence were caused by the virus itself. Only recently did Suzuki and colleagues develop a system to separate unambiguously mitovirus effects from host nuclear and mitochondrial backgrounds. Hygromycin-resistant, virus-free isolates were used as recipients in protoplast fusions with virus-containing, hygromycin-sensitive isolates. Following repeated anastomosis and single conidial isolation, resulting mitovirus-containing isolates bearing the nuclear and mitochondrial genotypes of the original recipient isolate were obtained and maintained in continuous culture. Several mitoviruses appear to have profound effects on fungal phenotype, including virulence on plant hosts. In the ascomycete pathogen of many field crops, Chalara elegans (¼Thielaviopsis basicola), reduced fungal growth, reduced melanin production, and reduced virulence have been associated with mitovirus presence. Furthermore, smaller and fewer mitochondria were associated with mitovirus-bearing isolates compared to their mitovirus-free counterparts. Similarly, a mitovirus from the broad host range plant pathogen Botrytis cinerea was shown to have deleterious effects on its fungal host and on the infected mitochondria themselves. In each of these two cases, the mitovirus was not completely eliminated from the fungal host, so conclusions of direct causality of the virus and quantifying disease effects were tempered. Mixed infection of a single fungal thallus by multiple mitoviruses has been demonstrated in several instances. In O. novo-ulmi, single conidial isolates of debilitated, multiply-infected isolates were used to help identify mitoviruses that affected fungal phenotype and those that did not. Multiple mitoviruses have also been shown to infect single isolates of Sclerotinia and Botrytis. Still unknown is whether multiple mitoviruses infect and replicate together in single mitochondria, or whether instead each mitochondrion within the population of mitochondria in a fungal thallus is infected with a single mitovirus. As noted for other mitovirus infectivity studies, experiments to address this question are challenging.

Mitoviruses (Mitoviridae)

69

87

61

100

89

100

100

96 99

100 100

100 97

100

100

95

100

98 94

100

100

73

99 88 100

67

98

89 70

100

100

100

96 90

Mitovirus B

100

100 99

90

100 96

89 99

98

90

100 100

100

100

100

100

99

92

100 100

90 100

94 97

Mitovirus A

100

91

81

99

100

Helicobasiium mompa mitovirus Cronarum ribicola mitovirus 1 Scleronia scleroorum mitovirus 6 Rhizoctonia solani mitovirus 6 Agaricus bisporus mitovirus 1 Scleronia scleroorum mitovirus 2 Scleronia scleroorum mitovirus 7 Hymenoscyphus fraxineus mitovirus 7 Cryphonectria cubensis mitovirus 1a Scleronia scleroorum mitovirus 15 Cronarum ribicola mitovirus 2 Ophiostoma novo-ulmi mitovirus 6 Scleronia scleroorum mitovirus 5-IL1 Grammeniella abiena mitovirus S32 Fusarium circinatum mitovirus 2 Fusarium poae mitovirus 1 Grammeniella abiena mitovirus S53 Alternaria brassicicola mitovirus 1 Ophiostoma novo-ulmi mitovirus 6 Botrythis cinereal mitovirus 2 Scleronia scleroorum mitovirus 5 Scleronia scleroorum mitovirus 1 Fusarium coeruleum mitovirus 1 Thielaviopsis basicola mitovirus Scleronia scleroorum mitovirus 4 Ophiostoma novo-ulmi mitovirus 4 Rhizoctonia solani mitovirus 13 Fusarium poae mitovirus 3 Botrys cinereal mitovirus 3 Thanatephorus cucumeris mitovirus Rhizoctonia solani mitovirus 11 Cryphonectria parasica mitovirus 1-NB631 Buergenerula sparnae mitovirus Scleronia scleroorum mitovirus 7 Rhizoctonia solani mitovirus 2 Rhizoctonia cerealis mitovirus Rhizoctoniasolani mitovirus 2 Ophiostoma novo-ulmi mitovirus 3a Scleronia scleroorum mitovirus 3 Scleronia scleroorum mitovirus 11 Scleronia scleroorum mitovirus 6-14563 Fusarium poae mitovirus 4 Ophiostoma novo-ulmi mitovirus 3b Botrys cinerea mitovirus 1 Tuber aesvum mitovirus Ophiostoma novo-ulmi 1b Ophiostoma novo-ulmi 1a Heterobasidion mitovirus 1 Clitocybe odora mitovirus Macrophomina phaseolina mitovirus 3 Rhizoctonia solani mitovirus 15 Rhizoctonia solani mitovirus 12 Dahlia pinnata mitovirus Erigeron breviscapus mitovirus Solanum chacoense mitovirus Petunia exserta mitovirus Humulus lupulus mitovirus Cannabis sava mitovirus Oxybasis rubra mitovirus Ambrosia artemisiifolia mitovirus Beta vulgaris mitovirus Azolla filiculoides mitovirus Cassava virus C Epirus cherry virus Ourmia melon virus Botryhis ourmia-like virus Pythophthora narnavirus Saccharomyces cerevisiae narnavirus 23S Saccharomyces cerevisiae narnavirus 20S Enterobacteria phage Hgal1 Enterobacteria phage GA Enterobacteria phage MS2

603

100

Ourmiavirus Narnavirus Levivirus

0.7 Substuons per site

Fig. 2 Maximum likelihood (ML) phylogenetic tree based on multiple sequence alignment of full-length RNA-dependent RNA polymerase (RdRp) protein sequences from 130 mitoviruses. Sequences were aligned with MUSCLE and implement in IQ-TREE with model selection parameters. A VT þ F þ I þ G4 model of evolution was selected based on the Akaike information criterion (AIC) and Bayesian information criterion (BIC). Bootstrap support above 60% is displayed above branches, calculated from 2000 ultrafast bootstrap replicates. Major clades are colored to denote the following groups: genus Levivirus (yellow), genus Narnavirus (red), genus Ourmiavirus (gray), genus Mitovirus clade B (blue), and genus Mitovirus clade A (green). Assembled mitovirus RNA sequences from plants form a monophyletic clade within mitovirus clade A and are denoted by a darker shade of green.

Phylogenetic Relationships Phylogenetic reconstruction of 130 mitovirus sequences reveal a well-supported monophyletic group containing two major clades; a smaller subset of those isolates is provided in Fig. 2. Clade A contains mitovirus sequences determined from a range of Ascomycete and Basidiomycete fungi as well as all identified plant mitoviruses, which form an independent monophyletic group. Mitovirus clade B is comprised of virus isolates from similar fungi, but no plant mitoviruses. High bootstrap values support the separation of the mitovirus clade from its sister clade containing the narnaviruses and ourmiaviruses. It is notable that several fungal taxa harbor mitoviruses from both clades A and B, and that individual mitovirus species from a single fungal species are distributed broadly within those clades. These observations suggest that mitoviruses have entered fungal taxa such as Ophiostoma, Sclerotinia, and Botrytis independently multiple times.

Taxonomy and Nomenclature Under current ICTV taxonomy, the family Narnaviridae contains two genera: the genus Narnavirus and the genus Mitovirus. Continued reassessment of the overall taxonomy of these and related viruses is needed, as members of the genus Narnavirus are more closely related to the plant-infecting viruses in the genus Ourmiavirus, in the recently approved family Botourmiaviridae, than they are to the mitoviruses. Thus, it is likely that the mitoviruses and narnaviruses will eventually be separated into different families. The genus Mitovirus currently contains all members discussed in this article. At this time, other than the association with different host taxa, there are no solid biological criteria on which to subdivide the taxon. Based on the strong evidence for at least

604

Mitoviruses (Mitoviridae)

two distinct mitovirus clades, one of which includes all known presumed plant-associated mitoviruses and related sequences, it might be useful to subdivide this broad taxon into smaller taxa. As the numbers of characterized mitoviruses and putative mitovirus sequences accumulates, it is expected that such a taxonomic proposal would soon be entertained by the ICTV.

Transmission Mitovirus transmission is not fundamentally unlike transmission of other fungal viruses, but their mitochondrial localization leads to some important differences compared to cytoplasmic viruses. Like other fungal viruses, mitoviruses depend on hyphal anastomosis for horizontal transmission. As such, vegetative incompatibility (VIC) genes and the associated VIC phenotype play a fundamental role in transmission from one fungal thallus to another in natural settings. Mitoviruses are efficiently transmitted upon anastomosis of vegetatively compatible isolates in culture, but details of transmission are only recently coming to light. Considering the mechanism of transmission as it pertains to the mitochondria themselves, two basic possibilities present themselves: mitoviruscontaining mitochondria from an infected isolate could enter an uninfected isolate after hyphal anastomosis and then fuse with virus-free mitochondria in the previously uninfected isolate, resulting in delivery of virus by mitochondrial fusion; or the viruscontaining mitochondria could enter the uninfected strain and simply take over, eventually replacing the population of uninfected mitochondria with infected ones. In the former model, the resulting mitochondrial haplotype of the newly infected strain would either be the same as it was previously, or would be a hybrid of the two mitochondrial haplotypes; in the latter model, the mitochondrial haplotype of the newly infected strain would be that of the infecting strain. Evidence to date strongly supports the first model: incoming mitochondria deliver virus to resident mitochondria and do not themselves replace the resident population. This has recently allowed for mitovirus transmission among different species of the same genus, among genera within a family, and even across family lines. These transmission experiments have also shown that a mitovirus from one fungal host species may or may not be stable in a new host. More such experiments are needed to better understand factors that regulate mitovirus/host stability. Vertical transmission of mitoviruses occurs both via sexual and asexual spores. Most mitoviruses have been characterized in ascomycetes, and their transmission has been more thoroughly characterized in those taxa. Transmission through asexual spores such as conidia in ascomycetes is often very efficient, at or near 100% in single conidial isolates derived from an infected isolate. Experimental transmission through sexual ascospores in ascomycetes has been shown to follow a somewhat predictable pattern. In C. parasitica, there are two mating types, and mitochondria have been shown to be inherited maternally: all mitochondria in ascospore progeny are derived from the female parent in sexual crosses. This pattern is common, but not universal in fungi. The C. parasitica mitovirus CpMV1-NB631 was shown to transmitted to B50% of ascospores derived from matings in which the female isolate was infected with CpMV1-NB631, but no transmission occurred when the male isolate in the matings was the infected partner. While that general pattern of virus inheritance follows the pattern predicted from mitochondrial inheritance, it is unclear why only approximately 50% rather than 100% of progeny in these experiments were infected.

Engineering Mitoviruses for Infectivity Reverse genetics can be effected for most positive strand RNA viruses through the use of infectious cDNA clones. Usually this is a fairly straightforward process: a full-length cDNA clone of the viral RNA is produced in a manner allowing for in vitro transcription of positive-sense RNA representing a replica of the viral RNA itself. The transcript is infectious when introduced into cells. The cDNA clones can then be manipulated through traditional molecular methods to examine functions of specific sequences. To date, no robust reverse genetics system has been reported for mitoviruses. The fundamental problem in developing a true reverse genetics system is delivery of viral-length RNA to mitochondria to initiate the infection process. It is difficult to know how many attempts at such infections have been made – we know of several – but none have yet been reported as successful. Another approach to examining mitovirus replication involves attempting replication and expression not in mitochondria but in the cytosol, as with most other positive-sense RNA viruses. For mitoviruses, this involves mutating any of the UGA codons to UGG, thereby allowing them to direct cytoplasmic incorporation of the amino acid tryptophan (Try) rather than translation termination. This approach would effectively turn a mitovirus into a narnavirus, which has the same fundamentally simple genome structure, encoding only RdRp, but is naturally localized to cytoplasm rather than mitochondria. Limited success in this approach has been reported, but we are unaware of peer-reviewed journal articles summarizing such studies. Such experiments are of interest from the standpoint of basic virology and examination of factors differentiating narnaviruses and mitoviruses, but they are not replacements for studies examining mitovirus RNA replication in mitochondria.

Host Defense Within fungi, a primary host defense mechanism against virus attack is thought to be vegetative incompatibility systems, which limit intraspecific hyphal fusion and accordingly limit horizontal virus transmission. This barrier is well-understood in some fungi, including C. parasitica, and has been demonstrated to restrict mitovirus transmission experimentally in that fungus and several other fungi. Presumably it functions in nature in fungi with VIC systems as well.

Mitoviruses (Mitoviridae)

605

Active, nucleic acid-based antiviral defense systems in plants and fungi are similar to each other, but differ in details and complexity. Both plants and fungi have RNA silencing-based defense systems. Viral suppressors of silencing encoded by related plant and fungal viruses, including the fungal-infecting members of the family Hypoviridae and the plant-infecting Potyviridae, have been characterized in detail, and the fungal virus hypovirus silencing suppressor, p29, has been shown to function when expressed in plants. There is currently no evidence that mitoviruses encode suppressors of RNA silencing, and comparison of CpMV1 accumulation among various RNA silencing-deficient and competent strains indicated that the virus was unaffected by RNA silencing. CpMV1 did not induce expression of either the dicer-like protein DCL2 or the Argonaut-like protein AGL2, the two RNAi-associated proteins that are known to be components of the RNA virus defense response pathway in fungi. Furthermore, there was no difference in susceptibility of these strains compared to the wild-type strain, as measured by replication level of CpMV1 in fungal strains deleted in the dcl2 or agl2 genes. Studying the interactions between fungal – or plant – RNA defense systems and mitoviruses will help answer important fundamental questions about mitovirus biology.

Codon Usage and Implications for Mitovirus Biology and Evolution Mitoviruses were initially described as mitochondrial elements based on their biology in the Dutch elm disease fungal system, Ophiostoma novo-ulmi, and their molecular, genetic, and subcellular properties in the chestnut blight system, Cryphonectria parasitica. Although, few comprehensive studies have been done to date in the many mitovirus systems described at the primary sequence level, deep sequencing and subsequent mining of genomic and transcriptomic data have resulted in important advances in our understanding of the biology and evolutionary history of mitoviruses. One of the hallmarks of mitovirus genome structure is their use of the mitochondrial genetic code for translation of their only gene product, the viral RdRp. The observation in the C. parasitica mitovirus CpMV1-NB631 that the genome contains numerous UGA codons in an otherwise open reading frame (ORF) encoding a putative RdRp led to the hypothesis that the element is obligately mitochondrial. UGA is a termination codon in the cytosol but encodes tryptophan in most fungal mitochondria, and therefore this CpMV1-NB631 element was predicted to encode a functional RdRp in mitochondria, but have no capacity for translation in the cytosol. As mitovirus sequences accumulated, the universal presence of UGA codons in predicted RdRp ORFs became evident as a general property of these elements. The accumulation in databases of many mitovirus sequences allowed for systematic analysis of Try-encoding UGA vs. UGG codons in the context of fungal host UGA vs. UGG codon use. These analyses showed that mitovirus UGA codon use mirrored that of the host fungus: that is, fungal taxa that rarely used UGA to encode Try tended to host mitoviruses with few UGA codons.

Origin and Evolution The phylogenetic tree of plant and fungal mitovirus sequences does not exclude the possibility that fungal mitoviruses derive from their plant counterparts. However, the likely ancient origin of eukaryotic mitoviruses, coupled with the earlier appearance on earth of fungi rather than plants argues for fungi as the earlier mitovirus host. Possible scenarios to envision include entry of a plant mitovirus into a closely associated sympatric fungus, such as an endophyte or mycorrhizal fungus, and subsequent expansion of fungal mitovirus host range and adaptation to fungal mitochondrial codon preferences – i.e., progressive gain of UGA codons within coding sequences – following heterologous fungal anastomosis events and associated mitochondrial fusion events. It is also easy to envision this scenario moving in the other direction: that a fungal mitovirus devoid of UGA codons within the coding sequence similarly entered a plant mitochondrion, took up residence, and expanded from there into other plants. Perhaps more than other viruses, the extreme simplicity of mitovirus genomes raises questions about how close they are to progenitors of RNA viruses. Recent deep and comprehensive phylogenetic analysis supports the idea that mitoviruses are ancient in origin, with the intriguing possibility that progenitors of contemporary mitoviruses entered the eukaryotic lineage along with the bacterial progenitors of the mitochondrial endosymbiont, possibly by the loss of a capsid protein gene from a progenitor bacterial levivirus. Evolution of mitoviruses within fungi would then parallel evolution of fungal (and plant) mitochondria. Further downstream evolution of viruses in the lineage represented by mitoviruses would presumably involve escape from mitochondria and then either (1) adaptation to a similarly cell-bound but non-mitochondrial, cytosolic lifestyle in the case of narnaviruses and possibly fungal botourmiaviruses, or (2) acquisition of simple capsid protein and movement protein genes and escape from the cell in the case of plant ourmiaviruses.

Plant Mitoviruses Signature sequences of mitoviruses were among the first non-retroviral endogenous RNA (NERVE) sequences identified in plants, described initially in the late 1990s. Mitovirus sequences have been identified in both mitochondrial and nuclear DNA genomes of plants. Their initial entry and presumably subsequent integration into plant genomes appears to have been an ancient event, possibly dating from the early evolution of clubmosses nearly a half billion years ago. The extent to which mitoviruses have been active members of the plant virome remains a matter of active investigation that will be explored further as broader and deeper genomic coverage of plant phyla becomes available. Some studies suggest that mitoviruses may have entered into plant lineages

606

Mitoviruses (Mitoviridae)

multiple times, possibly being lost from a given lineage and then regained through subsequent mitochondrial exchange. Recent evidence based on mining of transcriptome data strongly suggests that mitoviruses are active, contemporary residents of plants and that they form a monophyletic cluster embedded within one of the two major clades of fungal mitovirus elements (see Fig. 2). Furthermore, analysis of plant mitovirus-related NERVE sequences also supports the finding that these elements form a monophyletic group that clusters with the putative contemporary plant mitoviruses discovered from transcriptome data. Unlike fungal mitochondria, UGA codons in plant mitochondria encode a translational stop codon rather than tryptophan; thus, putative plant mitoviruses would not be expected to contain any UGA codons in ORFs, and this is indeed the case. As mitoviruses are now understood to be a component of the plant virome, we anticipate that other studies will investigate their relative importance in plants. A key question is whether they are cryptic in plants or whether they have a measurable and significant effect on plant development and biology. An interesting finding from associative plant transcriptome studies suggested higher mitovirus expression levels in flowers than other tissues, but these studies were not directed toward asking questions about mitovirus biology. Experimental studies with isogenic plants, similar to those that have been performed in fungi, will help address the possibility. Many of the same types of studies that have been performed in fungal systems are possible in plant systems, and many of the same specific questions will be addressed, for example: is there a measurable effect on plant phenotype; can horizontal transmission be demonstrated and what are mechanistic details; how do RNA defense systems impact plant mitoviruses? While genome and transcriptome data mining are very valuable tools for mitovirus discovery and analysis, examination of archival dsRNA studies that were performed for discovery of unknown viruses in plants may also be a useful approach. As it is with fungi, most plant viruses contain RNA genomes; in the case of plants, a preponderance of viruses contain positive-sense RNA genomes, and thus replicative form dsRNA accumulates to visually detectable levels in most. Initial discovery of mitoviruses in fungi, and subsequent discovery of most other fungal mitoviruses, has come through screening of samples from culture collections for dsRNA presence. Mitovirus-infected plants might be expected usually to contain a single signature genome-size dsRNA of B2.5–3 kb, identifiable in archival and published gel photographs, but evidence to date suggests that plant mitovirus dsRNA accumulation is minimal, so such data mining might not lead to further plant mitovirus discovery.

Further Reading Bruenn, J.A., Warner, B.E., Yerramsetty, P., 2015. Widespread mitovirus sequences in plant genomes. PeerJ 3, e876. Ghabrial, S.A., Suzuki, N., 2009. Viruses of plant pathogenic fungi. Annual Review of Phytopathology 47, 353–384. Hillman, B.I., Cai, G., 2013. The family Narnaviridae: Simplest of RNA viruses. Advances in Virus Research 86, 149–176. Hong, Y., Dover, S.L., Cole, T.E., Brasier, C.M., Buck, K.W., 1999. Multiple mitochondrial viruses in an isolate of the Dutch elm disease fungus, Ophiostoma novo-ulmi. Virology 258, 118–127. Nerva, L., Forgia, M., Ciuffo, M., et al., 2019. The mycovirome of a fungal collection from the sea cucumber Holothuria polii. Virus Research 273, 197737. Nibert, M.L., 2017. Mitovirus UGA(Trp) codon usage parallels that of host mitochondria. Virology 507, 96–100. Nibert, M.L., Vong, M., Fugate, K.K., Debat, H.J., 2018. Evidence for contemporary plant mitoviruses. Virology 518, 14–24. Polashock, J.J., Bedker, P.J., Hillman, B.I., 1997. Movement of a small mitochondrial double-stranded RNA element of Cryphonectria parasitica: Ascospore inheritance and implications for mitochondrial recombination. Molecular Genetics and Genomics 256, 566–571. Polashock, J.J., Hillman, B.I., 1994. A small mitochondrial double-stranded (ds) RNA element associated with a hypovirulent strain of the chestnut blight fungus and ancestrally related to yeast cytoplasmic T and W dsRNAs. Proceedings of the National Academy of Sciences of the United States of America 91, 8680–8684. Shahi, S., Eusebio-Cope, A., Kondo, H., Hillman, B.I., Suzuki, N., 2019. Investigation of host range and host defense against a mitochondrially replicating mitovirus. Journal of Virology 93, e01503–e01518. Vainio, E.J., 2019. Mitoviruses in the conifer root rot pathogens Heterobasidion annosum and H. parviporum. Virus Research 271, 19768. Wolf, Y.I., Kazlauskas, D., Iranzo, J., et al., 2018. Origins and evolution of the global RNA virome. mBio 9, e02329-18. doi:10.1128/mBio.02329-18. Wu, M., Zhang, L., Li, G., Jiang, D., Ghabrial, S.A., 2010. Genome characterization of a debilitation-associated mitovirus infecting the phytopathogenic fungus Botrytis cinerea. Virology 406, 117–126.

Mycoreoviruses (Reoviridae) Bradley I Hillman and Alanna B Cohen, Rutgers University, New Brunswick, NJ, United States r 2021 Elsevier Ltd. All rights reserved. This is an update of B.I. Hillman, Mycoreoviruses, In Encyclopedia of Virology (Third Edition), edited by Brian W.J. Mahy and Marc H.V. Van Regenmortel, Elsevier Ltd., 2008, doi:10.1016/B978-012374410-4.00569-0.

Glossary Anastomosis Fusion between hyphal branches allowing for exchange of cytoplasm, ions, and organelles, and the most common means of horizontal transmission of mycoviruses. Ascospores Sexual spores of an ascomycete. Conidia Asexual, non-motile fungal spores. Haplotype Group of alleles inherited together. In the case of haploid fungi, it constitutes the genetic makeup of the individual. Protoplast A cell that has its cell wall removed, allowing for membrane fusion.

Reassortment Exchange of genome segments resulting from co-infection of two or more strains of a multisegmented virus such as a reovirus. RNA silencing The negative regulation or inhibition of gene expression by non-coding RNA elements. Also called RNA interference (RNAi). Vegetative incompatibility Genetic differences between fusing vegetative hyphae of fungi that result in unviable heterokaryons and prevent horizontal mycovirus transmission.

The 9th Report of the International Committee for the Taxonomy of Viruses (ICTV) lists 15 genera of viruses in the family Reoviridae divided into two subfamilies, Sedoreovirinae and Spinareovirinae, that infect mammals, invertebrates, plants, protists, and fungi. Members of many of the reovirus genera in the 9th Report replicate in organisms representing more than one kingdom or phylum: for example, all plant reoviruses replicate in and persistently infect their insect vectors, and similarly many of the mammalian reoviruses replicate in their invertebrate vectors. The Genus Mycoreovirus comprises three fungus-infecting reovirus species, Mycoreovirus 1, Mycoreovirus 2, and Mycoreovirus 3. The exemplar strains of the first two species, mycoreovirus 1 (MyRV1) and MyRV2 were isolated from the filamentous ascomycete fungus Cryphonectria parasitica, and that of the last species is from the soilborne fungus Rosellinia necatrix, also an ascomycete but representing a different fungal order. A fourth proposed species of the genus infecting the fungal plant pathogen Sclerotinia sclerotiorum has been described (here designated MyRV4). Another recently described fungal reovirus, also infecting S. sclerotiorum, falls phylogenetically outside of the established genus Mycoreovirus. That virus, Sclerotinia sclerotiorum reovirus 1 (SsReV1), will nevertheless be discussed in this article. Mycoreoviruses often cause disease in their infected hosts, resulting in greatly reduced virulence, reduced growth in culture, reduced laccase accumulation, and reduced sporulation relative to wild-type, virus-free cultures of the same genetic background (Fig. 1). The mycoreoviruses are most closely related to members of the genus Coltivirus of tick-borne reoviruses (Fig. 2). Both Colorado tick fever virus (CTFV) and the closely related Eyach virus (EyaV) cause disease in mammals, and both viruses replicate in their arthropod vectors as well as in their mammalian hosts. A recently characterized coltivirus from bats was also found to replicate in mammalian cells. The coltiviruses are not well understood at the molecular level and so have shed only a little light on the study of mycoreoviruses. Coltiviruses are more closely related to the genus Orthoreovirus, which includes the common human pathogen mammalian reovirus (MRV), than to members of the other two genera of the family Reoviridae (Orbivirus and Rotavirus) that have been well studied at the structural and molecular levels. Filamentous fungi are valuable subjects for investigation of eukaryotic viruses. This has been especially true for ascomycetes, which usually are haploid throughout there their vegetative phases and thus readily amenable to tools of classical genetics and to relatively simple gene knockout and knockdown experiments. The plant pathogenic ascomycete fungi Cryphonectria parasitica, Rosellinia necatrix, and S. sclerotiorum have all been exceptional hosts for examination of fungal viruses: they cause important plant disease, are very stable in culture, showing a consistent morphology upon continued maintenance and subculture, are easily transformed and transfected, and can be examined through classical genetics. Because of their historical importance as plant pathogens and the potential for their control using natural or genetically engineered viruses, a large array of diverse viruses that cause stable morphological changes and/or changes in virulence of their fungal hosts have been identified and characterized in these fungi.

Structure-Function Relationships Relatively little is known of the details of mycoreovirus structure, but all indications from electron micrographs of negatively stained virus particles and genome sequence analysis are consistent with the Spinareovirinae subfamily, best studied in the orthoreoviruses. The orthoreoviruses, typified by mammalian orthoreovirus (MRV), are distinguished from the members of the subfamily Sedoreovirinae such as orbiviruses and rotaviruses in that they have identifiable pentameric turrets on top of each

Encyclopedia of Virology, 4th Edition, Volume 4

doi:10.1016/B978-0-12-809633-8.21326-5

607

608

Mycoreoviruses (Reoviridae)

Fig. 1 Morphologies of isogenic Cryphonectria parasitica colonies infected with three mycoviruses. (a) Uninfected colony, strain EP155; (b) Strain EP155 infected with hypovirus CHV1/EP713; (c) Strain EP155 infected with mycoreovirus MyRV1/Cp9B21; (d) Strain EP155 infected with mycoreovirus MyRV2/CpC18.

100

99

85

100

100

100

70

93

95

80

100

100

100

99

100

100 97

100

100

100

100 99

100

82 100

100

100

100

100 71

100

100

80 99 99 100

97

100

96

0.20 Fig. 2 Neighbor-joining (NJ) phylogeny of members of the family Reoviridae based on a multiple sequence alignment of full-length RdRp protein sequences. Amino acid sequences were aligned using MUSCLE. Phylogenetic tree was reconstructed using MEGA version 7 under the Poisson model with 1000 bootstrap replicates. Numbers above branches represent bootstrap support above 70%. Taxa that do not belong to a named clade represent unclassified viruses within Reoviridae. Taxa are colored according to host type: fungus (orange), mammal (red), invertebrate (gray), plant (green), and fish (blue).

Mycoreoviruses (Reoviridae)

609

fivefold axis of the core particle, and these have been demonstrated to be involved in capping of nascent mRNA. The protein that was experimentally demonstrated to be the mycoreovirus guanylyltransferase is homologous to turret proteins of orthoreoviruses that have the same function, supporting the grouping of mycoreoviruses within the Spinareovirinae subfamily of the family Reoviridae. Movement of fungal viruses within the mycelium and during horizontal or vertical virus transmission is also relatively poorly understood. Viruses that have been investigated are generally found in hyphal tip cells and move with the growing mycelium; however, location of mycoreoviruses within the mycelium has not been investigated. The steps in the infection process of the mammalian orthoreovirus have been investigated in considerable detail. Virus entry into cells via vesicles occurs by receptormediated endocytosis, whereupon the m1 protein, which is myristoylated at its N-terminus and inserted into the inner membrane of the vesicle, cleaves autocatalytically, resulting in pore formation in the vesicle membrane and subsequent virus release into the cytoplasm. Both of the C. parasitica mycoreoviruses have homologs of the orthoreovirus m1 protein, with strong myristoylation signals at the amino termini of the corresponding proteins and putative sites for autocatalytic cleavage. This suggests that a similar mechanism of egress from vesicles is a component of the infection cycle of these reoviruses. Interestingly, although the R. necatrix mycoreovirus, MyRV3/RnW370, contains a homolog of the m1 protein, it does not contain a glycine residue at the penultimate position of the N-terminus of this deduced protein, and based on its sequence it has a very low predicted probability of myristoylation. It does, however, contain the N/P putative cleavage motif at a similar position and in a similar environment as those in the other viruses. It is relatively easy to cure C. parasitica, R. necatrix, and S. sclerotiorum from their associated mycoreoviruses. In C. parasitica, single asexual spores are usually virus free, and in R. necatrix, cultures initiated from excised hyphal tips (the terminal 2–8 cells) are also virus free. In S. sclerotiorum, colonies regenerated from individual protoplasts derived from virus-infected cultures may be virus-free. These are in contrast, for example to cultures infected with the well-studied hypovirus of C. parasitica, Cryphonectria hypovirus 1 (CHV1), in which most or sometimes all conidia contain virus and virus infected cultures cannot easily be cured by hyphal tip isolation. It is likely the case that details of mycoreovirus movement within the mycelium as well as horizontal and vertical transmission are quite different from those properties of hypoviruses, which have no capsid protein, are more closely related to positive-sense single-stranded RNA viruses such as plant potyviruses and animal picornaviruses, and whose replication is associated with the fungal trans-Golgi network.

Taxonomy and Nomenclature Naming of fungal viruses is often problematic because infectivity studies are very difficult with fungal viruses and have not been done with most. In nature, fungal viruses do not exit completely from an infected isolate and enter an uninfected one exogenously. Instead, they move from one isolate to another only after the hyphae of the two isolates have fused (anastomosed), in which case the contents of one or usually more cells from the infected and uninfected isolate are mixed. Mixed infections are common and symptomless infections are the norm for fungal viruses. For these reasons, fungal virus names usually contain reference not only to the host genus and species, but to the host strain of origin as well. The genus name Mycoreovirus was a natural one to use because it is descriptive. For the sake of simplicity and consistency with other viral nomenclature, species are numbered progressively as they are described. The fungal species and isolate in which a particular virus was identified is provided after the species number. In this system, the first mycoreovirus species described was designated Mycoreovirus 1. The only isolate of this virus species identified to date was from C. parasitica strain 9B21, so the virus is designated mycoreovirus 1/Cp9B21 or MyRV1/Cp9B21. The reason that MyRV2 represents a separate virus species even though the two were isolated from the same fungal species and only a few miles away is because it shares much less sequence similarity at both the nucleotide and amino acid levels (o50%) than would be expected for two viruses in a single species. Surprisingly, both of these species are monotypic: these two virus isolates, each representing a different species, are the only two mycoreoviruses isolated to date from C. parasitica, even though thousands of isolates infected with members of the family Hypoviridae have been identified worldwide. In contrast, mycoreoviruses from dozens of isolates of R. necatrix from different parts of Japan have been isolated, but all are closely related to each other, indicating that they represent strains of a single virus species, Mycoreovirus 3. The two distinct reoviruses characterized from the phytopathogenic ascomycete S. sclerotiorum are not closely related to each other and most likely represent independent introductions to the fungus by viruses evolved elsewhere. One of the viruses falls within the same clade as the other mycoreoviruses and for consistency here and in keeping with the ICTV species names is designated Mycoreovirus 4. In fact, the predicted amino acid sequences of the 12 MyRV4 segments share an average of B80% identity with homologous segments of MyRV3, and terminal nucleotide sequences of MyRV3 and 4 are identical, indicating that they are recently diverged species. The other Sclerotinia reovirus, which has been designated Sclerotinia sclerotiorum reovirus 1 (SsReV1), falls well outside of the clade containing the four viruses within the genus Mycoreovirus; in fact, SsReV1 is more distant from members of the genus Mycoreovirus than it is to members of the genus Coltivirus (Fig. 2). SsReV1 shows some properties distinct from those other viruses, discussed below, and has been proposed to constitute the basis for a new genus within the family Reoviridae based on these phylogenetic relationships and predicted coding for proteins not found in the four species of the genus Mycoreovirus.

610

Mycoreoviruses (Reoviridae)

Fig. 3 Diagram illustrating the genome segments of mycoreovirus1/Cp9B21 drawn to scale with respect to size, open reading frame (ORF), and 50 and 30 untranslated regions (UTR). Segment size, ORF in number of amino acids and predicted protein product size, and putative protein function assigned to each genome segment are shown.

Genome Structures, Organizations, and Relationships Three members of the genus Mycoreovirus examined to date appear to have 11 segments of dsRNA that are required for infection. Isolates of one of the viruses, MyRV3, have been found to have either 12 or 11 segments (see below). Whether any of the 12 dsRNA segments of the closely related virus MyRV4 is dispensable for stable infection of S. sclerotiorum is unknown. There is significant sequence similarity among the larger segments of the four mycoreovirus species and between homologous segments of the mycoreoviruses and their closest relatives, the coltiviruses, but this similarity becomes less apparent in the middle segments and is not apparent at all in the smallest segments. This feature is common with other members of the family Reoviridae, in which the more distant relationships are generally revealed in only the large segments. Each of the 11 required segments of mycoreoviruses appears to contain a single open reading frame (Fig. 3). Segments 1–5 of the mycoreoviruses are homologous to segments 1–5 of the coltiviruses (segments 4 and 5 of MyRV2 are transposed relative to the others). Based largely on similarity with other reoviruses, their predicted functions are: S1: RNA-dependent RNA polymerase (core protein); S2: dsRNA-binding; putative methyltransferase (core protein); S3: Guanylyltransferase (turret protein); S4: Myristoylated membrane penetration protein (outer capsid); S5: Cytoskeletoninteracting (core). Segment 6 of the mycoreoviruses is predicted to encode a nucleic acid binding core protein and is homologous to segment 10 of the two coltiviruses. Segment 7 of MyRV1 encodes a proline rich protein with similarity to several viral and nonviral proline-rich domains. Homologies among the smaller mycoreovirus proteins (from S8–11 or 12) and other reovirus deduced proteins, including those of coltiviruses, are unknown. One of the indications of close relationship among reoviruses is conserved terminal sequences. This also is a good indication of whether or not pseudorecombinants can be generated by co-infection with two related viruses. In the case of the mycoreoviruses, the MyRV1, MyRV2, and MyRV3 have different conserved terminal sequences, consistent with their taxonomic separation. Terminal sequences of MyRV3 and MyRV4 are identical, supporting their close relationship.

Mycoreovirus 1 Although MyRV1 was not the first of the two C. parasitica reoviruses identified, fungal cultures infected with this virus proved to be more stable and easily studied than those infected with MyRV2 (see below), so early molecular investigations have focused on this virus. When grown on solid media in Petri dishes (e.g., a defined complete medium or potato dextrose agar (PDA)), MyRV1infected colonies are deep orange in color and have little aerial hyphae compared to their uninfected counterparts. As with other C. parasitica viruses, the phenotype of the infected culture has very little to do with the host isolate but is determined almost entirely by the virus. Fungal isolates infected with MyRV1 are much less virulent than uninfected isolates, and are among the most debilitated of any virus-infected C. parasitica cultures studied to date. Although the virus accumulates to reasonably high concentrations in infected colonies, it is transmitted very poorly through conidia, at rates of only 2%–5%. This may account in part for the rarity of the virus in nature.

Mycoreoviruses (Reoviridae)

611

Expression of MyRV1 Gene Products Functional analysis of the MyRV1 genome was initiated by cloning the 11 individual segments into a baculovirus expression vector and expressing them in insect cells. All 11 segments were expressed, resulting in 11 identifiable protein products on polyacrylamide gels. Only one of the proteins, the segment 3 product, has been studied functionally. This protein was found to be active in autoguanylylation assays, confirming it as the viral guanylyltransferase. Deletion and site-directed mutational analysis determined that the amino acid sequence EPAGYHPRPSIVVPHYFVFR constituted the catalytically active site of the MyRV-1/ Cp9B21 guanylyltransferase. The Hx8H motif was identified as absolutely conserved in all three members of the genus Mycoreovirus, as well as in the structurally related genera Coltivirus, Orthoreovirus, Aquareovirus, Cypovirus, Dinovernavirus, Oryzavirus, and Fijivirus. In all of the above genera in which the guanylyltransferase has been identified functionally, the Hx8H motif has been found within the sequence. The core consensus sequence for the guanylyltransferase within this group of genera was a/vxxHxxxxxxxxHhyf/lvf, with only the H residues being absolutely conserved.

Mycoreovirus 2 The first virus that was tentatively identified as a mycoreovirus was MyRV2. Like many of the virus infected C. parasitica cultures, the one that was found to contain this virus was isolated from a canker on an American chestnut tree. The circumstances of the discovery were somewhat unusual in that only one of 36 fungal isolates from that particular canker was virus infected; the rest were virus free. Although it is not unusual to find mixed fungal infections within a canker or to isolate virus-containing and virus-free cultures from one canker, the ratio of 1/36 is extraordinary. The phenotype of the infected culture, designated C-18 (the 18th isolate from canker C on that particular tree) was distinct from the phenotype of MyRV-1/Cp9B21-infected cultures described above in that it was light brown in color and had more aerial mycelium (Fig. 1). MyRV1 and MyRV2 have very similar, dramatic negative effects on fungal virulence. The recent finding that the original C-18 culture is co-infected with the unrelated virus CHV4 and that CHV4 helps stabilize MyRV2 in C. parasitica, likely as a result of suppression of RNA silencing, adds interesting complexity to understanding of mycoreoviruses. A major difference between the two C. parasitica mycoreoviruses is their stability and transmissibility. MyRV2 is easily lost upon subculture of the fungus, while this has never been observed with MyRV1. Furthermore, MyRV2/CpC18 is extremely difficult to transmit from an infected isolate to an isogenic uninfected isolate by hyphal anastomosis, a property that is not seen with other C. parasitica viruses. It is presumed that these properties are related, but their reasons have not been fully elucidated. Both MyRV1 and MyRV2 have been transmitted to uninfected fungal isolates by inoculating C. parasitica protoplasts with purified virus particle preparations and allowing the protoplast to regenerate cell walls, form a hyphal network, and grow into a single colony. The ability to infect protoplasts efficiently using purified virus particle preparations is an interesting and useful feature of C. parasitica reoviruses and has been used for a number of other viruses recently. This has allowed for making isogenic virus-infected and virusfree isolates, starting with different genotypes of C. parasitica regardless of vegetative incompatibility group and for experimental transmission of mycoreoviruses to different species, genera, and families of fungi.

Mycoreovirus 3 White rot is a root disease of fruit trees that can be limiting to production. Control of the fungus that causes the disease, Rosellinia necatrix, by chemical means is difficult and not economically feasible. Plant pathologists in Japan, where the disease is particularly severe, have sought to use virus-infected strains to control the disease, leading to the identification of several viruses. This plant/ fungus interaction represents an interesting contrast to the chestnut/C. parasitica interaction. As a root disease, there are challenges and opportunities for biological control of a fungal pathogen with viruses that do not apply to aerial diseases. One of the viruses under investigation for biocontrol of R. necatrix is the mycoreovirus MyRV3, which causes reduced virulence of the fungus. Unlike the monotypic C. parasitica reoviruses, different strains of MyRV3 have been isolated from a variety of R. necatrix strains from around Japan. Of the R. necatrix reoviruses, the virus isolate that has been most thoroughly characterized is MyRV3/RnW370. In surprising contrast to the C. parasitica viruses, MyRV3/RnW370 was found to contain 12 rather than 11 segments. However, the presence of 12 segments is not a necessary feature of MyRV3: R. necatrix isolates derived from subculture of original virus-infected isolates may have either 12 or 11 segments. Experiments to investigate virus composition and transmission have been performed on hyphal tip cultures from infected R. necatrix isolates, resulting in demonstration that these viruses behave like the C. parasitica mycoreoviruses. When only 11 segments are present in MyRV3 subculture isolates, segment 8 is the one that is absent from the full complement. With most reoviruses, sequence conservation among species and genera is evident in the larger segments, but much less so in the smaller segments, and this is true in the mycoreoviruses and related genera. Consistent with this general trend, sequence comparison between the 12 segments of MyRV3 and the 12 segments of the two coltiviruses has suggested nothing about possible function of the apparently dispensable segment 8. It is intriguing to think that perhaps MyRV3 segment 8 is vestigial for reoviruses in fungi and is required only in another host, past, or present. This would be similar to the leafhopper-transmitted phytoreovirus, wound tumor virus (WTV), in which deletion

612

Mycoreoviruses (Reoviridae)

mutations in any of three segments may be found upon successive serial, insect-free virus passage, or long-term virus maintenance in plants, whereupon resulting mutant viruses become defective in their transmission properties and incapable of replicating in their leafhopper vectors. An apparent major difference between MyRV3 and WTV is that in MyRV3 there is no evidence for terminal remnants of segment 8 in deletion mutant strains; it appears to be either present or entirely absent. In the well-characterized mutants of WTV, the deleted segments were not completely absent; rather, shorter mutant segments containing the two termini and varying amounts of adjacent sequence remain, ensuring that a total of 12 segments remain. Whether this represents a difference between the dsRNA segment sorting and packaging mechanisms of phytoreoviruses and mycoreoviruses is not known. In this line of inquiry, the close association of fungi with mites in natural settings may be significant to the evolutionary biology of mycoreoviruses: it may be no coincidence that their closest relatives are the tick-borne coltiviruses, with ticks and mites both in the arachnid subclass Acari. Unfortunately, mites are very difficult experimental subjects and cell cultures are not currently available. Furthermore, much less is known about coltivirus gene function than is known about many of the other members of the family Reoviridae that are pathogenic to humans, making it more difficult to pursue this line of research from a strictly bioinformatics standpoint.

Sclerotinia Sclerotiorum Reoviruses Mycoreovirus 4 MyRV4 from S. sclerotiorum is most closely related to MyRV3 from R. necatrix, and even though the two were identified in fungi representing distinct classes of ascomycetes, they are more similar to each other than are MyRV1 and MyRV2 to each other, even though the latter two were both isolated from C. parasitica only a few miles apart from each other. MyRV4 infection of S. sclerotiorum reduces virulence of the fungus on its plant hosts and also suppresses host non-self recognition, or vegetative incompatibility, in the fungus. In doing so, MyRV4 appears to enhance its own spread in nature by lowering the existing barriers to spread within heterogeneous field populations. Like its closest relative MyRV3, MyRV4 contains 12 segments of dsRNA. Each of the 12 segments of MyRV4 has a homologous MyRV3 segment, so these viruses will be good candidates for comparative genomics and for potential coinfection studies. Unlike MyRV3, in which stable virus isolates containing only 11 dsRNA segments (i.e., lacking segment 8) have been identified, no MyRV4 isolates containing only 11 dsRNA segments have yet been reported. Segment 8 of MyRV4 shares 77% nucleotide identity with MyRV3, so it would not be surprising to find similar MyRV4 isolates lacking this segment. MyRV4 is also associated with distinct alterations to fungal colony morphology in culture, and to alterations in fungal gene expression. Suppression of het gene expression, the genes associated with heterokaryon, or vegetative compatibility explains the observed suppression of vegetative incompatibility and increased virus transmission among diverse fungal strains. Reduction in expression of reactive oxygen species (ROS) genes by MyRV4 infection also likely contributes to the same phenotype.

Sclerotinia Sclerotiorum Reovirus 1 In addition to its greater phylogenetic distance from the four members of the genus Mycoreovirus, the other Sclerotinia reovirus, SsReV1, differs from those in that it causes no identified morphological or phenotypic change to the host fungus. SsReV1 was originally isolated from a hypovirulent strain of S. sclerotiorum, but infection with purified virus demonstrated that it was not the cause of the hypovirulent phenotype. A notable feature of the SsReV1 genome is the presence of two domains that were identified in several other viruses and eukaryotes, but not in mycoreoviruses. One of the domains, a dsRNA binding motif, was found especially in DNA viruses and also in cellular host genomes. The other domain, the Reo sC capsid domain, was more commonly found in other reoviruses but also in DNA-containing viruses. A possible new reovirus genus encompassing SsReV1 has not yet been described.

Effects of Mycoreoviruses on Fungal Gene Expression Considerable information has been amassed on the impact of CHV1, a positive-sense RNA virus, on its fungal host (see Hypoviridae). In contrast, studying the mechanisms of mycoreovirus infection on fungi is in its infancy. The first study addressing these questions was done by microarray analysis using mRNA isolated from isogenic fungal isolates of C. parasitica infected with either MyRV1 or MyRV2, and comparing results to the same fungal strain that was uninfected or infected with either of two different CHV1 strains, and with fungal mutants defective in virulence characteristics. To date, these experiments have been performed only on EST-based arrays representing B20% of the total C. parasitica gene complement. Overall, there was consistency in the effects of the two C. parasitica mycoreoviruses on host gene expression: MyRV1 infection resulted in differential expression of 6.5% of the genes on the array, whereas MyRV2 infection affected expression of 5.8% of those genes. As might be expected based on their phenotypes, similar but distinct suites of genes were up- or downregulated in isogenic fungal isolates infected with the two reoviruses. Approximately 60% of the genes whose expression was affected were the same whether infection was by MyRV1 or MyRV2, and all but one of those genes were altered in the same direction. Some of these groups of genes are in common with

Mycoreoviruses (Reoviridae)

613

those that are differentially regulated in cultures infected with the unrelated hypoviruses, but there are predictable differences. For example, hypovirus infection of C. parasitica results in female infertility, whereas mycoreovirus infection does not, and this is reflected in the expression of two genes predicted to be involved in the C. parasitica mating response. Both mf2–1, which encodes the fungal pheromone precursor, and Csp12, which encodes a homolog of the yeast Ste12-like transcription factor, were substantially downregulated in hypovirus-infected fungal isolates, which are female sterile, whereas there was much less effect on expression of these genes in either of the mycoreovirus-infected C. parasitica isolates. Virus is transmitted to ascospores at a rate of B50% or less when the female parent is infected, but there is no virus transmission to ascospore progeny if the male parent in a mating is infected. Sequencing the complete genome of several strains of C. parasitica has been accomplished, and current investigations based on high-throughput RNAseq methodologies are underway.

Coinfections of Mycoreovirus and Other Viruses Double infections of the well-characterized hypovirus CHV1/EP713 and MyRV1/Cp9B21 revealed that presence of the hypovirus caused an increase in both the concentration and the vertical transmission rate through conidia of the mycoreovirus, but that the mycoreovirus did not affect the concentration or transmission rate of the hypovirus. Furthermore, transgenic expression of only the hypovirus protein p29 also resulted in increased accumulation and transmission of MyRV1/Cp9B21. This is consistent with evidence that p29 serves as a suppressor of RNA silencing ( ¼ RNA interference) during hypovirus infection, and that this effect acts in trans to support enhanced mycoreovirus replication. Coinfection of C. parasitica with CHV4 was recently shown to facilitate infection with MyRV2. Single infections of either the original host strain, C-18, or the commonly used virus-free strain EP155 with MyRV2 were less stable upon serial subculture than were the same strains when co-infected with CHV4, which itself is symptomless but stable in C. parasitica. Using C. parasitica mutants defective in the dicer-like gene dcl2, which is an integral component of the fungal RNA silencing-based defense response against virus infection, it was shown that the putative silencing suppressor of CHV4 was responsible for trans-stabilization of MyRV2. SsReV1 was also first identified as a coinfection with another virus. In this case, a bipartite dsRNA virus was identified along with the 11-segment reovirus in S. sclerotiorum isolate SCH94. The doubly-infected isolates had a distinct colony morphology compared to virus-free isolates, but isolates infected with only SsReV1 were indistinguishable from uninfected isolates, indicating that SsReV1 alone did not have a discernable effect on the fungus. Unlike the above case with MyRV2, there was no evidence that SsReV1 was unstable without its original coinfecting virus.

Mycoreovirus Genome Rearrangements Numerous alterations and rearrangements of mycoreovirus genome segments have been identified and characterized. The first of these, discussed above, involves complete loss of segment 8 of MyRV3 in some fungal isolates. As far as we are aware, this is the only example of contemporary observed loss of an entire reovirus segment. Whether this segment is dispensable because it is not required for virus infection of the fungus in culture or because it is required for infection of an as-yet unknown arthropod host is unknown. Other segment alterations and rearrangements have been characterized during experimental infection of MyRV1. These include deletion of part or all of specific coding regions and duplication of all or part of coding regions of specific segments. Some of these segment alterations occurred in isolates singly-infected with MyRV1, some resulted in isolates coinfected with MyRV1 and the hypovirus CHV1, and some from MyRV1 infection of transgenic fungal isolates expressing the p29 RNA silencing suppressor protein. Many of the MyRV1 segment mutants are associated with growth phenotypes and differences in viral RNA accumulation. Coding region duplication among MyRV1 segments is an interesting phenomenon that has not been observed in other systems. When MyRV1 is influenced by the CHV1 silencing suppressor p29, in-frame extensions of the coding sequences of segments 1, 2, 3, and 6 have been observed, resulting in predicted larger protein products from those segments. Each of these products is predicted to be a structural protein, and while the predicted longer than wildtype proteins have been observed, in infected cells, it is not yet known whether and how these might be incorporated into particles. This type of segment coding region extension is unusual: in observed coding region duplications within rotaviruses, the most thoroughly studied example of such rearrangements, the size of the expressed coding region is not extended in-frame, but rather a second coding region is inserted head-to-tail downstream of the segment’s unaltered cognate coding sequence. Such rearrangements would not be predicted to alter the natural coding sequence of the rearranged segment. Deletions in MyRV1 segments 4 and 10, which are non-structural proteins, are tolerated. These deletions may be in-frame, resulting in a predicted fusion gene product, or out-of-frame, resulting in predicted truncated gene products. Surprisingly, virus that contains complete replacements of both segments 4 and 10 with their deletion mutant counterparts is viable, probably representing the only known example of a reovirus mutant that can tolerate loss of coding capacity of two segments. Also interesting is that this double deletion mutant replicates to similar levels as wildtype MyRV1. The next step in mycoreovirus research is to develop a robust reverse genetics system to complement studies such as those above. Furthermore, discovery and characterization of new members of the genus Mycoreovirus – and similar fungal reoviruses – will continue as high throughput sequence-based methods dominate the realm of virus discovery.

614

Mycoreoviruses (Reoviridae)

Further Reading Attoui, H., Mertens, P.P.C., Becnel, J., et al., 2012. Family Reoviridae. In: King, A.M.Q., Adams, M.J., Carstens, E.B., Lefkowits, E.J. (Eds.), Virus Taxonomy: Ninth Report of the International Committee for the Taxonomy of Viruses. New York: Elsevier, Academic Press, pp. 541–637. Aulia, A., Andika, I.B., Kondo, H., Hillman, B.I., Suzuki, N., 2019. A symptomless hypovirus, CHV4, facilitates stable infection of the chestnut blight fungus by a coinfecting reovirus likely through suppression of antiviral RNA silencing. Virology 533, 99–107. Enebak, S.A., Hillman, B.I., MacDonald, W.L., 1994. A hypovirulent Cryphonectria parasitica isolate with multiple, genetically unique dsRNA segments. Molecular Plant-Microbe Interactions 7, 590–595. Eusebio-Cope, A., Sun, L., Tanaka, T., et al., 2015. The chestnut blight fungus for studies on virus-host and virus/virus interactions: From a natural to a model host. Virology 477, 164–175. Eusebio-Cope, A., Suzuki, N., 2015. Mycoreovirus genome rearrangements associated with RNA silencing deficiency. Nucleic Acids Research 43, 3802–3813. Hillman, B.I., Supyani, S., Kondo, H., Suzuki, N., 2004. A reovirus of the fungus Cryphonectria parasitica that is infectious as particles and related to the Coltivirus genus of animal pathogen. Journal of Virology 78, 892–898. Hillman, B.I., Suzuki, N., 2004. Viruses of Cryphonectria parasitica. Advances in Virus Research 63, 423–472. Kanematsu, S., Arakawa, M., Oikawa, Y., et al., 2004. A reovirus causes hypovirulence of Rosellinia necatrix. Phytopathology 94, 561–568. Liu, L., Cheng, J., Fu, Y., et al., 2017. New insights into reovirus evolution: Implications from a newly characterized mycoreovirus. Journal of General Virology 98, 1132–1141. Osaki, H., Wei, C.Z., Arakawa, M., et al., 2002. Nucleotide sequences of double-stranded segments from hypovirulent strain of the white root rot fungus Rosellinia necatrix: Possibly of the first member of the Reoviridae from fungus. Virus Genes 25, 101–107. Supyani, S., Hillman, B.I., Suzuki, N., 2006. Baculovirus expression of the 11 Mycoreovirus-1 genome segments and identification of the guanylyltransferase-encoding segment. Journal of General Virology 88, 342–350. Suzuki, N., Supyani, S., Maruyama, K., Hillman, B.I., 2004. Complete genome sequence of Mycoreovirus 1/Cp9B21, a member of a new genus in the family Reoviridae isolated from the chestnut blight fungus, Cryphonectria parasitica. Journal of General Virology 85, 3437–3448. Tanaka, T., Eusebio-Cope, A., Sun, L., Suzuki, N., 2012. Mycoreovirus genome alterations: Similarities to and differences from rearrangements reported for other reoviruses. Frontiers in Microbiology 3, 186. Wu, S., Cheng, J., Fu, Y., et al., 2017. Virus-mediated suppression of host non-self recognition facilitates horizontal transmission of heterologous viruses. PLoS Pathogens 13 (3), e1006234. doi:10.1371/journal.ppat.1006234.

Mymonaviruses (Mymonaviridae) Daohong Jiang, Huazhong Agricultural University, Wuhan, China r 2021 Elsevier Ltd. All rights reserved.

Glossary Biological control A method to prevent pests and parasites using beneficial organisms, here specifically means using mycovirus to control fungal diseases. Hypovirulence Reduced virulence of a fungal pathogen caused by the infection of mycovirus; Hypovirulence and hypovirus of chestnut blight/Cryphonectria parasitica system

is a classic example for studying mycovirus and biological control of plant fungal diseases. Mymonaviruses A group of mononegaviruses that was originally identified from a fungus. Sclerotinia sclerotiorum An ascomycetous pathogen that can attack more than 400 species and subspecies of plants.

Introduction The mycovirus was first discovered by Hollings in 1962 from a diseased mushroom, Agaricus bisporus. In the same year, an active substance capable of stimulating the production of interferon in mammals was found in Penicillium spp., and this substance was identified as double-stranded (ds)RNA which was extracted from mycovirus particles. The dsRNA elements also were found in a hypovirulent strain of chestnut blight pathogen (Cryphonectria parasitica) at 1970s. The dsRNA elements represented the first identified virus (hypovirus) which are devoid of coat protein. These studies led to a notion that the genome of the fungal virus is dsRNA. Since then, dsRNA became a marker for mycoviruses and was used to judge whether a fungal strain was infected by mycoviruses or not still today. DsRNA extraction screening with cellulose has detected a group of mycoviruses from various fungi, which produce dsRNA, as their genomes or replicative forms, in infected cells. Some of these viruses are phylogenetically related to plant single-stranded, positive-sense RNA ( þ ssRNA) and are capsidless. Such fungal RNA viruses exemplified by hypoviruses and endornaviruses are now classified as þ ssRNA viruses, rather than as dsRNA viruses, by the International Committee on Taxonomy of Viruses (ICTV). By recent high-throughput RNA sequencing or RNA Seq of hypovirulent strain AH98 of Sclerotinia sclerotiorum, an important necrotrophic ascomycetous pathogen that attacks numerous dicotyledonous plants, we found that strain AH98 was infected by a negative-stranded (  ) RNA virus, designated as Sclerotinia sclerotiorum negative-stranded RNA virus 1 (SsNSRV-1). This is the first discovery of a -ssRNA virus as an infectious entity. Meanwhile, Dr. H. Kondo’s research group in Japan discovered the incomplete sequence of L-protein–like gene in the genome of the powdery mildew fungus by performing an exhaustive search using sequences of known negative-strand RNA viruses. They also found two entire L-protein genes and one partial gene of unknown mononegaviruses present in two independent transcriptomic data of Clarireedia (Sclerotinia) homoecocarp. In 2016, these findings led ICTV to create the new family Mymonaviridae within the order Monogavirales containing one genus Sclerotimonavirus with Sclerotinia sclerotimonavirus as the type species, the exemplar virus of which is SsNSRV-1. The name Mymonaviridae is originated from Myco and Mononegavirales.

Phylogenetic Status of Mymonoviridae and its Related Families The family Mymonoviridae belongs to the order Monogavirales (class Monjiviricetes, subphylum Haploviricoyina, phylum Negarnaviricota, realm Riboviria). Based on the phylogenetic analysis using the full-length L protein (RNA replicase), the members of Mymonoviridae are closely related to those of Lispiviridae, Paramyxoviridae, Filoviridae, Pneumoviridae and Rhabdoviridae, and are also phylogenetically related to members of Nayamiviridae, Bornaviridae, Xinmoviridae, Artoviridae and Chuviridae. Currently, there is only one genus (Scleromonavirus) in this family, however, the members in Mymonoviridae can be grouped into three clusters, and more genera should be established possibly to adopt viruses which are significantly different from those in Scleromonavirus (Fig. 1).

Species and Tentative Species in the Family Mymonoviridae At present, the genus Scleromonavirus has seven species, namely Sclerotinia sclerotimonavirus, Hubei sclerotimonavirus, Drop sclerotimonavirus, Dadou sclerotimonavirus, Illinois sclerotimonavirus, Glycine sclerotimonavirus and Phyllosphere sclerotimonavirus. However, it is clear that there are additional eight viruses at least; Penicillium cairnsense negative-stranded RNA virus 1, Narnavirus sp.-QDH90729.1 (tentative name), Lentinula edodes negative-strand RNA virus 1, Penicillium adametzioides negative-stranded RNA virus 1 or Alternaria tenuissima negative-stranded RNA virus 1 represents novel species; Botrytis cinerea mymonavirus 1 and Sclerotinia sclerotiorum negativestranded RNA virus 7 together represent a novel species, and TSA1, -2, -3 from Clarireedia homoeocarpaa may also represent a novel species (Table 1).

Encyclopedia of Virology, 4th Edition, Volume 4

doi:10.1016/B978-0-12-809633-8.21345-9

615

616

Mymonaviruses (Mymonaviridae)

Fig. 1 Phylogenetic analysis of members and tentative members in the family Mymonaviridae and other related families by Maximum Likelihood method. The viruses whose complete sequence of RNA replicase is known were selected to construct this phylogenetic tree. In total 83 RNA replicase sequences from viruses in the order Mononegavirales were analyzed. The phylogenetic analyses were conducted in MEGA7 (Kumar, S., Stecher, G., Tamura, K., 2016. Molecular Biology and Evolution 33,1870–18742). Bootstrap numbers out of 1000 replicates are indicated at the nodes. Virus name and accession number were on the right at the correspondence position.

Mymonaviruses (Mymonaviridae)

Table 1

617

List of members and tentative members in the family Mymonaviridae

Species name

Virus name

Available sequence (nt)

Accession number

Host

Sclerotinia sclerotimonavirus

Sclerotinia sclerotiorum negative-stranded RNA virus 1 Sclerotinia sclerotiorum negative-stranded RNA virus 1-A Sclerotinia sclerotiorum negative-stranded RNA virus 3-A Sclerotinia sclerotiorum negativestranded RNA virus 3-USA Hubei rhabdo-like virus 4 Sclerotinia sclerotiorum negative-stranded RNA virus 4-A Sclerotinia sclerotiorum negative-stranded RNA virus 4 Sclerotinia sclerotiorum negative-stranded RNA virus 2-A Soybean leaf-associated negative-stranded RNA virus 3 Soybean leaf-associated negative-stranded RNA virus 2 Soybean leaf-associated negative-stranded RNA virus 1 Fusarium graminearum negative-stranded RNA virus 1 soybean leaf-associated negative-stranded RNA virus 4 Penicillium adametzioides negative- stranded RNA virus 1 Alternaria tenuissima negative-stranded RNA virus 1 Narnavirus sp.a Lentinula edodes negative-strand RNA virus 1 Penicillium cairnsense negative-stranded RNA virus 1 Narnavirus spa Botrytis cinerea mymonavirus 1 ShTAS3b ShTAS2b ShTAS1b Sclerotinia sclerotiorum negative-stranded RNA virus 7

10002

NC_025383

Fungi, Sclerotinia sclerotiorum

7735

MF444277

Fungi, S. sclerotiorum

9919

MF444280

Fungi, S. sclerotiorum

10009

NC_026732.1

Fungi, S. sclerotiorum

10003 9564

NC_032783.1 MF444282.1

Animal, Arthropod Fungi, S. sclerotiorum

9707

NC_043483.1

Fungi, S. sclerotiorum

9450

MF444279.1

Fungi, S. sclerotiorum

6218

KT598228.1

7321

KT598227

9041

KT598225

9072

MF276904.1

Plant or associated fungi, soybean or Soybean leaf microbe Plant or associated fungi, soybean or Soybean leaf microbe Plant or associated fungi, soybean or Soybean leaf microbe Fungi, Fusarium graminearum

5317

KT598229

6713

MK584858

Plant or associated fungi, soybean or soybean leaf microbe Fungi, Penicillium adametzioides

8904

MK584852

Fungi, Alternaria tenuissima

9843 11563

QDH88671.1 BBI93117.1

Grassland soil Fungi, Lentinula edodes

8759

MK584851

Fungi, Penicillium cairnsense

9885 7863 8062 8645 9775 7819

MN035745 MH648611.1 JW828891.1 JW826636.1 JU091017 MF444285

Grassland soil Fungi, Botrytis cinerea Fungi, Clarireedia homoeocarpa Fungi, Clarireedia homoeocarpa Fungi, Clarireedia homoeocarpa Fungi, Sclerotinia sclerotiorum

Hubei sclerotimonavirus Drop sclerotimonavirus

Dadou sclerotimonavirus Illinois sclerotimonavirus Glycine sclerotimonavirus

Phyllosphere sclerotimonavirus Tentative members

This virus is most likely to incorrectly be named, narnaviruses that belong to the family Narnaviridae, with þ ssRNA genomes of o3.0 kb. ShTAS1,  2,  3 are not really virus names, but are names for transcriptome shotgun assemblies from Sclerotinia homoeocarpa (renamed as Clarireedia homoeocarpa).

a

b

Virion So far, virion morphology of SsNSRV1 virus and Fusarium graminearum negative-stranded RNA virus 1 (FgNSRV-1) has been defined, whereas the virions of other viruses are still unknown. The morphology of virions of SsNSRV1 and FgNsRV-1 is similar. SsNSRV-1 virions are filamentous, 25–50 nm in diameter and about 1000 nm in length, while FgNSRV-1 particles are 35–50 nm in diameter and about 1200 nm in length. The virions have enveloped-like structures (EVLS), which are similar to the virion of viruses in the family Filoviridae, but lacking spikes on the surface. When the virion is broken, Nucleocapsid complexes (RNPs) are releases. RNPs are filamentous, branched, left-handed, and helical structures with tight or loose coils. RNPs of SsNSRV-1 have a diameter of 20–22 nm and a length of 400–2000 nm, while the size of RNPs is unknown for FgNSRV-1. Nucleocapsid consists of polymerized Nucleoprotein (N) monomers (Fig. 2).

The Genomic Structure Since the full-length genome sequences of some viruses are not available and novel viruses with larger or smaller genomes that of the fully-sequenced mymonaviruses are frequently discovered, the size range of viral genome cannot be accurately determined.

618

Mymonaviruses (Mymonaviridae)

Fig. 2 Morphology and structure of virions of Sclerotinia sclerotiorum negative-stranded RNA virus 1 (SsNSRV-1). (A) Filamentous, possibly enveloped virions (marked by arrowheads), and nucleoprotein-RNA complexes (RNPs). (B) Purified tight (left) or loose (right) coils of RNPs. (C) Rings that make up the coils and nucleoprotein (NP) monomers (marked by arrowhead). Virions and RNPs were purified from mycelia of strain AH98 and virus-transfected strain Ep-1PNA367-PT2 of Sclerotinia sclerotiorum and negatively stained with 2% PTA [(wt/vol), pH 7.4]. Reproduced from Liu, L, Xie, J, Cheng, J, et al., 2014. Fungal negative-stranded RNA virus that is related to bornaviruses and nyaviruses. Proceedings of the National Academy of Sciences of the United States of America 111, 12205–12210.

From the current known viruses in the family Mymonaviridae, the genome size may be between 9000 and 11600 nt, and the largest virus in the genome is Lentinula edodes negative-strand RNA virus 1. There are 5–7 open reading frames (ORFs) on the genome; SsNsRV-1 has six ORFs on its genome (Fig. 3(A)). There are conserved non-coding regions between genes. For example, in the SsNSRV-1 genome, the sequences of non-coding regions are (30  (A/U) (U/A/C) UAUU (U/A) AA (U/G) AAAACUUAGG (A/U) (G/U)  50 ) (Fig. 3(B)). The N protein is encoded by ORF II, and the L protein (RNA replicase) is encoded by ORF V. There is usually an ORF (ORF VI) following ORF V. The small ORF IV and ORF VI are undetectable in some mymonaviruses. The gene arrangement of mymonaviruses is different from that of other mononegaviruses. The most conserved L protein shares sequence similarity among those of different members of the same genus, whereas other proteins have no significant similarity among them. The functions of the N and L protein are determined, while the functions of other proteins are unknown. In fact, only the transcriptional pattern of the SsNSRV-1 genome is known: six ORFs can be transcribed independently, while ORF V and ORF VI are transcribed together (Fig. 3(C)). The genome of this family of viruses has occasionally defective RNA molecules. For example, in S. sclerotiorum, there are defective RNAs in which both 30 -terminus (including ORF I and ORF II, even ORF III) and 50 -terminus are deleted (Fig. 3(D)). It is likely that similar phenomenon occurs in other mymonaviruses and the "defective" RNAs replicate in fungi. Botrytis cinerea mymonavirus 1 has only three ORFs supports this notion. This observation may suggest that certain shorter version of the mymonavirus genomes, resulting from deletions of dispensable genes, are replication competent.

The Impact on the Host The impact of mymonaviruses on their host is very difficult to determine. SsNsRV-1 was successfully transfected into a virus-free S. sclerotiorum strain by protoplast transfection. The pathogenicity of the strain infected with the virus showed a significant decline compared with the strain without the virus. Botrytis cinerea mymonavirus 1 was also found to have a significant inhibitory effect on the pathogenicity of its host B. cinerea; in addition, Lentinula edodes negative-strand RNA virus 1 may have an effect on the formation and development of fruitbody of the host shiitake mushroom (Lentinula edodes). FgNSRV-1 had no significant effects on the growth, pathogenicity, sporulation and toxin synthesis of the host F. graminearum. The impact of other members in the family Mymonoviridae has not been reported.

Mymonaviruses (Mymonaviridae)

619

Fig. 3 Genome organization and characteristics of Sclerotinia sclerotiorum negative-stranded RNA virus 1 (SsNSRV-1). (A) Genome length and organization. Boxes indicate position and length of each ORF, which are labeled with Roman numerals, except for two ORFs, N (II) and L (V), which encode the nucleoprotein (N) and RNA-dependent RNA polymerase (L). A conserved domain in L is shown. (B) Alignment of the putative gene-junction sequences between ORFs in a 30 -to-50 orientation. Conserved sequences [30  (A/U)(U/A/C)UAUU(U/A)AA(U/G) AAAACUUAGG(A/U) (G/U)  50 ] are highlighted in blue. Different shades indicate levels of conservations with the darkest color indicating the highest conservation. (C) Deduced transcription map based on 50 - and 30 -RACE. (D) Schematic diagrams for four defective genome RNA of SsNSRV-1, dotted lines represents deleted regions on the genome. Reproduced from Liu, L, Xie, J, Cheng, J, et al., 2014. Fungal negative-stranded RNA virus that is related to bornaviruses and nyaviruses. Proceedings of the National Academy of Sciences of the United States of America 111, 12205–12210.

Host and Distribution The hosts of viruses in family Mymonaviridae are mostly fungi, but two of them (tentatively called Narnavirus sp.-QDH90729.1 and -Narnavirus sp.-QDH88671.1) were identified from grassland soil, and Hubei rhabdo-like virus 4 was discovered from an arthropods or arthropods-associated organisms. It is interesting that Narnavirus sp-QDH88671.1 belongs to the same species as Hubei rhabdo-like virus 4, suggesting that they may share the same hosts. Three viruses were identified from phyllosphere. It is considered that some viruses from phyllosphere are closely related to viruses from pathogenic fungi that grow on the same plants; hence, those viruses likely have fungal hosts. Currently, it is known that viral strains of the species Sclerotinia sclerotimonavirus such as SsNSRV-1 are distributed widely in China; it is also detected in both the United States and Australia. SsNSRV-4 (species Drop sclerotimonavirus) is detected in

620

Mymonaviruses (Mymonaviridae)

S. sclerotiorum in the United States and Australia. Botrytis cinerea mymonavirus 1 is distributed in 11 provinces of China and is also found in S. sclerotiorum. Hence, viruses in the family Mymonaviridae are widely spread around the world.

Further Reading Amarasinghe, G.K., Ayllón, M.A., Bào, Y., et al., 2019. Taxonomy of the order Mononegavirales: Update 2019. Archives of Virology 164, 1967–1980. Hao, F., Wu, M., Li, G., 2018. Molecular characterization and geographic distribution of a mymonavirus in the population of Botrytis cinerea. Viruses 10, 432. Kondo, H., Chiba, S., Toyoda, K., Suzuki, N., 2013. Evidence for negative-strand RNA virus infection in fungi. Virology 435, 201–209. Lin, Y.H., Fujita, M., Chiba, S., et al., 2019. Two novel fungal negative-strand RNA viruses related to mymonaviruses and phenuiviruses in the shiitake mushroom (Lentinula edodes). Virology 533, 125–136. Liu, L., Xie, J., Cheng, J., et al., 2014. Fungal negative-stranded RNA virus that is related to bornaviruses and nyaviruses. Proceedings of the National Academy of Sciences of the United States of America 111, 12205–12210. Marzano, S.L., Domier, L.L., 2016. Novel mycoviruses discovered from metatranscriptomics survey of soybean phyllosphere phytobiomes. Virus Research 213, 332–342. Marzano, S.L., Nelson, B.D., Ajayi-Oyetunde, O., et al., 2016. Identification of diverse mycoviruses through metatranscriptomics characterization of the viromes of five major fungal plant pathogens. Journal of Virology 90, 6846–6863. Mu, F., Xie, J., Cheng, S., et al., 2017. Virome characterization of a collection of Sclerotinia sclerotiorum from Australia. Frontiers in Microbiology 8, 2540. Nerva, L., Turina, M., Zanzotto, A., et al., 2019. Isolation, molecular characterization and virome analysis of culturable wood fungal endophytes in esca symptomatic and asymptomatic grapevine plants. Environmental Microbiology 21, 2886–2904. Shi, M., Lin, X., Tian, J., et al., 2017. Redefining the invertebrate RNA virosphere. Nature 540, 539–543. Wang, L., He, H., Wang, S., et al., 2018. Evidence for a novel negative-stranded RNA mycovirus isolated from the plant pathogenic fungus Fusarium graminearum. Virology 518, 232–240.

Narnaviruses (Narnaviridae) Rosa Esteban and Tsutomu Fujimura, Institute of Biology and Functional Genomics, CSIC/University of Salamanca, Salamanca, Spain r 2021 Elsevier Ltd. All rights reserved.

Glossary Ribozyme

RNA with a catalytic activity.

Introduction The narnaviruses 20S RNA and 23S RNA (ScV20S and ScV23S, respectively) are positive strand RNA viruses found in the yeast Saccharomyces cerevisiae. These two viruses are ascribed to the genus Narnavirus of the family Narnaviridae. Massive RNA sequencing has started detecting many narnavirus-like sequences in plant and insect sources. Like other fungal viruses, 20S and 23S RNAs have no extracellular transmission pathway. They are transmitted horizontally by mating, or vertically from mother to daughter cells. It is believed that the high frequency of mating or hyphal fusion that occurs in the host life cycle makes an extracellular route of transmission dispensable for the viruses. The thick cell wall of fungi may also form a formidable barrier. The lack of extracellular transmission may, in turn, explain two prominent features found in narnaviruses. First, they are persistent viruses and do not kill the host cells. If their infection caused damages or disadvantages to the host, then the viruses might have perished during the course of evolution because of the lack of an escape route. Secondly, because there is no extracellular phase, the viruses do not need to form virions to protect their RNA genomes in the extracellular environment. In addition, they do not need machinery to ensure exit or reentry to a new host. The lack of a virion structure may sound peculiar to those who are familiar with infectious viruses. Considering that viruses are selfish parasites, however, it will be natural for them to shed genes or functions unnecessary for their existence. Consequently, the genomes of narnaviruses are simple and small: they only encode a single protein, the RNA-dependent RNA polymerase (RdRp). This may contribute to their persistence by reducing a number of viral proteins that might interfere with metabolism vital for the host. The simplicity of their RNA genomes encoding a single protein, together with the development of 20S and 23S RNA virus launching systems from yeast expression vectors, makes narnaviruses a good model system to investigate replication and the molecular basis for intracellular persistence of RNA viruses.

Historical Background 20S RNA was first described in 1971 as a single-stranded RNA (ssRNA) species that accumulated in yeast cells transferred to 1% potassium acetate, a standard procedure to induce sporulation in yeast under nitrogen starvation conditions. Because of its mobility relative to 25S and 18S rRNAs, the species was named 20S RNA. Later it was found, however, that the accumulation of 20S RNA was not related to the sporulation process because haploid cells that do not sporulate also accumulate 20S RNA under nitrogen starvation conditions. It was also found that 20S RNA is a cytoplasmic genetic element. The realization of 20S RNA as a viral entity, however, had to wait several years, until the characterization of 20S RNA by cloning and sequencing in 1991. 23S RNA was reported first time in 1992. Both viruses were recognized as members of a new family, Narnaviridae (naked RNA virus), a taxonomic group that appeared first time in the 2000 edition of the ICTV report. The other genus in the family is Mitovirus, whose members are found in mitochondria of fungi, many of them are pathogenic to plants. All members of the family have small RNA genomes (2–3 kilobases) that encode single proteins, their RNA-dependent RNA polymerases, and reside either in the cytoplasm (narnaviruses) or in the mitochondria (mitoviruses) of the host.

Viral Genomes Many laboratory strains of S. cerevisiae harbor 20S RNA virus and fewer strains contain 23S RNA virus. Both viruses are compatible in the same host. The presence of 20S and 23S RNA viruses does not confer phenotypic changes to the host. Under nitrogen starvation conditions, the amounts of the viral genomes become almost equivalent to those of rRNAs (4100,000 copies/cell; Fig. 1). In contrast, vegetative growing cells contain much lower amounts of the viral RNAs (5–20 copies/cell). Fig. 2(A) shows the genome organization of narnaviruses. Both 20S and 23S RNAs are small (2514 and 2891 nt, respectively) and each genome encodes a single protein; a 91 kDa protein (p91) by 20S RNA and a 104 kDa protein (p104) by 23S RNA. The 5′ untranslated regions in both RNAs are extremely short: 12 nt in the case of 20S RNA and only 6 nt in 23S RNA. These RNAs lack poly(A) tails at the 3′ ends. The same RNA can serve as template for translation and also for negative strands

Encyclopedia of Virology, 4th Edition, Volume 4

doi:10.1016/B978-0-12-809633-8.20941-2

621

622

Narnaviruses (Narnaviridae)

Fig. 1 RNA extracted from nitrogen-starved yeast cells carrying none (1), 20S RNA virus alone (2), 23S RNA virus alone (4) or both viruses together (3), was separated in an agarose gel and visualized by Ethidium bromide-staining. 20S and 23S RNAs, together with the rRNAs are indicated.

Fig. 2 Genomic organization of 20S and 23S RNA viruses (A) and their launching plasmids (B). (A). Diagrams of 20S and 23S RNAs and the proteins encoded by them, p91 and p104, respectively. (A–D) represents motifs conserved among RdRps from positive strand and dsRNA viruses and 1–3 indicates amino acid stretches conserved between p91 and p104. (B). The complete cDNA of 20S and 23S RNA genome is inserted downstream of the constitutive PGKI promoter in a yeast expression vector in such a way that positive strands are transcribed from the promoter. The HDV ribozyme is adjoined directly to the 3′ end of the viral genome.

synthesis. The antigenomic (or negative strand) RNAs have no coding capacity for protein and are present at much lower copy numbers (a few per cent) compared to the genomic (or positive strand) RNAs under the induction conditions. The double stranded forms of 20S and 23S RNAs are known and called W and T, respectively. These double-stranded RNAs (dsRNAs) accumulate when the cells are grown at 37°C, a rather high temperature for yeast (the optimal temperature for growth is about 28°C). These dsRNAs are not intermediates of replication but byproducts. Replication proceeds from a positive strand to a negative strand and then to a positive strand. The proteins encoded in the viral genomes are not processed to produce smaller fragments with distinct functions. Both proteins contain amino acid motifs well conserved among RdRp from positive strand and double-stranded RNA viruses (Fig. 2(A)). Their RdRp consensus motifs are most closely related to those of RNA bacteriophages such as Qβ. In addition, p91 and p104 share stretches of amino acid sequences (denoted by 1–3 in Fig. 2(A)) in the same order throughout the molecules, indicating a close evolutionary relationship between these two viruses.

Ribonucleoprotein Complexes as a Viral Entity Yeast is also a natural host for dsRNA totiviruses, called ScV-L-A and ScV-L-BC. Like narnaviruses, totiviruses have no extracellular transmission pathway. However, these viruses have gag and pol genes, and their dsRNA genomes are encapsidated into intracellular

Narnaviruses (Narnaviridae)

623

viral particles. In contrast, 20S and 23S RNA viruses have no capsid genes to form virion structures. Then, how do these viruses exist inside the cell and establish a persistent infection without a protective coat? Earlier studies demonstrated that 20S RNA migrated as “naked RNA” in sucrose gradients. Furthermore, deproteination with phenol had no apparent effect on its mobility. Because protein provides a large part of the molecular mass in virions, these data clearly indicate that narnaviruses are capsidless. When specific antibodies against their RdRps became available, however, it was realized that each RNA genome is associated with its RdRp and this interaction is specific; that is, p91 is associated only with 20S RNA and p104 only with 23S RNA. These ribonucleoprotein complexes reside in the cytoplasm and are not associated with the nucleus, mitochondria, or intracellular membranous structures. Further studies indicated that most of the positive strands of 20S and 23S RNA viruses under induction conditions are associated with their own RdRps in a 1:1 stoichiometry. The complexes lack host proteins and can be formed even in E. coli if the RNAs and their RdRps are expressed from vectors. These complexes are called “resting complexes” to distinguish them from the “replication complexes” described in the following section. Negative strands are present at much lower amounts compared to positive strands. Available data indicate that they also form complexes with their own RdRps. These findings suggest that the formation of ribonucleoprotein complexes between the viral RNA and its RdRp is important for the life of 20S and 23S RNA viruses.

Replication Intermediates Lysates prepared from virus-induced cells have an RNA-dependent RNA polymerase activity. The activity is insensitive to actinomycin D or α-amanitin, thus independent of a DNA template. The majority of in vitro products are positive strands of 20S RNA. Synthesis of negative strands accounts for a few percent compared to that of positive strands, thus reflecting the positive/negative strand ratio in the lysates. There is no, or very little, de novo synthesis in vitro. Therefore, radioactive nucleotides are unevenly distributed into 20S RNA positive strand products with more incorporation into the 3′ end region. Replication complexes that synthesize 20S RNA positive strands have a single-stranded RNA backbone and migrate in native agarose gels as a broad band corresponding to ssRNA in the size ranging from 2.5 to 5 kb long. These complexes consist of a full-length negative strand template (2.5 kb) and a nascent positive strand of less than unit-length, probably held together by the polymerase machinery. Deproteination with phenol converts them to double-stranded RNA. Therefore, W dsRNA is not a replication intermediate but a byproduct. It is likely that the high temperature (37°C) for growth may destabilize replication complexes, thus resulting in the accumulation of double-stranded RNA. Upon completion of RNA synthesis in vitro, the positive strand products as well as the negative strand templates are released from replication complexes. It is likely that, in the cell, the released negative strands are immediately recruited to another round of positive strand synthesis, because the majority of negative strands in lysates are present in replication complexes engaging in the synthesis of positive strands. Interestingly, both positive and negative strands released from replication complexes are associated with protein. Because replication complexes contain at least one p91 molecule per complex, p91 is a good candidate for the protein. Although resting complexes can be formed in E. coli, lysates prepared from theses cells showed no viral RNA polymerase activity. It is likely that formation of replication intermediates requires a host factor(s).

Generation of Narnaviruses in Vivo As mentioned earlier, the presence of narnaviruses does not render phenotypic changes to the host. This has hindered studies on replication or virus/host interactions using yeast genetics. This obstacle has been overcome by developments in generating 20S and 23S RNA viruses in vivo from a yeast expression vector (Fig. 2(B)). In either case, the complete viral cDNA was inserted in the vector downstream of a constitutive promoter in such a way that positive strands can be transcribed from the promoter. The 3′ end of the viral sequence was directly fused to the hepatitis delta virus (HDV) antigenomic ribozyme. Therefore, intramolecular cleavage by the ribozyme will create transcripts in vivo having the 3′ termini identical to the viral 3′ end. The efficiency of virus launching is high. Twenty to 70% of the cells transformed with the vector generated the virus. The primary transcripts expressed from the vectors have non-viral sequences (about 40 nt) at the 5′ ends. The generated viruses, however, possessed the authentic viral 5′ ends without the extra sequences. It is likely that the 5′ non-viral extension was eliminated by a 5′ exonuclease. Using these launching systems, it has been demonstrated that each RdRp is essential and specific for replication of its own viral RNA. Because negative strands cannot be decoded to the RdRps, vectors in which the viral cDNAs were reversed failed to generate the virus. These negative strand-expressing vectors, however, successfully generated narnaviruses, if active polymerases were provided in trans from a second vector. Therefore, both 20S and 23S RNA viruses can be generated from either positive or negative strands expressed from a vector.

Cis-Acting Signals for Replication 20S and 23S RNA genomes share the same 5 nt inverted repeats at the 5′ and 3′ termini (5′-GGGGC…GCCCC-OH). Extensive analysis was done modifying each nucleotide at the 3′ ends. It was found that the 3rd and 4th Cs from the 3′ termini are essential for replication in both viruses. While the 3′ terminal and penultimate Cs can be eliminated or changed to other nucleotides

624

Narnaviruses (Narnaviridae)

Fig. 3 Comparison of the 3′ terminal secondary structures in the positive (+) and negative (−) strands of 20S and 23S RNA viruses, with the top half domain of tRNATyr. The non-templated A residues at the viral 3′ termini are placed in parenthesis. The consecutive four Cs essential for replication are boxed (green). A second cis-signal (the mismatched pair of purines) present in the positive strand of 23S RNA virus is circled (green). Y, R, and N stand for pyrimidine, purine, and any base, respectively.

without affecting virus generation, the generated viruses recovered the wild type Cs at the termini. Therefore, the consecutive four Cs at the 3′ terminus are essential for these viruses (Fig. 3). In contrast, the G immediately upstream the 4 Cs from the 3′ end is dispensable for replication in both viruses. 23S RNA virus requires an additional 3′ cis signal for replication. The stem-loop structure proximal to the 3′ end contains a mismatched pair of purines in the stem (Fig. 3). This mismatched pair is essential for replication but the virus tolerates any combination of purines at the pair. On the other hand, changing the purines to pyrimidines or eliminating one of the purines at the mismatched pair blocked virus generation. The distance between the mismatched pair and the 3′ terminal four Cs and/or their spatial configuration appears to be critical, because shortening or increasing the length of the stem between the two sites by more than one base pair abolished virus launching. It is not known whether 20S RNA virus has a similar cis signal in the stem-loop structure proximal to the 3′ end. The first G upstream the 4 Cs from the 3′ end is located at the bottom of the stem structure. This G, as mentioned earlier, can be changed to another nucleotide without impairing replication, as far as the modified nucleotide is hydrogen-bonded at the bottom of the stem. The negative strands of 20S and 23S RNA viruses also possess four consecutive Cs at the 3′ ends. Using the two-vector system mentioned above, it has been found that the 3rd and 4th Cs from the 3′ end are essential for replication. Similar to the positive strands, the 3′ terminal and penultimate Cs can be eliminated or changed to other nucleotides without affecting virus generation and the generated viruses recovered the wild type four Cs at the 3′ ends. Therefore, the consecutive four Cs at the 3′ end of the negative strand are again a cis signal for replication. The 5′ ends of viral positive strands have not been analyzed extensively. Elimination of the 5′ terminal G or changing it to other nucleotide had no effect on virus generation and the generated viruses recovered this G at the 5′ ends.

Narnaviruses (Narnaviridae)

625

Cis-Signals for Formation of Ribonucleoprotein Complexes Narnaviruses, as mentioned earlier, exist as ribonucleoprotein complexes in the host cytoplasm. In the absence of the HDV ribozyme, RNA transcribed from the launching vectors failed to generate viruses because of the presence of non-viral extensions at the 3′ end. The transcripts, however, can be decoded to viral polymerases and the polymerases can form complexes in vivo with the transcripts, thus providing an assay system to analyze cis-signals for formation of ribonucleoprotein complexes. By immunoprecipitation with antiserum specific to p104, it has been found that the bipartite 3′ cis-signal for replication (more specifically, the mismatched pair of purines and the 3rd and 4th Cs from the 3′ end) is essential for 23S RNA positive strand to form complexes with the polymerase. In the case of 20S RNA virus, a similar in vivo assay indicates that the 3′ cis signal for replication (in particular the 3rd and 4th Cs from the 3′ end) is also important for formation of ribonucleoprotein complexes with p91. When isolated resting complexes of 20S RNA virus were analyzed in vitro, however, it was found that p91 physically interacts with 20S RNA at three different sites, not only at the 3′ end but also at the 5′ terminal and central regions of the molecule. The 5′ binding site is located at the second stem-loop structure from the 5′ terminus and is essential for replication. The central site interacts less strongly to p91 than the other two sites, and remains to be defined further. Computer-predicted analysis indicates that the 5′ and 3′ termini of 20S RNA (and 23and 23S RNA and 23S RNANA) are brought together into close proximity by a long distance RNA/RNA interaction (Fig. 4). This may allow a single molecule of p91 to interact simultaneously with both ends of 20S RNA genome in a resting complex.

Narnavirus Persistence in the Host mRNA degradation in yeast, like in other eukaryotes, is initiated by shortening the 3′ poly(A) tail followed by decapping at the 5′ end. Then the decapped mRNA is degraded by the potent Xrn1p/Ski1p 5′ exonuclease as well as by a 3′ exonuclease complex called exosome. The RNA genomes of narnaviruses, as mentioned earlier, have no 3′ poly(A) tails, thus resembling intermediates of mRNA degradation. This suggests that these RNA genomes are vulnerable to the exonucleases involved in mRNA degradation. In fact, the copy numbers of 20S and 23S RNAs increase several-fold in strains having mutations in SKI genes such as SKI2, SKI6, and SKI8. These mutations were originally identified by their failure in lowering the copy numbers of L-A dsRNA totivirus and its satellite RNA M. It is known that the SKI2, SKI6, and SKI8 gene products are components or modulators of the exosome. These

Fig. 4 Secondary structures at the 5′ and 3′ end regions of 23S and 20S RNA positive strands, as predicted by the MFOLD program. The AUG initiation codons (green) and the stop codons (red) for p104 and p91 are boxed. Stem-loop-II where the 5′ binding site for p91 is located is shown. Note that about 150 nt from the ends in each viral genome there are inverted repeats of 8–12 nt long that bring both 5′ and 3′ ends to a close proximity.

626

Narnaviruses (Narnaviridae)

observations suggest that the 3′ end of the viral genome is constantly nibbled by 3′ exonucleases. Therefore, one of the reasons for narnaviruses to form ribonucleoprotein complexes may be to protect their 3′ ends from exonuclease cleavage. The fact that the 3rd and 4th Cs from the 3′ end are important to form complexes in both viruses fits this hypothesis because binding of the RdRp to these nucleotides would block progression of the exonuclease and protect the internal region. As described earlier, mutations introduced at the terminal and penultimate positions at the 3′ end had no deleterious effects on virus launching and the generated viruses recovered the wild type sequences. This suggests that the terminal and penultimate positions at the 3′ ends are not only vulnerable to cleavages but also accessible to the repair machinery. The 3′ ends of these viruses may undergo constant turnover at these positions. It is not known whether the 3′ end repair is carried out by the replicase machinery during the replication process, or by host enzymes. The following evidence may favor the latter case. The 3′ terminal structures of 20S and 23S RNAs resemble a half of tRNA, the so-called “top-half” domain, consisting of the acceptor stem and T stem (Fig. 3). The domain provides the determinants necessary for specific interactions with tRNA-related enzymes such as the tRNA nucleotidyltransferase (CCA-adding enzyme). This raises the possibility that 20S and 23S RNAs nibbled at the 3′ ends by 3′ exonucleases are repaired to the wild type sequences by the CCA-adding enzyme. The fact that 15%–30% of both positive and negative strands of 20S and 23S RNAs possess an unpaired A at the 3′ ends supports this possibility. Furthermore, that the 3′ repair is confined to the terminal and penultimate positions is consistent with the catalytic activity expected for the CCA-adding enzyme. Concerning the 5′ end, the first nucleotides in both 20S and 23S RNAs are four consecutive Gs (Fig. 4). It is known that oligo G tracts inhibit progression of the Xrn1/Ski1 5′ exonuclease. Furthermore, these consecutive Gs are buried at the bottom of a long stem structure in both viruses. These features suggest that 20S and 23S RNAs by themselves are resistant to the 5′ exonuclease. This was confirmed by modifying the 5′ terminal stem structure of 20S RNA. Destabilizing the stem structure severely affected 20S RNA virus generation from a vector in wild type cells, but the same vector successfully generated the virus in SKI1-deleted cells. Therefore, the 5′ terminal stem structure is critical for the virus to evade SKI1 surveillance. The initiation codon of p91 is located in the middle of the 5′ terminal stem structure. If p91 bound to this stem in the ribonucleoprotein complex, then such a stable binding would interfere with translation of new p91 molecules from the RNA. In this context, it may make sense that the 5′ binding site of p91 in the complexes is located at the second stem-loop structure from the 5′ end. By binding simultaneously to this site and also to 3′ end of the same RNA molecule, p91 may stabilize the long distance RNA-RNA interactions that bring the 5′ and 3′ ends of the RNA into proximity, and help the RNA to form an organized structure by further interacting with the central binding site. Furthermore, because the host cytoplasm is filled with a great variety of RNA molecules, formation of complexes between the viral polymerase and its template RNA may also facilitate its replication by discriminating non-viral RNAs as templates.

Further Reading Buck, K.W., Esteban, R., Hillman, B.I., 2005. Narnaviridae. In virus taxonomy, VIIIth report of the ICTV. Fauquet, C.M., Mayo, M.A., Maniloff, J., Desselberger, U., Ball, L.A. (Eds.), London: Elsevier/Academic Press, pp. 751–756. Esteban, L.M., Rodríguez-Cousiño, N., Esteban, R., 1992. T double-stranded (dsRNA) sequence reveals that T and W dsRNAs form a new RNA family in Saccharomyces cerevisiae: Identification of 23S RNA as the single-stranded form of T dsRNA. Journal of Biological Chemistry 267, 10874–10881. Esteban, L.M., Fujimura, T., García-Cuéllar, M.P., Esteban, R., 1994. Associationof yeast viral 23 S. RNA with its putative RNA-dependent RNA polymerase. Journal of Biological Chemistry 269, 29771–29777. Esteban, R., Fujimura, T., 2003. Launching the yeast 23S RNA narnavirus shows 5′ and 3′ cis-acting signals for replication. Proceedings of the National Academy of Sciences of the United States of America 100, 2568–2573. Esteban, R., Vega, L., Fujimura, T., 2005. Launching of the yeast 20S RNA Narnavirus by expressing the genomic or anti-genomic viral RNA in vivo. Journal of Biological Chemistry 280, 33725–33734. Esteban, R., Vega, L., Fujimura, T., 2008. 20S RNA narnavirus defies the antiviral activity of SKI1/XRN1 in Saccharomyces cerevisiae. Journal of Biological Chemistry 283, 25812–25820. Fujimura, T., Esteban, R., 2004. Bipartite 3′ cis-acting signal for replication in Yeast 23 S RNA virus and its repair. Journal of Biological Chemistry 279, 13215–13223. Fujimura, T., Solórzano, A., Esteban, R., 2005. Native replication intermediates of the yeast 20S RNA virus have a single-stranded RNA backbone. Journal of Biological Chemistry 280, 7398–7406. Kadowaki, K., Halvorson, H.O., 1971. Appearance of a new species of ribonucleic acid synthesized in sporulation cells of Saccharomyces cerevisiae. Journal of Bacteriology 105, 826–830. Matsumoto, Y., Wickner, R.B., 1991. Yeast circular RNA replicon: Replication intermediates and encoded putative RNA polymerase. Journal of Biological Chemistry 266, 12779–12783. Rodríguez-Cousiño, N., Esteban, L.M., Esteban, R., 1991. Molecular cloning and characterization of W double-stranded RNA, a linear molecule present in Saccharomyces cerevisiae: Identification of its single-stranded RNA form as 20S RNA. Journal of Biological Chemistry 266, 12772–12778. Solórzano, A., Rodríguez-Cousiño, N., Esteban, R., Fujimura, T., 2000. Persistent yeast single-stranded RNA viruses exist in vivo as genomic RNA.RNA polymerase complexes in 1:1 stoichiometry. Journal of Biological Chemistry 275, 26428–26435. Vega, L., Sevillano, L., Esteban, R., Fujimura, T., 2014. Resting complexes of the persistent yeast 20S RNA narnavirus consist solely of the 20S RNA viral genome and its RNA polymerase p91. Molecular Microbiology 93, 1119–1129. Wejksnora, P.J., Haber, J.E., 1978. Ribonucleoprotein particle appearing during sporulation in yeast. Journal of Bacteriology 134, 246–260. Wesolowski, M., Wickner, R.B., 1984. Two new double-stranded RNA molecules showing non-Mendelian inheritance and heat inducibility in Saccharomyces cerevisiae. Molecular and Cellular Biology 4, 181–187.

Phlegiviruses (Unassigned) Karel Petrzik, Biology Center CAS, Institute of Plant Molecular Biology, Cˇeské Budeˇjovice, Czech Republic r 2021 Elsevier Ltd. All rights reserved.

Introduction Although the first reported virus-associated fungal disease (La France disease) was described in the basidiomycete of the cultivated button mushroom, Agaricus bisporus, the vast majority of currently known mycoviruses are derived from ascomycetous fungi. Despite their rarity, the basidiomycete-infecting viruses are classified as belonging to several virus families including Endornaviridae, Narnaviridae, Partitiviridae and Megabirnaviridae. The exception is phlegiviruses, wherein all currently known viruses infect basidiomycetous hosts. Like many other mycoviruses, phlegiviruses are not associated with any adverse impact on fruiting bodies or mycelia. Lentinula edodes virus (LeV) dsRNA has been found in malformed cultures of shiitake mushroom and also in healthylooking fruiting bodies and actively growing mycelia. Although phlegiviruses have yet to be officially classified by the International Committee on Taxonomy of Viruses (ICTV), the genus “Phlegivirus” was proposed to accommodate previously described “phlegiviruses” belonging to 5 species (Table 1). Justification for creating the genus “Phlegivirus” is based on biological properties, percent sequence identity and similarity in genome size and organization. Furthermore, members of the proposed genus are distinctly different from other ICTV-recognized virus genera.

Virion Properties Although repeated attempts were made to isolate virus particles from purified preparations of LeV, PgLV1, and RfV1, no viral particles have been observed by electron microscopy. A purification protocol using polyethylene glycol precipitation, ultracentrifugation, and sucrose gradient centrifugation has been successful in purifying dsRNA but not in isolating virions. At about 3539 nm, the length of the observed linear dsRNA corresponds to the genome size of about 11.8 kb for this virus. Most probably, Table 1

Members of the proposed genus “Phlegivirus”

Virus

Abbreviation

GenBank accession no.

Genome size nt

Reference

Lentinula edodes virus Phlebiopsis gigantea large virus 1 Pterostylis sanguinea leaf-associated virus A Rhizoctonia fumigata virus 1 Thelephora terrestris virus 1

LeV PgLV1 PsVA RfV1 TtV1

AB429556 NC_013999 KU291925 KM657432 NC_028921

11,282 11,563 a 10,716 a 9907 a 10,316

Magae (2012) Kozlakidis et al. (2009) Ong et al. (2018) Li et al. (2015) Petrzik et al. (2016)

a

Partial sequence.

Fig. 1 Atomic force microscopy image of a single linear Lentinula edodes virus dsRNA. Picture from: Magae, Y., 2012. Molecular characterization of a novel mycovirus in the cultivated mushroom, Lentinula edodes. Virology Journal. 9, 60.

Encyclopedia of Virology, 4th Edition, Volume 4

doi:10.1016/B978-0-12-809633-8.20942-4

627

628

Phlegiviruses (Unassigned)

Fig. 2 Genome arrangement and motifs of Thelephora terrestris virus 1. Schematic representation of the genomic organization of TtV1 shows the presence of two ORFs (ORF1 and ORF2). The dotted-line box indicates a possible extension of ORF1 via a translational frameshift mechanism. Positions of 2A-like motif, transmembrane motif, NUDIX hydrolase, S7 phytoreovirus domain, and RdRp conserved domain are indicated by boxes.

Fig. 3 Comparison of the amino acid sequences flanking 2APro-like motifs in mycoviruses.

Fig. 4 NUDIX hydrolase motif (pfam00293) in phlegiviruses.

Fig. 5 Alignment of the conserved motifs (pfam02123) of RdRp for phlegiviruses.

phlegiviruses are present in hosts as naked dsRNA and are somehow stabilized, perhaps by host proteins. Furthermore, the phlegivirus genomes do not code for capsid protein (CP) (Figs. 1 and 2).

Genome Organization The monopartite dsRNA genome of phlegiviruses is about 11 kbp long and contains two large ORFs in different frames on the genomic plus strand. ORF1 encodes a protein with calculated molecular mass of about 200 kDa, and ORF2 encodes a protein with molecular mass in the range of 140–170 kDa. There are no in-frame stop codons upstream from the ORF2 start codon, but there is a ribosomal -1 frameshift sequence (AAAAAAA in TtV1, GGGAAAC in LeV, GGGUUUU in PgLV1, and AAAUUUC in RfV1) with a pseudoknot structure. This raises the possibility that ORF2 could be translated as a fusion protein with ORF1. The ribosomal

Phlegiviruses (Unassigned)

Table 2

629

Summary of genome features of phlegiviruses

Virus

2A cleavage

NUDIX

S7 phytoreovirus

Slippery motif

ORF1 (kDa)

ORF2 (kDa)

LeV PgLV1 PsVA RfV1 TtV1

No Yes No No Yes

Yes Yes Yes Yes Yes

No Yes Yes No Yes

GGGAAAC GGGUUUU No AAAUUUC AAAAAAA

218 219 208 198 202

162 158 140a 144 174

a

Partial sequence. *See Table 1 for full names of viruses.

Fig. 6 An unrooted maximum-likelihood tree of phlegiviruses and related mycoviruses. Scale bar corresponds to 0.2 amino acid substitution per site. The tree is based on the amino acid sequences of the region between motifs I and VII of the RdRp. Numbers on nodes show bootstrap values above 70% (1000 replicates).

630

Phlegiviruses (Unassigned)

Table 3 Pairwise amino acid identity (%) of complete RdRp and ORF1 of phlegiviruses

RdRp below diagonal, ORF1 above diagonal. See Table 1 for full names of viruses.

frameshift sequence has not been detected in PsVA, but stable stem-loop structures that could assist in pausing translating ribosomes were predicted in this virus close to the slippery site. Polyprotein processing is intrinsic for many viruses. In phlegiviruses PgLV1 and TtV1, the picornavirus proteinase 2Apro-like motif was found on the N-terminal part of ORF1. This supports a hypothesis that the protein could be further processed by cleavage. This motif comprises the seven aa residues G/DxExNPG and the N-terminal proline of 2B protein (Fig. 3). The motif has been found also in Rosellinia necatrix mycovirus 2-W and in Fusarium graminearum hypoviruses 1 and 2. Based on the expected activity of 2APro, the N-terminal 301 and 90 aa fragments s of PgLV1 and TtV1 polyprotein, respectively, could be released. The ORF1-encoded protein is unrelated to any protein in GenBank. In all phlegiviruses, however, there is a region having similarity to the NUDIX hydrolase domain (pfam00293) that is close to the N-terminal part of ORF1. NUDIX hydrolases constitute a diverse superfamily of pyrophosphatases that are widespread among eukaryotes, bacteria, archaea, and some viruses. They catalyze hydrolysis of nucleoside diphosphates with various substrate specificities and function to clear the cell of potentially deleterious endogenous metabolites. Previously, NUDIX domain has been found only in dsDNA poxviruses, but the NUDIX domain in phlegiviruses differs from that of the dsDNA viruses and forms a distinct cluster in phylogenetic analysis. With the exception of RfV1, one or more transmembrane domains are predicted on the N-terminal part of the ORF1 protein upstream from the NUDIX domain in all phlegiviruses. ORF2 encodes a protein with calculated molecular mass of 140–170 kDa and having the RNA-dependent RNA polymerase (RdRp) catalytic domain. An S7-phytoreovirus-like domain is located in PgLV1, PsVA, and TtV1 close to the N-terminus of the ORF2-encoded protein. The phytoreovirus S7 domain is widely distributed in diverse dsRNA viruses within proteins known to be viral core proteins having nucleic acid binding activities. One can speculate that ORF2 encodes a structural protein that binds and protects the genomic RNA. The 5′UTR region is 25 to 1154 nt long and the 3′UTR is 29–113 nt long. Disparity in the UTR sequences could be caused by deletions in the 5′UTR sequences. There is no polyadenylation or secondary structure on the 3′ ends (Figs. 4 and 5, Table 2).

Biological Properties All phlegiviruses are associated with infections of basidiomycetes. In the case of PsVA, the mycorrhizal fungi have not been identified to species level, but they do sequence closely to a Ceratobasidium sp. and a Rhizoctonia sp. Like many other mycoviruses, phlegiviruses are not associated with an adverse impact on fruiting bodies or mycelia. LeV dsRNA has been found in malformed cultures of Lentinula edodes and also in healthy-looking fruiting bodies and actively growing mycelia. No correlation has been determined between LeV incidence and growth rate of the mycelium. Furthermore, the fact that more than 40% of monokaryotic progeny originating from the basidiospore of the fruiting body have been shown to contain LeV dsRNA indicates the persistence of dsRNA during sexual reproduction. The absence of capsid protein in phlegiviruses is speculated to be related to the feeding habits of the bark- and fungus-feeding beetles and oribatid mites (Acari), which can efficiently transfer the virus-infected fungus to a new environment. Thereafter, the CP would no longer be necessary for the virus to exit from the host cell, for its protection from the environment, and for it to create a new infection.

Evolutionary Relationships among Phlegiviruses All five phlegiviruses have almost identical genome arrangements and are monophyletic in a maximum-likelihood phylogenetic tree inferred from the RdRp region between motifs I and VII of the polymerase (Fig. 6). ClustalW comparison has shown 41%–54% nt identity across the genomes. Amino acid identity among the homologous proteins was 12%–44% for ORF1 proteins and 12%–55% for RdRps (Table 3). Megabirnaviruses (family Megabirnaviridae) are closely related to phlegiviruses, while chrysoviruses (family Chrysoviridae) and totiviruses (family Totiviridae) represent more distant groups as revealed by phylogenetic analysis based on RdRp. Megabirnaviruses differ, however, in genome components and their arrangement. They have two genome segments separately encapsidated in

Phlegiviruses (Unassigned)

631

isometric particles. dsRNA1 encompasses two overlapping ORFs. ORF1 encodes the CP and ORF2 encodes the RdRp, which is expressed as a fusion product with CP, probably via ribosomal frameshifting.

Future Perspectives It seems highly probable that phlegiviruses translate their genome as polyproteins having molecular mass exceeding 350 kDa. The genomes of potyviruses, which possess genomic nucleic acids of similar length (about 10 kb) but positive ssRNA, is translated as a polyprotein that is cleaved by virus-coded proteinases up to 10 mature proteins. Neither protease motif nor putative cleavage sites have been identified, however, on either ORF1 or ORF2 of phlegiviruses, and thus the presence of a presumed cleavage enzyme (if one exists) remains obscure. The 7aa long motif resembling that of picornavirus proteinase 2Apro cleavage site has been found in two out of six viruses only, which makes it improbable that cleavage via this motif is important and common for this group. Other than the RdRp, the presence of the NUDIX domain across all members of the proposed genus is the only constant feature of phlegiviruses indicating their origin from a common ancestor and supporting the phylogenetic relationships as inferred from the RdRp sequence. The role of this domain in the life-cycle of the viruses still needs to be clarified.

Further Reading Donnelly, M.L.L., Hughes, L.E., Luke, G., et al., 2001. The ‘cleavage’ activities of foot-and-mouth disease virus 2A site-directed mutants and naturally occurring ‘2A-like’ sequences. Journal of General Virology 82, 1027–1041. Ghabrial, S.A., Castón, J.R., Jiang, D., Nibert, M.L., Suzuki, N., 2015. 50-plus years of fungal viruses. Virology 479–480, 356–368. Kozlakidis, Z., Hacker, C.V., Bradley, D., et al., 2009. Molecular characterisation of two novel double-stranded RNA elements from Phlebiopsis gigantea. Virus Genes 39, 132–136. Li, Y., Xu, A., Zhang, L., et al., 2015. Molecular characterization of a novel mycovirus from Rhizoctonia fumigata AG-Ba isolate C-314 Baishi. Archives of Virology 160, 2371–2374. Liu, H., Fu, Y., Xie, J., et al., 2012. Evolutionary genomics of mycovirus-related dsRNA viruses reveals cross-family horizontal gene transfer and evolution of diverse viral lineages. BMC Evolutionary Biology 12, 91. Magae, Y., 2012. Molecular characterization of a novel mycovirus in the cultivated mushroom, Lentinula edodes. Virology Journal 9, 60. McLennan, A.G., 2006. The nudix hydrolase superfamily. Cellular and Molecular Life Sciences 63, 123–143. Nibert, M.L., 2007. ‘2A-like’ and ‘shifty heptamer’ motifs in penaeid shrimp infectious myonecrosis virus, a monosegmented double-stranded RNA virus. Journal of General Virology 88, 1315–1318. Ong, J.W.L., Li, H., Sivasithamparam, K., et al., 2018. Novel and divergent viruses associated with Australian orchid-fungus symbioses. Virus Research 244, 276–283. Petrzik, K., Sarkisova, T., Starý, J., et al., 2016. Molecular characterization of a new monopartite dsRNA mycovirus from mycorrhizal Thelephora terrestris (Ehrh.) and its detection in soil oribatid mites (Acari: Oribatida). Virology 489, 12–19. Won, H.K., Park, S.J., Kim, D.K., et al., 2013. Isolation and characterization of a mycovirus in Lentinula edodes. Journal of Microbiology 51, 118–122.

Plant and Protozoal Partitiviruses (Partitiviridae) Hanna Rose and Edgar Maiss, Leibniz University Hannover, Hannover, Germany r 2021 Elsevier Ltd. All rights reserved.

Nomenclature A Adenine aa Amino acids b Bases BCV Beet cryptic virus bp Base pairs C Cytosine C. Cryptosporidium CCCV2 Crimson clover cryptic virus 2 CCV Carrot cryptic virus, Cannabis cryptic virus CCV1 Carnation cryptic virus 1 CP Coat protein CsCL Caesium chloride CSpV1 Cryptosporidium parvum virus 1 DCV2 Dill cryptic virus 2 dsRNA Double-stranded RNA FCCV Fragaria chiloensis cryptic virus FCV Fig cryptic virus G Guanine HTCV2 Hop trefoil cryptic virus 2 ICTV International Committee on Taxonomy of Viruses Isl. Isolate

Glossary Buoyant densities in CsCl Density, equal to that of CsCL. Determined by density gradient centrifugation (also called isopycnic centrifugation), a method to separate molecules in CsCL according to their density. Mono-, bi- and tripartite Class of genomes: One, two or three individually packaged genome segments of a virus. Monocistronic (m)RNA that contains one open reading frame with one initiation and one stop codon translated into one protein. Negative-contrast electron microscopy Method, in which the background is more electron dense as the one of the object of interest, which thus is stained lighter than the background.

NCBI National Center for Biotechnology Information ORF Open reading frame PCV Pepper cryptic virus PeCV Persimmon cryptic virus PmV1 Primula malacoides virus 1 PTGS Post-trancriptional gene silencing RCCV2 Red clover cryptic virus 2 RCV-1 Rose cryptic virus 1 RdRp RNA-dependent RNA polymerase RsCV2 Raphanus sativus cryptic virus 2 T Thymine T. Trifolium U Uracil UTR Untranslated region VCV Vicia cryptic virus VLP Virus-like-particle VP Viral protein WCCV White clover cryptic virus a Alpha b Beta δ Delta

Post-trancriptional gene silencing (PTGS) Defense mechanism of plants in which dsRNA, e.g., originated from a virus, is degraded into small-interfering RNA which targets homologous ssRNA. Root-nodules Structures located at roots of plants forming a symbiotic relationship with nitrogen-fixing bacteria. Silencing suppressor (VSR) Viral suppressors of gene silencing: Proteins, that are able to inhibit the plant defense mechanism “Post-trancriptional gene silencing” (PTGS) at different steps. Type species A Type Species is a species whose name is linked to the use of a particular genus name. The genus so typified will always contain the Type Species (Mayo et al., 2002: “The Type Species in virus taxonomy”).

Introduction Members of the family Partitiviridae (partitivirids) appear as small isometric particles with a size of 25–43 nm owning a segmented doublestranded (ds) RNA genome. They cause no symptoms in their hosts, are not mechanically or vectorially transmissible and can infect plants, fungi or protozoa. Partitivirids firstly discovered in 1968/69 by Pullen, were described as spherical, virus-like-particles (VLPs) with a diameter of about 28 nm. These VLPs occurred in apparently healthy sugar beet varieties and beet seeds. It was neither possible to transmit them to several herbaceous test plants via aphids or mechanical methods nor to remove them with heat therapy. During this time, the term “cryptic” was established. Kassanis et al. confirmed the wide distribution of this virus in beet species in 1977 and named it beet cryptic virus 1 (BCV1). They also discovered a high seed transmission rate in commercial beet varieties (90%), identifying non-infected beet species and establishing a purification method for BCV1. In 1980 two further VLPs with similar properties were discovered, Vicia cryptic virus (VCV) and carnation cryptic virus 1 (CCV1), both showing similar characteristics like BCV1. In 1981, Lisa et al. showed that the genome of CCV1 consists of segmented dsRNA. A few years later (1988) it was shown that isolated CCV1 particles own RNA-dependent RNA polymerase

632

Encyclopedia of Virology, 4th Edition, Volume 4

doi:10.1016/B978-0-12-809633-8.21517-3

Plant and Protozoal Partitiviruses (Partitiviridae)

633

(RdRp) activity catalyzing the synthesis of dsRNA in vitro. During this time period, Natsuaki's research group in Japan also discovered VLPs in plants designated as "temperate viruses", such as spinach temperate virus and beet temperate virus. It turned out that cryptic and temperate viruses are synonymous. In 1984 Kassanis proposed to retain the name module "cryptic" in order to distinguish these viruses from latent viruses, which in contrast to cryptic viruses are usually mechanically or vectorially transmissible. With the isolation of white clover virus 1, 2, and 3 (WCCV1, 2, and 3), Boccardo and team members were able to show that VLPs are indeed cryptic viruses and originate from the plant in which they replicate. The genome of these three viruses consists for each one of two dsRNA segments and confirms previous assumptions of a segmented genome. In the fifth ICTV report in 1991, the family Partitivirdae with the genus Partitivirus was included for the first time, in which cryptic and temperate viruses were classified. The criteria for this genus were: isometric particles and a genome consisting of two dsRNA species. Two years later, Xie et al. found a third cryptic virus in beet and named it beet cryptic virus 3 (BCV3). It was also possible to determine the nucleotide and derived amino acid sequence of one dsRNA segment and to identify it as a putative RdRp with conserved motifs that are also present in RdRps of other viruses. The taxonomy of the Partitiviridae changed in 1995–2000 with the sixth and seventh ICTV reports, in that the genus Chrysovirus, which contained two fungal infecting viruses, was assigned to the family. With the eighth ICTV Report 2005, this genus formed its own family, the Chrysoviridae, since the respective species had clearly characteristics distinguishable from the partitivirids, such as a genome consisting of four dsRNA segments. In 1995–2000, the family Cryptoviridae, containing plant-infecting cryptoviruses was removed and its genera Alphacryptovirus and Betacryptovirus inserted to the Partitiviridae. Members of these two genera are serologically and morphologically distinguishable. From then on more and more partitivirids were discovered and sequenced, for example in 2005 WCCV1 by Boccardo et al. and in 2007 VCV by Blawid et al.. A milestone in the decoding of the 3D capsid structure was achieved in 2008 by the research team surrounding Ochoa through transmission electron cryomicroscopy and subsequent 3D reconstruction of the particles of the fungus-infecting gammapartitiviruses Penicillium stoloniferum virus S and F (formerly genus Partitivirus). In the following two years further structures were analyzed and in general the capsid is composed of 120 subunits consisting of 60 coat protein (CP) dimers in a T1-icosahedral symmetry. In the ninth ICTV report 2012, the new genus Cryspovirus was created in the family Partitiviridae named after the only protozoan infecting virus Cryptosporidium parvum virus 1, which was discovered in 1997 by Khramtsov et al.. By the availability of more sequence data from potential members of this family, better phylogenetic comparisons could be made and taxonomic restructuring was needed. Nibert et al. explained in 2014 that species from the genus Partiti-, Alpha-, and Betacryptovirus separated into four distinct clades, with two clades (now Alpha- and Betapartitivirus) simultaneously containing plant and fungal infecting species. As a result, in 2017 the changes were approved by the tenth ICTV report, listing the Partitiviridae with the genera Alpha-, Beta-, Gamma-, and Deltapartitivirus as well as Cryspovirus. To date, partitivirids have been discovered worldwide in fungi and plants (as well as in one protozoon) and due to modern high-throughput sequencing techniques their number is continuously increasing.

Current Taxonomy of the Partitiviridae The family Partitiviridae currently comprises the five genera Alpha-, Beta-, Gamma-, and Deltapartitivirus, as well as Cryspovirus. The alpha- and betapartitiviruses include fungal as well as plant infecting species, whereas deltapartitiviruses infect plants and gammapartitiviruses fungi. The only species in the genus Cryspovirus infects protozoa. The phylogenetic tree, based on the amino acid sequences of the partitiviral RdRps gives an overview about the taxonomic organization in this encyclopedia. Overall, the family comprises 45 assigned members whereof 16 belong to the plant infecting species, 15 unassigned members comprising twelve plant infecting partitivirids and two unclassified, related fungal viruses. The unclassified but related viruses within the individual genera are listed separately. Table 1 gives an overview of the composition of the individual genera. To assign a species into the Partitiviridae, genus and species demarcation criteria have been established by the ICTV and need to be fulfilled. One of the criteria is the host, since the members of different genera may have different hosts (plants and fungi) or only one host (plants or fungi or protozoans). With regard to the genome segments and the proteins expressed, each genus has characteristic lengths and sizes. Additionally, there are guidelines for the percentage identity values between RdRp and CP amino acid sequences: Viruses assigned to different genera have a maximum RdRp amino acid sequence identity of less than 24%, different species show a amino acid sequence identity of less than 90% for the RdRp and less than 80% for the CP, respectively. Until February 2020 there are 28 completely sequenced genomes of plant infecting species available in the NCBI GenBank. Table 2 lists all assigned and unclassified but related plant infecting species with the respective sizes of genome segments and proteins as well as accession numbers. Species, being suggested a member of the Partitiviridae which are not yet listed by the ICTV are shown in Table 3. Table 1

Overview of the composition of the individual genera of the Partitiviridae

Genus

Assigned species

Host: plant/fungal

Unclassified, related species

Host: plant/fungal

Alphapartitivirus Betapartitivirus Gammapartitivirus Deltapartitivirus Cryspovirus

14 17 8 5 1

4/10 7/10 –/8 5/– protozoan

8 4 6 4 3

6/2 –/4 –/8 5/– protozoan

634

Table 2

Plant and Protozoal Partitiviruses (Partitiviridae)

Overview of the plant infecting partitivirids with sizes of genome segments, proteins and accession numbers given Size of dsRNA segments [b] and protein [aa (kdA)]

GenBank accession

1: RdRp 2008; 616 (72) 2: CP 1783; 489 (53)

NC_011556.1 NC_011557.1

Carrot cryptic virus (CCV)

1: RdRp 1971; 616 (72) 2: CP 1776; 490 (54)

NC_038824.1 NC_038823.1

Vicia cryptic virus (VCV)

1: RdRp 2,012, 616 (72) 2: CP 1,779, 487 (53)

NC_007241.1 NC_007242.1

White clover cryptic virus 1 (WCCV1)

1: RdRp 1,955, 616 (72) 2: CP 1,708, 487 (54)

NC_006275.1 NC_006276.1

1: RdRp 1,959, 585 (68) 2: CP 1,763, 487 (55)

NC_030889.1 NC_030890.1

Dill cryptic virus 1 isl. IPP_hortorum

1: RdRp 2,013, 616 (73) 2: CP 1,837, 490 (54)

NC_022614.1 NC_022615.1

Diuris pendunculata cryptic virus isl. SW3.3

1: RdRp 2,010, 621 (73) 2: CP 1,806, 496 (55)

JX156424.1 JX891460.1

Raphanus sativus cryptic virus 1

1: RdRp 1,866, 573 (67) 2: CP 1,791, 505 (56) 3: CP 1,778, 505 (55)

NC_008191.1 NC_008190.1 DQ181927

Red clover cryptic virus 1 isl. IPP_Nemaro

1: RdRp 1,936, 616 (73) 2: CP 1,710, 487 (54)

NC_022616.1 NC_022617.1

Rose partitivirus isl. PB

1: RdRp 1,937, 586 (68) 2: CP 1,811, 487 (54)

KU896858.1 KU896859.1

1: RdRp 2,397, 746 (87) 2: CP 2,266, 672 (74)

NC_031134.1 NC_031130.1

Crimson clover cryptic virus 2 (CCCV2)

1: RdRp 2,444, 746 (87) 2: CP 2,354, 674 (75)

NC_038837.1 NC_038838.1

Dill cryptic virus 2 (DCV2)

1: RdRp 2,430, 745 (87) 2: CP 2,354, 673 (75)

NC_021147.1 NC_021148.1

Hop trefoil cryptic virus 2 (HTCV2)

1: RdRp 2,431, 746 (87) 2: CP 2,349, 673 (75)

NC_021098.1 NC_021099.1

Primula malacoides virus 1 (PmV1)

1: RdRp 2,390, 723 (84) 2: CP 2,344, 673 (75)

NC_013109.1 NC_013110.1

Red clover cryptic virus 2 (RCCV2)

1: RdRp 2,430, 745 (87) 2: CP 2,353, 673 (76)

NC_021096.1 NC_021097.1

White clover cryptic virus 2 (WCCV2)

1: RdRp 2,435, 746 (87) 2: CP 2,348, 673 (76)

NC_021094.1 NC_021095.1

1: RdRp 1,575, 475 (54) 2a: VP1 1,598, 426 (49) 2b: VP2 1,522, 393 (45)

NC_038846.1 NC_038845.1 NC_038847.1

Genus/virus name Alphapartitivirus Beet cryptic virus 1 (BCV1)

Unclassified – Alphapartitivirus related Arabidopsis halleri partitivirus 1

Betapartitivirus Cannabis cryptic virus (CCV)

Deltapartitivirus Beet cryptic virus 2 (BCV2)

Beet cryptic virus 3 (BCV3)

partial genome only

Fig cryptic virus (FCV)

1: RdRp, 1,696, 472 (54) 2: CP 1,415, 337 (38)

NC_015494.1 NC_015495.1 (Continued )

Plant and Protozoal Partitiviruses (Partitiviridae)

Table 2

635

Continued

Genus/virus name

Size of dsRNA segments [b] and protein [aa (kdA)]

GenBank accession

Pepper cryptic virus 1 (PCV1)

1: RdRp 1,563, 479 (54) 2: CP 1,512, 412 (48)

NC_037095.1 NC_037096.1

Pepper cryptic virus 2 (PCV2)

1: RdRp 1,609, 478 (54) 2: CP 1,525, 430 (49)

NC_034159.1 NC_034167.1

1: RdRp 1,734, 479 (56) 2: CP 1,479, 348 (39) 3: CP? 1,465, 346 (39)

NC_009519.1 NC_009521.1 NC_009520.1

Persimmon cryptic virus isl. SSPI

1: RdRp 1,577, 477 (54) 2: CP 1,491, 415 (47)

NC_017989.1 NC_017988.1

Raphanus sativus cryptic virus 2

1: RdRp 1,717, 477 (55) 2: CP 1,521, 346 (38) 3: VP 1,485, 347 (39)

NC_010343.1 NC_010344.1 NC_010345.1

Rose cryptic virus 1 isl. ShB  1

1: RdRp 1,749, 479 (56) 2: CP 1,485, 348 (39) 3: VP 1,446, 346 (39)

NC_010346.1 NC_010347.1 NC_010348.1

Unclassified – Deltapartitivirus related Fragaria chiloensis cryptic virus isl. NCGR CFRA 9089

Unassigned Alfalfa cryptic virus 1 (ACV1) Carnation cryptic virus 1 (CCV1) Carrot temperate virus 1 (CTV1) Carrot temperate virus 2 (CTV2) Carrot temperate virus 3 (CTV3) Carrot temperate virus 4 (CTV4) Hop trefoil cryptic virus 1 (HTCV1) Hop trefoil cryptic virus 3 (HTCV3) Radish yellow edge virus (RYEV) Ryegrass cryptic virus (RCV) Spinach temperate virus (STV) White clover cryptic virus 3 (WCCV3)

No No No No No No No No No No No No

entry entry entry entry entry entry entry entry entry entry entry entry

in in in in in in in in in in in in

GenBank GenBank GenBank GenBank GenBank GenBank GenBank GenBank GenBank GenBank GenBank GenBank

isl.: isolate; VP: viral protein.

Virion Properties Partitivirids appear as small isometric particles, 25–43 nm in diameter in this encyclopedia. The capsid is composed of 120 subunits consisting of 60 dimers in a T¼ 1-icosahedral symmetry. The viruses have at least two separately encapsidated genome segments encoding a RdRp and a CP. Partitivirids could be detected in numerous plant organs, for example stems, roots, cotyledons and petals. The virions are mainly located in the cytoplasm and less in the nucleus or nucleoli of parenchyma cells. Replication takes place in a semi-conservative mode.

Alphapartitivirus Alphapartitiviruses are between 25 nm and 40–50 nm in diameter, the type species WCCV1 has an average size of 34 nm. In negative-contrast electron micrographs, the particles appear rounded and ring-shaped because the coloration penetrates in the middle (Fig. 1). The buoyant densities in CsCl of WCCV1 virions are 1392 g cm3 and of VCV virions 1.37 g cm3. Encapsidation is performed by a single major CP with a molecular weight of 51–57 kDa. The RdRp has sizes between 68 and 73 kDa. In case of Raphanus sativus cryptic virus 1, a third genome segment was found with so far unknown function.

Betapartitivirus Members of the genus Betapartitivirus are between 25 nm and 38 nm in diameter. By negative-contrast eletron microscopy WCCV2 appears as rounded particles, which are not penetrated by stain (Fig. 2). The buoyant density in CsCl of WCCV2 virions is 1.375 g cm3 and analogous to the alphapartitiviruses there is also a single-major CP with a molecular weight of 71–77 kDa and an RdRp of 77–87 kDa. Betapartitiviruses have the largest proteins among all genera in this family.

636

Plant and Protozoal Partitiviruses (Partitiviridae)

Table 3 Overview of the plant infecting partitivirids which are not yet listed by the ICTV. Sizes of available genome segments, proteins and accession numbers are given putative Genus/Virus name

Size of dsRNA segments [b] and protein [aa (kdA)]

GenBank accession

Alphapartitivirus Alopecurus myosuroides partitivirus 1 (Black grass cryptic virus 1)

1: RdRp 1984; 616 (72) 2: CP 1835; 489 (54)

HG005156.1 LN713935.1

Alopecurus myosuroides partitivirus 2 (Black grass cryptic virus 2)

1: RdRp 2037; 515 (60) 2: CP 1956; 520 (54)

LN713936.1 LN713937.1

Medicago sativa alphapartitivirus 1

1: RdRp 1922; 586 (69) 2: CP 1707; 491 (54)

MF443256.1 MF443257.1

Medicago sativa alphapartitivirus 2

1: RdRp 1939; 586 (68) 2: CP 1764; 491 (54)

MK292288.1 MK292289.1

Pyrus pyrifolia cryptic virus 2

1: RdRp 1945; 586 (68) 2: CP 1788; 491 (54)

LC221826.1 LC221827.1

Raphanus sativus cryptic virus 4

1: RdRp 1976; 616 (72) 2: CP 1751; 490 (55)

MF686921.1 MF686922.1

Spinach cryptic virus 1 isl. SRR1766311

1: RdRp 1966; 616 (73) 2: CP 1762; 488 (54)

KX784754.1 KX784755.1

Spinach cryptic virus 1 isl. SRR1766329

1: RdRp 1961; 616 (73) 2: CP 1753; 488 (54)

KX784756.1 KX784757.1

1: RdRp 1573; 475 (54) 2: CP 1562; 411 (47)

KX826917.1 KX826918.1

Citrullus lanatus cryptic virus

1: RdRp 1603; 477 (54) 2: CP 1466; 407 (46)

KY081285.1 KY081284.1

Cucumis melo cryptic virus

1: RdRp 1502; 477 (55) 2: CP 1715; 480 (55)

MH479772 MH479773

Medicago sativa deltapartitivirus 1

1: RdRp 1584; 477 (54) 2: CP 1353; 394 (46)

MF443258.1 MF443259.1

Pittosporum cryptic virus 1

1: RdRp 1967; 482 (55) 2: CP 1525; 405 (47)

LN680393.2 LN680394.2

Pyrus pyrifolia cryptic virus 1

1: RdRp 2: CP 1523; 419 (48) 3: CP 1481; 411 (48)

no entry in GenBank LC221824.1 LC221825.1

Tea-oil camellia deltapartitivirus 1

1: RdRp 1712; 477 (56) 2: CP 1504; 343 (39) 3: CP 1353; 344 (38)

MH814756.1 MH814757.1 MH814758.1

1: RdRp 2: CP 1939; 516 (57)

no entry in GenBank HG005148.1

Lolium rigidum partitivirus 2

1: RdRp 2: CP 1841; 520 (58)

no entry in GenBank HG005149.1

Lolium rigidum partitivirus 3

1: RdRp 2: CP 1236; 350 (39)

no entry in GenBank HG005150.1

Raphanus sativus cryptic virus 3

1: RdRp 1609; 481 (55) 2: CP 1581; 374 (43)

NC_011705.1 NC_011706.1

Deltapartitivirus Carnation cryptic virus 3

Genus unclassified Lolium rigidum partitivirus 1

isl.: isolate; VP: viral protein.

Cryspovirus Cryptosporidium parvum virus 1 appears as isometric and non-enveloped virions of about 31 nm in diameter. Negative-contrast eletron microscopy revealed single-layered particles, which are not penetrated by stain showing short protrusions on their surfaces. The buoyant density in CsCl is 1.39–1.44 g cm3 and there is a single-major CP with a molecular weight of 37 kDa and an RdRp of 62 kDa.

Plant and Protozoal Partitiviruses (Partitiviridae)

637

Fig. 1 Alphapartitivirus. Negative-contrast electron micrograph of particles of an isolate of White clover cryptic virus 1, the type species of the genus Alphapartitivirus. The bar represents 50 nm. Reproduced from Ghabrial, S.A., Bozarth, R.F., Buck, K.W., Martelli, G.P., Milne, R.G., 2000. Partitiviridae. In Virus Taxonomy: Seventh Report of the International Committee on Taxonomy of Viruses, New York: Academic Press, pp. 503–513.

Fig. 2 Betapartitivirus. Negative-contrast electron micrograph of particles of an isolate of White clover cryptic virus 2. The bar represents 50 nm. Reproduced from Ghabrial, S.A., Bozarth, R.F., Buck, K.W., Martelli, G.P., Milne, R.G., 2000. Partitiviridae. In Virus Taxonomy: Seventh Report of the International Committee on Taxonomy of Viruses, New York: Academic Press, pp. 503–513.

Deltapartitivirus Deltapartitivirus particles of PCV1 and PCV2 are approximately 30 nm in diameter. Like the betapartitiviruses, these particles are not penetrated by stain in negative-contrast eletron micrographs and the buoyant density in CsCl of BCV2 was defined as 1.36 g cm3. The single major CP ranges in molecular weight from 38 to 49 kDa and the RdRp is approximately 54 kDa. A putative member of this genus, Cucumis melo cryptic virus, which was recently described, shows a longer dsRNA2 encoding a CP of 55 kDa. Beet cryptic virus 2 and Fragaria chiloensis cryptic virus (isl. NCGR CFRA 9089), Raphanus sativus cryptic virus 2 and rose cryptic virus 1 (isl. ShB-1) contain a third genome segment, the function of which has not yet been conclusively clarified.

Genome Organization and Replication Strategy Genome Organization Members of the family Partitiviridae have at least two monocistronic genome segments, in few cases a third segment could be detected (see above). There are several theories about the function of this additional genome segment. Tzanetakis et al suggested in 2008 that due to the similarities of segments 2 and 3 of Fragaria chiloensis cryptic virus, Rose cryptic virus 1 and Raphanus sativus cryptic virus 2 to each other, it could possibly be another CP, which was created by a duplication event that took place before species segregation. Thus, the CPs might have retained their function. Another hypothesis of Chen in 2006 was that they could be satellite elements. For the viruses BCV2 and rose cryptic virus 2 it was possible to detect strains in plants which do not contain the third segment. In a review by Nibert et al in 2014 it was suspected that the strains with the additional CP-like segment were present in a mixed infection and the associated RdRp could not yet be detected or was lost. If the RdRp had been lost, the other RdRp would have to be able to replicate all dsRNA segments. The sizes of the two main genome segments of partitivirids are different between the genera and specific for the respective one. Table 4 gives an overview of the ranges of genome and protein sizes, respectively, from segments 1 and 2 (taken from Table 2 in Section“Current Taxonomy of the Partitiviridae”). A schematic representation of the alphapartitiviral genome of Vicia cryptic virus is shown in Fig. 3.

50 and 30 UTRs The 50 UTRs of partitivirids show some genus-specific conserved sites, which is shown exemplarily in Fig. 4 for the alphapartitiviruses. The 50 UTRs of both dsRNA segments of the assigned species BCV1, CCV, VCV, and WCCV1 were aligned and 100% conserved regions are shown black-shaded. Gray-shaded sequences are not conserved in all viruses. In the case of WCCV1 the

638

Plant and Protozoal Partitiviruses (Partitiviridae)

Table 4 Overview of the ranges of genome and protein sizes from segments 1 and 2. (taken from Table 2 in Section “Current Taxonomy of the Partitiviridae") Genus

dsRNA1 [bp]

dsRNA2 [bp]

RdRp [aa]

CP [aa]

Alphapartitivirus Betapartitivirus Cryspovirus Deltapartitivirus

1866–2013 2390–2444 1786 1563–1749

1708–1837 2266–2354 1374 1479–1521

573–621 723–746 524 477–479

487–505 672–674 319 346–415

Fig. 3 Schematic genome organization of the alphapartitivirus Vicia cryptic virus (NC_007241.1, NC_007242.1). The RNA is represented by a black bar on which the open reading frames are illustrated as green (RdRp) and blue (CP) boxes. The numbers outline the first nucleotide position of the respective feature.

Fig. 4 Alignment of alphapartitiviral 50 UTRs of both dsRNA segments. Areas with 100% identity are black-shaded and gray-shaded sequences are not conserved in all viruses. BCV1: beet cryptic virus 1; CCV: carrot cryptic virus; VCV: Vicia cryptic virus.

50 ends differ from the two dsRNA segments, which could possibly be due to the fact that not all nucleotides could be determined correctly. It is noteworthy, that all 50 UTRs of the second genome segments are longer. Similar findings were described in the 50 UTRs of betapartitiviruses, which almost all begin with the pentanucleotide AGAUU and show a highly conserved region at positions 39–56 (Fig. 5). Furthermore, two conserved stem-loop structures are present, potentially being involved in dsRNA replication and/or virion assembly (Fig. 6). Deltapartitiviruses seem to have three different types of extreme 50 ends (assuming all ends have been correctly determined). For example, nearly all tripartite species despite of BCV2 (FCCV, RsCV2, and RCV-1) begin with the septanucleotide GAUAAUG (Fig. 7, type 1). There is another variation in which the genome segments of most bipartite species (PCV 1 & 2, BCV2, and RNA2 of PeCV) begin with AGAAUU (type 2). Only FCV and RNA1 of persimmon cryptic virus show the nucleotides GGA(A/U)U at the 50 end (type 3). The 50 UTR of Cryptosporidium parvum virus 1 does not fit to one of these types and both segments start with GGAAA(A/G).

Plant and Protozoal Partitiviruses (Partitiviridae)

639

Fig. 5 Alignment of betapartitiviral 50 UTRs of both dsRNA segments . Areas with 100% identity are black-shaded and gray-shaded sequences are not conserved in all viruses. CCCV2: crimson clover cryptic virus 2; CCV: Cannabis cryptic virus; DCV2: dill cryptic virus 2; HTCV2: hop trefoil cryptic virus 2; PmV1: Primula malacoides virus 1; RCCV2: red clover cryptic virus 2; WCCV2: White clover cryptic virus 2.

Fig. 6 RNA foldings of partial 50 UTRs (nucleotides 1–54) of RCCV2 dsRNA1 (A) and dsRNA2 (B).

Within the 30 UTRs, there are less conserved sequences than it was shown for the 50 UTRs before. The 30 UTRs of alpha- and betapartitiviruses show poly(A) stretches, which can be interrupted by other nucleotides, deltapartitiviruses and Cryptosporidium parvum virus 1 lack this feature (Fig. 8). In general, these A-rich areas seem to be longer in RNA2 than in RNA1. From these findings the question arises, if this could be an imitation of common poly(A) tails. Additionally, the adenosine content of the last 30 located 50 nucleotides is significantly higher in alpha- and betapartitivirds than in the other genera. Betapartitiviruses show another relatively conserved feature - they end with three to four Cs at the extreme 30 end (Fig. 8).

RNA1& 2 In general, RNA1 is larger than RNA2 and encodes an RdRp. The size differs between the genera, so the RdRps of alphapartitiviruses are about 72 kDa in size, of betapartitiviruses about 87 kDa, of CSpV1 62 kDa and of deltapartitiviruses about 54 kDa. There are several

640

Plant and Protozoal Partitiviruses (Partitiviridae)

Fig. 7 Comparison of three different types of 50 ends of deltapartitiviruses (not aligned). Areas with 100% identity are black-shaded. Gray-shaded sequences are not conserved among all viruses. BCV2: beet cryptic virus; FCV: fig cryptic virus; FCCV: Fragaria chiloensis cryptic virus; PCV1/2: pepper cryptic virus 1/2; PeCV: persimmon cryptic virus; RCV1: rose cryptic virus 1; RsCV2: Raphanus sativus cryptic virus 2.

Fig. 8 30 poly(A) stretches (underlined) and terminal located Cs (edged) of both segments of one alpha- and betapartitivirus. VCV: Vicia cryptic virus, a; WCCV2: White clover cryptic virus 2, b.

known motifs that occur in RdRps of partitivirids. Motifs I and II are not exactly localizable, but motifs III to VIII can be determined with slight variations and examples are given for WCCV1 (a), WCCV2 (b), CSpV1 (Cryspovirus) and PCV1 (δ). Motif III can be detected as K-X-R-3X-(A,G,P), motif IV as D-W-2X-(F, Y)-D, motif V as G-(I, V)-X-S-G, motif VI as G-D-D, motif VII as L-(G, S, V)-Y, and motif VIII as R. The smaller RNA2 encodes a CP with sizes of about 54 kDa for alphapartitiviruses, 75 kDa for betapartitiviruses and 37 kDa for CSpV1. In case of the deltapartitiviruses there seem to be two types of CP which can be distinguishable by size, one is about 49 kDa and the second one about 38 kDa. Compared to the RdRps, there are not so many clearly conserved sequence stretches to be found in the CPs. Some motifs are described, the first of which is called “SQLY” and varies a lot between different species and genera. A second motif “PGPL3XF” can be found in alphapartitiviruses and in a deviant form in CSpV1.

Replication Strategy In the plant cell, dsRNA usually triggers RNAi as plant defense mechanism leading to post-transcriptional gene silencing (PTGS), which is responsible for its degradation. To circumvent this defense strategy, viruses have evolved silencing suppressors (VSRs). No VSR could be detected for partitivirids yet, so they are believed to perform their transcription/replication within the particle. Every virus particle contains one or two RdRp molecules anchored to the CP as well as one or two dsRNA segments either coding for the CP or the RdRp. The RdRp acts in a semi-conservative manner, producing plus-strands from the minus-strand in this encyclopedia. In a 3D reconstruction of the capsid of the PsV-S, possible pores could be discovered that are probably associated with the transport of the plus-strand RNA from the particle into the cytoplasm during transcription. The newly synthesized plus-strands remains in the particle while the parental plus-strand is released into the cytoplasm. There, translation of RdRps and CPs by cellular ribosomes takes place and packaging of new virions is performed. Therefore, RdRps and RNAs associate and interact with the CPs whereupon the particle is fully assembled. Protein-protein interaction studies of alpha- and betapartitiviruses revealed varying results. For example, the expected interaction between RdRp and CP could only be demonstrated for alphapartitiviruses and not for betapartitiviruses. Furthermore, in the case of alphapartitiviruses, the CP dimers seem to be located in the nuclear membrane, those of betapartitiviruses in inclusions within the cytoplasm of epidermis cells. These findings indicate different replication strategies and still have to be elucidated.

Transmission Partitivirids are not able to be transmitted mechanically, by grafting or natural vectors. Within the plant or protozoon, no cell-to-cell movement is observed except during cell division or gamete fusion in case of CSpV1. From one plant to another partitivirids can be transmitted by seeds, ovules and pollen in high rates. It was found in crosses of BCV infected and non-infected sugar beet plants

Plant and Protozoal Partitiviruses (Partitiviridae)

641

(either female or pollen) that a female non-infected plant crossed with pollen from an infected plant led to an infection rate in progeny plants of 43% and vice versa of 82%. When both plants were infected, the infection rate in progeny plants is 100%. Partitivirids can not be eliminated by meristem culture or thermotherapy and stay life-long in their hosts. In experiments on Dianthus Szego and collaegues showed that after 16 years of in vitro cultivation carnation cryptic virus 1 (CCV1) was still present in the plants.

Virus-Host-Interaction An infection with partitiviruses is asymptomatic, which is also referred to as a latent infection. To date, there is hardly any evidence of the effects that partitiviruses have on their host. One study, however, showed that the expression of the gene TrEnodDR1 from Trifolium repens in Lotus japonicus resulted in reduced growth of the transformants and a smaller number of root nodules formed by infection with Mesorhizobium loti due to an increase of endogenic abscisic acid. Within this gene, sequences corresponding to the CP gene of WCCV1 were detected, leading to the conclusion that an infection of T. repens with WCCV1 could be mutualistic. Another study screened perennial ryegrass concerning salt toxicity and one result was a clone containing a partial sequence of a RdRp from the putative deltapartitivirus RsCV2. These finding may suggest evidence for mutualistic relationships between plants and partitivirids. This effect does not seem to be limited to plants only, because there are also studies in the case of CsPV1 whose results indicate that strain-dependent increased virus concentration in its protozoan host is associated with increased C. parvum replication in cell cultures and higher oocyte count in dairy calves. Another impact on the host plant is the horizontal gene transfer. It has been shown that CP-like sequences are present in Arabidopsis and Nicotiana species. However, most inserts are pseudogenes with internal stop codons. It has also been reported that in infected plants no inserts could be detected in the genome.

Further Reading Blawid, R., Stephan, D., Maiss, E., 2008. Alphacryptovirus and betacryptovirus. In: Mahy, B.W.J., van Regenmortel, M.H.V. (Eds.), Encyclopedia of Virology, third ed. Amsterdam: Elsevier, pp. 98–104. Boccardo, G., Milne, R.G., Luisoni, E., Lisa, V., Accotto, G.P., 1985. Three seedborne cryptic viruses containing double-stranded RNA isolated from white clover. Virology 147 (1), 29–40. Chiba, S., Kondo, H., Tani, A., et al., 2011. Widespread endogenization of genome sequences of non-retroviral RNA viruses into plant genomes. PLoS Pathogens 7 (7), e1002146. Ghabrial, S.A., Ochoa, W.F., Baker, T.S., Nibert, M.L., 2008. Partitiviruses. General features. In: Mahy, B.W.J., van Regenmortel, M.H.V. (Eds.), Encyclopedia of Virology, third ed. Amsterdam: Elsevier, pp. 68–75. Khramtsov, N.V., Upton, S.J., 2000. Association of RNA polymerase complexes of the parasitic protozoan Cryptosporidium parvum with virus-like particles. Heterogeneous system. Journal of Virology 74 (13), 5788–5795. Lesker, T., Maiss, E., 2013. In planta protein interactions of three alphacryptoviruses and three betacryptoviruses from white clover, red clover and dill by bimolecular fluorescence complementation analysis. Viruses 5 (10), 2512–2530. Lesker, T., Rabenstein, F., Maiss, E., 2013. Molecular characterization of five betacryptoviruses infecting four clover species and dill. Archives of Virology 158 (9), 1943–1952. Nakatsukasa-Akune, M., Yamashita, K., Shimoda, Y., et al., 2005. Suppression of root nodule formation by artificial expression of the TrEnodDR1 (coat protein of White clover cryptic virus 1) gene in Lotus japonicus. Molecular Plant-Microbe Interactions (MPMI) 18 (10), 1069–1080. Nibert, M.L., Ghabrial, S.A., Maiss, E., et al., 2014. Taxonomic reorganization of family Partitiviridae and other recent progress in partitivirus research. Virus Research 188, 128–141. Nibert, M.L., Woods, K.M., Upton, S.J., Ghabrial, S.A., 2009. Cryspovirus: A new genus of protozoan viruses in the family Partitiviridae. Archives of Virology 154 (12), 1959–1965. Ochoa, W.F., Havens, W.M., Sinkovits, R.S., et al., 2008. Partitivirus structure reveals a 120-subunit, helix-rich capsid with distinctive surface arches formed by quasisymmetric coat-protein dimers. Structure 16 (5), 776–786. Roossinck, M.J., 2010. Lifestyle of plant viruses. Philosophical Transactions of The Royal Society B Biological Sciences 365 (1548), 1899–1905. Sabanadzovic, S., Valverde, R.A., Brown, J.K., Martin, R.R., Tzanetakis, I.E., 2009. Southern tomato virus. The link between the families Totiviridae and Partitiviridae. Virus Research 140 (1–2), 130–137. Tzanetakis, I.E., Price, R., Martin, R.R., 2008. Nucleotide sequence of the tripartite Fragaria chiloensis cryptic virus and presence of the virus in the Americas. Virus Genes 36 (1), 267–272. Vainio, E.J., Chiba, S., Ghabrial, S.A., et al., 2018. ICTV virus taxonomy profile: Partitiviridae. Journal of General Virology 99 (1), 17–18.

Relevant Websites https://www.ncbi.nlm.nih.gov/genomes/GenomesGroup.cgi?taxid=11012 Complete genomes: Partitiviridae. NCBI. https://talk.ictvonline.org/ictv-reports/ictv_online_report/dsrna-viruses/w/partitiviridae Partitiviridae. dsRNA Viruses. ICTV. https://viralzone.expasy.org/168?outline=all_by_species Partitiviridae B ViralZone page.

Quadriviruses (Quadriviridae) Hideki Kondo, Okayama University, Kurashiki, Japan José R Castón, National Center for Biotechnology, Spanish National Research Council, Madrid, Spain Nobuhiro Suzuki, Institute of Plant Stress and Resources (IPSR), Okayama University, Kurashiki, Japan r 2021 Elsevier Ltd. All rights reserved.

Glossary Quadriparticulate virus Virus with the four genomic segments separately packaged into four different particles. Quadripartite genome The essential genome is separated into four genomic segments.

T ¼ 1 icosahedral symmetry Symmetry of capsid composed of 60 subunits, which forms 12 pentameric capsomeres.

Introduction Screens involving filamentous fungi, particularly phytopathogenic fungi, have detected an enormous number of novel mycoviruses (or fungal viruses) that have unique biological and molecular features, and that enhanced our knowledge of virus diversity and evolution. Many of them have double-stranded RNA (dsRNA) genomes and isometric particles. DsRNA mycoviruses are currently classified into six families, Chrysoviridae (includes both traditional chrysoviruses with four segments as well as proposed others with three to seven segments), Megabirnaviridae (two segments), Partitiviridae (mostly two segments), Quadriviridae (four segments), Reoviridae (11 or 12 segments), and Totiviridae (nonsegmented), and one genus Botybirnavirus (two segments, family unassigned). Some dsRNA mycoviruses have also been proposed as the members of tentative families or genera, such as Alternaviridae (four segments), Fusagraviridae (nonsegmented), Megatotiviridae (nonsegmented) and Phlegivirus (nonsegmented), the virion morphology for which is largely unknown, except for alternaviruses. The Quadriviridae is a recently established virus family, and its members are non-enveloped spherical viruses with quadripartite dsRNA genomes. The genus Quadrivirus within the family currently contains a single species Rosellinia necatrix quadrivirus 1. Rosellinia necatrix quadrivirus 1 (RnQV1) was discovered during a screen of over 1000 Japanese isolates of the white root rot fungus, Rosellinia necatrix (class Sordariomycetes), which is an ascomycete able to infect over 400 plants. The name quadrivirus originates from the ‘quadripartite’ dsRNA genome. At present, no species demarcation criteria exist for the genus Quadrivirus since only one species has been assigned to this genus. The virion structure of quadriviruses has unique features as compared to those of other known dsRNA viruses. Quadriviruses appear to be phylogenetically closely related to totiviruses (members of the genus Totivirus with an unsegmented dsRNA genomes). The goals of this article are to give an overview of the features of the well-studied quadriviruses and compare them with those of tentative members of the family Quadriviridae and other related dsRNA viruses.

Virion Structure and Composition The virion of RnQV1 is non-enveloped, isometric, about B45 nm in diameter (Fig. 1) and likely separately packages each of the four dsRNA genome segments, which are 3.5–5.0 kbp in length. The buoyant densities of quadrivirus virion are not known. The size of quadriviruses are similar to that of chryso-, partiti-, totiviruses (25–43 nm) and megabirnaviruses (B52 nm), which also separately encapsidate their segments, but are much smaller than that of mycoreoviruses (B80 nm), which package all of their dsRNA segments into a single multi-layered virion. Most of spherical dsRNA mycoviruses have a single-shelled T ¼ 1 capsid formed by 60 asymmetric homodimers of two capsid protein (a 120-subunit T ¼ 1 capsid). An exception is the chrysovirus particle, which is composed of 60 copies of a single capsid protein with a duplicated helix-rich domain (a genuine T ¼ 1 capsid). The RnQV1 capsids are comprised of 60 copies of heterodimers of two structural proteins, P2 (1356 or 1357 residues) and P4 (1061 or 1059 residues), which are encoded by dsRNA2 and dsRNA4, respectively, but P2 383-residue C-terminal region is absent in the mature viral particle (Fig. 2). The P2-P4 heterodimers are organized in a quaternary structure similar to that of reovirus, chrysovirus and totivirus. The two structural proteins of RnQV1 isolate W1075 are cleaved into several polypeptides without altering capsid structural integrity, likely during virion purification and/or in the infected fungal host, whereas in isolate W1118 P2 and P4 remain nearly intact (Fig. 3(a)). Similar capsid assembly involving two structural proteins may be suggested for botybirnaviruses, Botrytis porri RNA virus 1 and Alternaria alternate botybirnavirus 1 and for a betachrysovirus, Magnaporthe oryzae chrysovirus, which possibly belongs to the newly proposed genus Betachrysovirus (currently designated Cluster II) of the family Chrysoviridae (see Fig. 5 for their phylogenetically relationships). Posttranslational cleavage/degradation similar to RnQV1-W1075 virions was also observed for the two putative structural proteins of these dsRNA mycoviruses, even though their structural characteristics are still unclear. RnQV1 P2 and P4 lack sequence similarity, but they have a similar a-helical domain that is the structural signature shared with most dsRNA viruses. In addition to organizing the viral genome and RNA-dependent RNA-polymerase (RdRp) molecules, P2 and

642

Encyclopedia of Virology, 4th Edition, Volume 4

doi:10.1016/B978-0-12-809633-8.20948-5

Quadriviruses (Quadriviridae)

643

Fig. 1 Quadrivirus particle structure. Electron micrograph of negatively stained particles of RnQV1-W1118 (scale bar, 100 nm).

Fig. 2 Three-dimensional cryo-EM reconstruction of RnQV1 vrions at 3.7 Å resolution. (a) Radially color-coded surface-shaded virion T ¼ 1 capsid viewed along an icosahedral two-fold axis showing 12 outwardly protruding decamers (orange). Bar ¼ 100 Å . (b) Ribbon diagrams of P2 (blue) and P4 (yellow) proteins (top view). N and C termini are indicated. Symbols indicate icosahedral symmetry axes. (c) Surface-shaded virion T ¼ 1 capsid viewed along an icosahedral two-fold axis highlighting the A (blue) and B (yellow) structural subunits. P2 C-terminal regions are clearly seen as hooks embracing P4 outer surface. Reproduced from Fig. 1(f) of Mata, C.P., Luque, D., Gómez-Blanco, J., et al., 2017. Acquisition of functions on the outer capsid surface during evolution of double-stranded RNA fungal viruses. PLoS Pathogens 13, e1006755.

P4 may have acquired new functions by inserting complex domains in preferential insertion sites at the capsid outer surface, likely related to enzyme activity. The P2 insertion has a fold similar to that of gelsolin and profilin, two actin-binding proteins with a function in cytoskeleton metabolism, whereas the P4 insertion suggests protease activity involved in cleavage of the P2 C-terminal region. Except for reoviruses, whose 10–12 dsRNA genome segments densely packaged within a single particle (B40 bp/100 nm3 that corresponds to a spacing between dsRNA strands of 25–30 Å ), most of the known multi-segmented dsRNA mycoviruses have a multi-particulate nature, in which dsRNA segments are encapsidated separately and they might have a lower density at the central region of the capsid, with a single loosely-packed dsRNA molecule (B20 bp/100 nm3, with an interstrand spacing of B40–45 Å in the case of chrysoviruses and partitiviruses). A similar value for packed dsRNA, 25 bp/100 nm3, if there is one dsRNA molecule in the capsid, was proposed for quadriviruses. Such low density at the central region of the capsid may increase dsRNA mobility, which might be necessary for maximum RdRp activity with the RdRp complex. Interestingly, the amount of dsRNA1 and dsRNA4 in the quadrivirus virions were consistently less than those of dsRNA2 and dsRNA3 (Fig. 3(b)), and a similar proportion of the virion dsRNA profile was observed in a chrysovirus, Helminthosporium victoriae virus 145S (HvV145S, a proposed genus Alphachrysovirus, formerly the genus Chrysovirus). Thus, it is also tempting to speculate that two heterogeneous dsRNA sets, dsRNA1 þ dsRNA4 (8.6 kbp total) and dsRNA2 þ dsRNA3 (8.5 kbp total), where each set always shows a similar accumulation ratio, are separately packed in the two different capsids. If this is the case, an alternative density for the quadrivirus capsid with two

644

Quadriviruses (Quadriviridae)

Fig. 3 Quadrivirus dsRNA and protein components. (a) SDS-PAGE pattern of structural proteins of RnQV1-W1118 and -W1075. (b) Agarose gel electrophoresis of RnQV1-W1118 and -W1075 genomic dsRNA segments. The genomic dsRNA segments of a mycoreovirus (MyRV1 S1–S6, and others are not shown) and a DNA size marker (M) were used as size standards. Reproduced from Fig. 1(b) and (c) of Lin, Y.H., Hisano, S., Yaegashi, H., Kanematsu, S., Suzuki, N., 2013. A second quadrivirus strain from the phytopathogenic filamentous fungus Rosellinia necatrix. Archives of Virology 158, 1093–1098, with permission from Springer.

heterologous dsRNA molecules could conceivably be calculated as 50 bp/100 nm3. See the other article of this section “Structure of Double-Stranded RNA Mycoviruses” by Castón et al. for further discussion of this possibility.

Genome Organization and Expression Quadriviruses have a multi-segmented genome consisting of four monocistronic linear dsRNA segments. The complete genome sequence of the RnQV1-W1075 isolate is a total of 17,078 bp. Each dsRNA segment ranges from 3.7 to 4.9 kbp (GC contents are 51.9%–54.3%) and has a single large open reading frame (ORF) covering 86%–97% of its segment size (Fig. 4). The largest segment, dsRNA1 (4942 bp), codes for a protein of unknown function (P1, 1602 amino acid residues, 178.2 kDa). The third segment, dsRNA3 (4099 bp) codes for RdRp (P3, 1310 amino acids, 146.8 kDa). The second and fourth segments, dsRNA2 (4352 bp) and dsRNA4 (3685 bp) code for the two structural proteins, P2 (1356 amino acids, 147.4 kDa) and P4 (1061 amino acids, 113.2 kDa), respectively. The quadripartite nature of quadriviruses resembles that of the traditional chrysoviruses including several members of the proposed genus Alphachrysovirus and alternaviriuses members of the proposed family Alternaviridae. However, the total size of quadrivirus genome (3.7–4.9 kbp, 17.1 kbp total) is approximately 1.5–1.7 fold larger than those of traditional chrysoviruses (2.5–3.6 kbp, B12.6 kbp total) and alternaviriuses (1.4–3.6 kbp, 10.4 kbp total). The 50 - and 30 -terminal nucleotide sequences of the four dsRNA genome segments of RnQV1 are conserved (the positive-sense strand sequence of the RnQV1-W1075 is 50 -C/UACGAAU-CAUGAGAAUAUUCG/A-30 , Fig. 4). Several conserved sequence stretches are also present at both untranslated regions (UTRs), and the 30 -ends are better conserved than the 50 -end regions. The heterogeneity at the extreme termini, C or U for the 50 -end and G or A for 30 -end, is notable. As is found in typical chrysoviruses and some partitiviruses, “CAA” repeats were also found in the 50 -UTR, or within the adjacent coding region of each RnQV1 segment, except for dsRNA3. The (CAA) repeat sequences are known to serve as enhancer elements present at the 50 -UTRs of tobamoviruses, plant alpha-like viruses. Recently, the 50 -UTRs of two typical chrysoviruses, Cryphonectria nitschkei chrysovirus 1 and HvV145S (215 and 293 nt, respectively), were found to have internal ribosomal entry site (IRES) activities, while those of

Quadriviruses (Quadriviridae)

645

Fig. 4 Genomic organization of RnQV1 isolate W1075. The genome consists of four monocistronic dsRNA segments. dsRNA3 encodes the putative replicase (RdRp), while dsRNA2 and dsRNA4 encode structural proteins. dsRNA1 encodes a putative non-structural protein with unknown function. The sequence stretches conserved among below the dsRNA4. Solid lines and colored open boxes represent genomic dsRNAs and open reading frames on the positive-sense strands, respectively.

quadriviruses are relatively short (23–45 nt) and contain no IRES activity. The translation significance of (CAA)n in translation may differ between quadriviruses and chrysoviruses. Alternaviriuses and members of genera Alphapartitivirus and Betapartitivirus have poly(A) tails or poly(A) tracts either at the 30 -terminus or near the 30 -terminus of the coding-strand, whereas quadriviruses and most other dsRNA viruses have no such poly (A) rich sequences. Although little is known about the replication of multi-segmented dsRNA mycoviruses, their replication and transcription appear to occur inside particles involving virion-associated RdRps, similar to other well studied dsRNA viruses such as rice dwarf virus (a plant reovirus, genus Phytoreovirus, family Reoviridae) and yeast L-A virus (Saccharomyces cerevisiae virus L-A, genus Totivirus, family Totiviridae). Therefore, RnQV1 P3 (RdRp) may be involved in the genomic dsRNA replication and transcription, although the presence of RdRp in the quadrivirus virions remains to be confirmed. The putative small pores are located at the fivefold (B11 Å -diameter hole) and three-fold axes (B7 Å -diameter hole) of the RnQV1 capsid. This feature resembles that of other single-shelled T ¼ 1 capsids of dsRNA mycoviruses, suggesting that the conformational changes in P2 and/or P4 that face the channel wall might allow the exit of viral transcripts. In addition, the presence of two putative insertion sites in the P2 and P4 proteins, which allocate the virus capsid outer surface, is hypothesized to possess enzyme activities. Interestingly, in the case of the yeast L-A virus capsid, the inserted domain is located at a trench on the outer surface and can be responsible for catalyzing the unique “cap-snatching” reaction, which is known as an alternative method for capping mRNA.

Molecular and Biological Properties The three RnQV1 isolates were found in R. necatrix strains W1075, W1118 and W726 were obtained from the Japanese pear orchards in the Saga, Ibaraki, and Saga Prefectures of Japan. RnQV1-W1075 and W1118 have been completely sequenced, while only the complete sequence of dsRNA3 is available for RnQV1-W726. Nucleotide sequence identities between RnQV1-W1075 and W1118 vary from 67% (dsRNA1 and dsRNA3) to 70% (dsRNA2). Amino acid sequences identities between two virus isolates showed 72% (P1), 81% (P2), 74% (P3) and 82% (P4). The W726 dsRNA3 (RdRp) shows 86% and 66% nucleotide sequence (98% and 75% amino acid sequence) identities to that of W1075 and W1168, respectively. Rabbit polyclonal antibodies against RnQV1-W1075 P2 recognize RnQV1-W1118 P2 by western blot analysis. Since a number of field isolates displayed similar dsRNA profiles to that of quadriviruses, it seems that quadriviruses are widespread in the R. necatrix population. The RnQV1 persistently infects its natural fungal host R. necatrix and is transmitted intracellularly during cell division and/or mycelial cell fusion (hyphal anastomosis). Transfection of protoplasts from R. necatrix and Cryphonectria parasitica (an experimental model fungal host) by purified RnQV1 particles was unsuccessful. Little is known about the virus vertical transmission during sexual or asexual sporulation, or about extracellular transmission in nature. The RnQV1 is stably maintained in the host fungus during successive subcultures for a long period of time. However, viral infection has no discernible phenotypic effects on their host colony morphology, suggesting it is a symptomless infection. The spread of RnQV1 throughout the fungal colonies is relatively slower than other R. necatrix-infecting viruses, such as Rosellinia necatrix megabirnavirus 1 and Rosellinia necatrix partitivirus 1, because its dsRNA distribution within a colony is uneven, forming sectors with low levels of virus accumulation. RnQV1 does not block transgene silencing, and thus, it appears to have no RNA silencing suppressor activity. RnQV1 does not upregulate antiviral RNA silencing-related genes, while some other viruses do in R. necatrix. Next-generation sequencing analyses suggested that sense-strand-specific viral small RNA peaks were found at the 30 termini of RnQV1 segments. Nevertheless, the

646

Quadriviruses (Quadriviridae)

Fig. 5 Phylogenetic relationships of RnQV1 and other related dsRNA viruses. A neighbor-joining tree was constructed from the amino acid sequences alignment based on the RdRp proteins of quadriviruses and selected dsRNA viruses including totiviruses, chysoviruses and botybirnaviruses. Numbers at the nodes represent bootstrap percentages (1000 replicates). The scale bar represents the amino acid distances.

abundance of RnQV1 small RNAs was significantly lower than those of other studied R. necatrix dsRNA viruses, suggesting that RnQV1 is sequestered from the RNA silencing machinery, or that it has unknown counter-defense mechanism(s).

Phylogenetic and Evolutionary Relationships of RnQV1 to Other dsRNA Mycoviruses Virus-like dsRNA elements, namely Amasya cherry disease (ACD)-associated dsRNAs with sequence similarity to RnQV1 dsRNA segments together with two novel mycoviruses (chrysovirus and partitivirus), were identified from diseased cherry trees. RnQV1 P3 shows moderate levels of amino acid sequence identities to polypeptides encoded by ACD large (L) dsRNA3 (35%) and L dsRNA4 (36%), which putatively encode RdRps that are related to each other, and are clustered together with quadriviruses in the

Quadriviruses (Quadriviridae)

647

phylogenetic tree (Fig. 5). RnQV1 P1 shows weak amino acid sequence identities to polypeptides encoded by ACD L dsRNA1 (19%) and L dsRNA2 (27%). Although no ACD-associated dsRNA segments encoding structural proteins similar to RnQV1 P2 (dsRNA2) and P4 (dsRNA4) have been detected, it might be that these quadrivirus-like dsRNAs are derived from two related mycoviruses. Similarly, quadrivirus-like dsRNA elements, together with two other mycoviruses, were also isolated from the cherry leaves that had Cherry chlorotic rusty spot disease (CCRS). The etiology of ACD and CCRS occurring in Turkey and Italy, respectively, is still unknown. However, a Spanish fungal disease called Cherry leaf scorch (CLS), that has very similar symptoms to CCRS and mycovirus-like dsRNA-elements, is suggested to be caused by Apiognomonia erythrostoma (class Sordariomycetes). Thus, ACD- and CCRS-associated dsRNAs are likely of a fungal virus origin. RnQV1 P3 shows 22%–33% amino acid sequence identities to RdRps of other dsRNA mycoviruses, including the members of the families Totiviridae (unsegmented dsRNA genome) and Chrysoviridae (multi-segmented dsRNA genome). Phylogenetic analyses show that quadriviruses are more closely related to members of the family Totiviridae than to other known quadripartite dsRNA viruses, such as chrysoviruses and alternaviruses (Fig. 5, alternaviruses are not included). Quadriviruses and members of the genus Totivirus that have unsegmented genomes with two large ORFs for a capsid protein and a RdRp, form a well-supported monophyletic cluster within the Totiviridae family clade, suggesting that they descended from a common ancestor except for a unique totivirus, Ustilago maydis virus H1 (UmV-H1), that has a single ORF encoding a polypeptide with a capsid protein and RdRp domains. UmV-H1 and yeast L-A totivirus commonly have additional components (i.e., satellite dsRNAs), which encode secreted toxin precursor proteins responsible for the killer phenomena. UmV-H1 is phylogenetically related to recently discovered unsegmented dsRNA mycoviruses (relatives of UmV-H1) and botybirnaviruses (two-segmented dsRNA genome) (Fig. 5). Therefore, it is necessary to revise the taxonomy of the current members of the Totivirus genus based on phylogenetic relationships. Similar to quadriviruses, the betachrysovirus and botybirnavirus capsid may be composed of two different structural proteins. In addition, some plant-infecting deltapartitiviruses may have three dsRNA genome segments and encode two different structural proteins. The acquisition of two different structural protein genes (quadriviruses and probably betachrysoviruses), the duplication of a structural protein gene (botybirnaviruses and presumably some tripartite partitiviruses) and the duplication of a helix-rich domain within one single gene (alphachrysoviruses) possibly occurred independently during the course of evolution of these dsRNA mycoviruses. At the time of writing this article, a new quadrivirus, designated as Leptospaeria biglobosa quadrivirus 1, with a quadripartite dsRNA genome, was reported from the blackleg disease fungus Leptosphaeria maculans (anamorph Phoma lingam, class Dothideomycetes) infecting oilseed rape (Brassica napus). This virus shares many molecular characteristics with RnQV1, but its RdRp is more closely related to the putative RdRps encoded by ACD L dsRNA3 and dsRNA4, and CCRS L dsRNA3 and dsRNA4.

See also: Structure of Double-Stranded RNA Mycoviruses

Further Reading Chiba, S., Castón, J.R., Ghabrial, S.A., Suzuki, N., ICTV Report Consortium, 2018. ICTV virus taxonomy profile: Quadriviridae. Journal of General Virology 99, 1480–1481. Kondo, H., Kanematsu, S., Suzuki, N., 2013. Viruses of the white root rot fungus, Rosellinia necatrix. Advances in Virus Research 86, 177–214. Kozlakidis, Z., Covelli, L., Di Serio, F., et al., 2006. Molecular characterization of the largest mycoviral-like double-stranded RNAs associated with Amasya cherry disease, a disease of presumed fungal aetiology. Journal of General Virology 87, 3113–3117. Lin, Y.-H., Chiba, S., Tani, A., et al., 2012. A novel quadripartite dsRNA virus isolated from a phytopathogenic filamentous fungus, Rosellinia necatrix. Virology 426, 42–50. Lin, Y.-H., Hisano, S., Yaegashi, H., Kanematsu, S., Suzuki, N., 2013. A second quadrivirus strain from the phytopathogenic filamentous fungus Rosellinia necatrix. Archives of Virology 158, 1093–1098. Luque, D., Mata, C.P., Gonzalez-Camacho, F., et al., 2016. Heterodimers as the structural unit of the T ¼ 1 capsid of the fungal double-stranded RNA Rosellinia necatrix quadrivirus 1. Journal of Virology 90, 11220–11230. Luque, D., Mata, C., Suzuki, N., Ghabrial, S., Castón, J., 2018. Capsid structure of dsRNA fungal viruses. Viruses 10, 481. Mata, C.P., Luque, D., Gómez-Blanco, J., et al., 2017. Acquisition of functions on the outer capsid surface during evolution of double-stranded RNA fungal viruses. PLoS Pathogens 13, e1006755. Sato, Y., Castón, J.R., Suzuki, N., 2018. The biological attributes, genome architecture and packaging of diverse multi-component fungal viruses. Current Opinion in Virology 33, 55–65. Shah, U.A., Kotta-Loizou, I., Fitt, B.D., Coutts, R.H., 2019. Identification, molecular characterization, and biology of a novel quadrivirus infecting the phytopathogenic fungus Leptospaeria biglobosa. Viruses 11, 9. Yaegashi, H., Sawahata, T., Ito, T., Kanematsu, S., 2011. A novel colony-print immunoassay reveals differential patterns of distribution and horizontal transmission of four unrelated mycoviruses in Rosellinia necatrix. Virology 409, 280–289. Yaegashi, H., Shimizu, T., Ito, T., Kanematsu, S., 2016. Differential inductions of RNA silencing among encapsidated double-stranded RNA mycoviruses in the white root rot fungus, Rosellinia necatrix. Journal of Virology 90, 5677–5692.

Relevant Website https://talk.ictvonline.org/ictv-reports/ictv_online_report/dsrna-viruses/w/quadriviridae Quadriviridae.

Totiviruses (Totiviridae)1 Bradley I Hillman and Alanna B Cohen, Rutgers University, New Brunswick, NJ, United States r 2021 Elsevier Ltd. All rights reserved. This is an update of S.A. Ghabrial, Totiviruses, In Encyclopedia of Virology (Third Edition), edited by Brian W.J. Mahy and Marc H.V. Van Regenmortel, Elsevier Ltd., 2008, doi:10.1016/B978-012374410-4.00518-5.

Glossary Anastomosis Fusion between hyphal branches allowing for exchange of cytoplasm, ions, and organelles, and the most common means of horizontal transmission of mycoviruses. Conidia Asexual, non-motile fungal spores. Mycoviruses Viruses that infect and multiply in fungi. Protoplast A cell that has its cell wall removed, allowing for membrane fusion. Pseudoknot A secondary structure in viral mRNA that slows movement of the ribosome and may cause a frameshift that allows entry to an alternative reading frame during translation.

Ribosomal frameshifting Ribosomes switching reading frame on an mRNA, in response to the presence of a slippery site and/or a pseudoknot, to synthesize a fusion protein or a polyprotein from two overlapping reading frames. RNA silencing The negative regulation or inhibition of gene expression by non-coding RNA elements, also called RNA interference (RNAi). Totivirids Viruses within the family Totiviridae. Totiviruses Viruses within the genus Totivirus. Vegetative incompatibility Genetic differences between fusing vegetative hyphae of fungi that result in unviable heterokaryons and prevent horizontal mycovirus transmission.

Introduction The discovery of the killer phenomenon in the 1960s in yeast (Saccharomyces cerevisiae) and in the smut fungus (Ustilago maydis) eventually led to the discovery of the isometric double-stranded (ds) RNA mycoviruses with nonsegmented genomes, currently classified in the genus Totivirus (family Totiviridae), distinguishing them by that name from the partitiviruses, which had segmented genomes. An interesting feature of these viruses became apparent: virus-infected killer strains of yeast or smut secrete a protein toxin to which they are immune, but which is lethal to sensitive cells. The precursor to the killer toxin is now known to be encoded by a satellite dsRNA, which is dependent on a helper totivirus for encapsidation and replication. Unlike the totiviruses associated with the yeast and smut killer systems, other members of the family Totiviridae are not known to be associated with killer phenotypes. The other accepted genera in the family Totiviridae include the viruses that infect filamentous fungi (members of the genus Victorivirus) and those that infect parasitic protozoa (members of the genera Giardiavirus, Leishmaniavirus, and Trichomonasvirus). Genera proposed to be included in the family Totiviridae include Artivirus and Eimeriavirus, neither of which contain toxin-producing member viruses. The viruses constituting the family Totiviridae, including proposed members, will be referred to here as totivirids, distinguishing them from members of the genus totivirus. Totivirids that infect fungi and protozoa were the first dsRNA viruses with undivided genomes. The new dsRNA virus family Amalgaviridae also has undivided genomes with an architecture similar to members of the Totiviridae, but they are phylogenetically more closely related to the Partitiviridae family of viruses. Purified preparations of some of several totivirids and similar viruses contain dsRNA species suspected of being satellite or defective dsRNAs. An extreme example of this phenomenon is the relationship between yado-nushi virus 1 (YnV1), an unclassified totivirus most closely related to members of the genus Giardiavirus, which hosts and encapsidates a viral RNA, yado-kari virus 1 (YkV1), that is most closely related to members of the positive-sense ( þ )RNA virus family Caliciviridae. The realization by the scientific community of the breadth of the family Totiviridae is leading to broader interest in these viruses among virologists. Further spurring this interest is the realization that several of these viruses have measurable effects on hosts of economic importance, including reduction of virulence or fitness in filamentous fungi, disease in crustaceans and fish, and enhanced virulence in protozoans that cause human disease. As a group, the totivirids constitute a rapidly expanding family, with metagenomic information revealing new members on a continual basis. In many instances, hostvirus relationships are unknown, but the growing set of tools available for study is demonstrating the biological importance of these viruses. In addition to examining the similarities and differences among members of the family Totiviridae (totivirids), this article will focus on the totivirids that infect filamentous fungi, a group of viruses that are phylogenetically more closely related to each other 1 Authors’ note: This article is an update of the same chapter in the Third Edition written by Dr. Said A. Ghabrial. Dr. Ghabrial died before the rewrite of this article began, and thus was never able to have input into the content of the updated chapter. So even though roughly half of the content here is originally Dr. Ghabrial’s, he is not listed as a co-author.

648

Encyclopedia of Virology, 4th Edition, Volume 4

doi:10.1016/B978-0-12-809633-8.21347-2

Totiviruses (Totiviridae)

649

than to other totiviruses and utilize a different strategy to express their genomes. The yeast and smut totiviruses and associated killer systems, as well as the totivirids infecting parasitic protozoa, are discussed elsewhere in this encyclopedia.

Taxonomy of and Evolutionary Relationships Among Totivirids The family Totiviridae currently accommodates five genera, Giardiavirus, Leishmaniavirus, Totivirus, Trichomonasvirus, and Victorivirus. There are two additional proposed genera within the family, Artivirus and Eimeriavirus, which are not yet recognized by the International Committee on Taxonomy of Viruses (ICTV). The genus Victorivirus, derived from the specific epithet of Helminthosporium victoriae, was erected to accommodate HvV190S, and currently the largest in the family Totiviridae, with 14 ICTV-recognized members and many tentative victoriviruses that infect filamentous fungi. The members of one genus can be distinguished from those of another genus in genome expression strategy, phylogenetic relation, and/or natural host range (see below). The family encompasses a broad range of viruses characterized by isometric particles, B40 nm in diameter, that contain a monosegmented dsRNA genome coding for a capsid protein (CP) and an RNA-dependent RNA polymerase (RdRp). Viruses within the Totiviridae are found in the major clade that contains most of the dsRNA viruses, including the largest dsRNA family, the Reoviridae. This branch that includes dsRNA viruses is thought to have arisen from the branch of ( þ )ssRNA viruses containing the alpha and flaviviruses, and the dsRNA branch in turn is hypothesized to have given rise to the negative-sense RNA viruses. Sequence comparisons of the predicted amino acid sequences of totivirid RdRps indicate that they share significant sequence similarity and characteristically contain eight conserved motifs. This sequence similarity was common to all totivirids so far characterized including the totivirids that infect the yeast, smut, filamentous fungi, and plants as well as those infecting parasitic protozoa. It is notable that an expanded phylogenetic tree based on RdRp sequences of all dsRNA viruses makes it apparent that viruses currently contained within – or proposed to be contained within – the family Totiviridae does not form a monophyletic group. In particular, the genera Victorivirus, Leishmaniavirus, Trichomonasvirus, and proposed genus Eimeriavirus consistently group together, but the genera Totivirus and Giardiavirus and the proposed genus Artivirus do not group with the others. Instead, each of those last three represents a separate small clade within the greater clade of dsRNA viruses related to the totivirids, with viruses representing families including the Chrysoviridae and Quadriviridae, as well as several clades not yet classified intercalated into that larger “Toti-Chryso” clade. These facts suggest the necessity of taxonomical reorganization of these members of the family Totiviridae and related viruses with a similar genome organization but different phylogenetic relationships. Phylogenetic analysis based on multiple alignments of amino acid sequences of totivirids RdRp conserved motifs (Fig. 1) shows that the viruses infecting filamentous fungi (victoriviruses) are most closely related to each other and form a distinct large well-supported cluster (bootstrap value of 97%). Likewise, phylogenetic trees based on the CP sequences showed similar topology to those based on RdRp sequences, although in some cases with less support (Fig. 2). The leishmaniaviruses LRV1, LRV2, and LRV4 (percent sequence identities of 46%) are most closely related to each other and form a discrete cluster with 100% bootstrap value. This is also true for the two yeast viruses ScV-L-A and ScV-LBC (RdRp identity of 32%), members of the genus Totivirus. UmV-H1, the third member in the genus Totivirus forms a phylogenetic clade distinct from the cluster of the two yeast totiviruses. This may reflect the difference in RdRp expression strategy between these viruses (see below). It is of interest that HvV190S and related victoriviruses are phylogenetically more closely to the protozoan-infecting members of the Totiviridae, including trichomonasviruses, leishmaniaviruses, and the proposed genus Eimeriavirus, typified by the unclassified Eimeria brunetti RNA virus 1 (EbRV-1), which infects an apicomplexan parasitic protozoan, than to the yeast viruses (Figs. 1 and 2). It was previously hypothesized that leishmaniaviruses and the fungal totivirids are of old origin having existed in a single-cell-type progenitor prior to the divergence of fungi and protozoa. The closer relationship of victoriviruses to protozoal viruses than to yeast viruses in the genus Totivirus is consistent with this idea.

Virion Properties The buoyant densities in CsCl of virions of members of the family Totiviridae range from 1.36 to 1.43 g cm3, and the sedimentation coefficients of these virions range from 160S to 190S (S20w, in Svedberg units). Particles lacking nucleic acid sediment with apparent sedimentation coefficients of 90–105S. Isolates of ScV-L-A and UmV-H1 may have additional components, containing satellite or defective dsRNAs, with different sedimentation coefficients and buoyant densities. Virion-associated RdRp activity can be detected in all totivirids examined to date. Protein kinase activity is associated with HvV190S virions; capsids contain phosphorylated forms of CP.

Virion Structure and Composition The totivirids have isometric particles, approximately 40–50 nm in diameter, with icosahedral symmetry (Fig. 3). The capsids are single-shelled and are composed of a single major polypeptide. The capsids consist of 120 CP subunits of molecular mass in the range of 76–98 kDa. The capsid structures of six members or tentative members of the family Totiviridae have been determined

650

Totiviruses (Totiviridae)

Fig. 1 Maximum Likelihood (ML) phylogenetic tree constructed based on the RNA-dependent RNA polymerase (RdRp) sequences of selected members and proposed member of the family Totiviridae. Not all family members were included for space considerations, but the overall topology of the tree generally reflects the topology of the tree when all available sequences were included. A single member of the family Quadriviridae and the family Chrysoviridae was included to indicate positions within the larger clade that currently constitutes the family Totiviridae. The RdRp sequences were derived from aligned deduced amino acid sequences of members of the family Totiviridae using the program MUSCLE. The phylogenetic tree was generated using the program IQ-TREE with 1000 ultrafast bootstrap replicates. A VT þ F þ I þ G4 model of evolution was selected based on Akaike and Bayesian information criteria. Bootstrap support above 70% are indicated on branches.

using cryo-transmission electron microscopy combined with three-dimensional image reconstruction (Fig. 4). In all cases, the capsids of the fungal totivirids are made up of 60 asymmetric CP dimers arranged in a so-called 'T¼2' layer, equivalent to a T ¼1 structure in which the structural units are homodimers rather than monomers. There are structural differences among the viruses examined to date. Surface differences of particles are apparent across the family. Compared to the yeast L-A capsid, the HvV190S capsid shows relatively smoother outer surfaces, but both are more apparently icosahedral, while the trichomonasvirus TVV1, giardiavirus GLV, and proposed artiviruses OmRV and IMNV are more apparently spherical in cryoEM reconstructions. IMNV has an overall shape like OmRV, but it uniquely has fiber protrusions at the pentamers. The quaternary organization of the HvV190S particle is notably similar to that of yeast L-A and the cores of the larger dsRNA viruses of plants and animals: the A-subunits cluster around the fivefold axis and B-subunits around the threefold axis. The ScV-L-A CP removes the 50 cap structure of host mRNA and covalently attaches it to the histidine residue at position 154. The decapping activity is required for efficient translation of viral RNA. The published yeast L-A capsid structure reveals a trench at the active site of decapping. Like other totivirids, a single gene encodes the capsid of HvV190S, but the HvV190S capsid comprises two closely related major CPs, either p88 and p83 or p88 and p78. The capsids of all other totivirids so far characterized appear to contain only a single major CP. Interestingly, HmTV-1–17, a totivirid infecting a filamentous fungus, has similar capsid heterogeneity to that of HvV190S. Purified HvV190S virion preparations contain two types of particles, 190S-1 and 190S-2, which differ slightly in sedimentation rates (190S-1 is resolved as a shoulder on the slightly faster sedimenting component 190S-2) and capsid

Totiviruses (Totiviridae)

651

Fig. 2 Maximum Likelihood phylogenetic tree constructed based on the capsid protein (CP) sequences of selected members and proposed member of the family Totiviridae. The viruses and strains selected and the methods used were the same as those used in Fig. 7. A VT þ F þ G4 model of evolution was selected based on Akaike and Bayesian information criteria.

composition. The 190S-1 and 190S-2 virions are believed to represent different stages in the virus life cycle. The 190S-1 capsids contain p88 and p83, occurring in approximately equimolar amounts, and the 190S-2 capsids comprise similar amounts of p88 and p78. p88 and p83 are phosphoproteins, whereas p78 is nonphosphorylated (Fig. 5). Totivirid virions encapsidate a single molecule of dsRNA, 4.6–7.0 kbp, with tentative members of the family ranging up to 7.8 kbp. Some totivirids may additionally contain satellite dsRNAs or defective dsRNAs, which are encapsidated separately in capsids encoded by the totivirid genome.

Genome Organization and Expression In general, the genome organization of the totivirids infecting fungi and other eukaryotes are similar: each virus genome contains two large open reading frames (ORFs); the 50 proximal ORF encodes a CP and the 30 ORF encodes an RdRp (Fig. 6). Two distinct RdRp expression strategies have been reported for viruses in the family Totiviridae: those that express RdRp as a fusion protein (CP–RdRp) by ribosomal frameshifting, such as Saccharomyces cerevisiae virus L-A and the viruses that infect parasitic protozoa; and those that synthesize RdRp as a separate nonfused protein by an internal initiation mechanism, as shown for HvV190S and proposed for all of the other totivirids that infect filamentous fungi. An exception to this rule is Ustilago maydis virus H1 (Umv-H1), currently being classified into the genus Totivirus. The Umv-H1 genome contains only a single ORF that encodes a polyprotein, predicted to be autocatalytically processed by a viral papain-like protease to generate the CP and RdRp proteins. In most Giardiavirus, Totivirus, and proposed Artivirus members, a –1 frameshift regulates translation between the two main ORFs; in members of Trichomonasvirus a –2 frameshift regulates translation, and in Leishmaniavirus a þ 1 frameshift regulates translation between the two main ORFs. In each of these cases, a fusion protein consisting of the CP and RdRp occasionally results, and that fusion product is incorporated in virions. In most members of the genus Victorivirus, a translational stop-start separates the two main ORFs, with the termination codon of the CP ORF overlapping the initiation codon of the RdRp ORF in the tetranucleotide sequence AUGA or UAAUG. Other translational details in members or proposed members of the family Totiviridae

652

Totiviruses (Totiviridae)

Fig. 3 Negative contrast electron micrograph of particles of an isolate of Helminthosporium victoriae 190SV, the type species of the genus Victorivirus. Scale ¼ 50 nm. Reprinted from Ghabrial, S.A., 2008. Totiviruses. In: Mahy, B.W.J., Van Regenmortel, M.H.V. (Eds.), Encyclopedia of Virology (Third Edition), New York: Academic Press, pp. 163–174.

Fig. 4 Structural comparison of totivirid, OmRV, and other totivirus-related viruses. Electron density. volume map of Omono River virus (OmRV, 8.3 Å resolution), infectious myonecrosis virus (IMNV, 8.0 Å resolution), Saccharomyces cerevisiae virus L-A (ScV-L-A, 8.5 Å resolution), Helminthosporium victoriae virus 190S (HvV190S, EMDB Accession code: 5726, 7.5 Å resolution), Trichomonas vaginalis virus 1 (TVV1, EMDB Accession code: 2184, 6.7 Å resolution), and Giardia lamblia virus (GLV, EMDB Accession code: 5948, 6.0 Å resolution). The 8.5 Å electron density volume map of ScV-L-A was generated from the crystallographic capsid model (PDB ID: 1M1C) using Chimera60. The surface of the structures was colored according to the distance from the center of the virion (green o 131 Å , yellow o 193 Å , red o 220 Å , light blue o 250 Å , blue o 280 Å ). Reprinted from Okamono, K., Miyazaki, N., Larsson, D.S.D., et al., 2016. The infectious particle of insect borne totivirus-like Omono River virus has raised ridges and lacks fiber complexes. Springer. Scientific Reports 6, 1–14.

may include additional small ORFs, additional protease cleavage sites, 2A-like and pseudo 2A-like sequences, and pseudoknot sequences that facilitate frameshifting. The totivirids express their RdRps either as CP–RdRp (gag-pol-like) fusion proteins or as separate nonfused proteins. Expression of RdRp as a CP–RdRp fusion protein via  1 ribosomal frameshifting has been well documented for ScV-L-A. Virion-associated CP–RdRp has been detected as a minor protein in the capsids of ScV-L-A, ScV-L-BC, TVV, and GLV. Expression of RdRp as a fusion protein in LRV1–1, LRV1–4, or LRV2–1 by þ 1 ribosomal frameshifting or ribosomal hopping has been proposed. The overlap region between ORF1 and ORF2 of ScV-L-A (130 nt), LRV1–1 and LRV1–4 (71 nt), and GLV (122 nt) contains the structures necessary for ribosomal frameshifting including a slippery site and a pseudoknot structure to promote fusion of ORF1 and ORF2 in vivo. Although the overlap region in TVV is short (14 nt), it contains a potential ribosomal slippage heptamer. There is currently no evidence for CP-RdRp fusion proteins in HvV190S or other victoriviruses. The overlapping region in the dsRNA genomes of HvV190S is AUGA, where the initiation codon of the RdRp ORF overlaps the termination codon of the CP ORF, suggesting

Totiviruses (Totiviridae)

653

Fig. 5 Sodium dodecyl sulfate-polyacrylamide gel electrophoresis (SDS-PAGE) analysis of HvV190S sedimenting components 190S-1 and 190S-2. Purified preparations of HvV190S contain two types of particles, 190S-1 and 190S-2, which differ slightly in sedimentation rates. The 190S-1 capsids contain p88 and p83, occurring in approximately equimolar amounts, and the 190S-2 capsids comprise similar amounts of p88 and p78. The capsid proteins p88 and p83 are phosphorylated, whereas p78 is nonphosphorylated. Reprinted from Ghabrial, S.A., 2008. Totiviruses. In: Mahy, B.W.J., Van Regenmortel, M.H.V. (Eds.), Encyclopedia of Virology (Third Edition), New York: Academic Press, pp. 163–174.

Fig. 6 Genome organizations of selected members and proposed members of the family Totiviridae. The dsRNA genome encompasses two large open reading frames (ORFs) with the 50 ORF encoding a capsid protein (CP) and the 30 ORF encoding an RNA-dependent RNA polymerase (RdRp). Details are described in the text. Except for the members of the genus Victorivirus, translation of the RdRp ORF proceeds as a consequence of a frameshift from the CP ORF, resulting in a CP-RdRp fusion protein that is a component of the capsid. Translation of the RdRp of members of the genus Victorivirus proceeds by internal ribosome initiation following termination of translation of the CP sequence at a translational stop-start site.

that expression of RdRp occurs by a mechanism different from translational frameshifting. For HvV190S, the stop codon of the CP ORF (nucleotide position 2606-UGA-2608) overlaps with the start codon (nucleotide position 2605-AUG-2607) for the RdRp ORF in the sequence AUGA. Other victoriviruses have either the pentanucleotide UAAUG or, in the case of Alternaria alternata victorivirus 1 (AalVV1), a putative  1 frameshift. But in none of these instances has a CP-RdRp fusion protein been identified in translation experiments or in virus particles; rather, each appears to represent a slightly different mechanism for termination of the CP and initiation of the RdRp. The 50 end of the plus strand of all totivirids dsRNAs examined to date is uncapped and the 30 end is not polyadenylated. As noted, ScV-L-A uses a cap-snatching mechanism to provide capped mRNAs for efficient translation. The 50 end of HvV190S dsRNA is uncapped and highly structured and contains a relatively long (289 nucleotides) 50 leader with two minicistrons. These structural features of the 50 untranslated region (UTR) of HvV190S dsRNA suggest that the CP-encoding ORF1 (with its AUG present in suboptimal context according to the Kozak criteria) is translated via a cap-independent mechanism. The 50 -UTR of the leishmaniavirus LRV-1–1 functions as an internal ribosome entry site (IRES). Translation of the uncapped giardiavirus GLV mRNA in Giardia lamblia is initiated on a unique IRES element that contains sequences from a part of the 50 -UTR and a portion of the capsid coding region. IRES activities are also observed in the 50 -UTRof the victoriviruses HvV190S and Rosellinia necatrix victorivirus 1. The UGA codon at position 2606–2608 of the HvV190S genomic plus strand was verified by site-directed mutagenesis to be the authentic stop codon for ORF1. The RdRp-encoding downstream ORF2 of HvV190S is in a –1 frame with respect to ORF1 and is expressed via an internal initiation mechanism (a coupled termination–initiation mechanism is proposed). The HvV190S RdRp is detectable as a separate, virion-associated component, consistent with its independent translation from ORF2.

654

Totiviruses (Totiviridae)

The viral genome of IMNV, a monosegmented dsRNA virus infecting penaeid shrimp and tentatively assigned to the family Totiviridae in the proposed genus Artivirus, encompasses two nonoverlapping ORFs with ORF1 encoding a polyprotein comprised of a putative RNA-binding protein and a CP. The coding region of the RNA-binding protein is located in the first half of ORF1 and contains a dsRNA-binding motif in the first 60 amino acids. The second half of ORF1 encodes a CP, as determined by amino acid sequencing, with a molecular mass of 99 kDa. ORF2 encoded a putative RdRp with the eight conserved motifs characteristic of totivirids. Phylogenetic analysis based on the RdRp clustered IMNV with other invertebrate-infecting members in the family Totiviridae and separate from the five established genera. Important novel features of the genome organization of IMNV have significant bearing on how the viral proteins are expressed, and these features appear to be shared among closely related viruses and are continuing to evolve. These features include two encoded ‘2A-like’ motifs, which are likely involved in ORF1 polyprotein ‘cleavage’, a 199 nt overlap between ORF1 and ORF2, and the presence a ‘slippery heptamer’ motif and predicted RNA pseudoknot in the region of ORF1–ORF2 overlap. The latter features probably allow ORF2 to be translated as a fusion with ORF1 by  1’ ribosomal frameshifting. The generation of CP as a polyprotein and the potential involvement of the IMNV-encoded ‘2A-like’ peptides (GDVESNPGP and GDVEENPGP) in processing of the polyprotein to release the major CP represent novel features that are not shared by closely related viruses in the proposed Artivirus genus of totivirids. Interestingly, more divergent viruses are predicted to have “pseudo 2A” sequences that could give rise to functional “2A-like” peptides by random mutation. In contrast, the potential expression of RdRp as a CP–RdRp fusion protein via ribosomal frameshifting is a common strategy utilized by other totivirids for expressing their RdRps.

Virus Replication Cycle The replication cycle of totivirids has mainly been derived from in vitro studies of virion-associated RNA polymerase activity and the isolation of particles representing various stages in the replication cycle. In in vitro reactions, the RNA polymerase activity associated with virions of the fungal totivirids ScV-L-A, UmV-H1, and HvV190S, isolated from lag-phase cultures, catalyzes end-toend transcription of dsRNA, by a conservative mechanism, to produce mRNA for CP, which is released from the particles. Purified ScV-L-A virions, isolated from log-phase cells, contain a less-dense class of particles, which package only plus-strand RNA. In in vitro reactions, these particles exhibit a replicase activity that catalyzes the synthesis of minus-strand RNA to form dsRNA. The resultant mature particles, which attain the same density as that of the dsRNA-containing virions isolated from the cells, are capable of synthesizing and releasing plus-strand RNA. A proposed life cycle of HvV190S is depicted in Fig. 7. Host-encoded protein kinase and protease have been shown to be involved in post-translational modification of CP. Phosphorylation and proteolytic processing are proposed to play a role in the virus life cycle; phosphorylation of CP may be necessary for its interaction with viral nucleic acid and/or phosphorylation may regulate dsRNA transcription/replication. Proteolytic processing and cleavage of a C-terminal peptide, which leads to dephosphorylation and the conversion of p88 to p78, may play a role in the release of the plus-strand RNA transcripts from virions (Fig. 7).

Biological Properties The overall host range of the Totiviridae, including proposed members, includes an increasing breadth of eukaryotes. In addition to fungal and protozoal hosts, totivirids have been isolated from oomycetes, plants, insects, crustaceans, fish, and bats. The first such virus identified was a nonsegmented dsRNA virus with isometric particles, designated penaeid shrimp infectious myonecrosis virus (IMNV), isolated from diseased penaeid shrimp and tentatively assigned to the family Totiviridae. IMNV is the causal agent of the shrimp myonecrosis disease characterized by necrosis of skeletal muscle, particularly in the distal abdominal segments and tail fan. Phylogenetic analysis based on the RdRp region of viruses in the family Totiviridae suggests that Giardia lamblia virus (GLV; genus Giardiavirus) is the closest relative to IMNV. Related viruses infecting fish, insects, and bats cluster together phylogenetically and are proposed to constitute a new genus, Artivirus, which is not yet recognized by the International Committee on Taxonomy of Viruses (ICTV). The complete nucleotide sequences of many members and tentative members of the family Totiviridae have now been described. There are no known natural vectors for the transmission of the fungal totivirids. They are transmitted intracellularly during cell division, sporogenesis, and cell fusion. Although the yeast totiviruses are effectively transmitted via ascospores, the victoriviruses infecting the ascomycetous filamentous fungi are essentially eliminated during ascospore formation. The leishmaniaviruses and trichomonasviruses are propagated during cell division and are not infectious as purified virions. Members of the genera Giardiavirus, Totivirus, Victorivirus, and the proposed genus Artivirus such as IMNV, on the other hand, are infectious as purified virions. Successful transfection of the protozoon Giardia lamblia has also been accomplished via electroporation with plus-strand RNA transcribed in vitro from GLV dsRNA. Whereas IMNV causes a myonecrosis disease in its crustacean host, GLV is associated with latent infection of its flagellated protozoan human parasite G. lamblia. GLV is released into the medium without lysing the host cells and the extruded virus can infect many virus-free isolates of the protozoan host. There is limited information on experimental host ranges for the fungal totivirids because infectivity as particles has been demonstrated for only a few. As a consequence of their intracellular modes of transmission, the natural host ranges of fungal totivirids are limited to individuals within the same or closely related vegetative compatibility groups, but particle-based and RNA-based infectivity systems are allowing for broader host range studies. Furthermore, mixed infections with two or more unrelated viruses are

Totiviruses (Totiviridae)

655

Fig. 7 Life cycle of HvV190S. Mature virions contain a single dsRNA molecule and their capsids are composed entirely or primarily of the capsid protein (CP) p88. Virions representing different stages of the virus life cycle can be purified from the infected fungal host Helminthosporium victoriae including the well-characterized 190S-1 and 190S-2 virions. These two types of virions differ in sedimentation rate, phosphorylation state, and CP composition; 190S-1 capsids contain p88 and p83, whereas the 190S-2 capsids contain p88 and p78 (p88 is the primary translation product of the CP gene; p83 and p78 represent post-translational proteolytic processing products of p88 at its C-terminus). p88 and p83 are phosphorylated, whereas p78 is nonphosphorylated. The virions with phosphorylated CPs (p88 þ p83) have significantly higher transcriptase activity in vitro than those containing the nonphosphorylated p78. Transcription occurs conservatively and the newly synthesized plus-strand RNA is released from the virions. Phosphorylation of CP is catalyzed by a host kinase, and is proposed to play a regulatory role in transcription/ replication. A host-encoded protease catalyzes the proteolytic processing of phosphorylated p88; this occurs in two steps, leading first to p83 (the generation of the 190S-1 virions) and then to p78 (190S-2 virions). The conversion of p88-p83-p78 is proposed to play a role in the release of the plus-strand RNA transcripts from virions. The released plus-strand RNA is the RNA that is translated into CP and RNA-dependent RNA polymerase (RdRp) and packaged in capsids assembled from the primary translation product p88. It is not known whether p88 is phosphorylated before or after assembly. Synthesis of minus-strand RNA occurs on the plus-strand RNA template inside the virion; phosphorylation may be involved in turning on the replicase activity. Reprinted from Ghabrial, S.A., 2008. Totiviruses. In: Mahy, B.W.J., Van Regenmortel, M.H.V. (Eds.), Encyclopedia of Virology (Third Edition), New York: Academic Press, pp. 163–174.

common, probably as a consequence of the ways by which fungal viruses are transmitted in nature. Dual infection of yeast with ScV-L-A and ScV-L-BC and the filamentous fungus Sphaeropsis sapinea with SsRV-1 and SsRV-2 are examples of mixed infections involving totivirids. Dual infection of H. victoriae with the victorivirus HvV190S and the chrysovirus Helminthosporium victoriae virus 145 S (HvV145S) is an example of a mixed infection involving two unrelated viruses belonging to two different families. The victorivirus from the phytopathogenic ascomycete fungus Rosellinia necatrix, Rosellinia necatrix victorivirus 1 (RnVV1), was eliminated from its host fungus by hyphal tipping and was reintroduced to its natural host and introduced to the fungus Cryphonectria parasitica by transfection. Negligible pathogenic effect was observed in either host. However, transfection of RnVV1 to an RNA silencing-deficient strain of C. parasitica revealed phenotypic effects of the virus in the RNA silencing-deficient strain, demonstrating that the virus is targeted by anti-viral RNA silencing in the fungus. Many totivirids appear to be associated with symptomless infections of their hosts, but there is increasing evidence of biological impact of others. IMNV, an unclassified virus tentatively assigned to the proposed new genus Artivirus in the family Totiviridae, causes a myonecrosis disease in shrimp. The closely related Omono River virus isolated from Culex mosquitoes causes cytopathic effect in infected cells. Several recent studies suggest that members of the genera Leishmaniavirus and Trichomonasvirus enhance their respective parasite-associated disease in humans by inducing proinflammatory responses at infection sites. These relationships are becoming recognized as increasingly important impacts of members of the family Totiviridae, and are covered in other chapters of this volume. Multiple infections of totivirids with other totivirids or with unrelated viruses are common. In most instances, such mixed infections have not been dissected biologically. For example, a single isolate of Trichomonas vaginalis may be infected with all of the

656

Totiviruses (Totiviridae)

Fig. 8 Colony morphology of virus-free and virus infected isolates B2 and A9, respectively, and three representative transfectants (B2-HvV190S-1, B2-HvV190S-2 andB2-HvV190S-3) of Helminthosporium victoriae. All fungal isolates were cultured on potato dextrose agar containing 0.5% (wt/vol) yeast extract medium. Reprinted from Xie, J., Havens, W.M., Lin, Y.-H., Suzuki, N., Ghabrial, S.A., 2016. The victorivirus Helminthosporium victoriae virus 190S is the primary cause of disease/hypovirulence in its natural host and a heterologous host. Virus Research 213, 238–245, Copyright (2015), with permission from Elsevier.

four known species of TVV. Mixed infections with HvV190S and HvV145S were initially associated with a debilitating disease of the fungal host. The two viruses are transmitted together via hyphal anastomosis and diseased isolates are characterized by reduced growth, excessive sectoring, aerial mycelial collapse, and generalized lysis. The disease phenotype of H. victoriae, the causal agent of Victoria blight of oats, is of special interest not only because diseased isolates are hypovirulent, but also because it was among the first examples of the viral etiology of hypovirulence of an important plant pathogen. It was only relatively recently that the role of the victorivirus HvV190S in the fungal disease was elucidated. Colonies regenerated from fungal protoplasts infected singly with HvV190S, but not HvV145S showed the disease phenotype (Fig. 8).

Virus–Host Relationships The yeast killer system, comprised of a helper totivirus (ScV-L-A) and associated satellite dsRNA (m-dsRNA), is one of the few known examples where virus infection is beneficial to the host. The ability to produce killer toxins by yeast strains that in turn are immune to those toxins confers an ecological advantage over sensitive strains. The use of killer strains in the brewing industry provides protection against contamination with adventitious sensitive strains. Totiviruses maintain only the genes that are essential for their own survival (RdRp and CP), but killer viruses host accessory satellite segments that may be beneficial to their survival. The host cells have evolved to support only a defined level of virus replication, beyond which virus infection may become pathogenic. Because of amenability to genetic studies, the yeast–virus system has provided significant information on the host genes required to prevent viral cytopathology. Classical genetic studies with S. cerevisiae defined the first systems of chromosomal genes that regulate the copy number of a eukaryotic virus: the copy number of the totivirus ScV-L-A and its satellite M-dsRNAs were found to be downregulated by genes designated SKI (superkiller) or upregulated by genes designated MAK (maintenance of killer). More recently, the ecological flexibility of host/virus interactions in this system has been revealed: two host genes required for RNA silencing (Argonaute, AGO1, and Dicer, DCL1), and thus for interference with ScV-L-A multiplication, were found to be repeatedly gained or lost in different yeast strains, leading to the hypothesis that retaining these general virus resistance genes is more beneficial to the yeast in some circumstances, whereas eliminating these resistance genes and thus facilitating the presence of the killer viruses is more beneficial to the host in other circumstances. While much less well-defined than the yeast system, the HvV190S victorivirus that infects the plant pathogenic fungus H. victoriae also utilizes host-encoded proteins, in this case a protein kinase and a protease for post-translational modification of its CP. Phosphorylation and proteolytic processing of CP may play a role in regulating transcription and the release of plus-strand transcripts from virions (Fig. 7). The H. victoriae-virus system provides a well-studied and useful model system for examining virus–host interactions in a plant pathogenic fungus. A major attribute of this system is that the virus-infected H. victoriae isolates exhibit a disease phenotype

Totiviruses (Totiviridae)

657

(Fig. 8). Furthermore, HvV190S can infect the heterologous fungal host C. parasitica, and also results in a disease phenotype in strains deficient in the RNA silencing pathway (Ddcl-2). It was previously demonstrated that the fungal gene Hv-p68, a novel alcohol oxidase/RNA-binding protein, is upregulated as a result of virus infection suggesting that upregulation of this gene might play a role in virus pathogenesis. Overexpression of Hv-p68 in virus-free fungal isolates, however, resulted in a significant increase in colony growth and did not induce a disease phenotype. Thus, overexpression of Hv-p68 per se is not sufficient to induce the disease phenotype in the absence of virus infection. Because dsRNA-binding proteins are known to sequester dsRNA and suppress antiviral host defense mechanisms, it is feasible that overexpression of the dsRNA-binding protein Hv-p68 may lead to the induction of the disease phenotype by suppressing host defense, but additional studies are needed to determine whether or not Hv-p68 upregulation has a role in viral pathogenesis.

Conclusions Viruses in the family Totiviridae have common properties such as a two-ORF genome architecture and a T¼1 capsid composed of 60 asymmetrical capsid protein dimers. Taxonomic considerations of totivirids have proceeded in large part by expanding the family Totiviridae to accommodate new genera as they are formally described. The family now contains five ICTV-approved genera and two tentative genera, among virus families with the most diverse host organisms. The genera are differentiated from one another in host range and gene expression. Elucidating the capsid structures of representatives of different genera and proposed genera of the family Totiviridae help to inform taxonomy of these viruses. More definitive information on variation in life cycles and genome expression strategies will also aid in putting this large and expanding family of viruses in taxonomic perspective. It is easy to envision a taxonomy in which the viruses considered here represent multiple families, but the utility for the community of virologists and biologists of such a restructuring would be needed, and perhaps should proceed only after the breadth of the family is better understood.

Further Reading Castón, J.R., Luque, D., Trus, B.I., et al., 2006. Three-dimensional structure and stoichiometry of Helminthosporium victoriae 190S totivirus. Virology 347, 323–332. Chiba, S., Lin, Y.-H., Kondo, H., Kanematsu, S., Suzuki, N., 2013. A novel victorivirus from a phytopathogenic fungus, Rosellinia necatrix, is infectious as particles and targeted by RNA silencing. Journal of Virology 87 (12), 6727–6738. De Lima, J.G.S., Teixeira, D.G., Freitas, T.T., Lima, J.P.M.S., Lanza, D.C.F., 2019. Evolutionary origin of 2A-like sequences in Totiviridae genomes. Virus Research 259, 1–9. Fichorova, R., Fraga, J., Rappelli, P., Fiori, P.L., 2017. Trichomonas vaginalis infection in symbiosis with Trichomonasvirus and Mycoplasma. Research in Microbiology 168, 882–891. Ghabrial, S.A., 2008. Totiviruses. In: Mahy, B.W.J., van Regenmortel, M.H.V. (Eds.), Encyclopedia of Virology (Third Edition), New York: Academic Press, pp. 163–174. Ghabrial, S.A., Castón, J.R., Jiang, D., Nibert, M.L., Suzuki, N., 2015. 50-plus years of fungal viruses. Virology 479–480, 356–368. doi:10.1016/j.virol.2015.02.034. Ghabrial, S.A., Dunn, S.E., Li, H., Xie, J., Baker, T.S., 2013. Viruses of Helminthosporium (Cochliobolus) victoriae. Advances in Virus Research 86, 289–325. Goodman, R.P., Ghabrial, S.A., Fichorova, R.N., Nibert, M.L., 2011. Trichomonasvirus: A new genus of protozoan viruses in the family Totiviridae. Archives of Virology 156 (1), 171–179. Jamal, A., Sato, Y., Shahi, S., et al., 2019. Novel victorivirus from a Pakistani isolate of alternaria alternata lacking a typical translational stop/restart sequence signature. Viruses 11, 577. Li, F., Du, J., Wu, Z., et al., 2019. Identification and genetic analysis of a totivirus isolated from the Culex tritaeniorhynchus in Northern China. Archives of Microbiology 202 (4), 807–813. doi:10.1007/s00230-019-01788-9. Nibert, M.L., 2007. '2A-like’ and ‘shifty heptamer’ motifs in penaeid shrimp infectious myonecrosis virus, a monosegmented double-stranded RNA virus. Journal of General Virology 88 (2007), 1315–1318. Okamoto, K., Miyazaki, N., Larsson, D.S.D., et al., 2016. The infectious particle of insect-borne totivirus-like Omono River virus has raised ridges and lacks fibre complexes. Scientific Reports 6, 33170. Sasai, S., Tamura, K., Tojo, M., et al., 2018. A novel non-segmented double-stranded RNA virus from an Arctic isolate of Pythium polare. Virology 522, 234–243. Xie, J., Havens, W.M., Lin, Y.-H., Suzuki, N., Ghabrial, S.A., 2016. The victorivirus Helminthosporium victoriae virus 190S is the primary cause of disease/hypovirulence in its natural host and a heterologous host. Virus Research 213, 238–245. Soldevila, A.I., Ghabrial, S.A., 2001. A novel alcohol oxidase/RNA-binding protein with affinity for mycovirus double-stranded RNA from the filamentous fungus Helminthosporium (Cochliobolus) victoriae. Journal of Biological Chemistry 276, 4652–4661. Wickner, R.B., Ghabrial, S.A., Nibert, M.L., Patterson, J.L., Wang, C.C., 2012. Family Totiviridae. In: King, A.M.Q., Adams, M.J., Carstens, E.B., Lefkowits, E.J. (Eds.), Virus Taxonomy: Classification and Nomenclature of Viruses: Ninth Report of the International Committee on Taxonomy of Viruses. London: Elsevier Academic Press, pp. 639–650. doi:10.1016/B978-0-12-384684-6.00052-5. Xin, C., Wu, B., Li, J., et al., 2016. Complete genome sequence and evolution analysis of Eimeria stiedai RNA virus 1, a novel member of the family Totiviridae. Archives of Virology 161, 3571–3576.

Yado-kari Virus 1 and Yado-nushi Virus 1 (Unassigned) Subha Das, Okayama University, Kurashiki, Japan Nobuhiro Suzuki, Institute of Plant Stress and Resources (IPSR), Okayama University, Kurashiki, Japan r 2021 Elsevier Ltd. All rights reserved.

Glossary 2A-like motif A conserved self-processing peptide motif that cleaves a growing polypeptide chain co-translationally at a particular cleavage site. These motifs were first identified in members of Picornaviridae, separating structural proteins from replication-associated proteins. The motif is now known to be present in diverse RNA viruses. Capsid protein The protective shell of viruses usually composed of multiple copies of same or different protein subunits for encasing viral genomes.

Mutualism Interspecific cooperation where two organisms belonging to two different species live together and benefit each other. Trans-encapsidation or hetero-encapsidation Packaging of nucleic acid entities such as satellites by a capsid protein encoded by unrelated independent viruses. Yado-kari In Japanese, it means borrowing a room in which to live or hermit crab biologically. Yado-nushi In Japanese, it means someone who is the owner of the room.

Introduction Typically, mycoviruses have either double-stranded (ds) or positive-sense single-stranded (( þ )ss) RNA genomes. However, a few studies have also indicated the presence of mycoviruses with negative-sense (  ) ssRNA genomes and geminivirus-like circular ssDNA genomes. Mixed mycovirus infections are common among fungi with occasional documented interplays between two unrelated viruses. Such virusvirus interactions can be categorized into three types: apparently neutral, synergistic and antagonistic. Synergistic interactions enhance the titer of one virus by another co-infecting virus, subsequently resulting in a different symptom expression than that induced in the host singly infected by either of the viruses. One such example is an enhancement of accumulation and vertical transmission of mycoreovirus 1 (MyRV1) in chestnut blight-causing fungus Cryphonectria parasitica by a co-infecting Cryphonectria hypovirus 1 (CHV1). In addition, in Rosellinia necatrix, Rosellinia necatrix megabirnavirus 2 (RnMBV2) reduces host virulence and growth only in the presence of Rosellinia necatrix partitivirus 1 (RnPV1). In return, RnMBV2 enhances accumulation of RnPV1 by two-fold. Individually, neither of the viruses causes any alterations to host virulence. In case of antagonistic interaction, one virus impairs the replication or accumulation of another co-infecting virus. For example, in C. parasitica, impairment of replication and lateral transmission of Rosellinia necatrix victorivirus 1 by either a co-infecting MyRV1 or a mutated CHV1 (lacking the RNA silencing suppressor). Replication impairment by MyRV1 is mediated by enhanced antiviral RNA silencing via transcriptional induction of its key genes namely dicer-like 2 (DCL2) and argonaute-like 2 (AGL2). Recently, a unique mutualistic interplay between two RNA mycoviruses, Yado-nushi virus 1 (YnV1) and Yado-kari virus 1 (YkV1) has been described from a debilitated field isolate of R. necatrix strain W1032 from Nagano Prefecture, Japan. YnV1, an independent genuine double-stranded RNA (dsRNA) virus, trans-encapdisates the replicative-form dsRNA of the capsidless YkV1 ( þ )ssRNA (genomic RNA) along with its RNA-dependent RNA polymerase (RdRp), and is thereby responsible for YkV1 replication inside the host. In this way, despite being a ( þ )ssRNA virus, YkV1 likely replicates inside the hetero-capsid (particles) comprised of YnV1 capsid protein like a dsRNA virus. In return, YkV1 trans-enhances the accumulation of YnV1 by two- to four-fold, making the relationship rather mutualistic than commensal. This unique mutualistic interaction is different than the aforementioned virus-virus synergistic interactions, where two bona fide viruses (with the ability to infect and replicate inside host independently) interact and complement each other. This YkV1/YnV1 interaction is also distinct from those interactions between helper viruses and subviral elements (like satellite or defective viruses). Unlike satellite or defective viruses, YkV10 s genome is replicated by its own RdRp. In this article, we will be discussing genome characteristics of the two viruses and their unique mutualistic interplay in R. necatrix. In addition, we give an overview of other viruses or virus-like entities that may have similar YkV1/YnV1-like partnerships.

Virion Morphology Particles formed by YnV1 are spherical in shape and approx. 40 nm in diameter (Fig. 1). The YkV1 virus genome and RdRp are also encased by the same capsid.

Genome Characteristics of YnV1 At least three different strains of YnV1 (designated as YnV1-A, B and C) have been reported so far from R. necatrix, while a strain from Sclerotium rolfsii with sequence similarity to YnV1 has been partially characterized. All three YnV1 genomes are composed of a single

658

Encyclopedia of Virology, 4th Edition, Volume 4

doi:10.1016/B978-0-12-809633-8.20949-7

Yado-kari Virus 1 and Yado-nushi Virus 1 (Unassigned)

659

Fig. 1 Electron micrograph of negatively stained virus particles. The scale bar represents 20 nm. Courtesy of Dr. Hideki Kondo at Institute of Plant Science and Resources, Okayama University.

dsRNA element. The lengths of YnV1-A (GenBank Accession No. LC061478), B (LC006254) and C (LC006256) are 8971 bp, 8958 bp, and 8952 bp, respectively. The genome of YnV1-A contains two open reading frames (ORFs), ORF1 (4965 bp) and ORF2 (3339 bp). YnV1-A harbors the 50 and 30 untranslated regions (UTRs) of lengths 150 bp and 392 bp, respectively. YnV1-A ORF1 encodes a capsid protein with the presence of a zinc finger-like motif at the N-terminal side (Fig. 2). The ORF2 encodes a putative RdRp that contains a Phytoreo_S7 (pfam07236) domain upstream of the RdRp motifs (Fig. 2). All the domains identified on YnV1-A are also conserved in YnV1-B and C. The Phytoreo_S7 domain has been conserved across different virus families including Reoviridae, Chrysoviridae and Endornaviridae. It is assumed that this domain has been transferred horizontally between fungal and plant viruses over evolutionary time. In several non-phytoreoviruses, the domain is located downstream of RdRp, while like members of Endornaviridae and Chrysoviridae, the YnV1 Phytoreo_S7 domain is located upstream of the RdRp domain. Regarding the function of this domain, a previous study demonstrated that the minor inner core protein of rice dwarf virus (a phytoreovirus) with Phytoreo_S7 domain possessed dsRNA binding capability. Therefore, this domain could be associated with viral RNA synthesis occurring at the inner core of particles. The exact role of the Phytoreo_S7 domain in YnV1 is yet to be determined. Pairwise comparison showed 8%–10% sequence divergence at the amino acid level in the CP regions among all three strains of YnV1, while the sequence divergence was only 6%–7% between their corresponding RdRp regions. It is predicted that YnV1 CP is processed either by self-cleavage activity of YnV1 itself, or by host-derived proteases.

Genome Characteristics of YkV1 The ssRNA genome of YkV1 is 6310 bp in length, presumably existing in the host as dsRNA replicative form. The genome contains a single large ORF (4293 bp), encoding a putative polyprotein with the RdRp domain, situated close to the center of N-terminal portion (Fig. 2). The lengths of its 50 and 30 UTR regions are 796 bp and 1221 bp, respectively. In addition to the RdRp domain, YkV1 also contains a picornavirus 2A-like domain (Fig. 2). This domain was first reported in the foot-and-mouth disease virus (FMDV) and later in several other members of the family Picornaviridae and diverse viruses like animal dsRNA toti-like viruses and reoviruses. The picornavirus 2A is a self-processing or ribosome skipping peptide that separates structural proteins from replicationassociated proteins by cleaving a growing polyprotein at the C-terminal side of a conserved motif, DxExNPGP (where x is any amino acid). The cleavage or ribosomal skip occurs co-translationally, which is totally distinct from that of bona fide proteaseassociated cleavages of viral polyproteins. The YkV1 polyprotein is also predicted to undergo self-processing or ribosome skipping mediated by the 2A-like motif. Recently, a next-generation sequencing approach combined with Sanger sequencing identified a single Spanish R. necatrix isolate (Rn454) to harbor three additional yado-kari viruses designated as YkV2, YkV3 and YkV4. While YkV2 contains a single large ORF like its YkV1 counterpart, the genome of YkV4 consists of two ORFs (Fig. 2). Interestingly, YkV3 has two genome variants, one with a single-ORF organization and the other one with a double-ORF arrangement just like YkV4. Notably, another Spanish isolate, Rn95-16, of R. necatrix harbored YkV4. YkV2 to YkV4 share greater sequence similarity (approximately 60% identity in the RdRp region) with one another than with YkV1 (approximately 30% identity). A feature distinguishing YkV1 from YkV2 to YkV4 is the presence of a poly(A) tail at the 30 end of the genome. Notably, the 2A-like motifs of YkV1 and YkV2 are situated almost at the same position where ORF1 of YkV3 (two-ORF variant, not shown in Fig. 2) and YkV4 ends (Fig. 2). This

660

Yado-kari Virus 1 and Yado-nushi Virus 1 (Unassigned)

Fig. 2 Schematic representation of the genomic organization of yado-nushi virus 1 (YnV1) and all four yado-kari viruses (YkV1, YkV2, YkV3 and YkV4). The RdRp domains are highlighted in red for all the viruses. The zinc-finger and Phytoreo_S7 motifs on YnV1 are highlighted in purple and green, respectively. In the YnV1 genome, -1 FS indicates a possible -1 frameshift. The arrows indicate the cleavage positions within the conserved 2A-like domains (between amino acids G and P) of YkV1, YkV2 and YkV3. Molecular weight (in kDa) of each putative polypeptide has been mentioned on top of their corresponding open reading frames (ORF).

suggests the possibility that yado-kari viruses may encode at least two distinct polypeptides, i.e., the RdRp for genome replication and transcription and another putative protein with unknown function.

Phylogenetic Placements of YnV1 and YkV1 The ORF1-encoded CP of YnV1 has no significant sequence similarity with any of the viral proteins deposited so far to GenBank. However, the ORF2-encoded putative RdRp showed modest levels of sequence identity to corresponding regions of dsRNA viruses such as Giardia lamblia virus (GLV) and Glomus sp. RF1 medium virus (GRF1V-M). Phylogenetically, YnV1 is also distantly related to relatively common dsRNA mycoviruses, i.e., members of the families Toti-, Quadri-, Megabirna-, Chrysoviridae, and the proposed family Fusagraviridae such as Sclerotinia sclerotiorum dsRNA mycovirus-L (also known as Sclerotinia sclerotiorum non-segmented virus L or SsNsV-L) (Fig. 3(A)). Phylogenetic analysis with putative RdRp region of YkV1 showed that it is most closely related to the viruses like Penicillium aurantiogriseum foetidus-like virus 1 (PaFlV1), Aspergillus foetidus slow virus S2 (AfV-S2) and Fusarium poae mycovirus 2 (FpMyV2) (Fig. 3(B)). Interestingly, phylogenetically, YkV2, YkV3, and YkV4 are more closely related to FpMyV2 than to their Japanese counterpart YkV1. Indeed, based on RdRp-based phylogenetic analysis, it can be assumed that yado-kari viruses are distantly related to animal-infecting ( þ )ssRNA viruses belonging to the Caliciviridae family. It is noteworthy that both the families, Totiviridae and Caliciviridae are members of the extended picorna-like superfamily and are assumed to share the same ancestor.

Proposed Replication Model for YnV1 and YkV1 It is anticipated that, like other typical dsRNA viruses, YnV1 replicates and transcribes its genome using its own RdRp. YkV1 hijacks capsids from YnV1 and utilizes them as its replication site (Fig. 4). As the RdRp of YkV1 is also encased inside the YnV1-encoded capsid, both replication (synthesis of negative-strand RNA) and transcription (synthesis of positive-strand RNA) are presumed to be carried out by its own RdRp. This way, despite being a ssRNA virus, YkV1 likely exploits a replication strategy similar to a dsRNA

Yado-kari Virus 1 and Yado-nushi Virus 1 (Unassigned)

661

Fig. 3 Phylogenetic trees constructed with amino acid sequences of RNA-dependent RNA polymerase (RdRp) regions from (A) yado-nushi virus 1 (YnV1) and (B) yado-kari virus 1(YkV1), and their related viruses. The figure has been reproduced with permission from Hisano, S., Zhang, R., Faruk, M.I., Kondo, H., Suzuki, N., 2018. A neo-virus-lifestyle exhibited by a ( þ )ssRNA virus hosted in an unrelated dsRNA virus: taxonomic and evolutionary considerations. Virus Research 244, 75–83.

virus inside its fungal host. This model clearly distinguishes YkV1 from other satellite RNAs or satellite RNA viruses that may or may not encode their own CP, but surely do not encode their own RdRp, therefore depending upon RdRp of their corresponding helper viruses.

662

Yado-kari Virus 1 and Yado-nushi Virus 1 (Unassigned)

Fig. 4 Proposed model for mutualistic interactions between yado-nushi virus 1 (YnV1) and yado-kari virus 1 (YkV1).

Molecular Entities Sharing Similar YkV1/YnV1-Like Interactions The novel interplay between YkV1 and YnV1 can be misinterpreted with similar satellite and their helper/dependent virus interactions. Satellite or defective RNAs and viruses are associated with helper viruses. For genome replication, they must depend upon their corresponding helper viruses. By contrast, YkV1 encodes its own RdRp and therefore, for genome replication, it does not depend upon YnV1, although YkV1 diverts the capsid of YnV1 as its replication site. It seems that YkV1/YnV1-like mutualistic virus-virus interactions between unrelated viruses may not be so uncommon in nature. Umbraviruses with a ( þ )ssRNA genome do not encode CP, therefore forming no particles. Unlike YkV1, umbraviruses are speculated to replicate in rearranged host membranes like other ( þ )ssRNA viruses, infecting their hosts systematically and independently of their helper viruses. For encapsidation and aphid transmission, they solely depend on ( þ )ssRNA assistor viruses, i.e., luteoviruses, which are distinct from umbraviruses. In the absence of luteoviruses, the umbravirus genomes form filamentous ribonucleoprotein structures and their ORF3 and ORF4 products are predicted to help them in long-distance (to cause systemic infection) and cell to cell movement, respectively. Apparently, YkV1 does not have this type of survival mechanism in the absence of YnV1. Papaya sticky disease or “meleira” is a common disease of papaya in Brazil and Mexico. The causal agent of the disease is a complex mixture of a dsRNA virus, a tentative member of the proposed family Fusagraviridae, named papaya meleira virus (PMeV) and a ( þ )ssRNA virus termed papaya meleira virus 2 (PMeV2). The genome of PMeV is approximately 10 kbp in size, contains two ORFs, encoding a putative CP and an RdRp. This virus can only produce typical disease symptoms when being accompanied by another 4.5 kb ssRNA virus, PMeV2. The genome of PMeV2 contains two ORFs and is closely related to umbraviruses, while no homolog of the umbravirus movement protein is detected in PMeV2. Like YkV1, the genome of PMeV2 is trans-encapsidated by the CP of PMeV. Unlike the aforementioned umbravirus-luteovirus interplay, PMeV2, a ssRNA virus, behaves like YkV1 and acquires CP from the dsRNA fusagravirus, PMeV, which acts like YnV1 in this case. PMeV2 helps PMeV to establish severe disease symptoms to papaya plants. Despite the fact that PMeV does not encode any movement proteins, it can cause systemic infection in papaya. It is assumed that the virus uses laticiferous vessels for systemic movement in the plant. It would be interesting to examine the role of the PMeV2 ORF1-encoded hypothetical protein in cell to cell or long-distance virus movement. Another example of comparable virus-virus interplay is the interaction between rice tungro bacilliform virus (RTBV) and rice tungro spherical virus (RTSV). Rice tungro is a devastating disease of rice especially in the south and south east Asia causing significant impairment in rice production each year. The disease is caused by two morphologically different viruses: RTBV, a reverse-transcribing dsDNA virus (pararetrovirus) that forms bacilliform capsids, and RTSV, a ( þ )ssRNA virus that forms isometric capsids. In the case of mixed infection, both the viruses are transmitted in a semi-persistent manner by leafhoppers. However, on its own, RTBV is not transmitted by leafhoppers; for this, it depends on RTSV. While RTSV can infect plants individually, causing either asymptomatic or mild disease symptoms, severe disease manifestation occurs only when it is accompanied by RTBV. A similar mutualistic relationship also exists between human hepatitis B virus (HBV, a dsDNA hepadnavirus) and a viroid-like circular (  )ssRNA virus satellite, hepatitis D virus (HDV, deltavirus). HDV is well-known for encoding a ribozyme and hepatitis delta antigen (HDAg). However, for the completion of its life cycle (virion assembly, transmission and cell to cell movement), HDV requires its helper virus HBV. Again, while YkV1 encodes its own RdRp for independent genome replication, HDV depends on host RNA polymerase II and its own ribozyme to replicate in a rolling-circle manner and transcribe its genome in a way similar to the

Yado-kari Virus 1 and Yado-nushi Virus 1 (Unassigned)

663

replication strategy of plant-infecting viroids belonging to the family Avsunviroidae. This feature is different from general satellite RNAs or satellite RNA viruses. Thus, HDV is regarded as satellite-like RNA by the International Committee of Taxonomy of Viruses.

YkV1/YnV1-Like Virus Combinations in Other Fungi Phylogenetically, YkV1 is related to other ( þ )ssRNA viruses infecting different filamentous fungi than R. necatrix that include PaFlV1 (infecting Penicillium aurantiogriseum), AfV-S2 (infecting Aspergillus foetidus) and FpMV2 (infecting Fusarium poae). It is noteworthy that like YkV1, genomes of these viruses also have single-ORF architecture, encoding an RdRp with 2A-like motif at the C-terminal side of the polyprotein. Interestingly, like YkV1, PaFlV1, AfV-S2, and FpMV2 are also accompanied by either totivirus or toti-like viruses with the two-ORF genome organization. This suggests that there could be an intimate relationship between ( þ ) ssRNA and dsRNA viruses similar to YkV1/YnV1 in other fungi as well. These combinations of viruses also share some common features. For example, the genome size of these YkV1-like ( þ )ssRNA viruses is always smaller than their presumed partner dsRNA virus partners. It is noteworthy that R. necatrix strains carrying YkV2 to YkV4 are also co-infected by multiple dsRNA viruses with the two-ORF, undivided genome architecture. However, none of these viruses, detected in the fungi harboring yado-kari viruses shared moderate levels of sequence similarity to YnV1.

Predicted Past and Future of YkV1 Phylogenetic analyses showed that YkV1 is distantly related to calici-like viruses. This suggests that it might have evolved from a calici-like virus after losing the CP. This possibility has been proposed for other capsidless ssRNA viruses as well. Most probably, after losing its CP gene, YkV10 s progenitor survived inside the host by capturing CP from a co-infecting full-fledged toti-like dsRNA virus and used it as the site of replication. Another prediction is YkV1 might have evolved from a satellite RNA that somehow acquired the RdRp gene from a calici-like virus and then started hijacking CP from a co-infecting toti-like dsRNA virus. The theory of encountering a toti-like host virus is not implausible as mixed viral infection is very common in fungi, including R. necatrix. In future, to become a full-fledged dsRNA virus, YkV1 needs to acquire either a CP gene from a compatible dsRNA virus (most probably a toti-like virus) or it needs to find a way to utilize host-derived membranous vesicles as its replication site, like hypoviruses.

Concluding Remarks and Future Directions The neo-viral lifestyle of two unrelated viruses (YkV1 and YnV1) was first described in 2016. Since then, potential YkV1/YnV1-like virus combinations have been detected in other fungal hosts. It would be worth examining whether these viruses also share similar neo-viral lifestyle to the one exhibited by YkV1 and YnV1. Indeed, many aspects of this unique lifestyle are still not fully understood. Some open questions remain regarding this mutualistic interplay. (1) Does the 2A-like motif cleave the yado-kari polyprotein cotranslationally by ribosome slippage mechanism? (2) How does YnV1 efficiently trans-encapsidate the genome and RdRp of YkV1? (3) Are the YkV1 and YnV1 genomes encapsidated together or separately? (4) Where does the assembly origin of YkV1 reside in its genome? (5) What is the molecular mechanism responsible for the enhancement of YnV1 replication by YkV1? The influence of YkV1 and YnV1 on their host biology is also not fully understood. Hopefully, future research will find answers to these questions.

Further Reading Arjona-Lopez, J.M., Telengech, P., Jamal, A., et al., 2018. Novel, diverse RNA viruses from Mediterranean isolates of the phytopathogenic fungus, Rosellinia necatrix: Insights into evolutionary biology of fungal viruses. Environmental Microbiology 20, 1464–1483. Hibino, H., 1996. Biology and epidemiology of rice viruses. Annual Review of Phytopathology 34, 249–274. Kozlakidis, Z., Herrero, N., Ozkan, S., Bhatti, M.F., Coutts, R.H., 2013. A novel dsRNA element isolated from the Aspergillus foetidus mycovirus complex. Archives of Virology 158, 2625–2628. Littlejohn, M., Locarnini, S., Yuen, L., 2016. Origins and evolution of hepatitis B virus and hepatitis D virus. Cold Spring Harbor Perspectives in Medicine 6 (1), a021360. Osaki, H., Sasaki, A., Nomiyama, K., Tomioka, K., 2016. Multiple virus infection in a single strain of Fusarium poae shown by deep sequencing. Virus Genes 52, 835–847. Ryabov, E.V., Taliansky, M.E., Robinson, D.J., et al., 2012. Umbravirus. In: King, A.M.Q., Adams, M.J., Carstens, E.B., Lefkowitz, E.J. (Eds.), Virus Taxonomy. Ninth Report of the International Committee on Taxonomy of Viruses. London: Elsevier Academic Press, pp. 1191–1195. Sá Antunes, T.F., Amaral, R.J., Ventura, J.A., et al., 2016. The dsRNA virus papaya meleira virus and an ssRNA virus are associated with papaya sticky disease. PLoS One 11 (5), e0155240. doi:10.1371/journal.pone.0155240. Wolf, Y.I., Kazlauskas, D., Iranzo, J., et al., 2018. Origins and evolution of the global RNA virome. mBio 9. doi:10.1128/mBio.02329-18. Yaegashi, H., Nakamura, H., Sawahata, T., et al., 2013. Appearance of mycovirus-like double-stranded RNAs in the white root rot fungus, Rosellinia necatrix, in an apple orchard. FEMS Microbiology Ecology 83, 49–62. Yang, X., Cheng, A., Wang, M., et al., 2017. Structures and corresponding functions of five types of picornaviral 2A proteins. Frontiers in Microbiology 8, 1373. doi:10.3389/ fmicb.2017.01373. Zhang, R., Hisano, S., Tani, A., et al., 2016. A capsidless ssRNA virus hosted by an unrelated dsRNA virus. Nature Microbiology 1, 15001.

Yeast L-A Virus (Totiviridae) Reed B Wickner, National Institute of Diabetes and Digestive and Kidney Diseases, National Institutes of Health, Bethesda, MD, United States Tsutomu Fujimura and Rosa Esteban, Institute of Biology and Functional Genomics, CSIC/University of Salamanca, Salamanca, Spain Published by Elsevier Ltd.

Glossary Decapping Removal of the methylated GMP in 5'-5' linkage at the 5' end of most eukaryotic mRNAs. Ribosomal frameshifting Ribosomes changing reading

frame on an mRNA, in response to a special mRNA structure, to synthesize a protein from two overlapping reading frames.

Introduction The L-A virus of bakers/brewers yeast Saccharomyces cerevisiae is the exemplar strain (ScV-L-A) of the species Saccharomyces cerevisiae virus L-A (genus Totivirus, family Totiviridae). L-A is one of several RNA viruses infecting this organism, each of which spreads via the cell–cell fusion of mating, rather than by the extracellular route. The totivirus L-A, like members of the very similar L-BC family of viruses (ScV-L-BC), is a single-segment 4.6 kbp double-stranded RNA (dsRNA) virus encapsidated in icosahedral particles with a single major coat protein called Gag. The 20S (Saccharomyces 20S RNA narnavirus, ScNV20S) and 23S RNA replicons (Saccharomyces 23S RNA narnavirus, ScNV23S) are naked cytoplasmic single-stranded RNA (ssRNA) replicons except for their bound RNAdependent RNA polymerases. The L-A virus serves as the helper virus for any of several smaller satellite dsRNAs, called M dsRNAs, each encoding a secreted protein toxin and immunity to that toxin, producing the ‘killer’ phenomenon. Killer strains can eliminate some of the competition by this means, although less than 5% of wild strains harbor a killer dsRNA, suggesting there are costs to carrying this replicon. The killer phenotype was used to study the genetics of M dsRNAs and the helper L-A. Several functional variants of L-A were defined based on their interactions with different M dsRNAs and with the host, and host mutants affected in virus expression or propagation were also examined.

History In 1963, Makower and Bevan reported that some yeast strains secrete a toxin that kills other yeasts This led to the identification by Bevan and by Fink of cellular dsRNAs and later viral particles correlated with the killer phenomenon. Studies of the chromosomal genes affecting the killer system revealed that the Kex2 protease, identified by its requirement for toxin secretion, was the longsought proinsulin-processing enzyme. The Mak3 N-acetyltransferase, whose acetylation of Gag is needed for viral assembly, established consensus sequences for such enzymes. The loss of the L-A virus in mak3 mutants revealed a second dsRNA species of the same size, called L-BC, and unrelated to the killer system.

Virion Structure L-A virions have icosahedral symmetry, but contain 120 Gag monomers per particle, in apparent violation of the rules of quasi-equivalence. In fact, each virion is composed of 60 asymmetric dimers of Gag (Fig. 1), a feature that is common to the cores of all dsRNA viruses that have been characterized. One type of Gag molecule makes contact with the icosahedral fivefold and twofold axes, but not with the threefold axes. The second type of Gag finds itself in contact with the threefold axis, but not with either the five- or twofold axes (Fig. 1). The two environments of Gag lead to two distinct conformations, suggesting that Gag may be more flexible than some other coat proteins. The L-A virion has more volume per nucleotide than do ssRNA or dsDNA viruses. These facts may reflect the requirement that the dsRNA moves inside the particle as it is transcribed by the RNA-dependent RNA polymerase that is fixed to the inner virion wall (see below). Pores at the fivefold axes are assumed to allow entry of nucleotides and exit of ( þ ) strand transcripts, but retention of the dsRNA genome. A trench on the outside contains His154, the central active site residue of the mRNA-decapping activity to which the 7-methyl-GMP structure becomes covalently attached (Fig. 2), and from which it can then be transferred to new viral ( þ ) strands. A layered density observed for the viral dsRNA may reflect the dsRNA’s rigid structure and the fact that it is forced to press against the inner capsid wall.

664

Encyclopedia of Virology, 4th Edition, Volume 4

doi:10.1016/B978-0-12-809633-8.21283-1

Yeast L-A Virus (Totiviridae)

665

Fig. 1 Wire diagram of the L-A virus capsid. The major coat protein, Gag, is found in two nonequivalent positions: ‘A’ molecules contact the fivefold and twofold axes, while ‘B’ molecules contact the threefold axes. Reproduced from Naitow, I., Tang, J., Canady, M., Wickner, R.B., Johnson, J.E., 2002. L-A virus at 3.4 Å resolution reveals particle architecture and mRNA decapping mechanism. Nature Structural Biology 9, 725–728, with permission from Nature Publishing Group.

Fig. 2 Ribbon diagram of a single Gag molecule. The trench on the outer surface includes His154, the Gag residue to which 7meGMP is covalently attached by the decapping activity.

Genome Organization The single segment of the L-A genome is a 4.6 kbp dsRNA with two long open reading frames (ORFs) (Fig. 3). The 50 ORF encodes the major coat protein, called Gag in analogy with retroviruses, while the longer 30 ORF, called Pol, encodes the RNA-dependent RNA polymerase, and has homology with similar enzymes of other ssRNA and dsRNA viruses. Pol is expressed only as a fusion protein whose amino end is a nearly complete Gag molecule. The fusion is carried out by a –1 ribosomal frameshift event in the region of overlap of the two ORFs. An RNA pseudoknot, in the region of overlap of the ORFs, slows ribosome progression at a point on the mRNA (of the form X XXY YYZ (0 frame indicated)) where bonding of the A-site and P-site tRNAs to the mRNA is nearly as good in the –1 frame as in the 0 frame. About 1% of ribosomes slip into the –1 frame, and, continuing translation, make the Gag–Pol fusion protein.

Replication Cycle Transcription Viral particles have an RNA-dependent RNA polymerase that is due to the Pol part of the Gag–Pol fusion protein. Transcription is a conservative reaction, meaning that the parental strands remain together after the synthesis of a new ( þ ) strand. The L-A ( þ ) strand transcripts are extruded from the particles into the cytoplasm where they serve as both mRNA and as the species that is

666

Yeast L-A Virus (Totiviridae)

Fig. 3 Genome organization. The L-A ( þ ) strand encodes Gag and Pol in overlapping reading frames. The mRNA lacks 30 polyA structures, but gets a 50 cap stolen from cellular mRNAs. The location of the packaging site on the RNA and the region of the Pol protein segment that recognizes the packaging site are indicated. The RNA sites for replication, and the parts of Pol with homology to other RNA-dependent RNA polymerases (RdRp) are indicated. There is a cryptic in vitro RNA binding site that is inhibited by the protein region to its immediately C-terminal.

packaged by coat proteins to make new viral particles. L-A ( þ ) strand transcripts that are not degraded immediately have two possible fates: they may be translated or they may be encapsulated. The cap-stealing activity of Gag transfers the cap 7meGMP from cellular mRNAs to some L-A ( þ ) strands directing them to the translation apparatus, while only uncapped L-A ( þ ) strands are encapsidated. The ( þ ) strand transcripts of the smaller M dsRNA or of deletion mutants of L-A are often retained within the particle where they may be replicated. This is called ‘headful replication’ because the volume of the particle seems to be the determinant of how many dsRNA molecules may accumulate in each particle.

RNA Packaging The RNA packaging site (Fig. 3) is a stem–loop structure about 400 nucleotides from the 30 end of the L-A ( þ ) strand. This stem–loop has an essential A residue protruding on the 50 side of the stem. The loop sequence is also important, but the stem sequence is not critical, as long as the stem structure can form. The packaging site on the RNA is recognized by the proximal part of the Pol domain of the Gag–Pol fusion protein. The Gag part of Gag–Pol is part of the capsid structure, so the Gag–Pol fusion protein structure assures packaging of viral ( þ ) strands in new particles.

RNA Replication The (  ) strand synthesis reaction is called replication (Fig. 4). This reaction involves recognition of the internal binding site on the L-A ( þ ) strands by Pol, followed by interaction with the now nearby 30 end and initiation of new RNA (  ) chains. Multiple rounds of RNA synthesis can proceed within the viral particle in the ‘headful replication’ process discussed above.

Viral Translation The translation apparatus is the battleground of RNA viruses with their hosts. Poliovirus protease cleaves eIF4G so that host capped mRNAs cannot be translated but its own internal ribosomal entry site-containing mRNAs are used. Influenza virus steals caps from host mRNAs, as does L-A. The L-A coat protein removes caps from cellular mRNAs and transfers them to newly synthesized L-A ( þ ) strands.

Yeast L-A Virus (Totiviridae)

667

Fig. 4 Replication cycle of the L-A virus. Both ( þ ) and (  ) RNA strand synthesis occur within the viral particles, but at different stages of the cycle. RNA ( þ ) strands (transcripts) are extruded from the particles and serve as both mRNA and the species packaged to make new particles. The ( þ ) strands with a cap are translated while those without a cap are encapsulated or degraded. Translation of viral mRNA is blocked by Ski proteins. N-terminal acetylation of Gag by Mak3p is necessary for viral assembly.

L-A ( þ ) strand transcripts lack the 30 polyA structures typical of eukaryotic mRNAs, and so are at a distinct disadvantage for translation. The requirement for 30 polyA structure for translation, is imposed by the ribosome-associated Ski (superkiller, after the phenotype of the mutants) proteins (except Ski1p/Xrn1p). These Ski proteins both degrade non-polyA mRNAs and block their translation. In the absence of these Ski proteins, non-polyA mRNAs are expressed with the same kinetics as polyA þ messages. The SKI1/XRN1 gene encodes a 50 -4 30 exoribonuclease specific for uncapped mRNA. As a defense against its degradation of viral mRNA, L-A’s Gag protein has a decapping activity that removes caps from cellular mRNAs, transfers them to viral ( þ ) strands protecting them from Ski1p, and leaves some cellular mRNAs as ‘decapitated decoys’, uncapped RNAs that may serve as alternative targets for the exonuclease action. In the absence of this decapping activity, viral mRNAs are not expressed, but deletion of the SKI1/XRN1 gene restores viral mRNA expression. Together with the Ski proteins, a mitochondrial nuclease, Nuc1p, is released during meiosis/sporulation and protects cells from toxin overexpression during this process. Nuc1p is homologous to EndoG, an apoptosis nuclease of mammals, suggesting an evolutionary origin of apoptosis as an antiviral process.

L-A Genetics Several natural variants of L-A have been described. The ability of an L-A variant to support M dsRNA replication was first called [HOK] (helper of killer) and then H when it was found to be a property of L-As such as L-A-H, L-A-HN, or L-A-HNB. Making

668

Yeast L-A Virus (Totiviridae)

several chromosomal MAK genes dispensable for M propagation was named [B] (bypass) and is found on L-A-HNB. Ability to exclude L-A-H was named [EXL] (exclusion) and then just E on L-A-E. Insensitivity to the action of [EXL] was called [NEX] (nonexcludable), and then shortened to N as in L-A-HN or L-A-HNB. The molecular basis of these interactions has resisted study because of the current inability to obtain L-A dsRNA from a cDNA clone. M dsRNA can be supported from a cDNA clone of L-A, but the L-A virus has not been shown to be regenerated from these transcripts.

Other RNA Replicons in Yeast: L-BC, 20S RNA, 23S RNA 20S RNA was discovered as an RNA species which appeared when cells were placed under conditions that induce meiosis and spore formation, namely near-starvation for nitrogen and provision of acetate as a carbon source. It was shown that 20S RNA is an independent replicon, and can be made independent of the sporulation or meiosis processes. 20S RNA encodes its own RNA-dependent RNA polymerase which is bound to the otherwise naked cytoplasmic RNA. The mechanism of 20S RNA replication control by culture conditions remains to be elucidated. 23S RNA is a related, but independent yeast replicon which was found as a dsRNA form (called T). 23S RNA also encodes its RNA-dependent RNA polymerase. Both 20S RNA and 23S RNA can be launched from cDNA clones to form replicating virus. 20S RNA (ScNV20S) and 23S RNA (ScNV23S) are classified into the genus Narnavirus within the family Narnaviridae. L-BC is a totivirus (genus Totivirus, family Totiviridae), related to L-A but independent of it. Its dsRNA is essentially the same size as that of L-A although its copy number is usually about tenfold lower. Thus it was not detected until chromosomal mutants that lose L-A were examined. L-BC does not interact with the killer system, and confers no obvious phenotype, so its genetics has not been extensively explored.

See also: Totiviruses (Totiviridae). Viral Killer Toxins

Further Reading Bevan, E.A., Herring, A.J., Mitchell, D.J., 1973. Preliminary characterization of two species of dsRNA in yeast and their relationship to the ‘killer’ character. Nature 245, 81–86. Bostian, K.A., Elliott, Q., Bussey, H., et al., 1984. Sequence of the preprotoxin dsRNA gene of type I killer yeast: Multiple processing events produce a two-component toxin. Cell 36, 741–751. Dinman, J.D., Icho, T., Wickner, R.B., 1991. A –1 ribosomal frameshift in a double-stranded RNA virus of yeast forms a gag-pol fusion protein. Proceedings of the National Academy of Sciences of the United States of America 88, 174–178. Esteban, R., Fujimura, T., Wickner, R.B., 1989. Internal and terminal cis-acting sites are necessary for in vitro replication of the L-A double-stranded RNA virus of yeast. The EMBO Journal 8, 947–954. Fujimura, T., Esteban, R., 2016. Diphosphates at the 50 end of the positive strand of yeast L-A double-stranded RNA virus as a molecular self-identity tag. Molecular Microbiology 102, 71–80. Fujimura, T., Esteban, R., 2011. Cap-snatching mechanism in yeast L-A double-stranded RNA virus. Proceedings of the National Academy of Sciences of the United States of America 108, 17667–17671. Fujimura, T., Esteban, R., Esteban, L.M., Wickner, R.B., 1990. Portable encapsidation signal of the L-A double-stranded RNA virus of S. cerevisiae. Cell 62, 819–828. Fujimura, T., Ribasm, J.C., Makhov, A.M., Wickner, R.B., 1992. Pol of gag-pol fusion protein required for encapsidation of viral RNA of yeast L-A virus. Nature 359, 746–749. Fuller, R.S., Brake, A., Thorner, J., 1989. Intracellular targeting and strructural conservation of a prohormone-processing endoprotease. Science 246, 482–486. Gao, J., Chau, S., Chwdhury, F., et al., 2019. Meiotic viral attenuatioin through an ancestral pathway. Proceedings of the National Academy of Sciences of the United States of America 116, 16454–16462. Icho, T., Wickner, R.B., 1989. The double-stranded RNA genome of yeast virus L-A encodes its own putative RNA polymerase by fusing two open reading frames. Journal of Biological Chemistry 264, 6716–6723. Leibowitz, M.J., Wickner, R.B., 1976. A chromosomal gene required for killer plasmid expression, mating, and spore maturation in Saccharomyces cerevisiae. Proceedings of the National Academy of Sciences of the United States of America 73, 2061–2065. Masison, D.C., Blanc, A., Ribas, J.C., et al., 1995. Decoying the cap-mRNA degradation system by a dsRNA virus and poly(A)-mRNA surveillance by a yeast antiviral system. Molecular and Cellular Biology 15, 2763–2771. Naitow, H., Canady, M.A., Wickner, R.B., Johnson, J.E., 2002. L-A dsRNA virus at 3.4 Å resolution reveals particle architecture and mRNA decapping mechanism. Nature Structural & Molecular Biology 9, 725–728. Sommer, S.S., Wickner, R.B., 1982. Yeast L dsRNA consists of at least three distinct RNAs; Evidence that the non-Mendelian genes [HOK], [NEX] and [EXL] are on one of these dsRNAs. Cell 31, 429–441. Tercero, J.C., Wickner, R.B., 1992. MAK3 encodes an N-acetyltransferase whose modification of the L-A gag N-terminus is necessary for virus particle assembly. Journal of Biological Chemistry 267, 20277–20281. Wickner, R.B., 2013. Viruses and prions of yeasts, fungi and unicellular eukaryotes. In: Knipe, D.M., Howley, P.M. (Eds.), Fields Virology, sixth ed. Philadelphia, PA: Wolters Kluwer/Lippincott Williams & Wilkins, pp. 2355–2383.

ALGAL VIRUSES

Algal Marnaviruses (Marnaviridae) Marli Vlok and Curtis A Suttle, University of British Columbia, Vancouver, BC, Canada Andrew S Lang, Memorial University of Newfoundland, St. John’s, NL, Canada r 2021 Elsevier Ltd. All rights reserved. This is an update of A.S. Lang, C.A. Suttle, Marnaviruses, In Encyclopedia of Virology (Third Edition), edited by Brian W.J. Mahy and Marc H.V. Van Regenmortel, Elsevier Ltd., 2008, doi:10.1016/B978-012374410-4.00567-7.

RNA Viruses Infecting Marine Protists Most studies on marine RNA viruses have been on those infecting either charismatic megafauna or animals of economic importance. Over the last three decades this research has extended to include RNA virus isolates infecting protists as well as RNA metagenomic studies. Through this, the known marine RNA viruses now include representatives of all four Baltimore RNA virus groups, with hosts spread across the eukaryotic tree of life. Of these viruses, those infecting single-celled eukaryotes are thought to be the dominant portion of the free-floating RNA virus community. The first RNA virus isolated and characterized that infects a marine protist was Heterosigma akashiwo RNA virus (HaRNAV). This virus was isolated from the Strait of Georgia in coastal British Columbia and it infects and causes the lysis of the marine, toxic bloom-forming, unicellular, photosynthetic alga H. akashiwo. HaRNAV particles are non-enveloped icosahedrons, with an average diameter of 25 nm (Table 1). The positive-sense monocistronic genome is 8.6 kb in size and has distinct domain organization characteristics that contributed to its classification as the founding member of the family Marnaviridae within the order Picornavirales. The polyprotein translation is under the control of a putative 50 internal ribosome entry site (IRES), with the non-structural proteins (helicase [S3H], 3C protease [3CPro], and RNA-dependent RNA polymerase [RdRp]) encoded at the N terminus and the structural proteins (VP2, -4, -3, and -1) at the C terminus (Fig. 1). HaRNAV particles become visible within infected cells 48 h post infection. Like other RNA viruses it has a relatively large burst size, at 460–520 particles per cell. One notable sign of viral-induced intracellular cytopathology is the appearance of membranous vesicles within the cytoplasm; such structures are the site of RNA replication for other positive-sense single-stranded (ss) RNA viruses and are well characterized for picornaviruses such as poliovirus. This is followed by the appearance of assembled viral particles throughout the cytoplasm, and these can be at sufficient density for the particles to form crystalline arrays. Additional icosahedral (25–32 nm in diameter) ssRNA viruses (genomes ranging from 4.4 to 13.2 kb) infecting other marine protists have since been isolated and characterized. These include the dinoflagellate-infecting Heterocapsa circularisquama RNA virus (HcRNAV) of the family Alvernaviridae, Chaetoceros tenuissimus RNA virus types I and II (CtenRNAV01 and CtenRNAVII), Chaetoceros socialis f. radians RNA virus (CsfrRNAV), Chaetoceros sp. (CspRNAV) and Rhizosolenia setigera (RsRNAV) RNA viruses that infect centric diatoms, the pennate diatom-infecting Asterionellopsis glacialis RNA virus (AglaRNAV), and Aurantiochytrium RNA virus (AuRNAV), which infects a heterotrophic marine flagellate. There is also one known double-stranded (ds) RNA virus infecting a marine protist, MpRNAV-01B, which infects the photosynthetic Micromonas pusilla and represents the only species in the genus Mimoreovirus, subfamily Sedoreoviridnae, family Reoviridae. Some studies have suggested other protist RNA viruses might exist in cultures. Two notable examples are the observation of RNA virus-like particles in Ostreococcus sp. and the determination of RdRp sequences for two viruses that infect the pennate diatom Cylindrotheca closterium. Recently, the potential host range of protist-infecting RNA viruses has also been extended to the holobiont of the multicellular protist Delisea pulchra. The detected viruses included þ ssRNA viruses affiliated to the Picornavirales as well as some dsRNA viruses with sequence similarity to members of the Totiviridae. Metagenomic studies focused on planktonic marine RNA viruses have also revealed many novel viruses, including some complete viral genomic sequences. The majority of these show affiliation with the order Picornavirales.

Development of a Taxonomic Framework and Resulting Changes in Taxonomy In addition to HaRNAV’s classification as the founding member of the family Marnaviridae, four of the above-mentioned þ ssRNA viruses had previously been formally taxonomically classified. These are AuRNAV, classified within the unassigned genus Labyrnavirus, and CtenRNAV01, CsfrRNAV and RsRNAV, which were classified within the unassigned genus Bacillarnavirus. Until recently, the International Committee on the Taxonomy of Viruses (ICTV) taxonomic classification system required inclusion of biological information such as host range, replication cycle, virus particle structure, and serology. However, recognizing the need to update the taxonomic system, the ICTV recently announced they would accept the formal taxonomic classification of viruses known only from metagenomic studies. We have recently applied a sequence-based framework for analysis of twenty marine RNA viruses that provides the justification for taxonomic classification of these viruses within the family Marnaviridae. The 20 viruses include the original and sole representative of the family, HaRNAV, 7 additional viruses represented by isolates, and 12 viruses discovered using metagenomics. Phylogenetic analysis of the RdRp domain sequences placed the 20 marine RNA virus sequences in a strongly supported monophyletic group relative to other Picornavirales sequences (Fig. 2). The original genus within the Marnaviridae, Marnavirus, is basal within the clade and the analysis places the 20 viruses into seven clades that we defined as genera within the family Marnaviridae. These include the

Encyclopedia of Virology, 4th Edition, Volume 4

doi:10.1016/B978-0-12-809633-8.21323-X

671

ND, not determined.

a

Heterosigma akashiwo Rhizosolenia setigera Chaetoceros tenuissimus Chaetoceros tenuissimus Chaetoceros sp. Chaetoceros socialis f. radians Aurantiochytrium Asterionellopsis glacialis

HaRNAV RsRNAV CtenRNAV01 CtenRNAVII CspRNAV CsfrRNAV AuRNAV AglaRNAV

Heterosigma akashiwo RNA virus Rhizosolenia setigera RNA virus Chaetoceros tenuissimus RNA virus 01 Chaetoceros tenuissimus RNA virus type II Chaetoceros species RNA virus 02 Chaetocerus socialis f. radians RNA virus 01 Aurantiochytrium single stranded RNA virus Asterionellopsis glacialis RNA virus

Host

Isolates in the family Marnaviridae that infect marine protists

Virus

Table 1

Raphidophyte Diatom Diatom Diatom Diatom Diatom Thraustochytrid Diatom

Marnavirus Bacillarnavirus Bacillarnavirus Sogarnavirus Sogarnavirus Bacillarnavirus Labyrnavirus Kusarnavirus

Genus 25 32 31 35 32 22 25 31

Particle diameter (nm) 8.6 11.2 9.4 9.5 9.0–10.0 9.5 9.0 9.5

48 48 o24 24–48 o48 o48 8 48

460–520 1010–3100 1.0  105 287 NDa 66 5.8  103–6.4  104 ND

Genome size (kb) Latent period (hours) Burst size (particles per cell)

672 Algal Marnaviruses (Marnaviridae)

Algal Marnaviruses (Marnaviridae)

673

Fig. 1 Viral genomic organizations of members within the family Marnaviridae. The isolates and metagenomic-assembled viruses have either a mono- or dicistronic genome organization, which include a putative 50 internal ribosome entry site (IRES), S3 helicase, 3C protease/peptidase, RNA-dependent RNA polymerase (RdRp), putative intergenic region (IGR) IRES (in dicistronic genomes), four capsid domains (VP2, VP4, VP3, and VP1), and a poly(A) tail.

1.0

SH-like support

Picornaviridae

1.0 0.7

Polycipiviridae

Iflaviridae

Marnavirus Labyrnavirus Locarnavirus Kusarnavirus Bacillarnavirus Salisharnavirus Sogarnavirus

AV

AglaRNAV

RN

V NA AuR

PA L1 BC28 JP P -1 -A BC AL47 -4 3 Csfr RNA RsRNV1 A CtenRNVA V1

Secoviridae Ha

en

SF-2 JP-B SF-1 SF-3

Ct

156 PAL-3 BC C-2V2 BNA pRII Cs AV RN

Dicistroviridae

Marnaviridae Fig. 2 Unrooted maximum-likelihood phylogeny of the RNA-dependent RNA polymerase amino acid sequences of members in the order Picornavirales. Sides of grey triangles represent longest and shortest branches with the base width indicating of the number of genera present in the collapsed clade. Members of the family Marnaviridae are coloured by genus: Marnavirus (blue), Labyrnavirus (red), Locarnavirus (purple), Kusarnavirus (yellow), Bacillarnavirus (pink), Salisharnavirus (orange), and Sogarnavirus (green). SH-like branch support values are indicated at nodes when 40.70 and the scale bar indicates substitutions per site.

674

Algal Marnaviruses (Marnaviridae)

Table 2

Species in the family Marnaviridae Species

Virus name

Abbreviation Accession number

Marnavirus

Heterosigma akashiwo RNA virusa

Heterosigma akashiwo RNA virus

HaRNAV

NC_005281 8587 Heterosigma akashiwo

Labyrnavirus

Aurantiochytrium single Aurantiochytrium single stranded RNA virus stranded RNA virusa

AuRNAV

NC_007522 9035 Aurantiochytrium sp.

Locarnavirus

Jericarnavirus Ba Sanfarnavirus 2 Sanfarnavirus 1 Sanfarnavirus 3

Marine Marine Marine Marine

JP  B SF  2 SF  1 SF  3

NC_009758 KF412901 JN661160 KF478836

Kusarnavirus

Astarnavirusa

Asterionellopsis glacialis RNA virus

AglaRNAV

NC_024489 8842 Asterionellopsis glacialis

RNA RNA RNA RNA

virus virus virus virus

JP  B SF  2 SF  1 SF  3

Size (nt)

Source

Genus

8926 9321 8970 8648

Coastal Coastal Coastal Coastal

marine wastewater wastewater wastewater

Bacillarnavirus Chaetoceros tenuissimus Chaetoceros tenuissimus RNA virus 01 CtenRNAV1 AB375474 9431 Chaetoceros tenuissimus RNA virus 01 Rhizosolenia setigera RNA virus RsRNAV AB243297 8877 Rhizosolenia setigera Rhizosolenia setigera RNA virusa Chaetoceros socialis Chaetoceros socialis f. radians RNA virus 01 CsfrRNAV1 NC_012212 9467 Chaetoceros socialis f. radians RNA virus 01 f. radians BC  4 PAL473 BC  1 PAL128

Salisharnavirus Britarnavirus 4a Palmarnavirus 473 Britarnavirus 1 Palmarnavirus 128

Marine Marine Marine Marine

Sogarnavirus

Marine RNA virus BC  2 Marine RNA virus Pal156 Marine RNA virus BC  3 Marine RNA virus JP  A Chaetoceros tenuissimus RNA virus type II Chaetoceros species RNA virus 02

Britarnavirus 2 Palmarnavirus 156 Britarnavirus 3 Jericarnavirus A Chaetenuissarnavirus IIa Chaetarnavirus 2

RNA RNA RNA RNA

virus virus virus virus

BC  4 PAL473 BC  1 PAL128

MH171300 NC_029309 MG584187 NC_029306

8593 6360 8638 8660

Coastal/oceanic marine Coastal marine Coastal marine Coastal marine

BC  2 PAL156 BC  3 JP  A CtenRNAVII CspRNAV2

MG584188 NC_029307 MG584189 NC_009757 NC_025889 AB639040

8843 7897 8496 9236 9562 9417

Coastal marine Coastal marine Coastal marine Coastal marine Chaetoceros tenuissimus Chaetoceros sp.

a

Type species.

previously unassigned genera Bacillarnavirus and Labyrnavirus, and we chose the names Kusarnavirus, Locarnavirus, Salisharnavirus and Sogarnavirus for the four new genera. Capsid polyprotein sequence comparisons were used to delineate the species within genera (Table 2), with a cut-off of 75% pairwise amino acid identity. Among the 20 viruses and 7 genera, there is a mixture of mono- and dicistronic genome organizations (Table 1; Fig. 1). While the majority of the 20 viruses have a dicistronic genome organization, HaRNAV and SF-3 have a single predicted polyprotein encoded in their genomes. HaRNAV is the only representative of the genus Marnavirus, but SF-3 is one of four viruses falling within the genus Locarnavirus. There is precedent for such different genomic organizations within the Picornavirales, with both mono- and bipartite genomes found within the Secoviridae.

Marine RNA Virus Quasispecies RNA viruses exist in a mutation-selection balance where a genetically diverse population of variant genomes are ordered around the fittest genome from a single progenitor, known as the quasispecies concept, and this is important in the evolution of RNA viruses. Single-nucleotide variance analysis of six of the Marnaviridae members’ sequences identified in RNA virus metagenomic data sets from around the globe suggest that these viruses also exist as quasispecies, with many low frequency mutations detectable across conserved domains. The distribution of viral genotypes across marine waterbodies is poorly understood. Patterns in the variant analysis showed that some of the viruses exhibited quasispecies-like populations with synonymous low-frequency mutations in locations distinct from the original site of discovery. Some mutations provide a competitive advantage to the virus, and they can become established in the viral population. Mutations occurring at 499% frequency were detected in these datasets across multiple viral domains and geographic locations, which indicates turnover of the dominant genotypes had occurred.

Importance of RNA Viruses in Marine Microbial Ecology It has traditionally been believed that most of the estimated 107 marine virus particles per ml of seawater have DNA genomes and infect bacteria. Viral abundance is most frequently determined using epifluorescence or electron microscopy, or flow cytometry. These methods are good for detecting larger DNA viruses, but they lack the sensitivity and specificity to detect small RNA genomes

Algal Marnaviruses (Marnaviridae)

675

or, in the case of electron microscopy, differentiate RNA viruses from small DNA viruses. Use of fluorometry to determine the abundance of RNA- and DNA-containing virus particles suggested that at least 50% of the virioplankton in coastal waters have RNA genomes. Using quantitative RT-PCR, picorna-like RNA virus genome abundances were estimated to range from 0.2 to 5  107 per ml, thereby representing 14%–90% (mean approximately 40%) of the total coastal marine virus populations sampled. If this holds true for other marine environments, it could have huge implications for our understanding of viral production and the contribution of RNA viruses to biogeochemical processes. Considering that the majority of viruses identified in the study were picorna-like viruses, known to infect eukaryotes, we can assume that protists actually contribute more to marine viral dynamics than previously thought. The hosts for the isolated Marnaviridae members (Table 1) include ecologically important protists and these viruses are thought to have significant effects on both the frequency and duration of plankton blooms. For example, HaRNAV infects the bloomforming Raphidophyte Heterosigma akashiwo. Only some strains of H. akashiwo are susceptible to infection by HaRNAV, and the pattern of susceptibility does not appear to relate to geographic origin of the strain in question. When the virus is added to a growing culture of susceptible H. akashiwo, complete lysis of the culture generally occurs within 1 week or less. Diatoms are integral members of marine phytoplankton communities and represent the largest proportion of host organisms for viruses in the family Marnaviridae. Diatoms are distributed worldwide and are responsible for 35%–75% of the primary production in the oceans. Diatom blooms occur seasonally in nutrient-rich ecosystems, where a succession of species is observed. Like HaRNAV, the studied diatom viruses are highly host specific, frequently exhibiting strain specificity. Although burst sizes range widely, from 66 (CsfrRNAV) to 1.0  105 (CtenRNAV01) particles per cell, these viruses can decimate host populations and alter population dynamics, species succession and the functioning of the biological carbon pump. The survival of host cells within a pool of viral pathogens suggests that there are regulatory factors affecting successful infection. For example, C. tenuissimus was found to be consistently detected at high abundances in both seawater and sediments off the coast of Japan where viruses infecting it are also known to occur in high abundance. While virus resistance could account for some of this, environmental niche partitioning was also shown to be an important contributor to viral infection, with temperature and salinity affecting viral infections of the same host organism.

Summary Work in the ocean, with both culturing and metagenomics approaches, has shown that marine systems harbor a whole new world of ssRNA viruses. The majority of these have shown affiliation with the order Picornavirales, and many of these have now recently been formally classified within the family Marnaviridae. However, despite their shared characteristics and phylogenetic affiliations based on conserved domains, most of the genomes have no recognizable similarity to any other known sequences, viral or otherwise. More work is clearly needed with the individual marine RNA viruses, and even more viruses need to be brought into culture, sequenced and characterized. The ssRNA viruses isolated from marine systems infect basal eukaryotes, protists, and so these viruses may be ancestral to the related viruses infecting higher organisms such as mammals. There is no doubt that study of marine RNA viruses will further our understanding of RNA virus evolution.

Further Reading Allen, L.Z., Mccrow, J.P., Ininbergs, K., et al., 2017. The Baltic Sea virome: Diversity and transcriptional activity of DNA and RNA viruses. mSystems 2, e00125–16. Armbrust, E.V., 2009. The life of diatoms in the world’s oceans. Nature 459, 185–192. Barrett, T., Visser, I.K.G., Mamaev, L., et al., 1993. Dolphin and porpoise morbilliviruses are genetically distinct from phocine distemper virus. Virology 193, 1010–1012. Brussaard, C.P.D., Noordeloos, A.A.M., Sandaa, R.-A.A., Heldal, M., Bratbak, G., 2004. Discovery of a dsRNA virus infecting the marine photosynthetic protist Micromonas pusilla. Virology 319, 280–291. Burki, F., 2014. The eukaryotic tree of life from a global phylogenomic perspective. Cold Spring Harbor Perspectives in Biology 6, a016147. Comeau, A.M., Arbiol, C., Krisch, H.M., 2010. Gene network visualization and quantitative synteny analysis of more than 300 marine T4-like phage scaffolds from the GOS metagenome. Molecular Biology and Evolution 27, 1935–1944. Correa, A.M.S., Welsh, R.M., Vega Thurber, R.L., 2012. Unique nucleocytoplasmic dsDNA and þ ssRNA viruses are associated with the dinoflagellate endosymbionts of corals. The ISME Journal 7, 13–27. Culley, A.I., 2018. New insights into the RNA aquatic virosphere via viromics. Virus Research 244, 84–89. Culley, A.I., Lang, A.S., Suttle, C.A., 2006. Metagenomic analysis of coastal RNA virus communities. Science 312, 1795–1798. Culley, A.I., Lang, A.S., Suttle, C.A., 2007. The complete genomes of three viruses assembled from shotgun libraries of marine RNA virus communities. Virology Journal 4, 69. Culley, A.I., Mueller, J.A., Belcaid, M., et al., 2014. The characterization of RNA viruses in tropical seawater using targeted PCR and metagenomics. mBio 5, 1–11. Dolja, V.V., Koonin, E.V., 2018. Metagenomics reshape the concepts of RNA virus evolution by revealing extensive horizontal virus transfer. Virus Research 244, 36–52. Gleason, F.H., Jephcott, T.G., Küpper, F.C., et al., 2015. Potential roles for recently discovered chytrid parasites in the dynamics of harmful algal blooms. Fungal Biology Reviews 29, 20–33. Greninger, A.L., DeRisi, J.L., 2015. Draft genome sequences of ciliovirus and brinovirus from San Francisco wastewater. Genome Announcements 3, e00651–15. Khramtsov, N.V., Upton, S.J., 2000. Association of RNA polymerase complexes of the parasitic protozoan Cryptosporidium parvum with virus-like particles: heterogeneous system. Journal of Virology 74, 5788–5795. Kimura, K., Tomaru, Y., 2015. Discovery of two novel viruses expands the diversity of single-stranded DNA and single-stranded RNA viruses infecting a cosmopolitan marine diatom. Applied and Environmental Microbiology 81, 1120–1131. Kimura, K., Tomaru, Y., 2017. Effects of temperature and salinity on diatom cell lysis by DNA and RNA viruses. Aquatic Microbial Ecology 79, 79–83.

676

Algal Marnaviruses (Marnaviridae)

Koonin, E.V., Wolf, Y.I., Nagasaki, K., Dolja, V.V., 2008. The Big Bang of picorna-like virus evolution antedates the radiation of eukaryotic supergroups. Nature Reviews Microbiology 6, 925–939. Kraeva, N., Butenko, A., Hlavácˇová, J., et al., 2015. Leptomonas seymouri: adaptations to the dixenous life cycle analyzed by genome sequencing, transcriptome profiling and co-infection with Leishmania donovani. PLoS Pathogens 11, 1–23. Lachnit, T., Thomas, T., Steinberg, P., 2016. Expanding our understanding of the seaweed holobiont: RNA viruses of the red alga Delisea pulchra. Frontiers in Microbiology 6, 1–12. Lang, A.S., Culley, A.I., Suttle, C.A., 2004. Genome sequence and characterization of a virus (HaRNAV) related to picorna-like viruses that infects the marine toxic bloomforming alga Heterosigma akashiwo. Virology 320, 206–217. Lang, A.S., Rise, M.L., Culley, A.I., Steward, G.F., 2009. RNA viruses in the sea. FEMS Microbiology Reviews 33, 295–323. Lawrence, J.E., Suttle, C.A., 2004. Effect of viral infection on sinking rates of Heterosigma akashiwo and its implications for bloom termination. Aquatic Microbial Ecology 37, 1–7. Levin, R.A., Voolstra, C.R., Weynberg, K.D., van Oppen, M.J.H., 2016. Evidence for a role of viruses in the thermal sensitivity of coral photosymbionts. The ISME Journal 11, 808–812. Løvoll, M., Wiik-Nielsen, J., Grove, S., et al., 2010. A novel totivirus and piscine reovirus (PRV) in Atlantic salmon (Salmo salar) with cardiomyopathy syndrome (CMS). Virology Journal 7, 309–315. Malviya, S., Scalco, E., Audic, S., et al., 2016. Insights into global diatom distribution and diversity in the world’s ocean. Proceedings of the National Academy of Sciences of the United States of America 113, 1516–1525. Miranda, J.A., Culley, A.I., Schvarcz, C.R., Steward, G.F., 2016. RNA viruses as major contributors to Antarctic virioplankton. Environmental Microbiology 18, 3714–3727. Mizumoto, H., Tomaru, Y., Takao, Y., Shirai, Y., Nagasaki, K., 2007. Intraspecies host specificity of a single-stranded RNA virus infecting a marine photosynthetic protist is determined at the early steps of infection. Journal of Virology 81, 1372–1378. Moniruzzaman, M., Wurch, L.L., Alexander, H., et al., 2017. Virus-host relationships of marine single-celled eukaryotes resolved from metatranscriptomics. Nature Communications 8, 16054–16064. Munn, C.B., 2006. Viruses as pathogens of marine organisms – From bacteria to whales. Journal of the Marine Biological Association of the United Kingdom 86, 453–467. Nagasaki, K., Tomaru, Y., Katanozaka, N., et al., 2004. Isolation and characterization of a novel single-stranded RNA virus infecting the bloom-forming diatom Rhizosolenia setigera. Applied and Environmental Microbiology 70, 704–711. Nagy, P.D., Pogany, J., 2006. Yeast as a model host to dissect functions of viral and host factors in tombusvirus replication. Virology 344, 211–220. Nelson, D.M., Tréguer, P., Brzezinski, M.A., Leynaert, A., Quéguiner, B., 1995. Production and dissolution of biogenic silica in the ocean: revised global estimates, comparison with regional data and relationship to biogenic sedimentation. Global Biogeochemical Cycles 9, 359–372. Scheffter, S.M., Ro, Y.T., Chung, I.K., Patterson, J.L., 1995. The complete sequence of Leishmania RNA virus LRV2-1, a virus of an Old World parasite strain. Virology 212, 84–90. Shirai, Y., Tomaru, Y., Takao, Y., et al., 2008. Isolation and characterization of a single-stranded RNA virus infecting the marine planktonic diatom Chaetoceros tenuissimus Meunier. Applied and Environmental Microbiology 74, 4022–4027. Shi, M., Lin, X.-D., Tian, J.-H., et al., 2016. Redefining the invertebrate RNA virosphere. Nature 540, 539–543. Smetacek, V., 1999. Diatoms and the ocean carbon cycle. Protist 150, 25–32. Sommer, U., Adrian, R., De Senerpont Domis, L., et al., 2012. Beyond the Plankton Ecology Group (PEG) model: Mechanisms driving plankton succession. Annual Review of Ecology, Evolution, and Systematics 43, 429–448. Steward, G.F., Culley, A.I., Mueller, J.A., et al., 2013. Are we missing half of the viruses in the ocean? The ISME Journal 7, 672–679. Steward, G.F., Rappe, M.S., 2007. What’s the ‘meta’ with metagenomics? The ISME Journal 1, 100–102. Steward, G.F., Wikner, J., Cochlan, W.P., Smith, D.C., Azam, F., 1992. Estimation of virus production in the sea: ii. Field results. Marine Microbial Food Webs 6, 79–90. Suttle, C.A., 2005. Viruses in the sea. Nature 437, 356–361. Tai, V., Lawrence, J.E., Lang, A.S., et al., 2003. Characterization of HaRNAV, a single-stranded RNA virus causing lysis of Heterosigma akashiwo (Raphidophyceae). Journal of Phycology 39, 343–352. Takao, Y., Mise, K., Nagasaki, K., Okuno, T., Honda, D., 2006. Complete nucleotide sequence and genome organization of a single-stranded RNA virus infecting the marine fungoid protist. Journal of General Virology 87, 723–733. Tarr, P.I., Aline, R.F., Smiley, B.L., Keithly, J., Stuart, K., 1988. LR1: A candidate RNA virus of Leishmania. Proceedings of the National Academy of Sciences of the United States of America 85, 9572–9575. Tomaru, Y., Katanozaka, N., Nishida, K., et al., 2004. Isolation and characterization of two distinct types of HcRNAV, a single-stranded RNA virus infecting the bivalve-killing microalga Heterocapsa circularisquama. Aquatic Microbial Ecology 34, 207–218. Tomaru, Y., Mizumoto, H., Nagasaki, K., 2009a. Virus Researchistance in the toxic bloom-forming dinoflagellate Heterocapsa circularisquama to single-stranded RNA virus infection. Environmental Microbiology 11, 2915–2923. Tomaru, Y., Nagasaki, K., 2007. Flow cytometric detection and enumeration of DNA and RNA viruses infecting marine eukaryotic microalgae. Journal of Oceanography 63, 215–221. Tomaru, Y., Takao, Y., Suzuki, H., Nagumo, T., Nagasaki, K., 2009b. Isolation and characterization of a single-stranded RNA virus infecting the bloom-forming diatom Chaetoceros socialis. Applied and Environmental Microbiology 75, 2375–2381. Tomaru, Y., Toyoda, K., Kimura, K., 2018. Occurrence of the planktonic bloom-forming marine diatom Chaetoceros tenuissimus Meunier and its infectious viruses in Western Japan. Hydrobiologia 805, 221–230. Tomaru, Y., Toyoda, K., Kimura, K., et al., 2012. First evidence for the existence of pennate diatom viruses. The ISME Journal 6, 1445–1448. Tomaru, Y., Toyoda, K., Kimura, K., et al., 2013. Isolation and characterization of a single-stranded RNA virus that infects the marine planktonic diatom Chaetoceros sp. (SS08-C03). Phycological Research 61, 27–36. Vlok, M., Lang, A.S., Suttle, C.A., 2019a. Application of a sequence-based taxonomic classification method to uncultured and unclassified marine single-stranded RNA viruses in the order Picornavirales. Virus Evolution 5, vez056. Vlok, M., Lang, A.S., Suttle, C.A., 2019b. Marine RNA virus quasispecies are distributed throughout the oceans. mSphere 4, e00157-–19. Wang, A.L., Wang, C.C., 1986. Discovery of a specific double-stranded RNA virus in Giardia lamblia. Molecular and Biochemical Parasitology 21, 269–276. Wang, A.L., Wang, C.C., 1986. The double-stranded RNA in Trichomonas vaginalis may originate from virus-like particles. Proceedings of the National Academy of Sciences of the United States of America 83, 7956–7960. Weinbauer, M.G., 2004. Ecology of prokaryotic viruses. FEMS Microbiology Reviews 28, 127–181. Wolf, Y.I., Kazlauskas, D., Iranzo, J., et al., 2018. Origins and evolution of the global RNA virome. mBio 9. e02329-18. Wu, B., Zhang, X., Gong, P., et al., 2016. Eimeria tenella: A novel dsRNA virus in E. tenella and its complete genome sequence analysis. Virus Genes 52, 244–252.

Algal Mimiviruses (Mimiviridae) Ruth-Anne Sandaa and Håkon Dahle, Department of Biological Sciences, University of Bergen, Bergen, Norway Corina PD Brussaard, NIOZ Royal Netherlands Institute for Sea Research, Den Burg, Texel, The Netherlands and Utrecht University, Utrecht, The Netherlands Hiroyuki Ogata and Romain Blanc-Mathieu, Institute for Chemical Research, Kyoto University, Kyoto, Japan r 2021 Elsevier Ltd. All rights reserved.

Glossary Algal viruses Viruses infecting eukaryotic algae. Intein An intein is a “protein intron” that catalyzes selfsplicing at the protein level. The self-splicing is carried out by the autocatalytic excision of the intein from a precursor host protein where it is located, and the concomitant ligation of the flanking protein segments (“exteins”) of the precursor. Inteins often possess a homing endonuclease domain, which confers the mobility on the intein-encoding DNA. MIGE (Major Interspersed Genomic Element) Repeated sequences throughout genomes. NCLDV Nucleocytoplasmic large DNA viruses is a proposed order of viruses containing several families that all share certain genomic and structural characteristics.

NCVOG Nucleo-Cytoplasmic Virus Orthologous Groups is a database for homologous genes vertically inherited within at least one viral family in the NCLDV group. ORF Open reading frame, the part of a reading frame that has the ability to be translated. Phylogenetic diversity A measurement of biodiversity which incorporates phylogenetic difference between species. Phylogenetics The study of evolutionary history and relationship among individuals or groups of species or populations. Taxonomic richness The number of different taxa represented in an ecological community. Virophage Small double stranded DNA viruses that require the co-infection of another virus for the replication.

History During the last decades it has become clear that viruses are the most abundant biological entities in aquatic environments. By their lytic activity, viruses are linked to several vital ecological processes, such as the structuring of host communities, and biochemical cycling. They are also key players in the evolution of their hosts. Viral abundances in aquatic systems are typically in the order of 106–107 viruses per mL, and it is estimated that viral lysis is responsible for releasing as much as 10 billion tons of organic carbon into the ocean per day. Viral activity contributes not only to nutrient recycling (that fuels the productivity in the upper euphotic ocean) but also impacts the transfer efficiency of organic matter to higher trophic levels and the transport of organic matter to the depths of the oceans. The viral shunt is also proposed to increase the efficiency of the carbon pump. This is because the viral lysis of host cells releases labile NP-rich organic matter (e.g., peptides) that can be re-used to grow new cells, while the less labile, carbon enriched (e.g. cell wall), part of the host cell tend to aggregate and sink. Most of the marine viruses infect the numerically dominant microorganisms, including the unicellular algae (phytoplankton). As early as 1981 large icosahedral viruses infecting Chlorella were reported and in 1982 a virus infecting the photosynthetic protist Micromonas pulsilla was described. Currently viruses have been isolated for all taxonomic groups of phytoplankton. Originally, all double-stranded (ds) DNA icosahedrical viruses infecting photosynthetic protists were first classified into the family Phycodnaviridae. Recent phylogeny based on Nucleocytoplasmic Large DNA Viruses (NCLDV) orthologous genes, however, shows that many of the large dsDNA algal viruses belong to the family Mimiviridae (Fig. 1). The enourmous diversity found within this family suggests that Mimiviridae has a deep co-evolutionary history with its marine protists hosts, dating back to the early history of life. The families Mimiviridae and Phycodnaviridae have been shown to dominate the NCLDV populations in marine metagenomes with concentrations up to 4104 genomes per mL in the photic zone. In this article we describe the phytoplankton viruses belonging to the Mimiviridae family. One of the first isolated large dsDNA viruses, that was later assigned to the Mimiviridae family, was Phaeocystis pouchetii virus (PpV) isolated in 1995. A few years later (1998) two more viruses, belonging to this subgroup, were characterized, namely Chrysochromulina ericina virus (CeV 01B) and Pyramimonas orientalis virus (PoV 01B). The genome size of these viruses were the largest ever reported at that time with sizes being 474 and approx. 560 kbp, respectively. All these viruses were isolated from coastal waters in a Norwegian fjord (Raunefjorden). Some years later (2005), characterization of different Phaeosystis globosa viruses (PgVs), isolated from the southern North Sea, was published. Some of these PgV viruses belong to the family Phycodnaviridae and are denoted PgV Group II, whilst those belonging to the family Mimiviridae are called PgV Group I. PgV-16T (Group I, genome size: 460 kbp) was the first of the algal mimiviruses (members of the family Mimiviridae) to be fully sequenced (in 2013). In the following decade several new viruses have been linked to this family, namely Aurecoccus anophagefferens virus (AaV), Haptolina ericina virus (HeV RF02), two Prymnesium kappa viruses (PkV RF01, PkV RF02), Prymnesium parvum virus (PpDNAV), and Tetraselmis virus (TetV-1). In addition, metagenome assembled genomes have been recovered for three non-isolated Mimiviridae, i.e., two organic lake phycodnaviruses (OLPV 1 and 2) and Yellowstone lake mimivirus (YSLGV). The

Encyclopedia of Virology, 4th Edition, Volume 4

doi:10.1016/B978-0-12-809633-8.21318-6

677

678

Algal Mimiviruses (Mimiviridae)

Fig. 1 Maximum likelihood phylogenetic reconstruction on DNA polymerase family B (PolB) using RAxML. Members of Phycodnaviridae were used as an outgroup. Circles between internal nodes indicate bootstrap values above 70% (light gray), and above 90% (dark gray). Viral host are color coded by heterotrophs (blue) and phototrophic algae (green). The scale bar indicates the average number of substitutions per site. The proposed subfamilies Megavirinae and Mesomimivirinae are not yet approved by the ICTV.

two firsts are nearly-complete genomes while the third one is partial. These three viruses are phylogenetically most related to algaeinfecting mimiviruses suggesting that phytoplankton are their hosts. As early as 2005 the search of mimivirus proteomes against Sargasso Sea metagenomes, both concomitantly released a year before, indicated the presence of mimivirus-relatives in marine environment. Following analyses of several marker gene sequences (DNA polymerase type B, DNA-directed RNA polymerase (RNApol)), asparagine synthetase (AsnS) and DNA mismatch repair enzyme (MutS) from various marine genomic datasets hinted to the vast phylogenetic diversity of Mimiviridae and their large taxonomic richness. Recently, diversity analysis of DNA-directed RNA polymerase subunits 1 and 2 fetched from bacteria-sized metagenomes collected during the Tara Oceans expedition indicated that the taxonomic richness of the family Mimiviridae exceeds that of a domain of cellular organisms (i.e., Bacteria). Importantly, the vast majority of this untapped diversity falls in the vicinity of the algae-infecting rather than heterotroph-infecting clade, although their host may include non-photosynthetic organisms.

General Properties All so far isolated algae-infecting viruses in the Mimiviridae family have icosahedral structures between 130 up to 310 nm in diameter and dsDNA genomes. They all replicate in the cytoplasmic area (defined as virus factories) of the host cells, thanks to their own well-conserved transcription machinery, and produce lytic viruses that are transmitted horizontally. The latent time ranges from 10 to 72 h (Table 1). Phylogenetic analysis of DNA polB (Fig. 1) shows that they group in a strongly supported monophyletic group within the Mimiviridae family, most similar to viruses infecting heterotropic protists and less to viruses infecting other autotrophic protists in the Phycodnaviridae family. Nevertheless, several genes of Phycodnaviridae origin are detected in the genomes of Mimiviridae viruses, suggesting a common origin. A comparative genomics study between NCLDVs reported

Mesomimivirinae

unclassified algae-infecting Mimiviridae Mesomimivirinae

Unclassified algae-infecting Mimiviridae Mesomimivirinae

Mesomimivirinae

Unclassified algae-infecting Mimiviridae

PgV-16T

PkV RF01

PoV 01B

PpDNAV

TetV1

Tetraselmis sp.

Prymnesium parvum

Pyramimonas orientalis and Haptolina ericinaa Phaeocystis pouchetii

Prymnesium kappa and Haptolina ericina Prymnesium kappa

Haptolina ericina and Prymnesium kappa Phaeocystis globosa

a

host taxonomy reassigned from Chrysochromulina ericina to Haptolina ericina. Source: ND: Not determined.

PpV 01

PkV RF02

Mesomimivirinae

HeV RF02

474

340

Prasinophyceae

226–257

Prymnesiophyceae 221 668

ND

485

41

Circular

ND ND

ND ND

ND ND

220  180 560

Prymnesiophyceae 130–160

Prasinophyceae

Linear

ND ND

32

ND ND

ND

470

Linear

Linear

ND ND

25

29

GC Linear/circular % genome

507

Prymnesiophyceae 160

Prymnesiophyceae 310

Prymnesiophyceae 150

Prymnesiophyceae 190  160 530

Prymnesiophyceae 160

CeV 01B

140

Size (nm) Genome size (Kbp)

Haptolina ericinaa

Unclassified algae-infecting Mimiviridae Mesomimivirinae

AaV

Class Pelagophycea

Strain

Group/subfamily

Strains

Characteristics

Aurecoccus anophagefferens

Host

Viruses infecting unicellular eukaryotic algae belonging to the family Mimiviridae

Viruses

Table 1

nd

24–48

12–18

14–19

12–16

24–32

10

14–18

14–19

60–72

Latent Period (h)

ND ND No

þ þ

ND

þ

þ

ND

þ

ND

Yes

þ

þ

ND

No

þ þ

No

þ

West-cost, Norway West-cost, Norway Southern North Sea West-cost, Norway West-cost, Norway West-cost, Norway West-cost, Norway East-coast, England Coast of Oàhu, Hawaii

East-cost, USA

Replication in Virophage Origin of host cytoplasm isolation

Algal Mimiviruses (Mimiviridae) 679

680

Algal Mimiviruses (Mimiviridae)

Table 2 Characteristics proteins for algae infecting Mimiviridae. (DNA polB) DNA polymerase type B, (MCP 1) Major capsid protein 1, (RNA pol II) DNA-directed RNA polymerase, (AsnS) asparagine synthetase, (MutS7) DNA mismatch repair enzyme, (eIF-4E) eukaryotic translation initiation factor 4E Strains

DNA polB

MCP1

A32-like virion packing ATPase

RNA pol II

MutS7

AsnS

eIF-4E

AaV CeV-01B PgV-16T PkV RF01 TetV-1 HeV-RF02 PkV-RF02 PoV 01B PpV 01 PpDNAV

þ þ þ þ þ þ þ þ þ þ

þ þ þ þ þ þ þ þ þ þ

þ þ þ þ þ þ þ ND ND ND

þ þ þ þ þ þ þ ND ND ND

þ þ þ þ þ þ þ þ þ ND

No þ þ þ þ þ þ ND ND ND

No þ þ þ þ þ þ ND ND ND

Source: ND: not determined.

22 proteins unique (not seen in other NCLDVs) to the Phycodnaviridae-Mimiviridae clade and a conservative estimates of 74 proteins present in their last common ancestor. As a comparison these numbers were respectively 27 and 66 for the Chordo/ Entomo-poxvirus clade, whose shared ancestry is well acknowledged. It should be noted that the genomes of algae-infecting mimiviruses as well as the distant phycodnavirus Heterosigma akashiwo virus HaV53 were not available at that time for comparison. Among the 22 previously reported proteins unique to the Phycodnaviridae-Mimiviridae clade, those with notable functions and also encoded in some algae-infecting mimiviruses are transcription factor TFIIB, putative glycosyltransferases, prolyl hydroxylase, DNA J like chaperone and cytidine/deoxycytine deaminase. Four of the isolated viruses are fully sequenced with genome sizes ranging from 340 to 668 kbp (Table 1). Three of the viruses have linear genomes (CeV, PgV-16T, and AaV), while the genome of TetV-1 is circular. Most of these genomes exhibit high A þ T contents, with G þ C contents between 23%–32% One exeption is TetV, that has a G þ C content of 41% (Table 1). A general feature for these viruses is a relatively low share of common “core” genes and a high proportion of unique genes that is not shared with other members within the Mimiviridae family. Of the NCLDV orthologous genes, DNA polB (NCVOG0038), Major capsid protein 1 (MCP1, NCVOG0022), A32-like virion packing ATPase (NCVOG0249), RNApol (RNA pol subunit I: NCVOG0274 and subunit II: NCVOG0271), asparagin synthase (AsnS, NCVOG0061), eukaryotic translation initiation factor 4E (eIF-4E, NCVOG4661) and MutS7 (NCVOG2626) are all shared in the sequenced mimivirus genomes (Table 2), except AaV that is missing AsnS and eIF-4E homologs. DNA repair enzymes are essential for cellular life but have been detected in some viruses with large genomes. Viral homologs of MutS7 shows homology to MutS7 in Epsilonproteobacteria and in the mitrochondrial genome of octocorals. The presence of MutS7 homologs in viral genomes therefore provide strong evidence for the placement in the Mimiviridae family. Among these core genes, the presence of an additional copy of the RNA pol subunit 2 in the genomes of algaeinfecting mimiviruses, with the exception of AaV, is a notable feature that can be used to discriminate between algae- and heterotroph-infecting members of the family Mimiviridae. The seven core genes mentioned above are present in the unpublished/ partly sequenced genomes of PkV RF01, PkV RF02, and HeV RF02 (Table 2). For PpV, PoV, and PpDNAV, presence of only some of these genes have been tested by PCR followed by sequencing. All three viruses have DNApolB and MCP1, while PoV and PpV contain MutS7 in addition (Table 2). None of the viruses have been tested for presence of A32-like virion packing ATPase or RNA polymerase subunit II. Another shared feature between algae and heterotroph-infecting Mimiviridae viruses is the presence of several tRNAs and other proteins involved in translation. In line with their smaller genome size (Fig. 1), algae-infecting Mimiviridae possess less genes coding for translation related proteins compared to heterotroph-infecting mimiviruses. While mimiviruses infecting heterotrophs encode several aminoacyl-tRNA synthetases, up to 20 for Tupanvirus soda lake (genome size: 1.4 Mb), 7 for Megavirus chiliensis (genome size: 1.2 Mb) to as few as one (Ile-aaRS) for CroV (genome size: 730 kbp), none have been reported in TetV1, CeV 01B, PgV-16T, nor AaV. The notable exception is a catalytic domain for Asn/Asp-tRNA synthetase encoded in the genome of CeV 01B. The closest homologs to this ORF are found in Prochlorococcus phages P-SSM2, 5, and 7. However, algae-infecting mimiviruses possess no less tRNA genes (AaV and PgV: 8, CeV: 12, TetV:10) compared to other mimiviruses. tRNA genes tend to occur in clusters in both algae and heterotroph-infecting mimiviruses. CeV, PgV and TetV encode the translation initiation factor 4F (eIF-4E) shared and monophyletic with other mimivirus eIF-4E proteins. The stramenopile-infecting mimivirus AaV encodes the translation elongation factor eEF-1 alpha. eEF-1 alpha is shared with heterotroph-infecting mimiviruses, but not with other algaeinfecting mimiviruses. AaV eEF-1 alpha has its closets homologs in genomes of diatoms (62% amino acid (aa) identity) and is very distantly related to the eEF-1 alpha of heterotroph-infecting mimiviruses (aa identity o30%). AaV additionally codes for the translation initiation factor eIF-5A, which presents 56% aa identity to a Aureococcus anophagefferens homolog, and the only, yet distant, viral homolog is seen in Orpheovirus. TetV-1 encodes the only viral homolog for the eukaryotic translation initiation factor eIF-1A. PgV-16T possesses the translation elongation factor eEF3, not seen in other Mimiviridae, and harboring 63% aa identity with Emiliania huxleyi’s eEF3. Interestingly, chloroviruses (genus Chlorovirus, family Phycodnaviridae) also encode an eEF3 sequence most similar (60% aa identity) to a homolog in Chlorella variabilis. The sparsity of translation factors in algae-infecting mimiviruses, and the similarity with homologs in host (or relatives), indicate recurrent acquisition of these genes from their host.

Algal Mimiviruses (Mimiviridae)

681

The biological underpinnings for the acquisition of these translation factors is unclear. But, it may be needed to counteract the shutdown of host translation machinery upon infection. Seven of the ten cultured algal mimiviruses infect haptophytes belonging to the class Prymnesiophyceae, two infect hosts belonging to the Prasinophyceae and one infects a host belonging to the Pelagophyceae. These hosts are deeply separated in the eukaryotic tree demonstrating the wide span in host range for viruses belonging to this group. Most viruses infecting eukaryotic hosts are thought to be specialists, infecting only one species or strain within a species. Most (7) of the cultured algal mimiviruses are species- or strain-specific, while the other three can infect different species (HeV RF02, PkV RF01, PoV 01B). HeV RF02 and PkV RF01 both infect different strains of Haptolina ericina and one strain of Prymnesium kappa, while PoV 01B infects both one strain belonging to the Pyramimonas orientalis and Haptolina ericina, respectively (Table 1). The host groups of these algal mimiviruses are ubiquitous and diverse, found in the epipelagic layer of tropical, temperate and Arctic oceans. They play ecologically important roles in the epipelagic ocean, moreover some are forming blooms or are mixotrophs, grazing on bacteria and protists. Viral control of these host population dynamics can thus have imperative consequences for carbon and nutrient fluxes and trophic transfer efficiency. Currently, there is no official taxonomic subdivision within the Mimiviridae specific for the algal viruses. In recognition of the phylogenetic affinity of the distinct clade of algae-infecting mimiviruses, and to acknowledge the clear separation of these viruses from the Phycodnaviridae family, the subfamily Mesomimivirinae was recently proposed. It encompasses CeV01B, HeV RF02, PgV16T, PkV RF02, PpDNAV, and PpV01, together with OLPV 1 and 2 (Fig. 1). However, not all algal Mimiviridae viruses cluster within the Mesomimivirinae subfamily. Some separate into distinct lineages (Fig. 1). These are the viruses PkV RF01, AaV, and PoV 01B together with TetV-1. This group of unclassified algae-infecting mimiviruses indicates a huge diversity to be discovered and novel distinct taxonomic groups will certainly be described as more related viruses are isolated and sequenced.

Proposed Subfamily Mesomimivirinae This subfamily consists of six virus isolates, all infecting haptophytes belonging to the Prymnesiophyceae. Virus characteristics vary strongly within this subfamily, as reflected in capsid size (130–221 nm) and gene contents of those two representatives that have been sequenced, namely PgV-16T (GenBank accession no. KC662249) and CeV 01B (GenBank accession no. KT820662). Both PgV16T and CeV 01B contain genomic repeats, MIGE (Major Interspersed Genome Element), with twelve copies in the PgV genome and six copies in CeV. The MIGEs of PgV and CeV are grouped in separate clusters based on their phylogeny, indicating a separate evolution from a single copy to multicopy duplication. Another common feature between these two viral genomes is the detection of inteins. PgV-16T contains only a single intein, within the coding region of a helicase, while the genome of CeV 01B contains an unusually large number (8) of inteins that are inserted in seven different genes, strongly biased against DNA processing enzymes. Two similar inteins are detected in two unrelated genes, a Lon protease and a DEAD-box helicase. This is quite unusual as an intein normally spreads over homologous (but intein-free) genes with the use of the intein encoded homing endonuclease. PgV-16T also contains two transposases with strong similarity with viral homologs in OLPV1 (also in two copies). Furthermore, PgV-16T also contained several DNA methylases and type II restriction endonucleases, suggesting that the viral genome contains modified nucleotides as also found for Paramecium bursaria Chlorella Virus 1 (Phycodnaviridae). An interesting feature is the finding of a virophage genome (PgVV) in the PgV-16T genome data with only three genes homologous to virophages detected in the OLPVs metagenomes and Mavirus infecting the CroV. Surprisingly, no small virions are detected in the cytoplasma together with PgV during infections. Nor is a capsid like gene detected in the PgVV genome. It is therefore suggested that PgVV exists either as a linear plasmid-like molecule, or as a randomly integrated provirus. If so, PgVV might be the first example of a virophage/ transposonvirion intermediate mobile element. The partially assembled genome sequences of two other PgV strains (PgV-12T and PgV-14T, GenBank accession no. HQ634147.1 and HQ634144.1, respectively) showed a similarity in their gene content and identity (as high as 98% to the fully sequenced PgV-16T genome). All these PgVs are part of the Group I PgV, with similar large dsDNA genomes and clustering based on phylogeny (DNA pol B) and phenological characteristics. PgV-07T, also part of Group I, have shown to contain a lipid membrane, detected by staining with lipophilic dye and chloroform and diethyl ether treatment. Intact polar lipids (IPL) analysis displayed that the IPL composition of PgV-07T was different than its host P. globosa G(A). In contrast to EhV infecting the prymnesiophyte Emiliania huxleyi, no viral glycosphingolipids (vGSLs) were detected. PgV seems to obtain its lipid membrane in the host cytoplasm and not from the cell membrane as observed for EhV. The lipid membrane of PgV is argued to be an intracapsid membrane, similar to what is proposed for PpV (infecting P. pouchetii) using electron cryomicroscopy. PgV-07T is highly sensitive to a combination of phosphate (P) limited and light stressed growth of its host, resulting in a 95% loss in infectivity of the progeny viruses. Host metabolism strongly regulates the proliferation of PgV, with prolonged latent period and reduced burst size under low and high light intensity (host energy limitation and inhibition) as well as nutrient limitation. Not only phosphorus was shown to strongly limit PgV production but also iron and in particular nitrogen limited host growth resulted in lowered burst sizes as compared to resource replete conditions (i.e., 70%, 72% and 92% reduction, respectively). P. globosa occurs in temperate coastal waters with strong gradients in light and nutrient availability. Overall, all hosts of the Mesomimivirinae viruses are found in waters with strong seasonal variability in growth-related environmental variables. With their important ecological function, the magnitude of viral control of these host species will vary temporarily, geographically and with depth.

682

Algal Mimiviruses (Mimiviridae)

Unclassified Algae-Infecting Members in the Family Mimiviridae AaV (GenBank accession no. KJ645900) infects Aurecoccus anophagefferens that causes “brown tides” in coastal and estuarine waters of the United States, China and South Africa. The ecological and economical consequences of these blooms is enormous as blooms inhibit penetration of light and kill seagrass and juvenile shellfish. The virus was isolated from the east coast of the United States outside New York. AaV is the smallest member of viruses infecting photosynthetic protists within the Mimiviridae family with a sequenced genome of 370 kbp and a capsid size of 140 nm. The genome architecture of AaV demonstrates that the ancestral virus had even a smaller genome, which has been expanded through gene duplication and assimilation of genes from several sources including its host. AaV has been shown to be an outlier within the Mimiviridae family. In sequence-based analyses, AaV encodes many genes that are more similar to genes of mimiviruses than of other NCLDVs and its core NCLDV genes showed higher sequence similarity, and in a number of cases, clear orthologous relationship with mimivirus homologs. However, AaV is missing the asparagine synthase, the polyA polymerase and eIF-4E present in other mimiviruses. An additional unique feature of AaV compared to other mimiviruses is a duplication of RNA polymerase II largest subunit (rpb1). Although it possesses a second, yet incomplete (222 aa against 41000 aa for canonical proteins) version of the RNA polymerase second largest subunit (rpb2), these two seem to be only distantly related as it is for other algal mimiviruses. It is also notable that twenty-two AaV genes had highest sequence similarity with phycodnavirus genes. In addition several photolyase genes and genes involved in DNA repair are found, as also found in several large DNA viruses. Phylogenetically it makes a long branch within the group unclassified algae-infecting mimiviruses, with the closest (but still distant) relative being PkV RF01 (Fig. 1). PkV RF01, infecting the haptophytes P. kappa and H. ericina, has the largest capsid size (310) of all the so far cultured algaeinfecting mimiviruses. Interestingly, PkV RF01 infects the same hosts as HeV RF02 and PkV RF02 that belong to the subfamily Mesomimivirinae indicating the complexity of virus-host interactions and further the difficulties in naming viruses. Similar to HeV RF02, PkV RF01 is able to proliferate in more than one genus indicating that broad host range may perhaps be common in marine ecosystems as the ecological advantage seems obvious. The persistent hosts of PkV RF01, H.ericina and P.kappa, do normally not form massive blooms, as many of the hosts of other isolated algal viruses (e.g., PgVs, AaV, EhV), but are present at low densities at different seasons. As evolution between host and its parasite always will aim towards host survival rather than high mortality, these viral-host systems may represent the most typical systems in the marine environments; low-density hosts (non-blooming) and viruses that appear to have co-evolved for a long time. Although its closest yet very distant relative is AaV, PkV RF01 encodes the AsnS and eIF4-4E mimivirus core genes (Table 2). Two mimiviruses infecting Chlorophyta have been described, namely PoV 01B and TetV-1. Both viruses have large capsid size (220–257 nm) and genomes (560–668 kbp). Based on DNA polB and MutS7 phylogeny they are more closely related to each other than to other viruses within this algal Mimiviridae group. PoV 01B was isolated from the western coast of Norway while TetV-1 was isolated from the coast of Hawaii. PoV 01B has unfortunately been lost from our culture collection. The genome of TetV-1 (Genbank asseccion no. KY322437) contains several key genes never detected in viruses before. These genes include a mannitol metabolism enzyme, a saccharide degration enzyme and the key fermentation genes pyruvate formate-lyase and pyruvate-lyase activating enzyme. The presence of the two latter enzymes suggest that TetV-1 has the potential to manipulate its host fermentative pathway. It is speculated that green algae use fermentation to survive periods with low oxygen concentration. Thus, TetV-1 might use the viral homolog of these genes to ensure energy requirement for virus production especially after increased bacterial respiration that follows bloom termination by viruses. The saccharide degration enzyme in TetV-1, alpha-galactosidase, is suggested to enhance cell lysis as the cell wall of Tetraselmis contain up to 7% galactose and 21% galacturonic acid. The same function has been suggested for the mannitol 1-phosphate dehydrogenase gene as the host, Tetraselmis, uses mannitol as a osmolyte under hypersaline conditions. Manipulation of such system might therefore induce host lysis.

Further Reading Baudoux, A., Brussaard, C., 2005. Characterization of different viruses infecting the marine harmful algal bloom species Phaeocystis globosa. Virology 341, 80–90. Claverie, J.-M., Abergel, C., 2018. Mimiviridae: An expanding family of highly diverse large dsdna viruses infecting a wide phylogenetic range of aquatic eukaryotes. Viruses 10, 506. Gallot-Lavallee, L., Blanc, G., Claverie, J.-M., 2017. Comparative genomics of Chrysochromulina ericina virus (CeV) and other microalgae-infecting large DNA viruses highlight their intricate evolutionary relationship with the established Mimiviridae family. Journal of Virology 91 (14), e00230-17. Hingamp, P., Grimsley, N., Acinas, S.G., et al., 2013. Exploring nucleo-cytoplasmic large DNA viruses in Tara Oceans microbial metagenomes. The ISME Journal 7, 1678–1695. Iyer, L.M., Balaji, S., Koonin, E.V., Aravind, L., 2006. Evolutionary genomics of nucleo-cytoplasmic large DNA viruses. Virus Research 117, 156–184. Jacobsen, A., Bratbak, G., Heldal, M., 1996. Isolation and characterization of a virus infecting Phaeocystis pouchetii (Prymnesiophyceae). Journal of Phycology 32, 923–927. Johannessen, T.V., Bratbak, G., Larsen, A., et al., 2015. Characterisation of three novel giant viruses reveals huge diversity among viruses infecting Prymnesiales (Haptophyta). Virology. 476, 180–188. Moniruzzaman, M., LeCleir, G.R., Brown, C.M., et al., 2014. Genome of brown tide virus (AaV), the little giant of the megaviridae, elucidates NCLDV genome expansion and host–virus coevolution. Virology 466–467, 60–70.

Algal Mimiviruses (Mimiviridae)

683

Sandaa, R.A., Heldal, M., Castberg, T., Thyrhaug, R., Bratbak, G., 2001. Isolation and characterization of two viruses with large genome size infecting Chrysochromulina ericina (Prymnesiophyceae) and Pyramimonas orientalis (Prasinophyceae). Virology 290, 272–280. Santini, S., Jeudy, S., Bartoli, J., et al., 2013. Genome of Phaeocystis globosa virus PgV-16T highlights the common ancestry of the largest known DNA viruses infecting eukaryotes. Proceedings of the National Academy of Sciences 110, 10800–10805. Schvarcz, C.R., Steward, G.F., 2018. A giant virus infecting green algae encodes key fermentation genes. Virology. 518, 423–433. Wagstaff, B.A., Vladu, I.C., Barclay, J.E., et al., 2017. Isolation and characterization of a double stranded DNA Megavirus infecting the toxin-producing haptophyte Prymnesium parvum. Viruses. 9, 40.

Miscellaneous Algal Viruses (Alvernaviridae, Bacilladnaviridae, Dinodnavirus, Reoviridae) Keizo Nagasaki, Kochi University, Nankoku, Japan Yuji Tomaru, Japan Fisheries Research and Education Agency, Kanagawa, Japan Corina PD Brussaard, NIOZ Royal Netherlands Institute for Sea Research, Den Burg, Texel, The Netherlands and Utrecht University, Utrecht, The Netherlands r 2021 Elsevier Ltd. All rights reserved. This is an update of K. Nagasaki, C.P.D. Brussaard, Algal Viruses, In Encyclopedia of Virology (Third Edition), edited by Brian W.J. Mahy and Marc H.V. Van Regenmortel, Elsevier Ltd., 2008, doi:10.1016/B978-012374410-4.00359-9.

Introduction With the realization during the last three decades that viruses are highly abundant in various aquatic environments, interest in aquatic viruses significantly increased. Viruses are now recognized as important biological agents not only regulating population dynamics, and succession and diversity of the host organisms in marine systems, but also influencing the functioning of aquatic food webs and biogeochemical cycles (energy and matter fluxes). Members within the microbial food web can become infected by viruses, including eukaryotic algae. In the present article, we summarize the characteristics of some RNA and DNA algal viruses that do not belong to the virus families thus far established officially, e.g., Phycodnaviridae, Mimiviridae, and Marnaviridae (separately discussed in other chapters).

Algal single-stranded RNA Viruses Genus Dinornavirus Heterocapsa circularisquama RNA virus (HcRNAV) is a positive-sense ssRNA virus infecting the bivalve-killing, bloom-forming dinoflagellate, H. circularisquama, which is a type species of the genus Dinornavirus in the family Alvernaviridae. The virion is icosahedral, approximately 30 nm in diameter, harbouring an ssRNA genome of around 4.4 kb in length, and is propagated in the host cytoplasm, often forming a crystalline array. Its genome is a linear positive-sense ssRNA, lacking 50 -cap structure and 30 -polyA tail, and has a stemloop structure at the 30 -end. HcRNAV clones can be roughly divided into two types (types UA and CY) based on their intraspecies host specificity patterns; each type shows its own strain-specific infectivity that is complementary to each other. These two HcRNAV types can coexist in natural water. Typical HcRNAV clones of type UA and CY (HcRNAV34 and 109, respectively) were fully sequenced (DDBJ accession numbers: AB218608 and AB218609, respectively); they are approximately 97% identical at the nucleotide sequence level. Each genome has two open reading frames (ORFs). ORF-1 encodes a putative polyprotein having at least the serine protease domain and the RdRP domain. In about a half of the tested virus clones, a specific 15-nt deletion was found; however, it is not related in the determination of intraspecies host specificity. ORF-2 coding for the single major capsid protein (MCP) is unlikely to be a polyprotein gene because the molecular mass directly estimated by SDS polyacrylamide gel electrophoresis agrees well with the value predicted by the deduced amino acid sequence of ORF-2. Between the two virus clones tested, the stem-loop structures at the 30 -end are different in stem length and loop-size; this may affect their replication efficiency. Similarity analysis for the deduced amino acid sequences of RdRP revealed that HcRNAV is evolutionarily quite distant from any of the land and aquatic viruses that have been genetically studied. This is also supported by a phylogenetic analysis of the RdRP amino acid sequence. Genomic comparison revealed that complementary host ranges of HcRNAV may be related to the amino acid substitution patterns in ORF-2, within the MCP-encoding gene. In addition, the tertiary structure of the MCPs predicted by using computer modelling indicated that many of the amino acid substitutions were located in regions on the outer portion of the viral capsid proteins exposed to the ambient water environments. This suggests that the intraspecies host specificity of HcRNAV is determined by nano-structural differences of the viral surface, which may affect its binding affinity to the host cell. By a cross-reactivity test, H. circularisquama clones are also divided into two types according to the sensitivity spectra of the two HcRNAV types; however, the two host types are indistinguishable when their morphology or the sequences of the internal spacer regions of the ribosomal RNA genes are compared. There are at least two distinct and independent host/virus systems between H. circularisquama and HcRNAV; i.e., multiple types of host and virus systems coexist within natural blooms of H. circularisquama and their combinations are regulated by exquisite molecular mechanisms.

Algal double-stranded RNA Viruses Genus Mimoreovirus The Micromonas pusilla reovirus (MpRV, originally termed as MpRNAV) is a double-stranded (ds) RNA virus infecting the cosmopolitan prasinophyte, M. pusilla. The polysegmented dsRNA genome of MpRV identified it as a member of a large family, Reoviridae, arranging

684

Encyclopedia of Virology, 4th Edition, Volume 4

doi:10.1016/B978-0-12-809633-8.21317-4

Miscellaneous Algal Viruses (Alvernaviridae, Bacilladnaviridae, Dinodnavirus, Reoviridae)

685

nonenveloped viruses with segmented (9–12 linear segments) dsRNA genomes. The virus was characterized by sequence, morphological and physiochemical properties and is thus far the first and only member of the genus Mimoreovirus. Its genome is composed of 11 segments ranging between 741 and 5792 bp, with a total size of 25,563 bp. When isolated, the virus coexisted with a large genomesized dsDNA virus infecting the same M. pusilla LAC38 strain (recently renamed as M. commoda; van Baren et al., 2016). It has a narrow host range, a latent period of 36 h (based on a one-step lytic virus growth cycle and TEM enumeration of the virus particles), and infectivity sensitivity to temperatures 4351C. Non-ionic detergents and chloroform or diethyl ether did not affect infectivity, while alcohol and acetone did. The size of intact virus particles is 90–95 nm in diameter (based on TEM images), wherein CsCl purified capsids show a compact inner structure of 75 nm surrounded by a 15 nm outer layer of protein. The smooth surface of the 50 nm subcore particles indicate that MpRV belongs to non-turreted Reoviridae (subfamily Sedoreovirinae). Within the phylogenetic tree built with Reoviridae polymerase (RdRP) sequences, the branch of MpRV dissects the tree, separating the group of turreted and non-turreted viruses. As M. pusillla is evolutionarily older than the hosts of other members of the Reoviridae, the topology of the tree suggests that the branch of MpRV could be ancestral. An interesting feature is the unusual length of segment 1 (5792 bp), encoding an outer layer protein of 1877 aa long (VP1) similar to non-turreted intact particles of other viruses. If this is indeed a transient envelope structure, MpRV would be the first of the Reoviridae to have acquired a constitutive additional outer coat or pseudo-envelope structure, the consequence of budding of virus particles from the cell membrane or budding into the endoplasmic reticulum during morphogenesis. The many repeats within the V1 sequence suggest that it may have arisen from amino acid fragment duplication, followed by diversification of the sequence. The mechanism and the constraints which have driven such an evolution are not clear, although very recently a model of stem-loop formation (based on the crystal structure of the polymerase of a mammalian orthoreovirus) was proposed for explaining such duplications.

Algal single-stranded DNA Viruses Genus Bacilladnavirus Diatoms are important for marine primary production and for support of coastal fisheries. To date, approximately 20 viruses capable of infecting diatoms, including Centric and Pennate genera have been isolated and characterized. Diatom viruses are divided into two groups based on their genome conformation: single-stranded (ss) RNA or ssDNA viruses. Those harbouring ssRNA are now classified in the genus Bacillarnavirus within the family Marnaviridae. Diatom-infecting viruses which harbour ssDNA belong to “genus Bacilladnavirus”. The virions of bacilladnaviruses are icosahedral, 32–38 nm in diameter and accumulate in the host nucleus (Fig. 1). In addition to icosahedral virions, fibrous or rod-shaped structures, 17–27 nm in width and 40.5 mm in length, are often observed in diatom nuclei infected with bacilladnaviruses. The infectivity and role of the rod-shaped particles are still unknown. The ssDNA diatom virus genome is primarily a 5–6 kb, covalently closed, circular ssDNA with a o1 kb linear, complementary strand. An exception is CsetDNAV (Chaetoceros setoensis DNA virus), which infects the bloom-forming diatom C. setoensis. The CsetDNAV genome is composed of a covalently closed circular ssDNA and eight short complementary fragments (67–145 nt in length). The genome of bacilladnaviruses possesses at least two major open reading frames (ORFs) encoding a putative replication-related protein and structural proteins. The replication proteins of bacilladnaviruses display a low similarity to those of circoviruses. Their infection is highly species-specific and strain-specific. As mentioned above, they are also considered to have a significant impact on the ecology of their host populations.

Fig. 1 Transmission electron micrograph of an ultra-thin section of Chaetoceros setoensis host strain, NIES-3712, 48 h after CsetDNAV inoculation. Virus-like particles (V, approximately 33 nm in diameter) are accumulated in the host nucleus (N). ‘m’ indicates mitochondrion. Photographed by Y. Tomaru.

686

Miscellaneous Algal Viruses (Alvernaviridae, Bacilladnaviridae, Dinodnavirus, Reoviridae)

Fig. 2 Field emission SEM images of HcDNAV particles; pentagonal (A) and hexagonal projection (B), and the viroplasm of HcDNAV-infected host Heterocapsa circularisquama (C, D). Note that many virus particles do not completely form icosahedral shapes in the viroplasm (i.e., they have deformed shapes), but are equipped with protrusions at the vertex position, and that fibrous materials (arrowheads) are also present. Reproduced from Takano, Y., Tomaru, Y., Nagasaki, K., 2018. Visualization of a dinoflagellate-infecting virus HcDNAV and its infection process. Viruses 10, 554.

Algal double-stranded DNA Viruses (not Belonging to Either of the Families: Phycodnaviridae or Mimiviridae) Genus Dinodnavirus Heterocapsa circularisquama DNA virus (HcDNAV) is a dsDNA virus infecting the bloom-forming dinoflagellate, H. circularisquama, which is a type species of the genus Dinodnavirus. The virion is icosahedral and around 200 nm in diameter; each five-fold vertex is decorated with a protrusion (Fig. 2). As well as HcRNAV mentioned above, HcDNAV is reported to play a significant role in the disintegration process of H. circularisquama blooms. HcDNAV has a large icosahedral capsid (180–210 nm in diameter) and its genome size is estimated to be approximately 356 kbp in length. During its multiplication process, virions emerge from a specific cytoplasm compartment, called “viroplasm,” or “virus factory,” which is induced during infection by the virus (Fig. 2). HcDNAV PolB sequence is closely related to the PolB sequence of asfaviruses with dsDNA genomes, exemplified by African swine fever virus. In addition, the amino acid sequence of HcDNAV PolB showed a rare amino acid substitution within a sequence containing a highly conserved motif.

Reference van Baren, M.J., Bachy, C., Reistetter, E.N., et al., 2016. Evidence-based green algal genomics reveals marine diversity and ancestral characteristics of land plants. BMC Genomics 17, 267.

Further Reading Attoui, H., Jaafar, F.M., Belhouchet, M., et al., 2006. Micromonas pusilla reovirus: A new member of the family Reoviridae assigned to a novel proposed genus (Mimoreovirus). Journal of General Virology 87, 1375–1383. Brussaard, C.P., Noordeloos, A.A., Sandaa, R.A., et al., 2004. Discovery of a dsRNA virus infecting the marine photosynthetic protist Micromonas pusilla. Virology 319, 280–291. Coy, S.R., Gann, E.R., Pound, H.L., et al., 2018. Viruses of eukaryotic algae: Diversity, methods for detection, and future directions. Viruses 10, 487. Nagasaki, K., 2008. Dinoflagellates, diatoms and their viruses. Journal of Microbiology 46, 235–243. Nagasaki, K., Shirai, Y., Takao, Y., et al., 2005. Comparison of genome sequences of single-stranded RNA viruses infecting the bivalve-killing dinoflagellate Heterocapsa circularisquama. Applied and Environmental Microbiology 71, 8888–8894. Nagasaki, K., Tomaru, Y., Takao, Y., et al., 2005. Previously unknown virus infects marine diatom. Applied and Environmental Microbiology 71, 3528–3535. Takano, Y., Tomaru, Y., Nagasaki, K., 2018. Visualization of a dinoflagellate-infecting virus HcDNAV and its infection process. Viruses 10, 554. Tomaru, Y., Hata, N., Masuda, T., et al., 2007. Ecological dynamics of the bivalve-killing dinoflagellate Heterocapsa circularisquama and its infectious viruses in different locations of Western Japan. Environmental Microbiology 9, 1376–1383. Tomaru, Y., Katanozaka, N., Nishida, K., et al., 2004. Isolation and characterization of two distinct types of HcRNAV, a single-stranded RNA virus infecting the bivalve-killing microalga Heterocapsa circularisquama. Aquatic Microbial Ecology 34, 207–218. Tomaru, Y., Toyoda, K., Kimura, K., 2015. Marine diatom viruses and their hosts: Resistance mechanisms and population dynamics. Perspectives in Phycology 2, 69–81.

Phycodnaviruses (Phycodnaviridae) James L Van Etten and David D Dunigan, University of Nebraska–Lincoln, Lincoln, NE, United States Keizo Nagasaki, Kochi University, Nankoku, Japan Declan C Schroeder, University of Reading, Reading, United Kingdom and University of Minnesota, St. Paul, MN, United States Nigel Grimsley, Integrative Biology of Marine Organisms Laboratory, Banuyls-sur-Mer, France and Sorbonne University, Banuyls-sur-Mer, France Corina PD Brussaard, NIOZ Royal Netherlands Institute for Sea Research, Den Burg, Texel, The Netherlands and Utrecht University, Utrecht, The Netherlands Jozef I Nissimov, University of Waterloo, Waterloo, ON, Canada r 2021 Elsevier Ltd. All rights reserved. This is an update of J.L. Van Etten, M.V. Graves, Phycodnaviruses, In Encyclopedia of Virology (Third Edition), edited by Brian W.J. Mahy and Marc H.V. Van Regenmortel, Elsevier Ltd., 2008, doi:10.1016/B978-012374410-4.00571-9.

History The concept that viruses might have a major impact on the marine environment began almost 30 years ago with the discovery that seawater contains around 1010 viruses per liter. At any one time, 20%–40% of the photosynthetic microorganisms in the ocean are infected with a virus. Consequently, these viruses contribute to microbial composition and diversity, nutrient cycling, carbon flow, and other biogeochemically-important processes in aqueous environments. This huge viral population consists of both bacterial and algal viruses and they are important because phytoplankton, consisting of cyanobacteria and eukaryotic algae, fix about half of the global carbon dioxide. Beginning in the early 1970s, viruses or virus-like particles (VLPs) have been reported in many taxa of eukaryotic algae. However, most of the early reports described single accounts of microscopic observations. This situation changed in the 1980s with the discovery of a family of large double-stranded (ds) DNA-containing viruses (referred to as chloroviruses) that infect and replicate in certain strains of unicellular, eukaryotic, symbiotic, chlorella-like green algae; these symbiotic algae are often referred to as zoochlorellae. Known chlorovirus hosts are either associated with the protozoan Paramecium bursaria (Fig. 1(A)), the coelenterate Hydra viridis or the heliozoae Acanthocystis turfacea. Zoochlorellae are resistant to viruses in their symbiotic state. Fortunately, the zoochlorellae from P. bursaria, as well as those from A. turfacea, can be grown independently in culture, and these cultured, naturally endosymbiotic algae serve as hosts for many large dsDNA viruses. The lytic chloroviruses can be produced in large quantities and assayed by plaque formation (Fig. 1(B)) using standard bacteriophage techniques. The prototype chlorovirus is Paramecium bursaria chlorella virus 1 (PBCV-1). Large polyhedral, dsDNA-containing viruses that infect marine algae are also ubiquitous throughout the world. These include viruses that infect the unicellular coccolithophore Emiliania huxleyi (EhV viruses, genus Coccolithovirus) and the other bloom-forming haptophyte Phaeocystis globosa (PgV viruses, genus Prymnesiovirus), the noxious bloom-forming raphidopyte Heterosigma akashiwo (HaV viruses, genus Raphidovirus), viruses that infect filamentous brown algae, e.g., Ectocarpus sp. (EsV viruses, genus Phaeovirus) and prasinoviruses (genus Prasinovirus) that infect the smallest eukaryotic cell, Ostreococcus and related species in the class Mamiellophyceae. Prasinoviruses infecting Micromonas were first reported in 1979 in samples of coastal seawater from British Colombia. However, more Micromonas viruses were not discovered until the 1990s, which led to the appreciation of their large diversity. The genomic sequence of the first prasinovirus, Ostreococcus tauri virus 5 (OtV5), was published in 2008. Subsequently, other viruses infecting Ostreococcus sp. have been sequenced, and the genomes of viruses infecting the genera Bathycoccus and Micromonas were analyzed. Now many genomes from the genus Prasinovirus have been sequenced, and their worldwide diversity and abundance is well documented. Although all the phycodnaviruses probably arose from a common ancestor, they can have different lifestyles. For example, phaeoviruses have a lysogenic phase in their life-cycle (which is rare among algal viruses); the virus particles are only seen in sporangial cells of their host. Moreover, phaeoviruses also infect the giant kelp seaweed that separated from Ectocarpus brown algae (genus Ectocarpus) around 90 million years ago, further confirming the ancient evolutionary history of algal viruses. Because the first algal viruses discovered were large dsDNA viruses, it was initially assumed that algae were only infected by large dsDNA viruses. However, this perception has changed and algal viruses with single-stranded (ss)RNA-, dsRNA- and ssDNA genomes have been discovered in the past two decades. These more recently discovered algal viruses are described in other articles in this 4th Edition of Encylopedia of Virology entitled “Other algal viruses” and “Marnaviruses (Marnaviridae).”

Taxonomy and Classification Members of the family Phycodnaviridae constitute a large genetically and morphologically diverse group of viruses with eukaryotic algal hosts from both inland and marine waters. Accumulating genetic evidence indicates that the phycodnaviruses together with the poxviruses, iridoviruses, asfarviruses, ascoviruses, mimiviruses, marseilleviruses, pithoviruses, molliviruses, faustoviruses and possibly pandoraviruses have a common evolutionary ancestor, perhaps, arising at the point of eukaryogenesis 3 billion or more

Encyclopedia of Virology, 4th Edition, Volume 4

doi:10.1016/B978-0-12-809633-8.21291-0

687

688

Phycodnaviruses (Phycodnaviridae)

Fig. 1 Chlorella cells and chlorovirus PBCV-1. (A) Parmecium bursaria and its symbiotic chlorella cells. (B) Plaques formed by PBCV-1 on a lawn of Chlorella variabilis NC64A. (C) Five-fold averaged cryo-electron micrograph of PBCV-1 reveals a long narrow cylindrical spike structure at one vertex and fibers extending from one unique capsomer per trisymmetron. (D) PBCV-1 attached to the cell wall as viewed by the quick-freeze, deep etch procedure. Note the virions attached to the wall by fibers. (E) Initial attachment of PBCV-1 to a C. variabilis NC64A cell wall. (F) Attachment of PBCV-1 to the algal wall and digestion of the wall at the point of attachment. This occurs within 1–3 min PI. (G) The generation of a continuous membrane lined tunnel from the virus to the host (arrows). (H) Virion particles assembly in defined areas in the cytoplasm named virus assembly centers at B4 h PI. Note both DNA containing (dark centers) and empty capsids. (I) A model depicting PBCV-1 assembly into infectious particles including generation of the internal lipid membrane. (J) Localized lysis of cell plasma membrane and cell wall and release of progeny viruses at 8 h PI. This figure is modified with permission. (A)–(F) and (H)–(J) are modified from Van Etten, J.L., Dunigan, D.D., 2016. Giant chloroviruses – Five easy questions. PLoS Pathogens 12, e1005751, and (G) is modified from Milrot, E., et al., 2017. PLoS Path. 13, e1006562, with permission.

years ago. Collectively, these large dsDNA viruses are referred to as nucleocytoplasmic large DNA viruses (NCLDVs) and it has been proposed that these viruses should be included in a new order named Megavirales. Phylogenetic analyses of B-type DNA polymerases encoded by all members of the Phycodnaviridae indicate that they are more closely related to each other than to other large dsDNA viruses and that they form a monophyletic group, consistent with a common ancestor. The phycodnaviruses fall into six clades based on phenotypic characteristics and host species and were originally given genus status. Often, the genera can be distinguished by additional properties, e.g., lytic versus lysogenic life styles or linear versus circular genomes. Among viruses with known host species, members of the genus Chlorovirus infect fresh water algae; however, chlorovirus DNA sequences are often reported in metagenomic studies from the ocean, which suggests that at least some chloroviruses infect marine algae or some other marine organism(s). In contrast, members of the other five genera currently listed in the International Committee on the Taxonomy of Viruses (ICTV) database (Coccolithovirus, Phaeovirus, Prasinovirus, Prymnesiovirus, and Raphidovirus) infect marine algae, although DNA sequences from inland water sources indicate that prasinoviruses also exist in inland water. Complicating the taxonomy of the Phycodnaviridae is the fact that genomic sequencing of several recently isolated large dsDNA algal viruses indicates that they are more closely related to viruses in the family Mimiviridae than they are to the Phycodnaviridae. The largest viruses listed in the genus Prymnesiovirus (PgV group I viruses) belong to the Mimiviridae family and so these viruses have been removed from this article. However, the PgV group II viruses will probably remain in the Phycodnaviridae family and so they are included in this article. In addition, a couple of viruses that infect prasinophytes also belong in the Mimiviridae family. These mimivirus-like viruses are described in another article in this 4th Edition of the Encyclopedia of Viruses entitled “Algal viruses belonging to a subgroup within the Mimiviridae family.” However, the majority of the viruses that infect prasinophytes belong in the Phycodnaviridae and so these members remain in this article. Table 1 lists representative viruses from the six genera in the Phycodnaviridae. Another complication is that the predicted amino acid sequences between homologous genes coded by many viruses, now classified as belonging to the same genus, can differ significantly and we expect that these genera will eventually be given family status and that many of the individual viruses will be given genus status.

Virion Structure and Composition Morphology Originally, all phycodnaviruses were assumed to be large icosahedral structures (100–220 nm in diameter) with a multi-laminate shell surrounding an internal membrane enclosing an electron dense core. However, recent experiments indicate that not all

Phycodnaviruses (Phycodnaviridae)

Table 1

689

Representative members of the Phycodnaviridae

Genera

Type

Representative

Genome sizes

Number of CDSs

Chloroviruses

NC64A viruses SAG viruses Pbi viruses Osy viruses Hydra viruses Emiliania huxleyi viruses Ectocarpus viruses Feldmannia viruses Hicksia viruses Myriotrichia viruses Pilayella viruses Laminaria digitata viruses Micromonas viruses Ostrococcus viruses Bathyococcus viruses Heterosigma akashiwo viruses Phaeocystis globosa group II viruses

PBCV-1 ATCV-1 MT325 Osy-NE5 HVCV-1 EhV-86 EsV-1 FsV HincV-1 MclaV-1 PlitV-1 Ldif MpV1 OtV1 BpV1 HaV PgV II

330 kb 288 kb 325 kb 327 kb 300 kb 407 kb 336 kb 155 kb 240 kb 320 kb 280 kb unknown 184 kb 192 kb 199 kb 275 kb unknown

416 329 331 357 unknown 472 231 150 unknown unknown unknown unknown 244 232 203 246 unknown

Coccolithoviruses Phaeoviruses

Prasinoviruses

Rhaphidoviruses Prymnesioviruses

phycodnavirus structures are identical. For example, fivefold-symmetry averaging 3D reconstructions revealed that one of the vertices in the chlorovirus PBCV-1 has a spike-structure (Fig. 1(C)). External fibers also extend from one capsomer in each of the 66 trisymmetron capsomes of PBCV-1 that facilitate attachment to the host (Fig. 1(D)). Coccolithoviruses and phaeoviruses are reported to have an external membrane in addition to an internal membrane. It is thought that these viruses utilize an animal-like virus infection strategy by entering their hosts via either an endocytotic or an envelope fusion mechanism. The capsid is then disrupted and the viral DNA is transferred to the nucleus of their hosts. During the formation of these viruses, they acquire their external membrane by budding through the host plasma membrane.

Physicochemical and Physical Properties The Mr of PBCV-1 is B1  109 daltons and the S20,w is 42000 S. Some chlorovirus particles are disrupted in CsCl and by freezethawing. Chloroviruses are generally stable in sucrose but if not, they are stable in iodixanol. Non-ionic detergents do not affect infectivity, whereas organic solvents do. The MpoV prasinoviruses, infecting Micromonas polaris, are isolated on temperate and Arctic host strains. Temperate MpVs (infecting M. pusilla and M. commode) remain infectious after freezing, whereas the Arctic MpoVs are sensitive to low temperature. Some prasinoviruses do not survive freezing-thawing in seawater as such, but do survive if a cryoprotectant such as 10% glycerol or dimethyl sulfoxide is added. The Arctic MpoVs are highly sensitive to temperature shifts (0–71C) comparable to that found during a season or with depth. MpoV-45T has highest infectivity at 31C with over an 80% drop at lower and higher temperatures, whereas MpoV-46T has the same dynamics but with optimal infectivity at 01C. The infectivity of MpoV-44T is highest at 71C while that of MoV-47T is largely independent of temperature.

Nucleic Acids The viruses currently listed in the ICTV as members of the Phycodnaviridae contain dsDNA genomes, ranging from 170 to B560 kbp. However, as noted above some of the viruses with the largest genomes will eventually be transferred to the Mimiviridae family. Therefore a more realistic size for the genomes in the Phycodnaviridae family range from B100 to 440 kbp. The G þ C content of the phycodnavirus genomes ranges from 30% to 55%. Chlorovirus genomes contain methylated bases, both 5-methylcytosine (5mC) and N6-methyladenine (6mA). The percentage of methylated bases in chloroviruses ranges from 0.1% 5mC and no 6mA to 47% 5mC and 37% 6mA. The methylation status of the genomes of other algal viruses is unknown.

Proteins The chlorovirus PBCV-1 virion has 148 different virus-encoded proteins and at least one host-encoded protein ranging in size from o10 to 4200 kDa. PBCV-1 has many predicted membrane proteins and multiple capsid-like proteins (CP). In addition, the PBCV-1 virion has three glycoproteins, at least two myristylated proteins and several phosphoproteins. The PBCV-1 major CP, Vp54, consists of two eight-stranded, antiparallel-b barrel, “jelly-roll” domains connected by an a helix. A recent 3.5 Å resolution structure of the PBCV-1 virion identified 14 minor proteins associated with the virion structure. Proteomic analysis of coccolithovirus EhV-86 indicated that it is composed of at least 28 virus-encoded proteins, 23 of which are predicted to be membrane proteins, one major CP, two lectin-binding proteins, a thioredoxin and a Ser/Thr protein kinase.

690

Phycodnaviruses (Phycodnaviridae)

Lipids The PBCV-1 virion contains 5%–10% lipid. The lipid is located in a single bilayered membrane located inside the glycoprotein shell and is required for virus infectivity. The coccolithovirus EhV-86 has an external lipid membrane and it probably also has an internal membrane. Glycosphingolipids (GSLs) comprise more that 80% of the EhV-86 lipidome. Many of the prasinoviruses, including the Arctic MpoVs (infecting Micromonas polaris), are sensitive to chloroform, which suggest that they probably have a lipid membrane. However, other MpV isolates do not lose their infectivity after treatment with chloroform. Prasinovirus MpV-08T, infecting the temperate Micromonas, also contains a lipid membrane, which is enriched in intact polar lipids (IPLs) in comparison with its host (e.g., the proportion of phosphatidylglycerols was greatly reduced). Other IPLs were only found in MpV (i.e., digalactosyldiacylglycerol, sulfoquinovosyldiacylglycerol and diacylglycerylhydroxymethyltrimethyl-alanine), implying that these are produced by altering IPL-bound fatty acids during viral replication.

Carbohydrates At least three of the chlorovirus PBCV-1 proteins are glycosylated including the major CP Vp54. Vp54 has four Asn-linked glycans, which are unusual in many aspects: (1) the glycans are attached to Asn by a b-glucose linkage, which is rare in nature; (2) the asparagines are not located in the typical Asn-X-Thr/Ser consensus sequon; (3) the glycans are highly branched and consist of 9–10 neutral monosaccharides; (4) all four glycoforms contain a dimethylated rhamnose as the capping residue of the main chain, a hyper-branched fucose residue and two rhamnose residues with opposite absolute configurations; and (5) these glycans do not resemble any structures previously reported in the three Domains of Life. Unlike other glycoprotein-containing viruses, PBCV-1 encodes most, if not all, of the machinery required to glycosylate its major CP.

Genomes Many complete or nearly complete draft genome sequences are currently available for members of the Phycodnaviridae, including about 45 chloroviruses, 14 coccolithoviruses, 4 phaeoviruses and 19 prasinoviruses. There is considerable variation in genome structure among members of the Phycodnaviridae. The chloroviruses have linear and nonpermuted genomes that are 290–370 kb in length, which code for 330–416 predicted proteins (coding sequences, CDSs). However, in total B640 unique CDSs are predicted to be coded by 41 chloroviruses and because any one virus encodes no more than 416 CDSs, the chloroviruses exhibit extensive genetic diversity. Furthermore, some of the chlorovirus genes encode proteins with as many as 3 distinct domains that serve different functions, which contributes to their genetic diversity. Therefore, the genetic information coded by the chloroviruses is larger than the number of genes. The termini of the 331 kb chlorovirus PBCV-1 genome consist of 35 nucleotide-long covalently closed hairpin loops that exist in one of two forms (flip and flop); the two forms are complementary when the 35-nucleotide sequences are inverted. Each hairpin loop is followed by an identical 2.2 kb inverted repeat sequence; the remainder of the genome consists primarily of single-copy DNA. The PBCV-1 genome has 416 CDSs that have 40 or more codons. Many of the core protein genes are in conserved clusters and are referred to as gene gangs. The coding regions of some of the genes slightly overlap and early and late genes are dispersed throughout the PBCV-1 genome, although there is some clustering of the early virus genes. The CDSs are evenly distributed on both strands and intergenic space is minimal. One exception is a 1788 bp sequence near the middle of the genome, which is a polycistronic gene containing 11 tRNAs. Marine prasinoviruses with known host unicellular green algal species in the class Mamiellophyceae have G þ C contents that range from 35% to 50% with 203–269 compactly arranged CDSs (with an average intergenic distance of about 40 bp), 4–11 tRNAs, and terminal inverted repeats of 250 to 2150 bp. Among these prasinoviruses, viruses infecting Bathycoccus sp. have the fewest CDSs, but they each encode 2 very large predicted proteins (ranging from 3126 to 5689 amino acids) of unknown function that make up to 14% of the genome. About half of the sequenced Ostreococcus lucimarinus viruses carry a central inversion of 32 kbp. Remarkably, the overall diversity of predicted proteins in marine prasinoviruses is less than that observed between the genomes of their host genera, suggesting that these viruses may have evolved by host-switching, a hypothesis supported by the incongruence of host and viral phylogenies. Four genomes of freshwater viruses resembling prasinoviruses both in their genome structures and their phylogenetic positioning (but with unknown host species), have been assembled from metagenomic samples taken from lakes in North America and a coastal lake in China. They have genomes of 171–181 kb encoding 225–248 predicted CDSs. Members of the genus Coccolithovirus have large dsDNA genomes (376–421 kbp), 444–548 predicted CDSs, and a G þ C content of 40%. These viruses also encode up to 6 tRNAs and the middle of their genomes is characterized by a large section of repeats, inversions and insertions. Complete genome and DNA polymerase-based phylogenetic analyses cluster EhV isolates into distinct sub-clades, based on geographical region and time of isolation. Importantly, EhVs harbor genes for a near-complete sphingolipid biosynthesis pathway, which are homologous to those found in the host genome (see below). Although most of the predicted EhV genes have no assigned function, and most resemble genes in the Domain Eukarya (and by extension, their unicellular algal host), more than 15% are most similar to bacterial genes. The definitive structure of the 335 kbp phaeovirus EsV genome is unknown. Several experiments suggested that the genome is circular; however, DNA sequencing indicates that it has defined ends with inverted repeats. The virus is predicted to encode 231 CDSs, 48% of which resemble proteins in the public databases. About 12% of the EsV genome consists of tandem repeats and

Phycodnaviruses (Phycodnaviridae)

691

portions of the genome have ssDNA regions. The genome of the HaV53 strain (274,793-bp in length) contains 246 CDSs and 3 tRNA-coding sequences.

Virus Replication Chlorovirus PBCV-1 attaches rapidly and specifically to the external surface of cell walls, but not to protoplasts, of its host Chlorella variabilis NC64A. The PBCV-1 virion spike structure makes the first contact with the host cell wall (Fig. 1(F)). The virus spike is too narrow to deliver DNA and likely serves to puncture the wall before it is jettisoned. Following host cell wall degradation by a virus-associated enzyme(s) (Fig. 1(G)), the virus internal membrane fuses with the host membrane, forming a membrane-lined tunnel between the virus and the host (Fig. 1(H)), and leaving an empty capsid attached to the surface. Interestingly, the membrane-lined tunnel between the virus and host is so narrow that the compressed DNA (ca. 0.2 bp nm3) in the virion must pass into the host in a linear manner and then once inside the host it appears to condense again. The virus-host membrane fusion process triggers rapid depolarization of the host plasma membrane, probably initiated by a virus encoded K þ channel located in the virus internal membrane. This hypothesis is supported by the fact that infection by the chloroviruses is inhibited by the K þ channel blockers barium and/or cesium. The depolarization results in the release of K þ from the cell. This rapid loss of K þ from the host and associated water fluxes reduce the host turgor pressure, which aids ejection of viral DNA and virion-associated proteins into the host. Host membrane depolarization also inhibits many host secondary transporters and prevents infection by a second virus. None of the chloroviruses encode a recognizable DNA-dependent RNA polymerase (DdRp) gene, so it is assumed that PBCV-1 DNA and viral-associated proteins quickly move to the nucleus and commandeer the host transcription machinery. Early transcription begins 5–10 min post-infection (PI). Viral DNA synthesis begins at 60–90 min PI and presumably starts in the nucleus before moving to the cytoplasm. At 3–4 h PI, assembly of PBCV-1 capsomers begins in localized regions of the cytoplasm, which become prominent 4–5 h PI (Fig. 1(H)). These localized regions, called virus assembly centers or virus factories, consist of host cisternae that are derived from the endoplasmic reticulum next to the nuclear membrane (Fig. 1(I)). The cisternae are localized at the periphery of the viral assembly centers and are cleaved into single bilayered membranes, which then move to the central region of the assembly centers. Capsomers form around these membranes leading to the formation of empty virions at the periphery of the virus assembly centers where DNA packaging occurs. By 5–6 h PI the cytoplasm fills with infectious progeny viruses and localized lysis of the host cell membrane and cell wall releases progeny at 6–8 h PI (Fig. 1(J)). Each infected algal cell releases B1000 newly formed virus particles, of which B25% form plaques. Mechanical disruption of cells releases infectious virus particles 30–50 min prior to cell lysis, indicating the virus is mature and does not acquire its glycoprotein capsid by budding through the host plasma membrane as it is released from the cell. The coccolithovirus EhV virion has an external lipid membrane and the virus encodes 6 RNA polymerase subunits, none of which are packaged in the virion. EhV enters the host cell via either endocytosis or an envelope fusion mechanism by interacting with lipid rafts on the host membrane, which are used as focal points for attachment and entry. Upon entry, EhV internalization and breakdown of the virion capsid takes place instantaneously, quickly followed by the release of the virion genome into the nucleus. There, early virus transcription occurs using host RNA polymerases, followed by transcription of the rest of the viral genes in the cytoplasm, where capsid assembly and viral DNA synthesis occurs. The newly assembled viruses are transported to the plasma membrane where they acquire the inner lipid layer by association with intracellular lipid bodies, and they are released from the cell by budding through lipid rafts where they acquire a GSL rich outer membrane. A large proportion of newly-formed viruses are also released by eventually lysing the host cell. The EhV latent period can be as short as 4–8 h PI and EhV burst sizes range from 50 to 1000 new particles per lysed cell. The number of infectious EhV progeny can be as high as 30% or as low as 1%, depending on the type of EhV and host (e.g., calcifying or not). Prymnesioviruses and prasinoviruses have somewhat longer latent periods than the coccolithoviruses. Burst size for the prymnesioviruses is between 275 and 410 and up to 330 per lysed host cell for prasinovirus MpV. Host responses to EhV infection are diverse and can often be used as early markers of infection. For example, EhV infection triggers a series of metabolic events, including host lipid and fatty acid remodeling that includes the production of viral encoded GSLs that are essential for infection, differential induction of nitric oxide and reactive oxygen species, induction of caspase activity, enhanced metacaspase expression, activation of program cell death and autophagy pathways, and impairment of the host photosystem apparatus. Furthermore, EhV infection results in the excess production of “sticky” extracellular polysaccharides, which are exuded from infected cells and facilitate aggregation of cellular content and debris into larger organic carbon-containing particles. Phaeovirus EsV initiates its life-cycle by infecting free-swimming, wall-less gametes of its host. Virus particles enter the cell by fusion with the host plasma membrane and release a nucleoprotein core particle into the cytoplasm, leaving remnants of the capsid on the surface. The viral core moves to the nucleus within 5 min PI. One important feature that distinguishes the EsV lifecycle from the other phycodnaviruses is that the viral DNA becomes integrated into the host genome and is transmitted mitotically to all cells of the developing alga. The viral genome remains latent in vegetative cells until it is expressed in the algal reproductive cells, the sporangia or gametangia. Massive viral DNA replication occurs in the nuclei of these reproductive cells, followed by nuclear breakdown and viral assembly that continues until the cell becomes densely packed with virus particles. Virus release is stimulated by the same factors that induce discharge of gametes from the host, i.e., changes in temperature, light, and water composition. This synchronization facilitates interaction of viruses with their susceptible host cells.

692

Phycodnaviruses (Phycodnaviridae)

Less is known about replication of the prasinoviruses and raphidoviruses, although formation of the raphidovirus HaV occurs in the cytoplasm and the nucleus remains intact and separate from the viroplasm that consists of a fibrillar matrix. Ultimately, viral production results in the disruption of organelles, lysis of the cell and release of the virus particles. Unlike the chloroviruses, which are very host specific, some of the raphidoviruses can infect a range of host isolates within specific algal species; however, there is no evidence that they cross the species barrier.

Virus Transcription After entry of the chlorovirus PBCV-1 DNA into the cell its DNA and viral-associated proteins quickly move to the nucleus and commandeer the host transcription machinery. Early transcription of PBCV-1 begins 5–10 min PI. In this immediate-early phase of infection, host transcription rates decrease and the host transcription machinery is reprogrammed to transcribe viral DNA. Some early viral transcripts are synthesized in the absence of protein synthesis. Details of reprogramming are unknown but host chromatin remodeling is probably involved because PBCV-1 encodes and packages an enzyme that methylates Lys-27 in histone 3. Circumstantial evidence indicates that this methylation represses host transcription following PBCV-1 infection. In addition, host chromosomal DNA degradation begins within 5 min PI, presumably by PBCV-1 encoded and packaged DNA restriction endonucleases, which also aids in inhibition of host transcription and release of the host transcription machinery for the benefit of the virus. PBCV-1 dominates the infected cells in the early phase prior to initiation of viral DNA synthesis, which begins at 60–90 min PI. Microarray experiments indicated that 63% of the PBCV-1 genes are expressed early, whereas late genes are expressed (after 60 min PI) until the end of virus replication; 43% of the genes are expressed both early and late. Remarkably, an RNA-seq based transcription profile revealed that B50 PBCV-1 genes are expressed within the first 7 min PI. By 60 min PI, essentially all of the PBCV-1 genes are expressed at some level and B40% of the poly (A þ ) containing RNAs in the infected cell are PBCV-1 transcripts. As expected, the synthesis of late transcripts requires translation of early virus genes. Consensus promoter regions for early and late PBCV-1 genes have not been identified definitively; however, the sequence AATGACA is common in the 100 nucleotides preceding the ATG start codon of most early PBCV-1 genes. Furthermore, the 50 nucleotides preceding the ATG start codons are usually 470% A þ T. Transcription of some PBCV-1 genes appears to be complex. For example, some gene transcripts exist as multiple bands and these patterns can change between early and late times in the virus life-cycle. Transcription of prasinovirus genomes has been analyzed by RNA-Seq in Micromonas pusilla infected with MpV-SP1 in continuous light or in Ostreococcus tauri infected with OtV5 in a 12 h day 12 h night growth regime. In both cases, the availability of complete host and viral genomes allowed the expression of all host and viral genes to be monitored over the viral life-cycles. In M. pusilla the continuous light experiments focused on comparing the effects of different nutrient phosphate levels, whereas in O. tauri a day/night cycle was used to mimic natural growth conditions because host gene expression is influenced strongly by light and the cell cycle. When M. pusilla growth was limited by 10-fold less phosphate, only 30% of host cells were lysed by MpV-SP1 virus at 24 h PI, compared to 70% lysis in controls; about 5.8% of the differentially regulated genes were due to the viral infection. In O. tauri the host cell growth was mainly synchronous, with S-phase (DNA synthesis) occurring in late afternoon-early night and cell division in early night. Under these conditions, prasinovirus transcripts were much less abundant in the day and host cell lysis occurred mainly during the following day. During the night, about 4.4% of host genes showed increased expression relative to uninfected controls, likely representing both genes de-repressed for viral virulence and induced for host defense. Practically all of MpV-SP1 and OtV5 genes were expressed during infections, except for a few CDSs residing in the terminal inverted repeat regions. Coccolithovirus EhV transcription also begins in the host nucleus using the host DdRp even though EhV encodes 6 RNA polymerase subunits; however, none of them are packaged in the virus. Transcriptional analysis suggests that viral RNA polymerase gene expression occurs in a secondary transcriptional phase (occurring at least one hour PI and demarcated by a transition from a unique-promoter driven, locus-specific transcriptional activity to a genome wide transcriptional activity), potentially indicative of a biphasic replication cycle with both nuclear and cytoplasmic phases.

Ecology Eukaryotic algae are important components of both inland water and marine environments; however, the significance of viruses in these systems is only beginning to be appreciated. The diversity of marine prasinoviruses was first noted by using the DNA polymerase gene found in all NCLDVs as a genetic marker. Using degenerate oligonucleotide primers and subsequent sequencing revealed viruses that were likely to infect the microalga Micromonas. Subsequently, diverse viruses infecting the species Ostreococcus tauri and Bathycoccus sp. (also members of the algal class Mamiellophyceae) were identified in coastal samples from the North Sea and Mediterranean Sea. The availability of several prasinovirus genomes permitted screening of very large metagenomic datasets, such as those produced by the Tara-Oceans expedition, for the presence of prasinoviruses and showed them to be one of the most abundant types of NCLDV in marine waters. A new degenerate primer pair, in combination with next generation sequencing of PCR products, revealed enormous viral diversity in different marine samples. The viral diversity and distribution are dependent on environmental conditions affecting host growth, e.g., nutrient and light availability and temperature.

Phycodnaviruses (Phycodnaviridae)

693

Controlled one-step infection experiments with prasinoviruses demonstrated that nutrient limitation typically results in longer latent periods, reduced burst sizes and delayed lysis of the host as compared to control treatments. For example, the burst size of MpV-08T was reduced by 70% under phosphate limitation as well as under nitrogen and iron limitation. Micromonas growth is not strongly affected by light intensity and this may be the reason that MpV proliferation is also not impaired by low or high light, independent of whether the host is phosphorus limited or not. Likely MpV proliferation is not dependent on photophosphorylation during infection, similar to chlorovirus production in Chlorella. Besides nutrients and light, temperature is also an important host growth-regulating variable and indeed in the Arctic it impacts MpoV viral burst size and progeny virus infectivity. Phycodnaviruses can play major roles in the termination of marine algal blooms and research efforts are actively focusing on elucidating the natural history of these algal/virus systems. For example, EhV and PgV viruses are one of the primary mechanisms for terminating Emiliania huxleyi and Phaeocystis globosa blooms, respectively. EhV infection and replication is affected by light, and newly formed virus particles can be transmitted to surrounding regions by ‘hitch-hiking’ on zooplankton and by marine aerosols. Likewise, infection by the rhaphidovirus HaV is regarded as one of the factors influencing the dynamics and termination of Heterosigma akashiwo blooms. HaV has a significant impact on H. akashiwo populations and changes clonal composition of the alga. Based on host-virus cross reactivity tests, HaV isolates were divided into three groups (I-III) according to their intraspecies host range. Chloroviruses are ubiquitous in inland water collected throughout the world and titers as high as thousands of infectious particles per ml have occasionally been reported. However, typically the titer is 1–100 infectious particles per ml. The titers are seasonal with the highest titers in the spring and late fall as monitored in a lake in the temperate zone. It is not known if chloroviruses replicate exclusively in zoochlorellae symbiotic with paramecia and heliozoae or if the viruses have another host (s). In fact, it is not known if these zoochlorellae exist free of their hosts in natural environments. However, zoochlorellae contained within their symbiotic hosts are refractory to infection by chloroviruses because the viruses cannot contact the algae. In the case of the ciliate Paramecium bursaria, chloroviruses can attach to the outer membrane of the paramecia without infecting them, which puts the viruses in good position to encounter zoochlorellae if the paramecium is disrupted by a feeding predator. For example, copepods feed on paramecia by engulfing them and pass viable zoochlorellae through their digestive system intact, which leads to virus infection and a rapid increase in viral titer. This gives rise to the concept of predation being an ecological catalyst. Other predators such as Didinium can forage on zoochlorellae-bearing paramecia and rupture the cell, which results in the release of zoochlorellae where they can be infected. The lysogenic phaeoviruses EsV and FsV infect the small filamentous brown algae Ectocarpus sp and Feldmania sp, respectively. Lysogeny is consistent with the observation by early investigators that VLPs appeared infrequently in eukaryotic algae and only at certain stages of algal development. The apparent lack of infectivity by many of these previously observed VLPs in eukaryotic algae is also consistent with a lysogenic lifestyle. The VLPs might either infect the host and resume a lysogenic relationship or be excluded by pre-existing lysogenic viruses. The ultimate role of the viruses in the ecology of their host is still unknown. Lysogenic phaeoviruses have also recently been reported to infect the large perennial Laminaria macroalgae (kelp). Kelp form underwater forests that dominate coastlines on all continents and form some of the most productive and structurally complex ecosystems in the world, which support diverse marine communities. The role that the phaeoviruses play in kelp ecology is completely unknown. To summarize, phycodnaviruses contribute to microbial composition and diversity, nutrient cycling, carbon flow, and other bio-geochemically-important processes in aqueous environments.

Resistance to Phycodnavirus Infections Resistance to virus infection has been studied to some degree for members of three genera within the family Phycodnaviridae. In the case of the chloroviruses, resistance to virus infection occurs quite frequently and most of the time this is due to a change in the host receptor such that the chloroviruses are unable to attach to the host. Coccolithoviruses infect the marine microalga, Emiliania huxleyi. The alga has different life forms including: (1) diploid coccolith-bearing or naked, non-motile cells that are susceptible to the virus, (2) diploid coccolith-bearing or naked, non-motile cells that are resistant to the virus; and (3) haploid, non-calcifying, organic scale-bearing swarming cells that were originally reported to be resistant to virus infection. However, recent evidence indicates that virus-derived lipids are present in E. huxley haploid cells, indicating that the cells are actually infected. In addition, virus transcripts are detected in these cells in the absence of cell lysis due to virus infection. This superimposed state of being both infected and resistant is reminiscent of Schrödinger’s cat; i.e., of being simultaneously both dead and alive. In the green microalga Ostreococcus tauri, resistance to prasinovirus infection arises frequently in culture and in Micromonas sp. the frequency of spontaneous resistance is temperature-dependent. RNA-Seq transcription analysis of many independent OtV5-resistant clonal lines obtained from OtV5-infected cultures revealed overexpression of all genes spanning half of the physical length of chromosome 19, and physical rearrangements of this chromosome are also observed in karyotypic analyses of these lines. This chromosome, which carries many predicted glycosyltransferase genes, is variable in its genetic structure in different wild-type algal strains, and appears to be involved in viral defense.

694

Phycodnaviruses (Phycodnaviridae)

Phycodnavirus Genes Encode Some Interesting and Unexpected Proteins Many chlorovirus-encoded proteins are either the smallest or among the smallest proteins of their class. In addition, homologous genes in the chloroviruses can differ in nucleotide sequence by as much as 50%, which translates into amino acid differences of 30%–40%. Therefore, comparative protein sequence analyses can identify conserved amino acids in proteins as well as regions that tolerate amino acid changes. The small sizes and the finding that many chlorovirus-encoded proteins are ‘user friendly’ have resulted in the biochemical and structural characterization of many PBCV-1 encoded proteins. Examples include: (1) 5mC and 6mA DNA methyltransferases and their companion DNA site-specific endonucleases (some of which are sold commercially); (2) the smallest eukaryotic ATP-dependent DNA ligase, which is also sold commercially; (3) the smallest type II DNA topoisomerase (the virus enzyme cleaves dsDNAs B50 times faster than the human type II DNA topoisomerase); (4) a small prolyl-4-hydroxylase that converts Pro-containing peptides into hydroxyl-Pro-containing peptides in a sequencespecific fashion; and (5) one of the smallest proteins (82 amino acids) to form a functional K þ channel. In fact, the prasinovirus MpV-12T encodes a functional 78 amino acid K þ channel protein. These virus-encoded K þ channels are under intensive investigation and there are over 60 publications on them. The chloroviruses are also unusual because they encode enzymes involved in sugar metabolism. Three PBCV-1-encoded enzymes, glutamine:fructose-6-phosphate aminotransferase, UDP-glucose dehydrogenase, and hyaluronan synthase are involved in the synthesis of hyaluronan, a linear polysaccharide composed of alternating b-1,4-glucuronic acid and b-1,3-N-acetylglucosamine residues. All three genes are transcribed early in PBCV-1 infection and hyaluronan accumulates on the external surface of the infected cells. Other chloroviruses encode a chitin synthase and chitin accumulates on the external surface of these infected cells. Two PBCV-1 encoded enzymes, GDP-D-mannose dehydratase and fucose synthase, comprise a three-step pathway that converts GDP-D-mannose to GDP-L-fucose and GDP-L-rhamnose. Both of these monosaccharides are in the glycans attached to the virus major CP. As noted earlier, PBCV-1 encodes most if not all of the machinery to glycosylate its major CP including several glycosyltransferases. PBCV-1 is the first virus known to encode 5 functional enzymes involved in polyamine biosynthesis and catabolism including: ornithine decarboxylase (ODC), homospermidine synthase, agmatine iminohydrolase, N-carbamoyl-putrescine amidohydrolase and a polyamine acetyltransferase. ODC catalyzes the decarboxylation of ornithine to putrescine, which is the first and the rate limiting enzymatic step in polyamine biosynthesis. Not only is the PBCV-1-encoded ODC the smallest known ODC, the PBCV-1 enzyme is also interesting because it decarboxylates arginine more efficiently than ornithine. Homospermidine synthase is a virion-associated protein. The chloroviruses also encode many other interesting and unexpected enzymes that have been characterized including an aquaglyceroporin, a Ca2 þ transporting ATPase, a K þ transporter, a pyrimidine dimer specific glycosylase, an aspartate transcarbamylase, and a thymidylate synthase X. Another set of unexpected proteins is encoded by the EhV coccolithoviruses. These include enzymes that comprise a nearcomplete de novo sphingolipid biosynthesis pathway believed to have been acquired from the host via horizontal gene transfer (HGT). Importantly in eukaryotes, sphingolipids are common building blocks of membrane lipids and lipid rafts. The first, and likely rate-limiting enzyme in this pathway is serine palmitoyltransferase (SPT). Its inhibition by myriocin (a sphingosine equivalent which is a specific and effective inhibitor of SPT in yeast and mammalian cells) halts successful EhV replication in a dose dependent manner. Other putative proteins coded by the different EhVs include: a phosphate permease, polyubiquitin, and a sialidase. In contrast to the chloroviruses, the gene content of many prasinoviruses has only come to light in the last decade and very few of the predicted gene functions have been experimentally verified. Exceptions to this are certain cation transporters, whose function has been tested by heterologous expression in yeast or mammalian cells. Some prasinoviruses encode phosphate or ammonium transporters that are thought to confer nutritional advantages to their host cells during infection. These, and numerous other genes are likely to have been acquired by HGT, either from the host or from other microbes. About 30% of the prasinovirus genes show similarity to genes with known functions, and include a diversity of interesting genes whose number increases with every new virus sequenced. All prasinoviruses encode methylases and glycosyltransferases that are likely involved in modifications of host or viral DNA or proteins important for the viral life-cycle. Many Ostreococcus or Micromonas viruses encode putative enzymes for amino acid synthesis (for example 3-dehydroquinate synthase, acetolactate synthase, oxovalerate aldolase) or sugar synthesis (for example 6-phosphofructokinase). Near-complete genomes of some prasinoviruses with unknown host species have been assembled from brackish or freshwater lakes, and encode genes for a multidrug resistance protein and for histone H3, which is likely to be involved either in host or viral gene expression and/or virus DNA packaging.

Perspectives Eukaryotic algae are important components of both inland water and marine environments. However, the significance of viruses on host population dynamics in these systems is only beginning to be appreciated. It is obvious that identifying and characterizing large dsDNA viruses that infect eukaryotic algae is in its infancy. For example, metagenomic studies, such as DNA sequencing in the Sargasso Sea, indicate that many algal viruses exist in a variety of aqueous environments with unknown hosts. This lack of knowledge certainly carries over to soil samples. Furthermore, with a few possible exceptions, little is known about the replication

Phycodnaviruses (Phycodnaviridae)

695

cycles of most algal viruses. The few algal-virus systems that are under investigation indicate that the various systems differ extensively, e.g., in their infection processes, in their replication cycles, consequences of infection (lytic versus lysogenic), and probably in many other aspects. The development of efficient systems that allow scientists to conduct molecular manipulations of the algal virus genomes would provide a huge boost to the study of algal viruses. Such techniques are currently lacking. Finally, it should be noted that algal viruses encode many interesting proteins, and even some that have potential for commercial exploitation. Unfortunately, it is also true that most of the putative proteins in these viruses have unknown functions and they often have no homology to proteins in the public domain databases. Who knows how many more unexpected algal virus encoded proteins await discovery?

Acknowledgments We want to thank all of the many researchers who have contributed to the growing field of algal virology, especially close colleagues for useful discussions, and those whose work has not been mentioned because of space limitations.

Further Reading Bidle, K.D., 2016. Programmed cell death in unicellular phytoplankton. Current Biology 26, R594–R607. Claverie, J.M., Abergel, C., 2018. Mimiviridae: An expanding family of highly diverse large dsDNA viruses infecting a wide phylogenetic range of aquatic eukaryotes. Viruses 10, 506. Derelle, E., Yau, S., Moreau, H., Grimsley, N.H., 2018. Prasinovirus attack of Ostreococcus is furtive by day but savage by night. Journal of Virology 92, e01703–e01717. Maat, D.S., Biggs, T., Evans, C., et al., 2017. Characterization and temperature dependence of Arctic Micromonas polaris viruses. Viruses 9, 134–154. Martınez Martınez, J., Boere, A., Gilg, I., et al., 2015. New lipid envelope containing dsDNA virus isolates infecting Micromonas pusilla reveal a separate phylogenetic group. Aquatic Microbial Ecology 74, 17–28. Nissimov, J.I., Pagarete, A., Ma, F., et al., 2017. Coccolithoviruses: A review of cross-kingdom genomic thievery and metabolic thuggery. Viruses 9, 52. Schroeder, D.C., 2015. More to Phaeovirus infections than first meets the eye. Perspectives in Phycology 2, 105–109. Van Etten, J.L., Agarkova, I.V., Dunigan, D.D., 2019. Chloroviruses. Viruses 12, E20. Van Etten, J.L., Dunigan, D.D., 2015. Voyages with chloroviruses. In: Rohwer, F. (Ed.), The Phage World. USDA, pp. 326–337. Van Etten, J.L., Dunigan, D.D., 2016. Giant chloroviruses – Five easy questions. PLoS Pathogens 12, e1005751. Weynberg, K.D., Allen, M.J., Wilson, W.H., 2017. Marine Prasinoviruses and their tiny plankton hosts: A review. Viruses 9, E43. Yau, S., Seth-Pasricha, M., 2019. Viruses of polar aquatic environments. Viruses 11, 189.

INVERTEBRATE VIRUSES

An Introduction to Viruses of Invertebrates Peter J Krell, Department of Molecular and Cellular Biology, University of Guelph, Guelph, ON, Canada r 2021 Elsevier Ltd. All rights reserved.

Nomenclature BV Budded virus (baculoviruses) G Virion glycoprotein, structural protein gRNA Genomic (or virion) RNA GV Granulovirus ICTV International Committee on Taxonomy of Viruses IRES Internal ribosome entry site on ssRNAs L Long segment of a tripartite RNA genome M Medium segment of a tripartite RNA genome N Virion nucleoprotein, structural protein of core NPV Nucleopolyhedrovirus NS Nonstructural protein

Glossary Granulovirus It is a term used for insect baculoviruses which produce granules composed of granulin each containing a single virion and which protects the virus from the environment. Internal ribosome entry site (IRES) It is a sequence in a mRNA which allows for cap-independent translation. This is common to the Picornavirales order but occurs in the RNAs of others as well. International Committee on Taxonomy of Viruses It is an international committee which has a defined organizational structure which meets on an annual basis to review proposals on new taxonomies, or changes in taxonomies. Once proposals are formally accepted the committee updates the taxonomy and posts it to the ICTV Website at “See Relevant Websites section”. Nucleopolyhedrovirus It is a term used for the insect baculoviruses which produce large polyhedral particles

ODV Occlusion derived virus (baculoviruses) P Virion phosphoprotein, structural protein RdRp RNA dependent RNA polymerase (of RNA viruses) S Short segment of trisegmented RNA genome sgRNA Subgenomic RNA especially for Nidovirales viruses TRS Transcription regulating sequence (Nidovirales viruses) þ ve ssRNA positive sense (mRNA sense) single stranded RNA  ve ssRNA antisense (anti mRNA sense) single stranded RNA VP Virion protein

composed of polyhedrin in which virions are embedded to protect them from the environment. RNA dependent RNA polymerase (RdRp) It is an RNA polymerase which synthesizes RNA copies from an RNA template strand. It is an enzyme which allows an RNA virus to transcribe its genome into antigenome RNA, viral mRNA or subgenomic mRNAs. Subgenomic RNAs (sgRNAs) They are smaller segments of RNA produced from a long RNA template with each segment encoding a different ORF, or in the case of the Nidovirales different combinations of ORFs. Transcription regulating sequence (TRS) It is a short sequence in viral RNA at the beginning of the first ORF or between subsequent ORFs, especially for viruses in the Nidovirales order, that allow transcription of subgenomic mRNAs with each sgRNA being translated into a different ORF.

Introduction Considering that invertebrate species outnumber the total of all other global species by about 9 to 1 and, other than bacteria, have a longer evolutionary history, it is not surprising that they harbor the majority of virus species on earth. Although most people think of invertebrates as arthropods, including insects and spiders, they also include protozoans, annelids, echinoderms and molluscs. For the sake of simplicity and acknowledging the paucity of information on viruses in other groups of invertebrates, this review will focus mostly on viruses of arthropods, specifically insects, arachnids and crustaceans and, due to their economic importance, molluscs. While many viruses, notably the arboviruses of mammals and plants, infect both their invertebrate vectors and vertebrate or plant hosts, the rest are restricted to individual species of animals. Table 1 summarizes the different species of invertebrate viruses based on the Family taxon. For the sake of simplicity names of only taxa from the Family level (and Order) and below are given. Higher orders of Class, Subphylum, Phylum, Kingdom and Realm, favored by the International Committee on the Taxonomy of Viruses (ICTV) may be of relevance to those interested in virus evolution and bioinformatics, but from a practical level have little value to most practicing virologists and are not included in these summaries. As one example for the complete taxonomy of a virus of invertebrates, the taxonomy of roniviruses would have to include all of Realm Ribovairia, Kingdom Orthornavirae, Phyllum Pisuviricota, Subphyllum none, Class Pisonvircetes, Order Nidovirales, Suborder Ronidovirineae, Family Roniviridae, Subfamily Okanovirinae, Genus Okavirus, Subgenus Tipravirus, Species Gill-associated virus. It would be too cumbersome to provide such a complete taxonomy for all viruses in this article and would probably be of little interest. Nevertheless those interested in the higher levels of classification ordained by the

Encyclopedia of Virology, 4th Edition, Volume 4

doi:10.1016/B978-0-12-814515-9.00097-7

699

Ascovirus, Spodoptera frugiperda ascovirus 1a (3) Toursvirus, Diadromus pulchellus toursvirus (1) Alphabaculovirus, Autographa californica multiple nucleopolyhedrovirus (55) Betabaculovirus, Cydia pomonella granulovirus (26) Deltabaculovirus, Culex nigripalpus nucleopolyhedrovirus (1) Gammabaculovirus, Neodiprion lecontei nucleopolyhedrovirus (2)

Bidensovirus, Bombyx mori bidensovirus (1)

Ascoviridae (Pimascovirales) Baculoviridae

Bidnaviridae (Poliovirales) Birnaviridae Carmotetraviridae (Tolivirales) Circoviridae (Cirlivirales)

þ ve ssRNA, linear, monopartite, 20 kb, 5’ cap 3’ polyA tail

Alphamesonivirus (10) Subgenus Namcalivirus, Alphamesonivirus 1 (2 species)

Mesoniviridae Hexponivirinae (Nidovirales)

dsDNA, linear 163 to 220 kb, circularly permuted terminally redundant, methylated  ve ssRNA, linear, bisegmented, 7.1, 4.4 kb dsDNA, linear, 134 to 212 kb

Chloriridovirus, Invertebrate iridescent virus 3 (5) Decapodiridovirus 1, Decapod iridescent virus 1 (1) Iridovirus, Invertebrate iridescent virus 6 (2)

Aurivirus, Haliotid herpesvirus 1 (1) Ostreavirus, Ostreid herpesvirus 1 (1)

Icosahedral T ¼1, 15 to 25 nm diameter

Diptera and perhaps hymenoptera Honey bees, parasitic mites, lepidoptera, aphids, planthoppers Icosahedral T¼147 to 217, 150 to 200 nm Mosquitoes, lepidoptera, diameter, inner membrane, naked or isopods, crabs, shrimp enveloped Unknown, identified solely from sequence Spiders data Icosahedral T¼16 nucleocapsid, spherical Molluscs 150 to 200 nm diameter virion, enveloped Spherical, 50 to 120 nm enveloped with Mosquito spike protein

Rod shaped, 50 to 65 nm  500 to 1,000 nm, helical core, enveloped þ ve ssRNA, linear, monopartite Icosahedral, pseudo T¼ 3, 22 to 30 nm 8-10.5 kb, VPg at 5’end, poly A at 3’ end diameter

dsDNA, circular, 124 to 190 kb

Iridoviridae Betairidovirinae (Pimascovirales) Lispiviridae (Mononegavirales) Malacoherpesviridae (Herpesvirales)

Arlivirus, Lishi arlivirus (6)

lepidoptera

Cockroach, dragonfly, mosquito, ants, spiders, ticks þ ve ssRNA, monopartite 8.0 to 10.2 kb, Icosahedral, pseudo T ¼3, 30 nm diameter Bees, ants, aphids, drosophila, triatomine bugs ssDNA, circular, 2.2 kb Icosahedral T ¼1, 29 nm diameter Dragonfly, mosquito

ssDNA, circular, 1.7 to 1.9 kb

Iflavirus, Infectious flacherie virus (15)

Aparavirus, Acute bee paralysis virus (6) Cripavirus, Cricket paralysis virus (4) Triatovirus, Triatoma virus (5) Gemycircularvirus, Sclerotinia gemycircularvirus 1 (43) Gemyduguivirus, Dragonfly associated gemyduguivirus 1 (1) Gemykibivirus, Dragonfly associated gemykibivirus 1 (16) Glossinavirus, Glossina hytrosavirus (1) Muscavirus, Musca hytrosavirus (1)

Circovirus, Porcine circovirus 1 (43) Cyclovirus, Human associated cyclovirus 8 (51)

Iflaviridae (Picornavirales)

Hytrosaviridae

Genomoviridae (Geplafuvirales)

Dicistroviridae (Picornavirales)

Spherical, flexuous nucleoprotein core, 100 to 130 nm enveloped

 ve ssRNA, monopartite, 12.5 kb

Entomobirnavirus, Drosophila X virus (2) Alphacarmotetravirus, Providence virus (1)

Icosahedral T ¼4, 42 nm diameter

Beta-monopartite, þ v ssRNA, 6.6 kb Omega-bipartite 2.5/5.3

Betatetravirus, Nudaurelia capensis beta virus (7) Omegatetravirus, Nudaurelia capensis omega virus (3) Hexartovirus, Caligid hexartovirus (2) Peropuvirus, Pteromalus puparum peropuvirus (6)

Alphatetraviridae (Hepelivirales)a Artoviridae (Mononegavirales)

Major invertebrate host

Wasps, barnacles, copepods, pill worms, woodlice, ticks Circular dsDNA, 100 to 200 kb Core 80  300 nm, enveloped 130  200 Lepidoptera, parasitoid to 400 nm allantoid or reniform virion wasps Circular ds DNA, 82 to 180 kb Budded virus helical nucleocapsids 30 to Lepidoptera, hymenoptera diptera 60 nm  250 to 300, envelopedOcclusion derived virus form (ODV) embedded in a polyhedral occlusion bodyNPV polyhedral 0.15 to 5.0 mM diameterGV polyhedral 0.12 to 0.5 mM diameter Linear ssDNA, ambisense, bipartite, 6 and Icosahedral T ¼1, 20 to 24 nm, separate silkworm 6.5 kb capsid for each of 4 ssRNAs dsRNA, bipartite, 3.2, 3.4 kb Icosahedral T ¼3, 60 to 70 nm diameter Drosophila, mosquitoes þ ve ssRNA, monopartite, 6.2 kb Icosahedral T ¼4, 40 nm diameter lepitoptera

Morphology, T value for icosahedral capsids

Genome type, segments and size

Genus, Subgenus, Type Species (total # in genus)

Families and Genera of Viruses Replicating in or Associated with Invertebrate Hosts

Family/Subfamily Order

Table 1

700 An Introduction to Viruses of Invertebrates

Polydnaviridae

Gouko, mosquitoes Phasi, mosquitoes and flies Phlebo, mosquito vector Wenri, shrimp

(Continued )

Parasitoid hymenoptera dsDNA, circular, multipartite Bracovirus, multiple rod shaped Bracovirus, 15 to 30 segs, 2 to 31 kb nucleocapsids, enveloped each, 25 to 600 kb aggregate size Ichnovirus, prolate ellipsoid, enveloped Ichnovirus, 20 to 50 segs, 2 to 20 kb each, 150 to 250 kb aggregate size

þ ve ssRNA, monopartite, 10 to 12 kb

Chipolycivirus, Chironomus riparius virus 1 (2) Hupolycivirus, Hubei hupolycivirus (1) Sopolycivirus, Solenopsis invicta virus 2 (11) Bracovirus, Cotesia melanoscela bracovirus (32) Ichnovirus, Campoletis sonorensis ichnovirus (21)

Polycipiviridae (Picornavirales)

Icosahedral, pseudo T ¼3, 33 nm diameter Ants, midges

 ve ssRNA, tripartite Gouko-1.1, 3.2, 6.4 kb Phasi-1.3, 4.1, 6.8 kb Phlebo-1.7, 3.2, 6.4 kb

Goukovirus, Gouléako virus (3) Phasivirus, Badu phasivirus (5) Phlebovirus, Rift Valley fever phlebovirus (60) Wenrivirus, Shrimp wenrivirus (1)

Icosahedral T ¼4, 40 nm diameter Nucleocapsid pseudocircular ribonucleoprotein core, spherical, 80 nm diameter virus, enveloped. Jonchet virions can also appear rod shaped with a diameter of 60 nm by up to 600 nm in length Virions typically spherical with diameters 80 to 130 nm with surface proteins arranged in a T¼12 icosahedral lattice with three RNP coils in the core

þ ve ssRNA, monopartite, 5.7 kb

Phenuiviridae (Bunyavirales)

Icosahedral T ¼1, 21 to 28 nm diameter

ssDNA, linear, 4 to 6 kb

lepidoptera Mosquitoes, cockroach, midges, water striders, dragon flies, bees

Decapods, diptera

Icosahedral T ¼1, 21 to 28 nm diameter

ssDNA, linear 4 to 6 kb

 ve ssRNA, tripartite Fera-1.5, 4.2, 6.9 kb Jonvirus-1.7, 5.4, 6.9 kb Ortho-2.2, 2.8, 6.7 kb Sawastri -1.6, 4.2, 7.2 kb

Alphapermutotravirus, Thosea asigna virus (2)

Ticks, insects, echinoderms, spinuculids, crustaceans, nematodes, tapeworms (land and sea birds) Crickets, crayfish, Hymenoptera, Orthoptera, Hemiptera, Lepitoptera, aphids, crustaceans, mosquitoes, echinoderms, prawns

Feravirus, Ferak feravirus (1) Jonvirus, Jonchet jonvirus (1) Orthophasmavirus, Kigluaik phantom orthophasmavirus (9) Sawastrivirus, Sancia sawastrivirus (1) Wuhivirus, Insect wehuvirus (1)

Parvoviridae Hamaparovirinae (Piccovirales) Permutotetraviridae Phasmaviridae (Bunyavirales)

Parvoviridae Densovirinae (Piccovirales)

Nyavirus, Nyamanini nyavirus (3) Orinovirus, Orinoco orinovirus (1) Socyvirus, Soybean cyst nematode socyvirus (1) Tapwovirus, Tapeworm tapwovirus (1) Aquambidensovirus, Decapod aquambidensovirus 1 (2) Blattamambidensovirus, Blattella blattambidensovirus 1 (1) Hemiambidensovirus, Hemipteran hemiambidensovirus 1 (2) Iteradensovirus, Lepidopteran iteradensovirus 1 (5) Miniambidensovirus, Orthopteran miniambidensovirus 1 (1) Pefuambidensovirus, Blattodean pefuambidensovirus 1 (1) Protoambidensovirus, Lepidopteran protoambidensovirus 1 (2) Scindoambidensovirus, Orthopteran scindoambidensovirus 1 (3) Brevihamaparvovirus, Dipteran brevihamaparvovirus 1 (2) Hepanhamaparvovirus, Decapod hepanhamaparvovirus 1 (1) Penstylhamaparvovirus, Decapod penstylhamaparvovirus 1 (1)

Berhavirus, Sipunculid berhavirus (3) Nyamiviridae (Mononegavirales) Crustavirus, Wenzhou crustavirus (3)

Alphnudivirus, Oryctes rhinoceros nudivirus (2) Betanudivirus, Heliothis zea nudivirus (1) Beetles, lepidoptera, crustaceans

dsDNA, circular, 96 to 232 kb

Alphanodavirus, Nodamura virus (5)

Nodaviridae (Nodamuvirales) Nudiviridae Nucleocapsid rod-shaped helical core, enveloped virus 81 to 100  200 to 415 nm  ve ssRNA, linear, monopartite, 12.2 kb Enveloped spherical particles, 100 to (bipartite for Tapwovirus) 130 nm diameter

Ellipsoid, nucleocapsid 70 to 170 Crustaceans, shrimp nm  210 to 420 nm, with tegument and envelope þ ve RNA, linear, bipartite, 1.4 and 3.1 kb Icosahedral T ¼3, 25 to 33 nm diameter coleoptera

dsDNA, circular, 280 to 307 kb

Whispovirus, White spot syndrome virus (1)

Nimaviridae

An Introduction to Viruses of Invertebrates 701

Bacilliform 40 to 60 nm  150 to 200 nm, Crustaceans (prawns) enveloped with peplomers Icoasahderal T¼? 15 nm diameter naked Prawns, crayfish, butterflies, water bugs Icosahedral T ¼3 27 to 29 nm diameter Ants

þ ve ssRNA, monopartite, 26 kb þ ve ssRNA, monoparatite, 0.8 kb

þ ve ssRNA, monopartite, 8 to 11 kb 5’ Vpg, 3’ polyA

Okavirus, (Subgenus Tipravirus) Gill-associated virus (3)

Macronovirus, Macrobrachium satellite virus 1 (1)

Invictavirus, Solenopsis invicta virus 3 (1) Nyfulvavirus, Nylanderia fulva virus 1 (1)

 ve ssRNA, tipartite, 3, 4.8, 8.8 kb  ve ssRNA, monopartite, 12 kb

Orthotospovirus, Tomato spotted wilt orthotospovirus (26)

Anphevirus, Xincheng anphevirus (7)

Caligrhavirus, Lepeophtheirus caligrhavirus (3) Sigmavirus, Drosophila melanogaster sigmavirus (7)

Pseudocircular ribonucleoprotein core, 80 to 120 nm diameter, enveloped Not seen, based solely on sequence

Bullet shaped, helical rod shaped nucleocapsid 45 to 100 nm  100 to 430 nm with matrix and envelope

 ve ssRNA, monopartite, 10.8 to 16.1 nm

Mosquitoes, drosophila, orthoptera

Thrips

mosquito, butterflies bed bugs, nematodes, sea lice, flies

Icosahedral single (T ¼2) or dual (T ¼2/T¼ 13) shell Aqua-dual shell, 75 nm diameter Colti-dual shell, 60 to 80 nm diameter Cypo-single shell, 65 nm diameter Dinoverna-single shell T ¼ 2 only, 50 nm diameter Idno, dual shell T ¼2/T ¼ 13, 70 nm diameter

Seadorna, mosquitoes (and mammals) Aqua, crab and oyster Colti, ticks and mosquitoes Cypo, lepidoptera Dinoverna, Aedes mosquitoes Idno, hymenoptera

Icosahedral, 2 shells, T¼2 (inner shell) Cardio, crabs T¼13 (intermediate shell), 60 to 70 nm Phyto, leafhopper (and diameter1 plants)

The taxon Order, if any is in brackets below the family or subfamily name. The taxon Subfamily, if any, is below Family name and is indented. The number of species in a genus is in brackets after the Genus name. Note: Some families in this list also include viruses that infect more than invertebrates.

a

Solinviviridae (Picornavirales) Tospoviridae (Bunyavirales) Xinmoviridae (Mononegavirales)

Roniviridae (Nidovirales) Sarthroviridae

Major invertebrate host

Mosquito but kills suckling mice and replicates in Vero cells Brick shaped 140 to 260  220 to 450 nm Mosquitoes, coleoptera, enveloped complex morphology lepidoptera, orthoptera,

Poxvirus like morphology

Morphology, T value for icosahedral capsids

dsRNA 9 to 12 segments Aqua-11 segs, 0.8 to 3.9 kb Colti-12 segs, 0.7 to 4.3 kb Cypo-10 segs, 1 to 4.2 kb Dinoverna-9 segs, 1.1 to 3.8 kb Idno, 10 segs, 1 to 4.8

Aquareovirus, Aquareovirus A (7) Coltivirus, Colorado tick fever coltivirus (5) Cypovirus, Cypovirus 1 (16) Dinovernavirus, Aedes pseudoscutellaris reovirus (1) Idnoreovirus, Idnoreovirus 1 (5)

Rhabdoviridae Almendravirus, Puerto Almendras almendravirus (6) (Mononegavirales) Alphanemrhavirus, Xingshan alphanemrhavirus (2)

Reoviridae Spinavirinae (Reovirales)

Reoviridae Sedoreovirinae (Reovirales)

dsDNA genome phylogenetically more Orthopoxvirus-like than any entomopoxvirus dsDNA, linear, 230 to 375 kb

Centapoxvirus, Yokapox virus (2)

Poxviridae Chordopoxvirinae (Chitovirales) Poxviridae Entomopoxvirinae (Chitovirales)

Alphaentompoxvirus, Melolontha melolontha entomopoxvirus (7) Betaentomopoxvirus, Amsacta moorei entomopoxvirus (16) Deltaentomopoxvirus, Melanoplus sanguinipes entomopoxvirus (1) Gammaentomopoxvirus, Chironomus luridus entomopoxvirus (2) Cardoreovirus, Eriocheir sinensis reovirus (1) dsRNA all with 12 segments Cardoreovirus 0.7 to 3.7 kb Phytoreovirus, Wound tumor virus (3) Seadornavirus 0.9 to 3.7 kb Seadornavirus, Banna virus (3) Phytoreovirus 1.4 to 4 kb

Genome type, segments and size

Genus, Subgenus, Type Species (total # in genus)

Continued

Family/Subfamily Order

Table 1

702 An Introduction to Viruses of Invertebrates

An Introduction to Viruses of Invertebrates

703

ICTV of any of these viruses can consult the ICTV Taxonomy Database for any virus species mentioned, at “See Relevant Websites section”. This site is updated on a regular basis following approval of new taxa by the ICTV. Any reference to taxa in the articles in this Volume are based on that posted as of Aug 31, 2020. Below are provided brief synopses of the virus families, in alphabetical order, which include viruses which infect invertebrates. In addition to family names, names of additional taxa including Order, Subfamily, Genus and Species are also provided. The total number of species for each genus are included in brackets after the genus name and is based on the ICTV Taxonomy list as of August 31, 2020. Also provided are the years that the corresponding names for the taxa were first approved by the ICTV. Note that by the rules of orthography of the ICTV, the names of taxa (e.g., Order, ending in “virales”, Family, ending in “viridae”, Subfamily ending in “virinae”, genus, one word ending in “virus”, and Species, two or more words ending in “virus”) are in italics and names of viruses are in normal Roman letters. Also given are the names of exemplar viruses of each type species. It will be noticed that in most cases the species names are derived directly from the names of a virus classified into that species, the only difference being that the species name is in italics and the virus name is in normal font. A fuller description of most of these are provided in the rest of the articles in Viruses of Invertebrates.

Artoviridae Family: Artoviridae [ICTV 2018] (Order Mononegavirales, ICTV 1990) Genus: Hexartovirus [ICTV 2019] (2 species) Type Species: Caligid hexartovirus [ICTV 2019] Species Exemplar: Lepeophtheirus salmonis negative-strand RNA virus-1 Genus: Peropuvirus [ICTV 2016] (6 species) Type Species: Pteromalus puparum peropuvirus [ICTV 2016] Species Examplar: Pteromalus puparum negative strand RNA virus-1 Artoviruses, originally discovered by high throughput sequencing, belong to the order Mononegavirales and hence have a single negative sense (  ve) ssRNA genome which is up to 12.5 kb in length. The two genera in the Artoviridae family (Arto from arthropod), Peropuvirus (from Pteromalus puparum negative-strand RNA virus, PpNSRV-1) with six species and Hexartovirus (from the crustacean host crab, Hexanauplia and artovirus) with two species. PpNSRV-1 was initially reported in 2017 in a pteromalid wasp, Pteromalus puparum, hence the name of the virus. Other members of the Peropuvirus genus have been found in other parasitic wasps, pillworms, woodlice, odonates and copepods, while members of the Hexartovirus genus are associated with copepod sea lice or barnacles. PpNSRV-1 virions, the only artovirus virions to be described, are enveloped with a diameter of 100–130 nm. PpNSRV-1 is transmitted vertically through both male and female wasps and is present in various tissues throughout the life stages. While PpNSRV-1 increases adult longevity, infection does have a fitness cost. PpNSRV-1 particles are present in the follicular cells and ovaries of the parasitoid wasp host and in intracellular vesicles of the digestive tract. Virions for the other artoviruses have not been described as they have been identified solely from metagenomic studies. Collectively artovirus genomes encode nucleocapsid protein (N), glycoprotein (G) RNA directed RNA polymerase domain in the large (L) protein along with some proteins of unknown function, U1 to U3. Transcription initiation and termination motifs, typical of other mononegavirales viruses flank the independently transcribed ORFs.

Ascoviridae Family: Ascoviridae [ICTV 1998] (Order Pimascovirales, ICTV 2019) Genus: Ascovirus [ICTV 1998] (3 species) Type Species: Spodoptera frugiperda ascovirus 1a [ICTV 1998] Species Exemplar: Spodoptera frugiperda ascovirus 1a Genus: Toursvirus [ICTV 2015] (1 species) Type Species: Diadromus pulchellus toursvirus [ICTV 1998] Species Exemplar: Diadromus pulchellus toursvirus 4a (DpTV) Ascoviruses are unusual viruses of lepidopteran insects that appear to be horizontally transmitted by endoparasitoid wasps during oviposition and have a wide tissue tropism. They are bacilliform or allantoid shaped virions about 130–150 nm in diameter and 300–400 nm in length with an inner, enveloped, 80  300 nm core surrounded by a second, outer layer of protein subunits. Their large circular dsDNA genome is 100–200 kb in size encoding 117–180 genes. In addition to their highly complex structures, ascoviruses initiate an unusual cytopathology starting in the nucleus followed by enlargement of the infected cell and its eventual cleavage into 10–30 vesicles similar to apoptotic vesicles. Virion morphogenesis, including development of the inner core surrounded by de novo synthesized membrane and envelopment by an outer membrane is completed within the vesicles as the cell ruptures. As the virions accumulate near the periphery of the vesicles, they often form occlusion bodies in a “foamy” vesicular matrix. When these abundant, highly refractile vesicles accumulate in the hemolymph they form a milky white appearance. Ascoviruses share 9 core genes with iridoviruses and the giant marseilliviruses infecting ameba. Phylogenetic analysis suggests that iridoviruses and ascoviruses share a common ancestor which is shared with marseilliviruses.

704

An Introduction to Viruses of Invertebrates

Baculoviridae Family: Baculoviridae [ICTV 1975] Genus: Alphabaculovirus [ICTV 2008] (55 species) Type Species: Autographa californica multiple nucleopolyhedrovirus [ICTV 1979] Species Exemplar: Autographa californica multiple nucleopolyhedrovirus (C6) Genus: Betabaculovirus [ICTV 2008] (26 species) Type Species: Cydia pomonella granulovirus [ICTV 1991] Species Exemplar: Cydia pomonella granulovirus (Mexican 1) Genus: Deltabaculovirus [ICTV 2008] (1 species) Type Species: Culex nigripalpus nucleopolyhedrovirus [ICTV 2004] Species Exemplar: Culex nigripalpus nucleopolyhedrovirus (Florida 1997) Genus: Gammabaculovirus [ICTV 2008] (2 species) Type Species: Neodiprion lecontei nucleopolyhedrovirus [ICTV 2002] Species Exemplar: Neodiprion lecontei nucleopolyhedrovirus Baculoviruses are among the most studied of any invertebrate virus. These viruses infect only insects, specifically lepidoptera (alpha and betabaculoviruses), diptera (gammabaculoviruses) and hymenoptera (deltabaculoviruses). Although baculovirus-like viruses in occlusion bodies have been described in shrimp, their taxonomic status has not been verified. From a historical perspective, baculoviruses were initially recognized as agents which caused jaundice (or flacherie) disease of silkworm. The “polyhedrosis” which was associated with this disease was due to polyhedral bodies observed in liquified silkworm which died of the disease. In the late 1860s it was Louis Pasteur who raised the hypothesis that the “flecherie” disease of silkworm was due to a bacterium, later in the 1920s identified as a virus by Glaser and Paillot. The involvement of polyhedrosis diseases on reducing pest insect populations led to a lot of research interest in identifying and characterizing the causative agent and using them for biological control. Perhaps the first evidence of their potential for insect control was the unforeseen introduction of ghNPV, a nuclear polyhedrosis virus (NPV) through parasitoids from Finland, into a population of the European spruce sawfly, Gilpinia hercyniae, a serious pest of spruce in Canada in 1938. As reported by a mentor for the author of this article, Fred Bird of the Insect Pathology Research Institute in Sault Ste Marie, Ontario, Canada, by 1945 the sawfly infestation essentially collapsed and is still being kept in check by this virus. This was followed shortly by control of another sawfly pest, Neodiprion sertifer by introduction from Sweden of NeseNPV, now commercialized as Neochek-s, by Gerry Wyatt of Watson and Crick fame. Gerry observed that in NPV DNA the ratio of A: T and of G:C were 1 irrespective of the NPV source, leading to the theory of the double stranded nature of the DNA double helix. As is often the case with fundamental research, the study of the molecular biology of baculoviruses, largely by the laboratories of Lois Miller (University of Georgia) and Max Summers (Texas A&M University) led to the development of the baculovirus expression system (BEVS) now in universal use for production of bioreagents and vaccines, including against COVID-19, and as gene delivery vectors. Baculoviruses are characterized by two morphological forms, budded virus (BV) and occlusion-derived virus (ODV). Budded virus, as the name implies, are virions which have budded from infected cells to form rod-shaped virions with helical nucleocapsids measuring 30–60 nm in diameter by 250–300 nm in length surrounded by an envelope derived from the plasma membrane and containing the major envelope protein GP64 or F. The ODV has an almost identical nucleocapsid and is surrounded by an inner nuclear membrane-derived envelope and, as the name implies, is occluded in an occlusion body, made of either polyhedrin (nucleopolyhedroviruses) or granulin (granuloviruses, GVs). The circular dsDNA genome of a given baculovirus is identical between the budded and occlusion derived form and ranges in size from 82 (NeleGV) to 179 kb (XecnGV), encoding 81–183 ORFs. Unsurprisingly the baculovirus phylogeny, based on 38 core baculovirus genes, matches that of the taxonomy, with the deltabaculovirus and gammabaculoviruses being more ancient and the alphabaculoviruses being divided into two clades representing Groups I and II, respectively. During infection, the polyhedra or granules disassemble in the alkaline insect gut, releasing the ODVs which pass through the peritrophic membrane of the gut lumen, aided by metalloprotease enhancins. Virion nucleocapsids enter the microvilli of midgut epithelial cells via membrane fusion, assisted by a complex of per os infectivity factors. In contrast to the ODVs, BVs infecting other cells of the host or tissue culture cells enter via receptor-mediated endocytosis. In both cases the nucleocapsids are released into the cytoplasm of the cells to initiate infection. Nucleocapsids are transported to the nuclei where they become uncoated and initiate a temporally regulated cascade of immediate early, delayed early, late and very late virus gene transcription. Following nuclear hypertrophy, viral DNA replication and viral protein synthesis, nucleocapsids begin to be formed within the nucleus. Early in infection, nucleocapsids escape the ring zone in the nucleus and through actin-mediated transport, traverse the cytoplasm to the plasma membrane, via microtubules, collecting at so called budding sites where the major envelope protein GP64 (or F for some baculoviruses) has formed foci, in association of another viral protein ME53. GV envelopes do not contain a fusion protein suggesting that budded GVs might not play a role in vivo. As the nucleocapsids bud through the plasma membrane they are enveloped by the membrane and are released as BVs which spread through the hemocoel or trachea to other tissues, or cell to cell transmission in tissue cultures. Late in infection, and in the same cells producing BVs, the nucleocapsids remain in the nucleus. They become enveloped by a membrane derived from the inner nuclear envelope which contains, among other viral proteins, the per os infectivity factors forming the occlusion-derived viruses. The ODVs are then assembled within polyhedra or granules based on the major occlusion body protein, polyhedrin (alpha, delta and gamma-baculoviruses) or granulin (betabaculovirus).

An Introduction to Viruses of Invertebrates

705

At the end of the infection the concerted activity of viral chitinase and cathepsin enzymes results in the liquefaction of the larvae and dissemination of the polyhedra or granules in the environment to infect other insects. Assisted in this dissemination is a curious baculovirus modification of behavior in which the infected larvae migrate to the tops of the plants they are on prior to death, a character called “Wipfelkrankheit”. As noted earlier, baculoviruses have also been developed as high-level protein expression systems, now widely commercialized and used for the development of bioactive reagents. This was a consequence of basic fundamental research completely unrelated to their use as biological control agents. At regular seminars held on Friday afternoons in the early 1980s at the Chicken Oil Company in College Station Texas, home of Texas A&M University, Gale Smith and others from the Max Summers lab noted that polyhedrin was the major protein produced in baculovirus-infected cells amounting to as much as 50% of the total protein. This led to a more in-depth characterization by Gale Smith, Max Summers and Malcolm Fraser, in 1983, of the polyhedrin promoter (in the absence of sequencing) showing that sequences upstream of the polyhedrin ORF did contain a powerful polyhedrin promoter and was capable of directing expression of human b interferon. This promoter was the basis of developing the baculovirus expression vector system (BEVS). This system has been expanded to utilize the p10 promoter, another highly expressed late promoter and others such as the orf46 promoter of SeMNPV. This BEVS system led to the production of literally thousands of different proteins, some of pharmacological and diagnostic importance, such as specific antibodies, immunotherapeutic agents, virus-like particles and vaccines. Baculoviruses are also being considered as gene therapy vectors showing some early encouraging results. For example, baculoviruses expressing a Shigatoxin A subunit known to kill cancer cells, successfully reduced tumors in mice with MCF7-induced breast cancer after injection of the tumor. This author was not surprised to learn that BEVS was being used to generate diagnostic reagents and vaccines to address the 2019/2021 COVID-19 pandemic.

Bidnaviridae Family: Bidnaviridae [ICTV 2011] (Order Poliovirales, ICTV 2008) Genus: Bidensovirus [ICTV 2011] (1 species) Type Species: Bombyx mori bidensovirus [ICTV 2011] Species Exemplar: Bombyx mori bidensovirus 3 (VD1) Bidensoviruses, which caused a densonucleosis in infected silkworm larvae were originally classified as parvoviruses because of their single stranded linear DNAs encapsidated separately into small icosahedral capsids. However bidensoviruses BmBDV-2 and BmBDV-3 are quite different from other densoviruses in that they have a bisegmented dsDNA genome that in aggregate (13 kb) is larger than the 4–6 kb of densoviruses, encodes a DNA polymerase, no densovirus ORFs, has a different transcription strategy and has no identity with densovirus genomes. Consequently, in 2012 they were re classified in the new family Bidnaviridae, genus Bidensovirus, type species Bombyx mori bidensovirus. Bidensoviruses are naked, 20–24 nm diameter icosahedral viruses with a bisegmented linear ssDNA genome (6.5 kb VD1 and 6 kb VD2) with ITRs in which complementary strands are separately encapsidated into 4 virion types. Both strands of both segments generate transcripts expressing different proteins and hence the genome has an ambisense organization. To date, bidensoviruses have been isolated only from silkworm and hence the three isolates (Yamanashi, Zhenjiang and Indian) are referred to as BmBDVs. In terms of evolution, bidensoviruses are quite unusual since different parts of the genome appear to originate from different sources. Based on bioinformatic analysis the BmBDV NS1 and VP may have been derived from parvoviruses, the DNApolB might have evolved from the hypothetical ancestral polintoviruses, P133 may have come from an insect reovirus and NS3 from granuloviruses. One strand of ambisense VD1 encodes transcripts for nonstructural proteins NS2 and NS1 and structural protein VP while the other strand encodes transcripts for protein-primed PolB. Though ORFs for NS2 and NS1 overlap, two different transcripts, with alternative initiation sites, are made. Leaky scanning of the VP ORF results in two VPs. One strand of ambisense VD2 encodes transcripts for the nonstructural NS3 and the other strand encodes the structural protein P133. NS1 has DNA binding, ATPase and helicase activities suggesting it is a multifunctional protein important in virus replication. The function of NS2 is still unknown, though it shares slight sequence similarity with dnaA, a chromosomal replication initiator protein, and a DNA-binding response regulator. A homolog of NS3 in a densovirus is important for viral DNA replication. The VPs are the major structural proteins which can self-assemble if expressed in a baculovirus expression system, while P133 is a minor structural protein. Interestingly P133 has a homolog with the outer capsid protein of cypoviruses, members of the Reoviridae family. PolB is a protein-primed DNA polymerase that appears to be processed by post translational cleavage into three peptides with sizes of 53 (detected in purified virions), 70 and 120 kDa. Infection of silkworm intestinal epithelial cells appears to be by receptor-mediated endocytosis through attachment of the virion VPs with a NSD-2, an amino transporter protein cell receptor protein. During infection there is nuclear hypertrophy with the appearance, late in infection, of a dense homogenous structure in the nuclei and concomitantly vacuolization of mitochondria and endoplasmic reticulum and formation of a large autophagosome. The dead columnar epithelial cells are sloughed out, along with the released viruses into bead-like frass. The mode of replication of BmBDV DNA is unknown though absence of replicative intermediates typical of parvovirus DNA replication seem to discount rolling circle replication. Rather, the protein-primed nature of the BmBDV DNApolB and the presence of ITRs at the genome ends which could form a panhandle, suggest replication of BmBDV DNA might occur by protein priming, as is well described for replication of adenovirus DNA.

706

An Introduction to Viruses of Invertebrates

Infection of early instars results in the lengthening of times between instars while infection of late instars can result in pupation and adult emergence, but the weight and survival of pupae is compromised.

Birnaviruses of Invertebrates (Entomobirnavirus) Family: Birnaviridae [ICTV 1984] Genus: Entomobirnavirus [ICTV 1993] (2 species) Type Species: Drosophila X virus [ICTV 1991] Species Exemplar: Drosophila X virus Viruses belonging to three genera of the family Birnaviridae infect non mammalian vertebrates. The fourth genus represents entomobirnaviruses infecting exclusively insects and are classified in the genus Entomobirnavirus. Originally discovered as a contaminant in a fruit fly cell culture, at least four other birnaviruses have also been found in free-living mosquitoes and drosophila. One birnavirus recovered in a mosquito cell line, Esprito Santo virus, was found in a sample from a Dengue-3 virus-infected patient from Brazil. Like other birnaviruses, they have an icosahedral T ¼ 3 symmetry and a dsRNA genome which, as the name implies, is bisegmented. One segment is about 3.2 kb and the other about 3.4 kb in length, with VPg, a virion protein derived from VP1 and linked to the RNA genome, covalently attached at the 50 end of the positive sense strand of each ds genomic RNA. The 3.4 kb segment A encodes a 114 kDa polyprotein precursor which is proteolytically cleaved by the VP4 protease into structural capsid proteins in the order on the polyprotein, preVP2, VP4 and VP3. A fourth VP, VP5 is encoded in frame with VP4. In addition to being a capsid protein, VP3 has anti RNAi activity suppressing antiviral RNAi. The 3.2 kb segment B encodes the RdRp. Little is known of the entomobirnavirus replication cycle, but it is likely that genome replication and transcription are similar to that of other birnaviruses, covered elsewhere in the encyclopedia. Entomobirnaviruses cause a cytopathic effect in infected insect cells including cell aggregation and ending in death. Drosophila X virus can result in a persistent infection often resulting in oxygen starvation causing death.

Dicistroviridae Family: Dicistroviridae [ICTV 2002] (Order Picornavirales, ICTV 2008) Genus: Aparavirus [ICTV 2009] (6 species) Type Species: Acute bee paralysis virus [ICTV 2004] Species Exemplar: Acute bee paralysis virus Genus: Cripavirus [ICTV 2002] (4 species) Type Species: [ICTV 1982] Cricket paralysis virus Species Exemplar: Cricket paralysis virus Genus: Triatovirus [ICTV 2015] (5 species) Type Species: [ICTV 2002] Triatomavirus Species Exemplar: Triatoma virus Dicistroviruses are a group of highly stable þ ve sense ssRNA viruses which, unlike other picornavirales viruses, has a bicistronic genome encoding two ORFs separated by an intergenic ribosome entry site (IRES), a characteristic which led to their name as dicistroviruses. They are 30 nm diameter icosahedral viruses with a pseudo T ¼ 3 symmetry and composed of 4 viral proteins, VP1, 2, 3 and 4. As a group dicistroviruses have a broad host range including many insects like bees, ants, crickets, aphids, drosophilids, lepidoptera and triatomes, the vector of Chagas disease, and other arthropods such as shrimp and crabs. Thus, they are of great economic and medical importance and some dicistroviruses have been touted as biological control agents for them. The 8.5–20 kb þ ve sense ssRNA genome has a VPg covalently linked to the 50 end containing two ORFs, ORF1 and ORF2. ORF1 is preceded by a 50 IRES which, along with the intergenic IRES between the two ORFs and a 30 polyA tail, allows for translation of both ORFs in a cap-independent manner. ORF1 encodes nonstructural proteins including a helicase, the VPg, protease and RNA dependent RNA polymerase (RdRp). ORF2 encodes the four structural proteins VP1, 2, 3 and 4. While the receptor for attachment to cells is still unknown, dicistroviruses enter cells via clathrin-mediated endocytosis. The rest of the replication cycle is similar to the picornaviruses, including the synthesis of the ORF1 and ORF2 polyproteins which are auto proteolytically cleaved into the smaller functional proteins. The VPg is uridylated at a tyrosine residue by the viral RdRp allowing it to serve as a primer for replication of both the -ve and þ ve sense strands. Most of the dicistroviruses infect mostly the gut and shed virus into the gut lumen, eliminating it through the feces which enables efficient transmission of the virus. However, dicistroviruses infecting hymenoptera infect all developmental stages and replicate in most tissues. Infection is associated with nutritional stress and in bees can be manifest in neurogical signs such as paralysis, trembling and flightlessness and ultimately death. Infections due to dicistroviruses have been associated, along with other factors, with the worldwide colony collapse disorder of beehives in 2006–2007 and continues to plague the honey industry. Dicistroviruses also cause losses in the crab and shrimp industries, being the causative agents of sleeping disease by Mud crab virus and Taura syndrome in penaeid shrimp due to Taura syndrome virus. Many dicistroviruses are transmitted horizontally through the fecal-oral route, while in hymenoptera, virus transmission can be food-borne, venereal, vector borne and vertical.

An Introduction to Viruses of Invertebrates

707

Hytrosaviridae Family: Hytrosaviridae [ICTV 2011] Genus: Glossinavirus [ICTV 2011] (1 species) Type Species: Glossina hytrosavirus [ICTV 2011] Species Exemplar: Glossina pallidipes salivary gland hypertrophy virus (Uganda) Genus: Muscavirus [ICTV 2011] (1 species) Type Species: Musca hytrosavirus [ICTV 2011] Species Exemplar: Musca domestica salivary gland hypertrophy virus Hytrosaviruses are large dsDNA viruses of diptera, mostly tsetse fly and houseflies and syrphid fly. Though the symptoms of salivary gland hypertrophy in tsetse flies were first described in the 1930s, the etiological viral agent exhibiting as virus-like rods, was not described until 1978. In hytrosavirus-infected insects the salivary glands swell to up to four times normal size, leading to the origin of the name salivary gland hypertrophy virus. They are now classified in the family Hytrosaviridae which has two genera, Glossinavirus and Muscavirus, named after the host of the virus in the genus. Each genus has a single species which is the type species, Glossina hytrosavirus in genus Glossinavirus and Musca hytrosavirus in genus Muscavirus. Though not always lethal, infection is associated with reproductive dysfunctions including infertility and distorted mating behaviors. The Musca hytrosavirus Glossina pallidipes salivary gland hypertrophy virus is particularly problematic in raising tsetse flies, the causative agent of African sleeping sickness, for the very effective sterile male release program using the sterile insect technique (SIT). Once in a colony, the hytrosavirus results in collapse of the colonies as happened in the Seibersdorf, Austria tsetse fly culture. Virions are rod-shaped with a diameter of 50–65 nm and length of 500–1000 nm with a bilipid membrane envelope and circular dsDNA genomes 124–190 kb in length. They are not only structurally similar to other large dsDNA invertebrate viruses like baculoviruses, nudiviruses and nimaviruses but they share about 12 of the 38 core genes common to them. Little is known of their replication cycle. They appear to be infected through the tracheal system. Viral capsids are released into the cytoplasm through membrane fusion. Through the cell’s microtubule system, the capsids move to the nucleus to initiate transcription of immediate early, early and late genes. More detailed studies of these early stages are hampered by lack of a more tractable cell culture system. Nucleocapsids can be observed leaving the nucleus through nuclear pores and exiting the cell by budding through the membrane of cells bordering the salivary gland lumen. Transmission of Glossina hytrosavirus in tsetse flies is both horizontal, due to feeding of contaminated blood and vertical through trans-ovum transmission or through infected milk. Transmission of Musca hytrosavirus is largely horizontal through gregarious feeding of the houseflies.

Iflaviridae Family: Iflaviridae [ICTV 2011] (Order: Picornavirales, ICTV 2008) Genus: Iflavirus [ICTV 2008] (15 species) Type Species: Infectious flacherie virus [ICTV 2002] Species Exemplar: Infectious flacherie virus Iflaviruses are small icosahedral viruses 22–30 nm in diameter with a monopartite 8–10.5 kb þ ve ssRNA genome with a VPg at the 50 end and a templated polyA tail at the 30 end. They infect only arthropods, namely insects including honeybees, lepidoptera, aphids and planthoppers and archnids, such as mites parasitic on honeybees. Genus, Iflavirus numbering 15 species to date includes the type species Infectious flacherie virus represented by a silkworm virus, Infectious flacherie virus from which the genus and species names are derived. Their cellular replication cycle is thought to be similar to that of other picornaviruses in the Picornavirales including cap-independent translation due to an IRES 50 to the polyprotein ORF and involving auto catalytic cleavage into structural proteins and the RdRp. Iflaviruses are transmitted largely through contaminated food and often result in a permissive infection sometimes resulting in vertical transmission. Some symptomatic infections do occur resulting in diarrhea, developmental abnormalities and death of the host. Iflavirus infections of honeybees have been linked, along with other bee viruses, with colony collapse disorders and as such they are of economic importance.

Iridoviruses of Invertebrates (Betairidovirinae) Family: Iridoviridae [ICTV 1975] (Order: Pimascovirales, ICTV 2019) Subfamily: Betairidovirinae [ICTV 2016] Genus: Chloriridovirus [ICTV 1981] (5 species) Type Species: Invertebrate iridescent virus 3 [ICTV 1999] Species Exemplar: Mosquito iridescent virus Genus: Decapodiridovirus [ICTV 2018] (1 species) Type Species: [ICTV 2018] Decapod iridescent virus 1 Species Exemplar: Shrimp hemocyte iridescent virus Genus: Iridovirus [ICTV 1971] (2 species) Type Species: Invertebrate Iridovirus 6 [ICTV 1999] Species Exemplar: Chilo iridescent virus

708

An Introduction to Viruses of Invertebrates

The Irodoviridae family represents large icosahedral viruses classified into two subfamilies. Viruses in the subfamily Alphairidovirinae infect ectothermic vertebrates including bony fishes, amphibians and reptiles. The second subfamily, Betairidovirinae, includes all iridescent viruses of invertebrates. It has three genera, Chloriridovirus with five species, Decapodiridovirus with one species and the original genus Iridovirus, with two species. The invertebrate iridescent viruses are among the largest icosahedral viruses with diameters up to 200 nm and symmetries of T ¼ 147 to 217. Though they have an icosahedral structure, they can be more complex including having an internal membrane enveloping the nucleoprotein core containing a linear dsDNA genome of 163–220 kb that is circularly permuted with terminal redundancies. While some are naked, iridoviruses that bud from infected cells will have an additional membrane surrounding it. They have a fairly broad host range infecting 6 orders of insects including mosquitoes and lepidoptera and crustaceans, like isopods, molluscs and oysters (Decapodiridoviruses) and even some vertebrates. For example, irididolike viruses cause gill necrosis and oyster velar virus disease in oysters. The viruses are highly stable in water but thermolabile and sensitive to dry conditions. Viruses enter susceptible cells via clathrin-mediated endocystosis, in the case of IIV-6 using ORF 096 L as the virus attachment molecule. Transcription is temporally regulated into immediate early and delayed early phases and a viral RNA polymerase II-mediated late phase. Genome size or slightly longer DNA molecules are first generated in the nucleus and after transport to the cytoplasm form concatemeric DNA facilitated by their terminal redundancy and are methylated. The concatemeric viral DNA is packaged by a “headful” process in which slightly longer than genome length DNA is packaged into virions as circularly permuted terminally redundant molecules. Virions are released from infected cells by cell lysis or by budding from the host cell membrane. Like many DNA viruses, iridoviruses code for anti-apoptotic proteins, like IIV-6 ORF193R, that are expressed early to protect the virus from the cells’ apoptotic machinery. However, late in infection they also induce apoptosis, mediated by viral proteins like the Chilo iridescent virus 389 L protein (iridoptin) which aids in viral dissemination to adjacent cells. Transmission between insects can be perorally or by endoparasitic hymenoptera. Infection can be patent, with an accumulation of high numbers of virions, giving rise to a visible iridescence, from which the viruses originally got their name. Infection is normally lethal or covert resulting in only a low level of virus particles, but still resulting in reduced fitness.

Malacoherpesviridae Family: Malacoherpesviridae [ICTV 2008] (Order: Herpesvirales, ICTV 2008) Genus Aurivirus [ICTV 2012] (I species) Type Species: Haliotid herpesvirus 1 [ICTV 2012] Species Exemplar: Haliotid herpesvirus 1 Genus: Ostreavirus [ICTV 2008] (1 species) Type Species: Ostreid herpesvirus 1 [ICTV 2008] Species Exemplar: Ostreid herpesvirus 1 The malacoherpesviruses are herpesviruses that infect a variety of marine invertebrates notably molluscs, like oysters (superfamily Ostreoidea) and abalones (sea snails in the family Halitodiae) and have a global economic impact, particularly in hatcheries. The oyster herpesvirus, OsHV-1, was the first malacoherpesvirus discovered and that was in adult Eastern oysters, Crassostrea viginica in 1972. It was also found to be responsible for mass mortality in Pacific oyster (C. grigas), European flat oyster (Ostrea edulis) and other oyster species. Disease outbreaks due to a second malacoherpesvirus, Haliotis herpesvirus HaHV-1, were also discovered in a variety of abalone species such as Haliotis diversicolor supertexta, H. laevigta and H. rubra and others beginning in 2005. The malacoherpesviruses have a typical herpesvirus morphology consisting of a 70–80 nm diameter icosahedral capsid with T ¼ 16, surrounded by a trilaminar unit membrane giving an overall diameter of about 100–180 nm. However, the malacoherpesviruses lack the tegument usually associated with herpesviruses but instead have a 5 nm translucent gap between the capsid and outer viral membrane filled with fine fibrils. The linear dsDNA genomes of only two malacoherspesviruses have been sequenced including the ostreavirus OsHV-1 genome with 207,339 bp (NC_005881) and the aurivirus, abalone herpesvirus Victoria/AUS/2009 genome with a size of 211,518 bp (NC_018874). Both genomes are structured as in other herpesviruses, in the order TRL-UL-IRL-X-IRS-US-TrS. For example, for OsHV-1, the genome is organized into a 3,370 bp unique short US and a 167,843 bp unique long UL regions having two orientations each flanked by inverted repeats. UL is flanked by 7,584 bp TRL and IRL and US is flanked by 9,774 bp TRS and IRS. However, the malacoherpesviruses differ from amniote herpesviruses in the family Herpesviridae and herpesviruses of fishes and amphibians in the family Alloherpesviridae by infecting only invertebrate hosts and their genomes are phylogenetically distinct from them. Consequently, during the re-examination of the order Herpesvirales they were classified in 2008 in a new family Malacoherpesviridae with two genera Ostreavirus for the oyster herpesviruses and Aurivirus for the abalone herpesviruses. The prefix “malaco” for this family name is derived from the word malaco, meaning soft, a term used to denote molluscs.

Mesoniviridae Family: Mesoniviridae [ICTV 2012] (Order: Nidovirales, ICTV 1996) Subfamily: Hexponivirinae [ICTV 2018] Genus: Alphamesonivirus [ICTV 2002] (9 species) Subgenus: Namcalivirus [ICTV 2018] Type Species: [ICTV 2012] Alphamesonivirus 1 Species Exemplar: Cavally virus

An Introduction to Viruses of Invertebrates

709

Mesoniviruses are 50–120 nm diameter enveloped mosquito viruses with a monopartite 20 kb þ ve sense ssRNA genome with a 50 cap and 30 polyA tail. For such a small group of very similar viruses in terms of morphology and a relatively tight nucleic acid phylogeny, mesoniviruses have a complicated taxonomy with a suborder Mesonidovirineae of the order Nidovirales, subfamily Hexponivirinae, family Mesoniviridae, one genus, Alphamesonivirus with a type species Alphamesonivirus 1. The genus is further subdivided into eight subgenera. The subgenera Karsalivirus and Namcalivirus each have two species, while each of the other six subgenera have only one species in the subgenus. The genome consists of seven ORFs. The replicative enzymes ORF1a and 1b overlap and are translated by a ribosomal frame shift, facilitated by overlapping GGAUUUU sequences. ORF3a and 3b might similarly be translated by a ribosomal frame shift. ORF1a encodes a viral protease for autocatalytic processing of pp1a and pp1ab into smaller proteins. ORF1a and ORF1b encodes a polyprotein processed by autocatalytic cleavage to RdRp, helicase, exoribonuclease and two methyl transferases. The ORFs downstream of ORF1b, are for structural proteins, two Spike protein subunits, S1 and S2, the nucleocapsid protein N and 4 membrane spanning proteins (M). These are translated from two subgenomic RNAs, the synthesis of which are regulated by transcription regulating sequences (TRSs). The envelope consists of protruding spike-like projections and an integral membrane protein anchoring it to the nucleoprotein core. To date mesoniviruses have been identified in only mosquitoes and one report of a mesonivirus from aphids based on next generation sequencing. While they replicate in some insect cells, they could not replicate in several mammalian and avian cell lines studied. Infection results in a variety of cytopathic effects, associated with a high and rapid yield of progeny viruses. Some mesoniviruses are maintained in the population by vertical transmission with transmission from female mosquitoes to her progeny, including male offspring. Horizontal transmission may occur via contaminated legs, wings and even saliva from naturally infected mosquitoes.

Nimaviridae Family: Nimaviridae [ICTV 2002] Genus: Whispovirus [ICTV 2002] (1 species) Type Species: White spot syndrome virus [2008] Species Exemplar: White spot syndrome virus-CN White spot syndrome virus is the only species in the Nimaviridae family. Variants of White spot syndrome virus (WSSV), the only virus in the species, causes a highly contagious disease, White spot disease (WSD), after which the virus was named. The first WSD outbreak was in China and Taiwan around 1992 and spread to other shrimp farming countries like Japan, Thailand, the Americas, France and the Middle East. The virus has a broad host range among crustaceans, including shrimp, crayfish, crab and even lobster resulting in large economic losses in the related industries. Originally, because of the baculovirus-like rod shape of the nucleocapsid and its large ds circular DNA genome, their earlier names included Chinese baculo-like virus, Penaeus monodon baculovirus and white spot baculovirus. The names were changed to WSSV followed by letters designating their geographic origins and were first classified as species White spot syndrome virus 1 in the family Baculoviridae. However due to significant differences in morphology and phylogeny between these viruses and baculoviruses, they were reassigned to a new family Nimaviridae (named for the threadlike tail, “nima” in Latin, of the virions) with a single genus, Whispovirus. The genus Whispovirus derives its name from the sole White spot syndrome virus species with the White spot syndrome virus-CN (WSSV-CN) as the exemplar isolate. This species is represented by 15 other isolates whose genomes have been completely sequenced including WSSVs from China, Japan, Taiwan, Thailand, Mexico, Australia, India, Brazil and Ecuador. WSSVs are large, 70–170 in diameter by 210–320 nm in length, enveloped virions with a thread-like extension with an intermediate tegument layer containing a 65  300 nm rod-shaped nucleocapsid, reminiscent of baculovirus nucleocapsids. In addition to about 50 structural proteins, the virus also has lipid components. The large circular dsDNA genome is 281–309 kb in size and has nine internal homologous repeats similar to the hrs of baculovirus genomes. Only about 150 of the predicted 530 ORFs are thought to be functional. The WSSV genome also encodes some microRNAs targeting both viral and host genes. While there is some genotypic variability among the isolates, all are considered to be from the same species. WSSV enters cells by caveola or clathrin-mediated endocytosis and alters host metabolism. Replication is nuclear with temporally regulated transcription divided into immediate early, early, late and very late phases. While most mRNAs have 30 polyA tails and 50 -caps there is no evidence of splicing, but some have IRESs which could be used to regulate translation. Like baculoviruses, the early genes have host-like RNApolII promoters, while late genes have a late gene promoter. Replication takes place entirely in the nucleus, including de novo formation of intranuclear viral envelope and morphogenesis of nucleocapsids. The virus appears to target stomach, gills, lymphoid organs and connective tissue but not hepatopancreas and midgut, both of endodermal origin. Signs of infection include lethargy, a reddish coloration of the body and white spots, from which it derives its name.

Nodaviridae Family: Nodaviridae [ICTV 1981] (Order Nodamuvirales, ICTV 2019) Genus: Alphanodavirus [ICTV 1997] (5 species) Type Species: Nodamura virus [ICTV 1981] Species Exemplar: Black beetle virus Alphanodaviruses replicate exclusively in insects, notably wax moth larvae, grass grubs, tsetse fly, reduviid bugs and mosquitoes. Viruses from genus Betanodavirus, the second of two genera in the family, infect fish. In 1975, the black beetle virus was the first

710

An Introduction to Viruses of Invertebrates

nodavirus discovered which was from black beetles (Heteronychus arator) in New Zealand. The virions are icosahedral with T ¼ 3 symmetry and diameters of 32–33 nm and consisting of a single capsid protein (CP) or its two cleavage products. The Flock house virus CP is 44 kDa and the two cleavage products b and g are 39 and 4 kDa in size, respectively. The þ ve ssRNA genome is bisegmented with sizes of 1.3 and 1.4 kb. Following entry to the cytoplasm, three mRNAs, RNA1, which codes for the catalytic subunit of RdRp, RNA2 which codes for CP, and RNA3, which codes for B1 and B2 which function as suppressors of RNAi, are produced. Infection of hosts usually results in stunting, paralysis and death. Complete genome sequences have been deposited for black beetle virus, Boolarra virus, five Flock House viruses, two Nodamura viruses and one Pariacoto virus.

Nudiviridae Family: Nudiviridae [ICTV 2013] Genus: Alphanudivirus [ICTV 2013] (2 species) Type Species: Oryctes rhinoceros nudivirus [ICTV 2013] Species Exemplar: Oryctes rhinoceros nudivirus (Ma07) Genus: Betanudivirus [ICTV 2013] (1 species) Type Species: Heliothis zea nudivirus [ICTV 2013] Species Exemplar: Heliothis zea nudivirus (1) The first nudivirus recognized as a virus is probably the “non occluded baculovirus” that infected the Rhinoceros beetle in Samoa and which was used in 1967 as an effective biological control agent against this beetle. Since then it has had a convoluted taxonomic history, originally classified as a baculovirus, except that it did not form occlusion bodies like other baculoviruses. Through phylogenetic analysis, the nudiviruses are now classified in their own family, Nudiviridae organized into two genera, Alphanudivirus with two species including Oryctes rhinoceros nudivirus as the type species and Betanudivirus with Heliothis zea nudivirus as the sole member and type species. Nudivirus virions have a rod-shaped nucleocapsid and a circular dsDNA 96–232 kb genome, much like that of baculovirus genomes and are enveloped. The alphanudiviruses are more rounded in shape measuring 100  200 nm, while the betanudiviruses are more rod-shaped measuring 80  300–400 nm. They differed initially from baculoviruses in that they were non occluded, or naked, resulting in their current taxonomic designation as “nudi” viruses. However, some nudiviruses have enveloped or non enveloped occlusion bodies, which in some cases are facultative. They infect many orders of insects and are a major problem for the shrimp industry. Not much is known about their replication cycle. OrNV attaches to cells and is taken in by pinocytosis. It is released into the cytoplasm upon fusion with lysosomal membranes and degeneration of the lysosomes. Presumably the nucleocapsids are transported to the nucleus where the DNA is released, initiating replication around 7 h post infection. Transcription is temporally regulated into early, intermediate and late stages with at least 100 transcripts being expressed. Viral envelopes and nucleocapsids and episomally replicated viral DNA assemble in the nucleus to be released into the cytoplasm followed by budding through the plasma membrane. Heliothsi zea nudivirus 1 (HzNV-1) was originally found in a persistently infected cell line IMC Hz-1. HzNV-1 causes gonadal atrophy in adult Galleria mellonella. In cells, HzNV-1 has both a productive (lytic) infection resulting in a high titer of virus and cell death and in some cells a persistent (latent) infection. Persistently infected cells can be passaged without cell lysis and viral DNA can be detected in episomal or integrated form. Two viral genes appear to be correlated with the switch between productive and latent infections. When a high concentration of HzNV-1 viral hh1 is expressed in latently infected cells a lytic infection ensues. However if pag1 encoded miRNAs targeting hh1 are produced then a latent infection occurs. As mentioned, nudiviruses have a broad host range in insects and shrimp. Transmission is through a peroral feeding of feces from virus-infected midgut epithelia and per parenteral mating routes. Consequences of infection can include reduced size, crippling, reduced fecundity, host sterility and death by fat body disintegration of infected individuals.

Nyamiviridae Family: Nyamiviridae [2013] (Order Mononegavirales, ICTV 1990) Genus: Berhavirus [ICTV 2018] (3 species) Type Species: Sipunculid berhavirus [ICTV 2018] Species Exemplar: Běihǎi rhabdo-like virus 4 Genus: Crustavirus [ICTV 2015] (3 species) Type Species: Wenzhou crustavirus [ICTV 2015] Species Exemplar: Běihǎi rhabdo-like virus 6 Genus: Nyavirus [ICTV 2013] (3 species) Type Species: Nyamanini nyavirus [ICTV 2013] Species Exemplar: Sierra Nevada virus Genus: Orinovirus [ICTV 2018] (1 species) Type Species: Orinoco orinovirus [ICTV 2018] Species Exemplar: Orinico virus Genus: Socyvirus [ICTV 2015] (1 species) Type Species: Soybean cyst nematode socyvirus [ICTV 2013]

An Introduction to Viruses of Invertebrates

711

Species Exemplar: Soybean cyst nematode virus 1 Genus: Tapwovirus [ICTV 2018] (1 species) Type Species: Tapeworm tapwovirus [ICTV 2018] Species Exemplar: Wēnzhōu tapeworm virus 1 Nyamiviruses, being in the Order Mononegavirales, have a non-segmented -ve ssRNA genome ranging from 8 to 12 kb in size, with the exception of Tapwovirus which has a bi-segmented  ve ssRNA genome with an aggregate size of 10.5 kb. The family Nymaviridae consists of six genera and a total of 12 species. The family is named for the Nyavirus genus (derived from Nyamanini Pan in South Africa from where the first Nyavirus, Nyamanini virus, was isolated). Viruses in that genus are tick-borne and can also infect birds. Many viruses in this family have been implied to exist only through next generation sequencing, so little virological and biological information, except their association with a broad range of invertebrates is available. Viruses in the genus Berhavirus are associated with marine echinoderms and sipunculids, those of genus Crustavirus, as the name implies, are associated with crustaceans, those of Orinovirus with moths, those of Socyvirus with plant parasitic nematodes and those from Tapwovirus are from tapeworms. The Nyamanini virus virions are enveloped and spherical in shape with diameters of 100–130 nm and consisting of 5–6 structural proteins including the nucleocapsid protein (N), glycoprotein (G) and RdRP (L) based on sequence similarity to other mononegavirales sequences. The genomes of the different nyamiviruses encode 5–6 ORFs encoding structural proteins including N, G, P, L and some of unknown function. Nyamanini virus replicates in the nucleus but other aspects of virus replication is unknown, a major limitation of identifying viruses solely through metagenomics.

Parvoviruses of Insects (Densovirinae and Hamaparvovirinae) Note: The family taxonomy of the Parvoviridae underwent a major reorganization as this article was being written. Below is the updated taxonomy as of August 2020. Family: Parvoviridae [1975] (Order Picovirales, ICTV 2019) Subfamily: Densovirinae [ICTV 1993] Genus: Aquambidensovirus [ICTV 2019] (2 species) Type Species: Decapod aquambidensovirus 1 (2) [ICTV 2019] Species Exemplar: Cherax quadricarinatus densovirus Genus: Blattamambidensovirus, (1) [ICTV 2019] (1 species) Type Species: Blattella blattambidensovirus 1 [ICTV 2019] Species Exemplar: Blattella germanica densovirus Genus: Hemiambidensovirus [ICTV 2019] (2 species) Type Species: Hemipteran hemiambidensovirus 1 (2) [ICTV 2019] Species Exemplar: Dysaphis plantaginea densovirus Genus: Iteradensovirus [ICTV 2013] (5 species) Type Species: Lepidopteran iteradensovirus 1 (5) [ICTV 2013] Species Exemplar: Bombyx mori densovirus 1 Genus: Miniambidensovirus [ICTV 2019] (1 species) Type Species: Orthopteran miniambidensovirus 1 (1) [ICTV 2019] Species Exemplar: Acheta domestica miniambidensovirus Genus: Pefuambidensovirus [ICTV 2019] (1 species) Type Species: Blattodean pefuambidensovirus 1 (1) [ICTV 2019] Species Exemplar: Periplanata fuliginosa densovirus Genus: Protoambidensovirus [ICTV 2019] (2 species) Type Species: Lepidopteran protoambidensovirus 1 (2) [ICTV 2019] Species Exemplar: Galleria mellonella densovirus Genus: Scindambidensovirus [ICTV 2019] (3 species) Type Species: Orthopteran scinambidensovirus 1 (3) [ICTV 2019] Species Exemplar: Acheta demostica densovirus Subfamily: Hamaparvovirinae Genus: Brevihamadensovirus [ICTV 2019] (2 species) Type Species: Dipteran brevihamadensovirus 1 (2) [ICTV 2019] Species Exemplar: Aedes albopictus C6/36 cell densovirus Genus: Hepanhamaparvosovirus [ICTV 2019] (1 species) Type Species: Decapod hepanhamaparvosovirus 1 (1) [ICTV 2019] Species Exemplar: Paneus merguiensis hepandensovirus Genus: Penstylhamaparvovirus [ICTV 2019] (1 species) Type Species: Decapod penstylhamaparvovirus 1 (1) [ICTV 2019] Species Exemplar: Decapod penstyldensovirus 1

712

An Introduction to Viruses of Invertebrates

Viruses in the family Parvoviridae are small, naked icosahedral viruses with a small, linear, single stranded DNA genome about 4–6 kb in length. Parvoviruses are classified into three subfamilies based largely on the nature of their host. Those in the Parvorivinae subfamily infect vertebrates while those in the two subfamilies Densovirinae and Hamaparvovirinae infect invertebrate hosts. Densovirinae have eight genera including the original Itaradensovirus, and what used to be genus Ambidensovirus, which due to recent phylogeny studies (2019) has been reorganized into seven genera, as indicated above. A second new subfamily, Hamaparvirinae, was created and accommodates densoviruses in three of the five genera which infect invertebrates namely genera, Brevihamadensovirus (originally Brevidensovirus), Hepanhamadensovirus (originally Hepandensovirus) and Penstylhamadensovirus (originally Penstyldensovirus). Densoviruses were first discovered in the early 1960s as an epizootic of densonucleosis (the name reflecting the dense appearance of nuclei due to the accumulation of viroplasm in nuclei of infected cells) in a commercial mass rearing of the greater wax moth larvae used for fishing bait in France. The virions have a naked icosahedral T ¼ 1 symmetry measuring 21–26 nm in diameter with a monopartite ssDNA genome, 4–6 kb in length encoding two or three NS proteins and two to four VPs depending on the species. The ambisense densoviruses (in the seven ambidensovirus genera) package both the  ve and þ ve ssDNA strands, but in separate virions while the other, monosense capsids, package primarily the -ve ssDNA strand. The imperfect palindromic sequences at each end of the genome fold into hairpins providing a free 30 OH for initiating DNA replication using host DNA polymerase. There is a wide variety of expression strategies to generate the different proteins, including splicing, leaky scanning and overlapping promoters and ORFs. As a group, the densoviruses have a very broad host range, including six orders of insects, decapod crustaceans, like shrimp and crayfish and echinoderms, like starfish. Parvovirus infections are polytropic in the tissues they infect and are often lethal and spread easily. Disease can be devastating to commercial enterprises, especially in industrial shrimp and crayfish farming, silk production and in the insects for food industry such as cricket farming. Some insect parvoviruses might be potentially developed as biocontrol agents against insect pests like ants and mosquitoes and perhaps some agricultural insect pests. For example, densovirus was used to successfully control Limacodidae pests of oil palm plantations and Galleria mellonella in beehives.

Polydnaviridae Family: Polydnaviridae [ICTV 1984] Genus: Bracovirus [ICTV 1999] (32 species) Type Species: Cotesia melanoscela bracovirus [ICTV 1990] Species Exemplar: Cotesia melanoscela bracovirus Genus: Ichnovirus [ICTV 1990] (21 species) Type Species: Campoletis sonorensis ichnovirus [ICTV 1999] Species Exemplar: Campoletis sonorensis ichnovirus Polydnaviruses were first recognized in the late 1970s in the laboratories of Don Stoltz (Dalhousie University, Canada) and Brad Vinson (Texas A&M University, USA) who noticed virus-like particles from the calyx region of the ovaries of first Apanteles (later renamed Cotesia) melanoscelus and then in other parasitic hymenoptera. Polydnaviruses are associated with the reproductive tracts of braconid (bracoviruses) and ichneumonid (ichnoviruses) parasitoids. Injected, along with parasitoid eggs, into host caterpillars, polydnaviruses subvert the host immune and physiology systems benefiting developing parasitoid eggs and larvae. Polydnaviruses are also characterized as having a multipartite circular dsDNA genome, referred to originally as a polydisperse genome, from which the family Polydnaviridae took its name. Because of the distinct differences, particularly in morphology and phylogeny between polydnaviruses from braconid wasps and those from ichneumonid ones, the viruses were classified in two genera Bracovirus and Ichnovirus, respectively. Another unique feature of polydnaviruses is that their genomes exist in an integrated and extrachromosomal form. The integrated provirus form allows for vertical transmission, and some of the integrated genes are responsible for generation of virions in the calyx cells. The extrachromosomal form is packaged into virions and encodes genes expressed to take over host immunology and physiology to benefit the developing parasitoid subsequent to their delivery coincident with oviposition. The polydnavirus virion DNA is a collection of discrete numbers of circular dsDNA molecules in non equimolar abundance. The number and size of individual circles varies depending on the genus and species. Bracovirus virions measure 50 nm in diameter x 30–80 nm long, consist of rod-shaped nucleocapsids with one or more of them surrounded by a nucleus-derived membrane and resemble baculovirus budded viruses. It was because of this morphology that bracoviruses were originally classified along with the baculoviruses in 1982 with the bracovirus Apanteles melanoscelus virus (now Cotesia melanoscelus polydnavirus) being recognized as a species in the then genus Baculovirus in the family Baculoviridae. The bracovirus dsDNA genome consists of 15–30 circles ranging in size from 2 to 50 kb with aggregate sizes of 125–600 kb. BV nucleocapsids assemble and acquire an envelope from the nucleus forming intranuclear virions in calyx cells near the lumen. These “mature” calyx cells fill with virions, eventually rupturing the nuclear envelope and escaping into the calyx lumen when the cells disrupt. Ichnoviruses have quite a different morphology. The lenticular nucleocapsid measures 80–90 nm in diameter x up to 300 nm in length. Single (from Campopleginae wasps) or multiple (from Banchinae wasps) nucleocapsids are surrounded by two, two-unit membranes, the inner one derived from within the nucleus, often surrounding a hook at one end and the outer one derived from the plasma membrane during budding into the calyx lumen. Ichnovirus circular dsDNA genomes consist of 20–50 circles ranging in size

An Introduction to Viruses of Invertebrates

713

from 2 to 38 kb with aggregate sizes of around 250 kb. Ichnovirus replication is similar to that of bracoviruses in that nucleocapsids assemble in a virogenic stroma in the nucleus where they also acquire an envelope. From that point ichnovirus development differs from the bracoviruses as the enveloped ichnovirus subvirions exit the nucleus and then bud into the calyx lumen through the apical plasma membrane. There has been a flurry of research activity surrounding the nature of the integrated and extrachromosomal (virion) DNA, the origin of the genomes and the expression of viral genes within the parasitoid and following oviposition in the parasitized host. During virus production the viral DNA segments of the integrated genomes are amplified, excised and circularized prior to encapsidation into polydnavirus nucleocapsids destined for envelopment and secretion into the calyx lumen. The “replication” genes remain integrated but are amplified in calyx cells. They are then expressed to produce mostly proteins associated with virus production. Many polydnavirus genes are derived from the parasitoids which harbor them and consequently the phylogeny of the virus often reflects the phylogeny of the parasitoids which harbor them.

Poxviruses of Invertebrates (Entomopoxvirinae) Family: Poxviridae [ICTV 1978] (Order Chitovirales, ICTV 2019) Subfamily: Entomopoxvirinae [ICTV 1978] Genus: Alphaentomopoxvirus [ICTV 2004] (7 species) Type Species: Melonontha melonontha entomopoxivirus [ICTV 1976] Species Exemplar: Melonontha melonontha entomopoxivirus Genus: Betaentomopoxvirus [ICTV 2004] (16 species) Type Species: Amsacta moorei entomopoxvirus [ICTV 1976] Species Exemplar: Amsacta moorei entomopoxvirus Genus: Deltaentompoxivirus [ICTV 2019] (1 species) Type Species: Melanoplus sanguinipes entomopoxvirus [ICTV 1976] Species Exemplar: Melanoplus sanguinipes entomopoxvirus Genus: Gammaentomopoxvirus [ICTV 2004] (6 species) Type Species: Chironomus luridis entomopoxvirus [ICTV 1976] Species Exemplar: Chironomus luridis entomopoxvirus Subfamily: Chordopoxvirinae [ICTV 1978] Genus: Centapoxvirus [ICTV 2016] (2 species) Type Species: Yokapox virus [ICTV 2016] Species Exemplar: Yoka poxvirus The Poxviridae family is subdivided into two subfamilies. Poxviruses of vertebrates, like the familiar smallpox virus, are classified in the subfamily Chordopoxvirinae, while those in the Entomopoxvirinae subfamily infect only insects. The Entomopoxvirinae subfamily is currently divided into three genera, Alphaentomopoxirus with 7 species, Betaentomopoxvirus with 16 species and Gammaentomopoxvirus with 6 species. The entomopoxviruses, first discovered by Constantine Vago in 1963, have been isolated from four orders of insects, namely Coleoptera, Orthoptera, Lepidoptera and Diptera. One poxvirus, Yoka poxvirus, was isolated from pools of mosquitoes in 1972, but phylogenetically it is more similar to orthopoxviruses than to any entomopoxvirus. Yoka poxvirus is classified in the subfamily Chordopoxvirinae, Centapoxvirus genus, species Yokapoxvirus. Moreover, it replicates in Vero and other mammalian cells and causes death in intracranially-inoculated suckling mice suggesting it may not be an insect virus. Given that it was isolated from a pool of mosquitoes it is more likely to be a contaminant of a chordopoxvirus, perhaps from a blood meal. Entomopoxvirus virions are ovoid shaped measuring 140–260 nm  220–450 nm with a complex morphology. The entomopoxvirus genome is a linear dsDNA ranging in size from 229 to 308 kb with an A þ T content of 79%–82% and shares 49 core genes with the chordopoxviruses. Unlike the chordopoxviruses of vertebrates, entomopoxviruses are embedded in a spherical shaped spheroid, made of spheroidin, much in the same manner that baculoviruses are embedded in a polyhedrin or granulin based occlusion body. Similar to other poxviruses, as described elsewhere in this encyclopedia, entomopoxviruses replicate in the cytoplasm in viroplasms, first as immature particles that mature into enveloped brick-shaped intracellular mature virions which can be released by cell lysis or by budding through the plasma membranes as external enveloped virions allowing systemic spread of the infection. Late in infection, and unlike chordopoxviruses, mature entomopoxvirions become severally occluded in spheroids, remaining intracellular till cell lysis with the spheroids allowing for insect to insect spread. Another unique feature of entomopoxviruses is that they produce a viral protein called fusolin which can be occluded with virions into spheroids which can also result in the formation of “spindles” which are not occluded. The spindles are thought to disrupt the peritrophic membrane allowing access of the virus to the epithelial cells of the larval midgut. Fusion of the outer EPV membrane with the plasma membrane allows access to the cell cytoplasm where replication occurs. Virus spread in the insects starts with the hemocytes and spreads to the fat body, the main site for virus replication, and other tissues. The signs of the disease vary depending on the virus and the host. EPV-infected lesser cornstalk borer become unresponsive and sluggish with their cutical changing from brown to orange, while in black-soil scarabs they develop a white spotted or mottled epidermis. EPVinfected lepidoptera can become whitish while some increase in size due to delayed pupation. Although they are insect viruses that can kill the insect, the time to kill is too long to be feasible for use in the control of insect pests.

714

An Introduction to Viruses of Invertebrates

Reoviruses of Invertebrates (Sedoreovirinae and Spinaviridae) Family: Reoviridae [ICTV 1974] (Order Reovirales, ICTV 2019) Subfamily Sedoreovirinae [ICTV 2009] Genus: Cardoreovirus [ICTV 2008] (1 species) Type Species: Eriocheir sinensis reovirus [ICTV 2008] Species Exemplar: Callinectes sapidus reovirus 1 Genus: Phytoreovirus [ICTV 1978] (3 species) Type Species Wound tumor virus [ICTV 1976] Species Exemplar: Wound tumor virus (34) Genus: Seadornavirus [ICTV 2004] (3 species) Type Species: Banna virus [ICTV 1999] Species Exemplar: Banna virus-China Subfamily: Spinavirinae [ICTV 2009] Genus: Aquareovirus [ICTV 1990] (7 species) Type Species: Aquareovirus A [ICTV 1999] Species Exemplar: American oyster reovirus13p2 Genus: Coltivirus [ICTV 2009] (5 species) Type Species: Colorado tick fever coltivirus [ICTV 1974] Species Exemplar: Colorado tick fever-Florio Genus: Cypovirus [ICTV 1990] (16 species) Type Species: Cypovirus 1 [ICTV 1999] Species Exemplar: Bombyx mori cypovirus 1 Genus: Dinovernavirus [ICTV 2008] (1 species) Type Species: Aedes pseudoscutellaris reovirus [ICTV 2008] Species Exemplar: Aedes pseudoscutellaris reovirus Genus: Idnoreovirus [ICTV 2004] (5 species) Type Species: Idnoreovirus 1 [ICTV 2004] Species Exemplar: Diadromus pulchellus idnoreovirus 1 Genus: Orthoreovirus [ICTV 1991] (10 species) Type Species Mammalian orthoreovirus [ICTV 1999] Species Exemplar: Mahlapitsi orthoreovirus The reoviruses have a broad host range, including insects, algae, fungi, fish, crustaceans, crabs as well as humans and other vertebrates and their invertebrate vectors and of plants and their invertebrate vectors. The Reoviridae family is subdivided into two subfamilies, Sedoreovirinae with six genera and Spinavirinae with nine genera. The two subfamilies differ in the nature of the virion surface with the Spinavirinae having a turreted or “spiked” surface (hence the prefix “Spina”, Latin for spike) while the Sedoreovirinae have a smooth surface (“Sedo” is Latin for smooth). Of the more recognized reoviruses are those infecting humans, other mammals and other vertebrates. Those from Rotavirus and Orthoreovirus genera infect only vertebrates, while viruses in genera Coltivirus, Orbivirus and Seadornavirus are considered arboviruses which infect both vertebrates and their insect vectors (mostly ticks, mosquitoes, midges, gnats and sandflies). Lesser known reoviruses infect algae (Mimoreovirus) and fungi (Mycoreovirus). In addition to infecting plants, reoviruses in genera Fijivirus, Oryzavirus and Phytovirus also infect their insect vectors (e.g., glassy winged sharpshooter). Reoviruses infecting only invertebrates are in genera Cardoreovirus (crabs), Cypovirus (insects), Dinovernavirus (mosquitoes) and Idnoreovirus (hymenoptera and diptera). The aquareoviruses, as the name implies, infect aquatic vertebrates like fish and but also aquatic invertebrates including shellfish and crustaceans such as oysters. This summary includes six genera of invertebrate-infecting reoviruses, Cardoreovirus, Phytoreovirus and Seadornavirus in the Sedoreovirinae and Cypovirus, Dinovernavirus and Idnoreovirus in the Spinavirinae. Of the Spinavirinae insect reoviruses, most cypoviruses have 10 dsRNA segments and infect lepidopteran insects like Lymantria dispar (Gypsy larvae), idnoreovirus have 10 dsRNA segments and infect Diadroma pulchellas (a hymenopteran parasitoid) and dinovernavirus has 9 dsRNA segments and infects Aedes mosquitoes. Most of the Sedoreovirinae reoviruses have 12 dsRNA segments (though Macrobrachium nipponense reovirus has 10 dsRNA segments). Seadornavirus is an arbovirus of humans, pigs and cattle but also infects mosquito vectors and two seadornaviruses appear to infect only mosquitoes. As noted some reoviruses, like the arboviruses in the genera Coltivirus (Colarado tick fever virus), and Orbivirus (Bluetongue virus) are vectored by invertebrates which themselves are infected and include midges, mosquitoes, ticks and sandflies. Also, some reoviruses infect plants and are vectored by invertebrates carrying the viruses. These are described in more detail in other articles in this Encyclopedia. Like other reoviruses, most invertebrate reoviruses have a double shelled icosahedral capsid with diameters ranging from 60 to 80 nm (though idnoreovirus capsids are smaller at about a 50 nm diameter). While cypoviruses and dinovernaviruses have a single shell with T ¼ 2, aquareoviruses, coltiviruses, idnoreoviruses, cardeoviruses, phytoviruses and seadornaviruses have two shells, the inner with T ¼ 2 and the other with T ¼ 13 icosahedral symmetry. As noted above they have 9–12 dsRNA segments. These range in size from 0.76 to 4.8 kb with each segment, except for cypoviruses, encoding one protein. For cypoviruses, segment 5 encodes two proteins by ribosomal skipping. Cypovirus was the first invertebrate reovirus discovered and that was in 1934 as polyhedra in the cytoplasm of infected cells of silkworm larvae and hence were initially named cytoplasmic polyhedrosis viruses from which the current genus name was derived.

An Introduction to Viruses of Invertebrates

715

Aquareoviruses (Aqua for their common aqueous environment of their hosts) have been isolated from oysters and crabs. Coltiviruses (from Colorado tick fever virus) replicate in ticks as well as mammalian hosts. Idnoviruses (from “idno” meaning water) were first isolated from houseflies in 1978, while the lone dinovernavirus (from double stranded RNA icosahedral nove, meaning nine, dsRNA segments) was isolated from a cell line from Aedes psuedoscutellaris in 2005. The first cardoreovirus (from carcinus for crab and dodeca for 12 dsRNA segments) was isolated in 2004 from diseased mitten crabs (Eriocheir sinensis). The Banna virus, in the genus Seadornavirus (from Southeast Asian dodeca for 12 dsRNA segments), was first discovered in 1987 in a patient with encephalitis and fever in China. Phytoreoviruses (from “phyto” Greek for plant) were initially discovered in 1941 as wound tumor viruses of plants transmitted by leaf hoppers in which the virus also replicates. The invertebrate reoviruses range from having a limited host range to a broad one. Aquareoviruses have been isolated from American oyster and Geoduck, shore, swimmer and mitten crabs. Coltiviruses replicate in ticks, the main vector for Colorado tick fever and mosquitoes. Cypoviruses have been isolated from lepidoptera and hymenoptera. Idnoreoviruses infect hymenoptera, and dipterans like the housefly, olive fly and fruit flies. So far dinovernavirus has been isolated solely from persistently infected Aedes pseudoscutellaris cells. Seadornaviruses have been found in mosquitoes, cardoreovirus in crabs and prawn and phytoreovirus in various plants and leafhopper vectors. Infection and replication of the invertebrate reoviruses closely mimic that of other reoviruses, so the reader is referred to other reovirus chapters. One unique feature of the cypoviruses is that they form polyhedra which can retain infectivity for years. The major polyhedrin protein is translated in the cytoplasm late in infection in large excess and virions are occluded in the polyhedrin at the periphery of the virogenic stroma in which morphogenesis of the virions occurs.

Rhabdoviruses of Invertebrates (Rhabdoviridae, Almendravirus, Alphanemrhavirus, Caligrhavirus, Sigmavirus) Family: Rhabdoviridae (Order Mononegavirales, ICTV 1990) Genus: Almendravirus [ICTV 2016] (6 species) Type Species: Puerto Almendras almendravirus [ICTV 2016] Species Exemplar: Puerto Almendras virus (Lo-39) Genus: Alphanemrhavirus [ICTV 2018] (2 species) Type Species: Xingshan alphanemrhavirus [ICTV 2018] Species Exemplar: Xingshan nematode virus 4 (XSNXC3292) Genus: Caligrhavirus [ICTV 2018] (3 species) Type Species: Lepeophtheirus caligrhavirus [ICTV 2018] Species Exemplar: Lepeophtheirus salmonis rhabdovirus 127 (LSRV127) Genus: Sigmavirus [ICTV 2012] (7 species) Type Species: Drosophila melanogaster sigmavirus [ICTV 2012] Species Exemplar: Drosophila melanogaster sigmavirus (HAP23) Like reoviruses, rhabdoviruses as a group, composing 30 genera, have a very broad host range, including humans and other mammals (e.g., genera Lyssavirus, Tibrovirus, Vesiculovirus), fish (genera Novirhabadovirus, Perhabadovirus and Sprivivirus), birds (genus Tupavirus) and plants (Varicosavirus nucleorhabdoviruses). Many of the vertebrate rhabdoviruses, (like those in genera Curiovirus, Ephemerovirus, Tibrovirus and Vesiculovirus) are also arboviruses that have been isolated from arthropod vectors like mosquitoes, midges and ticks. The arthropod specific rhabdoviruses belong to four genera in the Rhabdoviridae family, Almendravirus with five species, Alphanemrhavirus with two species, Caligrhavirus with three species and the most studied Sigmavirus, with seven species. Not covered in this introductory section are some rhabdoviruses in other genera that are arboviruses which infect vertebrates and are found in arthropod vectors such as midges, sandflies and mosquitoes and those found in insects such as aphids, planthoppers, leafhoppers and mites that transmit virus to plants, but do not replicate in the vector. The higher-level taxonomic designation of these viruses is highly convoluted but for the sake of completeness, these viruses belong to the Order Mononegavirales, Class Monjiviricetes, Subphylum Haplovercotina, Phylum Negarnavaricota and Realm Riboviria. In terms of morphology and virus replication the insect rhabdoviruses are similar to other rhabdoviruses covered elsewhere in this Encyclopedia. They have a helical nucleocapsid, surrounded by an envelope with spikes and a matrix between the nucleocapsid and envelope. The bullet shaped virions measure 80 nm diameter x 100 nm in length (sigmaviruses) or 40–55 nm diameter x 190–460 nm in length (almendraviruses). Alphanemrhavirus was identified only through metagenomic sequencing of pooled parasitic nematode populations, so it is not sure if an infectious virion form exists for this species. The 10.5–14.5 kv -ve ssRNA genome encodes in a 30 to 50 direction the canonical four rhabdovirus structural genes for, nucleoprotein (N) which is the major protein of the nucleocapsid, phosphoprotein (P) which ensures proper placement of L on the genome, matrix protein (M) which is the major structural protein between the nucleocapsid and virion and helps regulate genome RNA synthesis, and, glycoprotein G which is the major envelope glycoprotein and acts as the virus attachment protein and the non structural large protein (L) which is responsible for RdRp activity, 50 capping and 30 polyadenylation. In addition, the genomes of sigmavirus has an additional ORF X between P and M, involved in host-virus interaction and in almendravirus encodes an additional class 1a viroporin, between G and L. In some cases, such as sigmaviruses, invertebrate rhabdovirus sequences are found integrated into the genomes of drosophila and some mosquitoes.

716

An Introduction to Viruses of Invertebrates

The insect rhabdovirus life cycle is similar to that of other rhabdoviruses. Virions attach to host cell receptors through the G glycoprotein inducing low pH-dependent receptor-mediated endocytosis. Transcription occurs in the cytoplasm in a “stop start” L and P dependent process to generate separate monocistronic mRNAs for each of the proteins in decreasing molar abundance from the 30 to 50 end of the genome. Each mRNA has a common leader 50 end which is 50 capped and ending in a reiterative copyingderived polyA at the 30 end. Late in infection, following translation of viral proteins and when the concentration of N protein has increased to sufficient levels, transcription switches to genome replication. During genome replication the full length antigenome þ ve ssRNA template is produced and serves as the replicative intermediate for full length -ve sense genome ssRNA. The genome ssRNA is then encasidated in N protein in a P protein dependent fashion. The formed nucleocapsids and M protein attach to the plasma membrane of infected cells and bud through it to release mature virions which can then infect adjacent cells. Invertebrate rhabdoviruses have been found in several insects such as fruit flies, butterflies, mosquitoes and bees, as well as in nematodes and sea lice copepods. For the most part, transmission is vertical through either the female or male parents, though for sigmaviruses they can be experimentally transmitted by injection.

Roniviridae Family: Roniviridae [ICTV 2002] (Order Nidovirales. ICTV 1996) Subfamily: Okanivirinae ICTV 2018] Genus: Okavirus [ICTV 2002] (3 species) Subgenus: Tipravirus [ICTV 2018] Species: Gill-associated virus [ICTV 2002] Species Exemplar: Yellowhead virus To date all roniviruses (from “rod shaped nidovirus’) are from crustaceans, the prawn Penaeus monodon. Two viral diseases are associated with roniviruses, one caused by Yellowhead virus in black tiger shrimp which first appeared in Thailand in 1990 and the second, Eriocheir sinensis ronivirus which causes black gill disease in Chinese mitten crab. Yellowhead virus appears to be transmitted horizontally, while gill-associated virus can be transmitted vertically. Roniviruses differ from other nidoviruses in having a bacilliform 40–60 nm  150–200 nm enveloped virion. Like other nidoviruses it has a long, 26 kb, þ ve ssRNA monopartite genome. The ronivirus genome encodes a 50 terminal replicase complex (ORF1) followed by ORF2 for p20 nucleoprotein, ORF3 for pp3 (producing gp116 and gp64 envelope glycoproteins) and ORF4 of unknown function. The replicase ORFs ORF1a and ORF1b have a slippery sequence and RNA pseudoknot allowing for a  1 ribosomal frameshift and read through to a pp1ab replicase. Unlike other nidoviruses, roniviruses (and toroviruses) lack the common 50 leader. Preceding ORFs 2 and 3 are intergenic regions (IGRs) acting as a terminator and promoter. Replication probably follows that of other nidoviruses, like coronavirus described elsewhere, to generate a nested set of subgenomic mRNAs. Unlike other nidoviruses that use a discontinuous strategy to generate the different sgRNAs, roniviruses appear to use a ”continuous” transcription strategy.

Sarthroviridae Family: Sarthroviridae [ICTV 2015] Genus: Macronovirus [ICTV 2015] (1 species) Type Species: Macrobrachium satellite virus 1 [ICTV 2015] Species Exemplar: Extra small virus Extra small virus (XSV) derives its name based on the extremely small size of its naked icosahedral particle, of about 15 nm in diameter, with the name sarthrovirus derived from small arthropod virus. These viruses are þ ve ssRNA satellite viruses, depending on their replication on the RdRp of Macrobrachium rosenbergii nodavirus (MrNV). XSV belongs to the sole and type species Macrobrachium satellite virus 1, in the single genus Macronovirus of the family Sarthroviridae. They affect mostly different species of prawns, but have also been isolated from crayfish, beetles and water bugs and replicate in mosquito cell lines suggesting mosquitoes might be a vector. The genomic RNA is þ ve sense, 796 ntd in size with a 30 polyA tail and an upstream polyadenylation signal. It encodes two ORFs for coat proteins CP16 and CP17 made in equimolar amounts, with the ORF for CP16 initiating at an internal methionine codon. Infected prawns present with lethargy and opaqueness of the abdominal muscle and degeneration of the telson and uropods, followed by death around five days after observation of gross signs of infection. Immersion in a mixture of MrNV and XSV results in 100% mortality by 12 days post infection. They are a major problem for the shrimp farms in China, India, Australia, French West Indies, Taipei and Thailand.

Solinviviridae Family: Solinviviridae [ICTV 2016] (Order Picornavirales, ICTV 2008) Genus: Invictavirus [ICTV 2016] (1 species) Type Species: Solenopsis invicta virus 3 [ICTV 2016] Species Exemplar: Solenopsis invicta virus (DM) Genus: Nyfulvavirus [ICTV 2016] (1 species) Type Species: Nylanderia fulva virus 1 [ICTV 2016] Species Exemplar: Nylanderia fulva virus 1 (Florida initial)

An Introduction to Viruses of Invertebrates

717

Solinviviruses were first discovered through a bioprospecting approach using metagenomics of ant populations, showing the value of this approach to identify novel and perhaps useful biological control agents. Solinviviruses are icosahedral in shape measuring 27–29 nm in diameter with T ¼ 3 symmetry and showing surface projections and cup-like depressions. The virions are composed of two capsid proteins, VP1 and a VP1 fused to the capsid projection domain, CPD (VP1-CPD) and an about 10 kb þ ve ssRNA genome with a 50 VPg and a 30 polyA tail. The genome encodes a single polyprotein ORF (Nylanderia fulva virus) or two polyprotein ORFs via frame shifting (Solenopsis invicta virus). The polyprotein for the nonstructural proteins, helicase, protease and RdRp, are located at the amino end of the polyprotein while the structural proteins VP1 (or VP1-CPD) and VP2 are at the carboxy ends. It is thought that the polyprotein is cleaved to the individual proteins by the viral protease, much like poliovirus polyproteins are. There is some evidence of subgenomic mRNAs for the structural proteins. Viruses enter the host by feeding to access the alimentary canal of larval stage ants (NflV-1) or adults (SlNPV-3). Based on the genome organization, replication is thought to be similar to that of polioviruses and caliciviruses. Replication is in the cytoplasm with the genomic þ ve ssRNA acting as a transcript and later as a template for the production of genome length  ve sense ssRNA replicative strand. Virus assembly presumably occurs by encapsidation of the genomic þ ve sense ssRNA by the viral capsid proteins. Virus spread is intercolonial through trophallaxis. Solinviviruses are specific to Solenopsis genera fire ants which exhibit signs of infection including Queen weight decline, reduced feeding and fecundity and larval mortality. Virus can be transmitted mechanically by consumption of crickets that consume dead, SlV-3-infected worker ants. Solinviviruses are being developed as biological control agents using protein baits with lower doses resulting in chronic infection and higher doses of 107 to 109 virus particles per ml leading to colony collapse. The major problem for biopesticide development is the production of sufficient virus.

Tetraviruses (Families, Alphatetraviridae, Carmotetraviridae, Permutotetraviridae) Family: Alphatetraviridae [ICTV 2016] (Order Hepelivirales, ICTV 2019) Genus: Betatetravirus [ICTV 1997] (7 species) Type Species: Nudaurelia capensis beta virus [ICTV 1995] Species Exemplar: Nudaurelia cytheria capensis b virus Genus: Omegatetravirus [ICTV 1997] (3 species) Type Species: Nudaurelia capensis omega virus [ICTV 1995] Species Exemplar: Nudaurelia capensis o virus Family: Carmotetraviridae [ICTV 2011] (Order Tolivirales, ICTV 2019) Genus: Alphacarmotetravirus [ICTV 2011] (1 species) Type Species: Providence virus [ICTV 2004] Species Exemplar: Providence virus Family: Permutotetraviridae [ICTV 2011] Genus: Alphapermutotetravirus [ICTV 2011] (2 species) Type Species: Thosea asigna virus [ICTV 2004] Species Exemplar: Thosea asigna virus The tetraviruses were all at one time members of the Tetraviridae family with the common features of a unique T ¼ 4 icosahedral symmetry, a common jelly roll architecture of the capsid protein which form a monophyletic group all infecting insects and all having a þ vs sense ssRNA genome with a VPg at the 50 end and a polyA tail at the 30 end. However, there are some features which differed among the original species in the family, particularly the nature of the 30 end, existence of either a mono or biparatite genome, phylogeny of the RdRp of the tetraviruses and nature of the ORFs and their translation. In particular the alphatetraviruses (family Alphatetraviridae) have a monopartite (Betatetravirus genus) genome encoding both the replicase and through a subgenomic RNA the capsid protein and ending in a structured 30 tRNA-like end or a bipartite (Omegatetravirus genus) genome one encoding a replicase and the other the capsid protein. The replicase of both genera are alpha-like Supergroup replicases. Viruses in family Carmotetraviridae have a monopartite genome with one ORF encoding a Carmo-like supergroup replicase generated by a read through stop signal and the second ORF a capsid ORF generated from a subgenomic RNA. The permutotetraviruses, also have a monoparatite genome with a highly structured 30 terminal sequence. The replicase and VPg are like those in the dsRNA birnaviruses including a permuted active site, hence the family name Permutotetraviridae. Except for the, perhaps, subtle differences particularly in RdRp phylogeny and genome organization, most other characteristics are shared so these viruses are here treated collectively as tetraviruses. The monopartite genomes of the betatetraviruses of the Alphatetraviridae, carmotetraviruses, and permutotetraviruses are 6.6, 6.2, and 5.7 kb in size. Those of the bipartite genomes of the omegaviruses of the Alphatetraviridae are about 2.4 and 5.3 kb in size with the larger one encoding at least replicase and the smaller one encoding capsid protein ORFs. The capsid proteins are derived from subgenomic RNAs from the gRNA of either the monopartite genomes or the smaller omegavirus gRNA. The iconic T ¼ 4 icosahedral capsid is about 40 nm in diameter and made from capsid proteins from all tetraviruses being monophyletic. While they usually encapsidate viral RNA, the capsids can also capsidate non-viral RNAs, allowing for generation of reassortant viruses and gene exchange. When just capsid proteins are expressed, in a baculovirus expression system, they self-assemble and under acidic conditions condense to tetravirus-like VLPs. It has long been thought that tetraviruses replicate only in insects (particularly midgut cells of larvae) and cell culture but it was recently shown that they also infect and replicate in lab-infected HeLa and MCF7 cancer cells, as well as in cowpea plants. They are found mostly as persistent infections.

718

An Introduction to Viruses of Invertebrates

Taxa of Other Viruses of Invertebrates Not included in these summary descriptions above of viruses of invertebrates are taxa for which little is known and those for whose designations have relied largely on only sequence data from next generation high throughput sequencing and metagenomic studies of environmental samples. For many of these, if they are actually viruses of insects have not been otherwise confirmed. Since environmental samples were the source of the sequences the nucleic acid might simply have been a contaminant, or even present due to a blood meal from infected animals. While sequence data can predict phylogeny and protein composition, except for the source of the material which was analyzed, it cannot accurately provide information on basic virological data such as morphology, replication cycle, biological characteristics and true hosts. Below is a listing of some of these taxa along with some minimal associated information. Circoviridae [ICTV 1993] (Order Cirlivirales, ICTV 2019) hosts two genera, Circovirus [ICTV 1993] and Cyclovirus [ICTV 2015]. Circoviruses are characterized by having a circular, 1.7–1.9 kb ssDNA genome within an icosahedral capsid with T ¼ 1 symmetry. The monopartite covalently closed ssDNA genome is ambisense encoding two proteins, capsid protein and replication associated protein for rolling circle DNA replication. Two of the most known circoviruses are the porcine circovirus and beak and feather disease virus of parrots. Commercially available vaccines like CircoFLEX, Circogard and Circumvent PCV2 G2, derived from viruslike particles synthesized by the insect baculovirus expression vector system, are effective against some porcine circovirus infections. While viruses from most species in both genera of the Circoviridae family infect humans, other mammals and birds, some virus sequences are associated with spiders and insects including ants, mosquitoes cockroaches and dragonflies. All of the insectspecific species, like ant associated cyclovirus 1 are referred to as “associated cyclovirus” or “associated circovirus”. This is because the viruses have been identified solely through virus discovery approaches using next generation sequencing or metagenomics approaches. Consequently, we have no information about the true nature of these viruses. Genomoviridae [ICTV 2015] (Order Geplafluvirales, ICTV 2019) genus Gemyduguivirus [ICTV 2016], with the single and type species Dragonfly associated gemyduguivirus 1, genus Gemykibivirus [ICTV 2016], with 16 species including the type species Dragonfly associated gemykibivirus [ICTV 2016], and genus Gemycircularvirus [ICTV 2015] with 43 species including species Dragonfly associated gemycircularvirus and mosquito associated gemicircularvirus are all associated with dragonflies and mosquitoes. Genomoviruses, at least for those based on more than sequence analyses, and despite being in a different order are similar to circoviruses in having a circular ambisense ssDNA genome encoding Rep and Capsid proteins in an icosahedral capsid with T ¼ 1 symmetry. The genome is 2.2 kb in size and the capsids measure 29 nm in diameter. Following entry to the cytoplasm the genomovirus DNA becomes double stranded and replication is thought to be by a rolling circle mechanism. They are associated with a broad range of sources including invertebrates, humans, other mammals, birds and fungi. Lispiviridae [ICTV 2018] (Order Mononegavirales, ICTV 1990) family has a single genus, Arlivirus [ICTV 2015] with six species, Lishi arlivirus [ICTV 2015] being the type species. Viruses in this family have a  ve ssRNA genome and are associated with spiders. Based on the sequence from Lishi spider virus 1 from China the lispivirus genome is bisegmented  ve ssRNA. The larger segment is 7,051 ntd in size and encodes the L protein, while the smaller one, 4,426 ntd in size encodes the G, N and VP4 proteins. As the only evidence provided for this virus is through genome sequencing, there is no further information on, for example virion morphology, replication cycle, transmission or host range. Phasmaviridae [ICTV 2016] (Order Bunyavirales, ICTV 2016), genus Sawastrivirus [ICTV 2018] with a single and type species, Sanxia sawatrivirus [ICTV 2018] from the Sanxia water strider-2 virus is associated with water strider insects. Genus Orthophasmavirus [ICTV 2016] viruses infect mosquitoes, cockroaches and midges. The genome is tripartite -ve ssRNA with S (for N protein), M (for G protein) and L (for L protein, RdRp) segments with sizes of 1.7, 5.2 and 6.8 kb, respectively. Virions in the family are spherical about 80 nm diameter and are enveloped. Genus Jonvirus [ICTV 2018] and Feravirus (from Ferak Virus) [ICTV 2018] viruses were initially isolated from Aedes cell line C6/36 from pools of mosquitoes collected in 2004 in the Tai National forest from Côte d0 Ivoire. Both show typical bunyavirus morphology, except that Jonvirus also exhibits as enveloped rods measuring 60 nm by up to 600 nm (the genus name Jonvirus refers to Jonchet, a French pick-up sticks game where the shape of the sticks mimicked this unusual morphology). Polycipiviridae [ICTV 2017] (Order Picornavirales, ICTV 2008) has three genera, Chipocyclovirus, Hupolycivivirus and Sopolycivirus, [ICTV 2017]. The viruses are thought to be about 33 nm in diameter, icosahedral in shape with pseudo T ¼ 3 symmetry and a 10–12 kb monopartite þ ve ssRNA genome. Genomes of chipocycloviruses were isolated from midges like Chironomus riperias, while those of sopolyciviruses were isolated from a variety of ant species. Most of the information on this family is based on large scale sequencing of environmental samples and metagenomic approaches so little is known of the biology of its members, like morphology and replication strategy. Phenuiviridae [ICTV 2016] (Order Bunyavirales, ICTV 2016) viruses contain a tripartite  ve ssRNA genome in an RNP structure within an enveloped virion, 80–120 nm in diameter and surface proteins organized in a T ¼ 12 icosahedral structure. Some members of the family infect insects. Viruses from genus Wenrivirus [ICTV 2018] with a single and type species Shrimp wenrivirus [ICTV 2018] is associated with shrimp. Genus Goukovirus [ICTV 2016] (from Gouléako Village where it was first found) viruses and Genus Phasivirus [ICTV 2016] viruses are found exclusively in mosquitoes. Genus Phlebovirus [ICTV 1981] viruses infect mosquito vectors of some viruses in this genus. Tospoviridae [ICTV 2018] (Order Bunyavirales) is a family of plant viruses. One genus in the family, Orthotospovirus [ICTV 2016] which includes the Tomato spotted wilt tospoviruses consists of viruses which replicate in the midgut of thrip vectors of the virus. After replication, the virus moves to the salivary glands from where it can be transmitted to plants. Like other bunyaviruses, these viruses are pseudocircular in shape with a tripartite -ve ssRNA genome of 3, 4.8 and 8.8 kb in size. Xinmoviridae [ICTV 2018] (Order Mononegavirales, ICTV 1990) family has a single genus, Anphevirus [ICTV 2015] with seven species, with Xincheng anphevirus being the type species. The sequence of the viral genome of Xinchen mosquito virus is a monopartite 12,774 ntd

Bullet shaped, rod shaped nucleocapsid 45 to Mosquitoes, midges, sandflies, 100 nm  100 to 430 nm with matrix and ticks, fleas, blackflies envelope Vertebrate hosts, mammalian,

12 segments, 0.7 to 4.4 kb  ve ssRNA, 10.7 to 16.1 kb

Phlebovirus, Rift Valley fever phlebovirus (60)

Orbivirus, Bluetongue virus (22) Seadornaviru,s Banna virus (3)

Coltivirus, Colorado tick fever coltivirus (5)

Curiovirus, Curionopolis curiovirus (41) Ephemerovirus, Bovine fever ephemerovirus (8) Hapavirus, Flanders hapavirus (16) Ledantevirus, Le Dantec ledantevirus (16) Sripuvirus, Niakha sripuvirus (8) Tibrovirus, Tibrogargan tibrovirus (7) Vesiculovirus, Indiana vesiculovirus (16)

Phenuiviridae (Bunyavirales)

Reoviridae Sedoreovirinae (Reovirales)

Reoviridae Spinavirinae (Reovirales) Rhabdoviridae (Mononegavirales)

Alphavirus, Sindbis virus (31)

Mosquitoes, cockroaches water striders, drosophila, midges Mosquitoes

Mosquitoes, sand flies

Spherical, 65 to 70 nm enveloped

Mosquitoes, ticks, lice

avian, lizards, reptiles

Orbi-icosahedral T ¼ 2/T ¼13, 80 nm Mosquitoes, midges, gnats, sand flies, ticks diameter Seado-icosahedral T ¼ 2/T ¼13, 60 to 70 nm diameter

Spherical, 80 to 120 nm diameter, surface envelope with T¼12 symmetry

Pleomorphic to spherical, 80 to 120 nm diameter Spherical, 80 to 120 nm diameter

Ticks

Spherical, 80 to 120 nm diameter

The taxon Order is in brackets below the family or subfamily name. The number of species in each genus is given in brackets after the genus name.

a

Togaviridae (Martellivirales)

þ ve ssRNA, 10 to 12 kb

Icosahedral T¼2/T ¼12 60 to 80 nm diameter Ticks, mosquitoes

dsRNA, segmented Orbi-10 segs, 0.8 to 4.0 kb Seado-12 segs 0.9 to 4.0 kb

Feravirus, Ferak feravirus (1)

Orthobunyavirus, Bunyamwera orthobunyavirus (88)

Ticks, crustaceans, nematodes

Spherical, 100 to 130 nm, enveloped

Mosquitoes, ticks

Nyavirus, Nyamanini nyavirus (3) Socyvirus, Soybean cyst nematode socyvirus (1) Quaranjavirus, Quaranfil quaranjavirus (2) Thogotovirus, Thogoto thogotovirus (2)

Asfarviridae (Asfuvirales)

Nyamiviridae (Mononegavirales) Orthomyxoviridae (Articulavirales) Peribunyaviridae (Bunyavirales) Phasmaviridae (Bunyavirales)

Soft ticks (Ornithodoros), psyllids, odonates

Invertebrate vector

þ ve ssRNA, 9 to 13 kb  ve ssRNA, tripartite, 1 to 3 kb; 3.2 to 4.9 kb; 6.8 to 12 kb  ve ssRNA, 1 or 2 segments, total 12.2 kb  ve ssRNA, tripartite, 6 to 7 segs 0.9 to 2.3 kb  ve ssRNA, tripartite, 1, 4 and 6 to 8 kb  ve ssRNA, tripartite, 1.5, 4.2 and 6.8 kb  ve ssRNA’ tripartite 1,7, 3.2, 6.4 kb

Spherical, 175 to 215 nm diameter, with icosahedral capsid core (170 to 190 nm diameter), internal and external envelope Spherical, 40 to 60 nm diameter, enveloped Spherical, 80 to 120 nm, enveloped

Morphology

Flavivirus, Yellow fever virus (53) Orthonairovirus, Dugbe orthonairovirus (15)

Genome type and size

Flaviviridae (Amarillovirales) Nairoviridae (Bunyavirales)

Genus, type species (number of species in genus) dsDNA, linear, 170 to 194 kb

a

Families of viruses with an invertebrate arthropod host and mammalian host, often causing human diseases

Asfivirus, African swine fever virus (1)

Family

Table 2

An Introduction to Viruses of Invertebrates 719

Order

Amarillovirales Amarillovirales Amarillovirales Amarillovirales Amarillovirales Amarillovirales Amarillovirales Amarillovirales Amarillovirales Amarillovirales Amarillovirales Amarillovirales Amarillovirales Amarillovirales Articulavirales Asfarvirales Bunyavirales Bunyavirales Bunyavirales Bunyavirales Bunyavirales Bunyavirales Bunyavirales Bunyavirales Bunyavirales Bunyavirales Martellivirales Martellivirales Martellivirales Martellivirales Martellivirales Martellivirales Martellivirales Martellivirales Martellivirales Mononegarivirales Mononegarivirales Reovirales Reovirales Reovirales Reovirales Reovirales

Bagaza virus Entebbe bat virus Dengue hemorrhagic fever virus Japanese encephalitis virus Kyasanur forest disease virus Louping ill virus Murray Valley encephalitis virus Powassan virus Spondweni virus Saint Louis encephalitis virus Tick-borne encephalitis virus West Nile virus Yellow fever virus Zika virus Thogoto virus African swine fever virus Bunyamwera virus California encephalitis virus Crimean-Congo fever virus Jamestown Canyon virus La Crosse virus Oropouche fever virus Heartland virus Rift Valley fever virus Sandfly fever virus Toscana virus Barmah forest fever virus Chikungunya Eastern equine encephalitis virus Mayaro fever virus O0 nyong -nyong Ross River virus Sindbis disease virus Venezuelan equine encephalitis virus Western equine encephalitis virus Bovine ephemeral fever virus Vesicular stomatitis virus African horse sickness virus Banna virus Bluetongue virus Colorado tick fever virus Equine encephalosis virus

Flaviviridae Flaviviridae Flaviviridae Flaviviridae Flaviviridae Flaviviridae Flaviviridae Flaviviridae Flaviviridae Flaviviridae Flaviviridae Flaviviridae Flaviviridae Flaviviridae Orthomyxoviridae Asfarviridae Peribunyaviridae Peribunayviridae Nairoviridae Peribunaviridae Peribunyaviridae Peribunyaviridae Phenuiviridae Phenuiviridae Phenuiviridae Phenuiviridae Togaviridae Togaviridae Togaviridae Togaviridae Togaviridae Togaviridae Togaviridae Togaviridae Togaviridae Rhabdoviridae Rhabdoviridae Reoviridae Sedoreovirinae Reoviridae Sedoreovirinae Reoviridae Sedoreovirinae Reoviridae Spinareovirinae Reoviridae Sedoreovirinae

Family Flavivirus Flavivirus Flavivirus Flavivirus Flavivirus Flavivirus Flavivirus Flavivirus Flavivirus Flavivirus Flavivirus Flavivirus Flavivirus Flavivirus Thogotovirus Asfivirus Orthobunyavirus Orthobunyavirus Orthonairovirus Orthobunyavirus Orthobunyavirus Orthobunyavirus Bandavirus Phlebovirus Phlebovirus Phlebovirus Alphavirus Alphavirus Alphavirus Alphavirus Alphavirus Alphavirus Alphavirus Alphavirus Alphavirus Ephemerovirus Vesiculovirus Orbivirus Seadornavirus Orbivirus Coltivirus Orbivirus

Genus Game birds bats Humans, primates Humans, wading birds, pigs, horse Camels, rodents, monkeys Grouse, sheep Humans, Ardeid water birds Humans Humans Humans, birds Humans, rodents Humans, birds, horse Humans, primates Humans, primates Humans, wild and domestic vertebrates Pig, warthog, bushpig Humans Humans Humans, wild and domestic animals Humans, white tailed deer Humans, chipmunks, squirrels Humans, primates, sloth, marsupials, birds Humans, white tailed deer raccoons Humans, ovine, goat, sheep ruminants Humans Humans, dogs, goats Humans, marsupials Humans, primates Humans, birds Humans, primates Humans Humans, macropods Humans, passerine birds Humans, horses, donkeys, zebras Humans, birds Bovine Humans, bovine, swine, equine Horses, donkeys, mules, zebras Humans, pigs, cattle Goats, cattle, wild ruminants Rodents, ground squirrels Horses, donkeys, zebras

Vertebrate host

Some common arboviruses, taxonomy, listed by Order, vertebrate host and their invertebrate vectors

Arbovirus

Table 3

Culex mosquitoes, midges Mosquitoes Aedes aegypti and Aedes albopictus, mosquitoes, Culex mosquitoes Hyalomma hard ticks Ixodes ricinus ticks Culex annulirostris mosquitoes Ixodes and Dermacentor ticks Aedes circumluteolus mosquitoes Culex pipiens mosquitoes Ixodidae ticks Culex mosquitoes Aedes, Hemagogus and Sabethes mosquitoes Aedes mosquitoes Hyalomma hard ticks Ornithodoros soft ticks, Aedes aegypti mosquitoes Mosquitoes Hyalomma hard ticks Mosquitoes Ochlerotatus triseriatus mosquitoes Aedes serratus and Culex quinquefaciatus Mosquitoes, midges Amblyomma americanum ticks Culex tritaeniorhynchus and Aedes vexans mosquitoes Psychodidae sandflies Phlebotomus sandflies Aedes vigilax and Culex annulirostris mosquitoes Aedes albopictus, Aedes aegypti mosquitoes Culiseta melanura and Cs. morsitans mosquitoes Haemagogus mosquitoes Anopheles mosquitoes Ochlerotatus vigilax and Culex mosquitoes Aedes, Culex and Culiseta mosquitoes Aedes albopictus and Culex mosquitoes Culex tarsalis mosquitoes Culicoides oxystoma and nipponensis biting midges Culicoides midges, sandflies, blackflies Culicoides imicola midges Culex, Aedes and Anopheles mosquitoes Culicoides imicola and Culicoides variipennis midges Dermacentor ticks Culicoides midges

Insect and arachnid vectors

720 An Introduction to Viruses of Invertebrates

Icosahedral, naked, pseudo T¼3, Nepo- 28 to 30 nm diameter, Faba- 28 to 38 nm diameter Sequi- 25 to 30 nm diameter

þ ve ssRNA Nepo- bipartite, 3.9/7.5 kb Faba-bipartite, 1.58/3.4 kb Sequi-monopartite, 9 kb

Note: Adapted from Whitfield, A.E., Falk, B.W., Rotenberg, D., 2015. Insect vector-mediated transmission of plant viruses. Virology 479–480, 278–289. Dietzgen, R.G., Mann, K.S., Johnson, K.N., 2016. Plant virus-insect vector interactions: Current and potential future research directions. Viruses 8, 303. doi:10.3390/v8110303.

Virgaviridae (Martellivirales)

Tospoviridae (Bunyavirales)

Secoviridae (Picornavirales)

 ve ssRNA 12.0 to 14.5 kb

Icosahedral, double shell T¼2/T¼13, 70 nm Leafhopper (circulative, diameter propagative, persistent) Bullet to bacilliform shape, 45 to 100 Aphid, leafhopper, planthopper nm  130 to 350 nm, enveloped (circulative, propagative, persistent)

Aphid, planthopper (circulative, nonpropagative, persistent) 4 to 5 helical circular helices 3 to 10 nm diameter Planthopper (circulative, propagative, persistent) Flexuous, helix, 12 to 15 nm  650 to 850 nm Aphid, mite, whitefly (noncirculative, nonpersistent)

Icosahedra T¼1, 18 to 19 nm diameter

Whitefly, leaf hopper, aphid, tree hopper (circulative, nonpropagative, persistent) Aphid (circulative, nonpropagative, persistent)

dsRNA, 12 segments, 1.1 to 4.4 kb

þ ve ssDNA, multipartite, 6 to 8 segments, 1 kb each  ve and ambisense ssRNA, 4 to 5 segments þ v ssRNA, 8.0 to 12 kb

þ ve ssRNA, monopartite, 5.3 to 5.7 kb Icosahedral T¼3, 23 to 30 nm diameter

Twined icosahedra T ¼1, 22  38 nm

Nepo- nematode, mite, thrip Faba- aphid, beetle, leafhopper (non-circulative) Sequi- aphid, leafhopper (circulative, propagative) Orthotospovirus, Tomato spotted wilt orthotospovirus (26)  ve ssRNA, tripartite, 3, 4, 8, 8.k kb Spherical enveloped, 80 to 120 nm diameter, Thrip (circulative, propagative) 3 RNPs Tobravirus, Tobacco rattle virus (3) þ ve ssRNA, bipartite, 4.5/6.8 kb Helical 22  180 to 215; 22  46 to 115 Nematodes

Cytorhabdovirus, Lettuce necrotic yellows cytorhabdovirus (23) Alphaucleorhabdovirus, Potato yellow dwarf alphanucleorhabdovirus (9) Nepovirus, Tobacco ringspot virus (40) Fabavirus, Broad bean wilt virus 1 (7) Sequivirus, Parsnip yellow fleck virus (3)

Rhabdoviridae (Mononegavirales)

Reoviridae (Reovirales)

Macluraviru,s Maclura mosaic virus (10) Potyvirus, Brugmansia mosaic virus (183) Tritimovirus, Wheat streak mosaic virus (6) Rymovirus, Ryegrass mosaic virus (3) Phytoreovirus, Wound tumor virus (3)

Enamovirus, Pea enation mosaic virus 1 (5) Luteovirus, Barley yellow dwarf virus (13) Polerovirus, Potato Leafroll virus (26) Babuvirus, Banana bunchy top virus (3) Nanovirus, Subterranean clover stunt virus (8) Tenuivirus, Rice stripe tenuivirus (8)

þ ve ssRNA Ampelo- and Clostero- monopartite, 15.5 to 19.3 kb Crini-bisegmented 7.8 and 9.1 kb Emaravirus, European mountain ash ringspot-associated  ve ssRNA, quatrapartite, emaravirus (11) 1.3/1.6/2.0/6.9 kb Begomovirus, Bean golden yellow mosaic virus (424) þ ve ssDNA, bipartite, 2.5/2.6 kb

Potyviridae (Patatavirales)

Phenuiviridae (Bunyavirales)

Nanoviridae (Mulpavirales)

Luteoviridae (Tolivirales)

Geminiviridae (Gepfuvirales)

Fimoviridae (Bunyavirales)

Closteroviridae (Martellivirales)

Aphid, mealybug, leafhopper (non-circulative, semipersistent)

Aphid and whitefly (noncirculative, semi-persistent) Aphid (non-circulative, nonpersistent)

Invertebrate vector (transmission type)

Ampelo- and Clostero-single flexuous helix, Aphid, whitefly, softscale mealybug (non-circulative, 3.4 to 3.8  650 to 2,200 nm Criniviruses, bipartite helices, 10 to 13  650 semi-persistent) to 85/10 to 13  700 to 900 Spherical 80 to 90 nm, enveloped Eriophyd Mite (?)

Alfa-bacilliform 18  30 to 57 nm Cucumo- icosahedral T¼3, 26 to 35 nm diameter Icosahedra T¼7, 45 to 50 nm diameter or bacilliform (Tungovirus) 30 nm  60 to 900

þ ve ssRNA, tripartite Alfa- 2.1 to 2.2/2.6 Cucumo- 3.0/3.2 to 3.4 kb dsDNA, open circular, 7.0 to 8.2 kb

Alfamovirus, Alfalfa mosaic virus (1) Cucumovirus, Cucumber mosaic virus (4)

Caulimovirus, Cauliflower mosaic virus (13) Cavenovirus, Cassava vein mosaic virus (2) Roseadnavirus, Rose yellow vein virus (1) Soymovirus, Soybean chlorotic mottle virus (4) Tungovirus, Rice tungro bacilliform virus (1) Ampelovirus, Grapevine leafroll-associated virus 3 (12) Closterovirus, Beet yellows virus (16) Crinivirus, Lettuce infectious yellows virus (14)

Helical, 12 to 13  470 to 1,000 nm

þ v ssRNA, 5.8 to 9 kb

Carlavirus, Carnation latent virus (53)

Betaflexiviridae Quinvirinae (Tymovirales) Bromoviridae (Martellivirales)

Caulimoviridae (Ortervirales)

Morphology

Genome type and size

Genus (type species)

Viruses of invertebrate vectors of plant viruses (The Order is given in brackets below the family name. The number of species in the genus is given by the number in brackets after the genus name)

Family

Table 4

An Introduction to Viruses of Invertebrates 721

722

An Introduction to Viruses of Invertebrates

-ve ssRNA encoding N, G, L and an unknown ORF protein. These viral genomes have been associated with Anopheles mosquitoes, drosophila and orthoptera. However as only sequence data is available, no other characteristics, like morphology or replication are known.

Arboviruses in Their Invertebrate Vectors Only superficially, if at all, covered in this article are viruses that are arboviruses involved in transmitting arboviral disease, like yellow fever, Dengue, Zika even though part of the life cycle involves replication in the insect vectors. Nevertheless summary Table 2 provides a summary of the families of arboviruses and their insect vectors and summary Table 3 is for some better known arboviruses, their taxonomy and vector species. The reader is referred to other chapters in this Encyclopedia which are devoted to more detailed coverage of these viruses.

Plant Viruses in Their Invertebrate Vectors While many plant viruses are vectored by invertebrates (Table 4), they are not further discussed here but the reader is referred to the corresponding chapters on plant viruses. In terms of the nature of transmission of plant viruses, these can be defined as circulative (virus enters the insect body and circulates, usually through the hemolymph, prior to transmission), non circulative (virus does not enter the insect body but is transmitted physically through contaminated stylet or foregut of the insect) propagative (virus enters the insect and replicates in the insect prior to transmission, usually through the salivary glands) and non propagative (virus enters the insect, usually transported to salivary glands, but does not replicate in it prior to transmission). Transmission of plant viruses can also be defined as persistent, in which the virus is taken up by the insect following circulative means and invade the salivary gland, semi-persistent in which viruses are internalized and retained in the insect foregut, but do not enter the tissues and non persistent in which viruses are retained on the distal tip of the stylet only until it is released during feeding. A summary of some of these invertebrate-vector transmitted plant viruses is provided in Table 4.

Retrotransposons Associated With Invertebrates Three virus families associated with invertebrates, all in order Ortervirales (for reverse transcribing viruses), are actually transposons. While isolated mostly from plants, retrotransposons from these three families were also found to be associated with invertebrates. Retrotransposons in family Pseudoviridae [ICTV 1998] genus Hemivirus [ICTV 1998], type species Drosophila melanogaster copia virus are associated with drosophila and mosquitoes. Retrotransposons in family Balpaoviridae [ICTV 2017] with genus Semotivirus [ICTV 2017], type species Ascaris lubricoides virus are associated with roundworm, lepidoptera and drosophila. Retrotransposons in family Metaviridae, [ICTV 1998] genus Errantivirus, [ICTV 1998], type species Drosophila melanogaster Gypsy virus and genus Metavirus [ICTV 1998], type species Saccharomyces cerevisiae Ty3 virus have retrotransposons associated with drosophila, Trichoplusia ni and Bombyx mori. The genome of the erantivirus Trichoplusia ni TED virus is 7,510 bp.

Acknowledgments Much of the information for this summary was gleaned from preprints of the articles on Invertebrate Viruses prepared for this Encyclopedia of Virology 4th edition. I acknowledge the contributions of these authors for being able to use their information. I leaned heavily on the ICTV web sites for the wealth of knowledge provided on up to date taxonomy (some of which changed as I was writing this article), summary articles on different taxa in the 10th report of “The ICTV Report on Virus Classification and Taxon Nomenclature” and the numerous “proposals” that accompany the “history” provided through the taxonomy site. The articles in the main body of this volume do not cover all the families of viruses infecting invertebrates covered in this summary. In part that is because often not much is known, or in some instances only sequence data is available. I hope the reader will find the summary Tables (Tables 1–4) of use, as the tables attempt to collate the wealth of the diversity of invertebrate viruses and their taxa. I have made every effort for accuracy, but considering the breadth of information, it is easy to overlook some important information or err on transcription (the writing kind). For these I apologize.

See also: Ascoviruses (Ascoviridae). Baculoviruses: General Features (Baculoviridae). Bidensoviruses (Bidnaviridae). Bunyaviruses of Arthropods (Mypoviridae, Nairoviridae, Peribunyaviridae, Phasmaviridae, Phunuiviridae, Wupedeviridae). Dicistroviruses (Dicistroviridae). Entomobirnaviruses (Birnaviridae). Hytrosaviruses (Hytrosaviridae). Iflaviruses (Iflaviridae). Iridoviruses of Invertebrates (Iridoviridae). Mesoniviruses (Mesoniviridae). Nimaviruses (Nimaviridae). Nodaviruses of Invertebrates and Fish (Nodaviridae). Nudiviruses (Nudiviridae). Parvoviruses of Invertebrates (Parvoviridae). Poxviruses of Insects (Poxviridae). Reoviruses of Invertebrates (Reoviridae). Reoviruses (Reoviridae) and Their Structural Relatives. Rhabdoviruses of Insects (Rhabdoviridae). Sarthroviruses (Sarthroviridae). Solinviviruses (Solinviviridae). Tetraviruses (Alphatetraviridae, Carmotetraviridae, Permutotetraviridae)

An Introduction to Viruses of Invertebrates

723

Further Reading Bateman, K.S., Stentiford, G.D., 2017. A taxonomic review of viruses infecting crustaceans with an emphasis on wild hosts. Journal of Invertebrate Pathology 14, 86–110. Bonning, B.C., 2019. Insect Molecular Virology: Advances and Emerging trends. Caister Academic Press. Dietzgen, R.G., Mann, K.S., Johnson, K.N., 2016. Plant virus-insect vector interactions: Current and potential future research directions. Viruses 8, 303. doi:10.3390/v8110303. Kibenge, S.B., Godoy, M.G., 2016. Aquatic Virology. New York Academic Press. Whitfield, A.E., Falk, B.W., Rotenberg, D., 2015. Insect vector-mediated transmission of plant viruses. Virology 479–480, 278–289. Williams, T., Bergoin, M., van Oers, M., 2017. Diversity of large DNA viruses of invertebrates. Journal of Invertebrate Pathology 147, 4–22. Young, P.R., 2018. Arboviruses: A family on the move. In: Hilgenfeld, R., Vasudevan, S. (Eds.), Dengue and Zika: Control and Antiviral Treatment Strategies. Advances in Experimental Medicine and Biology 1062. Springer. doi:10.1007/978-981-10-8727-1_1.

Relevant Websites https://talk.ictvonline.org/information/w/faq/386/how-to-write-virus-and-species-names How to write virus, species, and other taxa names. https://talk.ictvonline.org International Committee on Taxonomy of Viruses (ICTV). https://talk.ictvonline.org/ictv-reports/ictv_online_report The ICTV Report on Virus Classification and Taxon Nomenclature. https://viralzone.expasy.org ViralZone root. https://talk.ictvonline.org/taxonomy Virus Taxonomy: 2019 Release. International Committee on Taxonomy of Viruses (ICTV).

Ascoviruses (Ascoviridae) Sassan Asgari, The University of Queensland, Brisbane, QLD, Australia Dennis K Bideshi, California Baptist University, Riverside, CA, United States and University of California, Riverside, CA, United States Yves Bigot, INRAE – French National Research Institute for Agriculture, Food and Environment, Nouzilly, France Brian A Federici, University of California, Riverside, CA, United States r 2021 Elsevier Ltd. All rights reserved. This is an update of B.A. Federici, Y. Bigot, Ascoviruses, In Encyclopedia of Virology (Third Edition), Edited by Brian W.J. Mahy, Marc H.V. Van Regenmortel, Elsevier Ltd., 2008, doi:10.1016/B978-012374410-4.00347-2.

Glossary Apoptosis Genetically programmed cell death. Apoptotic bodies Cell vesicles resulting from apoptosis. Caspase Protease that activates a major portion of programmed cell death. Endoparasitic wasps Species of insect parasites belonging to the order Hymenoptera, which lay their eggs in insects where the wasp larvae develop. Per os infection Infection by feeding.

Programmed cell death Genetically programmed cascade proteases and nucleases that cleave DNA and proteins within a cell leading to its death. Reniform Shaped like a kidney. Transovarial transmission Transmission of virus inside the egg. Virion-containing vesicles Vesicles containing virions formed by ascoviruses by rescue of apoptotic bodies induced by ascovirus infection.

Introduction The family Ascoviridae is one of the newest families of viruses, established in 2000 to accommodate several species of a newly recognized type of DNA virus that attacks larvae of insects of the order Lepidoptera. It consists of two genera, Ascovirus and Toursvirus; for practicality, here we refer to the collective members of the family Ascoviridae as ascoviruses. Viruses of this family produce large, enveloped virions, measuring 130 nm in diameter by 300–400 nm in length, and when viewed by electron microscopy have a reticulated surface appearance. They are typically bacilliform or reniform in shape, and contain a circular double-stranded DNA genome that, depending on the species, ranges from 119 to 200 kbp. Whereas the virions of ascoviruses are structurally complex like those of other large DNA viruses that attack insects, such as those of iridoviruses (family Iridoviridae) and entomopoxviruses (family Poxviridae), they differ from these in two significant aspects. First, ascoviruses are transmitted from diseased to healthy lepidopteran larvae or pupae by female endoparasitic wasps when these lay eggs in their hosts. Second, ascoviruses have a unique cell biology and cytopathology in which shortly after infecting a cell, they induce apoptosis and then rescue the developing apoptotic bodies and convert these into virion-containing vesicles. This aspect of viral reproduction apparently evolved to disseminate virions to the larval hemolymph (blood) where they could contaminate the ovipositors of female wasps so that the virus could be transmitted to new hosts. Ascoviruses appear to occur worldwide, wherever there are endoparasitic wasps and larvae of species belonging to the lepidopteran family Noctuidae. However, as these viruses have been discovered relatively recently and their signs of disease are not commonly known in the scientific community, relatively few ascovirus species have been described.

History The first ascoviruses were discovered during the late 1970s in southern California where they were found causing disease in larvae of moths belonging to the lepidopteran family Noctuidae. Diseased larvae were recognized by the presence of hemolymph that was very white and opaque, in marked contrast to the hemolymph of healthy larvae which is translucent and slightly green (Fig. 1). The color and opacity of the hemolymph in diseased larvae was shown to be due to the presence of high concentrations of vesicles that contained virions (Fig. 2). The white hemolymph and virion-containing vesicles are diagnostic for the disease, and the name for this group, ascoviruses (derived from the Greek asco meaning “sac”), was chosen on the basis of the latter characteristic. Since the discovery of the first ascovirus, ascoviruses have been isolated as the cause of disease in many species of noctuid larvae. In addition, an ascovirus that attacks the pupal stage of a species belonging to the family Yponomeutidae was discovered in the 1990s in France.

Distribution and Taxonomy With respect to distribution, ascoviruses have been reported from the United States, Europe, Australia, and Indonesia, and it is highly probable that they occur worldwide. This is because their most common hosts, larvae of lepidopteran species belonging to

724

Encyclopedia of Virology, 4th Edition, Volume 4

doi:10.1016/B978-0-12-809633-8.21548-3

Ascoviruses (Ascoviridae)

725

Fig. 1 Major characteristics of the disease typically caused by ascoviruses in lepidopteran larvae. (A) and (B) Healthy and ascovirus-infected larvae, respectively, of the cabbage looper, Trichoplusia ni, infected with Trichoplusia ni ascovirus. Note the opaque white hemolymph in the infected larva. (C) Spot plate containing hemolymph from healthy (left) and infected larvae (right). (D, E) Sections through lobes of fat body from a healthy and infected larva, respectively, of the fall armyworm, Spodoptera frugiperda, infected with Spodoptera frugiperda ascovirus. Note the greatly hypertrophied cells in the fat body of the infected larva. The cells in most of this tissue have already cleaved into viral vesicles. N, nuclei.

the family Noctuidae, the largest family of the order Lepidoptera, as well as their most common vectors, endoparasitic wasps of the families Braconidae and Ichneumonidae, are distributed throughout the world. Although only a few ascovirus species have been described to date, there are probably many, including variants, that occur worldwide. Thus, given the common occurrence of their hosts and vectors, it is possible that ascoviruses are very common insect viruses. That they have not been discovered in large numbers, for example like baculoviruses (family Baculoviridae), is probably because they cause a chronic disease with few easily detectable signs, making it difficult for individuals not familiar with the disease to recognize diseased larvae in field populations. At present, five species of ascoviruses are officially recognized based on a combination of properties including the relatedness of 28 key genes coding among others for the DNA polymerase and major capsid protein, the percent identity between their sequenced genomes, their lepidopteran host range, and their tissue tropism (Table 1). They are split into two genera, Ascovirus and Toursvirus. The type species of the Ascovirus genus is Spodoptera frugiperda ascovirus 1a (SfAV-1a), with the other species being Trichoplusia ni ascovirus 2a (TnAV-2a) and Heliothis virescens ascovirus 3a (HvAV-3a). To date, the Toursvirus genus contains only one virus species originally called Diadromus pulchellus ascovirus 4a (DpAV-4a), and now named Diadromus pulchellus toursvirus (DpTV-4a). The Arabic numeral reflects the order in which each species was formally recognized, whereas the lower case letter indicates variants of the type species. Variants from the type species are recognized by different consecutive lower case numbers; for example, TnAV-2b and 2c would represent two different variants of TnAV-2a recognized subsequently. Herein, these viruses are referred to by their acronyms without the numerical and lower case suffix. Members of the Ascovirus genus have been isolated from many more insect species than those listed above, but these isolates have turned out to be variants of known ascoviruses, and therefore they have not been named after the host from which they have been isolated. For example, ascoviruses related to TnAV and HvAV have been isolated from noctuid species such as Autographa precationis, Helicoverpa zea, Helicoverpa armigera, and Helicoverpa punctigera; however, they do not bear the name of their host of isolation. What this implies is that ascoviruses belonging to the TnAV and HvAV species have a broad and overlapping host range among different noctuid species, although this has only been tested experimentally to a limited extent.

Virion Structure and Composition Depending on the species, the virions of ascoviruses are either bacilliform or reniform in shape, with complex symmetry, and very large, measuring about 130 nm in diameter by 300–400 nm in length. Although variations occur in the virion structure among various ascovirus species, their structural components appear to be essentially the same. Ascovirus virions consist of an inner particle surrounded by an outer envelope (Fig. 2). The inner particle is complex containing a DNA/protein core as well as an apparent internal lipid bilayer surrounded by a distinctive layer of protein subunits. Thus, the virion appears to contain two lipid membranes: one associated with the inner particle and the other forming the lipid component of the envelope. In negatively stained preparations, virions have a distinctive reticulate appearance, which is thought to be due to superimposition of subunits on the surface of the internal particle with those in the envelope.

726

Ascoviruses (Ascoviridae)

Fig. 2 Structural and morphological characteristics of ascovirus virions and virion-containing vesicles. (A) Wet mount preparation viewed with phase microscopy of hemolymph from a Spodoptera frugiperda larva infected with Spodoptera frugiperda ascovirus (SfAV). The spherical refractile bodies are virion-containing vesicles. (B, C) Electron micrographs of ultrathin sections through viral vesicles produced by Trichoplusia ni ascovirus (TnAV) and SfAV, respectively. (D) Matrix of the occlusion body produced by SfAV. The occlusion consists of virions, protein, and small spherical vesicles. (E, F) Negatively stained virions of SfAV and TnAV, respectively. Note the reticulate appearance of the virions. (G, H) Electron micrographs of ultrathin cross sections through inner particles of SfAV after formation (G) and during envelopment (H). (I) Ultrathin cross section through a fully developed virion of TnAV.

As indicated by the size and complexity of the virions, the genome of ascoviruses is large, and consists of a single molecule of double-stranded circular DNA. The genomes of SfAV, TnAV, HvAV, or variants of these, in the genus Ascovirus, and DpTV-4a the sole virus in the genus Toursvirus have been sequenced (Table 1). The SfAV-1a genome is 157 kbp and codes for at least 120 proteins, whereas the TnAV genomes are slightly larger, 174–186 kbp, and code for at least 165–178 proteins. The genomes of HvAV variants range from 186 to 200 kbp and code for approximately 180–190 potential proteins. The DpTV-4a genome is 119 kb in size and encodes about 119 proteins. Based on gel analyzes, ascovirus virions contain more than 12 structural polypeptides ranging in size from 12 to 200 kDa. Furthermore, proteomic analysis revealed that SfAV-1a virion is composed of at least 21 proteins, whereas similar analyzes identified at least 67 proteins in the virion of HvAV-3i. The marked difference in the virion protein profiles of SfAV-1a and HvAV-3i could be due to inadvertent contamination with non-structural proteins or a higher level of structural complexity in the HvAV-3i virion. Regardless, the two major virion proteins in ascovirus virions are, invariably, the major capsid protein and members of an unusual large cationic protein family (ascovirus P64 homologs) that functions in

Ascoviruses (Ascoviridae)

Table 1

727

Members in the genus Ascovirus and Toursvirus belonging to the family Ascoviridae

Species namea

Virus abbreviation

Genome (bp)b/ORFsc

Accession number

SfAV-1a TnAV-2a TnAV-6a TnAV-6b

156,922/123 N/Ad 174,059/165 185,664/178

[AM3988432] [AJ312707] [DQ517337.1] [KY434117]

HvAV-3a HvAV-3e HvAV-3f HvAV-3g HvAV-3h HvAV-3i HvAV-3j

N/Ad 186,262/180 198,157/190 199,721/194 190,519/181 185,650/188 191,718/189

[AJ279817.1] [EF133465.1] [KJ755191.1] [JX491653.1] [KU170628.1] [MF781070.1] [LC332918.1]

Toursvirus Diadromus pulchellus toursvirus 4a

DpTV-4a

119,343/119

[CU469068.1]

Tentative species Spodoptera exigua ascovirus 5a Spodoptera exigua ascovirus 6a Helicoverpa armigera ascovirus 7a Helicoverpa punctigera ascovirus 8a

SeAV-5a SeAV-6a HaAV-7a HpAV-8a

Ascovirus Spodoptera frugiperda ascovirus 1a Trichoplusia ni ascovirus 2a

Heliothis virescens ascovirus 3a

a

Recognized type species (bold text) and variants of type species (plain text). Complete genome sequence; bp, base pairs. c Putative number of open reading frames (ORFs). d Complete genome sequence not available. b

condensing and encapsidating ascovirus genomic DNA. In addition to proteins and the DNA genome, the presence of an envelope as detected by electron microscopy, as well as experiments with detergents and organic solvents, indicate that virions contain a substantial lipid component. And, as in other enveloped viruses of eukaryotes, it is likely that the virion also contains carbohydrate in the form of glycoproteins, though at present none have been identified.

Transmission and Ecology One of the most interesting features of ascoviruses is that their transmission from host to host appears to be dependent on their being vectored by female endoparasitic wasps (order Hymenoptera). Ascoviruses are extremely difficult to transmit per os, with typical infection rates averaging less than 15% even when larvae are fed as many as 105 virion-containing vesicles in a single dose. In contrast to this, infection rates for caterpillars injected with as few as 10 virion-containing vesicles are typically greater than 90%. Moreover, experiments with parasitic wasps show that they can effectively transmit ascoviruses to their noctuid hosts. For example, when females are allowed to lay eggs in ascovirus-infected noctuid caterpillars, thereby contaminating their ovipositor, and then allowed to lay eggs in healthy larvae, the majority of the latter contract ascovirus disease. Interestingly, though the parasitoid eggs hatch in their infected noctuid hosts, the parasitoid larvae die as the ascovirus disease develops in the caterpillar. Under field conditions, the prevalence of ascovirus disease in caterpillars is correlated with rates of parasitization by endoparasitic wasps. When wasps from these populations are collected in the field and allowed to oviposit in healthy caterpillars reared in the laboratory, the latter often exhibit ascovirus disease within a few days. Thus, laboratory and field studies provide sound evidence that the primary mechanism for the transmission of ascoviruses attacking noctuid larvae is through being vectored mechanically by parasitic wasps. No evidence has been found in the lepidopteran hosts for transovum or transovarial transmission. In the case of DpTV (genus Toursvirus), the association of the virus with its wasp and caterpillar hosts is much more intimate. DpTV DNA is carried in wasp nuclei as a circular molecule, and small numbers of virions are produced in the oviducts of females. However, the virus does not cause noticeable pathology in the wasp host. The females lay eggs in the pupal stage of the lepidopteran host, Acrolepiopsis assectella, introducing small numbers of toursvirus virions along with the wasp eggs. These virions invade lepidopteran host cells, replicate, and initiate destruction of major host tissues. The wasp larva then emerges from the egg and feeds on the host tissues and toursvirus virions. The DpTV genome is carried extrachromosomally by both male and female wasps, where it is apparently transmitted from generation to generation transovarially. These observations make ascoviruses the only known group of viruses pathogenic to insects that are primarily dependent on vectors for their transmission. Now that the characteristics of the disease are known, field studies in the southeastern United States and California are beginning to show that ascoviruses are probably the most common type of virus to occur during most of the year in populations of several important noctuid pests, including the cabbage looper, Trichoplusia ni, fall armyworm, Spodoptera frugiperda, and the corn

728

Ascoviruses (Ascoviridae)

Fig. 3 Major stages of cellular pathogenesis caused by a typical ascovirus, a process that resembles apoptosis. After infection, the nucleus enlarges and the nuclear membrane invaginates, and then lyses. Subsequently, the plasmalemma of the cell invaginates and coalesces with cytoplasmic membranes, apparently formed de novo, thereby dividing the cell into a cluster of virion-containing vesicles. These vesicles dissociate and are liberated into the hemolymph as the basement membrane of infected tissues degenerates. Virion assembly becomes apparent as the nuclear membrane lyses, and continues throughout all subsequent stages of vesicle formation.

earworm, Helicoverpa zea. Prevalence rates range from 10% to 25%, depending on the species and time of the year, with the highest rates of infection, as noted above, being correlated with high levels of parasitization. In South Carolina, ascovirus infection rates as high as 60% have been reported in populations of noctuid larvae at the end of summer.

Host Range The experimental host range of ascoviruses varies with the viral species. TnAV, HvAV, and SeAV have a broad host range and are capable of replication in a variety of noctuid species, as well as in selected species belonging to other families of the order Lepidoptera. Alternatively, the experimental host range of SfAV is limited to other species of the genus Spodoptera. DpTV can replicate in hymenopteran and lepidopteran hosts closely related to its natural host species, A. assectella. To propagate virus in the laboratory, all ascoviruses can be grown in their larval or pupal hosts. To infect caterpillars, they are injected with virus in the third, fourth, or early fifth instar, and virion-containing vesicles are harvested from the hemolymph 5–7 days later. While ascoviruses can be propagated in lepidopteran host cell lines, virus pathology usually is not as distinct as that of in vivo. In addition, virus replication in vitro does not appear to be as high as that in vivo.

Pathology and Pathogenesis Signs of Disease In both natural and laboratory environments, the initial signs of ascovirus disease are very subtle, and this probably accounts for why ascoviruses were discovered only recently. The most obvious sign of disease within 24 h of infection is a decrease in the normal rate of feeding. The feeding rate continues to slow as the disease progresses, and as a result larvae fail to gain weight or advance in development. Healthy larvae, particularly in the early stages of development, will easily quadruple their weight and size in a period of 3 or 4 days, whereas ascovirus-infected larvae cease to grow and may actually lose weight. This feature of ascovirus disease is almost impossible to detect in infected larvae in the field. However, it is easily noticed under laboratory conditions when infected and healthy larvae are reared side by side over a period of a few days. A second feature easily noted in the laboratory is that ascovirus diseases are chronic, though usually fatal. When infected during early stages of development, ascovirus-diseased larvae often survive for 2 or 3 weeks beyond the time at which most healthy larvae have completed their development and pupated. Signs of disease other than these are minor, but include the inability to completely cast the molted cuticle, a bloated thoracic region, and a white or creamy discoloration and hypertrophied appearance of the larval body at advanced stages of disease development.

Ascoviruses (Ascoviridae)

729

Cytopathology and Cell Biology In comparison to all other known viruses, the most unique property of ascoviruses is the unusual cytopathology that leads to the formation of the virion-containing vesicles. This process resembles apoptosis, and previous studies of the SfAV-1a genome have shown that it encodes a functional executioner caspase, synthesized 9 h after infection, which by itself is capable of inducing apoptosis (Fig. 3); at present, no other virus is known to encode a functional caspase. Although in vitro studies have demonstrated that knock-down of SfAV-1a caspase by RNA interference significantly reduced the formation of apoptotic vesicles, it is unclear whether caspase homologs in other ascoviruses participate in this process. For example, in a similar study a caspase-like gene in HvAV-3e was shown not to be implicated in apoptosis but was essential for virus replication. At the cellular level in vivo, the disease begins with extraordinary hypertrophy of the nucleus accompanied by invagination of sections of the nuclear envelope, followed by a corresponding enlargement of the cell. Cells typically grow from 5 to 10 times the diameter of uninfected cells. As the nucleus enlarges, the nuclear envelope ruptures and disintegrates into fragments. At about this stage, the cell plasmalemma begins to invaginate along “planes” toward the now anucleate cell center. Concomitantly, sheets of membrane form closely adjacent to mitochondria that accumulate along the planes. As this process continues, the membrane sheets coalesce and join the invaginating plasmalemma, thereby cleaving the cell into a cluster of 20 to more than 30 vesicles, ranging in size from 5 to 10 mm in diameter. This aspect of ascovirus cellular pathology resembles the formation of apoptotic bodies during apoptosis. However, rather than dissipate as the cell dies, the developing apoptotic bodies are rescued by the virus and progress to form vesicles in which virions continue to assemble. These virion-containing vesicles, also referred to as viral vesicles, typically remain in the tissue until the basement membrane ruptures, though on occasion cell hypertrophy can be so great that the enlarging cell erupts out through the basement membrane of the infected tissue, releasing large fragments of the infected cell directly into the hemolymph. Analysis of ascovirus genomes shows that, unlike many other large DNA viruses, ascoviruses encode several lipid-metabolizing enzymes that are likely involved in the process of converting developing apoptotic bodies into virion-containing vesicles. Recent transcriptome analysis have shown that genes coding for a patatin-like phospholipase, a PlsC phosphate acyltransferase, a fatty acid elongase, and an esterase/lipase in SfAV-1a are highly expressed during later stages of virus infection in vivo. These enzymes are known to modify fatty acid chain length and the degree of lipid saturation, presumably generating novel lipid components in the virion and/or the membranes of viral vesicles. Regarding the latter, it is possible that these lipids, in part, function in providing stability to viral vesicles as they circulate in hemolymph for an extended period encompassing several days or weeks, and when these vesicles are disseminated through oviposition by female parasitic wasps. Another interesting aspect of this cytopathology is that the mitochondria, apparently under ascovirus control, are reprogrammed to assist synthesis of membranes that cleave the cell into the viral vesicles in which most replication occurs. Although the process by which viral vesicles are cleaved from cells varies among different ascoviruses, the histopathology is similar among virtually all ascoviruses. Vesicles accumulate in the tissues where they are formed, but as these tissues degenerate during disease progression, the basement membrane of infected tissues deteriorates and ruptures, allowing the vesicles to spill out into the hemolymph. There they accumulate reaching concentrations as high as 107–108 vesicles per ml within 3–4 days of infection. There is some evidence that viral replication proceeds within the vesicles as they circulate in the hemolymph, and thus this tissue must also be considered one of the tissues attacked by ascoviruses. In fact, because such high concentrations of viral vesicles are found in the hemolymph, this tissue could be considered a major site of infection, particularly if it is eventually shown that these viruses continue to replicate in the vesicles as they circulate in the hemolymph. Despite the chronic nature of the disease caused by ascoviruses, virion-containing vesicles are present in the hemolymph within 2 or 3 days of infection. When the virus replicates in cells in vitro, the vesicles are formed within 12–16 h of infection. The rapid development and circulation of the viral vesicles in the hemolymph probably evolved to enhance transmission of the virus by parasitic wasps.

Tissue Tropism The cytopathology of ascoviruses is consistent among different viral species; however, considerable variation occurs with respect to the tissues attacked, that is, in which replication occurs. TnAV, HvAV, and SeAV exhibit a relatively broad tissue tropism infecting the tracheal matrix, epidermis, fat body, and connective tissue. Differences exist between these species in that some HvAV variants infect the epidermis much more extensively than TnAV variants, whereas some of the latter can also replicate more extensively in fat body cells, but appear only to do this when larvae are infected early in their development. Alternatively, the type species, SfAV, and its variants have a very narrow tissue tropism, with the fat body being the primary site of infection. DpTV occurs in the nuclei of all tissues of its wasp host, but appears to only produce progeny in ovarial tissues. In its lepidopteran pupal host, it attacks and replicates in a wide variety of tissues.

Replication and Virion Assembly Although there have been few biochemical studies of viral DNA replication or protein synthesis, studies carried out with ascoviruses in vivo and in vitro show that progeny virions first appear about 12 h after infection. Virion assembly is initiated after the nucleus ruptures, and occurs prior to and during the cleavage of the cell into viral vesicles. The first recognizable structural

730

Ascoviruses (Ascoviridae)

component of the virion to form is the multilaminar layer of the inner particle. Based on its ultrastructure, this layer consists of a unit membrane and an exterior layer of protein subunits. As the multilaminar layer assembles, a dense nucleoprotein core aggregates on the interior surface. This process continues until the inner particle is complete. After formation, the inner particle is enveloped by membranes within the cell or vesicle. These membranes are apparently synthesized de novo. Thus, the assembly of the virions is reminiscent of that in other viruses with complex virions, such as the iridoviruses, herpesviruses, and poxviruses, where the virions differentiate after association of the precursors of virion structural components. After formation, the virions of the TnAV ascovirus accumulate toward the periphery of the vesicle where they often form inclusion bodies, that is, aggregations of virions (Fig. 2). In SfAV, occlusion bodies are formed in which the virions are actually occluded in a “foamy” vesicular matrix that consists of a mixture of protein and minute spherical vesicles. When viewed with phase microscopy, these viral inclusion and occlusion bodies are phase bright, and are largely responsible for the highly refractile appearance of the vesicles. Ascoviruses do not typically form the types of occlusion bodies characteristic of other types of DNA insect viruses, such as baculoviruses and entomopoxviruses.

Origin and Evolution The subject of viral evolution over millions of years has received relatively little study due to the lack of a fossil record. Moreover, viruses are considered polyphyletic, and thus most of the more than 70 families of viruses are thought to have originated independently. In this regard, ascoviruses may provide a unique opportunity to obtain insights into virus evolution over long periods. Phylogenetic comparisons of ascovirus genes sequenced to date including those coding DNA polymerase and major capsid protein as well as 20 other proteins indicate that these viruses evolved from a lepidopteran iridovirus (family Iridoviridae). Iridoviruses, in turn, appear to share a common ancestor with two families of Megavirales, Marseilleviridae and Pithoviridae which attack certain aquatic unicellular organisms.

Future Perspectives At present, too little is known about ascoviruses to assess whether they are or will turn out to be of economic importance. Their poor infectivity per os makes it highly unlikely they will ever be developed as viral insecticides, especially given the successful advent of insect-resistant transgenic crops. However, as more entomologists become familiar with the disease caused by ascoviruses, it may be shown that in habitats rarely treated with chemical insecticides, such as transgenic crops, these viruses are responsible for significant levels of natural pest suppression, particularly where parasitic wasps are abundant. Such findings would encourage even greater emphasis on the development of biological control and other more environmentally sound methods of pest control. With respect to the cell biology of viral vesicle formation, ascoviruses provide an interesting model for how apoptosis can be manipulated at the molecular level. Additionally, study of the unusual process by which ascoviruses rescue the developing apoptotic bodies to form viral vesicles could lead to insights into how cells manipulate the cytoskeleton and mitochondria. Finally, it is possible that viral vesicles will provide a unique anucleate cellular system for studying the replication of a complex type of enveloped DNA virus in vitro.

Further Reading Asgari, S., 2006. Replication of Heliothis virescens ascovirus in insect cell lines. Archives of Virology 151, 1689–1699. Asgari, S., 2007. A caspase-like gene from Heliothis virescens ascovirus (HvAV-3e) is not involved in apoptosis but is essential for virus replication. Virus Research 128, 99–105. Asgari, S., Bideshi, D.K., Bigot, Y., Federici, B.A., Cheng, X.-X., 2017. ICTV virus taxonomy profile: Ascoviridae. Journal of General Virology 98, 4–5. Asgari, S., Davis, J., Wood, D., Wilson, P., McGrath, A., 2007. Sequence and organization of the Heliothis virescens ascovirus genome. Journal of General Virology 88, 1120–1132. Bideshi, D.K., Bigot, Y., Federici, B.A., Spears, T., 2010. Ascoviruses. In: Asgari, S., Johnson, K.N. (Eds.), Insect Virology. Great Britain: Caister Academic Press, pp. 2–34. (ISBN 978-1-904455-71-4). Bideshi, D.K., Demattei, M.V., Rouleux-Bonnin, F., et al., 2006. Genomic sequence of Spodoptera frugiperda ascovirus 1a, an enveloped, double-stranded DNA insect virus that manipulates apoptosis for viral reproduction. Journal of Virology 80, 11791–11805. Bideshi, D.K., Spears, T., Zaghloul, H.A.H., et al., 2018. Ascovirus P64 homologs: A novel family of large cationic proteins that condense viral genomeic DNA for encapsidation. Biology 7, 44. Bideshi, D.K., Tan, Y., Bigot, Y., Federici, B.A., 2005. A viral caspase contributes to modified apoptosis for virus transmission. Genes and Development 19, 1416–1421. Bigot, Y., Rabouille, A., Doury, G., et al., 1997. Biological and molecular features of the relationships between Diadromus pulchellus ascovirus, a parasitoid hymenopterna wasp (Diadromus pulchellus) and its lepidopteran host, Acrolepiopsis assectella. Journal of General Virology 78, 1149–1163. Chen, Z.S., Cheng, X.-W., Wang, X., et al., 2018. Proteomic analysis of the Heliothis virscens ascovirus 3i (HvAV-3i) virion. Journal of General Virology 100, 301–307. Cheng, X.W., Wang, L., Carner, G.R., Arif, B.M., 2005. Characterization of three ascovirus isolates form cotton insects. Journal of Invertebrate Pathology 89, 193–202. Federici, B.A., 1983. Enveloped double-stranded DNA insect virus with novel structure and cytopathology. Proceedings of the National Academy of Sciences of the United States of America 80, 7664–7668. Federici, B.A., Govindarajan, R., 1990. Comparative histopathology of three ascovirus isolates in larval noctuids. Journal of Invertebrate Pathology 56, 300–311. Federici, B.A., Vlak, J.M., Hamm, J.J., 1990. Comparison of virion structure, protein composition, and genomic DNA of three ascovirus isolates. Journal of General Virology 71, 1661–1668.

Ascoviruses (Ascoviridae)

731

Govindarajan, R., Federici, B.A., 1990. Ascovirus infectivity and the effects of infection on the growth and development of noctuid larvae. Journal of Invertebrate Pathology 56, 291–299. Pellock, B.J., Lu, A., Meagher, R.B., Weise, M.J., Miller, L.K., 1996. Sequence, function, and phylogenetic analysis of an ascovirus DNA polymerase gene. Virology 216, 146–157. Piégu, B., Asgari, S., Bideshi, D., Federici, B.A., Bigot, Y., 2015. Evolutionary relationships of iridoviruses and divergence of ascoviruses from invertebrate iridoviruses in the superfamily megavirales. Molecular Phylogenetics and Evolution 84, 44–52. Wang, L., Xue, J., Seaborn, C.P., Arif, B.M., Cheng, X.W., 2006. Sequence and organization of the Trichoplusia ni ascovirus 2c (Ascoviridae) genome. Virology 354, 167–177. Zaghloul, H.A.H., Hice, R., Arensburger, P., Federici, B.A., 2018. Transcriptome analysis of the Spodoptera frugiperda ascovirus in vivo provides insights into how its apoptosis inhibitors and caspase promote increased synthesis of viral vesicles and virion progeny. Journal of Virology 91 (23), e00874.

Baculovirus–Host Interactions: Repurposing Host-Acquired Genes (Baculoviridae) A Lorena Passarelli, Kansas State University, Manhattan, KS, United States r 2021 Elsevier Ltd. All rights reserved.

Glossary Apoptosis A type of programmed cell death important for the normal development of organisms and used as a defense mechanism by cells to prevent the replication of a pathogen and their further propagation. The process is characterized by DNA fragmentation and cell blebbing. Cysteine-aspartases are caspases which cleave proteins after aspartic acids and are the active enzymes that drive the process.

Co-option Existing genes acquiring a different function and/or regulation, resulting from natural selection. Horizontal gene transfer Lateral transfer of genes between genomes, having a role in evolution and adaptation and differing from vertical gene transfer or the transfer of genes from a parent to an offspring. Host range Group of host species that are permissive and susceptible to infection by a virus.

Introduction Viruses are obligate intracellular parasites that depend on cellular metabolism and factors to support their multiplication and propagation between hosts. Viral replication cycles, involving the disassembly and reassembly of viral components to generate new infectious viral particles, provide ample opportunities for viral and host proteins to interact, genome segment exchange and rearrangement, and competition for mutually needed abundant or limited resources used by viruses and hosts. These interactions lead to an arms race between viruses trying to multiply and host cells mounting immune defenses to halt virus multiplication and ameliorate any cytopathic effects. The process leads to co-evolution of viruses and their hosts. Viruses need to carefully choreograph their host defense weaponry, as killing a host too quickly can result in a dead-end host, curtailing desired levels of virus multiplication and spread. There are many known examples of viruses having acquired host genes that benefit their replication and, in some cases, these genes have been co-opted to defeat the host. The horizontal transfer of genes is bidirectional–viruses acquiring genes from a host or another co-infecting pathogen and hosts acquiring viral genes. This review will provide background on gene acquisition by viruses from a host or other organism within that host. It will also summarize gene acquisition from a virus to a host. The general role and examples of host-captured genes by baculoviruses will be discussed in more detail, focusing on two broad groups, viral host range genes and genes that affect host defense pathways.

Viruses Acquiring Genes From Their Hosts Viruses may acquire genes from other organisms and these may be maintained in viral genomes if they are beneficial for their replication. Acquired genes may serve to manipulate the immune system of the host, by evading, avoiding or hindering immune factors that curtail their replication. In addition, virus-acquired host genes can enhance virus replication in their native host or expand their host range, further ensuring their successful spread. Captured genes typically evolve faster than the cellular counterpart, often making the lineage of acquired genes hard to determine and study. In some cases, specific genes have been acquired independently multiple times, for example, IL-10 in herpesviruses and poxviruses. Viruses may acquire genes from their host, a co-infecting pathogen (e.g., virus or bacteria) or the microbiome of the host (Fig. 1). The transfer of DNA may be complex involving three or more pathogen-host interactions. For example, bacterial viruses that along with their obligate intracellular bacterial host co-infect a eukaryotic organism may facilitate DNA exchange between these organisms. Bacteriophage WO from Wolbachia, Gram-negative bacteria that infect several arthropods, carries a eukaryotic association module, a segment of DNA in the bacterial virus with potentially functional eukaryotic gene domains (e.g., spider toxic, protease cleavage sites), indicating the lateral transfer of genes and the potential role of these domains in factor mimicry.

Hosts Acquiring Genes From Infecting Viruses Although it is more widely discussed how viruses capture genes from their hosts to supplement their replication advantages, the reciprocal also occurs, namely, host organisms acquiring genes from the viruses that infect them. As with virus co-opted genes, hosts can use viral genes for specific purposes and the function of the gene product may be modified. Viruses can integrate into host genomes and may remain as part of the host genome indefinitely. Acquired genes may be co-opted and function as antiviral defense genes, lending a twist to the arms race between viruses and their hosts, where the virus or its host acquired a gene from its

732

Encyclopedia of Virology, 4th Edition, Volume 4

doi:10.1016/B978-0-12-809633-8.21551-3

Baculovirus–Host Interactions: Repurposing Host-Acquired Genes (Baculoviridae)

733

Fig. 1 Gene flow to viruses. Movement of genes to viruses from the host organism, co-infecting pathogens or organisms in the microbiome of the host. Reciprocal movements may also take place.

foe to defeat it. Analyzes have indicated that viral RNA-dependent RNA polymerase and capsid genes, two viral-specific genes, have been found nested in cellular genomes and, in some cases, expressed, but their specific function in the host genome remains to be determined. Horizontal gene transfer of genes may also occur in non-viral invaders; for example, the pillbug Armadillidium vulgare was recently shown to have acquired a gene from the bacteria Wolbachia. Gene transfer can also occur by the integration of virus genes into a host genome, where viral genes usually evolve at the pace of host genes. Some viruses use integration of their genome into host genomes as part of their replication strategy, while others may integrate depending on the environment or to hide from host defense pressures. There is ample evidence of viral genomes “fossilized” in host genomes, underscoring the long relationship and co-evolution between viruses and hosts. A startling example of virus evolution is exemplified by polydnaviruses. Polydnaviruses have symbiotic relationships with two large groups of parasitic wasps, ichneumonid and braconid wasps, that are evolutionarily different. The wasps parasitize lepidoptera larvae as they oviposition. During oviposition, the eggs are injected along with polydnavirus particles. The encapsidated genomes of polydnaviruses encode genes that interfere with lepidopeteran larvae immunity and physiology, allowing egg development. In turn, polydnaviruses require the wasp host to spread the virus, since the viral genes that encode structural and replication proteins are encoded in the wasp genome and are not in the encapsidated polydnavirus genome. The ovaries of braconid wasps were reported to express 29 genes with homologs in nudiviruses and derived from nudiviruses (viruses evolutionarily closely related to baculoviruses). A significant number of these 29 genes (62% or 18 genes) are also baculovirus core gene homologs, as well as being in nudiviruses. Baculovirus core genes are carried by all baculoviruses sequenced to date and are essential for baculovirus replication. Thus, it appears that polydnaviruses evolved from the integration of nudiviruses into the genomes of wasps, aiding wasp reproduction success and expanding the gene repertoire of wasps to more effectively parasitize hosts.

Baculovirus Replication Baculoviruses belong to the family Baculoviridae and the prototype baculovirus is Autographa californica multicapsid nucleopolyhedrovirus (AcMNPV). Baculoviruses naturally infect insects, yet baculovirus virions can enter–but cannot replicate in–many other types of animal cells. The majority of baculoviruses discovered to date infect the larval stage of insects in the order Lepidoptera, which comprises moths and butterflies. A few other baculoviruses, considered more ancient, infect insects in the orders Diptera (mosquitoes) and Hymenoptera (sawflies). Baculoviruses carry a double-stranded circular DNA genome, ranging in length from 80- to 180-kilobasepairs that is contained in an enveloped rod-shaped nucleocapsid. Baculovirus genes are transcribed from both DNA strands with short intergenic regions, and their genomes can be engineered to incorporate large segments of foreign DNA. During prototypical baculovirus replication, two types of infectious virions are produced, budded and occlusion-derived virions. Both virion types are produced in the nucleus of infected cells and have identical genomes and capsids, but different envelopes, envelope proteins, and virion morphology. The composition of their envelopes and an environmentally-stable protein matrix surrounding the occlusion-derived virions are essential to carry out distinct functions in the replication cycle of the virus. The budded virus exits the nucleus and buds through the cell plasma membrane at late times post infection, obtaining the viral-encoded fusion protein necessary for entry into new cells. Thus, the budded virus is responsible for spreading infection between cells and is the form of the virus used in cell culture. The occlusion-derived virus is produced at very late times post infection. In some baculoviruses, multiple nucleocapsids (MNPVs) are co-enveloped and then occluded in a protein matrix composed of the protein polyhedrin. In single NPVs (SNPVs), single nucleocapsids are enveloped and embedded in polyhedra. Polyhedra are protein crystals containing virions that form an occlusion body. Polyhedrin protects virions in the environment once virions are released from an infected host, allowing transmission of the virus to other hosts. Additional information on the replication of baculoviruses is described in other articles in this volume.

734

Baculovirus–Host Interactions: Repurposing Host-Acquired Genes (Baculoviridae)

Fig. 2 Baculovirus genes with homology to genes in other organisms. Baculoviruses have acquired genes that fulfill several purposes. Proposed origin of genes or homologies to cellular genes is indicated in parentheses. Additional genes from those presented here have cellular homologs. Bro, baculovirus repeated open reading frame; chiaA, chitinase A; ctl, conotoxin-like; cys protease, cysteine protease; dUTPase, deoxyuridine 50 -triphosphate nucleotidohydrolase; egt, ecdysteroid UDP-glucosyltransferase; he65, hr4 left/EcoRI 65-kilodaltons; hel, helicase; iap, inhibitor of apoptosis; mmp, matrix metalloproteinase; mtase, methyltransferase; pcna, proliferating cell nuclear antigen; ptp, protein phosphatase; rnr, ribonucleotide reductase; ser/thr kin, serine/threonine kinase; snf gta, sucrose non-fermenting global transactivator; sod, superoxidase dismutase; sox, sulfhydryl oxidase; vef, viral enhancing factor; vfgf, viral fibroblast growth factor; vubi, viral ubiquitin. Information was obtained mainly from Hughes, A.L., Friedman, R., 2003. Genome-wide survey for genes horizontally transferred from cellular organisms to baculoviruses. Molecular Biology and Evolution 20, 979–987. Thèzè, J., Takatsuka, J., Nakai, M., Arif, B., Herniou, E.A., 2015. Gene acquisition convergence between entomopoxviruses and baculoviruses. Viruses 7, 1960–1974.

Baculovirus Acquisition of Genes At least 25% of baculovirus proteins have been shown to directly interact with host factors. The lineages of many of these interacting viral factors can be traced to cellular counterparts, indicting horizontal gene transfer taking place in different baculoviruses at different times in evolution and interactions on an evolutionary scale. A systematic study of virus-host interacting proteins using published work revealed that 29 viral proteins had been shown to interact with cellular host factors or pathways. The study of an overlapping host repertoire between two insect viruses, entomopoxviruses and baculoviruses, suggested analogous selection pressures from their respective host, and this may have led to similar genetic acquisitions and repurposing of genes to ensure virus replication. Phylogenetic analyzes using these acquired genes indicated that genes had been transferred to insect viruses following coinfections of insects with viruses and other organisms. The gene origin of several virus-acquired genes was traced to genes obtained from bacteria, and multicellular or unicellular eukaryotes (Fig. 2). In some cases, acquired genes could be traced to the host order, Lepidoptera, and there was clear indication that horizontal gene transfer may have taken place. The authors also cited examples of viral genes that seem to have no cellular corresponding homolog and are only carried by a virus (es). Since some of these genes are conserved in both virus families, it implied that acquired genes conferred a fitness to the virus, retaining these genes. In addition to gene acquisition from the host and co-option of genes, many baculovirus genes are under positive selection, and this serves to avoid the immune system of the host or other pressures. Genes under positive selection include viral genes involved in DNA replication (e.g., helicase), structural genes (e.g., ODV-e66), transcription genes (e.g., lef-4), apoptosis inhibitor genes (e.g., p35) and others, painting a complete picture of adaptation that ensure fitness. Some of these genes have homologies to host genes (e.g., helicase) and others do not (e.g., p35). A total of 15 proteins or 11% of proteins from the Bombyx mori NPV had recognizable counterparts in the host Bombyx mori and a number of them affect the physiology of the host but are not essential for virus replication. These included protein kinases 1 and 2, viral superoxide dismutase, DNA polymerase, desmoplakin, inhibitor of apoptosis 1 and 2, UDP-glycosyltransferase, viral fibroblast growth factor, viral ubiquitin, snf2 global transactivator, methyltransferase, viral chitinase A, viral cathepsin, and protein tyrosine phosphatase. In addition, other baculovirus genes with other functions or essential for virus replication also had homology to Bombyx mori genes, indicating broad acquisition of genes from lepidopteran hosts.

Baculovirus–Host Interactions: Repurposing Host-Acquired Genes (Baculoviridae)

735

A number of viral genes appear to be devoted to relieving translational blocks imposed by the host to prevent the translation of viral products, many in the eIF2a  PKR pathway. Some baculoviruses encode protein kinase 2 (PK2), which inhibits eIF2a by binding to the N-lobe of PKR. This forms a pseudokinase that targets the insect heme-regulated inhibitor-like eIF2a kinase thereby enhancing virus production. It was concluded that the viral PK2 may have been acquired from a lepidopteran host to compete for shared translation-specific factors. One pathway used by several viruses for replication or spread within or away from the infected cell is actin polymerization. Baculoviruses encode P78/83, a capsid-associated protein with Wiskott-Aldrich syndrome protein-like domains similar to actin nucleating promoting factors. P78/83 is necessary for virus replication. It is a functional factor that activates the host Arp2/3 (proteins that regulate the actin cytoskeleton, serving as nucleation sites that seed the origin of new actin filaments) after it is trafficked to the nucleus of infected cells, promoting actin polymerization. Baculoviruses can mobilize by assembling G-actin monomers from the host into F-actin filaments. Baculoviruses use actin to move to the nucleus of cells and to egress from the nucleus of cells and transit through the cytoplasm before budding. Furthermore, nuclear F-actin filaments can embed into the nuclear envelope, disrupt it and facilitate virus egress. Although nuclear actin in eukaryotes and its roles in chromatin remodeling, transcription and processing of RNAs have been more widely studied in recent years, the role of nuclear actin in virus replication is still in the discovery phase. Thus, baculoviruses have co-opted a common cytoplasmic function to promote actin assembly in the nucleus of cells and as an instrument that disrupts nuclear envelope integrity for viruses to transit into the cytoplasm rather than exiting through the nuclear envelope pores. Baculovirus genes acquired from the host may supply functions that optimize virus replication or facilitate host expansion; however, in some cases, the acquired gene has been co-opted to carry out a new essential function. The envelope fusion protein F of some baculoviruses may have been derived from an insect gene, as indicated by the similarity with the F predicted proteins of three Drosophila species and the mosquito Anopheles gambiae. Alternatively, baculoviruses may have acquired the f gene from f-like genes carried in endogenous retroviruses embedded in insect hosts.

Baculovirus Host Range Genes The host range of baculoviruses has been explored in insects and in cells. Studying the replication of baculoviruses in diverse insects defined alternative hosts in the laboratory that were permissive for virus replication in addition to the host from which the virus was isolated. This defined a relatively broader host range for some members of the Baculoviridae. Surveying infected individuals in nature is expanding our knowledge of the host range of each virus, while genomic sequencing of baculoviruses is further defining viral species that can be found in more than one host. In cells, host determining factors that expanded the host range of a virus have been described, and it appears that baculoviruses have acquired genes that allow more efficient replication or transmission in some hosts. Baculoviruses can be considered generalists, infecting several insects, or specialists, infecting a small number of insects, depending on the species. In general, they have a relatively narrow host range, a characteristic that is not optimal to agriculturists needing to combat more than one insect pest during a growing season but advantageous in that they do not target beneficial insects (e.g., pollinators). AcMNPV was originally isolated from the host species Autographa californica and can infect over 35 insect species, while genetically similar viruses isolated from other species (e.g., BmNPV isolated from Bombyx mori) infect slightly over a dozen insect species in a laboratory study but has only been successfully isolated from Bombyx mori in the wild. This poses the question of whether additional or specific genes allow some viruses to expand their host range. Having the appropriate host receptor binding protein for viral attachment may allow virus entry into cells, but successful and complete virus replication may not be supported unless viral genes that properly interact with host genes and defeat, reduce or circumvent immune responses are effective. Several baculovirus genes have been characterized that are necessary for a productive infection in another insect host or cell line derived from a different host, suggesting acquisition of genes occurred to expand the host range of a virus. These genes include six factors with seemingly little in common, either structurally or functionally, which suggests that they are affecting different mechanisms to alter host range, and viruses may have acquired these genes at different times. Briefly, the six baculovirus genes defined as host range genes are described as follows. (1) The viral helicase p143 (ac95) is a conserved gene in all baculovirus genomes. Its product, P143, is a helicase required for viral DNA replication and late gene expression. Changing two amino acids in P143 allowed AcMNPV to be lethal in non-permissive Bombyx mori larvae. (2) The AcMNPV p35 gene (ac135), an apoptosis inhibitor gene, carried by some baculoviruses, is necessary to prevent apoptosis in cell lines derived from Spodoptera frugiperda but not necessary for successful replication in cell lines derived from Trichoplusia ni. Mutant viruses lacking p35 are significantly less lethal than wild-type in S. frugiperda larvae but not in T. ni larvae. Baculoviruses have other apoptosis inhibitors (P49, IAPs) and in some cases more than one gene copy is present. This indicates that virus replication is dependent on blocking the process of apoptosis in some species, but not others, and several factors may be encoded to ensure the proper host-virus interactions. (3) The late expression factor-7 gene (lef-7, ac125) enhances viral late gene expression and DNA replication. Its deletion affects virus replication in some hosts and cell lines (SF-21) but not in others (TN-368). The BmNPV lef-7 showed little or no replication defects in BmN cells or B. mori larvae. (4) The host cell factor-1 (hcf-1, ac70) is found only in some baculoviruses. It stimulates a late viral promoter in some cell lines (e.g., TN-368), where it is necessary for optimal virus DNA synthesis and virion production. Deletion of hcf-1 affects mortality of T. ni larvae. It is not required in SF-21 cells and a virus with a deletion of hcf-1 can replicate in SF-21 cells and S. frugiperda larvae, but protein synthesis shut-off was impaired. Expression of

736

Baculovirus–Host Interactions: Repurposing Host-Acquired Genes (Baculoviridae)

HCF-1 allowed Hyphantria cunea MNPV to replicate in non-permissive TN-368 cells. (5) The host range factor-1 gene, hrf-1, is a gene encoded by Lymantria dispar MNPV. When expressed in AcMNPV, it allows AcMNPV to replicate in semi-permissive cells derived from L. dispar and in L. dispar larvae by allowing translation to proceed. A homolog of hrf-1 is also carried by Orgyia pseudotsugata MNPV, a virus that replicates in L. dispar, suggesting acquisition of the gene was necessary for these viruses to productively infect L. dispar. However, the OpMNPV hrf-1 did not allow AcMNPV to replicate in L. dispar-derived cells, suggesting differences in functionality or in the interactions with other host or viral factors. (6) The AcMNPV ie-2 (ac151) gene is a transcriptional transactivator in the presence of the powerful transactivator IE-1. IE-2 also enhances viral DNA replication and arrests the cell cycle. It is not necessary in T. ni-derived cells compared to S. frugiperda-derived cells, but AcMNPV mutants lacking ie-2 still have diminished replication in T. ni. It is possible that specific genes encoded by some baculoviruses optimize virus replication in some hosts and depending on host innate immunity, the presence of such genes may determine if virus replication is successful or not. The inability to test alternative hosts in some cases precludes obtaining a complete picture of specific gene advantages in those hosts. Altogether, various auxiliary genes functioning at the levels of transcription, translation and DNA biosynthesis are often non-essential for the ability of a virus to infect some hosts but essential in others. Their conservation in some virus species may be indicative of efficacious infectivity in specific hosts. Acquisition of host range genes also suggests that viral genes interact with specific host genes to augment viral replication.

Virus Defense and Resistance to Baculoviruses Several genes that block apoptosis are encoded by baculoviruses to prevent host cell death and thereby allow virus replication. These genes work at different steps in the apoptotic pathway, suggesting different acquisitions during their evolution and the importance of carrying genes that effectively block apoptosis. The gene p35 was discovered in a spontaneously generated mutant of AcMNPV that had a mutation within p35. A virus with this mutation prevented specific cells (i.e., SF-21) from undergoing apoptosis during infection. The p35 gene product, P35, inhibits effector or executioner caspases, cysteine-aspartic proteases, by binding to them in a stoichiometric manner and preventing proteolytic cleavage of their aspartic acid-containing substrate proteins. A similar gene product, P49, is encoded by some baculoviruses but, works earlier in the apoptotic cascade than P35, inhibiting initiator caspases. Thus, P49 can rescue p35 mutant viruses by working upstream of P35. Inhibitors of apoptosis genes, iaps, have homologs widespread in metazoans. The viral IAP from Orgyia pseudotsugata multiple NPV, Op-IAP3, binds the insect IAP, stabilizing it and preventing its transitory ability to block cell death. This ensures effective cell death and virus replication. The baculovirus Hemileuca sp. NPV encodes a functional serpin. Serpins are serine proteinase inhibitors and mediate the regulation of proteinases involved in the phenoloxidase process, a process that melanizes tissues or pathogens in invertebrates and has an important role in immune response and tissue healing. The baculovirus serpin, Hesp018, inhibited trypsin, chymotrypsin and plasmin. In addition, it blocked bacteria-activated phenoloxidase responses in the hemolymph of the tobacco hornworm Manduca sexta. Expression of hesp018 in AcMNPV resulted in increased budded virus production in Sf9 cells and increased virulence in T. ni larvae. Other baculoviruses encoding serpin-like genes have not been discovered, indicating that Hemileuca sp. NPV has a singular advantage to encode hesp018 and replicate in its host. The viral serpin gene differs from noctuid and bombycoid serpin-4 genes. Thus, it was concluded that hesp018 was either acquired long ago and slowly diverged or it was a recent acquisition and the gene changed at a fast rate compared to the cellular homolog. Other baculoviruses may have other genes that help control insect humoral immunity. For many years it was thought that an advantage of using baculoviruses as pest control agents of important agricultural and forestry plants was that the pests would have difficulty in developing resistance to the viruses they had co-evolved with for many years. However, in recent years, it has been found that a specific viral gene in the Cydia pomonella granulovirus involved in transcription, pe38, is necessary for infectivity in codling moths. Resistant insects were identified that were infected by a virus carrying a mutation within pe38. Acquired host resistance in codling moths has also been documented and has been mapped to dominant gene mutation(s) in sex-linked chromosomes, but the specific resistance genes have not yet been identified. Although insects have acquired resistance to virus isolates, indicating adaptation, this is an area that deserves more attention and has been understudied. In the past, the limitation of lepidopteran genetic tools and genomic sequences has hindered developments in the field, although this is now beginning to change. In addition to baculovirus genes that allow a productive infection in additional hosts and that interfere with the host immunity, other baculovirus genes that aid in other processes, including virus assembly, viral DNA replication, host physiology manipulation, and virus dispersal have homologs in other organisms (Fig. 2).

Concluding Remarks A significant percentage of baculovirus genes have counterparts in cellular organisms, conveying the close interaction between the two and showing how changing environmental situations leads to gene exchanges to benefit one and counteract the other. Studies on baculovirus gene lineages indicate that baculoviruses have usurped many of their genes from their insect hosts (some specifically traced to lepidopterans) or bacteria. Homologs of many baculovirus genes are found in insects, prokaryotes, other

Baculovirus–Host Interactions: Repurposing Host-Acquired Genes (Baculoviridae)

737

eukaryotes and archaea, heightening the richness of tools and strategies acquired through evolution. Thus, baculoviruses have evolved to contain a toolbox suitable to replicate in diverse insect hosts. Host-acquired genes include genes important for immunity, spread, genome replication, transcription and other processes. Detailed analyzes of the phylogenetics of each gene will provide a more comprehensive picture as to when gene acquisition took place, the cellular organism from which genes were acquired, additional host interactions, and insights into gene function. The fluidity of gene transfer suggests the need for baculoviruses to optimize their replication strategies in different hosts through time and following diverse burdens. Genes from a host or from another organism present in a host can be acquired by a virus to meet selective advantages depending the environmental cues. These can include fighting host defenses, mounting offenses to debilitate host processes and host range expansion. Viruses can use different strategies to further their replication capabilities: mimicry, where gene products mirror host proteins and serve as decoys; co-option, the adaptation of different functions of an original host protein to aid in virus replication; and gene acquisition, to have an additional function not previously encoded. Altogether, gene acquisition adds tools to a viral toolbox that can be effectively used to either defeat the host or enhance virus replication. Viral genes acquired from other organisms generally have low similarity to and simpler domains than the cellular gene, making it difficult to define their origin. In addition, viral genes incur additional and more rapid changes than host proteins, making it harder to discern the directionality of horizontal gene transfer events. Co-opted genes can acquire a different function, further muddying identification of gene origins. Genes transferred between viruses and their hosts have co-evolved in an impending and continuous arms race for survival of each. Virus-host networks can be studied using molecular methods that provide a footprint of gene acquisition. More specifically, the transferal of genes and relationships (genetic or behavioral) between viruses and their hosts can be studied using diverse systems, including high-throughput methods and genome-wide screens that map host genes that affect various viral replication steps; next generation sequencing of host and viral genomes combined with detailed and structure-based phylogenetics to define evolutionary lineages; molecular interactomes using the yeast two-hybrid system, mass spectrometry complex identification or computational algorithms to identify the interactions between factors; identification of microbiome components to further define the interactions and gene transfer between organisms and viruses; and host experiments supporting in vitro ones, when possible. The amalgamation of various methods will help define the cross-talk and mobility of genes between viruses, their hosts, host microbiomes and additional parasitic organisms replicating in a host. Co-option of genes aids in our understanding of the different potential functions of gene products in different settings throughout evolution.

Classification (Compact) Family: Baculoviridae; Genera: Alphabaculovirus, Betabaculovirus, Deltabaculovirus, Gammabaculovirus; Species: Autographa californica multiple nucleopolyhedrovirus (for example).

Virion Structure Virion structure is complex. Capsids have helical symmetry (200–450 nm in length, 30–100 nm in diameter). Two enveloped forms of virus are produced: budded and occluded viruses.

Genome One double-stranded circular DNA genomes, 80- to 180-kilobasepairs in length.

Replication Cycle Insects ingest occlusion bodies containing occlusion-derived viruses and released virions infect midgut epithelial cells. Following budding from midgut cells, the budded virions infect insect cells in the hemocoel, including hemocytes and cells in various tissues. Infected cells produce more budded virus for systemic spread within an insect and occluded virus for insect-to-insect spread. Upon host death, occluded virions are dispersed for new hosts to ingest. In a few baculoviruses, particularly deltabaculoviruses, the infection is midgut restricted.

Epidemiology Insects acquire infection through consumption of the occluded form of the virus as a contaminant of their food. In some cases, they can be transmitted transovum and transovarial. Baculoviruses have a relatively narrow host range and can cause epizootics, altering insect population dynamics.

738

Baculovirus–Host Interactions: Repurposing Host-Acquired Genes (Baculoviridae)

Pathogenesis The tissue and cuticle of infected insects liquefies through the concerted activity of baculoviral chitinase and cathepsin enzymes, dispersing occluded virus. Cuticle discoloration may be apparent. At the cellular level, cells round up and the nucleus is enlarged. At very late times post infection, cells exhibit occlusion or granular bodies, containing sets of one enveloped or more co-enveloped nucleocapsids.

Diagnosis Infected insect larvae have a lighter color and their cuticle is fragile, easily breaking to the touch and dispersing virus. Infected larvae climb to the top of plant foliage, a virus-induced behavior called Wipfelkrankheit or tree-top disease, and this allows more efficient virus spread on the foliage below, upon larva liquefaction or melting.

Further Reading Clem, R.J., 2015. Viral IAPs, then and now. Seminars in Cell & Developmental Biology 39, 72–79. Hill, T., Unckless, R.L., 2017. Baculovirus molecular evolution via gene turnover and recurrent positive selection of key genes. Journal of Virology 91, e01319. Hughes, A.L., Friedman, R., 2003. Genome-wide survey for genes horizontally transferred from cellular organisms to baculoviruses. Molecular Biology and Evolution 20, 979–987. Katsuma, S., Kawaoka, S., Mita, K., Shimada, T., 2008. Genome-wide survey for baculoviral host homologs using the Bombyx genome sequence. Insect Biochemistry and Molecular Biology 38, 1080–1086. Kong, M., Zuo, H., Zhu, F., et al., 2018. The interaction between baculoviruses and their insect hosts. Developmental and Comparative Immunology 83, 114–123. Thézé, J., Takatsuka, J., Nakai, M., Arif, B., Herniou, E.A., 2015. Gene acquisition convergence between entomopoxvirues and baculoviruses. Viruses 7, 1960–1974.

Baculoviruses: General Features (Baculoviridae) Vera ID Ros, Wageningen University and Research, Wageningen, The Netherlands r 2021 Elsevier Ltd. All rights reserved.

Nomenclature AcMNPV

Autographa californica multiple (multiple refers to the multiple nature of nucleocapsids within individual ODV virions of nucleopolyhedroviruses as opposed to those where only a single nucleocapsid is present within each virion from an ODV) nucleopolyhedrovirus AgMNPV Anticarsia gemmatalis multiple nucleopolyhedrovirus BmNPV Bombyx mori nucleopolyhedrovirus CfMNPV Choristoneura fumiferana multiple nucleopolyhedrovirus

Glossary Budded virus (BV) One of the two virion types of baculoviruses that is produced when nucleocapsids bud from the plasma membrane of an infected cell and mediates the systemic cell-to-cell spread within the infected insect. Granule The OB of granuloviruses (plural granules). Granulin The major protein in granulovirus OBs. Granulovirus (GV) Name used for all baculoviruses that form granular-shaped OBs of which the major matrix protein is granulin, comprising all viruses from the genus Betabaculovirus. Nucleopolyhedrovirus (NPV) Name used for all baculoviruses that form polyhedral-shaped OBs of which the major matrix protein is polyhedrin, comprising all

CpGV Cydia pomonella granulovirus CrleGV Cryptophlebia leucotreta granulovirus CuniNPV Culex nigripalpus nucleopolyhedrovirus HearNPV Helicoverpa armigera nucleopolyhedrovirus LdMNPV Lymantria dispar multiple nucleopolyhedrovirus NeleNPV Neodiprion lecontei nucleopolyhedrovirus SeMNPV Spodoptera exigua multiple nucleopolyhedrovirus SpliNPV Spodoptera littoralis nucleopolyhedrovirus SpltNPV Spodoptera litura nucleopolyhedrovirus XecnGV Xestia c-nigrum granulovirus

viruses from the genera Alphabaculovirus, Deltabaculovirus and Gammabaculovirus. Occlusion body (OB) Proteinaceous crystalline structure protecting the infectious ODVs, responsible for horizontal insect-to-insect spread. Occlusion derived virus (ODV) One of the two virion types of baculoviruses that is assembled in the nucleus where nucleocapsids obtain an inner nuclear membranederived envelope, and that becomes occluded within the proteinaceous crystalline matrix of the occlusion body. Polyhedrin The major protein in nucleopolyhedrovirus OBs. Polyhedron The OB of nucleopolyhedroviruses (plural polyhedra).

Historical Perspective The history of the discovery of baculoviruses is closely linked to the development of the silk industry, which originated in China and arrived in Europe during medieval times. The dramatic effect of baculoviruses on insect hosts was already described in 1527 by Marcus Hieronymus Vida, bishop of Alba in Italy, long before the identity of baculoviruses was revealed. His poem “Bombycum” (The silkworm) includes a description of liquefying silkworms (caterpillars of the silk moth Bombyx mori), later known as jaundice disease and characteristic for baculovirus infections. After the invention of light microscopy, Maestri and Cornalia independently described refractile occlusion bodies which were commonly polyhedron shaped, leading to the naming of the disease observed in the silkworms as ‘polyhedrosis’ by the mid-1800s. Glaser, Paillot, and colleagues, between 1913 and 1928, demonstrated that the polyhedrosis disease of caterpillars was due to a filterable agent and therefore must have a viral etiology. With the perfection of insect tissue culture methodology, William Trager in 1935 reported on the ability of the Grasserie virus, now known as Bombyx mori nucleopolyhedrovirus (BmNPV), to replicate in silkworm tissue culture cells. This led to the opportunity of studying the replication of these viruses under more controlled conditions than possible when working with whole insect larvae and free of other microbial agents.

Nomenclature, Taxonomy, and Classification The family Baculoviridae comprises insect viruses with 80–180 kbp circular double-stranded DNA (dsDNA) genomes encoding 100–200 proteins. The family name is derived from the rod-shaped (baculum, Latin for stick) nucleocapsids. In the 10th report of the ICTV (2019, as per see “Relevant Website section”) a total of 82 baculovirus species are recognized and a further 19 tentative species are listed (Table 1). Four genera are distinguished within the Baculoviridae, based on phylogeny, genome characteristics,

Encyclopedia of Virology, 4th Edition, Volume 4

doi:10.1016/B978-0-12-809633-8.21549-5

739

740

Baculoviruses: General Features (Baculoviridae)

Table 1

Number of species per baculovirus genus

Genus

Member species (ICTV)a

Related, unclassified species

Alphabaculovirus Betabaculovirus Deltabaculovirus Gammabaculovirus Total

53 26 1 2 82

15 0 2 2 19

a

Based on the 10th report of the International Committee on Taxonomy of Viruses (ICTV) (2019, as per https://talk.ictvonline.org/ictv-reports/ictv_online_report/dsdna-viruses/w/baculoviridae).

host range and occlusion body (OB) morphology; Alphabaculovirus (lepidopteran-specific nucleopolyhedroviruses (NPVs)) with 53 species, Betabaculovirus (lepidopteran-specific granuloviruses (GVs)) with 16 species, Deltabaculovirus (dipteran-specific NPVs) with one species, and Gammabaculovirus (hymenopteran-specific NPVs) with two species. Based on phylogenetic inferences but not yet recognized taxonomically, lepidopteran-specific alphabaculoviruses are further divided into Group I and Group II alphabaculoviruses (Fig. 1). These two groups differ significantly in gene content, most notably Group I alphabaculoviruses encode GP64 as the BV fusion protein, whereas Group II alphabaculoviruses encode Fusion protein (F) but lack gp64. Baculoviruses are unique among viruses in having a biphasic replication cycle with two morphologically distinct virion types, budded viruses (BVs) and occlusion derived viruses (ODVs) (Fig. 2). Early in infection, BVs are produced when newly formed nucleocapsids bud out of the insect cell, obtaining an envelope from the cell membrane (Fig. 3). BVs are responsible for cell-to-cell transmission within the infected insects (as well as in cell tissue culture). Later in infection, ODVs are formed in the host nucleus where nucleocapsids obtain a membrane from microvesicles derived from the inner-nuclear membrane. The ODVs are subsequently embedded in a polyhedral (NPV) or granular (GV)-shaped proteinaceous crystalline matrix, forming viral OBs. The occlusion body form is responsible for horizontal insect-to-insect spread. ODVs may have one or several nucleocapsids within a single envelope, depending on the species of baculovirus, forming single (S) or multiple (M) NPVs, respectively. Baculoviruses infect larval stages of species belonging to the insect orders Lepidoptera, Diptera (family Culicidae) and Hymenoptera (suborder Symphyta). NPVs are found in these three insect orders, while GVs are restricted to the Lepidoptera. Baculoviruses are normally named according to the initial host from which they were isolated and the type of OB/ODV associated with it. While this might cause some confusion, since the same virus might occur in a range of insect species, this approach is still applied today. Originally baculovirus abbreviations used the first letter of the genus and species name of the host insect followed by NPV or GV depending on the type of occlusion body. Hence the MNPV from the alfalfa looper Autographa californica is abbreviated as AcMNPV and the MNPV from Choristoneura fumiferana is abbreviated as CfMNPV. Those baculoviruses which have historically had a two-letter species code continue to maintain those abbreviations. However, as more baculoviruses were being described, a two-letter species designation system was untenable and now a four-letter species identification based on the first two letters of each of the genus and species names of the host insect from which the virus was first isolated is used. Hence the NPV from Helicoverpa armigera is abbreviated as HearNPV and that from Neodiprion lecontei is abbreviated as NeleNPV. If identical letterings might result, the third or later letters are used instead of the second, as seen for SpltNPV for an NPV from Spodoptera litura that was chosen since SpliNPV was already used for an NPV from Spodoptera littoralis. Although the S and M are useful morphological descriptors in the names indicating if nucleocapsids exist either singly or in multiples within ODV virions, they do not appear to have any taxonomic relevance and so in more recently described baculoviruses these designations have been dropped but have been maintained for those baculoviruses in which there is a strong historical precedence. To determine whether an isolate belongs to an existing virus species or should be the basis of a new species requires biological and molecular information. A Kimura two-parameter method including the information from all 38 baculovirus core genes can be used for species demarcation, along with additional information such as host range and genome organization.

Morphology Viruses in all four baculovirus genera have two structurally and biochemically distinct virion phenotypes (BVs and ODVs; Fig. 2) and a biphasic replication cycle (Fig. 3). These virions consist of cylindrical, rod-shaped, nucleocapsids contained within a lipid bilayer envelope. The nucleocapsids contain the dsDNA genome and range in size from 30 to 60 nm in diameter and, depending on the size of the genome, 250–300 nm in length. They consist of a single circular supercoiled dsDNA packed into a protein capsid. The nucleocapsids have a flat base structure on one end and a nipple-like cap on the other end. BVs contain a single nucleocapsid surrounded by a loosefitting envelope. The BV envelope, obtained from the cellular plasma membrane during budding, is modified at the nipple end by the occurrence of spikes or peplomers consisting of viral glycoproteins such as GP64 or F proteins. ODVs contain one or several nucleocapsids present in a parallel array, enveloped in a membrane derived from the inner nuclear membrane in the nucleus (for alphabaculoviruses) or in the nuclear-cytoplasmic milieu after loss of the nuclear membrane (for betabaculoviruses). ODVs are embedded in paracrystalline occlusions forming either polyhedra (NPVs) or granules (GVs). Polyhedra are roughly cuboidal (although other shapes have been described) containing multiple ODVs. Alphabaculovirus OBs are 0.15–5.0 mm in size and contain many ODVs and,

Baculoviruses: General Features (Baculoviridae)

741

Fig. 1 Schematic representation of baculovirus phylogeny, showing the different genera and other (sub) groups, based on the baculovirus core-genome phylogeny of Thézé, J., Lopez-Vaamonde, C., Cory, J.S., Herniou, E.A., 2018. Biodiversity, evolution and ecological specialization of baculoviruses: A treasure trove for future applied research. Viruses 10, 366. The original phylogeny was obtained from maximum likelihood inference analysis of the concatenated amino acid alignment of 37 baculovirus core genes. Values at nodes represent bootstrap values (100 replicates).

depending on the virus species, containing either single (S) or multiple (M) nucleocapsids. Betabaculovirus OBs (granules) are ovicylindrical in shape and much smaller in size, measuring 0.12–0.50 mm. Each granule characteristically contains a single virion each containing a single nucleocapsid. Within the genus Deltabaculovirus only a single species has been classified, Culex nigripalpus nucleopolyhedrovirus. The ODVs of CuniNPV are embedded in an OB composed of a crystalline matrix of a viral protein with no homology to polyhedrin or granulin. The OBs range in size from 0.5 to 5.0 mm and contain few (1–4) or many (450) ODVs depending on the strain of the virus. Gammabaculovirus OBs contain multiple ODVs, each containing a single nucleocapsid. Genome sequences of viruses from the three classified species in this genus revealed that they do not encode an envelope fusion protein as found in other baculoviruses, questioning the role of the BV phenotype in the biology of gammabaculoviruses.

Genomes, Gene Content, Organization Baculovirus genome size varies considerably for the currently sequenced genomes, ranging from 81,755 bp for the gammabaculovirus Neodiprion lecontei NPV (NeleNPV), encoding 89 open reading frames (ORFs) to 178,733 bp for Xestia c-nigrum GV (XecnGV), encoding 181 ORFs. Baculovirus ORFs are defined as those encoding proteins of at least 50 amino acids with minimal (o75 nucleotides) overlap with each other. The guanine-cytosine (GC) content of baculovirus genomes ranges from 32.4% for Cryptophlebia leucotreta granulovirus (CrleGV) to 57.5% for Lymantria dispar multiple nucleopolyhedrovirus (LdMNPV). Baculovirus genes are distributed on both strands, in both orientations. The gene order does not follow a general pattern but is conserved among related baculoviruses. Intergenic regions are short and promoters and/or 30 untranslated regions of flanking genes

742

Baculoviruses: General Features (Baculoviridae)

Fig. 2 Baculovirus virions and nucleocapsids. The two baculovirus virion phenotypes, occlusion derived virus (ODV) and budded virus (BV), are illustrated as diagrams with shared and phenotype-specific components. Taken from the International Committee on Taxonomy of Viruses (ICTV) website (https://talk.ictvonline.org/).

Fig. 3 Schematic representation of the alphabaculovirus infection cycle. A lepidopteran larva ingests plant material contaminated with occlusion bodies (OBs) (a). The OBs dissolve in the alkaline midgut, resulting in release of the occlusion derived viruses (ODVs). The ODVs pass the peritrophic membrane and bind to and fuse with the midgut epithelial cell membrane, after which the nucleocapsids enter the cell (primary infection) (b). The parental nucleocapsids migrate towards the nucleus for the first round of replication forming new nucleocapsids that migrate to the plasma membrane, or directly migrate to the plasma membrane. The nucleocapsids bud out of the cell, acquiring an envelope from the plasma membrane which contains the viral fusion protein (GP64 or F-protein; depicted as red spikes), forming budded viruses (BVs) (c). BVs are responsible for cell to cell spread (secondary or systemic infection). The BVs infect surrounding columnar or regenerative midgut cells via endocytosis. From the midgut, the infection is spread to other larval tissues via the hemolymph and the trachea (d). At the very late stage of infection, the nucleocapsids remain inside the host cell nucleus to be enveloped by a microvesicle membrane that is derived from the inner-nuclear membrane, forming new ODVs. These are subsequently occluded in a protein matrix, forming new OBs. Finally, the cells lyse and the exoskeleton is degraded, resulting in liquefaction of the larvae and release of the OBs into the environment. 1. Occlusion body; 2. Occlusion derived virus; 3. Basal lamina; 4. Regenerative midgut cell; 5. Budded virus; 6. Epithelial midgut cell; 7. Peritrophic membrane; 8. Midgut lumen; 9. Trachea. Drawings produced by Bob Boogaard.

Baculoviruses: General Features (Baculoviridae)

Table 2

743

Overview of the baculovirus core genes and the function of the encoded proteins

Baculovirus core genesa

Function of the encoded protein

lef-1 (ac14) lef-2 (ac6) dnapol (ac65) p143 (ac95) alkexo (ac133) lef-4 (ac90) lef-8 (ac50) lef-9 (ac62) p47 (ac40) lef-5 (ac99) vlf1 (ac77) ac53 vp1054 (ac54) vp39 (ac89) 38k (ac98) p49 (ac142) P6.9 (ac100) desmop (ac66) odv/bv c42 (ac101) p18 (ac93) p48 (ac103) p74 (pif-0) (ac138) pif-1 (ac119) pif-2 (ac22) ` pif-3 (ac115) pif-4 (ac96) odv-e56 (pif-5) (ac148) pif-6 (ac68) pif-7 (ac110) P95 (pif-8) (ac83) odv-e18 (ac143) odv-e25 (ac94) odv-e27 (ac144) gp41 (ac92) ac109 p33 (ac92) ac78 ac81

DNA primase DNA replication/associated with LEF-1 DNA polymerase Helicase Alkaline exonuclease mRNA capping enzyme RNA polymerase subunit RNA polymerase subunit RNA polymerase subunit Transcription initiation factor Very late factor 1 (p10 and polh expression) U-box/RING-like domains, NC formation NC assembly Major capsid protien NC assembly NC assembly DNA condensation, NC protein Desmoplakin, NC protein NC associated protein Egress of NCs BV production and ODV envelopment Per os infectivity factor Per os infectivity factor Per os infectivity factor Per os infectivity factor Per os infectivity factor Per os infectivity factor Per os infectivity factor Per os infectivity factor Per os infectivity factor and NC protein ODV envelope-protein ODV envelope-protein ODV envelope-protein Tegument protein Structural protein with unknown function Sulfhydryl oxidase/preventing oxidative stress Interacts with sulfhydryl oxidase Non-structural protein of unknown function

Replication

Gene expression

Nucleocapsid (NC) assembly

ODV morphogenesis/oral infectivity

Other

Name in brackets refers to the gene number on the AcMNPV genome, the first baculovirus complete genome to be sequenced.

a

may overlap. A characteristic of most baculovirus genomes is the presence of several homologous repeat regions (hr's), consisting of repeats of short, often palindromic sequences, typically 30–40 nt in length. Cell culture assays revealed that hr's function as origins of replication and enhancers of transcription. Comparison of all fully sequenced baculovirus genomes of all four genera has revealed a set of 38 core genes (Table 2), which are shared among all baculovirus species. Most of these genes encode proteins involved in basic functions, including DNA replication, gene transcription, progeny virion assembly and structure and oral infectivity. Additional sets of genes are shared within genera, or among groups of related baculoviruses. Other genes are unique to one or a few baculovirus species. Baculovirus transcription is temporally regulated into four phases: immediate early, delayed early, late and very late. Expression of genes in one phase is dependent on the expression of genes in the preceding phase. The expression of some genes extends through several phases. Early genes are expressed prior to DNA replication and are transcribed by host RNA polymerase II. Late and very late genes are transcribed by a viral RNA polymerase and the very late genes comprise two highly expressed genes: the polyhedrin or granulin gene encoding the OB matrix protein and the p10 gene encoding a 10 kDa protein that forms fibrillar structures important for OB formation and release.

Evolution Baculoviruses are related to other insect viruses with large dsDNA genomes, including nudiviruses (Nudiviridae), bracoviruses (endogenized viruses that are derived from the nudiviruses) (genus Bracovirus from the family Polydnaviridae) and hytrosaviruses

744

Baculoviruses: General Features (Baculoviridae)

(Hytrosaviridae), all of which are exclusively pathogenic to arthropods, harbor rod-shaped enveloped nucleocapsids and replicate in the nucleus. Nudiviruses and baculoviruses are sister groups, bracoviruses are nested within the nudiviruses and hytrosaviruses are an outgroup. Nudi- and baculoviruses first evolved around 310 Mya in the Paleozoic Era during the Carboniferous Period with the first insects. Further virus diversification is linked to the diversification of insect orders. More distantly related to these insect viruses are whispoviruses, infecting shrimp, of the family Nimaviridae. The baculovirus lineage appeared shortly after the appearance of holometabolous insects – insects with larval stages undergoing complete metamorphosis. Baculoviruses have been characterized in only holometabolous insects from the orders Diptera, Hymenoptera and Lepidoptera. Phylogenetic analyses revealed ancient co-evolution between baculoviruses and these insects, and also showed that the baculovirus ancestor was already specialized to infect larval stages of holometabolous insects. Within the baculoviruses, the four genera form distinct and major lineages, reflecting co-evolution of baculoviruses and their insect hosts. The deltabaculoviruses are the most ancient baculoviruses and are ancestral to the gammabaculoviruses. Both are ancestral to the lepidopteran-infecting alpha- and betabaculoviruses, which are sister groups. Furthermore, alphabaculoviruses are further subdivided into two groups, the monophyletic Group I alphabaculoviruses that is nested within the Group II alphabaculoviruses (Fig. 1). Two additional subgroups, clade Ia and Ib can be distinguished within Group I alphabaculoviruses, while clade IIa, IIb and IIc are found within Group II alphabaculoviruses (Fig. 1). Phylogenetic analyses of whole genome sequences have greatly expanded our understanding of baculovirus diversity and evolution and showed that host shifts played a major role in the diversification of baculoviruses. Colonization of new ecological niches has sometimes led to viral radiation. Furthermore, extensive horizontal gene transfer between baculoviruses and insect hosts and between baculoviruses and other viruses has occurred.

Infection Cycle Insects become infected by baculoviruses when an insect larva consumes viral OBs during feeding (Fig. 3). The OBs dissolve rapidly in the alkaline environment of the larval midgut, releasing the ODVs. In order to access the target columnar midgut epithelial cells, the released virus must first penetrate the peritrophic membrane (a mesh of fibers consisting of chitin fibrils linked to glycoproteins and proteoglycans) lining the gut. This process is facilitated by enhancins (metalloproteases), which are encoded by most GVs and some NPVs. Enhancins are also thought to increase the fusion efficiency between the viral and cellular membranes during infection. The ODV envelope fuses with the host cell plasma membrane, releasing the nucleocapsids into the cytoplasm. Fusion is preceded by binding of the ODV to the microvilli by a specific, protease-sensitive receptor. Attachment and fusion requires a set of proteins, called per os infectivity factors (PIFs). These proteins are encoded by core genes, and currently eight pif genes are known (Table 2). The released nucleocapsids migrate along cytoplasmic actin filaments to nuclear pores. The nucleocapsid is uncoated either at the nuclear pore or as it enters the nuclear pore to release the DNA core into the nucleus. In the nucleus, transcription of viral genes is initiated, followed by DNA replication, late gene transcription, viral protein translation and nucleocapsid assembly. Nucleocapsids are transported to the basal site of the infected midgut cell where they form BVs by budding out of the cell, thereby acquiring an envelope from the host cell plasma membrane which has been modified to contain viral proteins such as GP64 or F. BVs are responsible for cell to cell spread within the insect host (secondary or systemic infection). Cell entry of BVs occurs via clathrin-mediated endocytosis. Most baculoviruses disseminate throughout the insect infecting all tissues, via the hemolymph (after passing the basal lamina) and/or via the tracheal system. For some baculovirus species, including some GVs, gammabaculovirus and deltabaculovirus, infections are limited to host midgut epithelial cells. The role of BVs in these infections, if any, remains unclear. At the very late stage of infection, cells switch to ODV production and OB formation (Fig. 3). Within the nucleus, nucleocapsids acquire an envelope from the inner-nuclear membrane. The virions then become occluded within occlusion bodies (polyhedra or granules). This systemic infection is so effective that following death, up to 25% of the larval mass is due to polyhedra. The late expressed virally encoded enzymes chitinase and cathepsin aid cell lysis and exoskeleton degradation, liquefying the larvae and releasing the OBs into the environment. These enzymes are not encoded by baculoviruses whose infection is limited to the midgut. For these, the virus particles are released by sloughing of infected midgut cells and by excretion of viruses.

Baculovirus Transmission and Host Behavioral Manipulation OBs are highly persistent in the environment. OBs can even survive passage through the gastrointestinal tract of birds, facilitating their long distance dispersal. Baculoviruses also ensure enhanced dispersal by modifying host behavior. Baculovirus-induced behavioral changes were first reported in 1891 by Hofmann, representing the oldest written record of parasitic manipulation of host behavior. Infected Lymantria monacha larvae lost their appetite and migrated to the top of trees where they liquefied, termed by Hofmann as “Wipfelkrankheit” (tree-top disease). Such climbing prior to death has been observed for a range of baculoviruscaterpillar combinations, and infected caterpillars were found to move towards the light. In addition, infected caterpillars may show increased locomotion behavior (hyperactivity or hypermobility). These behavioral changes are thought to enhance viral spread, by increasing the area over which the virus is distributed and by making deceased caterpillars more visible to predators including birds. Recent studies have revealed part of the molecular mechanisms underlying these virus-induced behavioral changes. A virally encoded protein tyrosine phosphatase (PTP) is involved in the induction of hyperactivity in BmNPV-infected

Baculoviruses: General Features (Baculoviridae)

745

B. mori larvae as well as in AcMNPV-infected S. exigua larvae. Since ptp is restricted to Group I alphabaculoviruses, most likely other viral genes are involved as well for hyperactivity in non Group I alphabaculoviruses. Furthermore it was found that the egt gene, present in nearly all baculoviruses and encoding an ecdysteroid UDP-glucosyl transferase (EGT), plays a role in baculovirusinduced tree-top disease in certain virus-host combinations. In LdMNPV-infected L. dispar larvae, egt is involved in the induction of treetop disease, however, the egt gene of AcMNPV did not induce tree-top disease in S. exigua or Trichoplusia ni larvae. Baculoviruses apparently possess multiple mechanisms to modify host behavior, dependent on the specific virus-host combination studied. In addition to horizontal transmission of OBs, by either consumption of OB contaminated food or by cannibalism, baculoviruses are also vertically transmitted. Natural and laboratory populations often harbor covert baculovirus infections. The molecular basis for covert baculovirus infections is currently unclear, however, these covert viruses can be vertically transmitted from parents to offspring. Therefore baculoviruses have a mixed-mode transmission, with both horizontal and vertical transmission, possibly allowing virus persistence when opportunities for horizontal transmission is limited. Covert infections can be re-activated to produce lethal disease, followed by horizontal transmission.

Baculovirus Expression Technology The discovery in the 1980s that the polyhedrin promotor of AcMNPV could be exploited to produce high levels of recombinant proteins in insect cells marks the beginning of the baculovirus expression vector (BEV) system. Since then, baculovirus-based expression technology has developed extensively and is now widely used to produce proteins of interest and commercial vaccines for both human and veterinarian use, like the CircumventsPCV-M G2 vaccine developed by Merck against both circovirus and Mycoplasma hyopneumoniae in pigs, and the Cervarix vaccine against human papillomavirus. As a BEV is a eukaryotic expression vector, many of the common post-translational modifications that occur to produce mammalian proteins also occur in the BEV system, although glycoproteins produced in insects generally display more uniform, but less complex-glycans than mammals. Attempts are being taken to further “mammalianize” glycoprotein processing in insect cells to further improve the authenticity of baculovirus-expressed mammalian proteins. Furthermore, recombinant baculoviruses are used as gene delivery vectors for mammalian cells and as expression vectors for adeno-associated virus (AAV)-based gene therapy products.

Baculoviral Insecticides Since their discovery, baculoviruses have been studied mainly for their value as potent, biologically based insect pesticides. Baculoviruses are naturally occurring pathogens, with therefore a much smaller environmental impact compared to chemical insecticides in the management of agricultural and forest pest insects. Furthermore, baculoviruses are highly specific with a limited host range, have no negative effects on non-target organisms and can be mass-produced, formulated, packaged and stored. More than 60 baculovirus-based pesticides have been utilized worldwide to control diverse insect pests. One of the most effective ones include Anticarsia gemmatalis MNPV (AgMNPV) which was successfully used to control the velvet been caterpillar A. gemmatalis, an abundant soybean pest in Brazil and other South American countries. The use of AgMNPV reduced larval populations by 80%, the same level as for insecticide treatment. Other highly successful examples of baculovirus insecticides include Cydia pomonella GV (CpGV) to control the codling moth C. pomonella in apples, pears and walnuts, Helicoverpa armigera NPV (HearNPV) to control the cotton bollworm H. armigera, a highly polyphagous pest worldwide and Spodoptera exigua MNPV (SeMNPV) to control the beet armyworm S. exigua, a highly polyphagous and widely distributed pest both in the field and in greenhouses. Although baculoviruses have been successfully applied, they have some limitations compared to other (chemical) insecticides, including a slower speed of kill, a very narrow host specificity (which may hinder commercialization), a susceptibility to UV light and high production costs (large scale production still relies on in vivo systems). In addition, resistance against baculoviruses may develop, as occurred in some C. pomonella populations in Europe. Since the speed of kill is much faster for smaller instars (and since older instars may bore into stems or fruits being unreachable for any pesticide), the effectiveness of baculovirus application can be optimized by proper monitoring in the field enabling a timely application. To further lower the time to kill and to increase the efficacy of baculoviruses, recombinant baculoviruses have been developed in the laboratory, by deleting baculovirus genes, by adding genes encoding insect-specific toxins from scorpions or spiders, by exchanging genes among different baculoviruses or by reorganizing the genome. The stability of baculoviruses in pesticide formulations can be further improved by adding new and more effective additives. Nonetheless, baculoviruses have a promising future as biopesticides, and can be successfully implemented in integrated pest management strategies.

Further Reading Arif, B.M., 2005. A brief journey with insect viruses with emphasis on baculoviruses. Journal of Invertebrate Pathology 89, 39–45. Arif, B., Escasa, S., Pavlik, L., 2011. Biology and genomics of viruses within the genus. Gammabaculovirus. Viruses 3, 2214–2222. Beas-Catena, A., Sánchez-Mirón, A., García-Camacho, F., Contreras-Gómez, A., Molina-Grima, E., 2014. Baculovirus biopesticides: An overview. Journal of Animal & Plant Sciences 24, 362–373.

746

Baculoviruses: General Features (Baculoviridae)

Bezier, A., Thézé, J., Gavory, F., et al., 2015. The genome of the nucleopolyhedrosis-causing virus from Tipula oleracea sheds new light on the Nudiviridae family. Journal of Virology 89, 3008–3025. Gasque, S.N., van Oers, M.M., Ros, V.I.D., 2019. Where the baculoviruses lead, the caterpillars follow: Baculovirus-induced alterations in caterpillar behaviour. Current Opinion in Insect Science 33, 30–36. Han, Y., van Oers, M.M., van Houte, S., Ros, V.I.D., 2015. Virus-induced behavioural changes in insects. In: Mehlhorn, H. (Ed.), Hostmanipulations by Parasites and Viruses. Parasitology Research Monograph. Springer International Publishing, pp. 149–174. Harrison, R.L., Herniou, E.A., Jehle, J.A., et al., 2018. ICTV virus taxonomy profile: Baculoviridae. Journal of General Virology 99, 1185–1186. Harrison, R., Hoover, K., 2012. Baculoviruses and other occluded insect viruses. In: Vega, F., Kaya, H. (Eds.), Insect Pathology, second ed. Amsterdam: Elsevier, pp. 73–131. Rohrmann, G.F., 2019. Baculovirus Molecular Biology, fourth ed. Bethesda, MD: National Center for Biotechnology Information. Thézé, J., Bézier, A., Periquet, G., Drezen, J.-M., Herniou, E.A., 2011. Paleozoic origin of insect large dsDNA viruses. Proceedings of the National Academy of Sciences of the United States of America 108, 15931–15935. Thézé, J., Lopez-Vaamonde, C., Cory, J.S., Herniou, E.A., 2018. Biodiversity, evolution and ecological specialization of baculoviruses: A treasure trove for future applied research. Viruses 10, 366. van Oers, M.M., Pijlman, G.P., Vlak, J.M., 2015. Thirty years of baculovirus-insect cell protein expression: From dark horse to mainstream technology. Journal of General Virology 96, 6–23. Wang, M., Hu, Z., 2019. Cross-talking between baculoviruses and host insects towards a successful infection. Philosophical Transactions of the Royal Society B 374. (20180324). Wennmann, J.T., Keilwagen, J., Jehle, J.A., 2018. Baculovirus Kimura two-parameter species demarcation criterion is confirmed by the distances of 38 core gene nucleotide sequences. Journal of General Virology 99, 1307–1320. Williams, T., Bergoin, M., van Oers, M.M., 2017. Diversity of large DNA viruses of invertebrates. Journal of Invertebrate Pathology 149, 4–22. Williams, T., Virto, C., Murillo, R., Caballero, P., 2017. Covert infection of insects by baculoviruses. Frontiers in Microbiology 8, 1337.

Relevant Website https://talk.ictvonline.org/ictv-reports/ictv_online_report/dsdna-viruses/w/baculoviridae ICTV online.

Baculoviruses: Molecular Biology and Replication (Baculoviridae) Monique M van Oers, Wageningen University and Research, Wageningen, The Netherlands r 2021 Elsevier Ltd. All rights reserved.

Nomenclature AcMNPV

Autographa californica multiple nucleopolyhedrovirus BmNPV Bombyx mori nucleopolyhedrovirus ChchNPV Chrysodeixis chalcites nucleopolyhedrovirus CuniNPV Culex nigripalpus nucleopolyhedrovirus HearNPV Helicoverpa armigera nucleopolyhedrovirus

Glossary Budded virus (BV) The type of baculovirus virion that is formed by budding from the plasma membrane of infected cells and mediates the systemic spread of infection in the infected insect. Granulovirus (GV) A betabaculovirus, forming granularshaped occlusion bodies containing only a single virion each. Homologous repeat (hr) A DNA sequence that is typically comprised of a series of imperfect palindromes and repeated at various locations in a baculovirus genome. Baculovirus homologous repeats function as origins of DNA replication and as transcriptional enhancers. Nucleocapsid (NC) Nucleoprotein helical structure forming the core of the enveloped baculovirus. Nucleopolyhedrovirus (NPV) Indication used for all baculoviruses that form polyhedral-shaped occlusion

LeseNPV Leucania separata nucleopolyhedrovirus MaviNPV Maruca vitrata nucleopolyhedrovirus OpMNPV Orgyia pseudotsugata multiple nucleopolyhedrovirus SeMNPV Spodoptea exigua multiple nucleopolyhedrovirus

bodies (OBS) in the nucleus (compare granulovirus, GV, with granular OBs). Occlusion body (OB) A typical polyhedral or granularshaped structure made of a paracrystalline protein matrix in which the baculovirus ODVs are embedded and that is found in the nucleus of infected cells. Occlusion derived virus (ODV) The type of baculovirus virion that is assembled in the nucleus and becomes embedded within the paracrystalline matrix of the occlusion body. PIF proteins A conserved series of crucial per os infectivity factors encoded by baculovirus genomes and being specifically required for oral infection. Polyhedrin The major occlusion body protein of nucleopolyhedroviruses.

Introduction Baculoviruses are insect-infecting viruses characterized by rod-shaped, enveloped virions that are packaged in viral occlusion bodies (OBs). These OBs allow the virus to survive outside a host insect for long times. Baculoviruses replicate in the nucleus of infected cells and are further characterized by their double-stranded (ds), circular DNA genomes, ranging in size from 80 to 180 kbp. Historically, baculoviruses were divided based on their OB appearance into granuloviruses (GVs), with a single virion per granular-shaped OB, and nucleopolyhedroviruses (NPVs) with many virions per polyhedral-shaped OB. The family Baculoviridae is subdivided into four genera, Lepidopteran-infecting baculoviruses are classified in the genus Alphabaculovirus or Betabaculovirus, while the Gammabaculovirus and Deltabaculovirus genera encompass the hymenopteran and dipteran insect infecting baculoviruses, respectively. So far, only one deltabaculovirus is known, Culex nigripalpus (Cuni) NPV that infects mosquito larvae. The alphabaculoviruses, on the other hand, form the most widely studied group of baculoviruses, which is primarily due to the availability of permissive insect cell cultures for many of these viruses. Autographa californica multiple nucleopolyhedrovirus (AcMNPV) represents the most in-depth analysed alphabaculovirus. Furthermore, AcMNPV and Bombyx mori (Bm) NPV are used as expression vectors to produce (therapeutic) proteins, sub-unit vaccines and virus-like particles in cultured insect cells or caterpillars. AcMNPV is also an important tool to produce viral vectors for gene therapy purposes, especially for recombinant Adeno-associated virus (AAV). In addition, AcMNPV is being used for mammalian cell transduction and research is ongoing to use AcMNPV in human medicine as a gene delivery vector. A number of alpha- and betabaculoviruses have been developed into bio-insecticides, while biocontrol programs based on gammabaculoviruses to control sawflies also exist. The remainder of this article will focus on viruses belonging to the genus Alphabaculovirus.

Encyclopedia of Virology, 4th Edition, Volume 4

doi:10.1016/B978-0-12-809633-8.21550-1

747

748

Baculoviruses: Molecular Biology and Replication (Baculoviridae)

Infection Cycle Two Virion Phenotypes Two distinct types of virions are produced in the infection cycle of alphabaculoviruses: the budded virus (BV) and the occlusionderived virus (ODV). The latter type is embedded in OBs. BVs represent the first phenotype to be produced by an infected cell. BVs consist of a single nucleocapsid surrounded by an envelope obtained by budding from the plasma membrane of the infected cell. As the infection progresses, ODVs are assembled in the nucleus of infected cells and acquire an envelope that is derived from the inner nuclear membrane. As a consequence of the different sources of the BV and ODV envelopes, they have a significantly different lipid and protein composition (see Fig. 1). Unlike the ODVs of the baculoviruses classified in the other taxa, the ODVs of alphabaculoviruses can contain single (S) or multiple (M) nucleocapsids (NCs) per virion, depending on virus species. The S and M designation (SNPV and MNPV; e.g., AcMNPV) does not hold taxonomic significance above the species level but is in practice an easy indicator to discriminate virus isolates obtained from the same host, but that belong to different viral species. Towards the end of the infection, the ODVs become occluded in a para-crystalline protein matrix to form OBs that accumulate in the nucleus of the infected cell. OBs represent a feature that is common to all the currently classified baculoviruses, but the shape and size of the OBs depends on the virus isolate. The AcMNPV OBs are approximately 10 mm in diameter and are clearly visible using a light microscope. In NPVs these OBs have polyhedral shapes, hence the historical name “nuclear polyhedrosis” for the observed pathology that gave the NPVs their name. The major component of the OBs of most baculoviruses is the OB matrix protein, polyhedrin, which forms the bulk of the para-crystalline array in which the ODVs are embedded. Betabaculoviruses have an evolutionarily related, though slightly different protein, called granulin. Mature OBs are surrounded by a carbohydrate layer known as the polyhedral envelope or calyx. The alphabaculovirus-specific phosphoprotein called PP34 or the polyhedral envelope protein, is the major protein associated with the calyx in this group of viruses. The OBs of an AcMNPV pp34 deletion mutant were found to have an increased sensitivity to alkaline disruption and an enhanced virulence in fourth instar Spodoptera exigua larvae, suggesting that the polyhedral calyx may stabilize the OBs in the environment. EM images of such a mutant showed OBs with a pitted-appearance, apparently due to the loss of ODV particles.

Fig. 1 Schematic diagram of alphabaculovirus virion phenotypes showing BV and ODV structures. Components common to both virion phenotypes are shown in the center and components unique to each phenotype are indicated on the left and right. Note the BV glycoprotein spikes at both ends of the BV, as well as the ODV envelope proteins, which include a complex of per os infectivity factors (PIFs). PM ¼ plasma membrane; INM ¼ inner nuclear membrane. Adapted from Encyclopedia of Virology 3rd edition based on new insights (courtesy of D.A. Theilmann) and originally derived from Blissard, G.W., 1996. Cytotechnology 20, 73–93 with kind permission from Springer Science and Business Media.

Baculoviruses: Molecular Biology and Replication (Baculoviridae)

749

Oral and Systemic Infection The infection cycle is initiated when insects orally ingest OBs by consuming contaminated plant material. The OBs pass through the foregut, enter the midgut, and disassemble in response to the alkaline environment in there. This process releases the ODVs, which then traverse the peritrophic membrane (a protein–chitin structure that lines the gut) in order to infect the columnar epithelial cells of the lepidopteran midgut. This infection process may be aided by metalloproteases called enhancins, which degrade the peritrophic membrane and are encoded by certain alpha- and most betabaculoviruses. The ODVs bind to microvilli on the surface of midgut epithelial cells and are believed to enter the cells by direct fusion of the ODV envelope with the plasma membrane. A set of ten baculovirus per os infectivity factors (PIF proteins), embedded in the ODV envelope, is essential for oral infectivity and is involved in binding and fusion. Upon entry into the midgut epithelial cell, the NCs are released and transported along actin filaments to the nucleus, where transcription of viral genes is initiated. Viral DNA replication and NC assembly occur within the nucleus in a viral structure known as the virogenic stroma (see Fig. 2). After an initial round of replication in the gut, BVs are produced at the basal side of the columnar cells and these BVs spread the infection systemically to the other tissues. BVs carry a viral fusion protein in their envelope allowing them to enter host cells by endocytosis. A recent study showed that internalization of AcMNPV BVs is facilitated by actin polymerization and dynamin. Microtubules, on the other hand, are involved in intracellular transport of BV-loaded endosomes. While still in an early endosomal stage, the viral envelope fuses with the endosomal envelope, releasing the NCs into the cytoplasm. These NCs then move to the nucleus along actin filaments. Most cells and tissues of the host insect are permissive for infection by BVs. These secondary infected cells first start to produce more BVs, and subsequently also start to produce ODVs and OBs.

Dissemination of OBs Finally, the nuclei of the infected host cells will be filled with OBs. At least two viral enzymes, chitinase and cathepsin, assist in the release of OBs into the environment in a process called liquefaction: the disruption of infected cells and tissues by the breakdown of chitin and proteins, respectively. A third protein, called P10, helps to disintegrate the infected cell nuclei, thereby further dispersing individual OBs. The OBs released into the environment can remain stable for years and remain infectious if protected from ultraviolet light. The spread of OBs into the environment is probably aided by viral modification of host behaviour. Many alphabaculovirus-infected caterpillars show pre-death climbing behaviour (Wipfelkrankheit) and finally can be found hanging from the upper segments of the plants on which they fed. Release of the OBs at such elevated locations is believed to enhance transmission by contaminating the lower regions of the plant. Up in the plants, the infected caterpillars are also an easier prey for predating birds and these have been shown to aid in dispersal of OBs over large distances. In some baculovirus species, the egt gene, which encodes the enzyme ecdysteroid glucosyl transferase (EGT) that prevents larval moulting, is needed for pre-death climbing behaviour. In other species EGT is not required for climbing behaviour or rather plays an indirect role by prolonging the life span of infected caterpillars, thereby providing the time in which the caterpillars start to climb. The exact induction mechanism therefore remains unclear and most likely varies depending on the combination of virus and host species. In addition, it was shown that both AcMNPV and BmNPV induce hyperactivity in infected caterpillars for which the viral protein tyrosine phosphatase (PTP) is responsible, a protein specific for the clade of alphabaculoviruses to which these two viruses belong.

Genome Organization and Content All baculoviruses have circular, double-stranded DNA genomes that are negatively supercoiled. The complete genomic sequences of over 200 alphabaculoviruses have been deposited to GenBank (NCBI). The smallest alphabaculovirus genome is that of Maruca vitrata (Mavi) NPV at 111,953 bp, while Leucania separata (Lese) NPV has with 168,041 bp the largest alphabaculovirus genome. The numbers of predicted genes with open reading frames (ORFs) of 50 amino acids or greater range from 126 (MaviNPV) to 169 (LeseNPV). The best studied alphabaculovirus, AcMNPV, has a genome size of 133,894 bp with 154 predicted genes and was the first baculovirus genome to be sequenced in its entirety. A distinct feature of nearly all baculoviruses is the presence of regions with homologous repeats (hrs) distributed over the genome. The hr regions contain a series of repeated palindromic sequences and can vary significantly in length. In the AcMNPV genome seven hrs are present, which contain repeats of an imperfect palindrome with a central EcoRI restriction site. Transient transcription and plasmid replication assays have shown for alphabaculoviruses that the hr sequences function as transcription enhancers and as replication origins. In addition, single copy non-hr replication origins have been identified in the genomes of AcMNPV, Orgyia pseudotsugata (Op) MNPV, and Spodoptera exigua (Se) MNPV. Baculovirus genomes contain tightly packed ORFs with very short intergenic regions. Early and late genes are dispersed over the genome and both strands of the dsDNA genome carry approximately equal numbers of ORFs. Baculovirus genomes are relatively stable, but do display some plasticity, resulting in small variations in genome sequence when comparing virus isolates from the field. During evolution, recombination events and transposon insertions have led to the integration of new genes from co-infecting viruses or from the host insect. This resulted in sometimes extensive differences in gene content between baculoviruses belonging to different viral species. However, a set of 38 core genes is conserved in all sequenced baculoviruses (marked blue in Fig. 3) and as such, form the baculovirus genomic hall mark. The core genes are often involved in crucial processes in the viral replication cycle,

750

Baculoviruses: Molecular Biology and Replication (Baculoviridae)

Fig. 2 Electron and light micrographs of baculovirus-infected insect cells. (a) An insect cell infected with AcMNPV showing the nuclear virogenic stroma (VS) that contains developing nucleocapsids and the OBs showing ODVs in the occlusion process (arrows). Scale ¼ 1 mm. (b) An OB of the MNPV type, showing the process of occlusion of ODVs, containing multiple nucleocapsids in each virion. Many ODVs can be found in an OB. (c) A mature OB of the SNPV type, showing a single nucleocapsid in each virion (ODV) and many ODVs per occlusion body. Also note the formation of the polyhedral envelope (PE). (d) An NPV nucleocapsid budding from the plasma membrane (PM) of an infected cell. (e) Light microscopy of Tn5b1–4 cells infected with AcMNPV producing large relatively uniform OBs. (a–d) Courtesy of R.R. Granados and (e) kindly provided by G.W. Blissard.

such as midgut entry, DNA replication, transcription of viral genes and virion assembly. (For an overview table of the core genes please see the supplementary material). However, not all core genes are essential to allow the virus to infect a host; they may also play a role in virus fitness. On the other hand, genes that are not categorized as core genes may be crucial to the infection process for the viruses that carry these genes. Phylogenetic analysis using the combined sequences of the core genes divides the baculoviruses in four major evolutionary clades and these correspond with the four genera in the family Baculoviridae, which were mentioned above (Fig. 4). Such a

Baculoviruses: Molecular Biology and Replication (Baculoviridae)

751

Fig. 3 Genomic map of the archetype alphabaculovirus, AcMNPV. The genome contains 154 open reading frames that encode predicted proteins of 50 or more amino acids. Functional groups of genes are highlighted (colored arrows), as well as baculovirus core genes (blue text), genes present in all lepidopteran baculoviruses (purple text), those present in all lepidopteran NPVs (in red) and those specific for Group I NPVs (in green). The inner circle shows the EcoRI restriction map of the C6 strain of AcMNPV. Locations of homologous repeat or hr regions (which contain repeats of EcoRI restriction sites) are indicated.

phylogenetic analysis also shows that the alphabaculoviruses are subdivided into two distinct clades designated as group I and group II. The group I clade appears to be relatively cohesive and with a well-defined phylogeny, whereas the group II clade represents a less-homogenous collection of viruses. On top of the 38 baculovirus core genes, a number of genes are conserved in all lepidopteran-infecting baculoviruses (alphaand betabaculoviruses); purple text in Fig. 3. Other genes are fully conserved only in alphabaculoviruses (printed in red) or are restricted to group I (green text). Examples of the latter category are the ptp gene mentioned earlier and the gp64 gene, which encodes the group I BV fusion protein, see below. Several other genes are exclusively found in group II alphabaculoviruses (data not shown). Some baculovirus genes are, as far as we know, unique to a particular isolate. Many non-core genes have also been characterized in detail and these are often correlated with specific, sometimes even essential, functionalities (Table 1). Orthologues of essential genes (whether core or non-core genes), are likely to possess similar functionalities when present in other baculovirus

752

Baculoviruses: Molecular Biology and Replication (Baculoviridae)

Fig. 4 Neighbour-joining tree based on the amino acid alignment of 29 (of 38) core genes as identified in the first 29 sequenced baculovirus genomes. The alignment comprised 16349 positions. All branches have bootstrap values exceeding 50%. Bootstrap values 495% are given along the branches. Adapted from Jehle, J.A., Blissard, G.W., Bonning, B.C., et al., 2006. Archives of Virology 151, 1257–1266 with permission from Springer Nature Switzerland AG.

clades. Baculoviruses that lack particular orthologues of non-core genes may encode evolutionarily unrelated, but functionally similar proteins to be able to execute these crucial tasks. Functions of a number of genes important for alphabaculoviruses are described in the sections below.

Baculovirus Gene Expression Temporal Regulation of Transcription Baculovirus genes are expressed in a temporal cascade, beginning with immediate early and delayed early gene expression, which is followed by late and very late gene expression. After the initial uncoating of the viral genome in the nucleus, transcription of viral early genes is initiated by host RNA polymerase II, resulting in early transcription and the subsequent production of proteins associated with viral DNA replication and late gene expression. A variety of regulatory and even structural protein genes are also transcribed in the early phase. Production of early gene products results in the initiation of viral DNA replication, and the assembly and activity of a virus-encoded RNA polymerase. This viral RNA polymerase recognizes unique viral late promoters resulting in the transcription of genes encoding viral structural proteins as well as other late genes. Late gene transcription starts just prior to or concomitantly with the onset of DNA replication. Very late gene expression occurs at the terminal part of the replication cycle and includes the hyper-expressed viral genes polh and p10, encoding the major occlusion body protein polyhedrin and the 10K protein (P10) mentioned above that forms fibrillar structures in the nucleus and cytoplasm.

Early gene expression Transfected purified baculovirus DNA is infectious, indicating that no viral proteins are required to initiate or mediate early transcription from the viral genome. Early viral gene expression is dependent though upon the host cell’s RNA polymerase II complex and is sensitive to the drug a-amanitin. The early genes have been divided into two categories, immediate early (IE) and delayed early (DE) genes. IE genes require only cellular factors for expression whereas DE genes are either dependent on, or substantially upregulated by prior viral

Baculoviruses: Molecular Biology and Replication (Baculoviridae)

Table 1

753

Conservation of non-core genes in lepidopteran baculovirusesa,b

A. Genes conserved in all alpha- and betabaculoviruses polh (ac8)c bion (ac13) f protein gene (ac23)d dbp (ac25)c lef6 (ac28) fgf (ac32) v-ubi (ac35) pp31 (ac36) lef11 (ac37) adprase (ac38) fp25 (ac61)c gp37 (ac64) ac82 p12 (ac102) pif9 (ac108)c p24 (ac129) pp34 (ac131)c me53 (ac139) ac145c ac146 ac75c, ac76c, ac106/107c

Occlusion body matrix, granulin in GVs Baculovirus protein associated with inner and outer nuclear membrane (BION), involved in nuclear lipid accumulation BV fusion protein (group II NPVs) or truncated homolog (group I NPVs) DNA binding protein Unknown role in late gene transcription Fibroblast growth factor Ubiquitin-like protein PP31; phosphoprotein with ss DNA binding capacity Essential factor for DNA replication ADP-ribose pyrophosphatase (ADPRase). Nuclear transport of ODV envelope proteins Fusolin-like protein, unclear function Telokin-like protein ODV-associated, nuclear localisation of G actin Per os infectivity factor Capsid protein Polyhedral envelope-associated phosphoprotein (PEP) NC associated protein that interacts with GP64 in virus budding Possible role in oral infectivity Structural protein of BV and ODV Unknown function

B. Genes conserved in all alphabaculoviruses orf1629 (ac9) egt (ac15)e pkip (ac24) ac51 chaB (ac60)e vp80 (ac104) p10 (ac137)e exon0 (ac141)e ie0 (ac141 and ac147) ie1 (ac147)e ac18, ac102e

P78/83; Essential structural protein of the nucleocapsid Ecdysteroid UDP-glucosyltransferase, inhibits moulting Stimulates the activity of the viral protein kinase-1 in vitro DNAJ domain protein, unknown function Unknown function Essential structural protein of the nucleocapsid Major fibrillar structural protein, OB formation and dissemination NC associated, BV nuclear egress Transcriptional trans-activator, DNA replication Transcriptional trans-activator, DNA replication Unknown functions

C. Genes conserved in group I and in almost all group II alphabaculoviruses Rearrangement of cellular actin arif1 (ac20/21) ODV-E66 odv-e66 (ac46)f Late essential factor 10 lef10 (ac53a)f RING domain protein-possible apoptosis inhibition iap2 (ac71)/iap3 Unknown function gp16 (ac130) Unknown function p26 (ac136) Unknown functions ac17, ac19, ac26, ac29f, ac43, ac55, ac56, ac58/59, ac120 D. Genes conserved in all group I alphabaculoviruses ptp (ac1) bv-odve26 (ac16) iap1 (ac27) gta (ac42) ac443 gp64 (ac128) lef7 (ac125)g ie2 (ac151) pe38 (ac153)g ac4g, ac14g, ac30, ac73, ac74g, ac79, ac91g, ac111g, ac114, ac117g, ac124, ac132

Protein tyrosine phosphatase, OB formation, behavioural manipulation BV and ODV envelope associated protein RING domain protein-possible apoptosis inhibition Global trans-activator-like protein Homolog of ascovirus iap/RING finger protein BV specific glycoprotein required for virion entry and budding Late gene expression, role in DNA replication RING finger domain protein, transcriptional activation RING domain protein, transcriptional activation Unknown functions

Conservation levels were retrieved from Rohrmann, G., 2010. Chapter 12 – The AcMNPV genome: Gene content, conservation, and function. Baculovirus Molecular Biology, third ed. Bethesda, MD: National Center for Biotechnology Information. Available from: https://www.ncbi.nlm.nih.gov/books/NBK138304/. Information on the function of BION (ac13) was retrieved from Nagamine, T., Inabe, T., Saiko, Y., 2019. Virology 532, 108–117. Available at: https://doi.org/10.1016/j.virol.2019.04.006. b A similar Table, but then for the baculovirus core genes, can be found in the chapter Baculoviruses: General Features by V.I.D. Ros. c Also present in gammabaculoviruses. d Also present in the deltabaculovirus CuniNPV. e Also present in individual members of one or more other lineages. f Also present in individual members of one or more other lineages. g Also present in individual members of one or more other lineages. a

754

Baculoviruses: Molecular Biology and Replication (Baculoviridae)

gene expression. Early gene products are involved primarily in gene regulation, host modification, DNA replication, or are required for late gene expression. Many IE genes have a common motif, TATA-N(24–26)CAGT, at their transcription start site. The promoters of IE and DE genes resemble typical eukaryotic RNA pol II promoters and contain host cell transcription factor binding sites. The key viral regulatory proteins for early (and late) gene transcription are the immediate early proteins IE0 and IE1, encoded by the essential ie0 gene complex. This ie0 gene has two promoters and produces spliced and non-spliced transcripts. It is the only known baculovirus gene that produces two viral proteins via spliced transcripts, although several more baculovirus mRNAs are known to be generated through splicing. The IE0 protein contains the entire IE1 amino acid sequence, but in addition has an N-terminal extension that is variable in length depending on the viral species (54 amino acids in AcMNPV). Both the IE0 and IE1 proteins are present during the entire infection process due to the fact that the ie0 transcript can also produce IE1, in addition to IE0, by internal translation initiation at the start codon of the ie1 sequence. IE0 has peak expression prior to viral DNA replication, while IE1 becomes more abundant than IE0 when replication starts. IE1 levels continue to increase until the final stages of the infection. IE0 appears to be specific to alphabaculoviruses. IE0/1 is a typical acidic domain transcriptional trans-activator, similar to the herpesvirus VP16 protein, and activates viral gene transcription by both enhancer-dependent and independent mechanisms. Both proteins form dimers that bind to the genomic hr sequences, which serve as the transcriptional enhancers. IE0/1 is also essential for the viral replication complex, where it is believed to play the role of an origin binding protein. IE0 and IE1 can support viral DNA replication and BV production equally, but for some promoters (lef4, lef6, ac79, p35, ac18, lef3, ac111, and 39K) IE0 is a more potent activator than IE1. This difference in potency is seen only when IE1//IE0 are expressed at relatively low levels, in line with the situation at early times post-infection. It has been suggested that IE0 expression in midgut cells may stimulate rapid DNA replication and BV production to allow the progeny virus to quickly escape the midgut and allow systemic dissemination of the infection. Additional IE genes include the alphabaculovirus-specific ie2 and pe38 genes, which have been shown to augment viral transcription and, as indicated previously, viral DNA replication. Unlike ie0/1, both ie2 and pe38 can be deleted without interfering with the capacity to replicate. For viruses with ie2 or pe38 gene deletions, however, BV production and viral DNA replication are less efficient. Some of the most highly expressed viral IE genes are he65 and me53. The presence of G-actin within the nucleus upon infection has been shown to be in part due to the function of HE65, whereas ME53 is needed for optimal virus production and in infected cells is associated with GP64 at foci thought to be budding sites. Overall, IE genes play crucial regulatory roles that coordinate the infection cycle. The gp64 gene, encoding the AcMNPV BV fusion protein, is transcribed early in infection by host factors, but also has a late promoter element. DE gene expression requires other viral factors, such as IE0/IE1 to achieve above basal levels of expression. An example of a DE promoter is the AcMNPV pp31 promoter, which contains a TATA-CAGT sequence at the transcription start site and is dependent on IE1 for activated transcription. Also, the above-mentioned genes lef4, lef6, ac79, p35, ac18, lef3, ac111, and 39K belong to the DE genes. In general, DE gene products are required for DNA replication and/or late gene expression.

Late and very late gene expression Baculovirus late genes are transcribed by a viral-encoded DNA-dependent a-amanitin-insensitive RNA polymerase that recognizes baculovirus late promoters. Studies of the purified NPV RNA polymerase have shown that it is comprised of four major proteins, LEF4, LEF8, LEF9, and P47. Homologs of the corresponding genes have been identified in all baculovirus genomes sequenced to date. LEF4 has been shown to have enzymatic activity for RNA capping, and LEF8 and LEF9 contain conserved motifs found in other RNA polymerases. With only very rare exceptions, the baculovirus late promoters recognised by this viral RNA polymerase contain the sequence TAAG and transcriptome analysis has shown that the 50 end of late transcripts corresponds with the second position of this motif. In model late promoters that have been examined, only very short sequences immediately up- and downstream of the TAAG sequence were identified as components of the late promoter. Regions farther downstream have been shown to affect levels of very late gene expression. Using transient assays, twenty AcMNPV genes were identified as important or necessary for late gene expression: the late expression factor genes (lef1 to lef12), pp31, p47, DNApol, vlf1, helicase, p35, ie1/0, and ie2. Many of the lef genes are required for DNA replication, a prerequisite for late gene expression (see below). Typical examples of late genes are those encoding the structural proteins of the nucleocapsids and the envelope proteins of BVs and ODVs. Both gp64 and me53 have an early and a late gene promoter, allowing expression in both phases. At very late times post infection, transcription of the two hyper-expressed late genes (polh and p10) is upregulated, resulting in accumulation of the polyhedrin and P10 proteins to extremely high levels. The burst of very late gene transcription is mediated by the late viral factor 1 (VLF1), which binds to an A/T-rich region, called a burst sequence, located downstream of the DTAAG motif in the polh and p10 promoter regions. The hyper-expression of these two genes during the very late phase forms the basis of the baculovirus expression system. This recombinant protein production platform makes use of the promoters of either or both of these genes to efficiently express heterologous genes of interest in cultured insect cells.

Viral DNA Replication Baculovirus DNA replication occurs in the nuclei of infected cells and is first detected 6–8 h post-infection of cultured Spodoptera frugiperda cells with AcMNPV and reaches top levels at 16–18 h post-infection. Several viral proteins have been shown to be crucial for AcMNPV DNA replication. These include DNA polymerase, the helicase P143 and several LEF proteins (LEF1; LEF2; LEF3,

Baculoviruses: Molecular Biology and Replication (Baculoviridae)

755

LEF11). LEF1 and LEF2 mutually interact and possess primase activity. LEF3 is a single-stranded DNA binding protein that forms a complex with P143. Chemical cross-linking experiments showed that the viral trans-activator IE1 interacts with this complex. As shown in transient assays and by using mutant viruses, IE0 or IE1 can each support viral DNA replication, but both proteins are required to achieve full levels of viral replication. IE1 and presumably IE0 bind hr sequences in the baculovirus genome and therefore may function as origin-activating proteins. ME53 is a protein with two zinc-finger motifs, but it’s exact role in virus production is not clear. The exact role of LEF11 also remains enigmatic. LEF7 is an F-box protein that is required for optimal levels of DNA replication in AcMNPV. Despite its important role in DNA replication, the lef7 gene is conserved in the genomes of group I and a limited number of group II alpha- and betabaculoviruses. A number of other viral proteins have been shown to augment viral DNA replication in AcMNPV without being absolutely essential. These include the anti-apoptotic protein P35, the Inhibitor of apoptosis protein 1 (IAP1) and the group I alphabaculovirus-specific proteins IE2 and PE38. All three of IAP1, IE2 and PE38 are RING domain proteins. Both IE2 and PE38 are transcriptional activators (see above) and have ubiquitin ligase activities. However, it is not known whether this enzymatic function plays a role in DNA replication. The mechanism by which the baculoviral genome replicates is not exactly known, but high molecular mass viral DNA molecules that are suggestive of genomic concatemers have been reported in infected cells. It has therefore been suggested that baculovirus DNA may replicate following a rolling circle model. More recently, it has been proposed that baculoviruses may use a recombination-dependent mechanism of viral DNA replication. In reality, it may even be a combination of the two, where recombination is used to resolve the concatemeric products of the rolling circle process A protein that plays a role in producing genome length DNA molecules is Alkaline nuclease (AN). Alkaline nuclease family members repair dsDNA breaks and are involved in homologous recombination. The baculovirus AN may, therefore, be required in DNA processing, before the progeny genomes are packaged into the nucleocapsids. The ss-DNA binding protein LEF3 associates with AN, and this complex may play a crucial role in resolving replicative intermediates. Another protein that might be involved in this process is DBP (DNA binding protein), a protein that protects DNA against nuclease activity.

Virion Morphogenesis and Protein Composition Nucleocapsids The baculovirus life cycle is biphasic and produces two virion phenotypes, BV and ODV. NCs of both virion phenotypes are structurally identical. The NCs are rod-shaped, and measure approximately 30–60 nm by 250–300 nm. The NCs have characteristic cap and base structures (see Fig. 1) and several viral proteins are particularly associated with either of these two structures. NCs assemble in the ring zone of the nuclear virogenic stroma. In this process the viral dsDNA genome is densely packed within the NC in a spiral fashion. The small highly basic protein P6.9 is directly associated with the viral DNA and contains ca. 40% of arginine and 30% of serine/threonine residues. The highly basic nature of this protein functions in neutralizing the positive charges of the nucleic acid and aids in the condensation and packaging of the viral DNA. The condensed viral DNA is coated with the major capsid protein VP39. Recently, it was found that the 38K protein is crucial for the dephosphorylation of p6.9 during NC assembly. In contrast, uncoating of the viral DNA upon entry into the nucleus involves phosphorylation of p6.9. Other AcMNPV proteins associated with the NCs include: P24, VP80, VP1054, BV/ODV-C42, P95, ORF1629, and EXON0 (AC141), AC142 and VLF1, and for several of these proteins more information about their function is available. AC142 is an essential gene required for NC formation. ORF1629 encodes the essential phosphoprotein P78/P83 that is associated with the basal end of the NC and needed for NC transport towards the nucleus. Deletion of BV/ODV-C42, VP80, VP1054, p6.9 and P95 showed that these genes are also essential for NC/BV assembly. In addition, P95 also has a domain required for oral infectivity of ODVs. VP80 and EXON0 are required for efficient egress of NCs (see also next section). Immunogold labelling has shown that VP80 binds to one end of the NCs (although it is not yet clear which end). VP80 also has DNA binding affinity, which may explain why VP80 deletion mutants do not assemble properly. Interestingly, the very late gene transcription factor VLF1 is also required for proper NC assembly and like VP80, typically associated with one end of the NCs. Q-PCR analysis showed that deletion of either the vp80 or vlf1 gene did not lower the levels of viral DNA produced. VLF1 is also not required for the production of genome-size DNA molecules, as previously suggested. Another NC-associated protein, VP1054, shares characteristics with the cellular DNA binding protein PURa, and like this cellular protein, VP1054 has affinity for GGN-repeat motifs in DNA and RNA. GGN-repeat islands appear to be conserved in baculovirus genomes. A long (GGN)n stretch is for instance present in AcMNPV on the complementary strand of orf1629, in which a CCN-repeat corresponds to a proline-rich domain in the essential protein P73/P87. As such, VP1054 might play a role in DNA packaging. Testing this hypothesis is rather difficult though, as artificial mutation of the GGN repeats will directly interfere with the function of P78/ 83 encoded on the complementary strand.

BV Morphogenesis After their assembly in the virogenic stroma, NCs move along virus-induced, nuclear actin filaments to the nuclear periphery. The interaction with these actin filaments requires the NC protein VP80 that contains paramyosin-like domains. The NCs are believed to acquire an initial envelope when they egress from the nucleus to the cytoplasm to become BVs. By an unknown mechanism, the

756

Baculoviruses: Molecular Biology and Replication (Baculoviridae)

initial envelope is lost and NCs are further transported to the cell surface, most likely utilizing microtubules. The AcMNPV protein EXON0 (AC141) seems to play a crucial role here, as it is found to co-purify with beta-tubulin and to bind to kinesin-1. Deletion of the exon0 gene strongly reduced BV formation. At the cell surface, the NCs bud from the modified plasma membrane and as a consequence acquire an envelope carrying viral-encoded glycoproteins. Typical for the BV envelope of group I alphabaculoviruses is the presence of the major envelope glycoprotein GP64, which is a class III envelope fusion protein. GP64 is not present in ODVs and is therefore a BV-specific protein. Group II NPVs, GVs and deltabaculoviruses carry a functional equivalent, the F protein, which is a class I envelope fusion protein. Gammabaculoviruses do not have a BV fusion protein and as a consequence do not produce BVs, and, hence, are gut restricted.

GP64 glycoprotein characteristics GP64 proteins are highly conserved, with approximately 80% amino acid sequence identity among all GP64 protein ectodomains examined. As indicated above, GP64 is structurally a class III glycoprotein that is phosphorylated, palmitoylated, and heavily glycosylated. It is found on the cell surface of infected cells and on the virion as a disulphide-linked trimer of GP64 monomers. GP64 functions in both viral entry and exit. In viral entry, GP64 is important for host cell binding and membrane fusion. Little is known of the details of GP64 interactions with host cell receptor(s) and the molecules that can serve as the host cell receptor for BV binding also remain unknown. However, GP64 binding appears to be highly promiscuous and this feature has been exploited for instance in the use of AcMNPV as a transduction vector for mammalian cells and in the use of GP64 for pseudo-typing retrovirus-based gene therapy vectors. The GP64 protein has also been used for peptide display on BV particles e.g., for immunization purposes or drug screening. After host cell binding mediated by GP64, the BVs are internalized via endocytosis and the low pH of the endosome triggers a conformational change in the GP64 trimers, resulting in activation of membrane fusion. Studies of GP64-mediated membrane fusion have indicated that large short-lived complexes of approximately 10 or more GP64 trimers form immediately prior to membrane fusion and are likely the unit structure of the membrane fusion mechanism. In addition, the opening of the fusion pore occurs rapidly after low pH triggering. Unlike F and many other membrane fusion proteins, the GP64 protein does not require a prior internal cleavage for maturation or activation of its fusogenic capacity. Studies of a gp64 gene knockout virus showed that GP64 is also necessary for efficient budding of progeny BVs. Because GP64 is found on the surface of infected cells, concentrated in discrete areas, it is thought that these concentrations of GP64 represent the sites of virion budding. ME53 co-localizes with the envelope glycoprotein GP64 in the same plasma membrane foci and does so in a GP64-dependent manner. ME-53 is encoded by all alphabaculovirus genomes and BV production is strongly attenuated in mutants lacking me53. Interestingly, ME53 is also found in association with isolated nucleocapsids and as such it may bridge the NCs with the GP64-containing envelope (possibly via the cytoplasmic tails of GP64) and by doing so assist in budding.

F protein properties F proteins have been studied most extensively in SeMNPV (SE8), HearNPV (HA133), and LdMNPV (LD133). F proteins presumably serve a similar role in host receptor binding as GP64 although they appear to recognize a different host cell receptor. Both F and GP64 are glycosylated low-pH-activated membrane glycoproteins present as trimeric spikes in the BV envelope. F is a class I envelope fusion protein that becomes fusogenic after an internal cleavage by the cellular protease furin. This cleavage leads to the F1 and F2 components, which are interconnected by disulphide bonds. Unlike GP64, F proteins are not specific to BVs. Proteomic analyses of AcMNPV, HearNPV, ChchNPV and the deltabaculovirus CuniNPV have also identified F proteins in the ODVs. However, more detailed studies will be necessary to understand the potential role of the F protein in ODVs. All viruses that encode a GP64 protein (group I alphabaculoviruses) also encode a homolog of the F protein (AC23 in AcMNPV, OP21 in OpMNPV). This F-homolog can be deleted with no substantial effect in cell culture infections as long as the gp64 gene is intact. However, the conservation of f gene homologs in the genomes of the group I alphabaculoviruses suggests an important function and it was recently shown that deletion of the ac23 from the AcMNPV genome results in delayed mortality of infected larvae. Thus, while F proteins found in group II alphabaculoviruses are essential entry proteins, the F homologs in group I are not essential, but appear to serve an accessory role that is important in the pathogenicity and/or virulence of the virus. Evolutionarily, the F protein is most likely the original baculovirus BV fusion protein as homologs of this protein are also encoded by insect genomes. An ancestor of the group I alphabaculoviruses has probably obtained a gp64 gene by horizontal gene transfer, maybe from an insect retrovirus or via a retrotransposon, and the encoded GP64 protein has taken over the envelope fusion function in this clade of viruses. Recent cryo-EM analysis of BV particles has shown that BVs are rather ovoid than rod-shaped (as obtained when using chemical fixation and as shown in previous drawings). Cryo-EM has also shown that both GP64 and F are found concentrated on both ends of the virion, although spikes are sometimes also found on the lateral sites of the ovoid (Fig. 5).

ODV Assembly and ODV Envelopes Nucleocapsids destined to be incorporated into ODVs are retained in the nucleus and are enveloped by a lipid bilayer that is derived from the inner nuclear membrane. AcMNPV ORF142 and ODV-EC27 are essential for successful ODV assembly and for the production of infectious virus. ODV envelopes have been shown to contain a number of proteins that are not present in BVs (see Fig. 1). The most abundant AcMNPV ODV-envelope specific proteins include ODV-E18, ODV-E25, PIF5 (ODV-E56/ ODVP6E), ODV-E66, and PIF0 (P74). In more recent years, additional ODV envelope proteins have been identified that are less

Baculoviruses: Molecular Biology and Replication (Baculoviridae)

757

Fig. 5 Cryo-EM images of budded virus particles of group I and group II alphabaculoviruses. The ovoid BVs of AcMNPV (a) and SeMNPV (b) have an internal nucleocapsid (NC) surrounded by an envelope derived from the plasma membrane from which spikes (arrows) protrude at both ends. The spikes in AcMNPV are trimers of the envelope protein GP64, while SeMNPV has trimeric F protein spikes. Spikes might also be found along the lateral sides of the BVs. A pocket (P) of unknown composition is present between the lateral sites of the nucleocapsid and the envelope in these cryo-EM images. Scale bars represent 100 nm. The images are a courtesy of J.W.M. van Lent and have been published before as part of a larger figure in Wang, Q., Bosch, B.J., Vlak, J.M., et al., 2016. Budded baculovirus particle structure revisited. Journal of Invertebrate Pathology 134, 15–22 and used with permission of Elsevier, Amsterdam, the Netherlands.

abundantly present but are absolutely required for oral infectivity. Therefore, they have been termed per os infectivity factors (PIFs). The previously discovered P74 and ODVE-56, and VP95 were subsequently renamed to PIF0, PIF5 and PIF8, respectively. With the recent discovery of PIF9 (AC108), ten PIFs have been allocated in total. Viruses with pif gene mutations are infectious by injection of their BVs into caterpillars, but do not show mortality upon oral infection with ODVs. The only exception is PIF8, which contains a DNA domain (NAE domain) which is required for BV formation. PIF8 is also associated with NCs, but the other PIFs are ODV-envelope specific. Most of the studies on PIF genes were done with AcMNPV or HearNPV. PIF proteins, with the exception of PIF5, associate into a macromolecular protein complex, called the ODV entry complex. The macromolecular complex has a stable core consisting of PIF1–4. The other PIFs in the entry complex are associated with a relatively lower affinity to the core. One of the biological functions of the entry complex, apart from mediating midgut infection, is the protection of the individual PIF proteins against proteolytic degradation under alkaline circumstances. PIF proteins are highly specific and cannot easily be exchanged between viruses, possibly due to these complex protein-protein interactions in the ODV entry complex.

Tegument Proteins GP41 is an ODV-specific glycoprotein encoded by a core gene. GP41 does not co-purify with either the NC or the envelope fraction and is therefore considered as a tegument protein, located in between the NC and envelope. A temperature-sensitive mutant of AcMNPV with a codon replacement in the gp41 gene, showed that GP41 plays a critical role in virus assembly. When grown at the non-permissive temperature, the gp41 ts mutants failed to produce ODVs and OBs, and, in addition, NCs did not egress from the nucleus to form BVs. GP41 therefore appears to play a key role in the assembly of both virion phenotypes, even though it has been detected only in ODVs. Tegument proteins appear to be acquired within the nucleus, when the ODV NCs become enveloped. Since BVs do not contain GP41, it was suggested that proteins that loosely associate with the NCs in the nucleus, might be lost when these migrate from the nucleus to the cytoplasm to become BVs.

Appendix A Supplementary Material Supplementary data associated with this article can be found in the online version at doi:10.1016/B978-0-12-809633-8.21550-1.

See also: Baculoviruses: General Features (Baculoviridae)

758

Baculoviruses: Molecular Biology and Replication (Baculoviridae)

Further Reading Blissard, G.W., Theilmann, D.A., 2018. Baculovirus entry and egress from insect cells. Annual Review of Virology 5, 113–139. Boogaard, B., van Oers, M.M., van Lent, J.W.M., 2018. An advanced view on baculovirus per os infectivity factors. Insects 9, 84. Chen, Y.R., Zhongm, S., Feim, Z., et al., 2013. The transcriptome of the baculovirus Autographa californica multiple nucleopolyhedrovirus in Trichoplusia ni cells. Journal of Virology 87, 6391–6405. Han, Y., van Oers, M.M., van Houte, S., Ros, V.I.D., 2015. Virus-induced behavioural changes in insects. In: Mehlhorn, H. (Ed.), Host Manipulations by Parasites and Viruses. Parasitology Research Monographs 7. Cham: Springer, pp. 149–174. Harrison, R.L., Herniou, E.A., Jehle, J.A., et al., 2018. ICTV virus taxonomy profile: Baculoviridae. Journal of General Virology 99, 1185–1186. Huang, Z., Pan, M., Zhu, S., et al., 2017. The Autographa californica multiple nucleopolyhedrovirus ac83 gene contains a cis-acting element that is essential for nucleocapsid assembly. Journal of Virology 91. (e02110-16). Lai, Q., Wu, W., Li, A., et al., 2018. The 38k-mediated specific dephosphorylation of the viral core protein P6.9 plays an important role in the nucleocapsid assembly of Autographa californica multiple nucleopolyhedrovirus. Journal of Virology 92. (e01989-17). Marek, M., Merten, O.W., Galibert, L., Vlak, J.M., van Oers, M.M., 2011. Baculovirus VP80 protein and the F-actin cytoskeleton interact and connect the viral replication factory with the nuclear periphery. Journal of Virology 85, 5350–5362. Ohkawa, T., Volkman, L.E., Welch, M.D., 2010. Actin-based motility drives baculovirus transit to the nucleus and cell surface. Journal of Cell Biology 190, 187–195. Rohrmann, G.F., 2010. Chapter 12 – The AcMNPV genome: Gene content, conservation, and function. In: Baculovirus Molecular Biology, third ed. Bethesda, MD: National Center for Biotechnology Information. Available at: https://www.ncbi.nlm.nih.gov/books/NBK138304/. Stewart, T.M., Huijskens, I., Willis, L.G., Theilmann, D.A., 2005. The Autographa californica multiple nucleopolyhedrovirus ie0–ie1 gene complex is essential for wild-type virus replication, but either IE0 or IE1 can support virus growth. Journal of Virology 79, 4619–4629. van Oers, M.M., Pijlman, G.P., Vlak, J.M., 2015. Thirty years of baculovirus-insect cell protein expression: From dark horse to mainstream technology. Journal of General Virology 96, 6–23. Vanarsdall, A.L., Okano, K., Rohrmann, G.F., 2006. Characterization of the role of very late expression factor 1 in baculovirus capsid structure and DNA processing. Journal of Virology 80, 1724–1733. Wang, Q., Bosch, B.J., Vlak, J.M., et al., 2016. Budded baculovirus particle structure revisited. Journal of Invertebrate Pathology 134, 15–22. Wang, X., Shang, Y., Chen, C., et al., 2019. Baculovirus per os infectivity factor complex: Components and assembly. Journal of Virology 93. (e02053-18). Williams, T., Bergoin, M., van Oers, M.M., 2017. Diversity of large DNA viruses of invertebrates. Journal of Invertebrate Pathology 149, 4–22.

Bidensoviruses (Bidnaviridae) Qin Yao, Zhaoyang Hu, and Keping Chen, Jiangsu University, Zhenjiang, China r 2021 Elsevier Ltd. All rights reserved.

Classification Bombyx mori bidensovirus (BmBDV) is one of the causative agents that causes fatal densonucleosis in the silkworm. The singlestranded DNA genome is similar to that in parvovirus densoviruses. BmBDV was previously known as Bombyx mori densovirus type 2 (BmDNV-2) and was previously assigned to the Densovirinae in the Parvoviridae family. However, unlike parvoviruses the BmBDV genome consists of two ssDNA segments in separate capsids and encodes DNA polymerase B. Considering these unusual properties, in the Eighth Report of the International Committee on Taxonomy of Viruses (ICTV), this virus was excluded from the family Parvoviridae. In 2012, BmBDV was assigned to the new, sole and type species Bombyx mori bidensovirus in the new genus Bidensovirus within the new family Bidnaviridae. BmBDV is currently the only virus that possesses a single stranded DNA that encodes for a DNA polymerase. Characterizing its basic biology is important for understanding this new virus family and we can expect that other bidnaviruses will be isolated and characterized in the future.

Genome Organization and Expression Strategy BmBDV has two linear single-stranded DNA segments about 6.5 kb (viral DNA1, VD1) and 6 kb (viral DNA2, VD2) in size. The complementary strands of the genomic DNA, but probably also VD1 and VD2, are encapsidated separately. The genome compartmentalization of such a bipartite virus gives rise to an obvious cost, because the different viral particle types containing at least one copy of each segment have to co-infect the same host cell during primary and systemic infection. If the different particle types are present at the same frequency, this cost would be minimized, while deviations from a one to one ratio would increase the cost of genome multipartition. However, genome segments accumulate at different frequencies in BmBDV virions and in BmBDV-infected midguts. The copy number of short-segment VD2 is higher than that of long-segment VD1. Faba bean necrotic stunt virus (FBNSV, family Nanoviridae) has an even more complex genome composition of eight circular ssDNA molecules encapsidated in individual icosahedral particles. The various segments of FBNSV present very different frequencies within individual host plants and virus particles as well as aphid vectors. Several BmBDV isolates were obtained from silkworm and genomes of three of them, Yamanashi isolate (Japan), Zhenjiang isolate (China), and Indian isolate, have been sequenced. Sequence analysis of cloned VD1 and VD2 demonstrates that they are different DNA molecules that share a common 30 terminal sequence of 53 nucleotides. Both DNAs contain complementary terminal repeats in which the characteristic terminal palindrome observed in parvoviruses are not found (Fig. 1). The genomic structure of VD1 is similar to the self-synthesizing DNA transposons (hypothetical viruses termed polintovirus) characterized by terminalinverted repeats and a unique set of proteins such as protein-primed DNA polymerase B (PolB) and ATPase. VD1 encodes four genes, ns2, ns1, and vp located on the 50 -half of one strand and polB on the 50 -half of the complementary strand. ns2 and ns1 consist of 381 and 951 nts respectively, and there is a 218 nts overlap between them. vp and polB consist of 1500 nts and 3318 nts respectively. There is an 8 nts overlap between ns1 and vp. VD2 contains two genes which are located on the 50 -half of the complementary strands respectively, the ns3 gene consists of 669 nts and the p133 gene consists of 3483 nts. The conserved initiator elements TATA box and CAGT box, and the polyadenylation signal were found in the upstream and downstream regions of these genes respectively. Between the Zhenjiang isolate and Yamanashi isolate of BmBDV, VD1 shares 98.4% identity and VD2 shares 97.7% identity. Transcript mapping shows that three mRNA molecules with sizes of 1.1, 1.5, and 3.3 kb respectively are expressed during transcription of VD1 of BmBDV. The 1.1 kb transcript corresponds to ns2 and ns1. Further studies suggest that ns2 and ns1 are controlled by overlapping promoters, P5 and P5.5, respectively. Transcriptional start sites of ns2 and ns1 are located 21 nts upstream and 3 nts downstream of the ns2 initiation codon, respectively. Transcription of ns2 and ns1 terminate at the same site. The 1.5 kb transcript derived from vp, starts at 57 nts upstream of the initiation codon and terminates 22 nts downstream of the terminal codon. The major structural protein VPs are synthesized by a leaky scanning mechanism from this mRNA. The 3.3 kb transcript corresponding to polB, transcribes from 22 nts upstream of the start codon and stops 13 nts after the terminal codon. Sequence analysis showed conserved initiator elements, a TATA box and a CAGT box, which were observed in the upstream regions of the viral genes. The activity of promoter P5 was investigated using luciferase reporter assay and the core elements of P5 were determined by a series of truncation mutants. The results suggest that P5 can drive reporter expression in vitro, and the TATA box, CAGT box and the upstream nt-205 to nt-236 region are important for the promoter activity. In VD2, the transcription of ns3 initiates 29 nts upstream of the first ATG and terminates at the 59 nts downstream of the stop codon. The transcriptional initiation site and terminal site of p133 are located at nt 5367 and nt 1771 of VD2, respectively. Alternative splicing of mRNA, which is common in parvoviruses, is not observed in BmBDV.

Pathology of Silkworm Associated With BmBDV BmBDV replicates predominantly in the nuclei of columnar cells of the larval midgut epithelium, and leads to fatal densonucleosis in the silkworm. Symptoms of BmBDV-infected silkworm larvae include anorexia, diarrhea, and flaccidity. In diseased larvae, the

Encyclopedia of Virology, 4th Edition, Volume 4

doi:10.1016/B978-0-12-814515-9.00020-5

759

760

Bidensoviruses (Bidnaviridae)

Fig. 1 Transcription map of BmBDV Zhenjiang isolate. The viral genome segments (VD1 and VD2) are depicted, flanked by terminal repeats (ITR), including the viral promoters (P5/5.5, P21, P97, P10 and P89). The viral mRNAs, transcription initiation and termination sites, and the proteins that they encode are shown. Polyadenylation is shown by a squiggly line. The open reading frames (ORFs) used to encode the viral proteins are shown.

color of the postero-midgut changed to pale yellow. Silkworms infected by BmBDV before the first day of the 4th instar discharged bead-like frass. Virus infection caused slow growth and extension of lengths of instar times. Different development stages of the silkworm larvae have different sensitivities to infection by BmBDV. The LT50 of first stage larvae is about 21 days, indicating that BmBDV has a long pathogenic process. BmBDV-infected 5th instar larvae could develop to the adult stage and produce the next generation, but the weight and survival rate of the pupa and the cocoon production decreases slightly. The main lesions occur in the nuclei of infected cells. Because of the accumulation of viral DNA, the nuclei of virus-infected cells become hypertrophied and stain intensely with methyl green or Feulgen reagent or exhibited an orange yellow or red fluorescence after staining with acridine orange at acidic pH. The hypertrophied nuclei of infected cells is 2.5 times larger than that of the normal cell and a voluminous dense homogeneous structure appeared in each of the infected nuclei at the late stage of infection. Infected columnar cells eventually degenerate and are discharged into the gut lumen, after which the mature virions in the degenerated infected cells were excreted along with the frass. However, cells of the midgut epithelium infected by BmBDV are not as readily discharged as those from larvae infected by Bombyx mori densovirus (BmDNV). This was consistent with the fact that BmBDV infection was chronic, while BmDNV caused acute disease. Two non-viral 45.5 kDa and 44 kDa proteins accumulate in BmBDV-infected degenerated cells in the midgut epithelium of the silkworm, and this accumulation is inversely correlated with that of viral structural proteins. TUNEL staining showed the accumulation of these proteins in degenerated cells infected by BmBDV was not due to apoptosis. Pathological changes are also observed in the goblet cells of infected midgut epithelium at the late stages of infection. Although the nuclei of goblet cells were infected and degenerated, the extent of damage was far less than that of columnar cells. The ultrastructural changes in cells infected by BmBDV were observed both in the cytoplasm and the nucleus. In the cytoplasm, vesicles or cisternae near the granular endoplasmic reticulum, in which small spherical particles accumulated, could be observed, indicating the accumulation and transport of viral protein to the nucleus. During the replication of the virus, cellular organelles such as mitochondria and endoplasmic reticulum become vacuolized and degenerated. Meanwhile, a great many lysosomes and a large autophagosome containing degenerated organelles emerged. In the nucleus, the heterochromatin becomes very condensed and discrete. Hypertrophy of the virus-infected nucleus follows infection and a virogenic stroma appears. The virogenic stroma is less electron-dense than the surrounding nuclear matrix and occupies most of the nucleus. Simultaneously, the fibrillar and granular components of the nucleolus are segregated, and the hypertrophied nucleolus is pushed to the nuclear periphery. Electron-dense discrete sites of replication within the virogenic stroma appear early in infection. Following the multiplication of

Bidensoviruses (Bidnaviridae)

761

the virus, these discrete sites in which virions were assembled increase in size, At the end of infection, the greatly enlarged nuclei are filled with thousands of small spherical virions, 20–24 nm in diameter. The virions enter the cytoplasm through the enlarged nucleopores or by destruction of the nuclear envelope. The complete virions are then liberated into the gut lumen along with the degenerated cells.

Viral Non-Structural Proteins The BmBDV virus non-structural (NS) proteins are likely essential for viral replication as they are in parvoviruses. Bidensoviruses encode the DNA polymerase PolB which clusters with the DNA pol of polintoviruses. NS1 of BmBDV shows homology to that of densovirus and could be involved in the replication of the virus. A superfamily 3 helicase motif, which is conserved in NS1 of parvoviruses, was identified in the BmBDV NS1. However, another conserved initiator (replicator) motif in parvoviruses was lacking in BmBDV NS1. In parvoviruses, NS1 possesses multiple functions such as sequence specific DNA-binding, as well as helicase, ATP-dependent site-specific endonuclease and ATPase activities, involved in viral DNA replication. The NS1 of BmBDV also exhibits DNA binding, ATPase as well as helicase activities and NS1 interacted with the PolB. These facts implied that BmBDV NS1 is also a multifunctional protein, which possibly plays an important role in virus replication. BmBDV NS1 is a phosphorylated protein and the phosphorylation regulates its ATPase activity. The ns3 gene encodes a 27 kDa protein named NS3. Bioinformatics analysis suggests that NS3 possesses two zinc-finger motifs, 6 putative N glycosylation sites, and 4 putative phosphorylation sites. Homologs of BmBDV NS3 have also been identified in some densoviruses and baculoviruses. The homolog of NS3 in Junonia coenia densovirus is absolutely required for viral DNA replication, implying that BmBDV NS3 might play a similar and important role in the BmBDV viral life cycle. The NS3-like proteins from diverse viruses share a conserved Zn-finger motif. Inhibitors of apoptosis (IAP proteins) in baculoviruses also have Zn-finger motifs. It was hypothesized that NS3-like proteins might represent a novel family of apoptosis inhibitors in arthropod-infecting viruses. NS2 was identified from BmBDV-infected silkworm midguts via comparative proteomics but this protein was not detected in virions, indicating this protein is a non-structural protein. NS2 of BmBDV shares no homology with other NS2 from parvoviruses and little is known about it. NS2 shares slight sequence similarity with some chromosomal replication initiator protein dnaA and DNA-binding response regulator. In virus-infected larvae, the transcription of the ns2 gene was first detected 28 h post inoculation in low amounts, but in higher amounts at late stages of infection. Immunofluorescence showed NS2 ultimately concentrated on the nuclear membrane in BmN cells at late stages. Like the NS2, adenovirus death protein (ADP) is an Asn-glycosylated integral membrane protein which ultimately localizes to the nuclear membrane, and it is expressed early but is greatly amplified at late stages of infection. As ADP is required for efficient cell lysis and virus release, BmBDV NS2 might have the same function in facilitating the cell’s lysis and virus release. As yet, the function of NS2 is still obscure.

Viral Structural Protein Similar to parvoviruses, bidensoviruses are non-enveloped small spherical viruses with a diameter of 20  24 nm with equiaxial symmetry. However, the detailed ultrastructure of this virus remains to be elucidated. Viral structural proteins perform a wide variety of structural and biological functions during the virus life cycle, including host cell surface receptor recognition, pathogenicity determination, viral genomic encapsidation and host immune response detection and evasion. BmBDV contains two genes vp and p133 encoding the major capsid proteins (VPs) and minor capsid protein, respectively. Two VPs are produced by leaky scanning from the first and second ATG of vp. The VPs contain an N-terminal glycine-rich region as well as jelly-roll domains similar to parvovirus VPs. The conserved phospho-lipase A2 domain, which is essential for virus entry in parvoviruses, is not found in BmBDV. The BmBDV VPs expressed in a baculovirus expression system can auto-assemble into virus-like particles. p133 encodes a 133 kDa protein named P133. Western blot analysis showed that the anti-P133 N-terminus and anti-P133 C-terminus serums recognized the same specific 133 kDa protein in virions, but no other proteins could be detected, suggesting that the p133 gene encodes a single structural protein. Amino acid sequence alignment illustrated that the C-terminus is conserved between P133 and some viral structural proteins in family Reoviridae such as VP3 of cypovirus. In reoviruses, P133 homologs are located at the outer capsid shell and are involved in host recognition and virus entry. It appears most likely that P133 has a similar function. A 53 kDa protein was detected in purified virions using a specific antiserum against PolB, and three peptides with molecular masses of around 120, 70, and 53 kDa were also detected in virus-infected silkworm midgut, demonstrating that polB encodes a structural protein processed by post-translational cleavage.

Viral Replication Unlike other ssDNA viruses, including parvoviruses, bidensoviruses encode the protein-primed DNA polymerase B rather than an endonuclease, indicating that bidensoviruses employ a different replication mechanism from that of parvoviruses. In fact, the replicative intermediates of viral DNAs similar to those of parvoviruses were not detected during virus infection, strongly suggesting that bidensovirus genomic DNAs does not replicate by a rolling circle mechanism, which is thought to be used by all

762

Bidensoviruses (Bidnaviridae)

parvoviruses. However, details of the replication mechanisms of the bidensoviruses have not yet been defined. It is important to further characterize the viral specific DNA polymerase to understand the replication mechanism of bidensoviruses. Tijssen and Bergoin suggested that BmBDV DNA could replicate like the DNA of adenoviruses. In this model, the terminal protein (TP) which is covalently linked at the 50 -terminus of genome DNAs, serves as primers for viral DNA replication initiation. The terminal complementary repeats of bidensoviruses would form the panhandle structures, resulting in a double-stranded molecule that would then serve as a template for further replication of both minus and plus progeny strands. The 53 kDa peptide encoded by the PolB gene might be the TP of BmBDV, but the details of this peptide remain to be elucidated. Moreover, the terminus of BmBDV and polintovirus genomes start with the tetranucleotide GTGT/GTGT repeat, which is similar to the terminal GTAGTA sequence of adenovirus DNAs. The GTA at positions 4–6 of adenovirus DNA, rather than 1–3, are used as a template for pTP-CAT intermediate formation, and then, this intermediate jumps back to position 1 of the template to start elongation. This jumping-back mechanism ensures the integrity of the terminal sequences during replication of the linear genome. Receptor-mediated attachment and entry are essential first steps in the virus life cycle. The interaction between the BmBDV attachment protein and its receptor is very specific. The silkworm nsd-2 gene determines the resistance of silkworm to BmBDV. Its allele encodes a 12-pass transmembrane protein, þ NSD2, which is homologous with amino acid transporter family and is expressed in predominantly the posterior part of the midgut. A deletion of 9 transmembrane domains of þ NSD2 at the C-terminus results in resistance to BmBDV. þ NSD-2 can interact with the VP of BmBDV in vitro. These facts suggest that þ NSD-2 serves as a functional receptor for BmBDV and the target site recognized by the virus is present in the deleted portion of the membrane protein. However, the BmBDV genome can replicate at a low level and the virus VP can be detected before 48 h postinfection in resistant silkworm, indicating that þ NSD-2 regulates virus replication in addition to its function as a cell receptor protein for BmBDV.

Virus Evolution The linear single-stranded DNA genomic structure of bidensoviruses as well as the dimensions and morphology of the virions are like those of parvoviruses. Similar to the insect densoviruses, BmBDV infection leads to a fatal densonucleosis. BmBDV NS1 is homologous to densovirus NS1, possessing a conservative superfamily 3 helicase domain. Although the major structural protein VP of BmBDV didn’t share significant sequence similarity to that of parvoviruses, the BmBDV VP encompasses counterparts to all 8 b-strands that can form the core of the jelly-roll fold as parvoviruses, suggesting that VP was a homolog of the VP of parvoviruses. These features strongly suggest that bidnaviruses originated from a parvovirus ancestor. The main feature of bidensoviruses that differs from ssDNA viruses is that the VD1 of bidensoviruses encode the protein-primed DNA polymerase B. A phylogenetic tree indicates that the PolB of BmBDV was derived from polintovirus. In BmBDV VD2, p133 encodes the 133 kDa minor structural protein (P133) that shares sequence similarity to structural proteins of some reoviruses. Phylogenetic analysis demonstrates that bidensoviruses cluster with members of the genus Cypovirus. The non-structural protein NS3 shows some homology with NS3 of some densoviruses, and the homologs of NS3 were also found in some granuloviruses and nucleopolyhedroviruses. Phylogenetic analysis shows that BmBDV NS3clusters with granuloviruses, suggesting that BmBDV most likely acquired the NS3 from a granulovirus rather than a densovirus. All together, it appears that bidnaviruses evolved from a parvovirus ancestor from which they inherited a jelly-roll capsid protein and a superfamily 3 helicase and that later, the genome of the ancestral parvovirus integrated into a polintovirus genome acquiring the polintovirus PolB gene along with terminal inverted repeats to generate the VD1 genome segment. It is possible that the VD2 segment of BmBDV evolved directly from the VD1 segment by replacement of the parvoviral and polintoviral genes with genes from reoviruses and baculoviruses.

Further Reading Bao, Y., Chen, L., Wu, W., et al., 2013. Direct interactions between bidensovirus BmDNV-Z proteins and midgut proteins from the virus target Bombyx mori. FEBS Journal 28 (3), 939–949. Chen, H., Qin, Yao, Bao, F., Chen, K., Liu, X., 2012. Comparative Proteome analysis of silkworm in its susceptibility and resistance responses to Bombyx mori densonucleosis vrus. Intervirology 55, 21–28. Guo, H., Sun, Q., Wang, B., et al., 2019. Spry is downregulated by multiple viruses to elevate ERK signaling and ensure viral reproduction in silkworm. Developmental and Comparative Immunology 98, 1–5. Hu, Z., Deng, Y., Zhang, X., et al., 2018. Selection and validation of reference genes for reverse transcription quantitative real-time PCR (RT-qPCR) in silkworm infected with Bombyx mori bidensovirus. Biologia 73 (9), 897–906. Ito, K., Fujii, T., Yokoyama, T., Kadono-Okuda, K., 2018. Decrease in the expression level of the gene encoding the putative Bombyx mori bidensovirus receptor during virus infection. Archives of Virology 163 (12), 3327–3338. Kapitonov, V.V., Jurka, J., 2006. Self-synthesizing DNA transposons in eukaryotes. Proceedings of the National Academy of Sciences of the United States of America 103, 4540–4545. Krupovic, M., 2013. Networks of evolutionary interactions underlying the polyphyletic origin of ssDNA viruses. Current Opinion in Virology 3, 578–586. Kumar, D., Sun, Z., Cao, G., et al., 2019. Bombyx mori bidensovirus infection alters the intestinal microflora of fifth instar silkworm (Bombyx mori) larvae. Journal of Invertebrate Pathology 163, 48–63. Li, G., Zhou, Q., Hu, Z., et al., 2015. Determination of the proteins encoded by BmBDV VD1-ORF4 and their interacting proteins inBmBDV-infected midguts. Current Microbiology 70 (4), 623–629.

Bidensoviruses (Bidnaviridae)

763

Li, G., Zhou, Q., Qiu, L., et al., 2017. Serine protease Bm-SP142 was differentially expressed in resistant and susceptible Bombyx mori strains, involving in the defence response to viral infection. PLoS One 12 (4), e0175518. Lü, P., Pan, Y., Yang, Y., et al., 2018. Discovery of anti-viral molecules and their vital functions in Bombyx mori. Journal of Invertebrate Pathology 154, 12–18. Lü, P., He, Y., Lin, F., et al., 2019. Rapid detection of Bombyx mori bidensovirus by loop-mediated isothermal amplification based lateral flow dipstick assay for field applications. Journal of Invertebrate Pathology 163, 75–81. Sun, Q., Jiang, L., Guo, H., et al., 2018. Increased antiviral capacity of transgenic silkworm via knockdown of multiple genes on Bombyx mori bidensovirus. Developmental and Comparative Immunology 87, 188–192. Zhang, P., Miao, D., Zhang, Y., et al., 2016. Cloning and rescue of the genome of Bombyx mori bidensovirus, and characterization of a recombinant virus. Virology journal 8 (13), 126.

Bunyaviruses of Arthropods (Mypoviridae, Nairoviridae, Peribunyaviridae, Phasmaviridae, Phunuiviridae, Wupedeviridae) Sandra Junglen, Charité - University Medicine Berlin, Berlin, Germany r 2021 Published by Elsevier Ltd.

Glossary Ambisense Coding strategy used by some bunyaviruses in which a single stranded genome segment encodes proteins in both positive and negative sense orientations. Arbovirus A virus transmitted to vertebrates by hematophagous (blood-feeding) insects. Arthropod-specific virus A virus that infects only arthropods and cannot be transmitted to vertebrates. Bunyavirus Any member of the order Bunyavirales.

Cap snatching Cleavage of the capped 50 end of a cellular mRNA to produce short oligonucleotides that are used to prime viral mRNA transcription. The cellular sequences are incorporated into the viral mRNAs. Insect-specific virus A virus that infects only insects and cannot be transmitted to vertebrates. Unclassified A virus not yet classified as a member of a taxon.

Introduction At the time when Gouléako virus and Herbert virus as the first insect-specific bunyaviruses were discovered, bunyaviruses were organized as a family comprising the five genera Phlebovirus, Orthobunyavirus, Hantavirus, Tospovirus and Nairovirus. All bunyaviruses detected previously in blood-feeding arthropods were able to replicate in vertebrate cells and were considered to be arboviruses. The detection of bunyaviruses in mosquitoes that replicated efficiently in insect cells but were unable to infect vertebrate cells was a peculiar novelty. Moreover, these viruses did not group with any of the classical arthropod-associated bunyavirus genera (Phlebovirus, Orthobunyavirus and Nairovirus) and defined new deep rooting lineages in the bunyavirus phylogeny. Many other insectspecific bunyaviruses were discovered in subsequent years, giving rise to the reorganization of the family into an order comprising twelve families at present. Table 1 gives an overview of selected classified arthropod-specific bunyaviruses. As most of these viruses were identified by metagenomic sequencing of pooled arthropod specimens and no virus isolates are available, the following paragraphs are limited to mainly those viruses for which the phenotypic and molecular characteristics have been determined.

Virion Structure The morphology of virions has been analyzed for viruses of the genera Herbevirus, Feravirus, Goukovirus, Phasivirus and Jonvirus. Herbert virus (genus Herbevirus) and Badu virus (genus Phasivirus) particles are spherical, enveloped and 90–110 nm or 130 nm in diameter, respectively (Fig. 1). The enveloped virions of Gouléako virus (genus Goukovirus) and Ferak virus (genus Feravirus) are pleomorphic with a diameter of 70 – 120 nm. Jonchet virus (genus Jonvirus) has a unique morphology in the Order Bunyavirales. Jonchet virus has two types of enveloped virions with either a tubular morphology of about 60 nm up to 600 nm or a spherical morphology with a diameter of about 80 nm. The virions consist of four structural proteins, termed Gn, Gc, N, and L, in analogy to the bunyavirus terminology.

Genome and Coding Strategies of the Viral Genomes The genomes consist of three single-stranded RNA segments of negative-sense or ambisense polarity. The segments are designated L (large), M (medium) and S (small) and have complementary terminal sequences that are conserved among viruses of the same genus and may also be conserved within the same family but differ between families. The terminal nucleotides have been identified for only the insect-specific bunyaviruses shown in Table 2. The conserved terminal nucleotides of the 30 - and 50 -prime ends are complementary allowing the RNA segments to base-pair and form panhandle structures. The complete genomes of all classified bunyaviruses have been sequenced. However, genome annotations are available for only those viruses that were isolated in cell culture and analyzed by molecular methods such as protein sequencing by mass spectrometry. The genome organization for those species is shown in Fig. 2. The L segment encodes the RNA-dependent RNA polymerase (RdRp) protein, also named L protein. The M segment encodes the two glycoproteins which are translated as a precursor polyprotein (glycoprotein precursor, GPC) and posttranslationally cleaved into Gc and Gn. Viruses of the genera Feravirus and Jonvirus in the family Phasmaviridae encode an

764

Encyclopedia of Virology, 4th Edition, Volume 4

doi:10.1016/B978-0-12-814515-9.00037-0

Bunyaviruses of Arthropods

Table 1

765

Selected classified arthropod-specific bunyaviruses in the order Bunyavirales

Family

Genus

Species

Host

Mypoviridae Nairoviridae

Hubavirus Shaspivirus Striwavirus

Myriapod hubavirusa Spider shaspivirusa Strider striwavirusa

Millipede Spider Water strider

Peribunyaviridae

Herbevirus

Herbert herbevirusa Kibale herbevirus Tai herbevirus Insect shangavirusa

Mosquito Mosquito Mosquito n.d.

Ferak feravirusa Jonchet jonvirusa Anopheles orthophasmavirus Culex orthophasmavirus Ganda orthophasmavirus Kigluaik phantom orthophasmavirusa Odonate orthophasmavirus Qingling orthophasmavirus Wuchang cockroach orthophasmavirus 1 Wuhan mosquito orthophasmavirus 1 Wuhan mosquito orthophasmavirus 2 Sanxia sawastrivirusa Insect wuhivirusa

Mosquito Mosquito Mosquito Mosquito Bee Phantom midge Dragonfly Dragonfly Cockroach Mosquito Mosquito Water strider n.d.

Pidchovirus

Dipteran beidivirusa Cumuto goukovirus Gouleako goukovirusa Yichang insect goukovirus Horsefly horwuvirusa Dipteran hudivirusa Lepidopteran hudovirusa Blackleg ixovirusa Norway ixovirus Scapularis ixovirus Laurel Lake virusa Mothra mobuvirusa Badu phasivirusa Dipteran phasivirus Fly phasivirus Phasi Charoen-like phasivirus Wutai mosquito phasivirus Pidgey pidchovirusa

n.d. Mosquito Mosquito n.d. Fly n.d. Moth Tick Tick Tick Tick Moth Mosquito n.d. n.d. Mosquito Mosquito Moth

Wumivirus

Millipede wumivirusa

Millipede

Shangavirus Phasmaviridae

Feravirus Jonvirus Orthophasmavirus

Sawastrivirus Wuhivirus Phenuiviridae

Beidivirus Goukovirus

Horwuvirus Hudivirus Hudovirus Ixovirus

Laulavirus Mobuvirus Phasivirus

Wupedeviridae a

type species for the corresponding Genus, n.d. not determined.

Fig. 1 Morphology of selected insect-specific bunyaviruses. Negative-stained virions of Herbert virus (A), Gouléako virus (B), Ferak virus (C), and Jonchet virus (D) sedimented by ultracentrifugation. Scale bars represent 100 nm.

additional non-structural protein named NSm. The NSm protein of Jonchet virus is encoded at the N-terminal part of the GPC protein and posttranslationally cleaved as observed in phleboviruses. In contrast the NSm protein of Ferak virus shows a unique coding strategy as the NSm is encoded in a second small ORF upstream of the GPC ORF that overlaps the N terminus of the GPC

766

Table 2

Bunyaviruses of Arthropods Consensus 30 and 50 terminal nucleotides of selected arthropod-specific bunyaviruses

Family

Genus

Virus

Terminal nucleotides

Phasmaviridae

Feravirus

Ferak virus

Jonvirus

Jonchet virus

Orthophasmavirus

Kigluaik phantom virus

Peribunyaviridae

Herbevirus

Herbert virus

Phenuiviridae

Goukovirus

Gouléako virus

Phasivirus

Badu virus

30 50 30 50 30 50 30 50 30 50 30 50

UCAUCAUUUGU … AGUAGUAAACA … UCAUCAU … AGUAGUA … UCGUCGUGCG AGCAGCAUGC UCAUCAC … AGUAGUG … UGUGU … ACACA … UGUGUUUCUG ACACAAAGAC

Fig. 2 Schematic coding strategy of selected arthropod-specific bunyaviruses. The viral RNA genome is shown as a black line with the length in nucleotides indicated above each segment. Viral mRNAs are represented as blue arrows with boxes indicating the host-derived primer sequences. Proteins are shown as green boxes with predicted protein sizes given in or beside the boxes.

ORF. The S segment encodes the N protein and in the case of fera- and jonviruses also a nonstructural protein, named NSs, in different overlapping reading frames in the complementary-sense RNA. The NSs ORF is located upstream of the N ORF. The two proteins are translated from the same mRNA species, the result of alternative initiation at different AUG codons. While Shuango insect virus 1 (genus Shangavirus) lacks a NSs, it has been predicted to encode a NSm protein. All other arthropod-specific bunyaviruses seem not to encode NSm or NSs proteins which is in contrast to the vertebrate-infecting bunyaviruses that encode mostly those two proteins. Goukoviruses contain the shortest bunyavirus genomes identified to date consisting of S, M and L segments of about 1.1, 3.2 and 6.4 kb, respectively. The L protein of herbeviruses contains an additional region of about 150 amino acids and is thus longer than the related orthobuynaviruses. However, the glycoprotein precursor of herbeviruses is ca. 600 amino acids shorter compared to orthobunyaviruses as the variable region of the Gc protein as well as the NSm protein are lacking.

Bunyaviruses of Arthropods

767

Viral Replication Cycle Infection of the target cell is believed to be mediated by interaction of the glycoproteins with cellular receptors. However, no cellular receptors nor the role of each glycoprotein attachment and entry has been identified for any arthropod-specific bunyavirus. Viral replication takes place in the cytoplasm. The negative-strand RNA is replicated via a complementary full-length positivesense RNA. The mRNA contains a 50 -methylated capped nonviral (primer) sequence and is truncated at its 30 -end compared to the genomic full-length and is used for transcription and translation. Further analyzes on Herbert virus, Gouléako virus, Jonchet virus and Ferak virus have shown that the 50 -nonvirally templated elements are obtained from host cell mRNAs by a cap-snatching mechanism that is mediated by an endonuclease activity of the viral L protein. The 30 -end of the mRNAs terminate ca. 100–200 nucleotides before the end of the genomic RNA. There is no apparent termination signal present. Transcription in arthropodspecific bunyaviruses is similar to that of the plant- and vertebrate-infecting bunyaviruses. Assembly and release of bunyaviruses occurs at the Golgi complex. Similar observations have been made for Herbert virus. Two types of premature spherical viral particles of high and low electron densities, termed intracellular annular viruses (IAV) and intracellular dense viruses (IDV), were detected in Golgi vesicles. Maturation and budding of Herbert virions occurs at the Golgi membrane of Golgi vesicles filled with IAV and IDV.

Host Associations, Virus Maintenance Cycles and Pathogenicity The progress in the area of metagenomic sequencing enabled the detection of bunyavirus-like sequences in a huge diversity of arthropods. For example, bunyavirus sequences were found in beetles, bees, dragonflies, moths, water strides, millipedes, spiders and springtails to name just a few, indicating that bunyaviruses have coevolved with their arthropod hosts. Despite the accumulation of sequence data from diverse arthropod hosts, there is little knowledge on the routes of transmission of these viruses and their pathogenicity in arthropod hosts.

Further Reading Auguste, A.J., Carrington, C.V., Forrester, N.L., et al., 2014. Characterization of a novel Negevirus and a novel Bunyavirus isolated from Culex (Culex) declarator mosquitoes in Trinidad. Journal of General Virology 95, 481–485. Ballinger, M.J., Bruenn, J.A., Hay, J., et al., 2014. Discovery and evolution of bunyavirids in arctic phantom midges and ancient bunyavirid-like sequences in insect genomes. Journal of Virology 88, 8783–8794. Hobson-Peters, J., Warrilow, D., McLean, B.J., et al., 2016. Discovery and characterisation of a new insect-specific bunyavirus from Culex mosquitoes captured in northern Australia. Virology 489, 269–281. Käfer, S., Paraskevopoulou, S., Zirkel, F., et al., 2019. Re-assessing the diversity of negative strand RNA viruses in insects. PLoS Pathogens 15 (12), e1008224. Li, C.X., Shi, M., Tian, J.H., et al., 2015. Unprecedented genomic diversity of RNA viruses in arthropods reveals the ancestry of negative-sense RNA viruses. eLife 4, e05378. Makhsous, N., Shean, R.C., Droppers, D., et al., 2017. Genome sequences of three novel bunyaviruses, two novel rhabdoviruses, and one novel nyamivirus from Washington state moths. Genome Announcements 5, e01668. Marklewitz, M., Handrick, S., Grasse, W., et al., 2011. Gouleako virus isolated from West African mosquitoes constitutes a proposed novel genus in the family Bunyaviridae. Journal of Virology 85, 9227–9234. Marklewitz, M., Zirkel, F., Kurth, A., Drosten, C., Junglen, S., 2015. Evolutionary and phenotypic analysis of live virus isolates suggests arthropod origin of a pathogenic RNA virus family. Proceedings of the National Academy of Sciences of the United States of America 112, 7536–7541. Marklewitz, M., Zirkel, F., Rwego, I.B., et al., 2013. Discovery of a unique novel clade of mosquito-associated bunyaviruses. Journal of Virology 87, 12850–12865. Shi, M., Lin, X.D., Tian, J.H., et al., 2016. Redefining the invertebrate RNA virosphere. Nature 540, 539–543. Tokarz, R., Sameroff, S., Tagliafierro, T., et al., 2018. Identification of novel viruses in Amblyomma americanum, Dermacentor variabilis, and Ixodes scapularis ticks. mSphere 3, e00614–e00617.

Dicistroviruses (Dicistroviridae) Yanping Chen, Bee Research Laboratory, Agricultural Research Service, US Department of Agriculture, Beltsville, MD, United States Steven M Valles, Center for Medical, Agricultural and Veterinary Entomology, Agricultural Research Service, US Department of Agriculture, Gainesville, FL, United States r 2021 Elsevier Ltd. All rights reserved. This is an update of P.D. Christian, P.D. Scotti, Dicistroviruses, In Encyclopedia of Virology (Third Edition), Edited by Brian W.J. Mahy, Marc H.V. Van Regenmortel, Elsevier Ltd., 2008, doi:10.1016/B978-012374410-4.00608-7.

Glossary Dipteran A member of the insect order Diptera: true flies. Hemipteran A member of the insect order Hemiptera: true bugs (including aphids). Horizontal transmission Transmission from one individual to another in the same generation. Hymenopteran A member of the insect order Hymenoptera: wasps and bees. Intergenic region Region between the two open reading frames in the dicistrovirus genome. Lepidopteran A member of the insect order Lepidoptera: moths and butterflies.

Orthopteran A member of the insect order Orthoptera: crickets and grasshoppers. Pathogenicity The ability of a virus to cause damage and disease in a host. Penaeid A shrimp from the family Penaeidae. Polyprotein A protein that is cleaved after synthesis to produce a number of smaller functional proteins. Vertical transmission Transmission of virus directly from an infected mother to her offspring. Virion Infective form of a virus outside a host cell, consisting of a nucleic acid core and a protein coat. VPg A virally encoded protein covalently linked to the 50 end of the viral genome.

Introduction For years, viruses in the Dicistroviridae family had no definitive taxonomic placement and were referred to as “picorna-like” based simply on their similarities to mammalian picornaviruses possessing an RNA genome, the size of virions, the composition of capsids, and other biophysical characteristics. Genome sequencing revolutionized biological science across many fields including virus taxonomy. In 1998, the first complete genome sequence of an insect picorna-like virus, Drosophila C virus (DCV) was published and predicted to possess a genome organization different from that of mammalian picornaviruses. Unlike other picornaviruses with monopartite, monocistronic genomes, DCV has a monopartite bicistronic genome with replicase proteins encoded by a 50 -proximal ORF and capsid proteins encoded by a 30 -proximal ORF, which are separated by an intergenic region (IGR), suggesting this virus belongs to a previously undescribed virus family. Subsequently, the increasing availability of complete genome sequences of insect picorna-like viruses and phylogenetic data have confirmed the distinct genome architecture of the dicistroviruses, and have conclusively defined Dicistroviridae, a taxonomic designation which was officially adopted by the ICTV in 2002. The family currently includes fifteen members officially recognized by the ICTV and many additional unclassified species.

Taxonomy and Classification The Dicistroviridae family is a member of the Order Picornavirales and currently comprises 15 species classified into three genera, Aparavirus (derived from acute bee paralysis virus), Cripavirus (derived from cricket paralysis virus), and Triatovirus (derived from triatoma virus). There are several other potential candidates for the family but these have yet to be accepted as species by the ICTV. For the purposes of this article, we will limit our discussion to only those officially classified species shown in Table 1.

Biophysical Properties Dicistrovirus virions are roughly spherical as observed by electron microscopy with a particle diameter of approximately 30 nm and no envelope (Fig. 1(a)). The virions are composed of 60 protomers, each comprised of a single molecule of each of the major capsid proteins VP1, VP2, and VP3 (Fig. 1(b)). The major capsid proteins VP1 to VP3 exhibit a jelly roll b-sandwich fold and form the capsid shell with pseudo-T ¼ 3 icosahedral symmetry. A smaller protein, VP4, is also present inside the virion and in contact with the RNA genome. In Cricket paralysis virus (CrPV) and Israeli acute paralysis virus (IAPV), VP4 is located on the internal surface of the five-fold axis below VP1, but in contrast, is disordered in Triatoma virus (TrV). In most species, a protein precursor (VP0) is present which is cleaved to yield capsid proteins VP3 and VP4 during virion maturation. An Asp-Asp-Phe (DDF) motif, which is part of the VP1 subunit and conserved among dicistroviruses may be involved in VP0 cleavage. There are spike-like

768

Encyclopedia of Virology, 4th Edition, Volume 4

doi:10.1016/B978-0-12-814515-9.00006-0

Dicistroviruses (Dicistroviridae)

Table 1

769

Members of the family Dicistroviridae. Isolate and vernacular names are shown in brackets

Genus

Species (isolate name)

Accession #

Abbreviation

Aparavirus

Acute bee paralysis virusa (Acute bee paralysis virus) Israeli acute paralysis virus (Israeli acute paralysis virus) Kashmir bee virus (Kashmir bee virus) Mud crab virus (Mud crab virus) Solenopsis invicta virus 1 (Solenopsis invicta virus 1) Taura syndrome virus (Taura syndrome virus)

[AF150629] [EF219380] [AY275710] [HM777507] [AY634314] [AF277675]

ABPV IAPV KBV MCV SINV  1 TSV

Cripavirus

Cricket paralysis virusa (Cricket paralysis virus) Aphid lethal paralysis virus (Aphid lethal paralysis virus) Drosophila C virus (Drosophila C virus) Rhopalosiphum padi virus (Rhopalosiphum padi virus)

[AF218039] [AF536531] [AF014388] [AF022937]

CrPV ALPV DCV RhPV

Triatovirus

Triatoma virusa (Triatoma virus) Black queen-cell virus (Black queen cell virus) Himetobi P virus (Himetobi P virus) Homalodisca coagulata virus 1 (Homalodisca coagulata virus 1) Plautia stali intestine virus (Plautia stali intestine virus)

[AF178440] [AF183905] [AB017037] [DQ288865] [AB006531

TrV BQCV HiPV HoCV  1 PSIV

a

Type species for the Genus.

Fig. 1 Virion structure. (a) Negative stained electron micrograph of CrPV virions that appear spherical in shape. (b) Schematic diagram of a dicistrovirus virion showing the surface packing of the coat proteins (VP1, VP2, and VP3) (courtesy of A. E. Mechaly, with permission). The comparison of surface structures of, from left to right, Triatoma virus, Israeli acute bee paralysis virus, and Cricket paralysis virus. Color scale indicates the distance from the particle center (Å). Protrusions are shown in red and depressions are shown in blue (courtesy of A. E. Mechaly, with permission). (a) Courtesy of Carl Reinganum, with permission.

770

Dicistroviruses (Dicistroviridae)

protrusions that are formed by two antiparallel b-strands from the CD loop of VP3 and a C-terminal b-strand of VP1on the surface of virions; however, the size of the protrusions varies among different dicistroviruses (Fig. 1(c)–(e)). Unlike many picornaviruses, some dicistroviruses such as Cricket paralysis virus (CrPV), Black queen cell virus (BQCV), and IAPV do not contain a hydrophobic pocket in capsid protein VP1, which is naturally occupied by a lipid and can be targeted by capsid-binding antiviral compounds. Proteins account for 70% of virion weight. The approximately 200 kDa nonstructural polyprotein and 100 kDa structural polyprotein are encoded by ORF 1 and ORF 2, respectively. Virions contain three major structural (capsid) viral proteins, VP1, VP2, and VP3. The size of these capsid proteins ranges from 24 to 40 kDa; an exception is Taura syndrome virus (TSV), in which VP1 is 55 kDa. A fourth smaller capsid protein (VP4) of roughly 4.5–9 kDa has also been reported in some species. Virions are stable in acidic conditions (to pH ¼ 3.0) and resistant to detergents and organic solvents such as ether and chloroform. Mature virions have a buoyant density of 1.34–1.39 g ml1 in CsCl in the pH range of 7–9 and sedimentation coefficients of between 153 and 167 S. However, physicochemical properties have not been fully established for all members of the family. A summary of the biophysical properties of dicistrovirus virions is provided in Table 2.

Organization of the Dicistrovirus Genome The positive-sense single-stranded RNA genomes of dicistroviruses are monopartite and dicistronic and approximately 8500–10,000 nt in size. The RNA genomes comprise two non-overlapping ORFs of approximately 5500 and 2600 nt that are separated by an untranslated intergenic region (IGR) of B190 nt and flanked by 50 and 30 untranslated regions (UTRs) (Fig. 2). The length of the UTRs at each end of the genome varies. The 50 -proximal and 30 -proximal ORFs encode non-structural and structural protein precursors, respectively. Each ORF is preceded by a specific RNA structure identified as an internal ribosome entry site (IRES), which allows for the initiation of translation in a cap-independent manner. Components of the non-structural Table 2

Summary of some biophysical properties of dicistroviruses.

Virus

Molecular weight of major capsid proteins (kDa)a

Buoyant density in CsCl (g ml1)

Particle diameter (nm)

Genus

Species

Aparavirus

Acute bee paralysis virus Israeli acute paralysis virus Kashmir bee virus Mud crab virus Solenopsis invicta virus 1 Taura syndrome virus

35, 33, 24, 9 no data available 41, 37, 25, 6 53, 35, 22 41, 36, 25 (22) 55, 40, 24 (58)

1.34 1.33 1.37 no data available 1.35 1.34

30 27 30 30 31 31

Cripavirus

Cricket paralysis virus Aphid lethal paralysis virus Drosophila C virus Rhopalosiphum padi virus

35, 34, 31, 31,

1.37 1.34 1.34 1.37

27 27 27 27

Triatovirus

Triatoma virus Black queen-cell virus Himetobi P virus Homalodisca coagulata virus 1 Plautia stali intestine virus

39, 37, 33 34, 32, 29, 6 37, 33, 28 no data available 33, 30, 26, 5

1.39 1.34 1.35 no data available no data available

30 30 29 no data available 30

34, 32, 30, 30,

30 (43) 31 (41) 28, 9 (37) 28 (41)

a

Minor virion components are shown in brackets. These are presumed to be precursors of VP4–VP3.

Fig. 2 Organization of the dicistrovirus genome. The RNA genome contains two non-overlapping open reading frames (ORFs) separated by an intergenic region (IGR) and flanked by the 50 and 30 untranslated regions (UTRs). The 50 proximal ORF encodes the nonstructural proteins: RNA helicase (Hel), 2 x VPg, cysteine protease (Pro), and RNA-dependent RNA polymerase (RdRp). The 30 -proximal ORF encodes structural proteins: VP2, VP4, VP3, and VP1. The internal ribosome entry sites (IRESs) are located within the 50 -UTR and the IGR.

Dicistroviruses (Dicistroviridae)

771

polyprotein include an RNA helicase (Hel), 3C-like cysteine protease (Pro), and RNA-dependent RNA polymerase (RdRp) lying in the order (50 –30 ) Hel-Pro-RdRp. The structural proteins that build the capsid are arranged in the order of VP2-VP4-VP3-VP1. VP4 is cleaved from VP3 during the maturation of the virion. A small genome-linked virus protein (VPg), derived from ORF1, is covalently attached to the 50 end of the genome and plays an important role in RNA replication. The VPg sequence is repeated in most dicistrovirus genomes with the number of repeats being species-dependent. The presence of multiple VPg copies is predicted to allow for multiple progeny RNA molecules to be produced per template. The poly(A) tail at the 30 end of the genomes is presumed to function synergistically with the IRES to enhance IRES-mediated translation efficiency,

Virus Replication and Genome Expression The ways in which the dicistrovirus viral capsid proteins interact with host cellular receptors to facilitate entry is not fully understood. Likely, however, dicistroviruses enter host cells via clathrin-mediated endocytosis. The recent development of infectious full-length clones of dicistroviruses will make the study of virus replication and genome expression possible. Dicistroviruses are considered to use the same general replication mechanisms common to picornaviruses. Following viral attachment to the host cell, conformational changes occur in the capsid facilitating genome release into the cytoplasm. The RNA genome is immediately translated into polyproteins. The multiplication of the viral genomic RNA involves the synthesis of a negative-strand RNA intermediate from the positive strand, which, in turn, serves as a template for the production of progeny positive-strand RNAs. The viral RNA-dependent RNA polymerase (RdRP) named 3Dpol is responsible for primarily RNA synthesis. The 3Dpol uses VPg (also called 3B) at the 50 terminus of the genomic RNA as a primer to initiate the replication process. The first step in protein-primed initiation is the uridylylation of a tyrosine residue at position 3 of VPg by the 3D viral polymerase. The resulting VPg-pUpU then serves as a primer for the synthesis of both negative as well as positive-strand RNA at the 30 - or 50 -end of the template genomes. The viral genomic RNA also functions as mRNA which is translated into proteins for the production of new virion materials. The initiation of protein synthesis occurs immediately upon entry into the host cell and coincides with the shutdown or downregulation of host cell protein synthesis. For example, CrPV infection inhibits the host translational machinery, a process called host shutoff which is concomitant with an increase in CrPV viral protein synthesis. Translation of the dicistronic RNA genome of dicistroviruses proceeds directly from internal ribosome entry sites (IRESs) and RNA stem-loop structures located within the 50 -UTR and the IGR. Although the mechanism of IRES-directed translation initiation has not been fully defined and varies among different IRES elements, the IGR-IRES of dicistroviruses is one of the best-studied IRES elements. The IGR IRES can initiate translation without codon-anticodon base pairing between the initiation AUG codon and the initiator Met-tRNA. Moreover, the IGR IRES can directly recruit and bind the ribosome in the absence of canonical initiation factors. The activities of the IGR IRES depend on two conserved and independently folded domains that work in concert to mediate ribosome recruitment and direct translation initiation. In most cases, the initiation codon is CCU (Pro) while the first codon of VP2 encodes an alanine residue. It has been shown empirically that the glutamine at the 50 end of PSIV VP2 can be replaced with any other amino acid to produce a mature protein in an in vitro translation system. Conversely, the 50 -UTR IRES is not well conserved within the group and there are no clear structural homologies between the 50 -UTR IRES and IGR IRES. Translation activity of the IGR IRES is relatively stronger than that of the 50 -UTR IRES. However, a study with CrPV showed that IGR IRES translational activity is coupled with 50 UTR IRES translational activity and the increased IGR IRES translation of capsid proteins is stimulated when the expression of non-structural proteins is active.

Host Range To date, all members of the Dicistroviridae have been isolated from arthropods. Most dicistroviruses have a range of multiple hosts. The hymenopteran viruses, Acute bee paralysis virus (ABPV), Kashmir bee virus (KBV), IAPV, KBV, and BQCV which were originally identified in the European honey bee (Apis mellifera) have recently been detected in other bee species, which is likely due to the sharing of foraging sites as well as the high mutation rates of the RNA viruses in driving viral adaptation and host expansion. In addition, it is evident that these bee viruses are capable of replicating in the ectoparasitic mite Varroa destructor, which acts as a vector of viruses in honey bee colonies. Studies have shown that the host of predilection for Solenopsis invicta virus 1 is the red imported fire ant, Solenopsis invicta. However, in areas where S. invicta and the congener, Solenopsis geminata are sympatric, the virus will infect both species. SINV-1 has also been detected in the black imported fire ant, S. richteri. Dicistroviruses found in Hemiptera (the combined orders of Heteroptera and Homoptera), also have multiple host ranges. Rhopalosiphum padi virus (RhPV) has been isolated from laboratory and field populations of the aphids Rhopalosiphum padi, R. maidis, R. rufiabdominalis, Schizaphis graminum, Diuraphis noxia, and Metapolophium dirrhodum. Aphid lethal paralysis virus (ALPV) has been isolated from several species of aphids, including R. padi, M. dirrhodum, and Sitobian avenae; more recently, it has been identified in the honey bee A. mellifera. The Himetobi P virus (HiPV) and Plautia stali intestine virus (PSIV) are found in true bugs rather than aphids with HiPV having been isolated from the leafhoppers Laodelphax striatellus, Sogatella furcifera, and Nilaparvata lugens, and PSIV from the brown-winged green bug, Plautia stali. TrV is also a virus of hemipterans and has been found in the hematophagous triatomine bug, Triatoma infestan, which is also a vector of the protozoan agent responsible for Chagas disease in South America.

772

Dicistroviruses (Dicistroviridae)

DCV has a host range restricted to dipterans and has been isolated from Drosophila melanogaster and the sibling species D. simulans. Taura syndrome virus (TSV) is a virus of penaeid shrimps and has been isolated from a number of shrimp species including Litopenaeus vannamei, L. stylirostris, Metapenaeus ensis, and Penaeus monodon. P. chinensis is highly susceptible to infection in experimental bioassays. CrPV was originally isolated from the field crickets, Teleogryllus oceanicus and T. commodus. This virus has an exceptionally broad host range as it has subsequently been isolated from a further 20 species belonging to five taxonomic orders: Orthoptera, Hymenoptera, Lepidoptera, Hemiptera, and Diptera, as well as a range of cultured insect cells. CrPV is the only dicistrovirus isolated from lepidopterans – in fact from ten lepidopteran species. All of the above records refer to the natural host range of the viruses. To a certain extent, studies on the experimental host range of dicistroviruses are limited and those that have been carried out have not substantially extended the known host ranges. Again the exceptions are CrPV and DCV. CrPV replicates in a number of established insect cell lines including those from Drosophila, the hemipteran Agallia constricta, and the lepidopterans Pieris rapae, Plutella xylostella, Spodoptera ornithogalli, and Trichoplusia ni. In addition to insect cell lines, CrPV has also been found to replicate readily in larvae of the greater waxmoth, Galleria mellonella. This is an easy insect to rear and maintain and virus yields can be very high. Apart from CrPV, the only other dicistrovirus shown to replicate in cultured cells is DCV, which multiplies in several Drosophila cell lines (some DCV isolates also replicate in the greater waxmoth). Reports that TSV can replicate in some mammalian cell lines have never been substantiated and may simply be attributable to the production of a cytopathic effect in the absence of virus replication.

Pathology and Transmission Dicistroviruses are significant pathogens of agricultural and health importance. Dicistrovirus infections vary considerably in virulence and pathogenicity. While in some instances, dicistroviruses may not produce noticeable pathological symptoms, most dicistrovirus infections result in impaired functions, severe diseases, reduced life expectancy, and increased mortality in infected hosts. RNA interference (RNAi) is a major immune defense mechanism against viruses in arthropods. Viruses, in turn, have evolved to encode RNA silencing suppressor proteins to counterattack antiviral defense of hosts. The activity of the virus-encoded suppressors of host RNAi (VSRs) has been identified in dicistroviruses including DCV, CrPV, and IAPV. This evolutionary arms race between dicistroviruses and their hosts defines the pathogenicity of viruses in hosts. Most of the dicistroviruses exhibit a tissue tropism toward some part of the alimentary canal, often replicating in epithelial cells of the gut and subsequently shedding virus particles into the gut lumen where the virus accumulates in the feces and serves as inoculum. DCV is a viral pathogen affecting the model organism, D. melanogaster, which has often been used for deciphering virushost interactions. The DCV infection triggers a nutritional stress in infected flies, which is the result of intestinal obstruction; the virus displays an affinity for the food-storage organ of the digestive tract, the crop, in infected hosts. The related virus, CrPV, does not trigger the same pathology. The hymenopteran viruses, ABPV, IAPV, KBV, and BQCV are highly pathogenic to the honey bee, A. mellifera. The viruses attack all developmental stages and castes of the honey bees and replicate in most tissues of infected bees, including the tissues of the gut, fat body, mandibular glands, hypopharyngeal glands, hemolymph, and nerve. ABPV and IAPV infections are associated with paralysis of adult bees, abnormal trembling of the wings and body, flightlessness and mortality. ABPV, IAPV, and KBV infections can be lethal at times, especially in the presence of the parasitic Varroa mite. BQCV affects mainly developing queen larvae and pupae, causing infected pupae to turn dark and quickly die, with the wall of the queen cell becoming a dark color, a characteristic symptom of infection. The hymenopteran virus infections in honey bees have often been linked to worldwide colony losses, including the losses caused by the colony collapse disorder (CCD), a devastating malady that wiped out bees by the millions in the U.S. during the winter of 2006–2007. The fecal-oral route features prominently in the transmission of SINV-1. Uninfected fire ants acquire SINV-1 infections when exposed to baits containing crude homogenates of SINV-1-infected worker ants and purified preparations of the virus. Virus transmission in this manner is easy and facilitates its use as a biopesticide (bait formulation) to control this pest ant. Fire ant colonies infected with SINV-1 can exhibit a chronic asymptomatic condition, or, under certain stressors, the colony exhibits significant mortality. Interestingly, SINV-1 prevalence in field colonies of fire ants is correlated strongly with increasing temperature; the virus prevalence is very high during warmer seasons and nearly undetectable during colder parts of the year. TSV and Mud crab virus (MCV) are two dicistroviruses of crustaceans. MCV is associated with sleeping disease (SD) in farmed Portunidae crabs. TSV is responsible for one of the more devastating diseases of penaeid shrimp, Taura syndrome, which affects the aquaculture industry worldwide. TSV infects post larvae, juveniles and adult stages of shrimps; it has a broad range of tissue tropism and the infection can spread to different types of tissues, including the antennal gland, stomach, gill, lymphoid organ, sub-cuticle and connective tissues. TSV infection involves acute and chronic phases at the farm level. During the acute phase of the infection, infected shrimp display a variety of disease symptoms including anorexia, discoloration, lethargy and erratic swimming behavior. The mortality of infected hosts can be up to 95%. Shrimp that survive the acute stage pass through a transition phase and then become long-term (chronic) carriers of the virus. The Dicistroviridae family consists of members whose hosts are agricultural and urban insect pests. SINV-1 is was the first virus identified in the red imported fire ant, Solenopsis invicta, a pest whose sting is known to cause serious medical problems in people. SINV-1 infects all developmental stages and caste members of the fire ant, exhibiting a tissue tropism for gut tissue especially midgut. S. invicta queens infected with SINV-1 have lower body weights, reducing the probability of colony

Dicistroviruses (Dicistroviridae)

773

founding. The potential use of SINV-1 as a biopesticide for pest control is currently under investigation. RhPV, ALPV, HiPV, and PSIV infect hemipteran pests that are important vectors of plant diseases. The viral infections primarily affect gut tissues of the hemipteran hosts. The host of TrV is a blood-sucking reduviid bug, which is an important vector of the protozoan parasite Trypanosoma cruzi responsible for Chagas disease in humans. TrV is the sole viral pathogen of triatomine bugs and replicates within the intestinal epithelial cells of the hosts, causing a delayed molting cycle and high mortality rate; its use as a biological control agent for vectors of Chagas disease has been proposed. HoCV-1 infects and causes increased mortality in the polyphagous glassy-winged sharpshooter, an important insect vector of the xylem-limited bacterial plant pathogen Xylella fastidiosa. CrPV. HoCV-1 grows rather efficiently in Drosophila, is highly pathogenic to the European olive fruit fly – a possible target pest for this virus. Transmission of dicistroviruses involves both horizontal and vertical pathways. Transmission via ingestion and the alimentary canal feature prominently in dicistrovirus infection acquisition and transmission. Studies with hymenopteran dicistroviruses, ABPV, IAPV, KBV, and BQCV demonstrate that transmission in honey bees involves multiple pathways including food-borne, venereal, vector-borne, and vertical transmission. The detection of honey bee dicistroviruses in colony food stores, the digestive tract of infected bees, and feces suggests that horizontal transmission via the fecal-oral route is a prominent mechanism of dissemination. The detection of viruses in adult drones, semen, and the spermatheca of queens (the organ in which sperm is stored) implies transmission via the venereal route, where viruses are transmitted from infected males to females during mating. The route of horizontal transmission with the most consequence is by the parasitic mite Varroa. The varroa mite acquires and transmits the virus from infected bees to health bees. In addition to facilitating virus transmission, Varroa parasitism also suppresses honey bee immunity, thereby activating virus replication and increasing the susceptibility of infected hosts to further pathogenic infection. The detection of viruses in eggs as well as in the larval stages of bee hosts that are not normally associated with virus vector, suggests that viruses might be transmitted vertically from infected queens to her offspring. Evidence of vertical transmission has also been identified in other dicistroviruses. ALPV RNA can be detected in the developing embryos inside infected females of the aphid host, R. padi, suggesting that ALPV can be vertically transmitted. RhPV can be transmitted not only vertically between aphid hosts but horizontally via the plant by using the plant as a passive reservoir. However, RhPV does not replicate in the plant. DCV is vertically transmitted and is associated with transovum transmission on the egg surface. A similar transovum transmission is also found with CrPV where the surface sterilization of eggs with dilute hypochlorite blocks the transmission of the virus to emerging nymphs. Evidence of horizontal transmission is also documented for other dicistroviruses. The transmission of TSV is predominantly by cannibalism of infected or dead shrimp by healthy shrimp. The virus can also be spread from one farm to another farm by seagulls and aquatic insects. In the case of TrV and its host, T. infestans, the virus is transmitted through the fecal–oral pathway, where infected insects excrete the virus in their feces and healthy insects become infected by coprophagy. The leafhopper-infecting virus Homalodisca coagulata virus 1 (HoCV-1) spreads most readily in high population densities through contact among infected individuals, contact with virus-contaminated surfaces, and/or as an aerosol in leafhopper excreta. With DCV, the natural route of transmission in Drosophila is poorly characterized, as the majority of the virus infection assays in Drosophila utilize injection. However, when uninfected males are placed with uninfected females, the males become infected (and vice versa). In fact, even if uninfected female flies are placed on the same media that infected males have been allowed to feed on it for several hours, the females become infected, clearly indicating that the virus is transmitted horizontally among individuals.

Geographic and Strain Variation Some dicistroviruses infect specific hosts with a wide geographic distribution or have been found to infect a number of species spread over a large geographic range. Several studies have looked at strain variations between geographic isolates of dicistroviruses using a variety of techniques. CrPV and DCV possess a range of biological and serological characteristics that show differentiation between geographical isolates. With CrPV, two major serogroups have been identified that separate Australian and New Zealand isolates from North American isolates. With DCV, isolates from different localities varied both in their pathogenicity and virus yield after being injected into virus-free flies. Additional genetic studies on CrPV and DCV have utilized ribonuclease T1 fingerprinting and subsequently polymerase chain reaction and restriction endonuclease analysis. Estimates of the maximum nucleotide divergence between isolates of CrPV and DCV were about 10%. In the case of CrPV, the North American isolates were quite distinct from the antipodean isolates, a finding that reflects the serological data. More recently, a number of molecular studies with the honeybee dicistroviruses have been undertaken to determine the levels of genetic variation between isolates of these viruses. In most instances, slightly different regions of the virion protein coding regions (usual regions of VP3) have been used, which makes a direct comparison between studies quite difficult. Nevertheless, IAPV and ABPV isolates show levels of nucleotide identity between 90% and 100%, but isolates from the same geographic region are more closely related than isolates from more distant locations. One such study has revealed that viruses isolated from different central European regions were more similar to each other than to isolates from North America or from the UK. Similar patterns are also evident for BQCV and KBV with nucleotide identity within species at 90%–100%. To put these values into perspective, the region used for the KBV studies was 75% identical to the same region from ABPV.

774

Dicistroviruses (Dicistroviridae)

For TSV, the situation is slightly different with nucleotide identities between isolates from North and Central America and Asia ranging from 95% to 100%. These lower levels of diversity may indicate that TSV has rapidly spread into many of the regions and hosts where it is now found, a hypothesis that is supported to some extent by the rapid emergence of the disease over the last two decades. There is also some evidence that suggests there are serological differences between isolates. The RdRp region of the SINV-1 genome was sequenced from infected ant colonies collected from across the USA, northern Argentina, and northern Taiwan. Nucleotide sequences were calculated to exhibit an overall identity of 490% between geographically-separated samples. A total of 171 nucleotide variable sites (representing 22.4% of the region amplified) were mapped across the SINV-1 RdRp alignment and no insertions or deletions were detected. Phylogenetic analysis at the nucleotide level revealed the clustering of Argentinean sequences, distinct from the USA sequences. SINV-1 RdRp sequences derived from populations of S. invicta from northern Taiwan resided within the multiple USA groupings.

Relationships Within the Family Phylogenetic relationships with the Dicistroviridae family are depicted in Fig. 3. The three genera, Aparavirus, Cripavirus and Triatovirus, form distinct clades. Triatoviruses are distinct from aparaviruses and cripaviruses by the presence of prominent projections on the virion surface and a DDF motif in the homologous location in VP3 and the absence of icosahedrally ordered VP4. Aparaviruses and cripaviruses are different from each other in their distinctive features exhibited in the IRES. The typical IGR-IRES of cripaviruses has a conserved bulge sequence (UGAUCU and UGC), while those of aparaviruses have a different bulge sequence (UGGUUACCCAU and UAAGGCUU) and an additional stem loop in the 30 region of the IGR-IRES. Phylogenetic analysis of deduced amino-acid sequences of the three major coat proteins (ORF 2) shows three distinct clades (Fig. 3), supporting the classification of species into three genera in the Dicistroviridae family.

Fig. 3 Phylogenetic tree constructed from the amino acid identity of the structural proteins encoded by ORF2 shows the relationships among members of the family Dicistroviridae, Numbers on the branches indicate percentage of bootstrap support out of 1000 replications. The scale indicates amino acid distance. The tree was generated using neighbor-jointing algorithm implemented in MEGA7.

Dicistroviruses (Dicistroviridae)

775

Similarity With Other Taxa Members in the family Dicistroviridae have similarities with viruses in the families Iflaviridae, Picornaviridae, Marnaviridae, and Secoviridae. The genomes of viruses of these taxa are positive sense ssRNAs with a VPg and a poly (A) tail, and are translated into autoproteolytically processed polyprotein(s). Nonstructural proteins contain sequence motifs for helicase (Hel), 3C-like cysteine proteinase (Pro) and RNA-dependent RNA polymerase (RdRp) with the characteristic gene order: (Hel)-(Pro)-(RdRp). Virions contain capsid proteins organized in a module containing three related jelly-roll domains that form non-enveloped, isometric particles of pseudo T ¼ 3 symmetry and 30 nm diameter.

Acknowledgments We would like to thank Dr. Peter D. Christian for all his help and support.

Further Reading Bonning, B.C., Miller, W.A., 2010. Dicistroviruses. Annual Review of Entomology 55, 129–150. Bonning, B.C., 2009. The Dicistroviridae: An emerging family of invertebrate viruses. Virologica Sinica 24, 415–427. Chen, Y.P., Nakashima, N., Christian, P., et al., 2012. Dicistrovidae. In: King, A.M., Adams, M.J., Carstens, E.B., Lefkowitz, E.J. (Eds.), Virus Taxonomy: Ninth Report of the International Committee on Taxonomy of Viruses. Elsevier Academic Press, pp. 840–845. Gall, O.L., Christian, P., Fauquet, C.M., et al., 2008. Picornavirales, a proposed order of positive-sense single-stranded RNA viruses with a pseudo-T ¼ 3 virion architecture. Archives of Virology 153, 715–727. Valles, S.M., Chen, Y.P., Firth, A.E., et al., 2017. ICTV virus taxonomy profile: Dicistroviridae. Journal of General Virology 98, 355–356. Valles, S.M., Chen, Y., Firth, A.E., et al., 2019. Dicistroviridae in the ICTV report on virus taxonomy, 10th report. ICTV. Available at: https://talk.ictvonline.org/ictv-reports/ ictv_online_report/positive-sense-rna-viruses/picornavirales/w/dicistroviridae.

Entomobirnaviruses (Birnaviridae) Marco Marklewitz, Institute of Virology, Charité – University Medicine Berlin, Berlin, Germany r 2021 Elsevier Ltd. All rights reserved.

Glossary Antagonist An agent that binds a target consequently blocking or dampening a biological process. Cytopathic effect Structural changes of infected cells resulting from a viral infection. Metagenome Summary of all genomes and genes obtained from an environmental sample. The data are generated through non-targeted shotgun sequencing of nucleic acids extracted from the sample. Open reading frame Continuous stretch of codons that begins with a start codon and ends at a stop codon which has the ability to be translated into a protein. Phylogenetic clade A group of organisms derived from a common ancestor species.

Positive-sense Also known as ‘plus-stand’ which refers to the translation sense of the viral RNA (50 to 30 ) that may be directly translated into the desired viral proteins. Protease An enzyme that cuts or cleaves proteins. Prototype virus The first representative virus of a species or genus. Ribonucleoprotein complex A nucleoprotein that contains RNA. RNA interference A biological mechanism of inhibiting gene expression or translation. Interfering RNA molecules bind certain mRNA molecules, leading to their neutralization.

Classification (Compact) Entomobirnaviruses are small RNA viruses that infect exclusively insects and they are the only ones in the Birnaviridae family that do. When discovered in 1958 during a series of experiments with Sigma virus (genus Sigmavirus, family Rhabdoviridae), the prototype entomobirnavirus Drosophila X virus was identified as a contaminant in laboratory-reared fruit flies of the species Drosophila melanogaster (Table 1). The virus has also been detected in various Drosophila melanogaster cell lines. However, Drosophila X virus has not yet been detected in wild populations and thus its origin is still unclear. It was speculated that it could have pre-existed in Drosophila broods in a non-pathogenic form or that it might have originated as a contaminant in fetal bovine serum used in cell culture. Over 50 years later, published in 2012, the first tentative entomobirnavirus was identified in free-living insects.

Table 1 Classified and unclassified entomobirnaviruses, their host, country, and year of initial detection. Accession numbers provided refer to the NCBI GenBank database (https://www.ncbi.nlm.nih.gov/genbank/). Virus

Host

Country

Year

Accession numbers

Drosophila X virusa

Drosophila melanogasterb

France

1958

U60650 AF196645

mosquito X virus

Anopheles sinensis

Yunnan province, China

2009

JX403941 JX403942

Culex Y virusa

Culex pipiens complex

Bad Segeberg, Germany

2010

JQ659254 JQ659255

culicine-associated Z virus

Ochlerotatus caspius Ochlerotatus detritus

Camargue region, France

2011

KF298271 KF298272

Espirito Santo virusa

unknown

Brazil

2010

JN589003 NJ589002

Eridge virus

Drosophila immigans

United Kingdom

2011

KU754527 KU754528

unknown

United Kingdom

1980

not available

Aedes sollicitans

Galveston County, Texas, United States

2013

MT263973 MT263974

unclassified:

Thirlmere virusa a

Port Bolivar virus a

virus isolate available. laboratory contamination.

b

776

Encyclopedia of Virology, 4th Edition, Volume 4

doi:10.1016/B978-0-12-809633-8.21552-5

Entomobirnaviruses (Birnaviridae)

777

Fig. 1 Phylogenetic relationship of entomobirnaviruses. Phylogenetic analysis of entomobirnaviruses and prototype members of the birnavirus genera Avibirnavirus, Blosnavirus, and Aquabirnavirus. Pictograms illustrate the source of the respective virus: laboratory contamination, mosquito, fruit fly.

Culex Y virus was isolated from hibernating Culex pipiens complex mosquitoes collected in a cave in Bad Segeberg, Germany. Today, natural infections with entomobirnaviruses have been detected in Drosophila immigrans and in various Culicidae sp. mosquitoes of France, China, Germany, and the United Kingdom. The genus Entomobirnavirus belongs to the family Birnaviridae and is currently one of four established genera (Fig. 1). Entomobirnaviruses form a monophyletic clade within the family Birnaviridae and are distantly related to the other three birnavirus genera that comprise vertebrate infecting viruses. Apart from the species Drosophila X virus, which is represented by the prototype entomobirnavirus Drosophila X virus, a second species, mosquito X virus, has formally been established in the genus. In addition, there are six unclassified but suggested entomobirnaviruses that have not yet been officially assigned to species in the genus (Table 1). Drosophila X virus is a widely used model pathogen for the study of different RNAi mechanisms in insect cells. As an insect virus, Drosophila X virus can be propagated in cell culture under standard laboratory conditions. Together with Eridge virus, which has been detected in Drosophila immigrans during a metagenomic study, Drosophila X virus forms a sister clade to the other five mosquito-associated entomobirnaviruses: Espirito Santo virus, Culex Y virus, mosquito X virus, culicine-associated Z virus, and Port Bolivar virus (Fig. 1). However, the host of Espirito Santo virus remains unknown as it has been detected in an Aedes albopictus cell line inoculated with serum from a Dengue virus (genus Flavivirus, family Flaviviridae) infected patient. In addition, Thirlmere virus has been isolated from a water sample in cells derived from Drosophila melanogaster (Table 1). The virus failed to replicate in other insect, plant and various vertebrate cells. A relationship between Thirlmere virus and Drosophila X virus has been determined based on antigenic, biochemical, and physical properties. The genome of Thirlmere virus has not yet been sequenced.

Virion Structure Like other members of the family Birnaviridae, entomobirnavirus virions are non-enveloped with a capsid made by the viral protein 2 (VP2) and a diameter of about 60–70 nm (Fig. 2). Drosophila X virus particles have a density of 1.345 g/ml. The virions contain ribonucleoprotein complexes formed by the genome and multiple copies of the capsid protein.

Genome Like all birnaviruses, entomobirnaviruses possess a bisegmented, linear, double-stranded RNA genome encoding several viral proteins (VP). Segment A of Drosophila X virus is about 3.4 kbp in length and encodes a 114 kDa polyprotein precursor preVP2VP4-VP3. The VP4 (24 kDa) is a viral serine-alanine protease that releases preVP2 (55 kDa) and VP3 (35 kDa) by cleavage of its own N- and C-termini in the polyprotein (Fig. 3). In Drosophila X virus the cleavage sites are located at Ser500/Ala501 (preVP2 and VP4) and Ser723/Ala724 (VP4 and VP3), respectively (Fig. 3). The mature VP2 (47 kDa), which represents the viral capsid protein and type-specific antigen, is generated by subsequent serial cleavages of small peptides at the C-terminus of preVP2 most likely via cellular proteases. In mature virus particles VP2 dominates, however progenitor proteins, including preVP2, can also be found. Multiple copies of the nucleoprotein VP3 are associated with the two genome strands forming ribonucleoprotein complexes inside the virion. It has also been shown that VP3 has an RNAi antagonistic function to suppress antiviral RNAi. An additional protein of unknown function (VP5, 27 kDa) is encoded by a small 714 nucleotide, internal open reading frame. In contrast to Drosophila X virus, Eridge virus, Espirito Santo virus, and Port Bolivar virus the initiation codon AUG of this open reading frame is mutated to GUG in Culex Y virus, mosquito x virus, and culicine-associated Z virus (Fig. 3). This open reading frame is likely to be expressed

778

Entomobirnaviruses (Birnaviridae)

Fig. 2 Infected cells and purified virus particles of Culex Y virus. Aedes albopictus cells infected with Culex Y virus 24 h post infection. Cells show cell aggregation as a consequence of the infection with the virus (A). Electron microscopy of purified Culex Y virus particles. Particles were negative stained using uranyl acetate and suggest icosahedral symmetry. Scale bar represents 100 nm (B).

Fig. 3 Schematic representation of entomobirnavirus genome organization. Numbers refer to nucleotide and amino acid sequence position in the genome and proteins of Drosophila X virus. Protease cleavage sites are shown in bold. VPg ¼ genome-linked RNA-dependent RNA polymerase (VP1).

by a  1 ribosomal frame shift that gets facilitated by a particularly shift-prone (“slippery”) sequence (50 -…UUUUUUAA…) located 22 nucleotides downstream of the VP5 initiation codon and is conserved among entomobirnaviruses. Segment B of Drosophila X virus is about 3.2 kbp in length. It encodes the VP1 protein which represents the viral RNA-dependent RNA polymerase. This protein is covalently linked to the 50 -end of the positive-sense genome strand of both segments (VPg), though free forms can be found inside the virion (Fig. 3). Like all RNA viruses, entomobirnaviruses carry highly conserved motifs in their polymerases that are believed to correspond to RNA polymerase functions. However, entomobirnaviruses are lacking the highly conserved Gly-Asp-Asp (GDD) motif that is postulated to form the catalytic site of RNA virus polymerases. Instead, the entomobirnavirus Drosophila X virus RdRp contains three downstream Asp-Asp (DD) residues. They may function as homologs but are located in a more spatially distinct region than the conserved GDD motif in RNA virus polymerases. Due to this the polymerase of entomobirnaviruses may form a distinct subgroup of RNA virus polymerases.

Life Cycle Knowledge of the replication cycle of entomobirnaviruses is scarce. Transcription and replication are facilitated by the viral RNAdependent RNA polymerase (VP1). The VP1 protein of Drosophila X virus contains a consensus GTP-binding site and likely has a self-guanylylation activity. Genome replication takes place in the cytoplasm. Translation is performed by the cellular machinery using capped mRNAs that are lacking 30 -poly (A) sequences. Mature virus particles can be found scattered or arranged as crystalline arrays in the cytoplasm.

Epidemiology Due to their insect-specificity, entomobirnaviruses do not have epidemiological significance. However, entomobirnaviruses have been shown to be present in free-living insects in China, the United States, and different countries in Europe.

Entomobirnaviruses (Birnaviridae)

779

Clinical Features Infections with the entomobirnavirus Drosophila X virus have been determined to cause oxygen starvation in Drosophila melanogaster. The symptoms occur three to four days before death in dose-dependent in vivo experiments. However, these flies and Drosophiladerived cell lines can also be persistently infected.

Pathogenesis Knowledge on the pathogenesis of entomobirnaviruses is scarse. Drosophila melanogaster flies infected with Drosophila X virus showed a wide distribution of virus particles in the brain, digestive tract, muscles, Malpighian tubules, ovaries, testis, and thorax. Flies that have been examined directly after the onset of anoxia showed that gut and trachea cells, as well as the muscle sheath of different organs were mainly affected. It is speculated that an early invasion of trachea cells by the virus leads to the sensitivity of flies to anoxia. Experimental infections of various vertebrate and plant cell lines showed no replication of these viruses in non-insect cells. Culex Y virus, Espirito Santo virus, and Port Bolivar virus, cause a cytopathic effect (CPE) in different cell lines derived from Aedes albopictus (Fig. 2). The CPE is lethal to the cells and they die shortly after being infected with the virus. In contrast, Drosophila X virus and Thirlmere virus are not able to infect cells of Aedes albopictus.

Further Reading Chung, H.K., Kordyban, S., Cameron, L., Dobos, P., 1996. Sequence analysis of the bicistronic Drosophila X virus genome segment A and its encoded polypeptide. Virology 225 (2), 359–368. Cook, S., Chung, B.Y.W., Bass, D., et al., 2013. Novel virus discovery and genome reconstruction from field RNA samples reveals highly divergent viruses in dipteran hosts. PLoS One 8 (11), e80720. Franzke, K., Leggewie, M., Sreenu, V.B., et al., 2018. Detection, infection dynamics and small RNA response against Culex Y virus in mosquito-derived cells. Journal of General Virology 99 (12), 1739–1745. Huang, Y., Mi, Z., Zhuang, L., et al., 2013. Presence of entomobirnaviruses in Chinese mosquitoes in the absence of Dengue virus coinfection. Journal of General Virology 94 (Pt. 3), 663–667. Kelly, D.C., Ayres, M.D., Howard, S.C., et al., 1982. Isolation of a Bisegmented Double-stranded RNA Virus from Thirlmere reservoir. Journal of General Virology 62, 313–322. L’Heritier, P.H., 1958. The hereditary virus of Drosophila. Advances in Virus Research 5, 195–245. Marklewitz, M., Gloza-Rausch, F., Kurth, A., et al., 2012. First isolation of an entomobirnavirus from free-living insects. Journal of General Virology 93 (Pt. 11), 2431–2435. Shwed, P.S., Dobos, P., Cameron, L.A., Vakharia, V.N., Duncan, R., 2002. Birnavirus VP1 proteins form a distinct subgroup of RNA-dependent RNA polymerases lacking a GDD motif. Virology 296 (2), 241–250. Teninges, D., 1979. Protein and RNA composition of the structural components of Drosophila X virus. Journal of General Virology 45, 641–649. Teninges, D., Ohanessian, A., Richard-Molard, C., Contamine, D., 1979. Isolation and biological properties of Drosophila X virus. Journal of General Virology 42, 241–254. Tesh, R.B., Bolling, B.G., Guzman, et al., 2020. Characterization of Port Bolivar virus, a novel entomobirnavirus (Birnaviridae) isolated from mosquitoes collected in East Texas, USA. Viruses 12 (4), (Epub ahead of print). van Cleef, K.W.R., Van Mierlo, J.T., Miesen, P., et al., 2014. Mosquito and Drosophila entomobirnaviruses suppress dsRNA- and siRNA-induced RNAi. Nucleic Acids Research 42 (13), 8732–8744. Vancini, R., Paredes, A., Ribeiro, M., et al., 2012. Espirito Santo virus: A new birnavirus that replicates in insect cells. Journal of Virology 86 (5), 2390–2399. Webster, C.L., Longdon, B., Lewis, S.H., Obbard, D.J., 2016. Twenty-five new viruses associated with the Drosophilidae (Diptera). Evolutionary Bioinformatics 12 (Suppl. 2), 13–25.

Relevant Website https://talk.ictvonline.org/ictv-reports/ictv_online_report/dsrna-viruses/w/birnaviridae/1017/genus-entomobirnavirus Genus: Entomobirnavirus. Birnaviridae. dsRNA Viruses. ICTV.

Hytrosaviruses (Hytrosaviridae) Henry M Kariithi, Kenya Agricultural and Livestock Research Organization, Nairobi, Kenya Irene K Meki, French National Center for Scientific Research, Montpellier, France r 2021 Elsevier Ltd. All rights reserved.

Nomenclature Hytrosa

Is derived from two Greek words: “hypertrophia” (“excess nourishment”) and “sialoadenitis” (“salivary gland inflammation”)

Glossary Aposymbiotic Is an organism that is functionally devoid of its natural symbiont (mutualist, commensalist, or parasitic). Corpora allata/cardiac Are a complex of endocrine glands that secrete and store juvenile hormones, which are involved in the regulation of metamorphosis and tissue development in insects. Horizontal transmission Is the direct (e.g., air-borne, food-borne or venereal) or indirect (e.g., intermediate host vector) transmission of an organism among related or unrelated individuals of the same generation of an ecosystem. Hyperplasia Is enlargement of an organ or tissue caused by an increased proliferation of cells, which become multilayered and are capable of replication.

Hytrosaviridae

Is a family of entomopathogenic dsDNA viruses that infect dipteran insects

Hypertrophy Is enlargement of an organ or tissue from increased size of cells, which are incapable of dividing. Hytrosa Is derived from “hypertrophia sialoadenitis”, the Greek words meaning “excess nourishment” and “salivary gland inflammation”, respectively. Peritrophic matrix Is a semipermeable membranous layer that lines the insect gut and is composed of proteins, glycoproteins and chitin microfibrils. Vertical transmission Is the transovum (egg surface) or transovarian (within the egg) transmission of an organism among individuals of the same generation (e.g., from a mother to its offspring). Vitellogenesis Is the process of egg yolk formation through deposition of nutrients into oocytes or female germ cells involved in reproduction of lecithotrophic (non-feeding larval development) organisms.

Introduction Hytrosaviruses (also known as salivary gland hypertrophy viruses or SGHVs; Hytrosaviridae family) are rod-shaped, enveloped viruses with large, circular double-stranded DNA (dsDNA) genomes of 120–190 kbp in length. These viruses infect the hematophagous tsetse fly (Diptera: Glossinidae), the filth-feeding common housefly, Musca domestica (Diptera: Muscidae) and the phytophagous syrphid fly, Merodon equestris (Diptera: Syrphidae). The distinguishing diagnostic feature of SGHV infections is the induction of salivary gland hypertrophy (SGH) syndrome in the salivary glands (SGs) of their adult insect hosts. Observations of a morphologically (rod-shaped viral particles) and symptomologically (occurrence of SGH symptoms) similar virus in the male accessory gland filaments of the solitary braconid wasp, Diachasmimorpha longicuadata Ashmead, (Hymenoptera: Braconidae) suggests existence of SGHVs in other insect hosts. The intrinsic properties of SGHVs (i.e., chronic covert infection of adult stages of their hosts), combined with the need to dissect SGs for diagnosis probably hinder the discovery of more members of the Hytrosaviridae family. The SGH symptoms were first reported in the tsetse species Glossina pallidipes in the early 1930s in Zululand, South Africa, during a survey of the prevalence of African trypanosomes, the causative agents of Africa’s ancient scourge, African trypanosomiasis (sleeping sickness and nagana in humans and animals, respectively) (see Table 1). Three decades later (1970s and 1980s), SGH symptoms were noted to occur twice as often in males than in females, to favor development of Trypanosoma parasites, and to cause reproductive dysfunctions in the infected flies. The second description of SGH was in 1970s in southern France in 31% and 54% of sampled populations of M. equestris var. nobilis and M. equestris var. transverlis, respectively. The third report of SGH symptoms was in the 1990s in adult populations of M. domestica at a dairy farm in Florida, USA, during a survey of parasitic nematode infections. Of particular interest is the negative impacts of SGHV infections in tsetse mass production factories. These factories are required for production of sexually sterile males that are used for tsetse vector control via the environmentally friendly sterile insect technique (SIT), an insects’ birth control with a solid track record. For instance, in the 1980s, a G. pallidipes colony maintained at the now defunct Kenya Trypanosomiasis Research Institute in Kenya collapsed unexpectedly a few years after its establishment due to poor productivity. Subsequent SGH outbreaks caused collapse of two more G. pallidipes colonies at a tsetse production facility in Seibersdorf, Austria (1987 and 2001), and another colony at Kality in Ethiopia (in 2012). The poor productivity and eventual collapse of the Seibersdorf colonies jeopardized the implementation of an ambitious SIT-mediated control of G. pallidipes from a

780

Encyclopedia of Virology, 4th Edition, Volume 4

doi:10.1016/B978-0-12-809633-8.21553-7

Hytrosaviruses (Hytrosaviridae)

Table 1

Chronological history of the discovery and characterization of SGHVs

Year

Milestones in SGHV research

1930–1940



1970–1979

• •

1980–1989

• • • • • •

1990–1999

• • • • • • • • • • • • • • • •

2000–2009

2010–2018

781

• • •

First observation of salivary gland hypertrophy (SGH) syndrome in Glossina pallidipes in Zululand, South Africa. SGH symptoms noted to preferential occur more in males than females SGHV reported in other tsetse species (G. morsitans and G. fuscipes). SGHV reported in two varieties of the narcissus bulb fly (Merodon equestris vr. nobilis and M. equestris vr. transversalis) in southern France. SGH reported to be a common diagnostic feature in wild populations of G. pallidipes. Collapse of a G. pallidipes colony in Kenya attributed to SGHV infections. Demonstration of per os transmission of infectious GpSGHV particles in tsetse flies. SGHV shown to reduce insemination rates, fecundity and lifespan of colonized G. pallidipes. Reports of SGH symptoms in tsetse flies from Zimbabwe and Côte d0 Ivoire. SGHV linked to poor productivity of two G. pallidipes colonies maintained at the Insect Pest Control Laboratory (IPCL) in Seibersdorf, Austria. SGHV demonstrated to be transmitted from artificially infected mothers to their offspring. First description of SGHV cytopathology in tsetse salivary glands. SGHV infections reported in G. morsitans Swyenatoni and G. brevipalpis. SGHV reported for the first time in the housefly, Musca domestica L in Florida, USA. SGHV shown to infect milk glands, mid-gut and male accessory glands of G. pallidipes. Collapse of an Ethiopian-derived G. pallidipes colony at IPCL, Seibersdorf, Austria. SGHV shown to infect male accessory glands of G. m. morsitans Westwood. Full sequencing of genomes of GpSGHV and MdSGHV. GpSGHV and MdSGHV proposed and established in a new virus family, Hytrosaviridae. Transcription analysis of MdSGHV. Description of the full proteome, ultra-structural features and morphogenesis of GpSGHV. Demonstration of global distribution of MdSGHV in M. domestica populations. SGHV-like virus described in accessory gland filaments of the solitary braconid wasp Diachasmimorpha longicuadata. Antiviral drugs (valacyclovir/acyclovir) demonstrated to interrupt GpSGHV replication in G. pallidipes. Demonstration that microbiota influence vertical transmission of GpSGHV in G. pallidipes. Successful management of GpSGHV infections and complete eradication of overt SGH symptoms in G. pallidipes colonies using combination of antiviral drugs and modified feeding regimes. GpSGHV shown to modulate different molecular pathways in G. pallidipes and G. m. morsitans. Full sequencing of genome a GpSGHV strain from Ethiopian tsetse flies. The first evidence that microRNAs (miRNA) and RNA interference (RNAi) are involved in GpSGHV infections in G. pallidipes.

25,000 square kilometers of under-utilized fertile land in the Southern Rift Valley of Ethiopia. The Ethiopian SIT project was to be a model demonstrating that tsetse flies can be eradicated from the mainland Africa, after the hailed successful eradication of G. austeni populations on Zanzibar Island. These turns of events precipitated into concerted research efforts to characterize the tsetse virus and develop integrated mitigation strategies to salvage tsetse mass production. The research also included the houseflyinfecting virus, which was deemed as a potential risk in the magmeal production industry and as a biocontrol agent due to its ability to suppress ovarian development in the insect. To date, genomes of two strains of SGHVs from different tsetse fly populations, and one strain from the housefly have been fully sequenced.

Taxonomy and Classification When first discovered, the tsetse SGHVs were described as “virus-like particles” due to their morphological resemblance to viruses that had already been described in other organisms such as the Drosophila, mosquito, and nematodes. The SGHVs were erroneously presumed to be arboviruses (based on their morphological similarities to other viruses that replicate in the SGs and are transmitted by hematophagous insects such as mosquitoes, ticks, sand flies and gnats), or nudiviruses (based on their sizes and enveloped rodshaped morphology). However, full genome sequencing data of SGHVs in 2008 did not support the placement of these viruses within any of the established families of invertebrate viruses. The tsetse fly and housefly SGHVs showed limited but significant gene homologies, and were subsequently classified into separate genera within a new Hytrosaviridae family. Glossina pallidipes salivary gland hypertrophy virus (GpSGHV) is classified in the type species Glossina hytrosavirus of the genus Glossinavirus and Musca domestica salivary gland hypertrophy virus (MdSGHV) is classified in the species Musca hytrosavirus of the genus Muscavirus. The name of the virus family (Hytrosa) was derived from the Greek words Hypertrophia and sialoadenitis, meaning “excess nourishment” and “salivary gland inflammation”, respectively. Compared to GpSGHV, MdSGHV is more structurally and symptomologically similar to the virus that induces SGH symptoms in M. equestris (MeSGHV).

782

Hytrosaviruses (Hytrosaviridae)

Fig. 1 Maximum likelihood (ML) phylogenetic trees of conserved hytrosaviruses (SGHVs) genes. Based on DNA polymerase, SGHV relate more closely with linear dsDNA viruses (e.g., Herpesviridae, Iridoviridae and Phycodnaviridae) than circular dsDNA viruses (A). Panels B and C show ML trees of concatenated sequences of per os infectivity factors (P74 or PIF-0, PIF-1, PIF-2 and PIF-3) and ODV-E66, respectively. SGHV PIFs branched more distantly from nudiviruses than from baculoviruses. GenBank accession numbers of the viruses are indicated in tree branches (square brackets). Virus abbreviations: ASFV, African swine fever virus; GpSGHV-Uga/Eth, Glossina pallidipes SGHV Uganda/Ethiopia strain; MdSGHV, Musca domestica SGHV; HHV-6B, Human betaherpesvirus 6B; PCMV, Porcine cytomegalovirus; BoHV-1, Bovine herpesvirus 1; RFHV, Retroperitoneal fibromatosis-associated herpesvirus; MfRV, Macaca fuscata rhadinovirus; ApMV, Acanthamoeba polyphaga mimivirus; PbCV-1, Paramecium bursaria Chlorella virus 1; EsV-1, Ectocarpus siliculosus virus 1; SfAV-1a, Spodoptera ascovirus 1a; LDV-1, Lymphocystis disease virus 1; IIV-6, Invertebrate iridescent virus 6; AtIV, Aedes taeniorhynchus iridescent virus; AmEPV, Amsacta moorei entomopoxvirus L; CbEP, Choristoneura biennis entomopoxvirus L; VACV, Vaccinia virus; McV, Molluscum contagiosum virus; WSSV, Shrimp white spot syndrome virus; NeleNPV, Neodiprion lecontei nucleopolyhedrovirus; NeabNPV, Neodiprion abietis nucleopolyhedrovirus; HearNPV, Helicoverpa armigera nucleopolyhedrovirus; CpGV; Cydia pomonella granulovirus; PxGV; Plutella xylostella granulovirus; AcMNPV, Autographa californica multiple nucleopolyhedrovirus; LdMNPV, Lymantria dispar multiple nucleopolyhedrovirus; SeMNPV, Spodoptera exigua multiple nucleopolyhedrovirus; OpMNPV, Orgyia pseudotsugata multiple nucleopolyhedrovirus; OrNV, Oryctes rhinoceros nudivirus; HzNV-2, Helicoverpa zea nudivirus 2; HzNV-1, Heliothis zea nudivirus 1; GbNV, Gryllus bimaculatus nudivirus; PmNV, Penaeus monodon nudivirus; HgNV, Homarus gammarus nudivirus.

Similarities With Other Virus Taxa GpSGHV and MdSGHV are structurally similar to members of other arthropod-infecting virus families such as Baculoviridae, Nudiviridae, and Nimaviridae. The SGHVs share 12 of the 38 core genes that have been described in baculoviruses, nudiviruses, nimaviruses, and some bracoviruses. Among the structural and genomic features shared by SGHVs and other large dsDNA viruses include possession of enveloped, rod-shaped virions, circular dsDNA genomes and their replication in the nucleus of infected insect cells. However, the SGHVs functionally differ from viruses such as the baculoviruses in the lack of occlusion bodies and lower lethality (i.e., SGHVs rarely kill their insect host). Based on the DNA polymerase II (B-family; polB) gene, which is present and conserved in all large dsDNA viruses, SGHVs relate more closely with invertebrate large linear dsDNA viruses compared to circular dsDNA viruses. The linear dsDNA viruses that cluster together with SGHVs include members of families Herpesviridae (120–240 kb), Iridoviridae (140–303 kb), Poxviridae (130–375 kb), Phycodnaviridae (100–560 kb), and Mimiviridae (1200 kb) (Fig. 1(A)). At the amino acid level, the polB of GpSGHV and MdSGHV are 31% identical, and their best matches are to the alcelaphine herpesvirus polB (24.9% and 22.6% identities, respectively). The most significant similarity between the SGHVs and other large invertebrate dsDNA viruses is the possession of genes encoding four of the core and highly conserved per os infectivity factor proteins (PIFs 0 or P74,1, 2, and 3) (Fig. 1(B)). The PIFs are essential for entry of baculoviruses and nudiviruses in their host’s midgut cells. At the amino acid level, the GpSGHV and MdSGHV PIF homologs are closely related (31%–39% identical) and appear to have evolutionarily diverged more distantly from nudiviruses than from baculoviruses. The genomes of SGHVs also encode homologs to a late expression protein, and a major structural component of occlusion-derived virus (ODV) envelopes of lepidopteran baculoviruses (ODV-E66 protein). ODV-E66 is a chondroitinase that plays roles during baculovirus infections in the midguts of lepidopteran larvae. The ODV-E66 of SGHVs cluster together with their homologs in baculoviruses (Fig. 1(C)), but its role during the SGHV infection has not been established.

Hytrosaviruses (Hytrosaviridae)

783

In addition to polB and PIFs, the two SGHVs encode homologs to four of the six subunits of the DNA-dependent RNA polymerase (DdRp) complex reported in baculoviruses and nudiviruses. DNA viruses use the DdRp to transcribe their mRNAs. The DdRp complex subunits present in SGHVs include the late expression factors 4, 5, 8, and 9 (LEF-4, LEF-5, LEF-8, and LEF-9). The multifunctional LEF-4 is an mRNA capping enzyme, LEF-5 is a transcription initiation factor, and LEF-8 plays critical roles in the activation of the late gene promotor, while LEF-9 forms a part of the catalytic center of the DdRp complex. It is notable that whereas LEF-9 is encoded by a single open reading frame (ORF) in MdSGHV, baculoviruses and nudiviruses, GpSGHV has two adjacent ORFs coding for the N- and C-terminals of this enzyme. Both GpSGHV and MdSGHV lack homologs to the remaining two of the six DdRp subunits, i.e., P47 and very late expression factor 1 (VLF-1) proteins, which play roles in systemic viral infections, and capsid assembly/viral DNA packaging, respectively. Taken together, the presence of the above-mentioned gene homologs in the SGHVs suggests their common ancestry and similar modes/mechanisms of entry into the host cells and gene expression.

Virion Structure The MdSGHV and GpSGHV virions are 50–65 nm in diameter, non-occluded, non-icosahedral rod-shaped particles measuring 500–1000 nm in length, and contain a bilayer lipid envelope. The SGHV particles have a density of 1.153 g cm3 on a 10%–60% Nycodenz gradient. There are however distinct morphological differences between the SGHV virions. The rigid and rod-shaped GpSGHV virions are longer (50–65 nm  1000 nm) compared to those of MdSGHV (65 nm  550 nm). The enveloped virions of the unclassified MeSGHV measure 50–60 nm in diameter and 500–600 nm in length. The GpSGHV virions are asymmetrical with a rounded and a conical end, and consist of a thin, central, electron-dense helical nucleocapsid core (40 nm in diameter). The nucleocapsid core is surrounded by a 10 nm-thick proteinaceous tegument matrix, which is encapsulated by an outer lipid bilayer envelope. The outer surface of the virion is studded with left-handed helical polymeric spikes (13 nm long with 15 nm periodicity) composed of viral and hostderived protein dimers (23 spikes  24 helical turns ¼ 1104 envelop dimers). GpSGHV particles are more fragile (the viral envelopes are readily depleted by treatment with common buffers), which partially accounts for their low infectivity compared to the MdSGHV particles. The MdSGHV virions have regularly spaced braided, bead-like surface topology and are rounded on both ends.

Genome Organization Detection of relaxed and supercoiled bands after gel electrophoresis of purified viral DNA, and the absence of end-labeling of undigested DNA suggested that SGHV genomes were circular dsDNA molecules. Currently, two strains of GpSGHV isolated from Ugandan (GpSGHV-Uga; GenBank Acc. no. EF568108) and Ethiopian G. pallidipes (GpSGHV-Eth; GenBank Acc. no. KU050077.1) have been fully sequenced. The organization and the main features of the genome of GpSGHV-Eth compared to that of GpSGHV-Uga are schematically represented in Fig. 2. The conserved regions of GpSGHV-Uga and GpSGHV-Eth genomes are 98.1% similar at the nucleotide level. However, compared to the GpSGHV-Uga, the GpSGHV-Eth genome has insertions and deletions in 17 and 20 ORFs, respectively, while 11 ORFs are deleted and 24 ORFs are novel. Only one MdSGHV strain has so far been fully sequenced (MdSGHV GenBank Acc. no. NC_010671.1), but other uncharacterized isolates of the virus have been detected in housefly populations sampled from various geographical locations. The MdSGHV genome presumably has similar genome organization as that of GpSGHV. The GpSGHV-Uga, GpSGHV-Eth and MdSGHV differ in their genome sizes (190,032 bp, 190,291 bp and 124,279 bp, respectively), and G þ C contents (28.0%, 27.9% and 43.5%, respectively). Approximately 2.8% and 1.7% of the GpSGHV and MdSGHV genomes, respectively, harbor tandem direct repeat sequences (drs). The numbers and sizes of the drs differ among the SGHV strains: 14 drs (52–147 bp) in GpSGHV-Eth, 15 drs (52–246 bp) in GpSGHV-Uga, and 18 drs (30–380 bp) in MdSGHV. The drs are clustered in some genomic regions, but they are less clustered in MdSGHV than in GpSGHV. Three GpSGHV-Eth drs (dr9, 11, and 13) and seven MdSGHV drs (dr5, 6, 7, 10, 16, 17, and 18) are localized within coding regions of specific ORFs. All the GpSGHV-Uga drs are localized in noncoding regions. In baculoviruses, similar repeat regions are usually located between genes, and are hypothesized to influence viral replication (as transcription enhancers and/or origins of DNA replication), evolution, and pathobiology. The SGHVs lack the palindromic homologous repeat (hrs) regions found in other invertebrate circular dsDNA viruses such as baculoviruses, nudiviruses and nimaviruses. Most of the ORFs in the genomes of GpSGHV harbor the TATA-like box promoter elements (89.7%), poly(A) signals (83.9%), and late transcriptional initiation motifs (T/G/A)TAAG (35.5%). The transcriptional elements in the MdSGHV genome are largely unknown, but most of its ORFs are enriched with the TAAG motifs, poly(A) signals and TATA-like box promoter elements. Notably, 13 GpSGHV-Eth genes are non-canonical, five of which are homologs to known regulatory genes, including the MAL7P1.132 gene, lef-9, thymidylate synthase (ts), cluster of differentiation antigen protein-48 (cd-48) and metalloendopeptidase (mp-nase).

Modes of Infection and Gene Expression The mechanism(s) of SGHVs life cycle within their insect hosts is unclear. Proteomic and transcriptomic studies have however provided evidence for the expression of pif genes, but it is unclear whether PIFs are involved in the virus initial attachment and

784

Hytrosaviruses (Hytrosaviridae)

Fig. 2 Circular representation of the GpSGHV-Eth genome and its comparison (percentage identity) to the GpSGHV-Uga genome. The open reading frames (ORFs) are presented starting with the ATG initiating codon of the p74 gene (coded by SGHV-Eth001). The outermost two concentric rings and the arrows represent the positions and orientations of the transcription potential of each ORF. BLASTn comparison (98.1% identity at the nucleotide level) between the two SGHV genomes is indicated by the third ring (from outside). The black peaks represent the deviation from the average % G þ C content of the entire genome. ORFs are color-coded based on their homologies between the two genomes as shown in the figure key. The names of the annotated ORFs are indicated. Unlabeled ORFs (light gray color) represent hypothetical/uncharacterized ORFs. The figure is drawn to scale. Gene abbreviations: odv-e66, occlusion-derived virus envelope protein 66; paxp-1, paired box protein Pax-1; lef, late expression factor; ADPRase, ADP-ribose pyrophosphatase; pif, per os infectivity factor; RpoD, RNA polymerase sigma (s) factor; cd-48, Cell division cycle protein 48; chmU, chlacomycin gene U; cg-30, Zinc finger protein CG30; pe38, Major immediate early protein 38; gadd-34, growth arrest and DNA damage protein 34; trap, trp RNA-binding attenuation protein; bcr-abl, an oncogene arising from the fusion of the breakpoint cluster (bcr) gene with chromosomal Abelson murine leukemia (c-abl) proto-oncogene; ABC transporter, ATP-binding cassette transporter.

infection of the host’s midgut epithelial cells. Available data suggest low frequency of per os infection, most likely due to the peritrophic matrix (PM) barrier that may reduce the penetration of viral particles into the hemocoels. Following initial oral infection, SGHVs exploit the host’s tracheal system as a conduit to breach the basal laminal barrier in the midgut to access the SGs, the primary tissue of viral replication. Viral capsids are released into the cell cytoplasm via fusion of viral and host’s vesicle membranes, and then trafficked (via the host’s microtubule motor complexes) to the cell nucleus for viral gene transcription, viral DNA genome replication, packaging and assembly of progeny nucleocapsids in the virogenic stroma. Based on the complexity and significant homologies of SGHV-expressed proteins to well-annotated proteins in other viruses, the current school of thought is that the viral replication involves sequential expression of the immediate early (transcription factors), early (DNA replication genes), and late (viral replication proteins) genes. However, a lack of a suitable cell culture to support SGHV replication has so far hindered the elucidation of the precise transcriptional timings of the SGHV genes. Transmission electron microscopy (TEM) has provided evidence that the SGHV nucleocapsids exit the nucleus of infected SG cells via the nuclear pore complex, and subsequently acquire their glycoprotein-containing envelopes in the cell cytoplasm, presumably via the ER-Golgi system. After maturation, the enveloped viral particles egress by migrating to and budding out of the cell membranes bordering the SG lumens (for MdSGHV) or lysis of the luminal cell membranes (for GpSGHV).

Major Structural Proteins Proteomic studies have revealed that the SGHV structural complex comprise of a blend of virally-encoded and host-derived proteins, with the nucleocapsid and envelope sections of the GpSGHV and MdSGHV virions composed of at least 45 and 29 structural proteins, respectively. Of the GpSGHV structural proteins, 10, 15, and 20 proteins are localized in the envelope, nucleocapsid, and tegument matrix, respectively. The structural proteins in GpSGHV and MdSGHV are almost identical, most of

Hytrosaviruses (Hytrosaviridae)

Table 2

785

Twenty-nine homologous structural proteins of SGHVs and other dsDNA viruses

GpSGHV proteinsa

MdSGHV homolog

Homologs in other dsDNA viruses

Functional annotation

Homologs to nucleocapsid structural proteins SGHV103 Viral capsid associated-like protein (GpSGHVEth113)



Viral DNA encapsidation.

SGHV062



Viral capsid-associated protein  1054 NeabNPV (VP1054) Smc protein TniAV  2c (ORF147) AmEPV (AMV214) Major core protein P4a (A10L gene) MsEPV (MSV152) LCDV CDC48 protein (ORF 209R)

GpSGHV

SGHV083 SGHV052 Cell division-cycle protein 48 like protein (cd  48) (SGHV107/108) SGHV154

GpSGHV-Eth

P53 transcription factor-like protein (GpSGHVEth069) GpSGHVEth091 Nucleocapsid protein (GpSGHVEth053) cd  48 (GpSGHVEth117/ 118)

Nucleocapsid protein (GpSGHVEth169)

Homologs to tegument structural proteins SGHV010 Desmoplakin (GpSGHVEth009) ABC transporter ABC transporter (GpSGHVEth071) (SGHV064) SGHV071 putative capsid protein 3 (GpSGHVEth078) SGHV086 Tegument protein (GpSGHVEth094) SGHV110/111 Metalloproteinase (mpnase)  38 (GpSGHVEth121/122) Homologs to envelop proteins P74 (GpSGHVEth001) P74 (SGHV001) PIF  1 (GpSGHVEth112) PIF  1 (SGHV102) PIF  2 (SGHV053)

PIF  2 (GpSGHVEth054)

PIF  3 (SGHV076)

PIF  3 (GpSGHVEth083)

SGHV045

LEF  3 (GpSGHVEth046)

SGHV072

Thiol oxidase (GpSGHVEth079) Metalloprotease (GpSGHVEth096)

SGHV088

Homologs to infected Class-II chitinase (SGHV027) Core-like protein (SGHV051) SGHV040 SGHV032/33 DNA polymerase (SGHV079)

MdSGHV013 – vacuolar sorting ATPase (MdSGHV033)

Chromosomal structural positioning – Viral DNA assembly. Proteins export to ubiquitinproteasome complex during late viral infection.



Wiseana iridovirus ORF026





MsEPV desmoplakin (Ac66)



AmEPV ABC transporter (ORF AMV130) ApMV capsid protein 3

Nucleocapsid egress from virogenic stroma. Multifunctional protein.

MdSGHV090 –



Megavirus chiliensis ankyrin repeat protein (ORF398) SpliGV mp-nase  38

Hijacking of the host’s cell ubiquitin machinery. Disrupts plasma membrane to initiate infection.

P74 (MdSGHV039) PIF  1 (MdSGHV029) PIF  2 (MdSGHV089) PIF  3 (MdSGHV106) MdSGHV083

SpliNPV P74 NeabNPV PIF  1

Essential for per os infection.

MdSGHV102

ASFV FAD-dependent thiol oxidase (ORF pB119L) AmEPV vaccinia G1L metalloprotease (AMV258)

matrix metalloproteinase (MdSGHV036)

MdSGHV017

cell-specific viral proteins (ICSVPs) Class-II chitinase – (GpSGHVEth027) LEF  4 (GpSGHVEth052) MdSGHV087 MdSGHV070 LEF  8 (GpSGHVEth041) LEF  9 (GpSGHVEth033/34) MdSGHV074 DNA polymerase DNA polymerase (GpSGHVEth087) (MdSGHV001)

AmFV PIF  2 EupSNPV PIF  3 MaviMNPV LEF  3

PrGV type II chitinase (ORF010) GbNV LEF  4 (Ac90) SpliNPV LEF  8 (Ac50) GbNV LEF  9 (ORF24) AmEPV RNA polymerase TF (AMV054)

Virus replication and late gene expression. Required for correct virus assembly. Morphogenesis of infectious virions.

Promote terminal host liquefaction. Promoter recognition and stabilization of late and very late viral transcripts.

Viral DNA replication. (Continued )

786

Table 2

Hytrosaviruses (Hytrosaviridae)

Continued

GpSGHV proteinsa GpSGHV

GpSGHV-Eth

SGHV078

Ac81-like (GpSGHVEth086)

Thymidylate synthase TS (GpSGHVEth036/37) (TS) (SGHV035/36)

MdSGHV homolog

Homologs in other dsDNA viruses

Functional annotation

Ac81-like (MdSGHV108) TS (MdSGHV012)

OrNV Ac81-like protein (ORF004) MsEPV TS (ORF MSV238)

Virus interaction with host cell cytoskeleton. de novo nucleotide biosynthesis.

a

Hypothetical (non-annotated) proteins are indicated by their respective ORF numbers in the viral genomes. Abbreviations: - PIF, per os infectivity factor; LEF, late expression factor; EupSNPV, Euproctis pseudoconspersa single nucleopolyhedrovirus; AmFV, Apis mellifera filamentous virus; MaviMNPV, Maruca vitrata multi-nucleopolyhedrovirus; SpliNPV, Spodoptera litura nucleopolyhedrovirus; ASFV, African swine fever virus; MsEPV, Melanoplus sanguinipes entomopoxvirus; GbNV, Gryllus bimaculatus nudivirus; NeabNPV, Neodiprion abietis nucleopolyhedrovirus; OrNV, Oryctes rhinoceros nudivirus; AmEPV, Amsacta moorei entomopoxvirus; PrGV, Pieris rapae granulosis virus; TniAV-2c, Trichoplusia ni ascovirus 2c; SMC, structural maintenance of chromosomes protein; BP-NLS-Bipartite nuclear localization signal; LCDV, Lymphocystis disease virus; ApMV, Acanthamoeba polyphaga mimivirus.

which are homologs to structural proteins described in several insect viruses. The major structural proteins of GpSGHV, their homologs in other invertebrate dsDNA viruses and their functional annotations are summarized in Table 2. As shown in Table 2, the most notable GpSGHV nucleocapsid structural proteins include homologs involved in viral DNA assembly and encapsidation, chromosomal structural positioning, and protein turnover during late viral infection. Amongst the homologous structural tegument proteins include those involved in the disruption of the plasma membrane to initiate infection, nucleocapsid egress from virogenic stroma, and hijacking of the host cell ubiquitin machinery. Notable envelope structural proteins include homologs involved in oral infection, viral replication and late gene expression, correct virus assembly, and morphogenesis of infectious virions. Additionally, there are proteins that could not be associated with the structural (nucleocapsid, tegument, envelope) components of GpSGHV, and were assigned as “infected cell-specific viral proteins” (ICSVPs). Notable ICSVPs include homologs to proteins essential for viral DNA replication, late and very late viral transcription, nucleocapsid envelopment, and terminal host liquefaction (Table 2). Other than the virally-encoded proteins, a mature GpSGHV particle contain at least 50 host-derived cellular proteins, almost 50% of which are tegument proteins. At least 13 of the host-derived cellular proteins are potentially incorporated into the virions, which may serve specific and auxiliary roles in viral morphogenesis. Although there is no evidence for the presence of cellular-derived host proteins in MdSGHV, the derivation of the viral envelope during the cytoplasmic envelopment process implies incorporation of a complex of host proteins and lipids.

Host Range Tsetse fly is the only known natural host for GpSGHV, in which the virus predominantly causes chronic asymptomatic (covert) infections. GpSGHV is highly specific to tsetse species, and there is no evidence for its potential infection or replication of heterologous hosts such as the housefly. The susceptibility of tsetse flies to GpSGHV infections differ widely amongst different tsetse species, of which G. pallidipes is the most susceptible. There is evidence for the existence of at least 15 GpSGHV haplotypes whose prevalence differs spatially and genetically among different wild populations of tsetse flies in East, Central, and West Africa. The common housefly (M. domestica L.) is the natural host for MdSGHV in which the virus causes only acute symptomatic (overt SGH symptoms) infections. Under laboratory conditions, MdSGHV can infect other insects such as the obligate hematophagous stable fly (Stomoxys calcitrans), the autumn housefly (Musca autumnalis), and a larval predator of the housefly, the black dump fly (Hydrotaea aenescens). Although MdSGHV is incapable of inducing the diagnostic overt SGH symptoms in other insects, the virus does significantly affect ovarian development and fly mortality in the stable fly and black dump fly. These observations suggest that MdSGHV may reside asymptomatically in other muscids as alternative or reservoir hosts. It is however unknown whether these alternative hosts can transmit infectious viral particles to healthy conspecifics under field/natural conditions. MdSGHV also fails to infect or replicate in the tsetse flies when virus suspensions are injected into the flies.

Pathogenesis and Tissue Tropism All SGHVs induce the same gross pathology (i.e., SGH symptoms) in their respective host insects (adult stages), but the cytopathologies are distinct for each genus (Fig. 3). The SG pair are equally affected, swollen up to four times or higher than their normal size, and the enlargement typically covers the entire length of the distal regions of the glands. The precise mechanism(s) underlying the development of SGH symptoms are unknown. In tsetse flies, GpSGHV significantly alters the synthesis of host proteins, notable of which are proteins associated with blood feeding, immunity, cellular proliferation, homeostasis, cytoskeletal traffic and regulation of protein turnover.

Hytrosaviruses (Hytrosaviridae)

787

Fig. 3 Pathogenesis and the main structural/morphological features of SGHVs. The infected salivary glands (SGs) are swollen up to four times or higher than their normal sizes (A and D). MdSGHV induces hypertrophy (i.e., enlargement of both cytoplasm and nucleus of SG cells) (B), while GpSGHV induces hyperplasia (i.e., enlarged, multi-layered cytoplasm of SG cells; the nucleus is not enlarged) (E). The MdSGHV and GpSGHV particles are shown in panels C and F, respectively. The nucleocapsid core (nc), tegument (tg) and envelope (en) structural components of GpSGHV particle are shown in the inset of panel F. Scale bar in C is 300 nm. Adapted from. Kariithi, H.M., Meki, I.K., Boucias, D.G., Abd-Alla, A.M.M., 2017. Hytrosaviruses: Current status and perspective. Current Opinion in Insect Science 22, 71–78. doi:10.1016/j.cois.2017.05.009. (with permission).

Infections of non-SG tissues by the SGHVs is associated with various pathological effects including reproductive dysfunctions, infertility in the females and distorted mating behaviors. These effects are more apparent in flies that exhibit overt SGH symptoms, and the characteristics of the pathologies are distinct for each genus (see below).

Pathogenesis in the Salivary Glands In the tsetse fly, GpSGHV causes SGs hyperplasia, whereby the cytoplasmic (but not nuclear) compartments of the infected cells are enlarged. The SG cells become multi-layered (Fig. 3) and are capable of replicating. The cellular proliferation is thought to be part of the host’s responses to GpSGHV infections, which result from virus-induced reprogramming of differentiated SG cells. Notably, only G. pallidipes exhibits overt SGH symptoms, and even in this species, overt SGH symptoms is an exception rather than the rule. Other tsetse species harbor only chronic asymptomatic GpSGHV infections, without any notable host’s fitness cost (development and survival). The mechanism(s) underlying the switch from asymptomatic to symptomatic infection states are currently unknown, but it is suspected that certain host’s and/or viral factors, and host’s microbiota communities influence development of overt SGH. When adult flies are artificially (hemocoelic) injected with GpSGHV suspensions, overt SGH symptoms do not develop in the same parental generations, rather the symptoms develop in the F1 progenies produced by the virusinjected mothers. In the housefly, MdSGHV causes SG hypertrophy whereby both the cytoplasmic and nuclear compartments of the infected gland cells proliferate, but the SG cells are incapable of dividing. Unlike in the case of GpSGHV, within 3 days post infection, MdSGHV induces overt SGH symptoms in 100% of the infected individuals (i.e., the virus does not infect asymptomatically). Additionally, older adult houseflies show increased resistance to MdSGHV infection, which is partially attributed to the maturation of the PM barrier in older flies.

Pathogenesis in Other Host Tissues In tsetse flies, overt SGH symptoms are associated with testicular degeneration, ovarian abnormalities, severe necrosis, and degeneration of germaria, reduction of the development, survival, fertility and fecundity of the flies. The testicular degeneration is characterized by vacuolation of the follicles, thus causing complete shutdown of spermatogenesis. GpSGHV infection also causes disintegration of the accessory reproductive glands of males. Virus infection of tsetse fly tracheal cells and milk glands may cause extensive hypertrophy, but without hyperplasia. Infected milk glands become necrotic, and the reservoir organelles that store the milk secretions are depleted. Virus-induced midgut cellular necrosis impair nutrient assimilation, which, combined with the SG

788

Hytrosaviruses (Hytrosaviridae)

pathologies, results in starvation of infected flies. A combination of these pathological effects of the virus largely accounts for reduce productivity and collapse of tsetse fly colonies. In the housefly, MdSGHV replicates rapidly, resulting in complete shutdown of vitellogenesis via blocking the production of sesquiterpenoids. The replication of MdSGHV in the cells of corpora-allata and corpora-cardiaca partially accounts for a shutdown of vitellogenesis. In addition, the ovaries of viremic housefly females are arrested at a pre-vitellogenic stage. MdSGHV infection also alters housefly-mating behaviors, whereby viremic females refuse to copulate when paired with healthy males and viremic males show reduced avidity to initiate courtship with healthy females. Examination of various tissues obtained from flies with overt SGH symptoms under TEM provided evidence for presence of viral particles in multiple other tissues including the crop, midgut lumens, and ovarioles.

Viral Latency One of the outstanding questions that has emerged in the course of SGHVs research is how GpSGHV is maintained at an asymptomatic state in the tsetse host. Some of the proposed hypotheses include: (1) that the virus may be integrated in the host chromosomes; or (2) may occur as an episome in the nuclei; or (3) persist at a low-level, covert infection in the tsetse fly organs. In all these cases, it appears that GpSGHV is ‘under the control’ of the host and the question is what type of control is exerted. Southern blot analysis using selected GpSGHV genes as probes did not provide any evidence for integration of SGHVs into host genomes as proviruses. The predominant asymptomatic GpSGHV infection state is thought to represent either a sub-lethal persistent infection state, or viral latency. During persistent infections, a virus remains in specific cells, whereby progeny virions are perpetually produced at low-levels without excessive damage to the host’s cells. During latency, viral genome copies and proteins are present in infected cells for a period of time, but without detectable formation of infectious viral particles. Additionally, depending on the tissue tropism, a virus can cause both persistent and latent infections in the same host concurrently, but in different tissues. Based on its pathobiology during asymptomatic infections, GpSGHV is thought to exist in both persistent and latent infection states at the same time. A persistent infection state in the tsetse SG tissue is supported by the release of low amounts of virus particles (B102 viral genome copies) via saliva during feeding by an asymptomatic fly. In this state, low virus replication levels does not result in the development of overt SGH symptoms. At the same time, the virus may latently infect non-SG tissues such as the tracheal cells, in which viral DNA is detectable but no transcripts. During viral latency or persistence, only minimal numbers of viral genes are expressed to evade the host’s immune system, and the virus does not induce overt SGH symptoms or reproductive dysfunctions as observed during the symptomatic infection state. A specific RNA interference (RNAi) is one of the possibilities of how GpSGHV latency is maintained in tsetse flies. This possibility was informed by findings from other insect viral systems that have been well studied, in which viruses are under the control of the host’s RNAi machinery, particularly the small interfering RNA (siRNA) and micro RNA (miRNA) pathways. Research into this aspect of GpSGHV infection has revealed that the virus provokes an RNAi defense response in G. pallidipes as evidenced by significant downregulation of some key genes in the siRNA arm of the RNAi machinery (e.g., Argonaute) in symptomatic G. pallidipes flies as compared to artificially or naturally-infected, asymptomatic flies. A knockdown of Argonaute-2 gene led to an increased susceptibility to GpSGHV infection as evidenced by increased viral replication. Additionally, evidence suggests the virus alters the host’s miRNA expression profiles in G. pallidipes (by targeting the host’s immune genes and participate in viral immune evasion).

Transmission and Epidemiology The transmission dynamics of GpSGHV and MdSGHV are distinct in both wild and laboratory-bred tsetse fly and housefly species, which could be attributed to differences in the hosts’ ecologies and life histories.

GpSGHV Transmission Dynamics in the Tsetse Fly GpSGHV is transmitted both vertically (mother-to-offspring, either trans-ovum or through infected milk gland secretions) and horizontally (fly-to-fly). Horizontal transmission is more prevalent and well documented in tsetse mass-rearing facilities where the flies are fed using an in vitro membrane-feeding regime. Under these conditions, up to 10 cages with flies (at average densities of approximately 75 flies per cage) are fed on the same membrane in succession. It is estimated that, during a single 10–15 min feeding event, B106–9 viral genome copies are released via saliva secretions by an infected fly. The viral particles released by infected flies are infectious per os to susceptible uninfected flies that feed on the same membrane. Infection of the tracheal system, milk glands, germline cells, nurse cells, and the oocytes of the ovaries, ensures vertical viral transmission to the larvae that develop within the mother’s womb (adenotrophic viviparity) and on milk gland secretions. Additionally, GpSGHV is efficiently transmitted to the F1 progenies produced by intra-hemocoelically injected mothers. These F1 progenies exhibit overt SGH symptoms, the proportion of which gradually increase from the first to fourth gonotrophic (G) cycles (0%–3% in G1, 10%–30% in G2,

Hytrosaviruses (Hytrosaviridae)

789

40%–60% in G3, and 100% in G4 onwards). Except in G. pallidipes, injection of GpSGHV into third-instar larvae of tsetse flies does not lead to the development of overt SGH symptoms in the adults that emerge from the virus-injected larvae. Some of the maternally inherited tsetse microbiota have been implicated in GpSGHV vertical transmission as evidenced by absence of overt SGH symptoms in the F1 progenies produced by aposymbiotic mothers. Interestingly, the absence and/or presence of certain microbiota such as Wolbachia may influence vertical GpSGHV transmission of the virus. Evidence suggests a causal link between the expression of overt SGH symptoms and microbiota as attested to by the absence of the symbiont in G. pallidipes (exhibits overt SGH) and its presence in most other tsetse species (without overt SGH). There is no evidence for the transmission of GpSGHV from father-to-progeny, which could be due to reduced virus transmission rates, rather than to a total failure of transmission. The mechanisms responsible for the circulation and maintenance of GpSGHV in the wild tsetse populations are unclear. Further, there is no evidence for mechanical transmission of the virus via contacts between flies, mating, or fecal contamination. This limits horizontal transmission as the main mode GpSGHV dispersal amongst wild tsetse populations in the field, especially because tsetse flies are known to aggregate on specific mammalian hosts for blood feeding. When feeding on an animal, tsetse flies create blood pools at the bite sites on their hosts. GpSGHV-infected flies could deposit infectious viral particles via saliva into the blood pools, which would be picked up and infect other susceptible flies. This may facilitate transmission of the virus from fly-tofly without necessarily replicating in the mammalian hosts. The dynamics of such horizontal GpSGHV transmission probably depends on feeding behaviors of specific tsetse species, their host preferences, feeding regimes and susceptibility to GpSGHV infections. Nevertheless, these factors partially explain the significantly low prevalence of GpSGHV in the wild tsetse populations (0.4%–15%) compared to colonized laboratory populations (almost 100% based on a diagnostic polymerase chain reaction [PCR] method).

MdSGHV Transmission Dynamics in the Housefly MdSGHV is transmitted mainly horizontally during the gregarious feeding of the houseflies. There is no evidence of vertical or sexual transmission of MdSGHV from mother to the progeny in infected houseflies. During feeding, viremic houseflies release an estimated 106 MdSGHV genome copies onto the food substrates, which are infectious and serve as inoculum to infect most (Z65%) of the co-feeding healthy conspecifics. When newly eclosed (B2 h-old) adult houseflies are force-fed with MdSGHV preparations under laboratory conditions, Z50% of the treated flies display SGH symptoms. Unlike in the GpSGHV-tsetse infection system, MdSGHV-infected houseflies remain infectious until their death. However, dynamics of MdSGHV per os infection under a laboratory setting may be different in nature because newly eclosed flies do not imbibe food until after several hours. Within 12–24 h post-eclosion, synthesis of the protective gut PM increases the housefly’s resistance to orally-ingested virus. Under laboratory conditions, oral infections of newly eclosed adults with low viral quantities is successful only during the initial six hours post eclosion. Further, oral challenge of housefly larvae with MdSGHV suspensions does not result in detectable SGH symptoms in the adults, implying that MdSGHV is acquired during only the adult stages. This raises the question of how MdSGHV is maintained within wild populations of the housefly. Recently, mechanical transmission of MdSGHV amongst wild populations of houseflies has been suggested as an alternative route for per os transmission. This is thought to be through cuticular-wounding events, which circumvents the PM barrier in the midgut by introducing the virus directly into the fly’s hemocoel. The likelihood of the occurrence of the mechanical transmission of MdSGHV is partially supported by the increased activity of male houseflies, which increase cuticular wounding, especially during courtship. Notably, males harbor higher viral titers and prevalence than the females. The mechanical transmission of MdSGHV is also supported by the fact that topical applications of virus preparations to the labellum and spiracular regions of the older housefly adults, which are more likely to be wounded, induces SGH symptoms in 10% and 70% of the flies, respectively.

Diagnosis and Management of SGHV Infections GpSGHV is presumably introduced into tsetse mass-rearing facilities from asymptomatic field-collected materials (e.g., pupae), or those derived from already existing colonies, that are used to establish new colonies or replenish existing ones. The asymptomatic infections and vertical transmission of GpSGHV allow the spread and long-term maintenance of the virus in the colonies. Undefined stress- and/or genetic-related factors may trigger outbreaks of SGH, which result in fly mortality, reduced fecundity and colony collapse.

Diagnosis There are no obvious external clinical signs of SGHV infections in both tsetse flies and houseflies. However, in tsetse flies, hyperplastic SGs appear as pale outlines with irregular ridges in the abdomens of male flies. The discoloration is probably due to the extension of gland cells towards the cell lumina, resulting to constricted gland lumens. The enlarged and chalky-white glands are often observed in MdSGHV-infected houseflies. In the case of GpSGHV-tsetse model, a simple, sensitive and reliable non-destructive PCR-based diagnostic assay has been developed, which allows the screening of virus infections in individual live

790

Hytrosaviruses (Hytrosaviridae)

tsetse flies. For this method, a single intermediate fly leg is excised, followed by DNA extraction, PCR amplification of a conserved viral gene (odv-e66), and analysis of the PCR products on agarose gels.

Management The best option to manage GpSGHV infections in tsetse mass rearing is an integrated approach, which involves two components. The first component, which is referred to as the clean feeding system (CFS), is based on strict sanitation, regular and routine monitoring of virus infections and occurrence of overt SGH symptoms, and quarantine of infected fly colonies. The CFS is a modification of the in vitro feeding membrane regime aimed to interrupt horizontal virus transmission within the tsetse colonies, and consists of three feeding rounds. On the same membrane, teneral flies are always fed first, after which the rest of the fly cages are fed in a second feeding round, and finally oldest colony flies are fed last in the third round. This prevents uninfected young colony flies from picking up the virus that may be deposited onto feeding membranes by older infected flies. The second component is the supplementation of blood meals with antiviral drugs (e.g., valacyclovir), which are administered at low doses (non-detrimental to fly’s DNA synthesis). Virally encoded thymidylate synthase enzymes convert the drugs into active metabolites, which subsequently block viral replication and thus reduce viral titers and shedding. In practice, CFS breaks the cycles of SGH outbreaks and thus rapidly restores colony productivity (within six months) without additional investments in specialized equipment or reagents. The SGH outbreaks are eliminated by the administration of antiviral drugs, which can be withdrawn in case of undesirable side effects, or development of drug resistance. After withdrawal of the antiviral drugs, CFS protocols alone can reduce viral infections to low levels (i.e., o10% virus prevalence) that do not compromise colony productivity. Overall, CFS is recommended in colonies with low SGH or non-observable SGH outbreaks to prevent any development and outbreaks of SGH symptoms.

Conclusions The SGHVs are a small group of large dsDNA viruses currently restricted to dipteran insects. However, the detection of virus particles that induce the SGH symptoms in Hymenoptera, and are morphologically similar to GpSGHV and MdSGHV, suggest possibility of more members of the Hytrosaviridae family. The majority of the SGHV genes are hypothetical (unknown functions) and have limited or no homologies to known genes in other invertebrate viruses. The pathological differences observed from the SGHVs (MdSGHV vs. GpSGHV, and between different GpSGHV strains) appear to reside in the genetics of both the SGHVs and their respective hosts. The ecologies and the life histories of SGHV hosts may have significantly influenced the SGHVs-host coevolution. The roles of other co-infecting microorganisms, particularly the microbiota, cannot be overlooked. The GpSGHVinduced differential expression of overt SGH symptoms in certain Glossina species and not others could be attributed to the host’s inherent abilities to mount a robust immune response to counteract the viral infections. It is now evident that, different wild and laboratory-bred populations of tsetse fly species are infected by different lineages of the same GpSGHV virus, even in G. pallidipes (the only Glossina species known to exhibit overt SGH symptoms). Further, the RNAi is a major contributor to the maintenance of asymptomatic/latent infection state of GpSGHV in tsetse as evidenced by the virus-induced provocation of an RNAi defense response via the siRNA arm of the RNAi machinery, and modulation of the host’s miRNA profiles during symptomatic infection state.

Further Reading Abd-Alla, A.M.M., Cousserans, F., Parker, A.G., et al., 2008. Genome analysis of a Glossina pallidipes salivary gland hypertrophy virus reveals a novel, large, double-stranded circular DNA virus. Journal of Virology 82 (9), 4595–4611. Abd-Alla, A.M., Kariithi, H.M., Cousserans, et al., 2016. Comprehensive annotation of Glossina pallidipes salivary gland hypertrophy virus from Ethiopian tsetse flies: A proteogenomics approach. Journal of General Virology 97 (4), 1010–1031. Abd-Alla, A.M.M., Vlak, J.M., Bergoin, M., et al., 2009. Hytrosaviridae: A proposal for classification and nomenclature of a new insect virus family. Archives of Virology 154 (6), 909–918. Amargier, A., Lyon, J., Vago, C., Meynadier, G., Veyrunes, J., 1979. Discovery and purification of a virus in the gland hyperplasia of insects. a study on Merodon equestris F. (Diptera, Syrphidae). Comptes rendus hebdomadaires des séances de l0 Académie des sciences. Série D: Sciences naturelles 289 (5), 481–484. Coler, R., Boucias, D., Frank, J., et al., 1993. Characterization and description of a virus causing salivary gland hyperplasia in the housefly, Musca domestica. Medical and Veterinary Entomology 7 (3), 275–282. Garcia-Maruniak, A., Maruniak, J.E., Farmerie, W., Boucias, D.G., 2008. Sequence analysis of a non-classified, non-occluded DNA virus that causes salivary gland hypertrophy of Musca domestica, MdSGHV. Virology 377 (1), 184–196. Jehle, J.A., Abd-Alla, A.M., Wang, Y., 2013. Phylogeny and evolution of Hytrosaviridae. Journal of Invertebrate Pathology 112 (Suppl. 1), S62–S67. Kariithi, H.M., 2013. Glossina hytrosavirus control strategies in tsetse fly factories: Application of infectomics in virus management. Wageningen, The Netherlands: Wageningen University and Research. Kariithi, H.M., Boucias, D.G., Murungi, E.K., et al., 2018. Coevolution of hytrosaviruses and host immune responses. BMC Microbiology 18 (Suppl. 1), 183. Kariithi, H.M., van Lent, J.W., Boeren, S., et al., 2013. Correlation between structure, protein composition, morphogenesis and cytopathology of Glossina pallidipes salivary gland hypertrophy virus. Journal of General Virology 94 (1), 193–208. Kariithi, H.M., Yao, X., Yu, F., et al., 2017. Responses of the housefly, Musca domestica, to the hytrosavirus replication: Impacts on host’s vitellogenesis and immunity. Frontiers in Microbiology 8, 583.

Hytrosaviruses (Hytrosaviridae)

791

Lietze, V.-U., Abd-Alla, A., Vreysen, M., Geden, C.C., Boucias, D.G., 2011. Salivary gland hypertrophy viruses: A novel group of insect pathogenic viruses. Annual Review of Entomology 56, 63–80. Meki, I.K., 2018. Hytrosavirus in tsetse flies: Phylogeography and molecular mode of action. Wageningen, The Netherlands: Wageningen University and Research. Orlov, I., Drillien, R., Spehner, D., et al., 2018. Structural features of the salivary gland hypertrophy virus of the tsetse fly revealed by cryo-electron microscopy and tomography. Virology 514, 165–169. Vreysen, M.J.B., Saleh, K.M., Lancelot, R., Bouyer, J., 2011. Factory tsetse flies must behave like wild flies: A prerequisite for the sterile insect technique. PLoS Neglected Tropical Diseases 5 (2), e907.

Relevant Website www.ictv.global/report/hytrosaviridae Hytrosaviridae. Hytrosaviridae. dsDNA Viruses. ICTV.

Iflaviruses (Iflaviridae) Bryony C Bonning, University of Florida, Gainesville, FL, United States Sijun Liu, Iowa State University, Ames, IA, United States r 2021 Elsevier Ltd. All rights reserved.

Glossary Endogenous viral element (EVE) Virus-derived DNA sequence in the genome of an organism. Internal ribosome entry site (IRES) Complex secondary RNA structure that allows translation initiation independent of a 50 cap.

UTR Untranslated region in an mRNA at the 50 or 30 end of an open reading frame. VPg Viral protein genome-linked attaches to the 50 end of positive strand viral RNA and functions to prime RNA synthesis.

Classification The family Iflaviridae is classified under the order Picornavirales (Realm Ribovira) and shares a number of important characteristics with other families in that order. Iflaviruses have non-enveloped, icosahedral virions ranging from 22 to 30 nm in diameter (Fig. 1). The virions encase a single copy of a positive sense, single-stranded RNA. The genome is 8–10.5 k nucleotides (excluding the polyA tail) and non-segmented with a polyA tail. Translation occurs directly from the genomic RNA via an internal ribosome entry site (IRES) to produce a single polyprotein. This polyprotein is post-translationally cleaved to generate the structural (capsid) and nonstructural proteins. The type species of the family Iflaviridae is Infectious flacherie virus (IFV), and all members of the family as recognized by the International Committee on Taxonomy of Viruses (ICTV; Table 1) belong to the genus Iflavirus. The name “ifla” is derived from Infectious flacherie virus. Iflavirus species are demarcated by host range and o90% amino acid identity in the sequence of the capsid protein precursor. Based on this, Deformed wing virus (DWV) and Varroa destructor virus 1 (VDV1) with 95% amino acid identity appear to be different isolates of the same species. An additional 17 unclassified RNA viruses appear to belong to the genus Iflavirus on the basis of their genome organization (Table 2). The diversity of iflaviruses is evident from phylogenetic analysis which shows three major clades (Fig. 2) with no relationship between virus relatedness and host range. The dicistroviruses Cricket paralysis virus (CrPV) and Drosophila C virus (DCV) group separately from Perina nuda virus (PnV) and Ectropis obliqua virus (EoV) which in turn distinctly differ from other members of the family.

Virion Structure The smooth surfaced, icosahedral virions are comprised of four structural proteins; three major capsid proteins (VP1-VP3, ranging from 28 to 35 kDa in size) and a minor structural protein (VP4, 4–12 kDa in size). The virion structure is similar to that of the dicistroviruses, for which VP4 is located on the interior of the capsid and may function to bind genomic RNA within the virion. The icosahedral particles have a T ¼ 3 symmetry and are made up of 60 promoters with a basic structure similar to those of mammalian picornaviruses. Low pH promotes genome release from the virion implicating host cell entry via endosomes.

Genome The primary characteristics of the iflavirus genome are a non-segmented genome of 8–10.5 kb excluding the 30 polyA tail. The genome is translated into a single polyprotein (Fig. 3). Cap-independent translation of the iflavirus polyprotein is mediated by an internal ribosome entry site (IRES) within the 50 UTR. The polyprotein is cleaved by the virus-encoded protease. Structural proteins are encoded at the 50 -proximal region while non-structural proteins are encoded at the 30 proximal region of the genome. The structural proteins are preceded by a leader protein (L) of unknown function that is removed from VP2 prior to capsid assembly. The nonstructural proteins are encoded in the order RNA helicase (Hel), a chymotrypsin-like 3C protease (pro), and RNA-dependent RNA polymerase (RdRP) similar to other small RNA viruses. The iflavirus RdRP has a canonical RGD motif, which is typical for RNA polymerases of positive sense, single-stranded RNA viruses. The genome organization of iflaviruses strongly resembles that of picornaviruses. The 11.5 kDa VPg protein covalently associates with the 50 of the positive and negative strand viral RNA and may be required to prime RNA synthesis similar to the situation in Picornaviridae. The location of sequence encoding the VPg protein has yet to be resolved. The construction of an infectious clone of a Deformed wing virus genome provides a useful tool for increasing understanding of iflavirus biology.

792

Encyclopedia of Virology, 4th Edition, Volume 4

doi:10.1016/B978-0-12-809633-8.21554-9

Iflaviruses (Iflaviridae)

793

Fig. 1 Iflavirus particles. (A) Transmission electron micrograph of Varroa destructor virus 1 (VDV1) particles within a Varroa mite. Bar, 150 nm. (B) Purified VDV1 particles. Bar, 50 nm. Photographs courtesy of Dick Peters. (C) Surface view of a virion of Infectious flacherie virus (IFV) along a 5-fold axis reconstructed by cryo-electron microscopy. Bar, 10 nm. Courtesy of J. Hong.

Table 1

Species within the Genus Iflavirus

Species

Virus abbreviation

Accession number

Host

Antheraea pernyi iflavirus Brevicoryne brassicae virus Deformed wing virus Dinocampus coccinellae paralysis virus Ectropis obliqua virus Infectious flacherie virus Lygus lineolaris virus 1 Lymantria dispar iflavirus 1 Nilaparvata lugens honeydew virus 1 Perina nuda virus Sacbrood virus Slow bee paralysis virus Spodoptera exigua iflavirus 1 Spodoptera exigua iflavirus 2 Varroa destructor virus 1

LnApIV BBV DWV DcPV EoV IFV LyLV1 LdIV1 NLHV1 PnV SBV SBPV SEIV1 SEIV2 VDV1

KF751885 EF517277 AJ489744 KF843822 AY365064 AB000906 JF720348 KJ629170 AB766259 AF323747 AF092924 EU035616 JN091707 JN870848 AY251269

Chinese (oak) tussar moth Cabbage aphid Honey bee Braconid wap Tea looper Silkworm Tarnished plant bug Gypsy moth Brown planthopper Ficus transparent wing moth Honey bee Honey bee Beet armyworm Beet armyworm Honey bee

Table 2

Unclassified, putative members of Iflaviridae

Proposed name

Virus abbreviation

Accession number

Bombyx mori iflavirus Ceratitis capitata iflavirus 1 Ceratitis capitata iflavirus 2 Formica exsecta virus 2 Graminella nigrifrons virus 1 Halyomorpha halys virus Heliconius erato iflavirus King virus Kinkell virus La Jolla virus Laodelphax striatella honeydew virus 1 Laodelphax striatellus iflavirus 2 Moku virus Nilaparvata lugens honeydew virus 2 Nilaparvata lugens honeydew virus 3 Osiphanes invirae iflavirus 1 Thaumetopoea pityocampa iflavirus 1

BMIV CcIV1 CcIV2 FeV2 GnV1 HhV HeIV KgV KkV LJV LsHV1 LsIV2 MV NlHV2 NlHV3 OiIV1 TpIV1

LC068762 GAMC01001920 GAMC01020602 KF500002 KP866792 KF699344 KJ679438 KX779454 KU754510 KP714074 KF934491 KM272628 NC031338 AB826459 AB826460 KR534892 KP217032

794

Iflaviruses (Iflaviridae)

Fig. 2 Phylogenetic relationship between viruses in the family Iflaviridae based on RdRP sequences. Dicistroviruses CrPV (Cricket paralysis virus), NP_647481.1; DCV, (Drosophila C virus), NP_044945.1 were used as outgroups. RdRP sequences were aligned using MAFFT version 7 (Multiple Alignment using Fast Fourier Transform) and phylogenetic analysis was conducted using the maximum likelihood (ML) method with 10,000 bootstraps. Evolutionary distance marker indicates 0.3 substitutions per amino acid position. RdRP sequences used were BbV (Brevicoryne brassicae virus), YP_001285409.1; DWV (Deformed wing virus), AST47884.1; DcPV, (Dinocampus coccinellae paralysis virus), YP_009111311.1; EoV (Ectropis obliqua virus), NP_919029.1; IFV (Infectious flacherie virus), NP_620559.1 LnApIV (Antheraea pernyi iflavirus), YP_009002581.1; LyLV1 (Lygus lineolaris virus 1), YP_009505598.1; LdIV1 (Lymantria dispar iflavirus 1), YP_009047245.1; NLHIV1 (Nilaparvata lugens honeydew virus 1),YP_009505599.1; PnV (Perina nuda virus), NP_277061.1; SBV (Sacbrood virus), ATA66298.1; SBPV (Slow bee paralysis virus), ADI46683.1; SeIV1 (Spodoptera exigua iflavirus 1), YP_004935363.1; SeIV2 (Spodoptera exigua iflavirus 2), YP_009010984.1.

Fig. 3 Genome organization of viruses within the family Iflaviridae. The genome encodes a single polyprotein with structural proteins (VP1 to VP4) toward the N-terminus, and nonstructural proteins (helicase, protease, RNA-dependent RNA polymerase) toward the C-terminus. An internal ribosome entry site (IRES) in the 50 untranslated region (UTR) allows for translation directly from the genomic RNA. A covalently linked protein, VPg is present at the 50 end of the genome. The polyprotein produced on translation of the single open reading frame, is auto-catalytically cleaved to generate the individual structural and nonstructural proteins. There are two systems for naming of iflavirus structural proteins, based on genome position and sequence identity, or on size or molecular mass. The system used by the ICTV based on molecular mass is adopted here.

Life Cycle Iflaviruses have been isolated from only arthropods (Table 1) with host ranges limited to one or a few closely related species. Iflaviruses may be vertically and / or horizontally transmitted. For example, DWV is vertically transmitted from queen honey bees to workers and drone progeny, but may also be transmitted from worker to worker either directly through food exchange, or indirectly via the Varroa mite. In the latter case, virus is released directly into the body cavity of the honey bee.

Epidemiology Analysis of the incidence and distribution of honey bee viruses highlighted the dynamic nature of virus infection across hives, locations and seasons. The Varroa mite plays a key role in changing the viral landscape with increased DWV prevalence and titer, and selection of a subset of viral variants. Co-infection of a host by iflaviruses and viruses from other families is typical.

Iflaviruses (Iflaviridae)

795

Clinical Signs Iflaviruses are transmitted primarily on ingestion of contaminated diet. Many iflaviruses infect the host persistently without inducing diseases symptoms. These viruses are transmitted vertically, which is characteristic of covert, persistent or chronic infection. There are several exceptions to this however with viruses resulting in symptomatic infections including discoloration, diarrhea, developmental abnormalities and death of the host. Infectious flacherie virus infects mainly the goblet cells in the midgut epithelium of the silkworm and results in accumulation of fluids known as the flacherie phenotype. This flacherie disease causes diarrhea and results in significant losses to the sericulture industry in China and Japan. The flacherie phenotype is also observed on infection of other lepidopteran species by other iflaviruses. In addition fluid accumulation under the skin affects honey bee larvae infected with Sacbrood virus resulting in fatal sacbrood disease. Iflaviruses of the honey bee have received particular attention as a result of research into global honey bee declines. Viral infection (by DWV in particular) is exacerbated by the presence of the Varroa mite, which vectors viruses and weakens the honey bee thereby reducing anti-viral defenses. At high titers, DWV impairs honey bee development, resulting in insects with deformed wings that are unable to fly. DWV can also affect learning behavior and aggression, in addition to lifespan.

Pathogenesis Iflaviruses replicate in the cytoplasm of host cells and can be observed as crystalline arrays of virions (Fig. 1). Virus-induced vesicular structures appear to play an important role in virus replication, similar to other picornaviruses. The impact of replication on host cell macromolecular synthesis has not been established for members of this family. Disease manifests when viral titers reach high levels. Information on the manner of disease development is currently limited.

Diagnosis/Detection Methods Serological methods such as western blot and enzyme-linked immunosorbent assay (ELISA) can be used for detection of viral proteins. Iflaviruses (with the exception of DWV and VDV1 which are viruses of the same species) are serologically distinct, with no serological relationships between family members. Reverse transcriptase-polymerase chain reaction (RT-PCR) with primers based on unique genome sequences provides an accurate and specific approach for detection of viral sequence. The use of multiple primer pairs is recommended to reduce the likelihood of viral sequence resulting from endogenous viral elements, rather than complete viral genomes. RT-PCR can also be used for detection of negative strand RNA (indicative of virus replication) provided that the appropriate controls (including positive sense RNA) are included. Quantitative PCR (RT-qPCR) can be used for determination of virus titer (genome equivalents) with reference to an appropriate viral sequence standard. A high viral titer would indicate virus replication in a given species, rather than virus that may be associated with an insect, on the surface or in the gut for example. Cell lines that support replication are available for some iflaviruses, including DWV, PnV, and Lymantria dispar virus 1 (LdIV1). The availability of such cell lines will facilitate further study of the biology of this virus family.

Prevention Maintenance of virus free insect colonies and cell lines in isolated, clean facilities is recommended. Management of the Varroa mite in honey bee colonies is essential for preventing escalation of viral titers to potentially damaging levels. Colonies with enhanced hygienic behavior are better able to survive in the presence of Varroa mites.

Further Reading Carrillo-Tripp, J., Bonning, B.C., Miller, W.A., 2015. Challenges associated with research on RNA viruses of insects. Current Opinion in Insect Science 8, 62–68. Carrillo-Tripp, J., Krueger, E.N., Harrison, R.L., et al., 2014. Lymantria dispar iflavirus 1 (LdIV1), a new model to study iflaviral persistence in lepidopterans. Journal of General Virology 95, 2285–2296. de Miranda, J.R., Genersch, E., 2010. Deformed wing virus. Journal of Invertebrate Pathology 103 (Suppl. 1), S48–S61. Fannon, J.M., Ryabov, E.V., 2016. Iflavirus (deformed wing virus). In: Liu, D. (Ed.), Molecular Detection of Animal Viral Pathogens. Boca Raton: CRC Press, pp. 37–46. Lamp, B., Url, A., Seitz, K., et al., 2016. Construction and rescue of a molecular clone of Deformed Wing Virus (DWV). PLoS One 11 (11), e0164639. Martin, S.J., Highfield, A.C., Brettell, L., et al., 2012. Global honey bee viral landscape altered by a parasitic mite. Science 336 (6086), 1304–1306. McMenamin, A.J., Genersch, E., 2015. Honey bee colony losses and associated viruses. Current Opinion in Insect Science 8, 121–129. Oers, M.M.V., 2010. Genomics and biology of iflaviruses. In: Johnson, K.N., Asgari, S. (Eds.), Insect Virology. Norfolk, UK: Caister Academic Press, pp. 231–250. Runckel, C., Flenniken, M.L., Engel, J.C., et al., 2011. Temporal analysis of the honey bee microbiome reveals four novel viruses and seasonal prevalence of known viruses, nosema, and crithidia. PLoS One 6 (6), e20656. Wilfert, L., Long, G., Leggett, H.C., et al., 2016. Deformed wing virus is a recent global epidemic in honeybees driven by Varroa mites. Science 351 (6273), 594–597.

796

Iflaviruses (Iflaviridae)

Relevant Websites https://www.ncbi.nlm.nih.gov/genome/?term=Iflavirus GEO DataSets Result. NCBI. NIH. https://talk.ictvonline.org/ictv-reports/ictv_online_report/positive-sense-rna-viruses/picornavirales/w/iflaviridae ICTV Report on Iflaviridae. https://viralzone.expasy.org/278 ViralZone.

Iridoviruses of Invertebrates (Iridoviridae) İkbal Agah İnce, Department of Medical Microbiology, Acıbadem University School of Medicine, Istanbul, Turkey r 2021 Elsevier Ltd. All rights reserved.

Glossary DE Delayed early gene transcription. IE Immediate early gene transcription. IFN Interferon. IIV Insect Iridovirus. Iridovirid Generic designation of members of the family Iridoviridae.

L Late gene transcription. NCLDVs Nucleocytoplasmic large DNA viruses, a non taxonomic grouping of large DNA viruses that replicate exclusively in the cytoplasm, or like iridoviruses, in both the cytoplasmic and nucleus.

Introduction Iridovirids, a generic designation for members of the family Iridoviridae, are nucleocytoplasmic large dsDNA viruses (NCLDVs) that infect both invertebrates (IIVs) and ectothermic vertebrates (Vertebrate Iridoviruses, VIVs). Infections may be asymptomatic or range in severity from minor reductions in host fitness to systemic disease and large-scale mortality. Currently, Iridovirids are classified according to virion particle size, host preference, presence of a DNA methyltransferase gene, GC content and phylogeny based on the amino acid sequence of the major capsid protein (Table 1). Although new iridovirids are continuously being discovered, the genomes of only a limited number of IIVs have been sequenced, and thus, it is hard to precisely quantify or estimate the genetic heterogeneity and variability within each genus. Additionally, it is still not clear what type of information or criteria would be sufficient to determine whether newly discovered iridovirids represent new species or are just variants or strains. The advent of phylogenetic analysis based on complete genome sequences now provides a superior method of differentiation and classification, (Fig. 1), and using pan-genomic data, which needs the raw sequencing data of the available complete genomes, might very well advance the analysis of viral phylogenies, which may ultimately lead to the development of new criteria for virus classification in general.

Classification of Iridovirids The invertebrate iridoviruses belong exclusively to the Betairidovirinae subfamily of the Family Iridoviridae. This subfamily consists of three genera, Iridovirus with two species, including the type species Invertebrate Iridovirus 6, Chloriridovirus with 5 species including the type species Invertebrate Iridovirus 3 and recently Decapodiridovirus with a single species, the type species Decapod Iridovirus 1. There are 26 core genes present in all 45 iridovirid genome sequences, forming four clades. Three clades for each of the vertebrate-infecting genera (Ranavirus, Lymphocystivirus and Megalocytivirus) and one clade for the invertebrate iridoviruses, recently split into three genera (Iridovirus, Chloriridovirus and Decapodiridovirus). Due to the independent monophyletic origins of these vertebrate and invertebrate iridoviruses the subfamilies, Alphairidovirinae and Betairidovirinae are now recognized. The currently accepted approach for iridovirid taxonomy is to use the conserved iridovirid genes (26 genes) in phylogenetic reconstructions. The criteria distinguishing taxa for Subfamilies are based on primary host species (i.e., vertebrate vs. invertebrate) and methylation content. Genera are distinguished by principal host species, sequence-based phylogenetic analysis and sequence identity, with members of different genera displaying o50% sequence identity among a concatenated set of 26 core genes. Species share 495% sequence identity within the set of 26 core genes, phylogenetic relatedness, a co-linear gene order, and similar genome size (710%) and

Table 1

Comparison of Betairidovirinae genera based on their different qualities

Genera

Virion size

Hosts

GC content

DNA methylation

Iridovirus Chloriridovirus Decapodiridovirus

120–130 nm 180 nm 158 nm

arthropods, particularly insects Diptera with aquatic larval stages, mainly mosquitoes Crustaceans

29%–32% 48% 34.58%

Absent Absent Absent

Encyclopedia of Virology, 4th Edition, Volume 4

doi:10.1016/B978-0-12-809633-8.21555-0

797

798

Iridoviruses of Invertebrates (Iridoviridae)

Fig. 1 Complete iridovirid genome sequences based on RAxML Maximum likelihood phylogeny of family Iridoviridae as described in İnce, İ.A., Özcan, O., Ilter-Akulke, A.Z., Scully, E.D., Özgen, A., 2018. Invertebrate iridoviruses: A glance over the last decade. Viruses 10, E161.

G þ C content. Although these criteria provide a rationale for the current taxonomy, they are rather poorly defined in terms of assigning new members to iridovirids. The genomes of many iridovirids have now been sequenced completely, and phylogenetic reconstructions based on complete genome sequences or genome methylation patterns may provide additional resolution when compared to approaches that use only a subset of genes.

Iridoviruses of Invertebrates (Iridoviridae)

799

Fig. 2 Graphical representation of the iridovirid structure. Source: ViralZone: www.expasy.org/viralzone. SIB Swiss Institute of Bioinformatics.

Morphology and Composition IIV-6, formerly called Chilo Iridescent virus, is the most studied invertebrate iridescent virus (IIV). The 212 kbp IIV-6 genome is packed into an icosahedral capsid with a T ¼ 147 lattice. The virion is comprised of four concentric domains: an outer membrane derived from the plasma membrane of host in the virions that are released by budding, a proteinaceous capsid layer, an intermediate lipid membrane with associated polypeptides and a DNA-protein core (Fig. 2). Initially measured at 120–130 nm diameter in ultrathin sections, CryoEM and 3D image reconstruction showed a maximum diameter of 185 nm including fibrils emanating from the surface of the virion.

IIV-6 Persistence and Sensitivity to External Factors IIVs are highly stable in water, but dry soil rapidly inactivates the infectivity of IIV-6. IIV-6 is thermolabile and can be rapidly inactivated once temperatures rise above 551C. Furthermore, solar UV light and ultraviolet radiation reduces IIV-6 infectivity, especially in aqueous habitats. Exposure to various solvents and enzymes can also impact IIV-6 infectivity, and sensitivity exists both in cell culture and whole insects. In addition, differences in sensitivity to suspensions, based on whether the assay was performed in cell culture or in whole insects, were observed. Notably, when the IIV-6 lipid bilayer was disrupted, the virus was still capable of infecting cultured cells, but not insect larvae when the inoculum was injected into the haemocoel. These observations highlight the need for studies on the role of the lipid component of the virus (1) in the process of entry (2) host specificity in different hosts and cell lines. This may also explain the observed differences in sensitivity to organic solvents and enzymes documented between vertebrate and invertebrate viruses.

Host Range and Pathology The host range of IIV-6 includes over 100 insect species belonging to six orders; Coleoptera, Diptera, Hemiptera, Hymenoptera, Lepidoptera and Orthoptera. Additionally, IIV-31 can infect several different isopod species, as well as Drosophilidae. Iridoviridlike particles have also been detected in marine invertebrates. Patent infection with IIV turns the host an iridescent blue and is mostly considered lethal. However, (inapparent) infections are more common, reducing the reproductive capacity and longevity of adult hosts. In some cases, the link between viral presence and mortality is obscure. For instance, a link between co-infection of Nosema species and IIV-6 with insect mortality was suggested but subsequent studies failed to find a significant correlation between IIV-6 and colony collapse disorder in the USA. IIV-6 can be propagated in nineteen different insect cell lines, with varying levels of susceptibility. In some insects, extracellular host factors may influence viral infectivity and not all cell lines derived from susceptible hosts can be infected with IIV-6. Interestingly, reptiles and amphibians fed IIV-infected insects appear to become infected and invertebrate iridovirus (IIV-6) propagation has also been achieved in some poikilothermic vertebrate cell lines. Intraperitoneal injections of large doses of native or ultraviolet-irradiated IIV-6 were lethal to frogs and mice, while heat- or antiserum-inactivated IIV-6 had no lethal toxicity.

800

Iridoviruses of Invertebrates (Iridoviridae)

Vertebrates can become infected with IIVs and lethality was observed under laboratory conditions and at high doses, but natural infections are rarely symptomatic or lethal, suggesting that IIV replication is more limited in vertebrate hosts and that the host immune system is able to inhibit viral replication. Mass produced feeder insects might be a source of viral diseases for reptiles and amphibians kept as pets. However, large-scale IIV outbreaks have not yet been observed. In vertebrate and invertebrate hosts, IIV-6 infection causes different immune responses. In invertebrates, autophagy and/or apoptosis may not be a typical host response to IIV-6 e.g., Drosophila. IIV-6 encoded proteins interfere with insect RNAi immune responses while the JAK-STAT pathway does not provide immunity against an IIV-6 infection. In vertebrates IIV-6, like many other DNA viruses, can induce a type I IFN-dependent antiviral response in mammalian cells, which is likely triggered by a cytosolic sensor upon viral dsRNA recognition, which is known to be produced in the course of IIV-6 transcription. Overall, studies suggest that IIVs can occasionally cross-infect certain vertebrates, but that their immune systems may be able to inhibit replication under natural conditions, reducing or preventing symptoms. Another possibility is that IIVs do not replicate productively in ectotherms and thus do not result in disease.

Genome Organization and Codon Usage The IIV-6 genome consists of a single linear 212,482 bp dsDNA molecule with 28.63% G þ C content. Compared to other animal viruses, the IIV-6 genome has a unique arrangement, with circular permutations and terminal redundancies that cause variations in the number of direct terminal repeats during DNA replication and genome packaging. While the coding capacity of the genome was estimated initially at 215 non-overlapping open reading frames (ORFs), the total number of ORFs rises to 468 when both nonoverlapping and overlapping ORFs are considered. However, also taking overlapping ORFs into account is still a point of debate. Importantly, gene transcripts of iridovirids show no evidence of introns, and viral mRNAs lack poly(A) tails. However, biochemical and in silico evidence suggests the existence of viral microRNAs (miRNA) that might modulate viral gene expression. In addition, unlike VIVs, apart from IIV-3, IIVs do not encode a DNA methyltransferase except for SGIV. In silico prediction of the methylation sites of 50 iridovirid genomes has demonstrated that there is a clear distinction in methylation patterns between IIVs and VIVs, except for IIV-3 from IIVs and SDDV and LCDV-1 from VIVs, which have an as yet unclear phylogenetic status. To date, 26 core genes have been identified in both IIVs and VIVs. They have been linked to virus replication, gene transcription, protein structure and nucleocapsid assembly. The rest of the encoded proteins may have roles in virus-host interactions, hijacking of host cell machinery and establishment of viral infection.

Virion Proteins Virion proteins have been identified for IIV-1, IIV-2, IIV-6, IIV-9, IIV-22 and IIV-25. However, we have observed that the different experimental settings in these studies do affect the resolution of the gel analyses and thus influence previous results and comparisons. More recently, a comprehensive proteomic analysis (LC-MS/MS) detected 54 proteins in IIV-6. Possible functions of fifteen of the putative ORFs were inferred to be: serine-threonine kinase, dual specificity protein phosphatase, DNA polymerase (viral) N terminal domain, carboxy terminal domain phosphatase, ribonuclease III, tyrosine protein kinase, cathepsin, DNA-binding protein, protein disulphide isomerase, lysosome associated membrane glycoprotein, and a ranavirus enveloped protein homolog. The remaining 39 proteins lacked similarity to any other annotated viral proteins and should be investigated further. Genomic analysis of IIV-9 identified 191 predicted genes, with 20% of its repeated sequences located mainly within the coding regions. 97 of 211 IIV-6 genes have detectable orthologs in IIV-9, whereas 108 out of 191 IIV-9 genes have orthologs in IIV-3. Phosphorylation reactions represent common host responses to iridovirid infection. During IIV-6 infection, increased phosphorylation of the ribosome associated proteins in the early phase of the infection cycle requires the expression of the viral genome. However, phosphorylation events were observed in ribosomal proteins when permissive cells were infected with UV-irradiated IIV-6, suggesting that increasing phosphorylation reactions were likely a host cell response rather than being related to an expression of the viral genome. Iridovirid genomes contain multiple genes coding for kinases and phosphatases, regulators of protein activities over the course of the infection. Although kinases and phosphatases are among the most abundant proteins, it is challenging to assign functions to these enzymes without expressing them individually in cell culture, making knock-out viruses, unravelling their subcellular localization, or identifying the protein targets that are specifically phosphorylated. One of the viral kinases of IIV-6, using a purified serine/threonine protein kinase 389L (also called iridoptin) was found to induce apoptosis in insect cells but it remains uncertain whether this is a virion associated protein kinase or not. The functions of many of these viral kinase enzymes remain to be determined however some significant successes have been achieved with nucleoside analogues which are “activated” by phosphorylation by viral and/or host-cell nucleoside kinases, the final target being principally the viral polymerase. Hence, they represent possible new targets for antiviral therapy. Targeting viral polymerases would serve as a safe model system for example in areas like fish farming to develop viral disease mitigation strategies, as kinases play crucial roles in guaranteeing successful viral infections. Furthermore, virally coded kinases present apoptotic features which could be candidates for developing pest insect controlling agents via generation of transgenic crops with using viral kinases showing pest insect specific toxicity.

Iridoviruses of Invertebrates (Iridoviridae)

801

Viral Entry, Replication, and Release Strategy Iridovirids undergo nucleocytoplasmic replication. Initially, the virus attaches to cell surface receptors and the virus is engulfed by the host cell via clathrin-mediated endocytosis or macropinocytosis in a pH-dependent manner. IIV-6 ORF 096L is proposed to act as an insect cell adhesion molecule (CAM) which may facilitate “fusion” between the virus shell and the cellular membranes, or possibly between the internal lipid membrane and cellular membranes. Following the entry into the host cell, immediate early and delayed early transcripts are synthesized. These transcripts encode proteins that are crucial for viral DNA replication and expression of late genes. Then, newly synthesized viral DNA is translocated from the nucleus to the cytoplasm where a second stage of viral DNA replication results in the formation of DNA concatemers (Fig. 3). Late viral transcripts are synthesized by a viral encoded RNA polymerase that has not yet been identified in IIVs. However, there are ORFs with homology to RNA polymerase II indicating that IIVs likely encode a unique viral transcriptase. Evidence from IIVs and VIVs suggests that a virus-encoded transcriptase synthesizes late viral messages. For example, knock down of the large subunit of the viral transcriptase of SGIV eliminates late protein synthesis. When replicated DNA reaches the cytoplasm, it forms large concatemers which eventually serve as “feedstock” for packaging of virions. The underlying mechanism is not yet understood. Progeny virions accumulate in large paracrystalline arrays, or egress from the cell by either budding or cell lysis. Viral particles are released from host cells either by budding from the host cell membrane or together with mature virions from infected cells via lysis. (Fig. 3). The specific classes of lipids in the virion structure can influence virus infectivity. Besides, changes in host cellular lipid composition associated with cell proliferation, apoptosis, cellular stress and viral infection can also occur during viral replication, impacting the virus’ ability to infect its hosts.

Fig. 3 Schematic representation of iridoviral replication. This figure was obtained from the 10th report of ICTV with permission Ince, I.A., Özcan, K., Vlak, J.M., van Oers, M.M., 2013. Temporal classification and mapping of non-polyadenylated transcripts of an invertebrate iridovirus. Journal of General Virology 94 (1), 187–192. Chinchar, V.G., Duffus, A.L.J., 2019. Molecular and ecological studies of a virus family (Iridoviridae) infecting invertebrates and ectothermic vertebrates. Viruses 11, 538.

802

Iridoviruses of Invertebrates (Iridoviridae)

Transcriptional Regulation In IIV-6 infected cells, viral transcription occurs in a regulated temporal gene expression cascade: as immediate-early (IE), delayed-early (DE) and late (L). However, some IE and DE transcripts are still present during the L-stage, and no meaningful correlation has yet been identified between transcription and the prevalence of particular proteins during the course of infection. A quantitative proteomic study provided detailed information about the kinetics of IIV-6 viral protein levels, showing that the transcripts belonging to the IE class are involved in mainly nucleoside metabolism, blocking host cell apoptosis, post-translational protein modifications and transcriptional activation of DE genes. The DE genes function as viral DNA polymerases, protein kinases, and transcriptional activators of late genes. The major capsid protein MCP (274L), one of the most abundant proteins in IIV-6, was expressed with other late class transcripts. The way viral transcriptional regulation is regulated in different target hosts is yet to be investigated.

Promoter Elements and Transcriptional Regulation It is difficult to conclusively identify promoter elements that regulate the temporal expression patterns of iridovirid genes. A very limited number of studies identified motifs in putative promoter regions that may be important for transcriptional regulation. IIV-6 virion proteins are likely able to either directly or indirectly activate the promoters to trigger transcription of some/ a few IE genes. Assays confirm that without virion proteins the promoters remained inactive even though the reporter plasmid constructs carrying these promoters were transfected. Interestingly, DNApol, helicase, and mcp transcripts of IIV-6 lack polyA tails, and clear polyadenylation signals downstream of these three IIV-6 ORFs were not found. To identify the transcription termination signals of IIV-6 genes, ligation-based amplification of cDNA ends was used, showing that about half of all IIV-6 genes contained complementary TAATG and CATTA motifs in the 30 UTR regions of their mRNAs. These CATTA motifs may be conserved features that enable formation of hairpins which, in the absence of polyadenylation signals, may serve as transcriptional terminators.

Induction/Inhibition of Apoptosis in Infections The principles and the dynamics of apoptosis in response to iridovirid infection are similar in vertebrates and invertebrates. During the early stages of infection, host cells trigger apoptotic machinery to impede the cell-to-cell movement of progeny viruses. However, many viruses have evolved mechanisms to protect their progeny viruses from the adaptive immune system, producing anti-apoptotic proteins to enhance the likelihood of spreading to neighbouring cells. At the same time, viruses also contain genes that may stimulate apoptosis, using it to promote viral dissemination to neighbouring cells. Studies found that initially, expression of early viral gene/s (IE) inhibited apoptosis, which suggests that one or more IE genes likely function as anti-apoptotic factors (e.g., IIV-6 ORF193R, iap gene). But IIVs also code for pro-apoptotic factors, which trigger apoptosis at later stages during the infection with the goal of facilitating viral dissemination.

Concluding Remarks Notwithstanding the above information, knowledge gaps about iridovirus still exist. It is not known which host and/or viral factors trigger viral entry into and replication in host cells, and the regulatory mechanisms of genes expressed at various stages during infection remain uncharacterized and will likely only be answered through using functional genomics. Identifying viral proteins responsible for viral infectivity is essential for developing a genetic recombination system for iridovirids, as iridovirid DNA by itself is not infectious. A genetic recombination system will expand the utility of IIVs as a model for identifying and characterizing iridovirid genetic factors driving virus-host interactions, impacting host fitness, and influencing immune system response, ultimately helping to limit or prevent disease in aquatic ecosystems, as well as leading to the development of IIVs as biocontrol agents. IIV-6 can be a potential biopesticide, as studies have identified virally coded genes (e.g., viral kinases) possibly linked to host toxicity, which could be exploited as biocontrol compounds. Additional functional genomics studies and the expansion of genomic resources may lead to the identification of new virulence factors and toxins that could be exploited and engineered for targeted control against certain groups of insects. Furthermore, virally coded kinases could represent potential antiviral drug targets in areas like fish farming, as they play crucial roles in guaranteeing successful viral infections.

Iridoviruses of Invertebrates (Iridoviridae)

803

Further Reading Chinchar, V.G., Duffus, A.L.J., 2019. Molecular and ecological studies of a virus family (Iridoviridae) infecting invertebrates and ectothermic vertebrates. Viruses 11, 538. Chinchar, V.G., Hick, P., Ince, I.A., et al., 2017a. ICTV virus taxonomy profile: Iridoviridae. Journal of General Virology 98, 890–891. Chinchar, V.G., Waltzek, T.B., Subramaniam, K., 2017b. Ranaviruses and other members of the family Iridoviridae: Their place in the virosphere. Virology 511, 259–271. Colson, P., De Lamballerie, X., Yutin, N., et al., 2013. “Megavirales”, a proposed new order for eukaryotic nucleocytoplasmic large DNA viruses. Archives of Virology 158, 2517–2521. Foster, L.J., 2011. Interpretation of data underlying the link between colony collapse disorder (CCD) and an invertebrate iridescent virus. Molecular & Cellular Proteomics 10. (M110-006387). Ince, I.A., Özcan, K., Vlak, J.M., van Oers, M.M., 2013. Temporal classification and mapping of non-polyadenylated transcripts of an invertebrate iridovirus. Journal of General Virology 94 (1), 187–192. ˙Ince, ˙I.A., Özcan, O., Ilter-Akulke, A.Z., Scully, E.D., Özgen, A., 2018. Invertebrate iridoviruses: A glance over the last decade. Viruses. 10, E161. King, A.M., Lefkowitz, E., Adams, M.J., Carstens, E.B., 2011. Virus Taxonomy: Ninth Report of the International Committee on Taxonomy of Viruses. Amsterdam: Elsevier. Liu, Y., Tran, B.N., Wang, F., Ounjai, P., Wu, J., Hew, C.L., 2016. Visualization of assembly intermediates and budding vacuoles of Singapore grouper iridovirus in grouper embryonic cells. Scientific Reports 6, 18696. Tokarz, R., Firth, C., Street, C., Cox-Foster, D.L., Lipkin, W.I., 2011. Lack of evidence for an association between Iridovirus and colony collapse disorder. PLOS ONE 6, e21844. Williams, T., 2008. Natural invertebrate hosts of iridoviruses (Iridoviridae). Neotropical Entomology 37, 615–632. Wu, J., Chan, R., Wenk, M.R., Hew, C.L., 2010. Lipidomic study of intracellular Singapore grouper iridovirus. Virology 399, 248–256. Yan, X., Olson, N.H., Van Etten, J.L., et al., 2000. Structure and assembly of large lipid-containing dsDNA viruses. Nature Structural & Molecular Biology 7, 101–103. Yan, X., Yu, Z., Zhang, P., et al., 2009. The capsid proteins of a large, icosahedral dsDNA virus. Journal of Molecular Biology 385, 1287–1299.

Relevant Website https://talk.ictvonline.org/ictv-reports/ictv_online_report/dsdna-viruses/w/iridoviridae Iridoviridae. Iridoviridae. dsDNA Viruses.

Mesoniviruses (Mesoniviridae) Jody Hobson-Peters and Daniel Watterson, Australian Infectious Diseases Research Centre, School of Chemistry and Molecular Biosciences, The University of Queensland, St Lucia, QLD, Australia r 2021 Elsevier Ltd. All rights reserved.

Classification The Mesoniviridae is a new family of positive sense single-stranded RNA viruses belonging to the order Nidovirales. The mesonivirus genome is organized similarly to other nidoviruses and its size of approximately 20–21 kb falls intermediately between those of the large nidovirus familes (Coronaviridae, Roniviridae: 26–32 kb) and the small nidoviruses (Arteriviridae: 13–16 kb), prompting the name – mesos meaning “in the middle” in greek and “ni” refering to nidovirus. Indeed, the mesoniviruses represent an evolutionary bridge across a large gap in RNA genome size. Similar to other members of the order, the enveloped mesonivirus virions are often observed with projections, at times with large globular heads attached to low density stalks. Mesoniviruses have no known disease association and host range analysis indicates that their replication is restricted to mosquitoes and that they are unable to replicate in vertebrate cells in culture. Mesoniviruses have been isolated from a variety of mosquito species collected from diverse geographical locations, suggesting that these viruses are likely to be very common in mosquito populations. However, most of the mesonivirus species have been isolated from Culex spp. mosquitoes Table 1). Current taxonomy for the Mesoniviridae family categorizes the viruses into a single genus, Alphamesonivirus, within the subfamily Namcalivirus (ICTV, 2018). Within eight subgenera, there are nine species designated Alphamesoniviruses 1–9 (Table 1). These assignments were based on genomic sequence analysis using the computational comparative genomics framework DEmARC (DivErsity pArtitioning by hierarchical Clustering) and five replicative protein domains (3CLpro, NiRAN, RdRp, ZBD, and HEL1) that are characteristic of nidoviruses. The type species, Alphamesonivirus 1, accommodates the two closely related viruses Nam Dinh virus and Cavally virus that were both described in 2011 and isolated from various mosquito species collected in Vietnam and Cote d’Ivoire, respectively. New isolates of Alphamesonivirus 1 were subsequently cultured from mosquitoes collected in Australia, Austria, China, Indonesia, Mexico, Thailand, and U.S.A (Table 1). Literature suggests that Alphamesonivirus 1 is the most frequently detected mesonivirus across both geographic location and mosquito genera (Table 1). The remaining 8 classified species are represented by mostly single reports of the species from one country and often isolated from a single genus of mosquito (Table 1). Apart from the isolates of Alphamesonivirus 1, only Casuarina virus (Alphamesonivirus 4) and Nsé virus (Alphamesonivirus 8) have been isolated from multiple genera of mosquitoes. Three viruses, Yichang, Dianke, and Moumo viruses, remain to be assigned (Table 1). Yichang virus was detected at a frequency of 16.5% from pools of mosquitoes collected in Hubei, in China. Pairwise evolutionary distances (PED) based on the conserved domains of ORF1b indicate that this virus is divergent from those that have already been classified. Several attempts to isolate Moumo virus from pools of mosquitoes collected in Côte d’Ivoire were not successful. However, a sequence fragment of 7985 nt was elucidated and enabled phylogenetic comparison with other mesoniviuses. Using antiserum produced to a mix of Alphamesoniviruses 1 and 8, conservation of antigenic sites in the S and M proteins between Alphamesonivuruses 1, 5, 8, and 9 was confirmed. The phylogenetic relationship between the four mesonivirus species was further supported with cross-reactivity analysis in Western blot using numerous peptide-specific antisera, whereby cross-reactivity was more likely to be observed between those viruses that were more closely related genetically.

Virion Structure Electron microscopy studies are limited but indicate that mesonivirus virions are spherical, enveloped viruses, ranging in size from 50 to 120 nm (Table 1). Only one study has assessed mesonivirus particles by cryo-electron microscopy, determining Casuarina virus (Alphamesonivirus 4) virions to have a diameter of 65 nm. On the surface of the particles, most mesoniviruses carry large club-shaped projections. Cryo-tomographic analysis revealed that these spike-like projections are attached to the virion via a low density stalk and display a globular head, with distinct similarities to the well characterized spikes of the coronaviruses, severe acute respiratory syndrome coronavirus (SARS-CoV) and murine hepatitis virus (MHV) (Fig. 1). The reported size of the projections ranges from 3 to 4 nm for Alphamesonivirus 1 through to 15 nm for Alphamesonivirus 4, but may be attributable to differences in imaging (Table 1). Studies using Alphamesonivirus 4 indicate that these viruses lack a dense nucleoprotein core complex.

Genome Mesoniviruses have a positive sense single stranded RNA genome with a 50 cap and a 30 poly A tail. A feature consistent with other nidoviruses is the organization of the mesonivirus genome into discrete ORFs, each encoding separate functional domains. The mesonivirus genome encodes seven ORFs, ordered ORF1a-ORF1b-ORF2a-ORF2b-ORF3a-ORF3b-ORF4, which are bounded by

804

Encyclopedia of Virology, 4th Edition, Volume 4

doi:10.1016/B978-0-12-814515-9.00007-2

Alphamesonivirus-9

Menolivirus

2012 1981

South Korea Indonesia

Méno virus

Hana virus Ofaie virus Kadiweu virus Nsé virus

Karang Sari virus Dak Nong virus Casuarina virus

Kamphaeng Phet virus

Ngewotan virus

Bontang Baru virus

Moumo virus Dianke virus

Alphamesonivirus-5 Alphamesonivirus-6 Alphamesonivirus-7 Alphamesonivirus-8

Hanalivirus Ofalivirus Kadilivirus Enselivirus

2017 2007/08

Mexico Mexico

Unassigned Unassigned

Alphamesonivirus-2 Alphamesonivirus-3 Alphamesonivirus-4

Karsalivirus Karsalivirus Casualivirus

2004, 2010

Côte d’Ivoire Senegal

China

Côte d’Ivoire

Côte d’Ivoire Brazil Brazil Côte d’Ivoire

Indonesia Vitenam Australia

2004 2012/2013

2014

2004

2004 2010 2010 2004

1981 2007 2010, 2006

1981 2015 1984–85

2016–2017

Austria China U.S.A

Indonesia Australia Thailand

2007 – 2014

Australia

Houston virus

2002

Vietnam

Nam Dinh virus

2004

Date of mosquito collection

Côte d’Ivoire

Isolation region

Cavally virus

Yichang

Alphamesonivirus-1

Namcalivirus

Name identification

Unassigned

Species

Species in the Family Mesoniviridae, Genus Alphamesonivirus

Subgenus

Table 1

Cx. nebulosus Anopheles spp. Culex spp. Aedes spp. Mansonia spp.

Culex spp. Armigeres spp. Anopheles spp.

Uranotaenia chorleyi

Culex spp. Mansonia spp. Culex spp. Culex spp. Aedes spp. Anopheles spp. Anopheles spp.

Cx. vishnui Cx. tritaeniorhynchus Cq. xanthogaster Culex annulirostris

Culex spp. Aedes spp. Anopheles spp. Uranotaenia spp. Culex vishnui Culex tritaeniorhynchus Cx. annulirostris, Cq. xanthogaster, Ae. vigilax Uranotaenia unguiculata Culex quinquefasciatus Culex quinquefasciatus Ae. albopictus Culex quinquefasciatus Culex quinquefasciatus Ae. taeniorhynchus Culex pipiens Cx. vishnui Cx. tritaeniorhynchus Cx. vishnui Cx. australicus Mosquito pool

Mosquito species

80 nm with 10 nm spikes

107 PFU

65 nm with 5 nm spikes 5 nm spikes

120 nm with 12 nm spikes

108.3 TCID50

108.8 TCID50

120 nm with 12 nm spikes

109.3 TCID50

B50 nm

55–65 nm

50-80 nm, 3 nm spikes

60-80 nm

120 nm with 12 nm spikes

B50 nm 80 nm 65 nm with 15 nm spikes

TCID50

Particle morphology

1011 PFU 108.5 TCID50

10

9.7

Titer in C6/36 cells/ml

Mesoniviruses (Mesoniviridae) 805

806

Mesoniviruses (Mesoniviridae)

Fig. 1 Mesonivirus particle morphology and spike protein architecture. Left and middle panels: Negative stained transmission-electron micrographs of Nam Dinh virus (Alphamesonivirus 1) and Casuarina virus (Alphamesonivirus 4) particles. Spikes on the virion surface are highlighted with white arrows. Right panel: Proposed mesonivirus virion architecture, with trimeric spike proteins (S) projecting from an ordered M protein/lipid bilayer support. Scale bars indicate 50 nm. Images produced by Dr Daniel Watterson, The University of Queensland, Australia.

Fig. 2 Mesonivirus genome organization. Schematic representation of the typical genome of mesoniviruses using Cavally virus (Alphamesonivirus 1) as the representative. The open reading frames are shown by boxes. In ORFs 1a and 1b, the functional domains (3C-like serine protease (3CLpro), RNA-dependent RNA polymerase (RdRp), zinc-binding (Z), helicase (Hel), exoribonuclease (ExoN), N7-methyltransferase (NMT) and 20 -O-methyltransferase (OMT)) are shaded gray and the ribosomal frame shift (RFS) element is shown. ORF2a encodes the S glycoprotein, which is cleaved (indicated by inverted triangles) to generate S1, S2 and an uncharacterized N-terminal fragment. An alternative ORF (2b) is encoded at the N-terminal end of ORF2a and expresses the N protein (colored in orange). The putative membrane protein is encoded by ORF3a. The function of the predicted protein encoded by ORF3b is not known and the putative ORF4 is indicated by a solid black rectangle.

50 and 30 untranslated regions (Fig. 2). The two largest ORFs, ORF1a and ORF1b, are located at the 50 end of the genome and overlap by a few nucleotides, with translation regulated by a ribosomal frameshift motif (RFS), with the sequence GGAUUUU, allowing the two respective polyproteins, pp1a and pp1ab, to be expressed. A second possible RFS also exists between ORFs 3a and 3b. Downstream of ORF1b are multiple smaller ORFs that code for the virion proteins. These ORFs at the 30 end of the genome are expressed from two subgenomic RNAs, as confirmed by Northern blot. Transcription regulating sequences (TRSs) have been identified which are predicted to control the synthesis of the subgenomic RNAs.

ORF1a/ORF1b ORFs 1a and 1b code for the key replicative enzymes. The replicase genes include conserved functional domains that are present in mesoniviruses and larger nidoviruses. The viral 3C-like chymotrypsin-like protease (3CLpro) domain is flanked by transmembrane domains (TM) and is translated from ORF1a (Fig. 2). Consistent with the role of 3CLpro for other nidoviruses, studies with Alphamesonivirus 1 3CLpro have confirmed the autocatalytic activity of 3CLpro, as well as its role in the processing of the enzymes encoded by ORF1b (Fig. 2). However, the mesonivirus 3CLpro does feature distinct substrate specificities in comparison to other nidoviruses. Another attribute of some mesoniviruses is that the N terminus of ORF1a can be variable in length, due to the presence of large block insertions of up to almost 200 amino acids for some viruses such as Yichang, Karang Sari, and Bontang Baru viruses. Apart from variations on the sequence SKRKGK commonly seen at the termini of these insertions, function of these insertions is unclear. Encoded by ORF1b are the remaining replicative enzymes – a RNA-dependent RNA polymerase (RdRp), a complex zinc-binding motif (Z) linked to a superfamily 1 helicase (Hel), 30 to 50 exoribonuclease (ExoN), N7-methyl-transferase (NMT) and 20 -Omethyltransferase (OMT) from ORF1b (Fig. 2). As detailed above, these enzymes are processed from the polyprotein by 3CLpro and putative host cell protease(s). NendoU is the only replicase domain not encoded by mesoniviruses, but for which is part of the typical constellation of seven conserved domains of other nidoviruses. Of note is the acquisition of ExoN by an ancestral group from which the mesoniviruses descended. This facilitated the required increase in replication fidelity, necessary for the evolution of the coronaviruses, which boast the largest RNA genome size. It is the presence of conserved functional ExoN and OMT domains within the genomes of mesoniviruses and other larger nidoviruses which distinguish them from the smaller nidoviruses.

Mesoniviruses (Mesoniviridae)

807

Structural Proteins The structural and accessory proteins are encoded by five ORFs at the 30 end of the mesonivirus genome and are translated from subgenomic mRNAs (Fig. 2). Analysis of the mesonivirus genome suggests the presence of eight structural proteins, including glycoprotein spike (S protein, S1 and S2 subunits) (ORF2a), nucleocapsid protein (N) (ORF2b), and four membrane-spanning proteins (M protein – p17, p18, p19, and p20) (ORF3a). Protein expression analysis could not detect proteins from ORF3b or ORF4. The spike protein encoded by ORF2a is translated as a precursor polyprotein, which undergoes post-translational proteolytic processing to form the spike proteins. N-terminal sequence analysis of Cavally virus and Dak Nong virus spike proteins confirmed the processing of the ORF2a peptide by a signal peptidase to create S (p77–80) and a furin-like protease to generate the two spike proteins, S1 (p23) and S2 (p55–57), which share a common N terminus and C terminus with S, respectively. While the signalase cleavage site appears to be relatively variable between the mesoniviruses, the furin-like motif is conserved. The S1 and S2 proteins are post-translationally modified by N-linked glycosylation. The extreme N-terminus of ORF2a translation product has not been detected, either as part of a full-length ORF2a polyprotein, or the signalase-cleaved product and exhibits high sequence variability between the mesoniviruses. An alternative ORF (ORF2b) is, however, encoded within this region (Fig. 2). Translation of this ORF results in the production of the putative 25 kDa hydrophilic nucleoprotein. Immediately downstream of ORF2a are two overlapping ORFs – ORF3a and ORF3b, the former encoding the putative glycosylated membrane (M) protein. In silico analysis of ORF3b proteins suggests features consistent with a class II transmembrane protein, although there is poor sequence conservation and a function of the protein remains to be established. In the 30 -terminal end of the genome, most mesoniviruses encode an additional putative ORF (ORF4). While it is not known if ORF4 is expressed in cells and analysis of the putative protein does not yield any indication of possible function, there is a high level of sequence identity at the N-terminal region. Experiments using anti-serum produced to the first 223 aa of ORF2a, and the predicted proteins of ORF3b and ORF4 of Cavally virus did not provide any evidence for the presence of these proteins in virus-infected cells.

Life Cycle The tropism of mesoniviruses is restricted to insect cells, and perhaps only to mosquitoes, although robust analysis for replication in a variety of insect cells is warranted. Numerous in vitro infection studies have been performed on a range of mammalian and avian cell lines to provide evidence that mesoniviruses do not replicate in vertebrate cell lines. In mosquito cell culture, the viruses cause varying degrees of cytopathic effect depending on the virus. In the Aedes albopictus mosquito cell line, C6/36, mesoniviruses replicate to very high titers, frequently exceeding 108 infectious units per ml (Table 1). The replication in C6/36 cells is also rapid. Within 24 h, 109 to 1010 RNA copies/ml can be detected in culture supernatant, peaking at 1011 RNA copies/ml by 48 h postinfection. The viruses will also replicate readily in cell lines of Culex mosquito origin, but to lower titers. There is evidence for maintenance of insect-specific viruses of other families within the mosquito population via vertical transmission. The detection of Alphamesonivirus 1 RNA in both male and female mosquitoes provides some evidence that mesoniviruses are similarly transmitted from the female mosquito to her progeny. The detection of Dianke virus in pools of male mosquitoes further support the likelihood of vertical transmission. However, more studies in this area are warranted. It has been shown that mesoniviruses disseminate into the legs and wings of naturally infected mosquitoes and intriguingly the viral nucleic acid in mosquito saliva expectorated onto honey-baited cards was detected. These data suggest that there may be numerous routes of transmission between mosquitoes.

Reference ICTV, 2018. ICTV Master Species List 2018b.v2. Available at: https://talk.ictvonline.org/files/master-species-lists/m/msl/8266.

Further Reading Gorbalenya, A.E., Enjuanes, L., Ziebuhr, J., Snijder, E.J., 2006. Nidovirales: Evolving the largest RNA virus genome. Virus Research 117 (1), 17–37. Hall, R.A., Bielefeldt-Ohmann, H., McLean, B.J., et al., 2016. Commensal viruses of mosquitoes: Host restriction, transmission, and interaction with arboviral pathogens. Evolutionary Bioinformatics Online 12 (Suppl. 2), 35–44. Lauber, C., Goeman, J.J., del Carmen Parquet, M., et al., 2013. The footprint of genome architecture in the largest genome expansion in RNA viruses. PLoS Pathogens 9 (7), e1003500.

Nimaviruses (Nimaviridae) Peter J Krell, Department of Molecular and Cellular Biology, University of Guelph, Guelph, ON, Canada Emine Ozsahin, University of Guelph, Guelph, ON, Canada r 2021 Elsevier Ltd. All rights reserved. This is an update of J.-H. Leu, J.-M. Tsai, C.-F. Lo, White Spot Syndrome Virus, In Encyclopedia of Virology (Third Edition), edited by Brian W.J. Mahy, Marc H.V. Van Regenmortel, Elsevier Ltd., 2008, doi:10.1016/B978-012374410-4.00776-7.

Glossary DNA dependent RNA polymerase (DdRp) An RNA polymerase that copies mRNA from a DNA only template. Homologous repeat (hr) a sequence of DNA with several copies of imperfect palindromes, that are repeated two or more times in the genomes of baculoviruses or whispoviruses. In situ hybridization (ISH) A technique that uses a labeled complementary DNA or RNA probe to hybridize and detect a specific DNA or RNA sequence in a section of tissue (in situ). Internal ribosome entry site (IRES) A sequence in mRNA, either at the 50 end or internal between ORFs allowing for translation of a downstream internal ORF in a capindependent manner.

Microarray A DNA microarray is a collection of microscopic dsDNA or oligonucleotide spots that are deposited on a solid glass slide at high density. Each spot commonly represents a single gene. DNA microarrays use DNA–DNA or DNA–RNA hybridization to perform simultaneous large-scale analyses of the expression levels of the corresponding genes. RNAi (RNA interference) RNAi refers to the introduction of homologous double-stranded RNA to specifically interfere with the target gene’s expression in the cells/ organisms. It was originally discovered in C. elegans, and is now widely used in many organisms for null mutations. Recent studies suggest that RNAi can be applied in shrimp as well through intramuscular injection of dsRNA.

Introduction White spot syndrome viruses (WSSVs) cause White spot disease resulting in a major economic problem in worldwide industrial shrimp and crayfish fisheries. They have a wide host range causing lethal diseases in shrimp, prawns, crabs, lobster, crayfish and copepods. The first whispovirus to be documented was called white spot bacilliform virus, now named White spot syndrome virusCN. As evidenced by earlier names of isolates such as Chinese baculo-like-virus, Penaeus mononodon non-occluded baculoviruses II and III and white spot baculovirus, these viruses were originally thought to be baculoviruses and were classified with the Baculoviridae. However due to unique features in their morphology and phylogeny they were placed in a new family, Nimaviridae in 2002.

Taxonomy The Nimaviridae family has a single genus, Whispovirus, with a single and type species White spot syndrome virus, all taxa names approved by the International Committee on Taxonomy of Viruses (ICTV) in 2002. Mercifully there are no higher orders of classification (Order, Class, Phylum, Kingdom, Realm) for Nimaviridae. There is one exemplar virus in the species, white spot syndrome virus CN (WSSV-CN). Thirteen additional viruses are recognized by ICTV, though many more isolates have been characterized. While the etymology of the species name is more self-evident (from white spot syndrome virus), that of the family name relates to the viral morphology not the disease. A distinguishing and unique feature of the whispoviruses is that they have a polar thread-like tail, “nima” in Greek, which gave rise to the family name Nimaviridae. Prior to 2002, due to the circular nature of the large dsDNA genome and similarity in morphology to baculoviruses, the white spot syndrome viruses were initially misclassified in the Baculoviridae (subfamily Nudibaculovirinae, genus Nonoccluded baculovirus). However, their unique morphology, genome structure and phylogeny showed they were clearly distinct from the baculoviruses and were deserving of their own Nimaviridae family.

Virion Structure and Composition WSSV virions are enveloped, ellipsoid to bacilliform in shape measuring 70–170 nm diameter and 210–420 nm in length with a rod-shaped nucleocapsid 50–80 nm in diameter and 300–420 nm long (Figs. 1 and 2(b)). One end of the nucleocapsid appears open and flattened and the other end is more rounded. By electron microscopy the nucleocapsids have a distinct pattern of a

808

Encyclopedia of Virology, 4th Edition, Volume 4

doi:10.1016/B978-0-12-814515-9.00060-6

Nimaviruses (Nimaviridae)

(a)

500 nm

(b)

809

500 nm

Fig. 1 Morphology of WSSV virions (a) intact WSSV virion with tail like extension at the smaller end. (b) WSSV nucleocapsid showing the stacked ring structures made up of two rows of VP664-based globular subunits. From Leu, J.H., Tsai, J.M., Lo, C.F., 2008. White spot syndrome virus. In: Mahy, B.W.J., van Regenmortel, M.H.V. (Eds.), Encyclopedia of Virology, third ed. pp. 450–459.

stacked series of about 16 rings, each about 20 nm in thickness and composed of two rows of 12–14 globular units each about 10 nm in diameter. Between the envelope and nucleocapsid is a tegument layer. A thread-like tail, or “nima” responsible for the family name, extends from the narrower end of the virus. WSSV virions have a buoyant density of 1.2 g cm3 and nucleocapsids have a buoyant density of 1.31 g cm3 and are quite stable in pond water, being viable for 3–4 days. SDS PAGE (Fig. 2(a)) and proteomic analyses have identified about 40 structural proteins ranging from 11 to 664 kDa in size. WSSV virions have a complex SDS-PAGE protein profile (Fig. 2(a)). Early structural protein studies used SDS-PAGE coupled with Western blot analyses and/or N-terminal sequencing of proteins to identify at least six structural proteins. More recent studies have used mass spectrometry to analyze the protein profiles of purified WSSV virions separated by SDS-PAGE and 2-D electrophoresis, and more than 40 structural proteins have now been identified. The major structural proteins are VP664, VP28, VP26, VP24, VP19, and VP15 (Fig. 2(a, b)). VP15 is a basic capsid protein with in vitro DNA-binding activity, and it has high homology to the lysineand arginine-rich DNA-binding proteins of the insect baculoviruses. This protein is therefore thought to be responsible for packaging of the WSSV DNA into the nucleocapsid. VP26, one of five tegument proteins, can interact with nine host proteins including actin, but the importance of this interaction remains unknown. VP26 interacts with beta-integrin to potentially facilitate virus entry, and antiviral proteins such as C-type lectins (MjLecA and LvCTL1) and prohibitins (PHBs). VP19 and VP28 are envelope proteins, and an in vitro physical interaction between VP28 and tegument proteins VP24 and VP26 has been reported. VP664, the major capsid protein, takes its name from its calculated molecular mass of 664 kDa. Not only is it the largest protein encoded by the WSSV genome, but it is also the largest viral structural protein ever found. This protein is encoded by the intronless giant ORF (wssv419) of 18,234 nt. Immunoelectron microscopy of purified virions has shown that VP664 forms the globular subunits visible in the nucleocapsid ring structures. VP664 antibodies bind to the nucleocapsid but not envelope of purified virions (Fig. 3). In addition to VP15, a very basic DNA binding nucleoprotein, and VP664, the nucleocapsid contains at least four other minor proteins: VP160A (WSSV344), VP160B (WSSV094), VP60B (WSSV474), and VP51C (WSSV364) (Fig. 2(b)). Neutral lipids, phosphatidylcholine and phosphatidylethanolamine form the major lipids of the envelope, along with the fatty acids stearic acid, eicosapentaenoic acid and docosahexaenoic acid.

Genome and Phylogeny The nimavirus genome is a large, circular dsDNA (Fig. 2(a)), one of the largest viral DNAs published rivaling that of some phycodnaviruses which range from 154,641 bp for a phaeovirus to 459,984 bp for a prymnesiovirus. There are currently 15 fully sequenced nimavirus genomes accessible from Genbank (Table 1), showing a range in size from 280,591 bp for WSSV-IN-AP4RU from India to 309,286 bp for WSSV-CN01 from China. Note, not included in Table 1 is the genome sequence for WSSV-EG3 (KR083866.1) since it is a duplicate of the WSSV-CN (AF332093) genome and the supporting publication for the WSSV-EG3

810

Nimaviruses (Nimaviridae)

Fig. 2 WSSV structural proteins and genome. (a) SDS-PAGE profile of purified WSSV virions. The six major and some minor structural protein bands are indicated. The bands corresponding to two host proteins, hemocyanin and actin co-purified with the virion are also indicated. (b) A proposed schematic diagram to show the WSSV virion structure and the location of some WSSV structural proteins and its DNA. (c) Distribution of 33 WSSV structural protein genes in the WSSV-TW genome. The inner circle shows predicted HindIII restriction enzyme cutting sites. Modified from Leu, J.H., Tsai, J.M., Lo, C.F., 2008. White spot syndrome virus. In: Mahy, B.W.J., van Regenmortel, M.H.V. (Eds.), Encyclopedia of Virology, third ed. pp. 450–459.

genome was withdrawn by the authors. Also not included in this table and article are nimavirus-like endogenous genomes found in host genomes. All genomes have a G þ C content of about 41%. Although most WSSV genome sequences are unique, 3% of the genome consists of highly repetitive sequences, which are organized into about nine homologous regions (hrs) with the number of repeats of about 250–300 bp within an hr varying among isolates and localized to intergenic regions. The hrs include a variety of repeats including direct repeats, atypical inverted repeats and imperfect palindromic sequences. Their function in the nimaviruses is unknown but in baculoviruses, hrs have been implicated in DNA replication and as enhancers of early gene transcription. Differences in genome lengths have been attributed to various indels, like the insertion of a 1.3 kb transposase sequence in the genome of WSSV-TW and deletions in the WSSV-TH genome and in variations in the number and size of hrs.

Nimaviruses (Nimaviridae)

(a)

200 nm

(b)

811

100 nm

Fig. 3 Immunoelectron microscopy analysis of purified WSSV virions detected with VP664 antibody followed by gold-labeled secondary antibody. (a) The anti-VP664 antibody specifically binds to the nucleocapsid and not to the viral envelope. (b) Most of the gold particles are localized at the perimeter of the nucleocapsid. From Leu, J.H., Tsai, J.M., Lo, C.F., 2008. White spot syndrome virus. In: Mahy, B.W.J., van Regenmortel, M.H.V. (Eds.), Encyclopedia of Virology, third ed. pp. 450–459.

Table 1

Characteristics of fifteen WSSV isolates and their complete genomes (derived from Genbank as of Aug 31, 2020)

Virus Abbreviation

Country of Origin

Species source

Year collected

Date seq submitted

Sequence Length

Protein number

% GC

Accession Number

WSSV-AU WSSV-CN

Australia China

2016 1996

2017 Aug 2014 Nov

285,973 305,119

904a 524a

41 41

MF768985.1 AF332093.3

WSSV-CN01

China

1994

2015 Nov

309,286

177

40.9 KT995472.1

WSSV-CN02 WSSV-CN03 WSSV-CN04

China China China

2010 2010 2012

2015 Nov 2015 Nov 2017 Mar

294,261 281,148 281,054

164 154 157

41 41 41

KT995470.1 KT995471.1 KY827813.1

WSSV-CN  95DFPEb WSSV-CN-Pc WSSV-CN-Pc02c WSSV-EC  15098 WSSV-IN-AP4RU WSSV-K-LV1 WSSV-MEX2008 WSSV-TH WSSV-TW

China

Penaeus monodon Marsupenaeus japonicus Marsupenaeus japonicus Procambarus clarkia Litopenaeus vannamei Marsupenaeus japonicus Litopenaeus vannamei

1995

2019 Dec

305,094

173

41

MN840357.1

China China Ecuador India South Korea Mexico Thailand Taiwan

Procambarus clarkii Procambarus clarkii Litopenaeus vannamei Litopenaeus vannamei Litopenaeus vannamei Litopenaeus vannamei Not reported Penaeus monodon

2015 2018 Pre 2017 2014 2011 2008 1996 1994

2016 2018 2018 2017 2012 2015 2001 2005

300,223 287,179 288,997 280,591 295,884 300,087 292,967 307,287

191 180 150 442a 515a 184 184 532a

41 41 40.9 40.8 40.9 41 41.1 41

KX686117.1 MH663976.1 MH090824.1 MG702567.1 JX515788.1 KU216744.2 AF369029.2 AF440570.1

Aug Jul Mar Dec Nov Nov Mar Mar

a

protein numbers in excess of 200, which may be due to over-annotation, including overlapping ORFs in the count. From archival paraffin embedded tissues of shrimp infected with a China CN95 strain of WSSV. c WSSV-CN-PC02 was originally named Procabarus clarkii virus. In this review the abbreviation WSSV-CN-Pc02 is used to maintain the consensus in the naming of WSSV isolates beginning with WSSV followed by letters indicating its origin (CN for China), Pc for Procabarus clarkia and isolate number, 02 since another WSSV-CN-Pc was previously described. Common names: Penaeus monodon, Asian tiger shrimp or black tiger shrimp; Marsupenaeus (Penaeus) japonicus, Japanese tiger prawn or Kuruma shrimp; Litopenaeus vannamei, Whiteleg shrimp; Procambarus clarkii, Red swamp crayfish. b

By analysis of the 15 current whispovirus genomes using MAFFT alignment, the overall genome sequence identity between genomes is quite high, between 92.44% and 98.91% (Table 2). Despite the overall identities, the genomes separated into two phylogenetic clades (Fig. 4). The most populous clade is clade B including two subclades B1 and B2. For the four isolates in subclade B1, the % identities range from 95.6% to 97.3%, all above 95% for all and their genomes range from 281 to 287 kb in size. The ten isolates within subclade B2 have % identities ranging from 95.28 to 98.91, all above 95%. Genome sizes in this subclade are among the largest and range from 289 to 309 kb. Nevertheless, the % identities between WSSV genomes from subclade B2 compared to genomes from subclade B1 range from 93.97% to 97.33% suggesting they should be part of the same clade B. The single isolate in clade A, WSSV-IN-AP4RU from India, has the lowest level of % identity with any of the other 14 isolates, ranging from 92.44% to 93.62% and has the smallest genome at 280,591 bps. Moreover, WSSV-IN-AP4RU appears to be an earlier genome compared to all the other WSSV genomes. The clades/subclades described here mimic the clusters observed in

812

Table 2

Nimaviruses (Nimaviridae)

Sequence identities among 15 WSSV complete genomes

Fig. 4 Alignment of 15 WSSV genome sequences using MAFFT (version 7.397). Neighbor joining clustered trees were used to calculate the Maximum Likelihoods and bootstrap support values based on 100 bootstraps in R using the Phangorn package. (a) Ancestry relationships of strains (rooted tree). (b) Relatedness of the strains (unrooted tree). Virus clades are shown as A or B and subclades of B are shown as subclade B1 and B2. from: Katoh, K., Standley, D.M., 2013. MAFFT multiple sequence alignment software version 7: Improvements in performance and usability. Molecular Biology and Evolution, 30 (4), 772–780. doi:10.1093/molbev/mst010. Schliep. K.P., 2011. Phangorn: Phylogenetic analysis in R. Bioinformatics 27 (4), 592–593. doi:10.1093/bioinformatics/btq706.

the unrooted phylogenetic tree (Fig. 3(b)) and for subclades B1 and B2 mimic the two clusters derived for nine genomes using ClustalW alignment reported by Oakey and Smith (2018). Both strands of nimavirus DNAs have protein coding genes. For those genomes that appear to be well annotated, whispovirus genomes encode from 150 proteins for WSSV-EC-15098 to 191 proteins, for WSSV-CN-Pc (based on NCBI “Protein” tables) (Table 1). However, for some isolates, identified by “a” in Table 1, the number of proteins is much higher, between 442 for

Nimaviruses (Nimaviridae)

813

Fig. 5 Hematoxylin and eosin (HE) staining of tissue section from Penaeus monodon cuticular epithelium of eyestalk infected with WSSV. Degenerated cells characterized by hypertrophied nuclei (black arrows) are readily seen. The cells with normal nuclei are also indicated with arrowheads. Scale ¼ 20 mm. From Leu, J.H., Tsai, J.M., Lo, C.F., 2008. White spot syndrome virus. In: Mahy, B.W.J., van Regenmortel, M.H.V. (Eds.), Encyclopedia of Virology, third ed. pp. 450–459.

WSSV-IN-AP4RU to 904 for WSSV-AU. These very high numbers, considering the genome size, suggest their genomes were over annotated, e.g., including smaller ORFs that clearly overlap larger ones and counting all ORFs even if fully overlapping. Proteomic and genomic studies have identified genes for at least 48 structural proteins including 31 envelope proteins such as the major envelope proteins VP28 and VP19, five tegument proteins like the major tegument proteins VP26 and VP24 and nine nucleocapsid proteins like the major VP664 component of nucleocapsids and VP15, the major DNA binding protein. A number of nonstructural proteins were also identified. Of the 18 immediate early genes, ie1, ie2, ie3, and ubiquitin E3 ligase were identified. Genes for four antiaptotic proteins (AAP), i.e., AAP1, 2, 3 and 4 were identified. Genes for at least nine enzymes were detected including for thymidylate synthase, protein kinases 1 and 2, a nuclease, DNA polymerase, two ribonucleotide reductases (RR1 and RR2), thymidine kinase/thymidylate kinase and a dUTPase. Given the large size of its genome, it is not surprising to find some additional interesting nimavirus genes including for two anti-melanization proteins, two latency-related genes, and genes for a histone-binding DNA mimic, mitochondria targeting protein and a TATA-box binding protein.

Life Cycle WSSV replication occurs in the tissues of mainly ectodermal (cuticular epidermis, fore- and hindgut, gills and nervous tissue) and mesodermal (lymphoid organ, antennal gland, connective tissue and hematopoietic tissue) origin but not in tissues of endodermal origin (hepatopancreas and midgut). In the early stage of infection, affected cells display nuclear hypertrophy (Fig. 5), nucleoli dissolution, and chromatin margination, and the central area changes into a homogeneous eosinophilic region. The infected cells then develop an intranuclear eosinophilic Cowdry A-type inclusion, which becomes a light basophilic, denser inclusion separated by a transparent zone from the marginated chromatin. The cytoplasm becomes less dense and more lucent. Later in infection, the nuclear membrane is disrupted, causing the intranuclear transparent zone to fuse with the lucent cytoplasm. At the end of cellular degeneration, the nucleus or whole cell disintegrates, leading to loss of cellular architecture. In moribund shrimp, most tissues and organs are heavily infected with the virus, and they exhibit severe multifocal necrosis. Following infection, the first WSSV-positive signals detected are in the stomach, gills, cuticular epidermis, and connective tissue of the hepatopancreas (Fig. 6). At later stages of infection, the lymphoid organ, antennal gland, muscle tissue, hematopoietic tissue, heart, midgut, and hindgut also become positive. As infection proceeds, the stomach, gill, hematopoietic tissue, lymphoid organ, antennal gland, and cuticular epidermis become heavily infected with WSSV, which leads to serious damage and necrosis and/or apoptosis. The entire WSSV replication cycle is nuclear and, in an acutely infected shrimp, can be completed within 24 h post infection. WSSV enters susceptible cells via caveola- or clathrin-mediated endocytosis facilitated, in part by the binding of the major envelope glycoprotein VP28 and cell receptors such as integrins gC1qR and Rab7 and a complex of about 11 virion proteins interacting with cell surface chitin-binding protein (CBP) which may also facilitate endocytosis. Several viruses use the Arg-Gly-Asp (RGD) motif to bind to cellular integrins during infection and this cell attachment site signature can be identified in at least six WSSV structural proteins. One of these six structural proteins, VP110 (WSV035), can attach to shrimp host cells, and this adhesion can be blocked by synthetic RGDT peptides, suggesting that the RGD motif in VP110 may play a role in WSSV infection. Following endocytosis, WSSV travels through endosomes and as a consequence of acidification of the endosomes and the interaction between VP28 and Rab7, the nucleocapsids are released to the cytoplasm. By an unknown mechanism WSSV nucleocapsids enter the nucleus (structurally similar baculovirus rod-shaped nucleocapsids enter nuclei through the nuclear pore

814

Nimaviruses (Nimaviridae)

(b)

(a)

(c)

(d)

Fig. 6 In situ hybridization analysis of WSSV positive cells (arrows) in (a) the integument, (b) the gill, (c) the stomach, and (d) the heart from experimentally infected Penaeus monodon at 60 hpi. The infected cells shown here in the heart are connective tissue cells; muscle cells are not usually targeted by WSSV. Scale ¼ 20 mm. From Leu, J.H., Tsai, J.M., Lo, C.F., 2008. White spot syndrome virus. In: Mahy, B.W.J., van Regenmortel, M.H.V. (Eds.), Encyclopedia of Virology, third ed. pp. 450–459.

or uncoat at the nuclear pore, so WSSV might use a similar nuclear entry mechanism). Once in the nucleus, transcription of viral genes can begin. Most virus RNAs are 50 capped and 30 polyA tailed, with no evidence of splicing. Some virion protein genes such as vp31, vp39b, and vp11 are arranged in a cluster, producing polycistronic mRNAs which use internal ribosome entry sites (IRES) to regulate their translation. Several IRES elements have been identified, for example upstream of vp28, wsv493 (icp35) and across the coding region of vp31 and vp39b. Immediate early gene expression, like that for immediate early gene 1 (ie1), is mediated by host transcription factors and RNApolII, initiating a temporal transcriptional cascade consisting of immediate early (IE), early (E), late (L) and very late (VL) phases. Ie1 genes, can be expressed in the presence of cycloheximide indicating their expression relies solely on host factors. The ie1 promoter is characterized by an upstream TATA-box and binding site for the host transcription factor Sp1. About 16 early ORFs have transactivation activity, presumably assisting in later gene transcription. Structural gene promoters incorporate a degenerate ATNAC motif that could be involved in late gene transcription. Unlike baculoviruses, whispoviruses show no evidence of a viral DNA dependent RNA polymerase (DdRp) and must rely on the host RNApolII and host factors. Thus, early viral transcription factors like IE1, IE2 and IE3 might recognize late gene promoters recruiting the cellular RNApol and transcription factors allowing for late gene transcription. Viral and host miRNAs may be involved in WSSV infection and latency. For example, M. japonicus miR-7 targeted viral wsv477, an early gene involved in virus replication and when injected, reduced WSSV copies by 1000 fold late in infection. Studies of WSSV expression identified as many as 89 WSSV miRNAs, the majority in the early phases. Some miRNAs showed tissue specific expression. For example, WSSV-miR-N24 inhibits apoptosis by downregulating host caspase 8. WSSV-miR-66 targets viral wsv094 and wsv177 and WSSV-miR-68 targets wsv248 and wsv309 genes related to latency to virus replication. Not much is known about whispovirus DNA replication. Nevertheless, the WSSV genome contains several genes identified through homology searches that encode proteins involved in DNA metabolism and replication. These genes include thymidylate synthase, dUTPase, ribonucleotide reductases (rr1 and rr2; two separate ORFs encoding the two subunits), chimeric thymidylate/ thymidine kinase (TK-TMK), and DNA polymerase. The chimeric thymidylate/thymidine kinase gene is a unique feature of WSSV, as these genes in other large DNA viruses are encoded in separate ORFs. The enzymatic activities of dUTPase, ribonucleotide reductase, and TK/TMK have been demonstrated by using purified recombinant proteins. Since it does not encode all genes needed for DNA replication, WSSV depends on the host for other replication proteins. It has therefore been proposed that WSSV, like some

Nimaviruses (Nimaviridae)

(a)

500 nm

(b)

815

167 nm

Fig. 7 Transmission electron micrograph of WSSV-infected tissues from beneath the cephalothoracic exoskeletoal cuticle. (a) The viral particles spread among the necrotic area. The complete viral particles are indicated with arrows and the long empty tubules with an arrowhead. (b) High magnification of virus particles with rod-shaped morphology. A viral particle with an empty nucleocapsid is indicated with an arrow. From Leu, J.H., Tsai, J.M., Lo, C.F., 2008. White spot syndrome virus. In: Mahy, B.W.J., van Regenmortel, M.H.V. (Eds.), Encyclopedia of Virology, third ed. pp. 450–459.

other DNA viruses like herpesviruses and SV40 promotes S-phase arrest to maximize the DNA replication machinery. Part of the mechanism for S-phase arrest could involve WSSV IE1 and WSSV056 which bind to an Rb homolog in shrimp. This binding could activate host E2F1 leading to S1 arrest, as demonstrated in Drosophila cells overexpressing these two proteins. Viral morphogenesis begins with de novo formation of the viral envelope, seen at first as fibrillar fragments in the nucleoplasm. These fragments form into the membranous and/or vesicular precursors of the viral envelope. Virus assembly begins with the packaging of the 300 kb viral genome into nucleocapsids. This packaging is mediated by binding of VP15 a basic viral DNA binding protein forming nucleoprotein complex fibers. The development of nucleocapsids begins with the formation of long empty tubules (Fig. 7) with a diameter like that of empty nucleocapsids and a segmentation similar to that of nucleocapsids. They can exist individually or may be laterally aligned in groups of two or three to form a larger structure. The assembled tubules break into fragments of 12–14 rings to form empty naked capsids which are then surrounded by envelopes, leaving an opening at one end. The filamentous nucleoprotein complex enters the capsid through this open end, while simultaneously increasing the diameter of the virion. Finally, the open end of the nucleocapsid is closed, and the envelope narrows at the open end to form the polar tail of the mature virion, which has now obtained its characteristic olivelike shape.

Apoptosis Apoptosis is a cell suicide program that enables multicellular organisms to direct and control cell numbers in tissues and to eliminate cells, including virus-infected cells that may be harmful to the survival of the organism. If apoptosis occurs during the early stage of infection, virus production is severely limited. It is because it inhibits the spread of progeny virus in the host that apoptosis is recognized as an antiviral defense. The classical signs of apoptosis can be identified in WSSV-infected shrimp, including nuclear disassembly, fragmentation of DNA into a ladder, and increased caspase3 activity. Furthermore, there is a positive correlation between the severity of WSSV infection and the number of apoptotic cells. Among the WSSV target tissues, the subcuticular and abdominal epithelia are the most seriously damaged, and these epithelial tissues exhibit the highest incidence of apoptotic cells. However, it is significant that cells displaying apoptotic characteristics do not contain virions, whereas those containing WSSV virions are not apoptotic. It is therefore reasonable to suppose that apoptosis is employed by shrimp as a protective response to prevent the spread of WSSV. Viruses have evolved to inhibit apoptosis early in infection, allowing virus to replicate and to induce it later in infection to enhance virus release from infected cells, and WSSV is no exception. For example, WSSV anti-apoptosis protein 1 (AAP1 or WSSV449) binds to PmCaspase (in P. monodon animals). The resultant cleavage (at one site) of AAP1 by the caspase results in its inhibition. In another system pmCaspase is bound by WSSV134 and WSSV322

816

Nimaviruses (Nimaviridae)

preventing caspase cleavage activity. Viral proteins, such as VP38 and VP41B can also influence apoptosis by binding the promoter of PjCaspase (in P. japonicum) to either repress expression (VP38) or activate it later in infection (VP41B). Another way WSSV induces apoptosis is through WSSV ICP11 which disrupts the binding of histones to DNA and disrupting nucleosome assembly, leading to apoptosis. Another viral protein, WSSV208, binds a mitochondrial protein COX5a, another protein which induces apoptosis by upregulating the transcription of apoptosis-related genes.

Hemocyte Responses to WSSV Infection In crustaceans, the hemocytes (blood cells) mediate many defense-related activities that are essential for nonspecific immunity, including phagocytosis, melanization, encapsulation, cytotoxicity, and clotting. They also produce a vast array of antimicrobial proteins, such as agglutinins, antimicrobial peptides, and lysozymes. WSSV infection decreases the total hemocyte count (THC) in shrimp but not in crayfish. The reduced THC in shrimp is probably due to lysis by the virus of infected hemocytes as well as due to virus-induced apoptosis in both circulating hemocytes and hematopoietic tissue. In addition, circulating hemocytes migrate to tissues that have a high number of virus-infected cells, although this is probably a general defensive response rather than a specific antiviral response. WSSV infection also induces apoptosis in crayfish hemocytes but the percentage of apoptotic hemocytes is much lower in crayfish (1.5%) than in shrimp (20%). WSSV infection differentially affects the three morphologically and functionally distinct crustacean hemocyte types, the semigranular cells (SGCs), granular cells (GCs), and hyaline cells (HCs). In shrimp and crayfish, both SGCs and GCs can be infected with WSSV but SGCs are the preferential target. This suggests that SGCs are more susceptible to WSSV and that the virus replicates more rapidly in SGCs than in GCs. However, the melanization activity of GCs in WSSV-infected crayfish is reduced. The HC type of hemocyte is refractory to WSSV infection in shrimp but their role in antiviral defense is still not clear. On the other hand, GCs and SGCs each play a major defensive role and WSSV infection of these cell types is likely to weaken shrimp defenses and reduce shrimp health. Since hemocytes are also necessary for clotting, the low THC also accounts for the phenomenon that hemolymph withdrawn from WSSV-infected shrimp and crayfish always has a delayed (or sometimes completely absent) clotting reaction.

Epidemiology White spot disease is a highly contagious viral disease of penaeid shrimp. Its onset is rapid, and high levels of mortality can result within just a few days. Following the first outbreaks of WSD in Fujian Province, China in 1992 and Taiwan in 1993, this viral disease spread quickly to other shrimp-farming areas, including Japan, Thailand and Korea (1993), the United States (1995), other regions including Vietnam, Malaysia and Indonesia of Southeast Asia (1996), India (1998), Central and South America (1999), and France and Iran (2002). In 2008 it was detected in Mexico, then between 2010 and 2012 in Madagascar, Mozambique and Saudi Arabia and, more recently since 2016 it affected Australia. Essentially, WSD is considered endemic in almost all shrimpproducing countries particularly in Asia and the Americas, causing serious economic damage to the shrimp culture industry. WSSV is highly virulent and can cause a cumulative mortality of up to 100% within 3–10 days of the first signs of disease at an annual loss of about one billion dollars. WSSV can be transmitted horizontally, either per os when the shrimp feed on diseased individuals, contaminated food or infected carcasses, or else through exposure to virus particles in the water, in which case the route of infection is primarily through the gills or other body surfaces. The virus is also transmitted vertically from brooder to offspring. However, transmission is not transovarial because the virus appears to attack only young developing oocytes, which die before reaching maturation. It is therefore more likely that vertical transmission is caused by contamination of the egg mass. Penaeid shrimp are highly susceptible to WSSV. Although there is evidence of resistance during the larval and early (younger than PL6) post-larval stages, WSSV can cause disease in shrimp at any growth stage. WSSV has a remarkably broad host range. Almost every species of penaeid shrimp is susceptible to WSSV infection. Moreover, the virus can infect other marine, brackish, and freshwater crustaceans, including crayfish, crabs, and spiny lobsters. However, in contrast to penaeid shrimp, infection is often not lethal in these species, and consequently they may serve as reservoirs and carriers of the virus. Furthermore, at least one insect, the shore fly (a member of the family Ephyridae), as well as copepods collected from WSSV-affected farms, have been diagnosed as WSSV-positive by PCR, suggesting that they are also possible reservoir hosts. In epidemiological studies of WSSV, variations in the number of the 54 bp tandem repeats located between the rr1 and rr2 genes have been used as a strain-specific, genetic marker. The number of repeats varies greatly when infected shrimps are collected from different ponds or from the same ponds at different times, but in almost every outbreak of WSD, shrimps from the same pond usually have the same number of repeats. This suggests that in each pond, a single WSSV isolate is the causative agent.

Clinical Features and Pathology The most commonly observed clinical sign of WSD in shrimp is white spots in the exoskeleton and epidermis. These may range in size from minute spots to disks several millimeters in diameter which may coalesce into larger plates. The spots may result from

Nimaviruses (Nimaviridae)

817

abnormal deposits of calcium salts by the cuticular epidermis or result from disruption to the transfer of exudates from the epithelial cells to the cuticle. Infected animals are lethargic, reduce their food consumption, and display a reddish to pink body discoloration due to expansion of cuticular chromatophores. Moribund shrimp exhibit systemic destruction of target tissues with many infected cells showing homogeneous hypertrophied nuclei. At advanced stages of infection, numerous virus particles are released into the hemolymph from the lesions, causing a general viremia. It should be noted that although the white spots are a typical and characteristic clinical sign of WSD, white spots on the carapace of shrimp can also be caused by other environmental stress factors, such as high alkalinity or a bacterial shell disease. Conversely, moribund shrimp with WSD may have few, if any, white spots. To date, no species of penaeid shrimp is known to show significant resistance to WSD.

Protection of Shrimp Against WSSV Infection Using Vaccination and RNAi Strategies Considering the economic impact of WSSV on the shrimp and crab industries, it is not surprising that a lot of research has gone into trying to mitigate the disease. These include efforts to enhance the immune system, use of vaccines and RNAi and discovery or development of antivirals. When P. japonicus shrimps survive either natural or experimental WSSV infections, they sometimes show resistance to subsequent challenge with WSSV. This ‘quasi-immune response’ suggests that shrimps may have an innate immune system that includes specific memory. Several vaccines including sub-unit, whole virus-inactivated, DNA and RNA based vaccines using different modes and times of vaccination have been developed with varying degrees of success. In earlier studies inactivated bacteria expressing VP28 (but not VP19) provided survival up to about 71%, but the protection is relatively short lived to less than 21 days postvaccination. For example, oral administration by continuous feeding of Bacillus subtilis spores expressing VP28 antigen even at low concentrations, provided about 70% protection. Through a metanalysis, Shu-Ying Feng and colleagues demonstrated that RNA-based vaccine showed the best protection and that of the subunit vaccines, VP26 had the best protective effects. Among different routes of administration, oral, immersion and injection, oral administration was the best. RNAi against different WSSV genes has also been tested as a treatment method. For example, injection of M. japonicus with vp28-siRNAs encapsulated with b-1,3-D-glucan inhibited WSSV replication. In a later study, injection of dsRNA targeting the VP28 mRNA provided nearly complete protection for WSSV infection. While RNAi seems to be effective, injection would be too labor intensive as a treatment regimen. Many natural products have been shown to be particularly effective in limiting virus infection. Alginate from the seaweed Sargassum siliquosum, provided about 56% survival rates. Epigallactechin-3-gallate (EGCG) extracted from green tea reduced WWSV production in the mud crab Scylla paramamosain and Kuruma shrimp (Marsupeneaus japonicus). An additional effect of EGCG treatment was that it correlated with increased expression of immune-related genes such as those for proPO, Rho, Rab7, p53, TNV-alpha, MAPK and NOS, improving innate immunity. Hesperetin, a flavanone compound from lemons and oranges, has multiple anti-inflammatory, anti-cancer and anti-viral (e.g., against herpes simplex virus type 1, poliovirus, respiratory syncytial virus) characteristics. Treatment of crayfish with hesperetin reduced mortality in WSSV infections and increased expression of innate immunity-related genes such as NF-kB and C-type lectin. Other extracts that have shown promise are ones derived from Cynodon dactylon (Bermuda grass), Ceriops tagal (Indian mangrove), Momordica charantia (bitter melon) and Agathi grandiflora (vegetable hummingbird). Some host proteins including LvGP (Litopenaeus vannamei, Glycogen phosphorylase), PmGILT enzyme (Penaeus monodon, Gamma-interferon-inducible lysosomal thiol reductase) and LvSrc64B (L. vannamei, Src family kinases) showed antiviral activity against WSSV.

Further Reading Bao, W., Tang, K.F.J., Acivar-Warren, A., 2020. The complete genome of an endogenous nimavirus (Nimav-1_LVa) from the Pacific whiteleg shrimp Panaeus (Litopenaeus) vannamei. Genes 11, 94. doi:10.3390/genes11010094. Feng, S.Y., Liang, G.F., Zu, Z.S., 2018. Meta-analysis of antiviral protection of white spot syndrome virus vaccine to the shrimp. Fish and Shellfish Immunology 81, 260–265. Kawato, S., Shitara, A., Wang, Y., et al., 2019. Crustacean genome exploration reveals the evolutionary origin of White spot syndrome virus. Journal of Virology 93 (3), e01144. [18]. Leu, J.H., Tsai, J.M., Lo, C.F., 2008. White spot syndrome virus. In: Mahy, B.W.J., van Regenmortel, M.H.V. (Eds.), Encyclopedia of Virology, third ed. pp. 450–459. Leu, J.H., Tsai, J.M., Wang, H.C., et al., 2005. The unique stacked rings in the nucleocapsid of the white spot syndrome virus virion are formed by the major structural protein VP664, the largest viral structural protein ever found. Journal of Virology 79, 140–149. Oakey, H.J., Smith, G.S., 2018. Complete genome sequence of a white spot syndrome virus associated with a disease incursion in Australia. Aquaculture 484, 152–159. Robalino, J., Bartlett, T., Shepard, E., et al., 2005. Double-stranded RNA induces sequence-specific antiviral silencing in addition to nonspecific immunity in a marine shrimp: convergence of RNA interference and innate immunity in the invertebrate antiviral response? Journal of Virology 79, 13561–13571. Tsai, J.M., Wang, H.C., Leu, J.H., et al., 2006. Identification of the nucleocapsid, tegument, and envelope proteins of the shrimp white spot syndrome virus virion. Journal of Virology 80, 3021–3029. van Hulten, M.C.W., Witteveldt, J., Peters, S., et al., 2001. The white spot syndrome virus DNA genome sequence. Virology 286, 7–22. Verbruggen, B., Bickley, L.K., van Aerle, et al., 2016. Molecular mechanisms of White spot syndrome virus infection and perspectives on treatments. Viruses 8. doi:10.3390/ v8010023. Wang, H.C., Hirono, I., Maningas, M.B.B., et al., 2019. ICTV virus taxonomy profile: Nimaviridae. Journal of General Virology 100, 1053–1054. Witteveldt, J., Cifuentes, C.C., Vlak, J.M., van Hulten, M.C.W., 2004. Protection of Penaeus monodon against white spot syndrome virus by oral vaccination. Journal of Virology 78, 2057–2061. Yang, F., He, J., Lin, X., et al., 2001. Complete genome sequence of the shrimp white spot bacilliform virus. Journal of Virology 75, 11811–11820.

818

Nimaviruses (Nimaviridae)

Zhang, X., Huang, C., Hew, C.L., 2004. Use of genomics and proteomics to study white spot syndrome virus. In: Leung, K.Y. (Ed.), Molecular Aspects of Fish and Marine Biology, Vol. 3: Current Trends in the Study of Bacterial and Viral Fish and Shrimp Diseases. Singapore: World Scientific, pp. 204–236. Zheng, S.C., Xu, J.Y., Liu, H.P., 2019. Cellular entry of white spot syndrome virus and antiviral immunity mediated by cellular receptors in crustaceans. Fish and Shellfish Immunology 93, 580–588. Zuidema, D., van Hulte, M.C.W., Marks, H., et al., 2004. Virus–host interaction of white spot syndrome virus. In: Leung, K.Y. (Ed.), Molecular Aspects of Fish and Marine Biology, Vol. 3: Current Trends in the Study of Bacterial and Viral Fish and Shrimp Diseases. Singapore: World Scientific, pp. 237–255.

Relevant Website https://talk.ictvonline.org/ictv-reports/ictv_online_report/dsdna-viruses/w/nimaviridae Nimaviridae. dsDNA Viruses. International Committee on Taxonomy of Viruses (ICTV).

Nodaviruses of Invertebrates and Fish (Nodaviridae) Kyle L Johnson, The University of Texas at El Paso, El Paso, TX, United States and CQuentia, Memphis, TN, United States Jacen S Moore, University of Tennessee Health Science Center, Memphis, TN, United States r 2021 Elsevier Ltd. All rights reserved.

Nomenclature AHPND Acute hepatopancreas necrosis disease BBV Black beetle virus BFNNV Barfin flounder nervous necrosis virus BoV Boolarra virus CMD Covert mortality disease CMNV Covert mortality nodavirus CPE Cytopathic effects cryoEM Electron cryomicroscopy EMS Early mortality syndrome FHV Flock House virus MrNV Macrobrachium rosenbergii nodavirus NNV Nervous necrosis virus NoV Nodamura virus PaV Pariacoto virus

PL Post-larvae qPCR Quantitative PCR qRT-PCR Quantitative reverse-transcription PCR RdRp RNA-dependent RNA polymerase RGNNV Redspotted grouper NNV RT-LAMP Reverse-transcription loop-mediated rapid isothermal amplification RT-PCR Reverse-transcription PCR SJNNV Striped Jack NNV TPNNV Tiger puffer NNV VCMD Viral covert mortality disease VER Viral encephalopathy and retinopathy VLP Virus-like particle VNN Viral nervous necrosis WTD White tail disease

Classification (Compact): Family Nodaviridae, genera Alphanodavirus and Betanodavirus The family Nodaviridae in the order Nodamuvirales is divided into two canonical genera: Alphanodavirus with five species including Nodamura virus as the type species and Betanodavirus with four species including Striped jack nervous necrosis virus as the type species. Alphanodaviruses are primarily virus pathogens of insects. They include Nodamura virus (NoV, a member of the type species), Flock House virus (FHV), black beetle virus (BBV), Boolarra virus (BoV), and Pariacoto virus (PaV). Viruses of the Betanodavirus genus are the causative agents of viral nervous necrosis (VNN) in infected fish. Consequently, the nomenclature used for most betanodaviruses is the name of the host species followed by “nervous necrosis virus” (NNV). The viruses are classified into four genotypes based on genome homology: Redspotted Grouper NNV (RGNNV), Barfin Flounder NNV (BFNNV), Striped Jack NNV (SJNNV), and Tiger Puffer NNV (TPNNV). RGNNV and SJNNV are serologically distinct from one another and from the cold-water betanodaviruses BFNNV and TPNNV. Corresponding serogroups A (SJNNV), B (RGNNV), and C (BFNNV/TPNNV) have been proposed to reflect these serological relationships. The placement of reassortant viruses that contain RNA1 from RGNNV and RNA2 from SJNNV (RGNNV/SJNNV) in serogroup A and those that RNA1 from SJNNV and RNA2 from RGNNV (SJNNV/ RGNNV) in serogroup C indicates that the capsid protein encoded by RNA2 is a major antigenic determinant. The discovery of nodaviruses that infect shrimp and their low sequence homology with either Alphanodaviruses or Betanodaviruses has led to the proposal of a third genus, Gammanodavirus, to include Macrobrachium rosenbergii nodavirus (MrNV) and Penaeus vannamei nodavirus (PvNV). These viruses have been isolated from the giant river prawn (M. rosenbergii) and the white leg or Pacific white shrimp (P. vannamei), respectively, and share approximately 80% similarity at the amino acid level. Two other shrimp nodaviruses, covert mortality nodavirus (CMNV) and Farfantepenaeus duorarum nodavirus (FdNV), have also been proposed for inclusion in this group. While Macrobrachium rosenbergii nodavirus (MrNV) remains in the family Nodaviridae, a satellite of MrNV called extra small virus (XSV) that was originally classified as a nodavirus has been reclassified as the sole member of a newly established virus family, the Sarthroviridae. The Sarthroviridae were named for the small arthropod hosts (shrimp, some insects) they infect.

Provisional Nodaviruses Additional viruses have been proposed as potential members of the family Nodaviridae, as summarized in Table 1, although none have yet been officially classified as nodaviruses by the ICTV.

Proposed Six Clade Taxonomic Structure While previous nodavirus phylogenies have relied on the relationship between the capsid proteins, yielding the current 2 genera (Alphanodavirus and Betanodavirus) and the proposed Gammanodavirus genus, more recent alignments using the RdRp protein sequence have suggested a broader taxonomic structure, particularly when considered together with a number of newly discovered viruses that share significant sequence homology with one or more recognized nodaviruses. Clade 1 includes all of the recognized

Encyclopedia of Virology, 4th Edition, Volume 4

doi:10.1016/B978-0-12-814515-9.00154-5

819

820

Nodaviruses of Invertebrates and Fish (Nodaviridae)

Table 1

Provisional nodaviruses

Virus name

Abbreviation

Source of isolate

Bat guano-associated virus

BGNV

Culannivirus Hypsignathus nodavirus Le Blanc nodavirus Lutzomyia Piaui nodavirus Mosinovirus Orsay virus Santeuil nodavirus Wuhan nodavirus

CuNV HSNV LBNV LPNV MoNV ONV SNV WhNV

Bat species, including the Brazilian free-tailed bat (Tadarida brasiliensis), cave myotis (Myotis velifer), evening bat (Nycticeus humeralis), and/or the tricolored bat (Perimyotis subflavus) Culex annulirostris mosquitoes Fruit bats (Hypsignathus monstrosus) Nematodes (Caenorhabditis briggsae) Phlebotomine sand flies (Lutzomyia longipalpis) Culex nebulosus and Culex sp. mosquitoes Nematodes (Caenorhabditis elegans) Nematodes (Caenorhabditis briggsae) Small cabbage white butterfly larvae (Pieris rapae)

alphanodaviruses except PaV; Clade 2 includes the betanodaviruses and several newly discovered noda-like viruses isolated from different arthropod hosts. Clade 3 includes PaV, MrNV, and CulV and is considerably more diverse than the other proposed clades, particularly MrNV. Clade 4 includes several newly sequenced isolates; Clade 5 includes one isolate from Helicoverpa zea and several newly sequenced isolates from diverse insect hosts, and Clade 6 includes SNV, ONV, and LBNV, all isolated from nematodes.

Virion Structure Nodavirus particles are comprised of a protein shell co-encapsidating one copy each of the two genomic RNA molecules. The three dimensional structures of several members of the Nodaviridae have been solved by cryo electron microscopy (cryoEM) and/or x-ray crystallography (BBV, FHV, NoV, PaV, MrNV). The unique features of each are characterized below. The structures of alphanodaviruses BBV, FHV, and NoV are all closely similar. The viral capsid consists of 180 coat protein subunits arranged with icocosahedral symmetry on a T¼ 3 quasi-equivalent lattice. Although the primary structures of the coat protein subunits are identical, they occupy three quasi-identical locations (A, B, and C) in the icosahedral asymmetric unit. The A subunits are related by 5-fold symmetry, while the B and C subunits are related by three-fold symmetry. However, only B15% of the viral RNA is visible in these structures. Strikingly, cryoEM image reconstruction and the crystal structure of PaV revealed the presence of a dodecahedral cage comprised of duplex RNA, representing approximately 35% of the viral single-stranded RNA genome. The highly basic N-terminal regions of the capsid protein that form pentamers (the A subunits) make contacts with the RNA cage structure. The capsid structure of recombinant virus-like particles (VLPs) of a malabaricus grouper nervous necrosis virus (MGNNV) (Epinephelus malabaricus) nodavirus VLP differs from that of the insect nodaviruses. Its protein fold more closely resembled those of Norwalk virus and tomato bushy stunt virus than that of the alphanodaviruses. The structural arrangement of the Machrobrachium rosenbergii nodavirus (MrNV) capsid also differs from those reported for the alphanodaviruses. While MrNV shares an 8-stranded antiparallel beta barrel capsid protein structure with the other nodaviruses, it lacks the canonical nodavirus trimeric “spike” at the isocahedral vertex at the 5-fold axis and the MrNV capsid protein forms a dimeric spike in the equivalent position. Many similarities were seen between the MrNV capsid and that of tombusviruses, including the coordination of calcium ions (3 pairs in each asymmetric unit) and an overall similar fold to that of the tombusvirus spike protein.

Genome As shown schematically in Fig. 1, the nodavirus genome is divided into two segments of positive strand ssRNA with non-methylated 50 cap structures and lacking poly(A) tails. The larger segment, RNA1, encodes the viral RNA-dependent RNA polymerase (RdRp), while RNA2 encodes the capsid protein precursor (see below). During replication of RNA1, a small subgenomic RNA (RNA3) is synthesized and subsequently replicated. RNA3 encodes proteins B1 and B2, whose functions are described below. For FHV, replication of RNA2 suppresses RNA3 synthesis, yet viral mutants that do not make RNA3 fail to replicate RNA2. If RNA3 is provided in trans from an exogenous source, RNA2 replication is restored. This interdependence may assist in coordinating replication of the two segments, which must be co-encapsidated during virus assembly.

Proteins RNA-Dependent RNA Polymerase (RdRp) Protein A (110 kDa) is the sole catalytic subunit of the RNA-dependent RNA polymerase that replicates the genomic RNA. It also possesses an RNA capping (guanylyl transferase) activity. This protein is membrane-associated (via either a transmembrane region or a membrane-association region that may involve only a single leaflet of the membrane, depending on the virus) and establishes

Nodaviruses of Invertebrates and Fish (Nodaviridae)

821

Fig. 1 Schematic of nodavirus genome organization. Genomic RNA1 and RNA2 as well as subgenomic RNA 3 are shown schematically. All three are capped (unmethylated cap 0 structure) but not polyadenylated. RNA1 encodes the viral RdRp, RNA3 encodes small proteins B1 and B2 in overlapping reading frames, and RNA2 encodes the capsid protein (CP; betanodaviruses) or the capsid protein precursor alpha, which is autocatalytically cleaved into mature capsid proteins beta and gamma (alphanodaviruses). RNAs 1 and 2 are copackaged into the same virion, while subgenomic RNA3 is not packaged.

viral replication complexes in association with the cellular outer mitochondrial membrane. For betanodaviruses, the N-terminus of the RdRp is associated with their ability to replicate at colder temperatures (thermotolerance).

Capsid Proteins The nodavirus capsid consists of 180 copies of the viral capsid protein (CP). For alphanodaviruses, the capsid protein (CP) is initially translated as a precursor, protein alpha, which undergoes an auto-catalytic maturation cleavage after assembly to yield the mature capsid proteins beta and gamma found in the virion. For betanodaviruses, this cleavage is not observed but the capsid protein undergoes an essential conformational change to produce mature virions. The betanodavirus CP contains an N-terminal signal for localization to the nucleolus, where it plays a role in cell cycle arrest. Its accumulation later in infection induces apoptosis via a caspase-dependent mechanism. The C-terminus of the CP may play a role in host specificity.

Proteins B1 and B2 Two small proteins, B1 and B2, are translated from RNA3. B1 is in the same reading frame as the RdRp and is identical to its C-terminus, although not all betanodaviruses are predicted to encode B1. B2 is translated in an overlapping reading frame. In alphanodaviruses, the function of B1 is unknown. For betanodaviruses, B1 serves as an anti-necrotic “death factor” that results in loss of mitochondrial membrane potential. For RGNNV, the subcellular localization of B1 transitions from the cytoplasm to the nucleus late in infection, resulting in cell cycle arrest. B2 suppresses RNA interference (RNAi) by binding to and sequestering double-stranded RNA replication intermediates. Betanodavirus B2 serves as a necrotic “death factor” that upregulates proapoptotic protein Bax and induces loss of mitochondrial membrane potential but not release of mitochondrial cytochrome C. However, in alphanodaviruses, B2 is not associated with apoptosis and while FHV infection does induce caspase-dependent apoptosis in infected cells, NoV does not.

Physical Properties Nodaviruses are non-enveloped, small (29–32 nm diameter) icosahedral viruses that contain bipartite positive-strand ssRNA genomes. Alphanodavirus particles are resistant to chloroform, diethyl ether, 1 M urea, 20 mM EDTA and 1% sodium deoxycholate and all except Boolarra virus (BoV) are stable in 1% SDS. Immature provirions are sensitive to these denaturants and to freezing. FHV is stable at 401C for 10 min, but loses infectivity when heated at 581C. FHV, BoV, and NoV are stable over a wide range of pH values (pH 3–10). NoV is unstable in the presence of chloride ions and BBV and LNVG are unstable when stored in cesium chloride (CsCl), while FHV is stable in CsCl. Betanodaviruses are resistant to aquatic environments and can survive at low temperatures in seawater. Infectivity of viruses contained in seawater decreases at temperatures above 251C while that in drying conditions lose infectivity at 211C. Betanodavirus particles are inactivated by hydrogen peroxide, benzalkonium chloride, iodine, and sodium hypochlorite but resistant to formalin. While MrNV particles are stable to storage in a cell lysate or tissue sample at  201C and  701C, their infectivity is lost upon experimental heat treatment (651C for two hours). The chemical stability of MrNV particles is unknown.

822

Nodaviruses of Invertebrates and Fish (Nodaviridae)

Life Cycle The ecological life cycle of Alphavodaviruses is unknown. Nodamura virus was isolated from mosquitoes and found to cause lethal infections in suckling mice when introduced experimentally via the intracranial (IC) route. NoV can also be transmitted from an infected mouse to a naïve one by mosquitoes and cause lethal infections in both vertebrates (suckling mice and suckling hamsters) and invertebrates (mosquitoes, honeybees, larvae of the greater wax moth, and fruit flies). The remaining alphanodaviruses have been isolated from only insects, including FHV, BBV, BoV, but little is known about their transmission in the wild. However, the infectious cycle of the viruses is well understood due to their ready replication in a variety of cultured cells from both vertebrates and invertebrates, and in transformed yeast cells. Betanodaviruses infect primarily larvae and juvenile fish. They can be transmitted to juvenile farmed fish populations from wild reservoir species or through importation of infected juveniles from other farms. The literature suggests that betanodaviruses most likely infect the intestinal epithelium and peripheral nervous system, are rapidly transmitted to the central nervous system, and result in either death or persistence in survivors. The virus is thought to spread to the environment by decomposition of dead fish. Little is known about the natural transmission cycle of Macrobrachium rosenbergii nodavirus (MrNV). Both horizontal and vertical (trans-ovum) transmission by the water-bourne route have been observed. Outbreaks of WTD disease may be induced or enhanced by rapid environmental changes such as those in pH, salinity, and temperature. Similarly, survival of shrimp with symptoms of covert mortality disease or viral covert mortality disease decreases sharply with water temperatures above 281C.

Epidemiology Over 50 species of fish are susceptible to betanodavirus infections. In particular the marine fish species striped jack (Pseudocaranx dentex), European sea bass (Dicentrarchus labrax), groupers (Epinephelus species), and flatfishes. These diseases primarily affect farmed fish, which are particularly vulnerable to infectious diseases due to the high densities at which the fish are raised. In general, the earlier the age at which betanodavirus disease signs and symptoms occur, the greater the mortality. In larval striped jack, the disease can occur within two days of hatching with the peak mortality observed at 10 days post-hatching, resulting in almost complete loss of the larvae. In European sea bass, mortality is most frequently seen 30-days post-hatching but may occur in adults as well as larvae and juveniles. The disease has been reported from countries in the Mediterranean region (France, Greece, Israel, Italy, Malta, Portugal, Spain, and Tunisia), in the Atlantic region (Norway and the United Kingdom), in North America (Canada and the United States), and in Asia (Chinese Taipei, India, Indonesia, Iran, Japan, Korea, Malaysia, the Philippines, the Peoples Republic of China, Thailand, and Vietnam). In Japan, a representative sample of 30 species in two geographically distinct regions confirmed that most farmed and wild fish tested positive for VNN. White tail disease (WTD) in the giant river prawn Macrobrachium rosenbergii is caused by infection with MrNV. The disease has been reported in Australia, Chinese Taipei, the French West Indies (the islands of Guadeloupe and Martinique), India, Malaysia, the Peoples Republic of China, and Thailand. PCR-positive results but no active infection have been obtained with other crustacean species, including brine shrimp (Artemia sp.), red claw crayfish (Cherax quadricarinatus), Monsoon River prawn (Macrobrachium malcolmsonii), hairy river prawn (M. rude), Indian white prawn (Penaeus indicus), kuruma prawn (Penaeus japonicus), and the giant tiger prawn (Penaeus monodon). In the latter three marine shrimp species, infection could not be initiated by the oral route, but intramuscular inoculation resulted in accumulation of particles in abdominal muscle, gill, hemolymph, intestine, and stomach that were infectious when inoculated into M. rosenbergii postlarvae (PL). Similar results have also been described in several insect species, including dragonfly (Aeshna sp.), giant water bug (Belostoma sp.), beetle (Cybister sp.), and backswimmer (Notonecta sp.). For MrNV, only larvae, PL, and early juveniles are susceptible, while adults are resistant. No mortality has been reported in subadult or adult prawns naturally or experimentally infected with MrNV. Covert mortality disease (CMD) or viral covert mortality disease (VCMD) is an emerging shrimp disease caused by covert mortality nodavirus (CMNV). The name CMD derives from moribund shrimp hiding in deep water rather than swimming to the surface or in shallow water. Initially identified in Litopenaeus vannamei (formerly Penaeus vannamei; common name white leg shrimp or Pacific white shrimp), specimens of Marsupenaeus japonicus (Japanese tiger prawn or Kuruma prawn) and Fenneropenaeus chinensis (Chinese white shrimp or fleshy prawn) isolated from diseased ponds exhibit similar symptoms. An outbreak of VCMD in 2012–2014 resulted in heavy loss of farmed M. japonicas and F. chinensis in the municipalities of Tianjin and Hebei and Shandong Provinces in China. A recent three-year epidemiological survey determined the prevalence of CMNV in samples of L. vannamei, F. chinesis, Marsupenaeus japonicus, Macrobrachium rosenbergii, and Penaeus monodon collected from 145 sampling site shrimp farms in ten coastal provinces in China. The prevalence was highest in Guangdong and Hainan Provinces (approximately 50%), 25%–30% in Hebei, Shandong, Jiangsu, Zhejiang, and Fujian Provinces, 17% in Guangxi Province, and approximately 10% in Liaoning and Tianjin Provinces. Other studies have reported on isolation of CMNV-positive shrimp from Ecuador, Thailand, and Vietnam, suggesting a wide distribution.

Clinical Features VNN Betanodaviruses are the causative agents of viral nervous necrosis (VNN), also known as viral encephalopathy and retinopathy (VER). Most common clinical signs included aberrant swimming (horizontal swimming, looping, darting, spiral swimming, or whirling), loss of

Nodaviruses of Invertebrates and Fish (Nodaviridae)

823

appetite, hyperinflation of the swim bladder, and lightening or darkening of the coloration. Nerve cells in the cerebellum and olfactory lobe exhibit vaculolization and infected cells have been detected in the spinal cord, medulla oblongata, thalamus, and preoptic area. However, the clinical signs depend on the fish species, developmental stage, water temperature, and the phase of the disease. For example, while juvenile fish may be severely affected, adults are often less so. Further information on VNN may be obtained from the OIE Reference Laboratory for Viral encephalopathy and retinopathy and the excellent review by Bandín and Souto (see Further Reading).

WTD Clinical signs include lethargy and whitening or opaqueness of the abdominal muscles. PL become opaque and develop a milky, white appearance. This whitish discoloration appears first in the second or third abdominal segment and diffuses gradually in both anterior and posterior directions. In severe cases, the last pair of abdominal legs (uropods) and the tail fan (comprised of uropods and telson) may degenerate. Mortality generally peaks 5–6 days after onset of symptoms and very few PL survive beyond 15 days. PL that do survive can reach normal market size. Due to the possibility of persistent infection, adults can serve as carriers.

VCMD Clinical signs include hepatopancreatic atrophy with color fading and necrosis, empty stomach and guts, soft shell, slow growth, and often abdominal muscle whitening and necrosis. Daily reports of mortality within a population begin within one month and increase during a 60–80 day period post-stocking, with a total mortality of approximately 80%. Although these symptoms are somewhat similar to those caused by early mortality syndrome (EMS)/acute hepatopancreas necrosis disease (AHPND), caused by a bacterial pathogen, CMD and EMD/AHPND are distinct from one another. As a result, covert mortality disease (CMD) was renamed viral covert mortality disease (VCMD) to reduce confusion between the diseases.

Pathogenesis Betanodaviruses replicate in brain, spinal cord, and retina of a number of fish species, resulting in vacuolization of the brain and retina together with cellular necrosis and neural degeneration. Often large vacuolar lesions are seen in larvae and juveniles while adults exhibit mild or absent symptoms. Infected striped jack larvae also exhibit hyperplasia and degeneration of epithelial cells of the skin, oral cavity, and gills. The tissue distribution appears to differ between larvae/juvenile fish and adults, with greater vaculolization in the spinal cord and medulla oblongata seen in the earlier developmental stages. In MrNV-infected PL and early juveniles, striated muscles of the tail, abdomen, and cephalothorax are affected. Histological examination reveals acute Zenker’s necrosis of striated muscles, with severe hyaline degeneration, muscular lysis, and necrosis. Abnormal open spaces, moderate edema, and basophilic cytoplasmic inclusions are observed among infected muscle cells. Transmission electron microscopy reveals a disorganized cytoplasm and the presence of not only MrNV but also a satellite virus (extra small virus, XSV). XSV was initially classified as a member of the Nodaviridae but was moved recently into its own family, the Sarthroviridae. Covert mortality nodavirus (CMNV) causes multifocal myonecrosis in striated muscle, accumulation of vacuoles in the cytoplasm of hepatopancreocytes, eosinophilic inclusions in the tubular epithelium of the hepatopancreas, and spheroids in the lymphoid organ with inclusions and nuclear condensation of the chromatin.

Immune Responses to Nodavirus Infection Innate Immune Responses Host immune systems are equipped with a variety of tools in their antiviral arsenal. Innate immune responses to viral infection can typically be classified within seven primary categories: (1) activation of Pattern Recognition Receptors (PRR) capable of binding Pathogen-Associated Molecular Patterns (PAMPs); (2) production of siRNAs, miRNAs, and piwiRNAs that target viral RNA for degradation and inhibit replication; (3) activation of signaling pathways such as Toll, NF-κB, and Imd; (4) activation of Jak-STAT pathways that act in a fashion similar to interferon pathways in mammals which control transcription and antiviral replication; (5) triggering of autophagy which can recycle damaged organelles and engage in signaling through the PI3K-Akt and TOR pathways; (6) production of interferon-like molecules and inflammatory cytokines; and (7) cytotoxic T cell responses that activate granzymes and perforins. Activation of the PRR through the binding of viral-associated nucleic acids to Toll-like receptors lead to the activation of a number of signaling molecules including TRIF, MyD88, NF-κB, Interferon Regulatory Factors (IRF), and MAPK that stimulate the production of cytokines, chemokines, and interferon-like proteins to suppress viral replication. Anti-microbial Toll-responses from PRRs led to activation of serine proteases and signal transduction, resulting in nuclear translocation and regulation of effector genes. Both Toll and Imd pathways are linked to NF-κB signaling in innate immune responses by activation of cellular transcription. Upregulation and expression of genes in both pathways has been observed in nodavirus-infected species. Silencing of viral RNA occurs most commonly through the production of complementary silencing RNAs (siRNAs) and microRNAs (miRNAs) that prevent transcription and regulate gene expression. P-element Induced WImpy (PIWI) or piRNAs are a type of RNA distinct from miRNAs whose role is to exert epigenetic control of germline genomic elements in vertebrates. MiRNAs

824

Nodaviruses of Invertebrates and Fish (Nodaviridae)

and piRNAs have been identified in invertebrates, however, their role in transcriptional regulation particularly in insects is not well-understood. Evaluation of their role has suggested that production of piRNA precedes that of siRNA then wanes. Blockage of piRNA leads to increased viral replication and the accumulation of viral particles, suggesting a regulatory role. In response to these regulatory pathways, nodaviruses have developed mechanisms to suppress the function of both host siRNA and miRNA activity by producing proteins and subgenomic RNAs including B2 and dicer that bind and promote cleavage of dsRNA. Jak-STAT pathways can respond to viral activation through multiple mechanisms including dephosphorylation, nuclear export, or negative feedback. The activation of this pathway results in changes in the transcription profile of infected organisms. This upregulation is not due to dsRNA alone, but requires active viral replication and production. Jak-STAT pathway mRNA expression is increased in nodavirusinfected hosts, supporting an anti-viral role for this pathway. Importantly, infection of hosts with different nodaviruses led to differential expression of Jak-STAT pathway members, indicating that other pathways may be involved. Furthermore, STAT can also repress gene activity by forming complexes with other transcription factors and chromatin-modifying proteins that regulate NF-κB immune responses. Toll ligands, cecropins, and defensins can also be downregulated, supporting the notion that JAK-STAT activation can lead to inhibition of Toll-regulated genes, thus regulating anti-viral responses. Autophagy is another mechanism that infected hosts use to regulate viral transcription and translation. In this process, de novo synthesized membranes encompass large cytoplasmic components including damaged organelles or protein aggregates containing viral components that join with lysosomes to form autolysosomes which degrade and recycle digested materials. Positive and negative regulation of autophagy occurs primarily through the phosphoinositol 3-kinase-related-kinase (PI3K-Akt) and Target of Rapamycin (TOR) pathways, respectively. Knockdown of autophagy-related genes results in higher viral production and prevents the formation of autolysosomes, supporting its role in viral suppression. Phosphorylation of PI3K-Akt was shown to downregulate TOR activation and increase autophagy which subsequently leads to decreased viral replication. Interferons and other inflammatory cytokines are a crucial aspect of the host antiviral response and help by inhibiting viral replication and promoting a pro-inflammatory phenotype. Viral pathogens such as NoV can promote activation of Interferon Specific Response Elements (ISRE) through two common pathways. Pattern Recognition Pathways (PRRs) that include RIG-I (retinoic acid-inducible gene I) and MDA5 (melanoma differentiation-associated protein 5) can sense double-stranded RNA in the cytoplasm. Once this occurs, these proteins can trigger the activation of MAVS (mitochondrial antiviral signaling protein) which, in turn, subsequently activates a complex composed of Tank Binding Kinase 1 (TBK1) and IκB kinase-e (IKKe). Stimulation of the TBK1/IKKe complex drives the phosphorylation of interferon regulatory factors 3 and 7 (IRF3 and IRF7), which translocate to the nucleus and bind to the ISRE promoters that upregulate transcription and translation of interferon-b and other Interferon-stimulated genes (ISGs). Viral PAMPs are recognized through Toll-like receptors 3, 7 and 8 (TLR3, 7, and 8), which specifically recognize viral dsRNA and ssRNA in the endosome and signal through TRIF (TIR-domain-containing adapter-inducing interferon-b) and MyD88 (Myeloid differentiation primary response 88). Activation of these factors triggers phosphorylation of IRF5 and IRF7 and downstream upregulation of ISGs and Type I Interferon production. TRIF can also activate the TBK1/IKKe complex, thus amplifying the antiviral response. Multiple mechanisms used by viruses to inactivate this pathway have been identified including the prevention of dephosphorylation keeping signaling proteins like MDA5 in an inactive state, blocking the ATPase activity or preventing ubiquitination of RIG-I, modulation of miRNAs that block MDA or RIG-I upregulation, or direct cleavage or degradation of MDA5, MAVS and RIG-I. Nodaviral RNA replication complexes promote progressive clustering and aggregation of mitochondria within the cell culminating in malformation of muscle fibers, tissue damage and paralysis. Because nodavirus RdRp’s localize to the outer mitochondrial membrane, it is critical that these viruses prevent or modulate host antiviral activity. Because the NoV RdRp associates with MAVS (Johnson, personal communication), mechanisms using such strategies as direct cleavage and degradation of mitochondrial membrane-associated proteins like MAVS and likely STING (Stimulator of interferon genes) or their sequestration in virally-induced autophagosomes (Johnson, unpublished) are likely to be required to control the innate host response.

Adaptive Immune Responses in Nodavirus Infection T cells are a crucial component of antiviral immunity and can determine whether a viral infection will be controlled and whether lasting immunity will be elicited. T cell responses have been characterized in only a limited number of studies in response to nodavirus infection. Data suggest that there is a specific role for cytotoxic T lymphocytes (CTL) in the modulation of nodavirus infection that is dependent upon MHC Class I and antigen-presenting cells. CTL responses have been characterized by the upregulation of cell markers, inflammatory cytokines and mediators of genes associated with cell lysis including granzyme, perforin and others. Importantly, this activity was limited in some species and allowed virally infected cells to evade killing. Other studies have confirmed observation of increased levels of T cell markers and Type II interferon following immunization with nodavirus, further suggesting that CD4 þ T cells may also play a critical role in the reduction of viral loads.

Diagnosis VNN Diagnosis of a suspect case of VNN (VER) depends on at least one of the following criteria: abnormal swimming behavior in susceptible species; typical histopathological lesions in a susceptible species; typical cytopathic effects (CPE) observed in cell cultures; detection of specific antibody reactivity; any positive result obtained by RT-PCR and subsequent genomic sequencing,

Nodaviruses of Invertebrates and Fish (Nodaviridae)

825

quantitative PCR, or histopathology; transfer of live fish from an infected farm to another site; or the existence of epidemiological links between an infected farm and another farm. A confirmed case is one in which positive results are obtained from at least two of the following methods: a suspect case that has produced typical CPE in cell culture plus identification of the causative agent by a molecular test or an antibody-based test and a second diagnostic assay that might include isolation from cultured infected cells followed by immunostaining, reverse transcription PCR (RT-PCR), or quantitative RT-PCR (qRT-PCR). While serological diagnostic methods are not yet used as a screening method for assessing the presence of Betanodivirus species in fish populations, a recent study generated novel antibodies suitable for development as a serological test for diagnosis of VNN.

WTD Diagnosis of a suspect case of WTD requires that at least one of these criteria are met: clinical signs or histopathology consistent with MrNV infection or a positive result by RT-PCR or qRT-PCR. A confirmed case fulfills two or more of the following: MrNVpositive histopathology, positive result in target tissues, RT-PCR followed by sequencing, or qRT-PCR. MrNV can be detected by in situ hybridization of histological specimens using a digoxigenin-labeled probe. Although no serological methods have been developed to date, both genome (qRT-PCR) and antibody-based (ELISA) tests are available for diagnostic use. The virus can be propagated in nodavirus-susceptible cultured cell lines including those of the mosquito Aedes albopictus line C6/36 and the fish Channa (Ophicephalus) striatus (striped snakehead) line SSN-1. No specific CPE is observed but viral replication can be detected by acridine orange staining, quantitative RT-PCR, and EM.

VCMD Diagnosis of shrimp exhibiting suspected clinical signs of CMD has been accomplished using various nucleic acid amplification tests including RT-LAMP, nested RT-PCR, and qRT-PCR. The RT-LAMP and RT-PCR assays test RNA isolated from the cephalothorax of suspected CMNV-infected shrimp. Because the RT-LAMP assay uses a fluorescent dye for detection rather than gel electrophoresis or qRT-PCR, it is suitable for use as a rapid detection method in the field.

Treatment No treatment is available for nodavirus infection.

Prevention VNN Virus contaminated water can be sterilized by UV treatment. Fertilized fish eggs may be washed in ozone-treated seawater to remove viral particles from their outer surfaces. Treatment of rearing water with chlorine or ozone appears effective in controlling the disease in larvae of striped jack, sevenband grouper, barfin flounder, and Atlantic halibut. Immunization with recombinant viral capsid protein or virus-like particles or with formalin-inactivated virus has proven effective in preventing VNN. An inactivated RGNNV preparation that protects sevenband grouper from RGNNV infection is commercially available in Japan. More recently, the production of antisera that are specific for the capsid proteins of SJNNV, RGNNV, TPNNV, and BFNNV suggested the possibility of developing antibody-based vaccines that might prove cross-protective against all four betanodavirus genotypes.

WTD Prevention may be accomplished using standard procedures for control of crustacean viral diseases, including use of iodophors or formalin to disinfect eggs and larvae. Preventive measures include disinfection of tanks and water supplies and screening of brood stock and PL to maintain pathogen-free populations. The OIE Manual of Diagnostic Tests for Aquatic Animals recommends against mixed cultures of M. rosenbergii and the giant tiger prawn Penaeus monodon or rotation between the two species in the same tank to decrease the likelihood of spread. No vaccine is available for WTD.

VCMD Preventative measures include deployment of a reverse transcription loop-mediated rapid isothermal amplification (RT-LAMP) detection kit for screening brood stock for CMNV prior to stocking of ponds and adoption of indoor culture methods. No vaccine is available for VCMD.

826

Nodaviruses of Invertebrates and Fish (Nodaviridae)

Further Reading Ball, L.A., Johnson, K.L., 1998. Nodaviruses of insects. In: Miller, L.K., Ball, L.A. (Eds.), The Insect Viruses. New York: Plenum Publishing Corporation. Bandin, I., Souto, S., 2020. Betanodavirus and VER disease: A 30-year research review. Pathogens 9. Chaves-Pozo, E., Valero, Y., Esteve-Codina, A., et al., 2017. Innate cell-mediated cytotoxic activity of European Sea bass leucocytes against nodavirus-infected cells: A functional and RNA-seq study. Scientific Reports 7, 15396. Chen, N.C., Yoshimura, M., Guan, H.H., et al., 2015. Crystal structures of a piscine betanodavirus: Mechanisms of capsid assembly and viral infection. PLoS Pathog 11, e1005203. Fan, X., Dong, S., Li, Y., Ding, S.W., Wang, M., 2015. RIG-I-dependent antiviral immunity is effective against an RNA virus encoding a potent suppressor of RNAi. Biochemical and Biophysical Research Communications 460, 1035–1040. Gant Jr, V.U., Moreno, S., Varela-Ramirez, A., Johnson, K.L., 2014. Two membrane-associated regions within the Nodamura virus RNA-dependent RNA polymerase are critical for both mitochondrial localization and RNA replication. Journal of Virology 88, 5912–5926. Ho, K.L., Gabrielsen, M., Beh, P.L., et al., 2018. Structure of the Macrobrachium rosenbergii nodavirus: A new genus within the Nodaviridae? PLOS Biology 16, e3000038. Hua, Y., Huang, Y., Liua, J., et al., 2018. TBK1 from orange-spotted grouper exerts antiviral activity against fish viruses and regulates interferon response. Fish and Shellfish Immunology 73, 92–99. Huang, Y., Huang, X., Yang, Y., et al., 2015. Involvement of fish signal transducer and activator of transcription 3 (STAT3) in nodavirus infection induced cell death. Fish and Shellfish Immunology 43, 241–248. Langevin, C., Aleksejeva, E., Passoni, G., et al., 2013. The antiviral innate immune response in fish: Evolution and conservation of the IFN system. Journal of Molecular Biology 425, 4904–4920. OIE, 2018a. Infection with Macrobrachium rosenbergii nodavirus (while tail disease). Manual of Diagnostic Tests for Aquatic Organisms. Paris, France: World Organisation for Animal Health. OIE, 2018b. Viral encephalopathy and retinopathy. Manual of Diagnostic Tests for Aquatic Organisms. Paris, France: World Organisation for Animal Health. Overgard, A.C., Patel, S., Nostbakken, O.J., Nerland, A.H., 2013. Atlantic halibut (Hippoglossus hippoglossus L.) T-cell and cytokine response after vaccination and challenge with nodavirus. Vaccine 31, 2395–2402. Panzarin, V., Toffan, A., Abbadi, M., et al., 2016. Molecular basis for antigenic diversity of genus Betanodavirus. PLoS One 11, e0158814. Sahul Hameed, A.S., Ninawe, A.S., Nakai, T., et al., 2018. ICTV virus taxonomy profile: Sarthroviridae. Journal of General Virology 99, 1563–1564. Sahul Hameed, A.S., Ninawe, A.S., Nakai, T., et al., 2019. ICTV virus taxonomy profile: Nodaviridae. Journal of General Virology 100, 3–4. Sullivan, C.S., Ganem, D., 2005. A virus-encoded inhibitor that blocks RNA interference in mammalian cells. Journal of Virology 79, 7371–7379. Warrilow, D., Huang, B., Newton, N.D., et al., 2018. The taxonomy of an Australian nodavirus isolated from mosquitoes. PLoS One 13, e0210029.

Nudiviruses (Nudiviridae) Yu-Chan Chao, Chih-Hsuan Tsai, and Sung-Chan Wei, Institute of Molecular Biology, Academia Sinica, Taipei, Taiwan r 2021 Elsevier Ltd. All rights reserved.

Glossary Baculovirus Member of the Baculoviridae family of occluded viruses that are pathogenic to insects and other invertebrates. The salient features of the viruses are their rod-shaped nucleocapsids and large circular, doublestranded, supercoiled DNA genome. Episome A non-integrated extrachromosomal closed circular DNA molecule that may be replicated in the nucleus with chromosomal DNA. Latent infection Viral infection without cell lysis, the viral genome but not virus particle is detectable.

Nucleocapsid Protein shell that encloses the viral genetic material. Nudivirus Viruses that are generally not packed in an occlusion body. Occlusion body A viral-encoded protein lattice that packs and protects the viruses in the environment. Persistent infection Viral infection without cell lysis, but with the appearance of virus particles. Productive/lytic infection A form of viral infection in which large numbers of viruses are produced and that usually results in cell lysis.

Classification History Nudiviruses are a highly diverse group of rod-shaped, enveloped, and circular double-stranded DNA (dsDNA) viruses. These viruses infect a wide variety of arthropod hosts including insects and crustaceans (shrimp). The word “Nudi-” comes from the Latin “nudus”, meaning naked, as several of the initially described members of this viral group do not produce occlusion bodies in which virion particles are embedded, as is the case for other large dsDNA insect viruses such as baculoviruses. Due to their structural features and viral replication processes being similar to those of baculoviruses, nudiviruses were previously classified as “non-occluded baculoviruses” (NOBs) in the Baculoviridae. However, due to differences in morphological and genomic characters, they were later removed from the family Baculoviridae and orphaned for years without having a clear classification. Several nudiviruses were also characterized as “intranuclear bacilliform viruses” (IBVs) during this period. Ultimately, in 2013, the new family Nudiviridae was assigned by the International Committee on Taxonomy of Viruses (ICTV), consisting of two new genera, Alphanudivirus and Betanudivirus. To date, a variety of nudiviruses and nudivirus-like viruses have been reported from various host species belonging to the Lepidoptera, Trichoptera, Diptera, Siphonaptera, Hymenoptera Neuroptera, Coleoptera, Homoptera, Thysanura, Orthoptera, Acarina, Araneina, and Crustacea (Table 1).

Criteria for Classification Although nudiviruses were first characterized according to the absence of an occlusion body, increasing numbers of the Nudiviridae have been found to generate occlusion bodies, so this feature is no longer suitable for classification. Several criteria have been proposed by nudivirus researchers for classifying a virus as a member of the Nudiviridae: (1) virion morphology is typically rodshaped and enveloped; (2) the virus has a large circular dsDNA genome and possesses the conserved nudivirus core genes (described further below); (3) the virus propagates in the nuclei of infected host cells and induces nucleus hypertrophy; (4) the virus can be transmitted orally or parenterally; and (5) the virus infects hosts at larval and/or adult stages and exhibits diverse tissue and cell tropisms. As most of these morphological criteria may not clearly distinguish nudiviruses from other dsDNA arthropod viruses, molecular phylogenetic analyses using genomic data have become the most reliable means of classifying nudiviruses. These viruses exhibit phylogenetic distinctiveness from their closest viral taxa, e.g., baculoviruses (see Fig. 1). Most new members of the Nudiviridae, either re-assigned from other virus families or new isolates, are classified based on genetic analyses in spite of their various morphologies and physiologies.

Members of the Nudiviridae Table 1 lists the current well-studied or confirmed nudiviruses, though several potential nudiviruses remain to be definitively described. Current classification puts both OrNV and GbNV in the genus Alphanudivirus, with HzNV-1 and HzNV-2 belonging to the genus Betanudivirus, and the rest remaining unassigned (Fig. 1). Apart from newly isolated nudiviruses, members of the Nudiviridae have mostly been re-assigned from other virus families and reclassified in the Nudiviridae. These re-classified viruses were renamed using the species name of the host from which they were first isolated. For instance, Oryctes rhinoceros nudivirus is now the type species of the genus Alphanudivirus. OrNV, from the beetle Oryctes rhinoceros, is a member of that species and is first

Encyclopedia of Virology, 4th Edition, Volume 4

doi:10.1016/B978-0-12-814515-9.00008-4

827

828

Nudiviruses (Nudiviridae)

Table 1

Nudiviruses and hosts from which they were isolated

Host

Nudivirus (abbreviation)

Order

Species (common name)

Lepidoptera

Heliothis zea (corn earworm) Heliothis zea (corn earworm)

Heliothis zea nudivirus  1 (HzNV  1) Heliothis zea nudivirus  2 (HzNV  2)

Diptera

Tipula oleracea (cabbage crane fly) Tipula paludosa (European crane fly) Drosophila innubila (mycophagous fly) Drosophila melanogaster (fruit fly)

Tipula oleracea nudivirus (ToNV) Tipula paludosa nudivirus (TpNV) Drosophila innubia nudivirus (DiNV) Kallithea virus (KV)

Orthoptera

Gryllus bimaculatus (two-spotted cricket)

Gryllus bimaculatus nudivirus (GbNV)

Coleoptera

Oryctes rhinoceros (rhinoceros beetle)

Oryctes rhinoceros nudivirus (OrNV)

Decapoda

Penaeus monodon (tiger shrimp) Litopenaeus vannamei (whiteleg shrimp) Homarus gammarus (European lobster)

Penaeus monodon nudivirus (PmNV) Penaeus vannamei single nucleopolyhedrovirus (PvSNPV) Homarus gammarus nudivirus (HgNV)

Fig. 1 Phylogeny of large dsDNA arthropod viruses. The tree was generated by Maximum Likelihood (ML) inference analysis using a concatenated amino acid alignment of 37 nudivirus-related genes. Numbers on the nodes indicate ML non-parametric bootstrap support (100 replicates). The white spot syndrome virus (WSSV-TH) was used as the outgroup. This tree has been reproduced with permission from: Bézier, A., Thézé, J., Gavory, F., et al., 2015. The genome of the nucleopolyhedrosis-causing virus from Tipula oleracea sheds new light on the Nudiviridae family. Journal of Virology 89, 3008–3025.

nudivirus discovered had been variously named “Rhabdionvirus oryctes”, “baculovirus of Oryctes”, or “Oryctes baculovirus” and was assigned as a subtype in different virus families, e.g., the family Baculoviridae Subgroup C (non-occluded rod-shaped nuclear viruses) and the Oryctes virus family. GbNV that was isolated from the two-spotted cricket, Gryllus bimaculatus, was previously called the “baculovirus of cricket” or “cricket baculovirus”. Heliothis zea nudivirus is the type species of the Betanudivirus genus with HzNV-1 as a member of that species. HzNV-1 was isolated from Heliothis zea ovarian tissue cell lines and its genome was the first nudivirus genome to be sequenced. It has variously been known as “IMC-HZ-I-NOV”, “baculovirus X”, “Hz-1 baculovirus”, and “Hz-1 virus (Hz-1V)”, and it was considered to belong to the type species of the subfamily Nudibaculovirinae in the Baculoviridae. Reassignment of PmNV into the Nudiviridae was more complicated as this shrimp-infecting virus forms occlusion bodies (as do baculoviruses) and it had been designated “monodon baculovirus” (MBV), “Penaeus monodon baculovirus”, “Penaeus monodon singly enveloped nuclear polyhedrosis virus” (PmSNPV) in the subgenus of single nucleocapsid SNPVs, and “Penaeus monodon

Nudiviruses (Nudiviridae)

829

nucleopolyhedrovirus” (PemoNPV) in the former Nucleopolyhedrovirus genus. This virus is now considered as an unclassified nudivirus belonging to the family Nudiviridae based on molecular phylogenetic analysis and consequently was renamed “Penaeus monodon nudivirus (PmNV)”. Genomic data on PmNV supports this new classification and may even suggest that a potentially distinct genus under Nudiviridae should be assigned (described below). However, the name MBV is still commonly used in the literature to describe this virus. Another occluded nudivirus, TpNV, which infects crane flies, was previously named “T. paludosa baculovirus (TpBV)” and “T. paludosa nucleopolyhedrovirus (TpNPV)”. Apart from the above-mentioned nudiviruses characterized according to their structural features of the virion particle, HzNV-2 representing the second nudivirus to be isolated from the corn earworm Heliothis zea, which was previously named as “gonadspecific virus (GSV)”, “H. zea reproductive virus”, and “Hz-2V”, primarily due to pathogenies in its hosts. Newly isolated nudiviruses are now more practically designated according to the hosts from which they have been isolated, such as DiNV that infects the mushroom-feeding fly Drosophila innubila, and ToNV that infects the cabbage crane fly Tipula oleracea. Kallithea virus (KV) is the first DNA virus to be isolated from wild D. melanogaster populations and has temporarily been named according to the collection locality.

Phylogeny (Evolution of Nudiviridae and Other dsDNA Arthropod Viruses) Phylogenetic analyses of viral core genes (described in “Genome” section) reveals that nudiviruses can be separated from baculoviruses (Fig. 1, shaded in green). The Nudiviridae exhibits several clades, including from the Alphanudivirus and Betanudivirus genera. The newly sequenced PmNV and ToNV genomes suggest the existence of two additional genera. Notably, the phylogeny demonstrates that the bracoviruses (Family Polydnaviridae) may derive from nudiviruses. Bracoviruses form a symbiotic relationship with braconid wasps and are obligatorily parasitic on their Lepidopteran hosts. The presence of a significant number of genes homologous to those of nudiviruses in bracoviruses suggests that these polydnaviruses may have originated from the integration of an ancient nudivirus into the genome of a braconid wasp. Nudivirus-like genes have also been identified in Nilaparvata lugens endogenous nudivirus (NIENV) of the plant sap-sucking insect, Nilaparvata lugens (brown planthopper), but whether these genes are expressed or functional is uncertain. The phylogeny displayed in Fig. 1 may reflect the evolution of these viruses. The common ancestor of baculoviruses and nudiviruses seems to have diverged upon the appearance of holometabolous insects (insects exhibiting completely different appearances in immature and mature stages). The baculovirus lineage became specialized in infecting larvae of the newly evolved dipteran, hymenopteran, and lepidopteran insects. In contrast, the nudivirus lineage evolved to infect larval and adult hosts from a variety of diverse arthropods and by means of complex infection strategies. Such strategies, including the establishment of latent infections, might have been instrumental for the nudiviral ancestor of the bracoviruses to integrate its genome into a parasitoid wasp chromosome.

Virion Structure The enveloped virion of nudiviruses typically harbors a single rod-shaped nucleocapsid. The virion structures of the two recognized Nudiviridae genera exhibit different morphologies, with alphanudiviruses (e.g., OrNV and GbNV) having more rounded virion particles of B200  100 nm, whereas those of betanudiviruses (e.g., HzNV-1 and HzNV-2) are longer and thinner at B300–400  80 nm. Several nudiviruses exhibit protruding structures within their virions, such as that of OrNV with thickened or “capped” nucleocapsid ends and a tail-like appendage at one end of the nucleocapsid. A filamentous structure has frequently been observed between the nucleocapsids and dilated envelopes of GbNV and HzNV-2, whereas HzNV-1 does not exhibit this structure. Although nudiviruses such as HzNV-1 and GbNV are non-occluded, many nudiviruses present occlusion body formation during viral transmission. The morphology of occlusion bodies also differs among nudiviruses. ToNV and TpNV form enveloped occlusion bodies, PmNV forms non-enveloped occlusion bodies, and both OrNV and HzNV-2 form facultative occluded virions. Although the occlusion body proteins or potential genes encoding nudivirus occlusion bodies usually bear low sequence similarities to those of baculoviruses (described further below), the similar morphologies of baculovirus and nudivirus occlusion bodies would seem to preclude using this feature as a criterion to distinguish these two families of viruses.

Transmission and Pathogenesis Nudiviruses infect a broad diversity of host species and they exhibit tissue tropism and transmission modes distinct from those of their sister group, the baculoviruses. Unlike baculoviruses that infect primarily the midgut epithelium of lepidopteran larvae and are transmitted by feeding (per oral route), nudiviruses infect a variety of tissues of both larvae and adult hosts in a range of arthropod orders (Table 1), and they can be transmitted through feeding or mating (per oral and per parenteral routes, respectively). Pathogenies are therefore diverse and depend on the stage of the infected host. Specifically, OrNV infects both larvae and adult O. rhinoceros beetles via per oral transmission and targets primarily midgut epithelial cells. In larvae, the virus spreads to other tissues and causes the infected larvae to die by fat body disintegration. Infected adult beetles exhibit few obvious symptoms,

830

Nudiviruses (Nudiviridae)

carrying abundant newly-synthesized viruses and releasing them in their feces. GbNV infects its host crickets through fecalcontaminated foods. It targets the fat body cells of cricket nymphs and adults. GbNV infection is lethal in early nymphs, whereas infected adult crickets may exhibit reduced size or become crippled. PmNV is transmitted via oral ingestion of foods contaminated with feces containing viral occlusion bodies. PmNV targets the anterior midgut epithelium of very young post-larvae of tiger shrimp. When infecting the post-larvae, juveniles and adults, PmNV targets the hepatopancreatic tubule epithelium and duct epithelium. PmNV infectivity has been shown to reduce shrimp productivity. The fly nudivirus, DiNV, diminishes the viability of infected flies and reduces the number of eggs they lay. TpNPV targets the blood cells and the infection delays the crane fly larval development before the larvae die. Differences in the pathogenies of HzNV-1 and HzNV-2 infection are interesting. HzNV-1 can replicate in only cultured cells of lepidopteran tissue but the infection is not evident in its insect host, the corn earworm. In cultured cell lines, HzNV-1 results in productive infection which kills most of the cells, leaving very few survivals to become viral latently infected cells. During viral latency, viral DNA was found to present as an episome or as an insertion into the host’s genome. In contrast, HzNV-2 can replicate and persistently infect corn earworm. The predominant means of transmission of HzNV-2 is through mating between infected moths. HzNV-2 infects the reproductive tissues of infected moths, causing malformation of those tissues and host sterility. The mating pheromone generated by infected females attracts mate-seeking males, further transmitting the HzNV-2 nudivirus.

Virus Life Cycle Viral life cycles in infected cells have been studied for only a few nudiviruses. OrNV attaches to and enters cultured cells by pinocytosis, whereby the virus is brought into cells by means of a small vesicle formed by cell membrane invagination. OrNV may enter the nucleus in the form of an unenveloped nucleocapsid, although the uncoating process in the cytoplasm remains unknown. Initial virus replication occurs at B7 h post infection (hpi) with a sign of cleared chromatin-free areas showing up in the nucleus. In these areas, envelopes and nucleocapsid shells form first, and the viral nucleoprotein core and DNA condense into these enveloped shells. As early as 12 hpi, the resulting virions enter the cytoplasm and bud out through the plasma membrane, where they acquire a second membrane. Viral budding is greatest at 16 hpi and lasts until at least 36 hpi. The assembly processes of OrNV and some nudiviruses are very different from baculoviruses; baculoviruses produce naked nucleocapsids in the nucleus and acquire their envelopes during budding from the plasma membrane or before occlusion into polyhedra or granule cores. In OrNVinfected cells, enveloped nucleocapsids have always been observed although the virus purified from the cells with only a single envelope exhibits similar infectivity as the virus acquiring a second envelope by budding. In the cell line DSIR-HA-1179, OrNV produces at least 27 proteins of which 14 are thought to be envelope proteins and 13 are associated with the nucleocapsid. The most abundant protein, P13, can be synthesized up to 36 hpi but most of the protein syntheses have ceased before it. Although the homologs of late or very-late genes appear in the genome, synthesis of these late proteins does not occur in OrNV. HzNV-1 exhibits two infection stages in cultured cells, i.e., productive (lytic) and persistent (latent) infections (Fig. 2). DNA replication and virus assembly processes in productive infection of HzNV-1 are similar to those of OrNV. That is, enveloped, rod-shaped virus particles are first formed in the nucleus of infected cells and then filled with capsids containing the viral DNA. Viruses are released from infected cells by cell lysis. The early sign of viral DNA replication occurs at 4 hpi, and the replication cycle can be divided into three stages based on the timing that viral proteins appear: early (0–4 hpi), intermediate (4–8 hpi), and late (beyond 8 hpi). As high titers of virus progeny are produced, most infected cells ultimately die. However, a small percentage (usually less than 5%) of infected cells become persistently infected. During this stage, virus progenies are detected with low titers for a prolonged period of time and for many culture passages without obvious cell lysis. The persistently infected cells can be established by the viruses in both productive and latent infection cycles. Upon the establishment of persistent viral infection, virions are not detectable in most of the infected cells and viruses exist either as episomes or are inserted into the host genome (Fig. 2). At the same time, in some very few cells (less than 0.2%), viruses undergo the productive viral infection cycle, resulting in viral titers of approximately 103 plaque-forming unit (PFU)/ML in the culture medium. The phenomenon of HzNV1 persistent infection is more appropriate to be described as a latent infection, in which virus is usually undetectable, and intermittent acute symptoms may occur in infected cells or tissues. During latency, insect cells are resistant to infection with homologous viruses (the same viruses), but they can be reactivated to result in a productive infection by infecting them with heterologous viruses (different viruses). After several serial, high-multiplicity passages of HzNV-1 virus in cell culture, defective interfering particles (DIPs) show up in the culture. These particles contain smaller genomes, which might be acquired from deletions in the HzNV-1 genome. Interestingly, 37 HzNV-1 viral proteins identified in virus with a complete genome are all synthesized by DIPs, but the relative rates of synthesis of many of these proteins are altered in DIPs (thirteen decelerate and five accelerate). The contribution of DIPs to the establishment of latency remains unknown. Temporal gene expression profiles have also been analyzed for HzNV-1. During productive infection, gene expression can be categorized into early stage (0–2 hpi), intermediate stage (2–6 hpi), and late stage (after 6 hpi), with at least 101 transcripts expressed; during latent infection, there is only one detectable transcript, i.e., the persistency-associated transcript 1 (PAT1) that is expressed by the persistency-associated gene 1 (pag1).

Nudiviruses (Nudiviridae)

831

Fig. 2 Bi-phasic infection of HzNV-1 in insect cells. After HzNV-1 enters the infected cell, the virus may be uncoated and enter the nucleus in the form of an unenveloped nucleocapsid. In the nucleus, the viral DNA may insert into the host chromosomal DNA or exist as episomes for latent infection. In this infection stage, viral particles are not detectable and only the pag1 gene expresses its transcript. Occasionally, HzNV-1 switches to productive infection in which viral DNA and proteins are abundantly produced. The enveloped, rod-shaped viral nucleocapsids are formed and filled with viral DNA in the nucleus. Viruses obtain their second envelope from the plasma membrane when they bud out from the cell. The budding eventually leads to the lysis of the infected cell. Table 2

Comparison of nudivirus genomes

Virus

Genome size (base pairs)

HzNV  1 HzNV  2 OrNV GbNV PmNV ToNV DiNV KV HgNV

228,089 231,621 127,615 96,944 119,638 145,704 155,555 152,388 107,063

No. of ORFsa 155 113 139 98 115 131 107 95 97

G þ C content (%)

Gene density (per kilobase pairs)

41.8 42.0 42.0 28.0 34.5 25.5 30.0 38.9 35.3

1.47 2.05 0.92 0.99 1.04 1.11 1.45 1.60 1.10

a

ORFs: open-reading frames.

Genome General Features Nudiviruses whose genomes have been fully sequenced are listed in Table 2. These viral genomes vary greatly in size, G þ C content, and number of coding sequences, except for the genomes of HzNV-1 and HzNV-2 that share 94% sequence identity. G þ C contents of some nudivirus genomes, e.g., GbNV and ToNV, are particularly low, with a lowest for ToNV at 25.5%. Similar to other large dsDNA viruses, nudivirus genomes feature direct repeat sequences that are extremely AT-rich (up to 98%) and with 2–3 copies of short reiterated sequences. However, these direct repeats are not homologous to each other within or between nudivirus genomes, or with those of other large dsDNA viruses. Numbers of direct repeats also differ among nudivirus genomes.

832

Nudiviruses (Nudiviridae)

Table 3

The conserved baculovirus core genes in nudivirus genomes

Gene name

Encoding product

dnapol helicase p47, lef  4, lef  8, lef  9 lef  5 vlf  1 p74 (pif  0), pif  1, pif  2, pif  3, 19 kDa (pif  4), odv-e56 (pif  5), ac68 (pif  6), vp91 (pif  7) vp39 38 K p33 ac81

DNA polymerase Helicase RNA polymerase subunit Late expression factor Very late expression factor per os infectivity factor (PIF) Major capsid protein Viral phosphatase Sulfhydryl oxidase Unknown function

For instance, the HzNV-2 genome harbors six direct repeats, the ToNV genome has five, the GbNV genome contains 14, and the DiNV genome has 156 simple direct repeats that account for 5.1% of the genome. Homologous repeat regions composed of direct repeats but with an imperfect palindromic core sequence commonly exist in the genomes of baculoviruses. In contrast, no homologous repeats have been found in most sequenced nudivirus genomes, apart from in the ToNV genome that hosts several direct repeats containing imperfect palindromic cores that are reminiscent of homologous repeats.

Gene Content Nudivirus genomes typically share a set of 30–32 common genes, so-called “nudivirus core genes”. Although the members of these nudivirus core genes are likely to be debated as more nudivirus genomes become available, currently 20 nudivirus core genes homologous to baculovirus core genes (Table 3) are present in the nudivirus genomes sequenced to date. These genes are responsible for different functions in both baculovirus and nudivirus life cycles. Their presence evidences the close relationship between nudiviruses and baculoviruses, but phylogenetic analyses of these genes also support their divergence (described in “Phylogeny (Evolution of Nudiviridae and Other dsDNA Arthropod Viruses)” section). Among the 20 nudivirus core genes shared with baculoviruses, eight encode homologs of baculovirus per os infectivity factor (PIF). PIFs are conserved among all baculoviruses and are essential for oral infection of insect hosts. Since many nudiviruses also exhibit per oral transmission, the occurrence of baculovirus PIF homologs suggests that a conserved infection mode exists in both nudiviruses and baculoviruses. Genes encoding the four RNA polymerase units homologous to baculovirus variants – P47, LEF-4, LEF-8, and LEF-9 – are also present among the nudivirus core genes, as is a late expression factor, LEF-5, and a very late expression factor, VLEF-1. Thus, nudiviruses seem to have a baculovirus-like RNA transcription apparatus and transcribe late gene expression in a similar manner to baculoviruses. Aside from these 20 shared core genes, some nudiviruses present a gene homolog to the baculovirus polyhedrin/granulin (polh/gran) gene. In baculoviruses, polh/gran encodes the polyhedrin protein that forms the occlusion body. The ToNV nudivirus obligatorily forms occlusion bodies, and its occlusion body protein is encoded by a baculovirus polh homolog. However, the gene encoding the major occlusion body protein of another occluded nudivirus, PmNV, does not share sequence similarities to baculovirus polh or gran. Whether the occlusion bodies of other occluded nudiviruses are encoded by polh/gran homologs awaits investigation. Apart from genes encoding protein products, nudiviruses such as HzNV-1 and KV encode miRNAs. miRNAs are involved in the establishment of HzNV-1 latency and regulation of host histone modification. KV transcribes many small RNAs (21 nucleotides) during infection of D. melanogaster. These small RNAs may be the targets of Drosophila antiviral RNA interference (RNAi) responses.

Gene Regulation for Switching Productive and Latent Infections The molecular mechanism responsible for switching between productive and latent infections has been analyzed for HzNV-1. A 6.2-kilobase early transcript encoded by the hhi1 gene reactivates viruses from latently infected cells. The hhi1 transcript has been detected very early during productive HzNV-1 virus infection, but it is not detectable during latent infection. When hhi1 is expressed in latently infected cells, most of the cells lyse and release high titers of viral progeny. In contrast, when hhi1 expression is suppressed (e.g., by siRNA), the number of latently infected cell clones is dramatically increased. The expression of hhi1 transactivates the expression of viral early genes, which may contribute to switching the virus from latent infection to productive infection. Latency of HzNV-1 is also well regulated at the molecular level. During viral latency, the non-coding RNA PAT1 (transcribed by pag1) is the only detectable viral transcript. PAT1 constantly exists in both latent and productive infections. Overexpression of PAT1 in productively infected cells enhances HzNV-1 latent viral infection. Interestingly, pag1 encodes at least two miRNAs that target the hhi1 coding region. Both of these miRNAs can reduce the amount of hhi1 transcript within cells to establish latent viral infection. The interactions between hhi1 and pag1 are illustrated in Fig. 3. During HzNV-1 infection, cells

Nudiviruses (Nudiviridae)

833

Fig. 3 Model for the establishment of productive or latent HzNV-1 viral infection via hhi1 and pag1 interactions. (a) Infection of HzNV-1 results in high levels of hhi1 expression, and a moderate level of pag1 expression. The abundant hhi1 transcript can tolerate the RNA degradation caused by the pag1 miRNAs, and transactivate viral early genes for activating the productive infection in most of the infected cells. (b) In a low percentage of cells, the level of hhi1 transcript is low or undetectable. Suppression of hhi1 transcript by pag1 miRNA allows the virus to enter latency. (c) Viruses in the latently infected cells can be reactivated by introducing hhi1 transcript, which results in virus production and cell lysis.

exhibiting high levels of hhi1 expression can tolerate the RNA degradation caused by the pag1 miRNAs and enter productive viral infection (Fig. 3(a)). However, if the expression level of hhi1 in a cell becomes lower, or if pag1 miRNAs are introduced, hhi1 transcript expression will be suppressed by pag1 via miRNA degradation and the virus enters latency (Fig. 3(b)). Re-activation of latent virus can be induced by introducing hhi1 transcripts, resulting in virus production (Fig. 3(c)).

Negative Impacts and Potential Applications PmNV infection was first identified in Taiwanese shrimp in 1977 and has since been reported worldwide. The infection of PmNV was initially thought to have resulted in the collapse of the penaeid shrimp culture industry in Taiwan in 1988. Although subsequent analyses suggested the collapse may have been due to environmental management or infection by other pathogens, PmNV is still closely linked to losses in many shrimp culture facilities in Taiwan and the Indo-Pacific region. To control PmNV infection, infected stocks need to be eradicated and the hatcheries must be disinfected using chemicals such as formalin, chlorine, and iodophor at sufficient concentrations. Use of OrNV as a biocontrol agent to suppress its beetle host O. rhinoceros is a classical but rare example of (initially) successful biological control. O. rhinoceros damaged both coconut and oil palms in many Pacific islands in the early 1900s. OrNV was first used in Samoan fields in 1967 to control this beetle. Symptomless adult beetles infected with the nudivirus become natural vectors for OrNV spread, and beetle populations were effectively under control for over 30 years. However, a new wave of invasive O. rhinoceros became established in 2007 that were resistant to OrNV infection. Since resistant beetles had already been detected in 1989, it is unclear if the OrNV biocontrol process catalyzed the emerging resistance in O. rhinoceros.

Further Reading Bézier, A., Thézé, J., Gavory, F., et al., 2015. The genome of the nucleopolyhedrosis-causing virus from Tipula oleracea sheds new light on the Nudiviridae family. Journal of Virology 89, 3008–3025. Burand, J.P., 1998. Nudiviruses. In: Miller, L.K., Ball, L.A. (Eds.), The Insect Viruses. New York: Plenum, pp. 69–90. Federici, B.A., Bigot, Y., 2003. Origin and evolution of polydnaviruses by symbiogenesis of insect DNA viruses in endoparasitic wasps. Journal of Insect Physiology 49, 419–432. Harrison, R.L., Herniou, E.A., Bézier, A., et al., 2019. Nudiviridae in the ICTV report on virus taxonomy 10th report. ICTV. Available at: https://talk.ictvonline.org/ictv-reports/ ictv_online_report/dsdna-viruses/w/nudiviridae. Harrison, R., Hoover, K., 2012. Baculoviruses and other occluded insect viruses. Insect Pathology. Elsevier. pp. 73–131. Huger, A.M., 2005. The Oryctes virus: Its detection, identification, and implementation in biological control of the coconut palm rhinoceros beetle Oryctes rhinoceros (Coleoptera: Scarabaeidae). Journal of Invertebrate Pathology 89, 78–84. Jehle, J.A., Burand, J., Herniou, E., et al., 2013. Creation of a new family Nudiviridae including two new genera and three species. Taxonomy Proposal. ICTV. Available at: https://talk.ictvonline.org/files/ictv_official_taxonomy_updates_since_the_8th_report/m/invertebrate-official/4819. Lin, C.L., Lee, J.C., Chen, S.S., et al., 1999. Persistent Hz-1 virus infection in insect cells: Evidence for insertion of viral DNA into host chromosomes and viral infection in a latent status. Journal of Virology 73, 128–139.

834

Nudiviruses (Nudiviridae)

Reil, J.B., Doorenweerd, C., San Jose, M., et al., 2018. Transpacific coalescent pathways of coconut rhinoceros beetle biotypes: Resistance to biological control catalyses resurgence of an old pest. Molecular Ecology 27, 4459–4474. Wang, Y., Burand, J.P., Jehle, J.A., 2007. Nudivirus genomics: Diversity and classification. Virologica Sinica 22, 128–136. Wang, Y., Jehle, J.A., 2009. Nudiviruses and other large, double-stranded circular DNA viruses of invertebrates: New insights on an old topic. Journal of Invertebrate Pathology 101, 187–193. Williams, T., Bergoin, M., Van Oers, M.M., 2017. Diversity of large DNA viruses of invertebrates. Journal of Invertebrate Pathology 147, 4–22. Wu, Y.L., Wu, C.P., Liu, C.Y., et al., 2011. A non-coding RNA of insect HzNV-1 virus establishes latent viral infection through microRNA. Scientific Report 1, 60.

Parvoviruses of Invertebrates (Parvoviridae)1 Judit J Pénzes and Hanh T Pham, National Institute of Scientific Research – Armand-Frappier Health Research Centre, Laval, QC, Canada Qian Yu, School of Life Sciences, Jiangsu University, Zhenjiang, China Max Bergoin, National Institute of Scientific Research – Armand-Frappier Health Research Centre, Laval, QC, Canada Peter Tijssen, National Institute of Scientific Research – Armand-Frappier Health Research Centre, Microbiology and Immunology, Laval, QC, Canada r 2021 Elsevier Ltd. All rights reserved.

Glossary Ambisense and monosense An ambisense single-stranded RNA virus (as in segmented negative stranded viruses) refers to a virus in which both the  ve and þ ve sense RNA segments encode proteins. For the ssDNA parvoviruses both the þ ve, and its  ve copy, ssDNA genomes are packaged and, after conversion to a dsDNA in the recipient host, encode mRNAs and proteins. In contrast, for monosense ssDNA parvoviruses, genes are located on only the -ve strands that are predominantly packaged. APOBEC ("apolipoprotein B mRNA editing enzyme, catalytic polypeptide-like") is a family of evolutionarily conserved cytidine deaminases. A mechanism of generating protein diversity is mRNA editing as well hypermutation. EVE-derived immunity (EDI) widespread recruitment of endogenous viruses as regulatory elements for immune genes that points to a systematic evolutionary process in their co-option for host immunity.

Monophyly Members of a monophyletic taxon are all descendants of one common ancestor. Modern taxonomy aims to establish exclusively monophyletic taxa. A good example is the vertebrate taxon of amniotes, unifying mammals, birds and reptiles. Paraphyly A taxon is considered paraphyletic if it excludes at least one lineage from its definition, even though given excluded lineage(s) still clearly derive(s) from the same common ancestor as the lineages included in the defined taxon. A well-known example is the definition of reptiles as taxon without the inclusion of birds. Polyphyly A taxon is polyphyletic if it includes lineages derived from a common ancestor, which itself is excluded from the taxon definition. Warm-blooded animals make a good example, as it unifies birds and mammals but excludes reptiles, though all three derive from the same reptile-like common amniote ancestor.

Introduction Parvoviruses of invertebrates are named densoviruses (DVs, formerly densonucleosis viruses or DNVs) although their demarcation characteristics from vertebrate parvoviruses (PVs) are increasingly blurred through the discovery of a third lineage (now classified as subfamily Hamaparvovirinae), which encompasses both vertebrate and invertebrate parvoviruses and is related to several endogenous viral elements (EVEs) identified as integration into the genomes of members from all major arthropod lineages. Linear single-stranded DNA (ssDNA) virus families of animals are relatively scarce since: (1) ssDNA cannot be transcribed to yield translation products such as a viral DNA polymerase; (2) linear ssDNA cannot, in contrast to linear single-stranded RNA, be replicated without loss of genome termini (corresponding to Okazaki fragments) of genomic integrity unless complex hairpin structures, or a protein-primed mechanism, are introduced to create double-stranded DNA first; (3) the absence of a complementary DNA strand, as a repair template, increases the mutation rate, and therefore limits the maximum length (2–12 kb with very few exceptions; to that of RNA viruses. PVs and DVs have small icosahedral capsids with a diameter of 18–27 nm with a limited packaging capacity). Although most DVs are highly pathogenic, some are essential for the life cycle of their hosts. Infection with Dysaphis plantaginea densovirus (DplDV) is required to produce the winged morph in asexual clones of the rosy apple aphid and its colony dispersal to neighboring plants. This mutualistic relationship between the rosy apple aphid and its viruses results both in a negative impact of DplDV on rosy apple aphid reproduction, but also contributes to the survival of aphids by inducing wing development and promoting dispersal. The ssDNA viruses are, apart from those of the bidnaviruses, exceptionally widespread in all domains of life despite the limited number of virus families. Metagenomic studies dramatically boosted our knowledge about the distribution of ssDNA viruses in the biosphere, from the human gut to hot springs. The diversity of ssDNA viruses appears to be determined by two principal factors: extremely high nucleotide substitution rates that approach those of RNA viruses, facilitating adaptation to different environments and pervasive recombination, both by DNA and RNA donors.

1 This article is dedicated to Prof. Michael Rossmann for his essential contributions to the structure of arthropod densoviruses and to Prof. Su De Ming, a pioneer of parvovirology in China. Both forerunners passed away while this article was in preparation.

Encyclopedia of Virology, 4th Edition, Volume 4

doi:10.1016/B978-0-12-814515-9.00009-6

835

836

Parvoviruses of Invertebrates (Parvoviridae)

Discovery, Taxonomy and Evolution of Densoviruses; of Polyphyly, Paraphyly and its Tangled Relationship With VertebrateInfecting Parvoviruses DVs are defined as small icosahedral viruses containing a linear, monopartite, ssDNA genome with terminal hairpins which facilitate rolling hairpin replication (RHR). DV genomes are monopartite, though for the ambidensoviruses both þ and  ssDNA strands are coding in an ambisense strategy. The first discovery of a DV was in the early 1960s when an epizootic decimated a mass rearing of Galleria mellonella L. larvae, reared as fishing bait in France. Electron microscopic examination of thin sections from diseased larvae revealed hypertrophied nuclei containing an electron-dense viroplasm with thousands of tiny viral particles. After purification, these particles were observed as small isometric, nonenveloped virions, 22–25 nm in diameter, and their biochemical analysis showed that they contained single-stranded DNA. The virus causing the disease was designated as densonucleosis virus, which later was simplified to the name densovirus (DV). Diseases presenting the same type of histopathological features, and caused by the same type of viruses, were later described not only in Lepidoptera but also in other invertebrates. The cloning and sequencing of many DV genomes during the last two decades, while confirming their relatedness to the vertebrate parvoviruses, revealed an extraordinary diversity, reflecting very likely the evolutionary diversity of their hosts. To date, and despite the current Parvoviridae taxonomy which reflects the actual evolutionary relationships among the family members, regardless of host affiliation, parvoviruses (family Parvoviridae) infecting vertebrates are still referred to as parvoviruses and parvoviruses of invertebrates are traditionally united under the densovirus designation even though this kind of classification of the family would introduce polyphyly. In contrast to most DNA viruses, DVs lack the DNA polymerase gene and use a pleiotropic non-structural protein 1 (named NS1 or Rep) for Rolling-Hairpin Replication (RHR). It became apparent throughout the Parvoviridae family that the SF3 helicase superfamily domain of the NS1 is the only conserved motif at the protein sequence level, hence NS1-based phylogeny was introduced as a basis of classification since 2014. Although this method initially conformed with the segregation of the family into subfamilies Parvovirinae and Densovirinae (sensu lato; referring to all invertebrate parvoviruses at the time), following the vertebrate-invertebrate host affiliation-based classification, the discovery of the first members from the current invertebrate-infecting genera Brevihamaparvovirus, Penstylhamaparvovirus, and Hepanhamaparvovirus revealed a diversity among densoviruses, which was unmatched with the rather well-conserved nature of the vertebrate-infecting Parvovirinae. Members of the latter share both NS1 and VP proteins of a clearly homologous nature proven by detectable sequence similarity, throughout basically the entire length of these proteins. In contrast, densoviruses could be linked together only by the above-mentioned short (approx. 200 aa long) SF3 helicase domain of detectable sequence similarity. If only the helicase domain is considered, however, certain densoviruses, such as hepanhamaparvoviruses, harbor the same amount of amino acid (aa) sequence similarity to certain densoviruses as they do to members of the vertebrate Parvovirinae. Since 2012 and on a frequent basis, parvoviruses of a novel, divergent lineage, unified under the name “chapparvovirus”, have been detected in association with kidney and liver tissue, as well as with various secretions (such as blood) and excretions (outbreaks of enteritis) of vertebrates. As unexpected as it may be, nevertheless, phylogeny evidence revealed a close relationship of these vertebrate-infecting parvoviruses to invertebrate-infecting densoviruses, such as members of genera Hepanhamaparvovirus, Penstylhamaparvovirus, and Brevihamaparvovirus. To further complicate the situation, EVEs have been detected in arachnid, chilopod, entognath and coleopteran arthropod genomes, evidently derived from ancient members of the “chapparvovirus” lineage, previously believed to be yet another new genus of Parvovirinae. These findings led to the establishment of the first Parvoviridae family of mixed host affiliation, designated Hamaparvovirinae. Although the subfamily itself is well-supported, Hamaparvovirinae still appears to be the most heterogenous one of the three subfamilies, as relationships between its three major clades, namely Penstylhamaparvovirus-Brevihamaparvovirus vs. Hepanhamaparvovirus and affiliated unclassified relatives vs. Ichthamaparvovirus and Chaphamaparvovirus have low support and they lack homologous capsid protein sequences (Fig. 1). This could suggest that Hamaparvovirinae may not be long-lived once novel parvoviruses related to these three clades will be discovered and provide more insights into evolutionary relationships within this diverse clade. Back in 2014, the unification of several ambisense densovirus genera to establish the large and diverse genus, called Ambidensovirus, created another major, unforeseen problem. Even then Ambidensovirus did not conform with the then-established demarcation criteria, albeit the ambisense genome organization proved to be a strong enough argument to support its existence. Since then, however, multiple densoviruses have been described, eventually rendering Ambidensovirus paraphyletic as opposed to monophyletic, as some members were more closely related to monosense iteradensoviruses than they are to other ambidensoviruses. This suggests that the orientation of the genome coding order is subjected to plasticity and either the monosense or the ambisense strategy could probably be “re-invented” on multiple occasions (Fig. 1). The selective constraints driving this conversion, however, are not well-understood. Today, to reflect the diversity of the clade and to avoid the above-mentioned paraphyly problem, ambisense DVs segregate into seven genera, namely Miniambidensovirus, Protoambidensovirus, Aquambidensovirus, Scindoambidensovirus, Hemiambidensovirus, Blattambidensovirus, Pefuambidensovirus. Despite the divergence in genome organization, members of the above-mentioned seven genera and Iteradensovirus share some features in common with each other but not with members of any other genus in the Parvoviridae, including (1) a shared NS1based and helicase domain-based phylogeny; (2) their VP proteins appear to share a common ancestral VP protein gene and similar capsid structures; (3) at least 32% identity can be detected between the NS1 proteins of any itera- or ambidensovirus, but only much lower identities with other densoviruses; (4) possession of a phospholipase A2 (PLA2) domain in their largest minor capsid protein sequence. Based on these criteria, ambisense DVs and monosense iteradensoviruses are now united in one

Parvoviruses of Invertebrates (Parvoviridae)

837

Fig. 1 Bayesian phylogenetic inference of family Parvoviridae based on the SF3 helicase domain. For the easier visibility subfamily Parvovirinae has been collapsed. Each tip label represents a separate species of its assigned genus (though murine chapparvovirus is the same species as mouse kidney parvovirus). The current (2020) taxonomy is s presented as branch labels, subfamily level taxonomy is presented to the right. Blue branches indicate lineages, which lack the phospholipase A2 domain from their large minor structural proteins. Branches presented in multiple colors correspond with genera introduced by the 2020 new taxonomy to resolve the former paraphyletic situation of ambisense densoviruses, previously classified as one genus. Ambisense genome organization is indicated by two small arrows facing toward each other. Host spectrum of each species or subfamily is indicated by the silhouette of the host animal. The reliability of the topology is indicated by posterior probability values, represented by the circles at the nodes; the larger and redder a circle is the higher the support is of given node as bright, large red circles indicate maximal support. This figure was created by BEAST v1.10.4.

subfamily on their own, which is subfamily Densovirinae. Given the large number of EVEs, discovered in various invertebrate genomes, which appear to be closely-related to this clade, it is to be expected that the current eight genera will soon be accompanied by several new ones, when the current dynamics of virus discovery are considered. The current taxonomy of DVs is summarized by Table 1.

Biology, Pathology and Host Range So far, DVs have been found in almost all major clades of the phylum Arthropoda, including arachnids, diplopods, decapod crustaceans and hexapods. As of the letter two, six different insect orders, as well as penaeid shrimps (family Penaeidae), crabs, and a crayfish (Cherax quadricarinatus) are known to be infected by DVs. Recent studies have derived DVs from members of the vast and diverse phylum of Mollusca, infecting oysters. The DV host spectrum also spans deuterostome invertebrates, derived from sea stars

838

Parvoviruses of Invertebrates (Parvoviridae)

Table 1 Master list of densovirus species. Former classification (Cotmore et al., 2019): rows 1: Ambidensovirus, rows 2 Brevidensovirus, rows 3 Hepandensovirus, row 4 Iteradensovirus and row 5 Penstyldensovirus with acronyms and GenBank accession numbers. References are to be found in the GenBank listings. Revision of the family, accepted in 2020 (Penzes et al., 2020): subfamily a: Densovirinae with ambidensoviruses, in the former genus Ambidensovirus (i) and genus Iteradensovirus (ii) and subfamily b: Hamaparvovirinae with genera Brevihamaparvovirus (x), Hepanhamaparvovirus (y), Penstylghamaparvovirus (z). The 4th column contains the names of the proposed genera of ambisense densoviruses Former and current taxonomy Virus Name

Name, abbreviated GenBank Accession # Genera of ambisense DVs

1.a.i

Periplaneta fuliginosa densovirus Blattella germanica densovirus1 Culex pipens densovirus Planococcus citri densovirus Diatraea saccharalis densovirus Galleria mellonella densovirus Helicoverpa armigera densovirus Junonia coenia densovirus Mythimna loreyi densovirus Pseudoplusia includens densovirus Acheta domesticus densovirus Acheta domesticus minidensovirus Acheta domesticus segmented densovirus

PfDV BgDV1 CpDV PcDV DsDV GmDV HaDV1 JcDV MlDV PiDV AdDV AdMDV AdSDV

AF192260 AY189948 FJ810126 AY032882 AF036333 L32896 JQ894784 S47266 AY461507 JX645046 HQ827781 KF275669

Pefuambidensovirus Blattambidensovirus Protoambidensovirus Scindoambidensovirus Protoambidensovirus Protoambidensovirus Protoambidensovirus Protoambidensovirus Protoambidensovirus Protoambidensovirus Scindoambidensovirus Miniambidensovirus

Asteroid densovirus Cherax quadricarinatus densovirus Solenopsis invicta densovirus Myzus persicae densovirus 1 Dysaphis plantaginea densovirus

SsaDV CqDV SiDV MpDV1 DplDV

PRJNA253121 KP410261 SRX023381 AY148187 EU851411

Aquambidensovirus Aquambidensovirus Scindoambidensovirus Hemiambidensovirus Hemiambidensovirus

2.b.x

Aedes aegypti densovirus1 Aedes albopictus densovirus1 Culex pipiens pallens densovirus Anopheles gambiae densovirus Aedes aegypti densovirus2 Aedes albopictus densovirus 2 Aedes albopictus densovirus 3 Haemagogus equinus densovirus

AaeDV AalDV1 CppDV AgDV AaeDV2 AalDV2 AalDV3 HeDV

M37899 AY095351 EF579756 EU233812 FJ360744 X74945 AY310877 AY605055

3.b.y

Penaeus monodon hepandensovirus 1 Penaeus chinensis hepandensovirus Penaeus monodon hepandensovirus 2 Penaeus monodon hepandensovirus 3 Penaeus merguiensis hepandensovirus Penaeus monodon hepandensovirus Fenneropenaeus chinensis hepandensovirus

PmoHDV1 PchDV PmoHDV2 PmoHDV3 PmeDV PmoHDV4 FchDV

NC_007218 AY008257 EU247528 EU588991 DQ458781 J410797 FJN082231

4.a.ii

Bombyx mori densovirus Casphalia extranea densovirus Sibine fusca densovirus Dendrolimus punctatus densovirus Papilio polyxenes densovirus Helicoverpa armigera densovirus

BmDV1 CeDV SfDV DpDV PpDV HaDV2

AY033435 AF375296 JX020762 AY665654 JX110122 HQ613271

5.b.z

Penaeus Penaeus Penaeus Penaeus

PstDV1 PmoPDV1 PmoPDV2 PstDV2

AF273215 GQ411199

stylirostris penstyldensovirus 1 monodon penstyldensovirus 1 monodon penstyldensovirus 2 stylirostris penstyldensovirus 2

AY124937 GQ475529

and sea urchins (phylum Echinodermata). Moreover, using a metagenomic approach, DVs have been derived from eumetazoan animals even a primitive as cnidarians while the presence of endogenous DV-like elements in nematodes and platyhelminths implies that the DV host spectrum, in fact, might span most eumetazoan invertebrate phyla. DVs, which have been characterized thoroughly, were derived mainly from insects (orders Lepidoptera, Diptera, Hymenoptera (e.g., red imported fire) Blattodea, Hemiptera and Orthoptera but also decapod crustaceans (giant tiger prawn, crayfish), and echinoderms (starfish and sea urchins)).

Parvoviruses of Invertebrates (Parvoviridae)

839

However, our knowledge on the true extent of the DV host spectrum is still limited. Invertebrates comprise multiple phyla and constitute over 1 million species. Moreover, some species, e.g., the common house cricket, are infected by at least four different ssDNA viruses. Phylogenetic analysis revealed that these densoviruses may be as diverse as their hosts, therefore a definite classification is not yet possible (Fig. 1). The hypertrophied nuclei intensively stained by Feulgen reagent is a salient feature shared by all densovirus-infected tissues to date, where characterized. Thin sections of heavily infected cells observed by EM show hypertrophied nuclei filled by a virogenic stroma containing thousands of virions and cytoplasmic paracrystalline virion arrays.

General Features Most of the DVs so far isolated are lethal for their natural hosts, the first symptoms being anorexia and lethargy followed by flaccidity. After a progressive paralysis, a slow melanization and death follow. The smokybrown cockroach (Periplaneta fuliginosa), infected by PfDV, displays characteristic hind leg paralysis, uncoordinated movements and a hypertrophied abdomen. The oil palm pests Casphalia extranea and Sibine fusca (in Ivory Coast and Colombia, respectively), when infected with CeDV and SfDV display tumor lesions in the gut. Infections are highly contagious and rapidly spread leading to severe epizootics both in noxious insects such as lepidopteran defoliators, mosquito larvae, aphids or cockroaches but also unfortunately in mass rearing facilities of commercially significant invertebrates such as Bombyx mori in sericulture farms, crickets in cricket farms and industrial shrimp farming. A densovirus is the causative agent of the recent extensive outbreak of sea-star wasting disease (SSWD). Due to their high pathogenicity, DVs have been successfully used to control Limacodidae larvae in oil palm plantations. Most DVs are polytropic and infect the fat body, hypodermis, central nervous system, silk gland, muscular membrane, tracheal cells, Malpighian tubules, foregut, hindgut, hemocytes, ovaries, and molting glands but not the midgut. However, some iteradensoviruses such as BmDV (B. mori), CeDV (Casphalia extranea) and SfDV (Sibine fusca), multiply predominantly in the midgut. BmBDV, though not a densovirus, is a bidnavirus which causes fatal flacherie disease in the silkworm, replicates in contrast to most parvo-like densoviruses, but similarly to BmDV, only in midgut columnar cells. B. mandarina and B. mori are considered to have a common ancestor and both (at least native strains) are susceptible to BmBDV. Many Japanese strains acquired BmBDVresistance through improved breeding. These resistant strains became susceptible to the parvovirus-like BmDV. The bidnavirus BmBDV is, therefore, likely the native disease of Bombyx, whereas the origin of BmDV is likely to be a pathogen from other insects. The pyralid Glyphodes pyloalis, a pest of mulberry plantations of sericultural farms is also susceptible to BmDV and probably the real/original host of BmDV. The host range of the different DVs varies considerably. Some, like GmDV of Galleria mellonella, and CeDV are restricted to their original hosts, some, like AdDV of Acheta domestica and PfDV of P. fuliginosa to closely related hosts while others, e.g., Junonia coenia densovirus (JcDV), Mythimna loreyi densovirus (MlDV) and the mosquito DVs infect many different host species. Although those infecting the midgut have a restricted host range, the inverse is not true for the polytropic viruses. Mosquito larvae infected with mosquito DVs exhibit symptoms of paralysis. Mosquito cells cultured in vitro often contain DVs, despite the lack of cytopathic effects (CPE), that are pathogenic for mosquito larvae by per os infection. The Ld652 cell line derived from the gypsy moth (Lymantria dispar) also seems to be contaminated with a DV despite the lack of a CPE. Iteradensoviruses replicate essentially in the columnar cells of midgut epithelium. Hepandensoviruses (genus Hepanhamaparvovirus) of penaeid shrimps infect hepatopancreatic tubule epithelial cells characterized by basophilic intranuclear inclusions. The penstyldensoviruses (genus Penstylhamaparvovirus) are cosmopolitan and infect different species of several penaeid shrimp genera. Densoviruses do not infect vertebrates or cells derived from vertebrates, nor was any pathogenic effect observed after inoculation of rabbits or mice with these viruses. However, their ability to integrate into the host chromosome has been exploited for stable expression of foreign proteins in insect cells and somatic transformation of insects.

Shrimp Densoviruses Two groups (genera) of shrimp densoviruses were identified as pathogens responsible for economically significant and virulent disease in farmed shrimp (infectious hypodermal and hematopoietic necrosis virus, PstDV1) and Fenneropenaeus chinensis hepandensovirus (FcHDV), now both members of the Hamaparvovirinae subfamily. They are widespread in nature but are no longer a major economic problem because tolerant shrimp populations have been developed.

Penaeus stylirostris penstyldensovirus 1 (PstDV1) Penaeus stylirostris penstyldensovirus 1 (PstDV1) (genus Penstylhamaparvovirus) was originally discovered in cultured blue shrimp (Penaeus stylirostris) in Hawaii in 1981. Formerly known as infectious hypodermal and hematopoietic necrosis virus (IHHNV), PstDV1 causes acute infection with up to 100% mortality in P. stylirostris. In Pacific white shrimp P. vannamei and giant tiger prawn (P. monodon), infection is mostly chronic and results in stunted growth and runt deformity syndrome (RDS) with wrinkled antennae, cuticular deformities of the rostrums and deformed sixth abdominal segments. PstDV1 was reported to infect various decapod species of the Penaeidae family, from wild to cultured shrimps worldwide. Only P. merguiensis seems to be refractory to PstDV infection. PstDV1 infection occurred in all stages of the shrimp life cycle, including eggs, larvae, postlarvae, juveniles, and adults. Histologic and in-situ hybridization analysis showed Cowdrey type A inclusion

840

Parvoviruses of Invertebrates (Parvoviridae)

bodies in ectodermal and mesodermal tissues of penaeids such as gills, cuticular epidermis, lymphoid organs, hematopoietic tissues, antennal glands and connective tissues. PstDV1-related EVE sequences, lacking telomers, have been detected in the germline of P. monodon in Australia, Madagascar and Thailand. These would likely have been eliminated unless they provide beneficial effects, such as EVE-derived immunity (EDI, e.g., mediated by TRIM5a and APOBEC).

Fenneropenaeus chinensis hepandensovirus (FcHDV) FcHDV or previously named hepatopancreatic parvovirus (HPV) (genus Hepanhamaparvovirus) was first reported in banana shrimp, P. merguiensis, and in Indian prawns, P. indicus, in Singapore. FcHDV later was found to infect P. chinensis, P. monodon and many other penaeid species worldwide in Asia, Australia, America, and Africa. The virus was also reported in giant freshwater prawn. Infected shrimps do not always show diagnostic gross signs of FcHDV disease, but infection is associated with reduced growth rate in juveniles and can cause chronic mortalities at early-and post-larva stages. FcHDV infected shrimps appear to be more susceptible to other viral disease agents such as White Spot Syndrome Virus (WSSV) or, Penaeus monodon nudivirus (previously called, erroneously, Monodon Baculovirus (MBV)). Target tissues for viral replication include the hepatopancreas system and the tubule epithelial cells of the digestive gland. Transmission routes of PstDV1 and FcHDV can be either horizontal or vertical. Horizontal transmission is by cannibalism or by water contamination and vertical transmission is via infected eggs. Multiple infection with different viruses is commonly found in cultured shrimps.

New invertebrate DVs Sea stars, infected with sea star-associated densovirus (SsaDV), inhabiting the Northeast Pacific Coast have recently experienced an extensive outbreak of wasting disease, leading to their degradation and disappearance from many coastal areas. The signs of the disease are behavioral changes, lesions, loss of turgor, limb autotomy, and death characterized by rapid degradation. SSaDV could be detected in plankton, sediments and in nonasteroid echinoderms, providing a possible mechanism for viral spread. SSaDV was detected in museum specimens of asteroids from 1942, suggesting that it has been present on the North American Pacific Coast. Since June 2013, millions of sea stars (asteroids) off the west coast of North America have wasted away into slime and ossicle piles, due to the disease known as sea-star wasting disease (SSWD). SSaDV is related to densoviruses in the Hawaiian sea urchins (Echinoidea) Colobocentrotus atratus, Echinometra mathaei, and Tripneustes gratilla, placing it to the same species with other DVs of echinoderms. Cherax quadricarinatus densovirus (CqDV), the causative agent of an epizootic in freshwater crayfish Cherax quadricarinatus, in Australia, has a large, ambisense genome of 6334 nt. It is closely related to the marine sea star densovirus (75% identity) and is the first ambidensovirus to be found in decapod crustaceans. Mortalities after infection with CqDV occurred over 4 weeks, with up to 96% cumulative mortalities in two earthen ponds stocked with juveniles. The crayfish were weak, anorexic and lethargic. A transmission trial was conducted, using filtered, cell-free extract prepared from infected crayfish as inoculum. The disease was reproduced, with ongoing mortalities occurring in inoculated crayfish over 55 days. Experimentally inoculated crayfish showed gross signs of malaise, anorexia, and disorientation before dying. Two types of intranuclear inclusion bodies (INIBs) were seen in tissues of endodermal, ectodermal and mesodermal origin by light microscopy with hematoxylin and eosin (H&E) stained sections. Using metagenomics, densoviral nucleic acids in oysters (Crassostrea ariakensis) was idendified, designated CaaDV1 and CaaDV2, assembling the nearly complete genome of both (5860 nucleotides (nt) and 4034 nt, respectively). Both genomes displayed an ambisense organization but were divergent from the existing densoviral species. The NS1 protein of the two CaaDVs shared 43.3%B61.5% amino acid identities with SSaDV and CqDV as well as clustering with these in phylogeny calculations. The CaaDVs are potential new members of genus Aquambidensovirus, accompanying the echinoderm DVs and CqDV, all of aquatic host origin. Solenopsis invicta densovirus (SiDV) was the first DNA virus discovered in fire ants from Argentina and is related to AdDV and PcDV. The red fire ant, Solenopsis invicta (Buren) was introduced into the United States and due to the paucity of natural enemies is estimated to cause damage of about 6 billion USD annually. SiDV causes a high mortality rate and is a prime candidate for biological control. Recently, a metagenomic study has derived partial DV genomes from several novel invertebrate hosts, of various, distant arthropod orders. All sequences encompassed at least the complete NS1-encoding open reading frame (ORF). Myriapoda DV 1 and 2 are DVs detected in diplopods for the first time and appear to be closely-related to mosquito-infecting brevidensoviruses (genus Brevihamaparvovirus). Perisesarma bidens hepandensovirus could be a potential member of the Hepanhamapaparvovirus genus, the first one, which is not associated with penaeid shrimps, but with the red-clawed crab (Perisesarma bidens), another decapod species, instead. The partially sequenced genome of Tetragnatha maxillosa DV which is derived from the long-jawed orb weaver (Tetragnatha maxillosa), suggests it is a novel arachnid DV, probably a divergent member of the Densovirinae subfamily.

Biochemical Properties and Purification of Densoviruses Infectious particles of DVs are stable in a broad pH range from 3 to 9 in various buffers, and at 561C for 60 min, except for GmDV (rapidly losses viral infectivity when raising temperature over 561C). Therefore, DVs are still infectious and survive for a long

Parvoviruses of Invertebrates (Parvoviridae)

841

period in the environment. Purified virus stocks can be stored at  201C in Tris-Cl buffer pH 7.6 without significant loss of viral infectivity for years. The virions lack lipids and the buoyant density of DVs in CsCl gradients are in the range 1.4–1.44 g cm3, reported values can be slightly different depending on techniques or buffers used. The major band at 1.39–1.42 g cm3 contains mostly infectious particles, although a slightly higher density, minor band at 1.45–1.47 g cm3 is also infectious. Those banding from 1.31 to 1.37 g cm3 are empty or contain incomplete genomes (defective particles). The sedimentation coefficient of a DV is about 110 S. The virion contains 25% DNA and 75% protein and a low percentage of polyamines, such as spermidine. Purification of DVs includes 3 steps: extraction of virus from infected tissues, followed by virus concentration of viruses and final purification. Virus-containing tissues are lysed by several freeze-thaw cycles and by homogenization in a low ionic strength pH 6–7 buffer. Reducing agents like phenylthiourea, 20 mM sodium ascorbate and EDTA are added to reduce melanisation. Final steps include differential and isopycnic centrifugation.

Structural Features of Virions The metastable capsid is an essential part of the infectious virion and has multiple functions. These include the protection and isolation of the viral nucleic acid from its environment, the delivery of the viral genome to its replication site and interaction with the host environment while executing the previous two functions. Like their vertebrate-infecting counterparts, densoviruses, possess non-enveloped capsids of 21–25 nm in diameter, adopting a T ¼ 1 icosahedral symmetry. The capsid is comprised of 60 monomers, of which an asymmetric unit interacts with the others at a twofold, threefold and fivefold symmetry axes when incorporated into the viral particle. The structural protein encoding ORF, also known as cap can express up to four different VPs of varying lengths, which all share a common C-terminal region. Some densoviruses may possess a split VP gene, which requires alternative splicing in order to be expressed (detailed later). In the case of penstyldensoviruses only one VP of 37 kDa gets expressed, deeming each of the 60 subunits to be identical. This implies that each monomer can execute the same function throughout the viral life cycle of PstDV. Nevertheless, generally, the smallest VP comprising the common C-terminal region, is expressed at a higher level and is therefore considered the major VP. The larger, less abundant VPs are N-terminal extended variants of the major VP, which contain multiple domains and motifs important in the viral life cycle such as the PLA2 domain, a calcium-binding domain, and nuclear localization and export signals. Due to the common C-terminus, the larger VPs are also incorporated into the capsid, albeit at low copy number. In different DVs the size of the major VP varies between 37 and 70 kDa in size. In contrast to members of Parvovirinae for which numerous capsid structures have been determined, only four crystal structures are available for Densovirinae viruses along with two low resolution structures that have been determined using cryo-EM (Table 2). Two of the high-resolution structures, for GmDV and AdDV, were determined for DNA packaged (full) infectious virions where only AdDV displayed three ordered pyrimidine bases at the 3-fold symmetry axis luminal surface. This ordering is unexpected given the lack of icosahedral symmetry of the packaged genome. Similarly, to the Parvovirinae VP structures, a significant portion of the N-terminal region of the major capsid VP is disordered in DVs. Interestingly, the BmDV1 structure currently represents the only parvovirus VP structure where the last 40C-terminal residues are also disordered. The parvovirus capsid is comprised of monomers of a single jelly roll core. This highly conserved structural fold is comprised of eight beta strands arranged in two, four-stranded antiparallel beta sheets packed across a hydrophobic interface, with the strands in the sheets traditionally labeled from B to H. When folded, each strand faces opposite its neighboring letter in the alphabetical order, hence a BIDG and CHEF sheet is formed, linked by various numbers of loops, responsible for creating the often surfaceexposed variable regions (VRs). For the Parvoviridae the BIDG sheet is complemented by an N-terminal fifth strand, designated bA. In the GmDV VP structure, which is the prototype for the Densovirinae, the EF and GH loops are further divided into five and four sub-loops, respectively (Fig. 2). While the GH loop is the longest and forms most of the surface features, its length is significantly shorter compared to the corresponding loop in the Parvovirinae members at 97 aa compared to 226 aa in canine parvovirus. As for the Parvovirinae, the GH loop is the most variable among ambidensoviruses. At the twofold symmetry axis, all parvoviruses harbor an alpha helix (aA), although densoviruses also contain a second a-helix within the EF loop. PstDV also possesses a third helix in the CD loop and displays significantly shorter EF and GH loops than any of the known densoviral capsid structures (Fig. 2). Table 2

Densoviral capsid structures resolved to date

Virus

Empty/Full

Structure determination method

Resolution in Å

PDB-ID

Ordered residues included in the structure of total residue number (major capsid protein)

AdDV GmDV JcDV AalDV2 BmDV1 PstDV

Full Full Empty Full Empty Empty

X-Ray Crystallography X-Ray Crystallography Cryo-EM Cryo-EM X-Ray Crystallography X-Ray Crystallography

3.5 3.6 8.7 15.6 3.1 2.5

4MGU 1DNV N/A N/A 3P0S 3N7X

23–418 22–437 N/A N/A 43–454 31–329

of 418 (VP4) of 437 (VP4)

of 494 (VP3) of 329 (VP1)

842

Parvoviruses of Invertebrates (Parvoviridae)

Fig. 2 Densovirus VP structure. (A) Cartoon ribbon diagrams of the ordered common VP structures of Galleria mellonella densovirus (GmDV) (top), and Acheta domestica densovirus (AdDV), Bombyx mori densovirus (BmDV1), and Penaeus stylirostris densovirus (PstDV) (bottom). The first ordered N-terminal residue and last C-terminal residue are labeled. The conserved b-core and aA helix are colored in black and labeled in GmDV. Loops and subloops within the major loops are labeled by the major loop name and a number, given according to their order in GmDV. The approximate fivefold symmetry axis is marked by a pentagon, the threefold by a triangle, and the twofold by an ellipsoid. (B) The GmDV VP structure (Ambidensovirus, proposed Protoambidensovirus) superimposed on the VPs of AdDV (left), BmDV1 (middle), PstDV (right). Diversity on the surface loops are evident.

An important and differentiating feature of the densoviral VP is the domain swapping confirmation observed at their N-terminus (Fig. 3). The densoviral bA strand is a direct extension of the bB strand and it interacts via either hydrogen bonds, metal ions, or hydrophobic interactions with the twofold-related neighboring monomer instead of the bB strand of the very same subunit, as happens in the Parvovirinae capsid structures (Fig. 3). Thus, the luminal bBIDG sheet of the jelly roll core is still extended into a bABIDG sheet as in the Parvovirinae and the first observed N-terminal residue is also in a position underneath the 5-fold axis, but in this case, that of the neighboring VP subunit. This swapped confirmation is hypothesized to provide extra stability. The PstDV VP monomer, however, despite displaying the domain-swapped confirmation, is incapable of providing this assessed structural stability. The overall capsid morphology of DVs can be divided into two types: The larger capsids with diameters of B235 to B260 Å in the depressions and protrusions, respectively, while the smaller capsids measure 215–250 Å , being the smallest capsids so far described for the Parvoviridae (Fig. 4). For the former, including GmDV, AdDV, and BmDV1, the capsid surface is smooth with small spike-like protrusions surrounding the 5-fold axes. In GmDV the spikes, formed by the EF4 sub-loop, appear to be smaller compared to BmDV1 and AdDV, due to the protruding GH2 sub-loop filling up the depression surrounding them. In GmDV, a smaller second protrusion is formed by the BC loop. The twofold axes are covered by a depression. In the second group, containing PstDV and AalDV2, there are prominent protrusions surrounding the 5-fold axis, forming two rim-like concentric circles. The 2- and 3-fold symmetry axes have depressions (Fig. 4).

Parvoviruses of Invertebrates (Parvoviridae)

843

Fig. 3 Multimeric interactions of densoviral and parvoviral VPs. (A) Ribbon diagrams of the interactions between bA and bB at the twofold symmetry axis of Galleria mellonella densovirus (GmDV) and canine parvovirus (CPV). The eight-stranded core, with the additional bA, which performs the domain swapping, are colored. (B) Interaction of three threefold-symmetrical VPs for GmDV and CPV are shown. The triangle indicates the threefold axis and the pentagon the fivefold axis. Note the open annulus-like structure at the threefold axis of the densovirus, compared to the more closed arrangement in the vertebrate parvoviruses.

The 5-fold symmetry axis of the DV capsids, like the Parvovirinae ones, contains a channel, with a direct opening to the surface. The inner wall of the channel is lined by large hydrophobic residues in all four structures, proposed to provide an interacting surface to a glycine-rich stretch of residues when the N-terminus is externalized during endosomal escape. The densoviral threefold axis, however, is radically different from that of the Parvovirinae, occupied by an additional opening besides the fivefold pore instead of the protrusions observed in the Parvovirinae (Fig. 4). The opening can be small, formed by the GH loops, which do not even interdigitate between neighboring monomers (PstDV1 of genus Penstylhamaparvovirus) or can be occupied by a b-annulus-like structure (subfamily Densovirinae). The latter is comparable to the threefold axis of ( þ ) ssRNA viruses, such as Tomato Stunt Mosaic Virus of Tomubusviridae and Southern Bean Mosaic Virus of Solemoviridae. The annulus is formed by charged and flexible residues, with a B10-Å wide opening in GmDV. This opening is the least pronounced in BmDV1.

Biophysical Features and Functions Associated With the Densovirus Capsid Compared to members of the Parvovirinae, little is known about the functions of densovirus VPs and the available information is based on studies of mostly lepidopteran ambidensoviruses. By comparing the VP4 major capsid proteins of two closely related lepidopteran ambidensoviruses, GmDV and JcDV, eight variable, exposed regions were identified, located in the vicinity of the fivefold and threefold symmetry axes. Mutating the GH-loop-associated residues in JcDV to their counterpart in GmDV resulted in a decrease of the ability to cross the host midgut epithelium and a reduction of JcDV virulence if introduced through the natural, gastrointestinal pathway. When infecting Spodoptera frugiperda hosts ex vivo, the mutated virus became mis-targeted and accumulated in subcellular compartments of midgut epithelial cells instead of reaching their target receptors in the basal tight junctions. Recently, the Helicoverpa armigera densovirus 2 (HaDV2) VPs were shown to enhance the structural promoter activity by 35-fold compared to the activation by NS. This suggests that the densoviral capsid proteins may have a non-structural function besides the expected structural ones. The biophysical properties to date have been investigated only in case of AdDV. Heating of infectious AdDV particles to 701C resulted in increased PLA2 activity accompanied by genome ejection while capsids remained intact when checked by negative

844

Parvoviruses of Invertebrates (Parvoviridae)

Fig. 4 Densoviral capsid structures. The capsid surface images show the high resolution structures of Galleria mellonella densovirus (GmDV), Acheta domestica densovirus (AdDV), Bombyx mori densovirus BmDV1 and Penaeus stylirostris densovirus PstDV, in comparison with the structure of two vertebrate-infecting parvoviruses, namely adeno-associated virus 2 (AAV2) (Dependoparvovirus) and canine parvovirus (CPV) (Protoparvovirus). The resolution of the structures is provided in parenthesis. The scale bar shows the color code of the radial distance from the capsid center. An icosahedral symmetry diagram indicating the positions of the visible symmetry axes in the capsid images is also shown. The cryo electron microscopy-obtained structures (in gray) of lower resolution are shown for Aedes albopictus densovirus 2 (AalDV) and Junonia coenia densovirus (JcDV) at the bottom right hand side.

stained EM. Structural studies of these emptied particles, however, did not display increased electron density in the fivefold channels, associated with the VP N-terminal externalization required to expose the PLA2. Our discovery of PLA2 in capsids of most vertebrate and invertebrate parvoviruses shed light on how the densovirus enters the cell and later escapes the endosome or lysosome. During clathrin-mediated endocytosis the PLA2-containing region of VP1u is externalized through the channel at the 5-fold axis which then enables the virion to breach the endosomal membrane. Almost all parvoviruses contain a calcium-binding loop (GPGN) and the active site (DxxAxxHDxxY) required for PLA2 activity in the N-terminus of the largest minor capsid protein. Brevidensoviruses, penstyldensoviruses and hepandensoviruses (subfamily Hamaparvovirinae) lack this enzymatic activity and enter probably by direct membrane fusion/penetration reactions at the plasma membrane and might have evolved other, yet unexplored, mechanisms to breach the endosomal membrane.

Densovirus Genome Structure and Replication As a general rule, capsids of monosense DVs package primarily the “  ” DNA strand (complementary to mRNA) while the ambisense DVs package both the “ þ ” and the “  ” strands in separate particles. There is a great variety in the organization of genome structures, such as size of DV genomes and genome termini, imperfect palindromic sequences at each extremity able to fold into duplex telomeric structures designated hairpins. The structural similarities of DV and vertebrate parvovirus genomes suggest common strategies in the replication of their DNA by a single-strand displacement by a DNA polymerase and encapsidation by helicase hexamers into the channel at the 5-fold axis of the preformed capsids. The self-priming hairpin at the 30 end of the genome serves as primers for complementary strand synthesis according to the rolling-hairpin replication model to generate concatemeric intermediate chain resolved in monomers by the nicking activity of Rep. Based on their viral replication model, the 50 and 30 terminal hairpins of all DVs with inverted terminal repeats, including terminal hairpins (ITRs), can exist in either of two orientations termed “flip” and its reverse-complement “flop” orientation. The significant differences in size and structure of palindromic 50 - and 30 -terminal sequences within the different genera of DVs might reflect their dependence on specific cellular factors necessary for the replication of their genome or promote their encapsidation. Ambisense DVs have ITRs whereas monosense genomes can have unique terminal hairpins (e.g., brevidensoviruses) or ITRs (e.g., iteradensoviruses). An N-terminal HuHuuu amino acid motif (u–bulky hydrophobic residue), with one or two other motifs, was identified in the Rep protein (HUH endonuclease superfamily) that is conserved in two vast classes of proteins, one of which is involved in initiation and termination of rolling-circle DNA replication, or RCR (Rep proteins), and the other in mobilization (conjugal transfer) of plasmid DNA (Mob proteins). One of the two additional conserved motifs is located upstream and the other downstream with a conserved Tyr residue. The major role of these HUH endonucleases is processing a range of mobile genetic elements by catalyzing cleavage and rejoining of single-stranded DNA using an active-site Tyr residue to

Parvoviruses of Invertebrates (Parvoviridae)

845

make a transient 50 -phosphotyrosine bond with the DNA substrate, such as in rolling-circle replication, in various types of transposition and in intron homing. HUH enzyme activities require a divalent metal ion, coordination which is provided by the HUH motif, to facilitate cleavage by locating and polarizing the scissile phosphodiester bond. During DV replication the 30 hairpin0 provides a free 30 ‑OH group from which leading-strand synthesis can be initiated using host cell enzymes. The replication of this 30 hairpin is thus incomplete and requires a site-specific nick to generate a second 30 ‑OH at a specific site (the terminal resolution site, trs) in the hairpin so that the complementary strand can be extended to completion. Both terminal hairpins on the two strands can then thus be refolded while the hairpins at the two ends are separated by a duplex. Rep proteins also contain a C‑terminal superfamily 3 (SF3) hexameric 30 –50 helicase domain. RCR uses this 30 –50 helicase activity acting on the template strand to facilitate DNA unwinding at the replication fork. The crystal structure of an SF3 DNA helicase, Rep40, from adeno-associated virus 2 (AAV2) delineates the expected Walker A and B motifs, but also reveals an unexpected "arginine finger" that directly implies the requirement of Rep40 oligomerization for ATP hydrolysis and helicase activity. The peptide linker between the HUH endonuclease and the SF3 helicase domains has a critical role in helicase oligomerization.

Expression Strategies The expression strategies among the DVs differ considerably (Figs. 5 and 6). The promoters of the rep and cap cassettes of the ambisense DV genomes tend, at least partially, to be in the ITRs. In addition to unspliced transcripts, splicing and leaky scanning, and even overlapping promoters/ORFs (brevidensoviruses), are used to produce a large array of proteins from a compact genome. For ambisense DVs, it is striking that an overlapping 30 untranslated region (UTR) exits, which is shared by both the NS and the VP transcripts. In GmDV two types of transcription strategies have been described. The transcription and translation map of GmDV (Fig. 5(a)) is a model for closely related densoviruses (JcDV, MlDV, PiDV, DsDV, HaDV1, etc) from butterflies. Analogous to vertebrate parvoviruses, the NS gene cassette is to the left and reading from left to right. The VP-encoding cassette, however, is positioned on the complementary strand and is read from right to left. Both expression cassettes are driven by promoters positioned in the ITRs. The relative amounts of the various NS products depend on (1) the strength of the splicing signal and (2) the relative strength of the NS1 and NS2 initiation codons. The relative amounts of VP products depend on the leakiness of the scanning ribosomes and does not involve splicing. CpDV on the other hand, and though sharing the leaky scanning VP cassette with the lepidopteran ambidensoviruses, has evolved a remarkably complex NS expression cassette. The NS-1, NS-2, and NS-3 in the 50 half of one strand are organized into five open reading frames (ORFs) due to the split of both NS-1 and NS-2 into two ORFs. The ORF encoding capsid proteins are positioned in the 50 half of the complementary strand. The expression of NS proteins is controlled by two promoters, P7 and P17, driving the transcription of a 2.4-kb mRNA encoding NS-3 and of a 1.8-kb mRNA encoding NS-1 and NS-2, respectively. The two NS mRNA species have a small 53-nt intron in the middle of the sequence. The NS expression strategy of AdDV (Fig. 5(b)) is rather like that of GmDV. However, different combinations of splicing donor and acceptor sites result in different N-terminal sequences unlike the truncated sequences as for other parvoviruses. The intron in the NS mRNA occurs in about half of the NS transcripts. Ia, Ib, and II are introns that occur in alternative VP transcripts. NS transcripts start at nt 192 and yield NS3. However, a fraction of these transcripts is spliced just upstream of this start codon (intron), leading to translation of NS1 from 856-AUG, with a poor initiation environment, and, through leaky scanning, of NS2 from 875-AUG. Like the case for all densoviruses, the 30 ends of AdDV NS and VP transcripts overlap in the middle of the genome at a 30 UTR sequence, a shared trait by all ambisense DVs to date. VP transcripts are initiated at nt 5235, and VP1 initiation is at nt 5230. The short 50 -UTR predicts an inefficient initiation (leaky scanning) and could be responsible for production of a nested set of N-terminally extended viral proteins. However, removal of either of the two alternative introns in ORF-B (Ia or Ib) did not connect the exons in ORF-B and ORF-A in frame, so only nonstructural proteins could be produced from nt 5230 and VP2 could be produced directly from the first AUG in ORF-A when this splicing occurred. An alternative intron II, which is mutually exclusive with introns Ia and Ib since it removes the ORF-B splice acceptor, connects ORF-B and ORF-A in frame so that VP1 can be produced from nt 5230 (Fig. 5(b)). The NS expression strategy of the ambisense pefuambidensovirus, PfDV, is the same as for GmDV. Structural analysis of cDNAs complementary to mRNAs from the region coding for structural proteins suggested alternative splicing and polyadenylation as a means for generation of the structural proteins of PfDV. The NS cassette of the ambisense blattambidensovirus, BgDV1, has an extra NS product (through an extra splice variant, identical to last 273 codons of NS1 (Fig. 5(c))). Alternative splicing in the VP transcript yields a VP2 product with a unique N terminus as for AdDV (Fig. 5(b)). Although their transcription strategy has not been resolved yet, in some cases, uninterrupted potential ORFs corresponding to non-structural viral proteins or capsid proteins of Myzus persicae densovirus (MpDV) and Dysaphis plantaginea densovirus (DplDV), both ambisense hemiambidensoviruses, were found within EVEs identified in the aphid genome. Little is known about the expression strategy of the Aquambidensovirus genus members. The predicted expression of the SSaDV contig in sea stars is represented in Fig. 5(d). The phenomenon of possessing a VP expression cassette comprised of two ORFs, united by alternative splicing, appears at four separate DV lineages (e.g., Figs. 5(b) and (c)). The variation in donor- and acceptor site sequence and position as well as in the number of spliced transcripts suggests that this strategy has evolved independently at multiple times during ambisense DV evolution.

846

Parvoviruses of Invertebrates (Parvoviridae)

Fig. 5 Transcription strategies of ambisense densoviruses. Panels (a, b, c, d) display representatives of four ambisense densoviral genera, according to the new taxonomy classification of subfamily Densovirinae. The thick line in the middle represents the coding genome, while genome termini are drawn in thinner lines. The fine lines indicate the starts, ends and intron positions on the mRNAs and promoters at the 50 ends are indicated by small arrows according to the direction of the transcription they drive. Genes and proteins are shown as colored boxes. Thick, bent arrows indicate the leaky scanning translation sites.

Parvoviruses of Invertebrates (Parvoviridae)

847

Fig. 6 Transcription strategies of monosense densoviruses. Panels (a, b, c, d) display representatives of four monosense genera, according to the proposed new taxonomy of family Parvoviridae, classifying Iteradensovirus to the original subfamily Densovirinae, whereas the remaining three genera are all members of Hamaparvovirinae. The thick line in the middle represents the coding genome, while genome termini are drawn in thinner lines. The fine lines indicate the starts, ends and intron positions on the mRNAs and promoters are indicated by small arrows according to the direction of the transcription they drive. Genes and proteins are shown as colored boxes. Thick, bent arrows indicate the leaky scanning translation sites.

As for monosense DVs, the transcription maps and expression of the iteradensoviruses do not show great variety (Fig. 6(a)). The two nonstructural (NS) genes are expressed by overlapping promoters with alternate transcription starts at either side of the NS1 start codon. Leaky scanning yields 4 VP products of which VP1 has PLA2 activity. So far, BmDV1 is the only iteradensovirus with a tandem repeat of 45 nts in the intergenic region. As for the iteraviruses, brevidensoviruses do not need mRNA splicing since they possess two overlapping P7/7.4 NS promoters (monocystic) located closely to the alternate transcription initiation sites, positioned at either side of the NS1 initiation codon (Fig. 6(b)). Their genomes harbor only one polyadenylation signal, at which both the NS and VP transcript populations co-terminate. VPs are expressed by leaky scanning from only one transcript.

848

Parvoviruses of Invertebrates (Parvoviridae)

In the penstyldensoviruses (Fig. 6(c)) NS1 and NS2 have separate promoters (P2 and P12; Pham, 2014) for NS1 and NS2 where the location of P12 overlaps the intron of NS1, possibly through sequence and motif constraints. The absence of hairpins in the telomeres (Direct Repeats, DR) questions whether PstDV is an authentic parvovirus. Little is known about the transcription strategy of Penaeus monodon hepandensovirus. It has been reported to have a unique expression strategy (Fig. 6(d)). Whereas in other densoviruses the initiation codon of NS2 is downstream of that of NS1, the ns2 gene is located upstream of NS1. The ORF of VP is exceptionally large compared to the protein observed in SDS-PAGE (92 kD vs 57 kD).

Conclusions Advances in viral metagenomic approaches in recent years have revolutionized the discovery of novel viruses (EVEs and exogenous viruses) from a high diversity of invertebrate hosts. This resulted in suggestions for a significant overhaul of the classification and taxonomy of ambisense densoviruses and the addition of an ancient lineage of highly divergent parvoviruses (Hamaparvovirinae subfamily) that infect an exceptionally broad range of vertebrate and invertebrate hosts. For instance, chapparvoviruses found in fish are more closely related to those from invertebrates than they are to those of amniote vertebrates, suggesting parvoviruses to have evolved to infect vertebrates on at least two independent occasions. Currently, host species susceptible to DVs and/or possessing EVEs of suspected DV origin encompass the basal animal phylum Cnidaria as well as both proto- and deuterostome animals. The future directions of DV research must assess and cope with this enormous diversity; a product of parvovirus evolution may flank a 500-million-year-long interval.

Acknowledgments The authors wish to acknowledge the support received from the Natural Sciences and Engineering Research Council of Canada.

Reference Cotmore, S.F., Agbandje-McKenna, M., Canuti, M., et al., 2019. ICTV virus taxonomy profile: Parvoviridae. Journal of General Virology 100, 367–368.

Further Reading Cotmore, S.F., Tattersall, P., 2006. Structure and organization of the viral genome. In: Kerr, J.R., Cotmore, S.F., Bloom, M.E., Linden, R.M., Parrish, C.R. (Eds.), Parvoviruses. London, UK: Hodder Arnold, pp. 73–94. Feschotte, C., Gilbert, C., 2012. Endogenous viruses: Insights into viral evolution and impact on host biology. Nature Reviews Genetics 13, 283–296. Flegel, T.W., 2006. Shrimp parvoviruses. In: Kerr, J.R., Cotmore, S.F., Bloom, M.E., Linden, R.M., Parrish, C.R. (Eds.), Parvoviruses. London, UK: Hodder Arnold, pp. 487–493. Ilyina, T.V., Koonin, E.V., 1992. Conserved sequence motifs in the initiator proteins for rolling circle DNA replication encoded by diverse replicons from eubacteria, eucaryotes and archaebacteria. Nucleic Acids Research 20, 3279–3285. Krupovic, M., 2013. Networks of evolutionary interactions underlying the polyphyletic origin of ssDNA viruses. Current Opinion in Virology 3, 578–586. Mietzsch, M., Pénzes, J.J., Agbandje-McKenna, M., 2019. Twenty-five years of structural parvovirology. Viruses 11. doi:10.3390/v11040362. Pénzes, J.J., de Souza, W.M., Agbandje-McKenna, M., Gifford, R.J., 2019. An ancient lineage of highly givergent parvoviruses infects both vertebrate and invertebrate hosts. Viruses 11. doi:10.3390/v11060525. Rosario, K., Duffy, S., Breitbart, M., 2012. A field guide to eukaryotic circular single-stranded DNA viruses: Insights gained from metagenomics. Archives of Virology 157, 1851–1871. Tijssen, P., Bando, Y., Li, Y., et al., 2006. Evolution of densoviruses. In: Kerr, J.R., Cotmore, S.F., Bloom, M.E., Linden, R.M., Parrish, C.R. (Eds.), Parvoviruses. London, UK: Hodder Arnold, pp. 55–68. Tijssen, P., Pénzes, J.J., Yu, Q., Pham, H.T., Bergoin, M., 2016. Diversity of small, single-stranded DNA viruses of invertebrates and their chaotic evolutionary past. Journal of Invertebrate. Pathology 140, 83–96. Zádori, Z., Szelei, J., Lacoste, M.C., et al., 2001. A viral phospholipase A2 is required for parvovirus infectivity. Developmental Cell 1, 291–302.

Relevant Websites https://en.wikipedia.org/wiki/Ambidensovirus Ambidensovirus. Wikipedia. https://talk.ictvonline.org/ictv-reports/ictv_online_report/ssdna-viruses/w/parvoviridae/1047/subfamily-densovirinae Subfamily: Densovirinae. Parvoviridae. ssDNA Viruses. ICTV.

Polydnaviruses (Polydnaviridae) Anne-Nathalie Volkoff, Diversity, Genomes and Insects-Microorganisms Interactions, National Institute of Agricultural Research, University of Montpellier, Montpellier, France Elisabeth Huguet, Research Institute on Insect Biology, French National Center for Scientific Research, University of Tours, Tours, France r 2021 Elsevier Ltd. All rights reserved.

Nomenclature AsIV Apophua simplicipes Ichnovirus BV Bracovirus BVSPER Bracovirus Structural Proteins Encoding Regions CcBV Cotesia congregata Bracovirus CiBV Chelonus inanitus Bracovirus CpBV Cotesia plutellae Bracovirus CrV Cotesia rubecula Virus (bracovirus) CsIV Campoletis sonorensis Ichnovirus CvBV Cotesia vestalis Bracovirus DRJ Direct Repeat Junction Egf Epidermal growth factor GfBV Glyptapanteles flavicoxis Bracovirus

Glossary Encapsulation Encapsulation is a cellular immune response used against pathogens that are too large to be phagocytosed. This response is employed by insect larvae in response to parasitism by wasp parasitoid eggs for example. Encapsulation involves the aggregation of insect immune cells (hemocytes) around the invading body. Hemocyte The insect blood cells. Koinobiont parasitoid A parasitoid whose host continues to develop after parasitization. Melanization In insect immunity, melanization is an immune effector mechanism involved in the killing of pathogens and the eggs of parasitoid wasps. Melanization involves a series of reactions such as the conversion of tyrosine to melanin precursors and the cross-linking of proteins to form a layer of melanin that surrounds and

GfIV Glypta fumiferanae Ichnovirus HdIV Hyposoter didymator Ichnovirus HfIV Hyposoter fugitivus Ichnovirus HIM Host Integration Motif ICTV International Committee on Taxonomy of Viruses IV Ichnovirus IVSPER Ichnovirus Structural Proteins Encoding Regions MdBV Microplitis demolitor Bracovirus PDV Polydnavirus PTP Protein Tyrosine Phosphatase RU Replication Unit TnBV Toxoneuron nigriceps Bracovirus TrIV Tranosema rostrale Ichnovirus

sequesters an invading body. The death of the parasitoid egg presumably occurs either by oxidative damage or by starvation. Oviposition Act of laying an egg or eggs. Parasitoid wasp A wasp the adult stage of which is free-living and whose immature stages feed and develop within or on the bodies of other organisms (usually an insect) that will eventually be killed. Rep/Helicase A single-stranded DNA-dependent ATPase involved in DNA replication. Virulence genes Genes responsible for the biological alterations observed in the organism where they are expressed. This term is borrowed from bacterial pathogens that use “virulence factors” to establish on or within a host and enhance their potential to cause disease.

Classification (Compact) Polydnaviruses (PDVs) are enveloped large DNA viruses with a segmented double-stranded DNA (dsDNA) genome. All PDVs are mutualistically associated with parasitic wasps (also named parasitoids) of lepidopteran larvae. The Polydnaviridae family was proposed by Don Stoltz in 1984 and recognized by the ICTV in 1991. PDV are divided into two genera, Bracovirus, for PDVs associated with wasp subfamilies of the family Braconidae, and the Ichnovirus for those associated with wasp subfamilies of the family Ichneumonidae. BVs are found in wasps from the “microgastroid complex”, a monophyletic assemblage of B50,000 species belonging to six braconid subfamilies. IVs are presently described in two ichneumonid subfamilies, the Campopleginae and the Banchinae, and they are estimated to be associated with at least 14,000 species of wasps.

Life Cycle of PDVs Among dsDNA viruses, PDVs are the only ones known to be mutualistically associated with multicellular organisms, parasitoid wasps. These insects are characterized by a free-living adult stage and parasitic immature stages. Their host, usually an insect, dies at the end of parasitoid development. Parasitoid wasps associated with PDVs are all koinobiont endoparasites of lepidopteran larvae, i.e. female wasps lay eggs in the body of a host caterpillar, which continues to develop as the parasitoid’s offspring matures.

Encyclopedia of Virology, 4th Edition, Volume 4

doi:10.1016/B978-0-12-809633-8.21556-2

849

850

Polydnaviruses (Polydnaviridae)

Fig. 1 Life cycle of a parasitoid wasp associated with a polydnavirus. Virus particles are produced in the wasp female ovaries during the pupal and adult stages. Production occurs exclusively in a specialized tissue, the calyx. The produced virus particles are released in the oviducts, and then injected in the parasitoid insect host, a caterpillar, when the female wasp lay its egg(s). Once in the caterpillar, polydnavirus particles infect all the tissues of the insect, viral DNAs reach the nuclei and the genes encoded by the packaged polydnavirus genome are expressed. Successful development of the wasp progeny depends on the expression of these viral genes, which lead to physiological alterations in the caterpillar that are beneficial for the wasp (inhibition of the caterpillar host immune response, modulation of its development, etc.). Once the parasitoid larva (or larvae) ends its development, the mature parasitoid larva egresses from the caterpillar, which dies. The parasitoid then spins a cocoon within which pupation takes place, leading to emergence of a new adult wasp.

BVs and IVs share a common life cycle. They both persist in all cells of the wasp as integrated proviruses. Consequently, PDVs are transmitted vertically through the germline of wasps. During the pupal and adult stage, PDVs replicate in the nuclei of specialized ovarian cells, in a region named calyx (Fig. 1). Many virions are produced by calyx cells. BVs are released by lysis of the replicative cells whereas IVs are released by budding. Virions are then stored in the lumen of the oviducts to be injected in the parasitoid host upon oviposition by the female wasp. PDVs rapidly infect host cells, but no viral replication occurs in the parasitized insect. Infection of the parasitoid host tissues is followed by expression of viral genes on which the survival of wasp offspring depends. Indeed expression of PDV genes disables host immune responses and alters host development and metabolism.

Virion Structure The virions of BVs and IVs are morphologically and structurally distinct (Fig. 2). BV virions resemble some non-occluded baculoviruses. They have cylindrical, often tailed nucleocapsids surrounded by a single envelope, formed in the nucleus of calyx cells. BV virions contain one or several nucleocapsids. BV nucleocapsids each contain a single DNA molecule. The capsids can vary in length from 30 to 80 nm (excluding tails), with a width around 50 nm.

Polydnaviruses (Polydnaviridae)

851

Fig. 2 Morphology of polydnavirus particles: differences between Bracoviruses (upper panel) and Ichnoviruses (lower panel). Both bracoviruses and ichnoviruses are produced in a specialized region of the female wasp ovaries, named the calyx; on the left, views of dissected ovaries of parasitoid wasps indicating localization of the calyx. In the center, the upper panel shows bracovirus virions in the oviduct lumen of a braconid female wasp observed by transmission electron microscopy (TEM); the lower panel shows on the left ichnovirus subvirions assembled in the nucleus of a calyx cell, and on the right mature virions present in the oviduct lumen of an ichneumonid female wasp. NC, nucleocapsid; env., envelope; VS, virogenic stroma; iE, inner envelope; oE, outer envelope. Scale bar: 200 nm. Pictures by M. Frayssinet (ovaries) and M. Ravallec (TEM), DGIMI, INRA. On the right of the figure, schematic representations of bracovirus and ichnovirus virions. For bracoviruses, a virion containing multiple nucleocapsids is represented.

IVs have lenticular nucleocapsids surrounded by two-unit membranes. The inner envelope is formed in the nucleus of calyx cells and often presents a “hook-like” structure (or tail) of variable length. The outer envelope originates from the plasma membrane and is acquired during virion budding. Nucleocapsids are uniform in size, they measure more than 300 nm in length and approximatively 80–90 nm in diameter. Virions from campoplegine wasps contain a single nucleocapsid, whereas virions of banchine wasps may have multiple nucleocapsids. Note that the virions described in the campoplegine Bathyplectes spp. enclose multiple punctiform nucleocapsids. There is no data presently on the number of circular DNA molecules contained in IV nucleocapsids.

Morphogenesis of PDV Particles PDVs replicate in the nuclei of calyx cells. Both BV and IV nucleocapsids are assembled in virogenic stroma and acquire an envelope in the nucleus (Fig. 3). The origin of the nuclear envelope, and whether it has the same origin in BVs and IVs, remain unknown. BV morphogenesis has been extensively followed for Chelonus inanitus BV. During the first steps of replication, virions are assembled and progressively fill the nucleus; “mature” calyx cells, located basally near the oviducts, undergo successive lysis of the nuclear envelope, then of the plasma membrane, leading to the release of the mature virions in the oviduct lumen. On the other hand, the production of IV particles relies on another process. Enveloped subvirions formed in the nucleus exit the later by budding through the nuclear envelope. They finally access the oviduct lumen by budding though the apical plasma membrane.

The PDV Packaged Genome The genome packaged into PDV virions consist in several multiple circular, double-stranded DNAs that are non-equimolar in abundance. The number and size of DNA segments vary depending on the species (Table 1). BV genomes usually have from 15 to 30 DNA segments ranging in size from 2 to 50 kb, with estimated genome sizes from 125 kb to almost 600 kb. IV genomes consist in 20 to 50 DNA segments, ranging in size from 2 to 28 kb, and estimated genome sizes around 250 kb. The genomes of IV associated with banchine wasps have slightly larger genomes and display a higher degree of genome segmentation compared to campoplegine IVs.

852

Polydnaviruses (Polydnaviridae)

Fig. 3 Schematic representation of the different steps of Bracovirus and Ichnovirus morphogenesis in the wasp calyx cells. On the left, a schematic representation of the wasp ovary indicates the position of the calyx (in red) in braconids and ichneumonids. Bracovirus (BV) and ichnovirus (IV) virions are assembled in the nucleus of the calyx cells, in specific regions named virogenic stroma. For BVs, along with nucleocapsid production, two-unit membrane envelopes appear in the nucleus. Nucleocapsids are enveloped to form mature virions that accumulate in the nucleus. Release of virions occurs by disintegration first of the nuclear and then of the plasma membrane. For IV, assembly of sub-virions (i.e. nucleocapsids surrounded by a single two-unit membrane envelope) takes place in the vicinity of the virogenic stroma. Sub-virions then migrate towards the nuclear envelope where they bud to the calyx cell cytoplasm. Sub-virions then reach the plasma membrane and bud through it to finally reach the calyx lumen where they accumulate as mature virions surrounded by two envelopes. In contrast to what occurs in BV-producing cells, IV-producing cells do not undergo cell lysis; each calyx cell can a priori produce virus particles during the entire female adult life. Table 1 Features of the packaged genome of selected polydnaviruses. Are indicated the name of the parasitoid species carrying the polydnavirus, the number and length of the segments present in the packaged genome; the total genome size, and the number of predicted genes encoded by the genome Parasitoid species

Polydnavirus Segment number and size

Genome size

Number of predicted genes

Reference

MdBV CcBV

15 (3.6–34.3 kb) 35 (5.0–41.6 kb)

190 kb 580 kb

61 222

CvBV TnBV GfBV

35 (2.6–39.2 kb) 427 (3.9–13.9 kb) 429 (3.8–50.7 kb)

540 kb 4203 kb 4581 kb

157 442 4193

Webb et al. (2006) Bézier et al. (2013); Chevignon et al. (2014) Chen et al. (2011) Ibrahim (2010); PhD thesis Desjardins et al. (2008)

Ichneumonid campoplegine wasps Campoletis sonorensis CsIV Hyposoter didymator HdIV Hyposoter fugitivus HfIV Tranosema rostrale TrIV

24 (6.1–19.6 kb) 50 (2.5–3.6 kb) 56 (2.6–8.9 kb) 440 (4.1–10.1 kb)

247 kb 250 kb 246 kb B250 kb

101 135 150 489

Webb et al. (2006) Dorémus et al. (2014) Tanaka et al. (2007) Tanaka et al. (2007)

Ichneumonid banchine wasps Glypta fumiferanae GfIV Apophua simplicipes AsIV

105 (1.5–5.2 kb) 4132 (B1.0–4.0 kb)

B291 kb B304 kb

101 186

Lapointe et al. (2007) Djoumad et al. (2013)

Braconid wasps Microplitis demolitor Cotesia congregata Cotesia vestalis Toxoneuron nigriceps Glyptapanteles flavicoxis

MdBV, Microplitis demolitor Bracovirus; CcBV, Cotesia congregata Bracovirus; CvBV, Cotesia vestalis Bracovirus; TnBV, Toxoneuron nigriceps Bracovirus; GfBV, Glyptapanteles flavicoxis Bracovirus; CsIV, Campoletis sonorensis Ichnovirus; HdIV, Hyposoter didymator Ichnovirus; HfIV, Hyposoter fugitivus Ichnovirus; TrIV, Tranosema rostrale Ichnovirus; GfIV, Glypta fumiferanae Ichnovirus; AsIV, Apophua simplicipes Ichnovirus.

Sequencing of PDV packaged genomes revealed they generally encode more than 100 predicted genes. The genes encoded by PDV genomes differ in nature in BVs and campoplegine IVs (Table 2); interestingly banchine IVs share gene families with BV (PTPs and viral ankyrins). The most represented gene family in BVs, in terms of number of genes, are the PTPs, with generally more than 30 genes identified in the sequenced genomes. For IVs, the most represented gene family is the repeat-element genes, with also often more than 20 members identified in IV genomes; they have no similarities with known genes, but all contain a 540 nt motif that make them related. Both BV and IV genomes possess genes encoding proteins harboring ankyrin domains (viral ankyrins);

Polydnaviruses (Polydnaviridae)

853

Table 2 Gene families identified in polydnavirus genomes. Are listed the main viral gene families encoded by bracoviruses (BV, on the left) and by ichnoviruses (IV) associated with campoplegine wasps (on the right). Conserved families are those encountered in all sequenced genomes for a polydnavirus genera; for BVs are also indicated gene families found in several species but not all; Specific gene families are examples of genes found in a single polydnavirus species, knowing that only a few polydnavirus packaged genomes have been sequenced so far Gene families conserved in all sequenced BVs (and in IVs associated with banchine wasps)

Gene families conserved in all sequenced IVs associated with campoplegine wasps

Protein tyrosine phosphatase (PTP) Viral ankyrin

Repeat-element gene Cys-motif protein

Gene families conserved in some BVs Cystatin CrV  1 like protein Lectin-like

Viral ankyrin Viral innexin Polar-residue rich protein (PRRP) N-gene

Specific gene families Egf-motif (MdBV)

Specific gene families Glycine-proline rich (HdIV)

MdBV, Microplitis demolitor; Bracovirus; HdIV, Hyposoter didymator Ichnovirus.

however whether BV and IV viral ankyrins are related or derive from independent acquisitions in the two wasp lineages has not yet been demonstrated. As a rule, based on analysis of all the PDVs sequenced, no typical virus replication genes have been identified in PDV packaged genomes. Nonetheless a helicase gene was identified in Cotesia vestalis BV; however, further work revealed that it was actually a Rep/Helicase from an intact Helitron transposable element inserted into a virus segment.

Function of the Genes Encoded by PDV Packaged Genome The packaged genome is transferred to the host of the parasitoid upon parasitism, where the genes are for the most part expressed. These genes are thus considered as “virulence” genes, responsible for the physiological alterations observed in the parasitized insect and on which rely successful development of the wasp progeny. A number of these genes, such as PTPs, viral ankyrins, cystatins, and viral innexins have similarities with eukaryotic genes. The hypothesis is that these genes have an insect, probably a wasp, origin rather than a viral origin. PDVs have early been shown to be necessary for successful development of the wasp. They are indeed involved in immune suppression (e.g., changes in hemocyte composition, cellular spreading, encapsulation and melanisation) and alterations of development (e.g., modification of growth hormone titers) or metabolism (e.g., changes in levels of total carbohydrates, lipids or proteins) in the parasitized insect. PDV physiological effects have been shown by phenotypic approaches but also indirectly by transcriptomic approaches. However, amongst the hundred genes encoded by PDVs, the physiological roles in parasitism of only a few has been investigated so far (Table 3). This is probably due to the absence of similarity to known functional proteins for the majority of PDV encoded proteins, but also, and most probably the main reason, due to the technical difficulty to study these viruses which are not replicative in the insect where genes are expressed and functional. Apart from physiological effects, recent studies have amazingly also highlighted ecological roles for PDVs. PDV infection affects protein expression in salivary glands of the parasitized insect; and particularly proteins that act as elicitors of the plant response are affected in terms of abundance. This leads to induction of the plant immune response and emission of volatiles that differ from those induced by a non-infected phytophagous insect, and consequentely impact trophic interactions between the plant, the phytophagous insect, its parasitoids and even the hyperparasitoids. Because PDVs are essential for parasitoid survival, it is expected that these mutualistic viruses impact parasitoid host range. Indeed, PDVs were shown in at least one case to be involved in the specialization of parasitoids to their hosts.

Fate of the PDV Packaged Genome in the Parasitized Insect Larval development of parasitoid larvae within their lepidoperan hosts last several days, a time period during which PDV continue to be active. Expression of PDV genes is detected early after parasitism and during all the parasitoid development in the parasitized insect. Since PDV particles are non replicative (absence of replication genes), this raises the question of how PDV DNA is maintained in the lepidopteran host of the parasitoid. Early studies in cultured cells showed that portions of PDV segments where capable of integration into the genome of lepidopteran cells. More recently, integration of BV segments in genomic DNA of the host was shown in vivo. Integration into host cells occurs rapidly after infection and in association with a domain termed the “host integration motif” (HIM) (Fig. 4), which was identified in most MdBV genomic DNAs and several CcBV circles. This finding suggests that BV genomic segments integrate into host cells through a shared mechanism. However, no specific host genomic target could be identified, despite the fact that integration was not totally random within the lepidopteran host genome. Nonetheless, this process allows PDV to persist and throughout parasitism.

854

Polydnaviruses (Polydnaviridae)

Table 3 Some examples of polydnavirus (PDV) or PDV genes effect on the physiology of the parasitized caterpillar host. PDV infection affects either the host immune response (Immunity) or its development (Development). For each gene studied, the observed effect is indicated Name of the gene

PDV

General function affected in the parasitized caterpillar

Lectin-like CpBV Immunity Glcl.8 MdBV Immunity CpBV-CrV1 CpBV Immunity CcV1 CcBV Immunity Glc1.8, ptp-H2, ptp-H3 MdBV Immunity ptp-H2 MdBV Immunity TnBVank1 TnBV Immunity p-vank  1 CsIV Immunity ank-h4, ank-n5 MdBV Immunity

Reference Lee et al. (2008) Beck and Strand (2005) Kumar and Kim (2015) Labropoulou et al. (2008) Pruijssers and Strand (2007) Suderman et al. (2008) Salvia et al. (2017) Fath-Goodin et al. (2009) Bitra et al. (2012), Thoetkiattikul et al. (2005) Wang et al. (2018) Darboux et al. (2019) Ignesti et al. (2018), Salvia et al. (2018) Kumar et al. (2016) Kim and Hepat (2016) Hasegawa et al. (2017) Di Lelio et al. (2014)

Not determined Not determined TnBVank3, TnBVank1

CvBV Immunity Humoral immunity HdIV Immunity Antiviral immunity TnBV Development Hormonal regulation

Inhibition of encapsulation Inhibition of encapsulation Inhibition of encapsulation Inhibition of phagocytosis Inhibition of phagocytosis Induction of hemocyte apoptosis Induction of hemocyte apoptosis Inhibition of hemocyte apoptosis Inhibition of antimicrobial peptides production Inhibition of PO activity Activation of antiviral gene transcription Arrest of ecdysteroidogenesis

CpBV-H4 CpBV_E94k vinnexinG TnBVank1

CpBV CpBV CsIV TnBV

Down-regulation of insulin signaling No data No data Decrease of amino acid transport

Development Development Development Development

Cellular immunity Cellular immunity Cellular immunity Cellular immunity Cellular immunity Cellular immunity Cellular immunity Cellular immunity Humoral immunity

Effect of the PDV gene

Molting Growth Growth Growth

CpBV, Cotesia plutellae Bracovirus; CsIV, Campoletis sonorensis Ichnovirus; CvBV, Cotesia vestalis Bracovirus; HdIV, Hyposoter didymator Ichnovirus; MdBV, Microplitis demolitor Bracovirus; TnBV, Toxoneuron nigriceps Bracovirus; TnBV, Toxoneuron nigriceps Bracovirus.

A consequence of this property of PDVs is occurrence of lateral transfers on lepidoptera genomes. Indeed, insertions of bracovirus sequences are present in the genomes of certain moth and butterfly lineages; the viral genes present in these sequences have been co-opted by lepidopteran species to confer some protection against pathogens.

The Proviral Genome Maintained in the Wasp Genome PDV sequences persist in all cells of the wasp as integrated proviruses. PDV chromosomal genomes include two functional components, the viral segments encoding the virulence genes and the viral replication genes (Fig. 4). The segmented proviral sequences are amplified, excised and circularized as dsDNA molecules that are packaged into the particles to be delivered to the caterpillar host. On the other side, DNA for the replication genes, which are also amplified, are not encapsidated and remain in the wasp. The latter are expressed specifically in calyx cells during virus replication and encode proteins found, for the majority of them, associated with the virus particles (by proteomic analyses of purified particles); they constitute the viral machinery that produces the PDV particles. These viral machineries derive from the PDV viral ancestor, which differs between BVs and IVs.

Organization of PDV Proviral Segments in the Wasp Genome Proviral segment organization in the wasp genome differs between BVs and IVs. The first data came from the analysis of genomic clones from two Glyptapanteles braconid wasps. It was the first demonstration that BV proviral segments were organized in “macroloci” containing several segments arranged in tandem array. The organization in macro-loci of BV proviral segments was confirmed by later studies on several Cotesia species and M. demolitor. The number of macro-loci and the number of segments differ between species, but globally, braconid wasps harbor in their genome a dozen viral loci containing from 18 to a single segment. For M. demolitor, availability of the whole genome allowed the demonstration that the eight loci containing MdBV proviral segments were flanked by large distances of wasp genomic DNA, suggesting a wide distribution of the viral loci within the wasp genome. Organization in clusters may facilitate concerted segment DNA amplification during BV replication. Indeed, it was shown that BV segments are amplified within replication units (RU) composed of several segments and including intersegmental sequences that are not encapsidated. The hypothesis suggested by the authors is that the concatemer is first excised and then sub-divided into individual segments. The organization of proviral segments is quite different in ichneumonids. The genome of two campoplegine species Hyposter didymator and Compoletis sonerensis reveals that IV segments are highly dispersed across the wasp genomes. Therefore the question on how the coordinated amplification of all the viral loci is regulated remains unanswered. In its linear form, each PDV segment is flanked in its extremities by a direct repeat, named “direct repeat junction” (DRJ). DRJs are the regions where homologous recombination takes place allowing excision of the segment; consequently, a single copy of the DRJ sequence is present in the circular molecule (Fig. 3). In IVs, no consensus sequence has been identified. Conversely, in BVs,

Polydnaviruses (Polydnaviridae)

855

Fig. 4 The different forms of PDV genomes. In the wasp, PDV genome sequences are present as integrated proviruses composed of two components. One component is composed of proviral segments, each of which are flanked by direct repeat junctions (DRJ) (dark blue arrows) and encode virulence genes (blue squares). Homologous recombination at the level of the DRJs allows segment excision and leads to the formation in the calyx cells of double stranded circular DNA molecules. For BVs, amplification of segments within replication units was shown to precede segment excision. The other proviral component corresponds to genes belonging to “BVSPERs” or “IVSPERs” “BV or IV structural proteins encoding regions” (red squares). In the case of BVs, these regions correspond to genes of nudiviral origin, whereas in the case of IVs the genes belong to a yet to be described virus. Expression of these genes in the calyx cells allows production of viral particles and the packaging of the double stranded circular molecules. These non-replicative PDV particles are then injected inside the caterpillar along with wasp eggs upon parasitism. Expression of PDV virulence genes occurs early after parasitism and is in part necessary for successful wasp development. PDV DNA was shown to persist in the genome of cultured lepidopteran cells. In the case of BVs, integration of BV DNA segments into genomic DNA of the host during parasitism was described for molecules containing a 100 bp “Host Integration Motif” (HIM). During integration BV circles are opened within the HIM, a short 40–50 bp sequence of the circle is lost and the remaining HIM sequences flanking this lost sequence become the extremities of the BV sequences integrated in the host DNA, these extremities are named Junction 1 and 2 (J1 and J2) and are represented by dark green lines. The fate of molecules that do not contain HIM sequences awaits investigation. The wasp genome is represented by a gray line. DRJs are represented by dark blue arrows. PDV virulence genes are represented by blue squares, virus-derived genes encoding virion structural proteins are represented by red squares. The lepidopteran genome is represented by an orange line. The HIM is represented by a green square, and JI and J2 sequences by dark green lines.

856

Polydnaviruses (Polydnaviridae)

DRJs contain a conserved tetramer AGCT, shown to be the site where the segment circularizes. Note that this motif differs from the host integration motif (HIM) mentioned above.

The PDV Viral Machineries One of the major findings of the last 10 years in the field of PDVs has been the discovery by coupling calyx transcriptome and virus particle proteome analyses, of the virus ancestors and of the virus-derived machineries allowing for the production of PDV particles. Importantly, BVs and IVs do not share a common viral ancestor, they have different evolutionary origins. BVs descend from the integration in the genome of an ancestral wasp of a nudivirus genome, whereas IVs originate from the integration of a still undetermined virus. Nudiviruses represent a sister-group of baculoviruses, with which they share a set of core-genes. A large proportion of nudiviral structural core genes was found to be maintained in the braconid wasp genome. These nudivirus-like genes potentially (1) code for the different subunits of a DNA-dependent RNA polymerase and the lef-5 initiation factor involved in transcription; (2) are involved in producing and assembling viral nucleocapsids; (3) code for a set of per os infectivity factors (PIFs) that could be involved in infection of host cells; or (4) are constituents of the viral particles (see Table 4). Several lines of evidence suggest that nudivirus-like genes in braconid wasp genomes have retained their ancestral function. First, nudivirus-like gene expression dynamics are consistent with the different steps in virion formation. Secondly, proteomic analysis showed that products corresponding to capsid or envelope components can indeed be present in the BV virions. Finally, RNA-interference experiments in M. demolitor showed that the Lef-4 and Lef-9 RNA polymerase subunits are necessary for transcription of viral structural genes involved later in the virion cycle. RNA-interference in this model also revealed that vp39, p74, and pif-1 are required for virion formation; while vlf-1 and the integrase display functions required for DNA excision and circularization. Notably, however, none of the braconid genomes sequenced so far possess a DNA polymerase of viral origin, suggesting viral amplification is under the control of wasp DNA polymerases. Half of the nudiviral genes identified within the genome of the braconid wasp Cotesia congregata are localized in a 17 kb region referred to as the “nudiviral cluster” whereas the other nudiviral genes are dispersed in the wasp genome. A nudiviral cluster was similarly found in the M. demolitor genome displaying a conserved synteny of structural genes with the Cotesia congregata nudiviral cluster. This observation shows that these nudiviral clusters have remained stable since divergence 53 MYA between the C. congretata and M. demolitor species. Regarding IVs, virus-derived genes encoding virion structural proteins were all found in distinct regions of the genomes of the campoplegine Hyposoter didymator and Campoletis sonorensis, and of the banchine Glypta fumiferanae. These regions were termed “IV Table 4

Nudivirus genes, with known or predicted functions, identified in braconid wasp genomes

Function

Gene name

MdBV

CcBV

CiBV

DNA replication/processing

helicase integrase

þ þ

nd nd

nd þ

Transcription by DNA-dependent RNA polymerase

p47 RNA polymerase subunit lef  4 RNA polymerase subunit lef  5 initiation factor lef  8 RNA polymerase subunit lef  9 RNA polymerase subunit

þ þ þ þ þ

þ nd þ þ nd

nd þ nd þ nd

Packaging, assembly, morphogenesis

38k phosphatase vp39 major capsid protein vlf1 multifunctional p33 sulhydryloxidase

þ þ þ þ

þ þ nd nd

þ þ þ nd

Envelope component/infectivity

pif  0 (p74) pif  1 pif  2 pif  3 pif  4 pif  5 pif  6 pif  8 (vp91) 11k

þ þ þ þ þ þ þ þ –

þ nd nd þ þ þ þ nd þ

þ þ þ nd þ þ nd þ þ

Particle component

HzNVOrf9-like HzNVOrf106-like HzNVOrf118-like

þ þ þ

þ þ þ

þ þ þ

MdBV, Microplitis demolitor; Bracovirus; CcBV, Cotesia congregata Bracovirus; CiBV, Chelonus inanitus Bracovirus. þ , gene present; -, gene absent; nd, gene not detected to date. Gene names in bold correspond to core genes that are also conserved in baculoviruses.

Polydnaviruses (Polydnaviridae)

857

Structural Proteins Encoding Regions” (IVSPERs). Five IVSPERs ranging in size from 2 to 30 kbp were identified in the H. didymator genome, and C. sonorensis genomes, and three IVSPERs were also identified in the G. fumiferanae genome. So the large majority of the replication genes are clustered in the wasp genome, although both campoplegine wasps genomes also contain one or two isolated genes. IVSPERs of these closely or distantly related species showed clear inter-taxonomic homology detected for several IVSPER genes. The function in particle assembly and trafficking was demonstrated for six H. didymator replication genes through RNA-interference experiments, clearly showing that IVSPER genes have classical viral functions.

Concluding Remarks Polydnaviruses are unusual viruses due to their atypical life cycle and to their genome composed of two functional components. The virus particles are non replicative and merely consist in gene transfer systems allowing delivery of virulence wasp genes to the host of the parasitoid. As a consequence, the virus status of PDVs could be questioned. Nonetheless, via the replication genes that derive from the viral ancestors, PDVs still possess typical virus characteristics, as the endogenous viral elements present in the wasp genomes have clearly retained their ancient viral functions. Polydnaviruses have a complex life style and they are still wrapped in many mysteries, especially in regard to the field of classical, molecular and evolutionary virology. For example, how these genomes are regulated to produce the viral particles, what are the mechanisms underlying particle assembly, where do the envelopes that form in the nuclei come from, how do these integrated genomes evolve, etc ... so many research questions that future generations will have to answer before the mystery of these unique domesticated viruses becomes somewhat clearer.

Further Reading Beckage, N.E., Drezen, J.M., 2012. Parasitoid Viruses: Symbionts and Pathogens. Academic Press. p. 312. doi:10.1016/C2009–0–64055–1. Burke, G.R., 2019. Common themes in three independently derived endogenous nudivirus elements in parasitoid wasps. Current Opinion in Insect Science 32, 28–35. doi:10.1016/j.cois.2018.10.005. Burke, G.R., Simmonds, T.J., Sharanowski, B.J., Geib, S.M., 2018. Rapid viral symbiogenesis via changes in parasitoid wasp genome architecture. Molecular Biology and Evolution 35, 2463–2474. doi:10.1093/molbev/msy148. Drezen, J.M., Leobold, M., Bézier, A., et al., 2017. Endogenous viruses of parasitic wasps: Variations on a common theme. Current Opinion in Virology 25, 41–48. doi:10.1016/j.coviro.2017.07.002. Feschotte, C., Gilbert, C., 2012. Endogenous viruses: Insights into viral evolution and impact on host biology. Nature Reviews Genetics 13, 283–296. doi:10.1038/nrg3199. Lorenzi, A., Ravallec, M., Eychenne, M., et al., 2019. RNA interference identifies domesticated viral genes involved in assembly and trafficking of virus-derived particles in ichneumonid wasps. PLOS Pathogens 15 (12), e1008210. doi:10.1371/journal.ppat.1008210. Pichon, A., Bezier, A., Urbach, S., et al., 2015. Recurrent DNA virus domestication leading to different parasite virulence strategies. Science Advances 1, e1501150. doi:10.1126/sciadv.1501150. Strand, M.R., Burke, G.R., 2014. Polydnaviruses: Nature’s genetic engineers. Annual Review of Virology 1, 333–354. doi:10.1146/annurev-virology-031413–085451. Strand, M.R., Burke, G.R., 2019. Polydnaviruses: Evolution and function. Current Issues in Molecular Biology 6, 163–182. doi:10.21775/cimb.034.163.

Poxviruses of Insects (Poxviridae) Basil Arif and Lillian Pavlik, Laboratory for Molecular Virology, Great Lakes Forestry Centre, Sault Ste Marie, ON, Canada Remziye Nalçacıoğlu, Hacer Muratoğlu, and Cihan İnan, Department of Molecular Biology and Genetics, Karadeniz Technical University, Trabzon, Turkey Mehtap Yakupoğlu, Trabzon University, Trabzon, Turkey Emine Özsahin, University of Guelph, Guelph, ON, Canada Ismail Demir, Kazım Sezen, and Zihni Demirbağ, Department of Biology, Karadeniz Technical University, Trabzon, Turkey r 2021 Elsevier Ltd. All rights reserved.

Glossary Alkaline midgut The midgut of the lepidopteran larvae has a pH about 10.5 and contains proteases active at this high pH. Apoptosis Programmed cell death. Viruses, toxins, etc. induce apoptosis in certain cells. Ecdysteroids Arthropod hormones responsible for molting and development. Epithelial cells A layer of cells the form part of the midgut and is permissive to virus infection. Fusolin It is the main protein of the spindle body. The N-terminus is associated with degradation of the peritrophic membrane and enhances virus access to epithelial cells. gfp A gene encoding the green fluorescent protein, GFP. Juvenile hormone A group of hormones that regulate larval growth and prevent metamorphosis. Lateral bodies Proteinic bodies present in the virion as a single unit on one side of a unilateral concave core in alphaentomopoxviruses or as two units on either side of a cylindrical viral core (beta- and gammaentomopoxviruses).

Malpighian tubules A system for the elimination of metabolic waste as well as regulators of osmolarity. Peritrophic membrane It is a non-cellular, semi-permeable membrane composed mainly of chitin, which lines the larval midgut. It improves digestion, protects against chemical and mechanical damage and acts as a barrier to pathogenic infections. Phylogeny It is the evolution of genetically related organisms (here viruses) by studying their gene or protein evolution through comparison of homologous sequences. Spheroid A proteinic body into which mature virions are embedded. The spheroid also contains an endogenous alkaline protease. It serves as a protective agent as it delivers the virus to the larval midgut. Spindles A para crystalline body that in some entomopoxviruses become occluded into the spheroid late in the infection cycle. Tracheoblasts Cells derived from epidermal cells that line the trachea to give rise to the tracheoles.

Introduction Over the decades, poxviruses have been associated with a most serious smallpox infection caused by two poxviruses, Variola major and Variola minor (Genus Orthopoxvirus), which resulted in millions of human deaths before it was eradicated with the last case reported in October 1977. Entomopoxviruses (EPVs) were discovered by C. Vago in 1963 where he noted very close structural similarities to poxviruses of vertebrates. EPVs, however, are distinguished by the synthesis of a spherical-like structure, termed spheroid or occlusion body into which virions are embedded at the end of the viral replication cycle. The spheroid consists mainly of a protein, spheroidin of 110–115 kDa and functions much like the polyhedra of nucleopolyhedroviruses. Spheroids and polyhedra afford the occluded virions a certain amount of protection against inactivating environmental conditions such as desiccation, UV light, heat, etc. but are also the means to deliver the virus to permissive midgut cells in the host insect. To date, EPVs have been isolated from Coleoptera, Orthoptera, Lepidoptera and Diptera and the German cockroach, Blattella germanica (Order: Blattodea). As in other poxviruses, EPVs replicate in the cytoplasm, primarily because the virus particle carries a DNA-dependent RNA polymerase to initiate the early stages of gene expression. The virus forms an electron-dense virogenic stroma or viral factories where virions are assembled as spherical immature particles. These virus particles develop into brick-shaped intracellular mature virions (IMVs) that are either released upon cell lysis or acquire a second double membrane from trans-Golgi and bud as external enveloped virions (EEVs). Mature virions become occluded in spheroids late in the infection cycle and remain intracellular until cell lysis. Intracellular particles are highly infectious and, indeed, cells infected with Amsacta moorei EPV (AMEV) can cause a systemic infection in the gypsy moth, Lymantria dispar by per os inoculation. Virus purified from spheroids does not cause a systemic infection in L. dispar by per oral inoculation as this insect is not its natural host.

Classification The family Poxviridae is divided into two subfamilies, Chordopoxvirinae, which encompasses several genera of viruses that infect chordates and Entomopoxvirinae, which encompass poxviruses of insects and has been subdivided to contain four genera:

858

Encyclopedia of Virology, 4th Edition, Volume 4

doi:10.1016/B978-0-12-814515-9.00021-7

Poxviruses of Insects (Poxviridae)

859

● Alphaentomopoxvirus of which the species Melolontha melolontha entomopoxvirus is represented by Melolontha melolontha EPV (MMEV). Members of this genus infect coleopteran insects (beetles). Virions are 450  250 nm in size and ovoid in shape. They contain a unilateral concave core and a single lateral body. The genome size of these viruses is in the order of 260–370 kb. ● Betaentomopoxvirus of which the Amsacta moorei entomopoxvirus is the type species. Viruses of this genus infect lepidopteran insects (butterflies and moths). The virions contain a cylindrical core, a “sleeve-shaped” lateral body and are similar in shape to alphaentomopoxviruses (ovoid) but smaller in size (250  350 nm). ● Deltaentomopoxvirus of which Melanoplus sanguinipes entomopoxvirus is the type species. Members include Orthopteran EPVs such as Melanoplus sanguinipes entomopoxvirus ‘O0 (MSEV). ● Gammaentomopoxvirus of which the Chrinomus luridus entomopoxvirus is the type species. Members of this genus that infect dipteran insects (mosquitoes and true flies) are brick shaped (320  230  110 nm) and the virions contain two lateral bodies and a biconcave core. ● Unassigned species in Entomopoxvirinae: Diachasmimorpha entomopoxvirus Host range, morphology and virus size are the criteria for the different genera. Poxviridae and Iridoviridae have been grouped as eukaryotic nucleocytoplasmic large DNA viruses (NCLDVs) with doublestranded genomes and complete part or all their replication and capsid assembly in the cytoplasm. Ascoviridae may also be included in this group. There is a proposed order Megavirales to include the above virus families along with Mimiviridae and Phycodnaviridae.

Virus/Midgut Interactions Infection starts when larvae ingest plant material contaminated with spheroids that contain occluded virus particles (Fig. 1). Spheroids dissolve in the alkaline environment of the midgut with the aid of an endogenous alkaline protease. Released virions negotiate the peritrophic membrane and attach to permissive epithelial cells. Infection is enhanced in part by a virus-encoded protein termed fusolin which appears to cause significant disruption of the peritrophic membrane. Fusolin is the major protein of spindlelike structure and is occluded into the spheroids of most EPVs (Fig. 2). Some EPVs do not occlude the spindles such as the Anomala cuprea EPV (ACEV) and Pseudaletia separata EPV (PSEV). Fusolins are conserved at the amino acid level and exhibit significant identity with an apparently unrelated baculovirus (BV) protein, GP37. Both fusolin and GP37 have a 20 amino acid leader sequence that is cleaved post translationally. Interestingly, GP37 forms spindle-like structures in the cytoplasm of certain nucleopolyhedrovirus infected cells (Fig. 3). Fusolin not only enhances the infectivity of EPVs, it also appears to interact functionally with BVs and boosts the infectivity of some nucleopolyhedroviruses (NPVs). The spindle protein, not the spheroid, causes the disintegration of the peritrophic membrane in the silk moth to aid the penetration of greater numbers of NPV virions into microvilli. A series of deletion ACEV fusolin mutants were constructed, expressed in a BV expression system and assayed for enhancement of BmNPV infectivity. It was observed that the N-terminus of fusolin (aa 1–253) is essential to enhance the infectivity of BmNPV in

Fig. 1 Choristoneura biennis entomopoxvirus is occluded within a spheroid. Arrow points to virions maturing within the spheroid.

860

Poxviruses of Insects (Poxviridae)

Fig. 2 Virions and spindles (arrow) occluded within a spheroid.

Fig. 3 GP37 spindles in the cytoplasm of CF70 cells infected with Choristoneura fumiferana defective nucleopolyhedrovirus (CfDEFNPV).

the silk moth. Interestingly, that part of the N-terminus is a chitin-binding domain that causes degradation of the peritrophic membrane. Also, essential to fusolin activity and stability in midgut juices is the glycosylation of N191. The carboxy-terminus does not appear to play a role in this function. As with many other viruses, EPVs induce apoptosis upon infection of certain cells. Most but not all EPVs contain inhibitor of apoptosis genes (iap) but their anti-apoptotic properties have not been established. BVs express a virus-encoded protein, P35, that targets cellular caspases to inhibit apoptosis. AMEV encodes P33, which is a functional homolog of the baculovirus P35 protein, which inhibits programmed cell death in certain insect and human cell lines. It is specific for effector but not initiator caspases and mutagenesis of the caspase cleavage site annuls cleavage and subsequent anti-apoptotic activity.

Replication in Larvae Much like the occlusion bodies of baculoviruses, the function of the spheroid is not only to give the virions a certain amount of protection in the environment but, more importantly to deliver the virus to permissive midgut cells. The spheroid contains an endogenous alkaline protease that aids in dissolution to efficiently release the virions. Initiation of infection starts when the outer envelope protein of the virion attaches and fuses with the plasma membrane allowing entry of the viral core and lateral bodies into the cell cytoplasm. The entry–fusion complex in chordopoxviruses (ChPVs) has been substantially elucidated, primarily in vaccinia

Poxviruses of Insects (Poxviridae)

Table 1

861

Potential EPV genes related to fusion entry complex and orthologs in Fowlpox virus

Predicted Function Structural

Homologs to AMEV

Homologs to MSEV

Homologs of Chordopoxvirus

Myristylated protein (Cop-G9R) Myristylated Entry/Cell-cell fusion protein (Cop-A16L) Unknown MP (Cop-J5L) S-S bond formation pathway protein (Cop-F9L) Entry and Cell-Cell Fusion (Cop-A21L) IMV MP/Virus entry (Cop-A28L) Entry and Cell-Cell Fusion (Cop-H2R)

AMEV  035 AMEV  118 AMEV  232 AMEV  243 AMEV  249 AMEV  186 AMEV  127

MSEV-Tuc  121 MSEV-Tuc  090 MSEV-Tuc  142 MSEV-Tuc  094 MSEV-Tuc  209 MSEV-Tuc  132 MSEV-Tuc  060

FWPV-Iowa  127 FWPV-Iowa  181 FWPV-Iowa  136 FWPV-Iowa  112 FWPV-Iowa  186 FWPV-Iowa  192 FWPV-Iowa  139

Fig. 4 Viroplasm of CBEV infected cells and maturing budded virions. Intracytoplasmic virus particles are at various stages of maturity. Arrows point to electron-dense virogenic stroma.

virus and the H3 component of the complex is a homolog of the AMEV, amv248. Table 1 delineates AMEV ORFs potentially encoding components of the entry–fusion complex and orthologues in Fowlpoxvirus. Virus replication is exclusively in the cytoplasm of infected cells in electron dense viroplasm or virus factories where immature virion cores are formed. Maturation of the virions in some hosts occurs after occlusion into the spheroids (Figs. 1 and 2). Also, extracellular virus may mature following budding from infected cells (Fig. 4). EPVs infect four orders of insects, Lepidoptera, Orthoptera, Diptera and Coleoptera and, while the virus is infective to larvae, an EPV of the European spruce bark beetle, Ips typographus, appears to target adult insects only. The primary larval tissue that supports virus replication is the fat body although other tissues also become infected. Melanoplus sanguinipes EPV (MSEV) is more restricted and infects fat tissue and haemocytes. A more comprehensive pathway is seen when AMEV tagged with a gfp marker under the control of the very late spheroidin promoter is used to infect the gypsy moth, Lymantria dispar by the per oral route. Since gfp is under the control of the very late spheroidin promoter, GFP expression indicates the virus has completed productive replication. The first signs of fluorescence are in the haemocytes that appear to carry the virus to the rest of the larval tissues. Unlike BVs, where the trachea is the conduit to disseminate the virus systemically, AMEV does not appear to infect tracheoblasts (Fig. 5) until the very late stages of infection of larva and by that time the virus has spread to all susceptible tissues. The fat tissue however is still the main site that supports virus replication where the virus progressively invades all cells and the tissues become the main site for productive replication (Fig. 6). Virus replication also occurs in the silk glands, the hind- and midgut. The malpighian tubules are refractory to infection. At the latter stages of infection, the gypsy moth becomes totally fluorescent with GFP (Fig. 7).

862

Poxviruses of Insects (Poxviridae)

Fig. 5 AMEV expressing GFP in haemocytes proximal to uninfected trachea. Reproduced from Perera, S., Zhen, L., Pavlik, L. and Arif B. (2010). Entomopoxviruses. In Insect Virology (Edited by: Sassan Asgari and Karyn N. Johnson), pp 83–102. Caister Academic Press, UK.

Fig. 6 Gypsy moth fat tissue infected by AMEV gfp þ sph at different hours post infection (pi). Reproduced from Perera, S., Zhen, L., Pavlik, L. and Arif B. (2010). Entomopoxviruses. In Insect Virology (Edited by: Sassan Asgari and Karyn N. Johnson), pp 83–102. Caister Academic Press, UK.

To date, EPVs have not been used to any extent in the control of agricultural or forest insect pests primarily because the virus takes a long time to kill the insect, usually 4–5 weeks. There has been some correlation observed between the host and the amount of time it takes to kill the larvae. For example, coleopteran EPVs could take up to 37 weeks, and may be temperature dependent. On the other hand, lepidopteran EPVs take about 4 weeks. Apart from general morbidity, symptoms of infected larvae differ from host to host. Larvae of the lepidopteran lesser cornstalk borer, Elasmopalpus lignosellus become unresponsive and sluggish just before they die. Their cuticle changes from brown to reddish in color and the hemolymph turns white or light blue. EPV-infected black-soil scarab, Othnonius batesi develop a characteristic white-spotted or mottled exterior triggered by the accumulation of infected fat body in the last segment that enclose the rectal sac. Lepidopteran larvae such as the Winter moth, Operophtera brumata and the Army cutworm, Euxoa auxiliaris become whitish but the Salt marsh caterpillar, Estigmene acrea remain dark after infection. Infected larvae of the Eastern spruce budworm, Choristoneura fumiferana (CFEV) or the Two-year-cycle budworm, C. biennis EV (CBEV) become much larger in size because pupation is either delayed or inhibited until they die. EPVs interfere with metamorphosis by modulating the levels of juvenile hormone (JH) and ecdysteroids in the hemolymph. Larvae treated with JH, appear to undergo similar symptoms. When infected with EPV, many die as larval/pupal or pupal/adult intermediates. It appears that upon infection with an EPV, the level of JH is elevated with a concomitant reduction of the ecdysteroid titer.

Poxviruses of Insects (Poxviridae)

863

Fig. 7 Latter stages of a whole gypsy moth larva infected with AMEV gfp þ sph  exhibiting GFP expression due to virus infection.

Table 2

Sequenced entomopoxvirus genomes

Viral Genome

Size in bp

Size of ITR in bp

Anomala Cuprea EPV Amsacta moorei EPV Choristoneura biennis EPV Choristoneura rosaceana EPV Adoxophyes honmai EPV Mythimna separate EPV Melanoplus sanguinipes EPV

245,717 232,392 307,691 282,895 228,750 281,182 236,120

22,978 9458 23,817 13,406 5617 7347 7201

Genomics Conserved and Shared Genes Poxvirus genomes share 49 core genes completely conserved among members of Chordopoxvirinae and Entomopoxvirinae. The conserved genes play roles in key functions such as transcription, DNA replication and virion structural assembly and are considered essential for virus survival and may even reflect a minimum poxvirus genome. At least 90 genes are common to all ChPVs genera including the 49 core genes. While the central region of the genome contains genes required for essential functions, the termini contain genes specific to a particular genus and some are involved in virus/host interactions. Some genes in the termini encode virulence proteins to counteract the host’s immune responses. During the co-evolution of ChPVs with their hosts, the virus acquired genes encoding proteins needed to evade the host's immune responses. A similar process has most likely developed in EPVs and will probably be elucidated as more studies are conducted on the interactions of EPVs with their natural hosts. Several EPV genes and their homologs in ChPVs have been characterized with respect to their temporal transcription and viral functions. The AMEV protein kinase gene (amv197) is expressed early and encodes a protein of 299 amino acids with conserved serine/threonine protein kinases (PK) domains. In a yeast two-hybrid system, this enzyme and another PK encoded by amv153 interacted specifically with a number of viral proteins including two involved in the fusion entry complex. Interestingly, amv133 encodes a protein with two types of enzyme activities, an esterase and a protease. In comparison to ChPV, to date, only seven EPV genomes have been completely sequenced (Table 2). The EPV genomes are distinguished by having a very high A þ T content measuring 79%–82%. A characteristic poxvirus genome, including EPVs, comprises a highly conserved central region where genes and ORFs show collinearity flanked by variable terminal regions. In relation to ChPVs, the central region in the EPV genomes have undergone significant gene rearrangements during the evolutionary separation of the two poxvirus subfamilies. Indeed, comparison of gene order and arrangements in the central part of the genomes of AMEV, CREV, CBEV and MSEV (Fig. 8) reveals almost complete collinearity in the first three betaentomopoxvirus genomes. Gene order has been practically totally rearranged in the genome of MSEV thus ascertaining its divergence from the betaentomopoxviruses, as recognized by the ICTV by being classified in the new (2019) Deltaentomopoxvirus genus (Fig. 9). The first EPV genome to be completely sequenced was the 236-kb MSEV genome containing 267 ORFs. In 2000, the sequence of AMEV (232-kb) became available and was assumed to have 294 protein-coding ORFs. The sequences of five other genomes were reported later (Table 2) and, as in other poxviruses, EPV genomes contain inverted terminal repeat regions (ITR) which range from 5.6 kb to about 24 kb in length (Table 2).

864

Poxviruses of Insects (Poxviridae)

Fig. 8 Gene collinearity in the central part of the genomes of AMEV, CREV, CBEV and MSEV.

Fig. 9 Unrooted phylogeny of EPVs based on the protein, spheroidin. Reproduced from Perera, S., Zhen, L., Pavlik, L. and Arif B. (2010). Entomopoxviruses. In Insect Virology (Edited by: Sassan Asgari and Karyn N. Johnson), pp 83–102. Caister Academic Press, UK.

The initial prediction of 294 protein-coding ORFs in the genome of AMEV is probably an over estimate and the actual number may be significantly less. Based on purine content and the average aspartate-glutamate-serine composition, a method was described to predict functional ORFs in A þ T rich genomes. By this method, 45 of the previously considered ORFs in the AMEV genome are likely to be non-coding and cannot be considered as genes. By using the Z-curve method, two other studies predicted that 38–48 previously annotated AMEV ORFs do not encode proteins. Based on the latter two studies, the AMEV genome encodes 246–256 functional proteins, which is significantly less than the 294 ORFs reported earlier.

Poxviruses of Insects (Poxviridae)

Table 3

865

Poxvirus gene families with orthologs in MMEV and MSEV

Family

AMEV ORF

MSEV ORF

ALI (alanine-leucine-isoleucine moltif Subgroup 1 Subgroup 2 Leucine-rich repeats (LRR)

AMV257 MSV024, 026, 196, 204 AMV 055, 057, 175, 177 MSV023, 194, 195 AMVITR01, AMV005, 014, 076,134 MSV008, 009, 010, 011, 013, 014, 015, 016, 018, 186, 227, 228, 239, 240, 241, 253, 254, 255, 257, 260, 261 AMV194, 207, 209 Methionine-threonine-glycine (MTG) MSV021, 191, 198, 199 AMV029, 254 Tryptophan (W) repeat MSV027, 029, 034, 197, 205, 252 AMV176 AMV56, 176, 178 Serine-cystein-glycine (SCG) MSV062, 214, 215, 216, 217 17K ORF (KiIK-N domain-containing proteins) MSV062, 214, 215, 216, 217

The genome sequence of the EPV infecting the Oriental beetle, Anomala cuprea (genus Alphaentomopoxvirus) revealed that it is 245,717 bp long, smaller than previously reported and contains the 49 core genes conserved in poxviruses. Because only one Alphaentomopoxvirus genome has been sequenced and five from the genus Betaentomopoxvirus, it is a too early to make generalized conclusion as to the major differences between viruses from the two genera. However, some differences are worth mentioning. For example, ACEV does not have a homolog to the apoptosis inhibitor, P33 but contains serpins known to inhibit apoptosis in ChPVs. The ITRs of ACEV and CBEV are in the order of 23–24 kb, which are significantly longer than those in other EPVs (Table 2). Also, in the terminus of each ITR of ACEV, there is a concatemer resolution motif, which has been reported to be essential for resolving concatemeric DNA during replication of ChPV genomes.

Gene Families EPVs contain genes that can be assembled into seven gene families (Table 3). Of these families, the AMEV genome contains 23 genes in six families and MSEV has 43 genes in five families. To date, the function of these genes has not been ascertained. However, it is quite reasonable to assume that the ALI (alanine-leucine-isoleucine) family members and 17K ORF family play functional roles in transcription and replication of DNA. The ALI gene family is characterized by an amino-terminal motif that contains invariant residues of alanine, leucine, and isoleucine. With the exception of AMV257, all members of the ALI motif family contain the Bro-N domain (N-terminus of baculovirus BRO proteins). The function of the baculovirus BRO proteins is still not clear but has been suggested that they are DNA binding proteins and may also be involved in regulating host DNA replication and/or transcription. The 17K ORF gene family is comprised of five AMEV proteins and homologs of the 17K ORF of Heliothis armigera entomopoxvirus (HAEV). This family has a putative DNA binding domain (KilA-N domain) present in many proteins in eukaryotic and bacterial DNA viruses, which implies a role in transcription and/or replication.

Entomopoxvirus Phylogeny A variety of methods have been employed in phylogeny analysis including morphological, gene contents, order and arrangements or even individual genes. Relatively recently, single nucleotide polymorphism (SNP) and alignment-free methods have been used to illustrate phylogeny of poxviruses. Collinearity between any two genomes reflects conservation of genes and their order within the genomes. Based on phylogeny of concatenated amino acids of the 49 core genes, MSEV was observed to be distinctly divergent from betaentomopoxviruses. Similarly, phylogeny based on spheroidins revealed that the betaentomopoxviruses cluster together and are divergent from the alphaentomopoxviruses and deltaentomopoxviruses (Fig. 9). Indeed, a new genus was proposed, now called Deltaentomopoxvirus and comprising orthopteran EPVs: MSEV, Calliptamus italicus EPV (CIEV), Gomphocerus sibiricus EPV (GSEV), Oedaleus asiaticus EPV (OAEV), and Anacridium aegyptium EV (AAEV). These five EPVs are evolutionarily divergent from the apha- and betaentomopoxviruses. Among the same genes, betaentomopoxviruses have a low degree of identity with MSEV. For example, spheroidin is about 20% identical between MSEV and other betaentomopoxviruses such as AMEV, CBEV, HAEV as well as the spheroidins of the alphaentomopoxviruses, MMEV and ACEV. In contrast, 76%–92% amino acid identity was detected among betaentomopoxviruses spheroidins. Moreover, the organization and gene order among the lepidopteran betaentomopoxviruses are quite conserved particularly the central region in the genomes. Such conservation is not observed between betaentomopoxviruses and those unassigned in the classification scheme. The molecular data that revealed the divergence between betaentomopoxviruses and MSEV validate the decision by the ICTV to remove the latter from the genus Betaentomopoxvirus and reclassify it as member of the Deltaentomopoxvirus genus. While some virus groups such as EPVs and BVs have descended from different ancestral origins, they evolved to have remarkably similar processes of infecting the natural host. Analysis by secondary structure alignments revealed 33 clusters of homologous genes to be common to EPVs and BVs and not surprisingly, DNA polymerase is among the homologous clusters. The genes were independently acquired from the insect host, other viruses, eukaryotes and bacteria. This gene acquisition convergence may indicate adaptations to conserved interactions of the virus with the natural host. In fact, 15 genes appear to have been acquired by EPV and BV core genomes from the natural host by horizontal gene transfers.

866

Poxviruses of Insects (Poxviridae)

Acknowledgment We are indebted to Ms. Erika Provenzano for expending time and effort in literature searches and procuring relevant published manuscripts to aid in the preparation of this review. Genomics research on EPVs was supported by a major grant from Genome Canada through the Ontario Genomics Institute and three grants from the Scientific and Technological Research Council of Turkey, TUBITAC. It is acknowledged that Figs. 5, 6 and 9 have been published previously in the book “Insect Virology”, Caister Academic Press, Norfolk, UK. The publisher is hereby acknowledged.

Further Reading Arif, B.M., 1995. Recent advances in the molecular biology of entomopoxviruses. Journal of General Virology 76, 1–13. Becker, M., Moyer, R., 2007. In: Poxviruses, A.A., Mercer, A., Schmidt Weber, O. (Eds.), Subfamily Entomopoxvirinae. Basel, Switerland: Birkhauser Verlag, pp. 253–271. Kurstak, E., 1991. The entomopoxviruses. In: Arif, B.M., Kurstak, E. (Eds.), Virus of Invertebrates. Marcel Dekker Inc, pp. 179–195. McLysaght, A., Baldi, P.F., Gaut, B.S., 2003. Extensive gene gain associated with adaptive evolution of poxviruses. Proceedings of the National Academy of Sciences 100, 15655–15660. Perera, S., Li, Z., Pavlik, L., Arif, B., 2010. Entomopoxviruses. In: Asgari, S., Johnson, K. (Eds). Insect Virology, Caister Academic Press, pp. 83–102. Roberts, D.W., Yendol, W.G., 1973a. Insect Poxviruses: Pathology, morphology and development. In: Granados, R.R. (Ed.), Some Recent Advances in Insect Pathology. Miscellaneous Publications of Entomological Society of America, pp. 73–94. Senkevich, T.G., Bugert, J.J., Sisler, J.R., et al., 1996. Genome sequence of a human tumorigenic poxvirus: Prediction of specific host response-evasion genes. Science 273, 813–816. Théze, J., Takatsuka, J., Nakai, M., Arif, B., Herniou, E.A., 2015. Gene acquisition convergence between entomopoxviruses and baculoviruses. Viruses 7, 1960–1974.

Reoviruses of Invertebrates (Reoviridae) Peter J Krell, Department of Molecular and Cellular Biology, University of Guelph, Guelph, ON, Canada r 2021 Elsevier Ltd. All rights reserved. This is an update of P.P.C. Mertens, H. Attoui, Insect Reoviruses, In Encyclopedia of Virology (Third Edition), edited by Brian W.J. Mahy, Marc H.V. Van Regenmortel, Elsevier Ltd., 2008, doi:10.1016/B978-012374410-4.00610-5.

Glossary Diptera (also known as true flies) An order of insects that comprises mosquitoes, gnats, midges, sand flies, and other flies. They possess a single pair of wings on the mesothorax (the middle of the three segments of the thorax of an insect) and a pair of halters which are derived from the hind wings present on the metathorax (the posterior of the three segments of the thorax of an insect). The name is derived from the number of wings (di means two in Greek and pteron refers to the wings). Hymenoptera This order of insects comprises the sawflies, wasps, bees, and ants. The name refers to the membranous wings of the insects and is derived from ancient Greek (hymen: membrane and pteron: wings). The hind wings of hymenopterans are connected to their forewings by hooks.

Lepidoptera This order of insects comprises butterflies, skippers, and moths. The name is derived from ancient Greek and refers to the minute scales (lepidon) that cover the membranous lanceolate wings (pteron) of adults. Reovirus A virus in the family Reoviridae characterized by an icosahedral capsid of about 80 nm with one or two shells encompassing a segmented (usually 10–12) dsRNA genome with an aggregate size of about 25 kb. The name reovirus comes from respiratory enteric orphan virus, since these viruses caused respiratory and enteric infections. Since they were originally thought not to be associated with disease they were erroneously considered orphan viruses. RNA dependent RNA polymerase (RdRp) An RNA polymerase which produces an RNA product from an ssRNA template.

Taxonomy [in brackets is first year in which ICTV accepted the taxon name]. Family: Reoviridae [ICTV 1974]. Order: Reovirales ICTV (2019). Higher Orders: Realm Riboviria, Kingdom Orthornavirae, Phylum Duplornaviricota, Class Resentoviricetes. Subfamily Sedoreovirinae [ICTV 2009] Subfamily Spinareovirinae [ICTV 2009]. Sedoreovirinae

Spinareovirinae

Genus: Cardoreovirus [ICTV 2008] (1 species) Type Species: Eriocheir sinensis reovirus [ICTV 2008] Species Exemplar: Callinectes sapidus reovirus 1 Genus: Phytoreovirus [ICTV 1978] (3 species) Type Species Wound tumor virus [ICTV 1976] Species Exemplar: Wound tumor virus (34) Genus: Seadornavirus [ICTV 2004] (3 species) Type Species: Banna virus [ICTV 1999] Species Exemplar: Banna virus-China

Genus: Aquareovirus [ICTV 1990] (7 species) Type Species: Aquareovirus A [ICTV 1999] Species Exemplar: American oyster reovirus13p2 Genus: Cypovirus [ICTV 1990] (16 species) Type Species: Cypovirus 1 [ICTV 1999] Species Exemplar: Bombyx mori cypovirus 1 Genus: Dinovernavirus [ICTV 2008] (1 species) Type Species: Aedes pseudoscutellaris reovirus [ICTV 2008] Species Exemplar: Aedes pseudoscutellaris reovirus Genus: Fijivirus [ICTV 1978] (9 species) Type Species: Fiji disease virus [ICTV 1979] Species Exemplar: Fiji disease virus Genus: Idnoreovirus [ICTV 2004] (5 species) Type Species: Idnoreovirus 1 [ICTV 2004] Species Exemplar: Diadromus pulchellus idnoreovirus 1 Genus: Orthoreovirus [ICTV 1991] (10 species) Type species: Mammalian orthoreovirus [ICTV 1999] Species Exemplar: Mahlapitsi orthoreovirus

Introduction As a group, the reoviruses have a broad host range, including insects, algae, fungi, fish, crustaceans, crabs, humans and other vertebrates and their invertebrate vectors and plants and their invertebrate vectors. Since the 2008 Encyclopedia of Virology 3rd Edition, the

Encyclopedia of Virology, 4th Edition, Volume 4

doi:10.1016/B978-0-12-814515-9.00084-9

867

868

Reoviruses of Invertebrates (Reoviridae)

Reoviridae family in the Ninth Report of the International Taxonomy of Viruses in 2009, was subdivided into two subfamilies, Sedoreovirinae with six genera and Spinareovirinae with nine genera. The two subfamilies differ in the nature of the virion surface with the Spinareovirinae having a turreted or “spiked” surface (hence the prefix “Spina”, Latin for spike) while the Sedoreovirinae have a smooth surface (“Sedo” is Latin for smooth). Of the more recognized reoviruses are those infecting humans, other mammals and other vertebrates. Those from Rotavirus and Orthoreovirus genera infect only vertebrates, while viruses in genera Coltivirus (e.g., Colorado tick fever virus), Orbivirus (e.g., bluetongue virus) and one virus in genus Seadornavirus are considered arboviruses which infect both vertebrates and their invertebrate vectors (mostly ticks, mosquitoes, midges, gnats and sandflies). Lesser known reoviruses infect algae (Mimoreovirus) and fungi (Mycoreovirus). In addition to infecting plants, reoviruses in genera Fijivirus, Oryzavirus and Phytovirus also infect their insect vectors (grasshoppers). These are described in more detail in the article on Reoviridae in this Encyclopedia. There are also reoviruses that infect exclusively invertebrates. There are many reoviruses infecting only insects in the genera Cypovirus and Idnoreovirus. Other reoviruses infecting only insects are found in genera Dinovernavirus, Fijivirus, Orthoreovirus, Pytoreovirus and Seadornavirus. Reoviruses in two genera Aquareovirus and Cardoreovirus, infect only crustaceans including oysters and crabs, some of economic importance. Like other reoviruses, the reoviruses of invertebrates have a single or double shelled icosahedral capsid with diameters ranging from 60 to 80 nm (though idnoreovirus capsids are smaller at about a 50 nm diameter). While cypoviruses and dinovernaviruses have a single shell with T ¼ 2, aquareoviruses, coltiviruses, idnoreoviruses, cardeoviruses, phytoviruses and seadornaviruses have two shells, the inner with T ¼ 2 and the outer layer with T ¼ 13 icosahedral symmetry. They have 8–12 dsRNA segments. These range in size from 0.76 to 4.47 kb with each segment, in general, encoding one protein. For cypoviruses, segment 5 encodes two proteins by ribosomal skipping. Aggregate genome sizes range from 18.1 kb to 25.72 kb (although a size of 32.7 kb was reported for an unclassified Dendrolimus punctatus cypovirus). With a G þ C content (based on those reoviruses whose whole genomes have been sequenced) ranging from 33.25% to 43.81%. Morphology, infection and replication of the invertebrate reoviruses closely mimic that of other reoviruses and will be covered only briefly here. Instead, the reader is referred to the article in this Encyclopedia on Reoviruses (Reoviridae) and Their Structural Relatives.

Spinareovirinae Much of the material on the Spinareovirinae viruses in this article is based on Insect Reoviruses by Peter Mertens and Houssam Attoui in the previous, 3rd, edition of the Encyclopedia of Virology and provided a good foundation for this update. Some spinareoviruses infecting only invertebrates are found in genera Cypovirus (insects), Dinovernavirus (mosquitoes) and Idnoreovirus (hymenoptera and diptera). One reovirus in genus Aquareovirus (which as the name implies, infect aquatic animals) infects only oysters, while others in the same genus infect vertebrates like fish. In the genus Fijivirus of plant-infecting reoviruses, one virus, Nilaparavata lugens reovirus, has been found only in plant hoppers. One virus in genus Orthoreovirus, Mahlapitsi orthoreovirus infects only bat flies, while Chiqui virus, another, unclassified, orthoreovirus was isolated from mosquitoes. Table 1 summarizes the recognized species in the Spinareovirinae with viruses infecting invertebrates within the six genera listed above. Of the Spinareovirinae insect reoviruses, most cypoviruses have 10 dsRNA segments but can have 8, 11 or even 16 segments (Dendrolimus punctatus cypovirus) and infect lepidoptera like Lymantria dispar (Gypsy larvae) and one infects the hymenoptera Polistes hebraeus. Idnoreoviruses have 10 dsRNA segments and infect hymenoptera like Diadroma pulchellas (a hymenopteran parasitoid) and diptera like Musca domestica. Dinovernavirus has 9 dsRNA segments and was found in an Aedes mosquito AP61 cell line. Cypovirus was the first invertebrate reovirus discovered and that was in 1934 as polyhedra in the cytoplasm of infected cells of silkworm larvae and hence were initially named cytoplasmic polyhedrosis viruses from which the current genus name was derived. Aquareoviruses (Aqua for their common aqueous environment of their hosts) have been isolated from oysters and crabs. Idnoreoviruses (from “idno” meaning water) were first isolated from houseflies in 1978, while the lone dinovernavirus (from double stranded RNA icosahedral nove, meaning nine, dsRNA segments RNA) was isolated from a cell line from Aedes psuedoscutellaris in 2005. The cypoviruses can produce severe to chronic diseases in their infected hosts and represent one of the most significant threats to the silkworm industry. The idnoreoviruses have a core structure that is directly comparable to the cypovirus virion but they also have an additional ‘outer layer’ of capsid proteins. They do not usually cause a cytopathic effect or disease in their insect hosts but may be involved in modifying the phenotype or sex of the host insect. The dinovernaviruses have been isolated only from cultured insect cells and like cypoviruses have a single-shelled capsid, but do not produce polyhedra. The Cytoplasmic polyhedrosis virus group was initially recognized as a genus name with viruses infecting insects in 1971 by the International Committee on Taxonomy of Viruses (ICTV). Unlike baculoviruses which replicate in the nucleus, these viruses replicate and produce polyhedra in the cytoplasm, giving rise to their name as cytoplasmic polyhedrosis viruses. By 1990 the genus name was changed to Cypovirus (sigla: cytoplasmic polyhedrosis virus) and recognized by ICTV. Cypovirus particles are B70 nm in diameter, contain a segmented dsRNA genome of 10 segments, and belong to the “turreted” reoviruses, with surface projections (turrets) situated at each of the 12 “fivefold” vertices. Unlike most other reoviruses, cypoviruses are single-shelled, with no outer capsid layer. Cypoviruses have one unique feature among reoviruses in that they form polyhedra which can retain infectivity for years. The major polyhedrin protein is translated in large excess in the cytoplasm late in infection and virions are occluded in the polyhedrin at the periphery of the virogenic stroma in which morphogenesis of the virions occurs and in which virus particles become singly or multiply embedded. More than 230 cypovirus isolates have been described, with evidence for at least 22 distinct species (based on differences in the migration pattern of their genome segments during electrophoresis–electropherotype and

Reoviruses of Invertebrates (Reoviridae)

Table 1

869

Spinareovirinae reoviruses infecting invertebrates, their hosts and some of their genome characteristics

Species

Virus

Host

# Segs

Genus Aquareovirus Aquareovirus A

American oyster reovirus 13p2

Oyster

11

Lymantria dispar cypovirus 1 Inachis io cypovirus 2 Anaitis plagiata cypovirus 3 Antheria mylitta cypovirus 4 Heliothis armigera cypovirus 5 China Aglais urticae cypovirus 6 Mamestra brassicae cypovirus 7 Abaraxas grossulariata cypovirus 8 Agrotis segetum cypovirus 9 Aporophyla lutulenta cypovirus 10 Heliothis armigera cypovirus 11 Autographa gamma cypovirus 12 Polistes hebraeus cypovirus 13 Lymantria dispar cypovirus 14 Trichoplusia ni cypovirus 15 Choristoneura occidentalis cypovirus 16 Daphnis nerii cypovirus Nanchang Dendrolimus punctatus cypovirus 22 Macheng Biston robustus cypovirus

Lepidoptera Lepidoptera Lepidoptera Lepidoptera Lepidoptera Lepidoptera Lepidoptera Lepidoptera Lepidoptera Lepidoptera Lepidoptera Lepidoptera Hymenoptera Lepidoptera Lepidoptera Lepidoptera Lepidoptera Lepidoptera

10 10

24,732 24,041

0.94–4.16 1.19–3.74

43.22 38.59

11 10

24,818

S9, 1473 0.88–4.12

36.42

10 11 8 8 16

25,325 25,120 19,016 18,088 32,753

0.96–4.44 0.20–4.36 1.17–3.77 1.14–3.97 1.14–4.05

40.29 36.40 40.65 38.79 39.94

Lepitoptera

10

23,954

1.12–4.03

Aedes pseudoscutellaris reovirus

Mosquito AP61 cells Mosquito

9

23,147

1.15–3.82

33.95

9

23,170

1.14–3.82

33.25

Genus Fijivirus Nilaparavata lugens reovirus Nilaparvata lugens reovirus Izumoa

Plant hopper

10

1.43–4.39

28,699

34.9

Genus Idnoreovirus Idnoreovirus 1 Idnoreovirus 2 Idnoreovirus 3 Idnoreovirus 4 Idnoreovirus 5 Unclassified

Diadromus pulchellus idnoreovirus 1 Hyposoter exiguae idnoreovirus 2 Musca domestica idnoreovirus 3 Dacus oleae idnoreovirus 4 Ceratitis capitatat idnoreovirus 5 Drosophila S virus

Hymenoptera Hymenoptera Diptera Diptera Diptera Diptera

10/11 10 10 10 10 10

6 segs: 0.99–4.23 25.15 kb 1.35–3.9 kb 23.4 kb

Genus Orthoreovirus Mahlapitsi orthoreovirus Unclassified

Mahlapitsi orthreovirus 2511b Chiqui virus COBD38d

Bat flies mosquitoes

11 9

23,195 25,028

Genus Cypovirus Cypovirus 1 Cypovirus 2 Cypovirus 3 Cypovirus 4 Cypovirus 5 Cypovirus 6 Cypovirus 7 Cypovirus 8 Cypovirus 9 Cypovirus 10 Cypovirus 11 Cypovirus 12 Cypovirus 13 Cypovirus 14 Cypovirus 15 Cypovirus 16 Unclassified Unclassified Unclassified Genus Dinovernavirus Aedes pseudoscutellaris reovirus Unclassified

Fako virus CSW77

Aggregate genome Range of segment % sizes (kb) CþC size (bp)

1.07–3.93 1.15–4.01

43.81 36.20

Nilaparvata lugens reovirus is the only fijivirus which infects exclusively insects (plant hoppers). All other fijiviruses infect plants and their invertebrate vectors. Mahlapitsi orthoreovirus is the only orthoreovirus which infects exclusively insects (bat flies), all other orthoreoviruses infect only vertebrates.

a

b

nucleotide sequence divergence). However, very few cypoviruses have so far been isolated from Africa, and undoubtedly many more species are sure to be recognized. The genus Idnoreovirus (sigla: insect-derived nonoccluded reovirus) was formally recognized by ICTV in 2005. Idnoreoviruses have ten segmented dsRNA genomes and are icosahedral particles, B70 nm in diameter with a turreted “core”. However, unlike the cypoviruses, they also have an outer capsid protein layer and do not produce polyhedra. Idnoreoviruses have been isolated from various flies including the fruit fly, the small fruit fly (drosophila), the olive fly, and housefly. They have also been isolated from parasitic wasps. The genus Dinovernavirus (sigla: double-stranded insect, novem (nine in Latin, the number of genome segments) RNA virus) includes the first nine-segmented dsRNA virus. Like the cypoviruses, dinovernaviruses are single-shelled, turreted, and lack an outer capsid layer, but do not produce polyhedra or a “polyhedrin” protein. While the genus Aquareovirus contains mostly reoviruses that infect fish like salmon, shiner and catfish, one virus in species Aquareovirus A, American oyster reovirus, infects oysters and have been isolated from Geoduck, shore, swimmer and mitten crabs.

870

Reoviruses of Invertebrates (Reoviridae)

This virus has 11 dsRNA segments. Virions are about 80 nm in diameter and have multiple capsid layers. The viruses replicate efficiently in fish and mammalian cell lines within a temperature range of 15–251C. While most viruses in the genus Fijivirus infect plants one fijivirus, Nilaparvata lugens reovirus (NLRV) has been found exclusively in planthopper but does not seem to replicate in rice. Nevertheless the virus can be transmitted from hopper to hopper through contaminated rice plants. Virions are double shelled icosahedra about 65–70 nm in diameter with 10 dsRNA segments ranging from 1.43 to 4.39 kp in size. The NLRV RNA genome has a G þ C content of 34.9% and an aggregate size of 28.7 kb. Virus members of the genus Orthoreovirus are vertebrate viruses usually spread through the fecal oral route. However, one orthoreovirus, Mahlapitsi orthoreovirus, infects bat flies and Chiqui virus was found only in mosquitoes. While most orthoreoviruses have a genome of 10 dsRNA segments, the Mahlapitsi orthoreovirus has 11 segments ranging in size from 1.07 to 3.93 kb with an aggregate size of 23,195 bp and a G þ C content of 43.81%. Chiqui virus has 9 dsRNA segments ranging in size from 1.15 to 4.01 kb, an aggregate size of 25,028 bp and a G þ C content of 36.20%.

Historical Overview Cypoviruses were first recognized by Ishimori in 1934, who observed polyhedra in the cytoplasm of midgut cells of diseased silkworm larvae. Cytoplasmic polyhedrosis has subsequently been recognized as an important cause of economic losses in the Japanese sericulture industry. Cypoviruses have a wide host range, infecting many lepidopteran insect species (more than 200 host species have been identified). Recent reports of cypovirus infections in mosquitoes confirm that they can also infect Diptera. The high levels of sequence divergence between equivalent genome segments of different cypovirus species, and the large numbers of distinct species, suggest that this is an ancient group. Indeed, cypoviruses have been detected in biting midges trapped in amber, indicating that they existed almost 100 million years ago. The American oyster reovirus in genus Aquareovirus was first identified in 1979 in Crassostrea virginica by J.R. Winton and colleagues and was later characterized as a reovirus with 11 dsRNA segments in 1987. The genus Dinovernavirus currently contains a single species (Aedes pseudoscutellaris dinovernavirus; the virus Aedes pseudoscutellaris dinovernavirus 1 (ApDNV-1)) was isolated in 2005 from the AP61 cell line established by Varma et al. from the mosquito Aedes pseudoscutellaris. The first idnoreovirus was isolated from the housefly Musca domestica in 1978. Several other idnoreoviruses have been isolated. Diadromus pulchellus idnoreovirus 1 (DpIRV-1), the most recently isolated idnoreovirus, has a genome composed of 10 dsRNA segments in functional male insects, with an additional segment (derived from genome segment 3) in particles isolated from females or sterile male wasps. The first report of an insect reovirus in genus Fijivirus was in 1991 when Noda and colleagues found Nilaparvata lugens reovirus in apparently healthy brown planthoppers Nilaparvata lugens and then characterized it as a reovirus with 10 dsRNA segments. Other fijiviruses infect plants and their invertebrate vectors. An insect orthoreovirus in the genus Orthoreovirus was first described by PJ Van Vuren and colleagues in 2016 when they isolated Mahlapitsi orthoreovirus from bat flies, Eucampsipoda Africana associated with African bats Rousettus aegyptiacus in South Africa. As part of their studies they determined the sequence of the 10 dsRNA segments determining an aggregate genome size of 23,195 bp and a G þ C content of 43.81%. A second invertebrate orthoreovirus was Chiqui virus which was isolated from mosquitoes collected in Columbia by M.A. Contreras-Gutierrez in 2018. Based on genome sequencing of the 9 dsRNA segments with an aggregate size of 25,028 bp and G þ C content of 36.20%, they concluded that Chiqui virus was in the subfamily Spinareovirinae.

Host Range, Diseases, Transmission, and Distribution Cypoviruses Cypoviruses have been isolated from only arthropods, including members of over 45 genera of Lepidoptera. However, they have also been isolated from Hymenoptera, including the wasps Polistes hebraeus and Diadromus pulchellus, and from Diptera, including the mosquitoes Culex restuans and Uranotaenia sapphirina. One isolate was reported from the freshwater daphnid Simocephalus expinosus. Experimental studies indicate that cypoviruses cannot infect mammals or mammalian cell cultures. The majority of cypovirus infections produce chronic disease, often without extensive larval mortality. Consequently, many individuals become adults even though they are heavily diseased. Infected larvae can stop feeding as early as 2 days post infection, producing symptoms of starvation. Their body size and weight are often reduced and diarrhea is common. The duration of the larval stage can also increase by B1.5 times. Infected pupae are frequently smaller and many diseased adults do not emerge correctly, can be malformed, and flightless. Infected females may exhibit a reduced egg-laying capacity. Cypoviruses can be transmitted on the surface of eggs, producing high levels of infection in the subsequent generation. However, disinfection of the egg surface destroys the virus, indicating that cypoviruses are not transovarially transmitted. The minimum infectious dose increases dramatically in later larval instars, although virulence varies significantly with virus strain and host species. Larvae can sometimes recover from cypovirus infections, possibly because the gut epithelium has considerable regenerative capacity and infected cells are shed at each larval molt. Cypoviruses are normally transmitted by an oral–fecal route. Ingested polyhedra on contaminated food materials dissolve in the high pH environment of the insect gut, releasing occluded virus particles, which can then infect the cells lining the gut wall.

Reoviruses of Invertebrates (Reoviridae)

871

Infection is generally restricted to the columnar epithelial cells of the larval midgut, although goblet cells may also become infected. Cypovirus infection spreads throughout the midgut region (and sometimes infect the fat body), although infection of the entire gut has occasionally been observed in some species. The production of very large numbers of polyhedra gives the gut a characteristically creamy-white appearance. The endoplasmic reticulum of infected cells is progressively degraded, mitochondria enlarged, and the cytoplasm becomes highly vacuolated. In most cases, the nucleus shows few pathological changes. However a cypovirus strain has been detected that produces polyhedra within the cell nucleus. In the later stages of infection, cellular hypertrophy is common and microvillae are reduced or absent. Polyhedra are released into the gut lumen by cell lysis and are excreted. The gut pH is lowered during infection preventing dissolution of progeny polyhedra in the gut fluid.

Idnoreoviruses All known members of the genus Idnoreovirus infect insects, with host species that include Diadromus pulchellus and Hyposoter exiguae (wasps: Hymenoptera); Musca domestica (housefly); Dacus oleae (olive fly); Ceratitis capitata (Mediterranean fruit fly); and Drosophila melanogaster (small fruit fly) (flies: Diptera). Unlike the cypoviruses, idnoreoviruses appear to cause few pathological effects in their hosts, although they may significantly influence the biological properties of individual insects. Drosophila S virus appears to be associated with the “S” phenotype in D. simulans (a maternally inherited morphological trait associated with abnormal bristles). The presence of an additional 3.33 kb dsRNA segment in DpIRV-1 is related to the sex and ploidy of the host, and may play a role in the biology of the host wasp species, possibly by providing information necessary for larval development.

Dinovernavirus ApDNV-1, the only representative of this genus, was isolated from persistently infected A. pseudscutellaris cells.

Aquareovirus American oyster reovirus 13p2 is the only aquareovirus not isolated from vertebrates but rather from oysters. Although it was originally isolated from oysters, the virus can replicate in Golden shiner fish at temperatures of 23 and 281C.

Fijivirus Unlike other fijiviruses which replicate in plants and their invertebrate vectors, Nilaparvata lugens reovirus replicates only in invertebrates, specifically the brown planthopper Nilaparvata lugens and transmission appears to involve contaminated rice plants. The virus does not appear to replicate in rice.

Orthoreovirus While most orthoreoviruses infect vertebrates including mammals, birds and reptiles, only two orthoreoviruses, Mahlapitsi orthoreovirus and Chiqui virus infect invertebrates, specifically bat flies and mosquitoes, respectively.

Virion Properties, Genome, and Replication Cypoviruses Large proteinaceous inclusion bodies are characteristically produced during cypovirus infections, usually within the cytoplasm of infected insect cells. The crystalline matrix of these “polyhedra” is primarily composed of a single viral “polyhedrin” protein that is often encoded by the smallest genome segment (Seg-10). The polyhedrin structure appears to protect the occluded virus particles, possibly filling the role of outer capsid proteins found in other reoviruses. Polyhedra are frequently symmetrical (cubic, icosahedral, or irregular) depending on both the virus strain (polyhedrin sequence) and the host species. The polyhedrin molecules appear to be arranged as a face-centered cubic lattice with center-to-center spacing varying between 4.1 and 7.4 nm. In some virus species, the virus particles are occluded singly within small occlusion bodies, which can also aggregate. However, some cypoviruses produce large polyhedra containing large numbers of particles, often regularly spaced within the polyhedrin matrix. “Empty” polyhedra containing no virus particles have also been observed. Cypovirus particles have a single-layered capsid, composed of a central shell, B57 nm in diameter (by cryoelectron microscopy), which extends to 71.5 nm when the 12 “turrets” or “spikes” on the icosahedral fivefold vertices are included. These turrets are hollow, up to 20 nm in length and 15–23 nm wide (by conventional electron microscopy and negative staining). The cypovirus capsid has a central compartment, B35 nm in diameter, which normally contains the genomic dsRNA segments. Cypovirus virions are structurally comparable to the cores of other reoviruses, particularly those from the genera with “turreted” cores (e.g., Orthoreovirus, Aquareovirus, and Oryzavirus) (Figs. 1 and 2). Cypovirus particles generally contain five to six distinct proteins, three of which have been identified as the T2 protein (120 copies, equivalent to the VP3(T2) subcore shell protein of bluetongue virus, or orthoreovirus lambda1); “large protrusion protein” (LPP, 120 copies, comparable to orthoreovirus sigma2); and “turret protein” (TP, 60 copies, comparable to orthoreovirus lambda2). Cypoviruses also contain transcriptase enzyme complexes attached to the inner surface of the capsid shell at the icosahedral fivefold vertices. Two or three of the cypovirus structural proteins are usually 4100 kDa.

872

Reoviruses of Invertebrates (Reoviridae)

Fig. 1 Electron micrographs of cypovirus, dinovernavirus, and idnoreovirus particles. (a) Contrast electron micrograph of nonoccluded Orgyia pseudosugata cypovirus 5 (OpCPV-5) virion. (b) Contrast electron micrograph of Aedes Pseudoscutellaris reovirus particles. The bars represent 20 nm. (c) Electron micrographs of purified virus particles (left) and core particles (right) of Hyposoter exiguae idnoreovirus-2 (HeIRV-1) stained with uranyl acetate. (d) Electron micrographs of a virus particle (left) and core particle (right) (stained with sodium phosphotungstate) of purified Dacus oleae idnoreovirus-4 (DoIRV-4). DoIRV-4 virions have small icosahedrally arranged surface projections (probably up to 12 in number), while the cores show much larger ‘spikes’ or ‘turrets’, which may lose a portion near to the tip (like those of the cypoviruses). (a) Courtesy of C. L. Hill. (c) Courtesy of Andrea Makkay and Don Stoltz. (d) Courtesy of Max Bergoin.

The structural proteins of Bombyx mori cypovirus 1 (BmCPV-1) are 148 (VP1), 136 (VP2), 140 (VP3), 120 (VP4), 64 (VP6), and 31 kDa (VP7) (Table 2). Polyhedra also contain the 25–37 kDa “polyhedrin protein” (28.5 kDa for BmCPV-1) which represents approximately 95% of the polyhedra protein (by dry weight). The buoyant density of virions in CsCl is 1.44 g cm3, c. 1.30 g cm3 for empty particles, and 1.28 g cm3 for polyhedra. Due to the high level of variation among different cypoviruses, it is unlikely that their homologous proteins can be identified simply by their migration order during polyacrylamide gel electrophoresis (PAGE). Cypoviruses retain infectivity for several weeks at  15, 5, or 251C. The virus retains full enzymatic activity (dsRNA-dependent single-stranded RNA polymerase and capping activity) even after repeated freeze–thawing (up to 60 cycles), even though this disrupts the virus particle into ten active but distinct enzyme/template complexes. Each transcription complex contains one genome segment and a complete transcriptase complex that includes one of the ‘spike’ structures derived from the fivefold axes of the virion capsid. Cations have relatively little effect on cypovirus structure. Heat treatment of virions at 601C for 1 h leads to degradation and release of genomic RNA. Virus particles are relatively resistant to treatment with trypsin, chymotrypsin, ribonuclease A, deoxyribonuclease, or phospholipase. Virion enzyme functions also show some resistance to treatment with proteinase K. However, this

Reoviruses of Invertebrates (Reoviridae)

873

Fig. 2 Cryoelectron microscopy reconstruction of a virion of Orgyia pseudosugata cypovirus-5 (OpCPV-5) to 26 Å resolution. Top left: Reconstruction of a nonoccluded virion; top right: reconstruction of an occluded virion. Bottom left: Cross section of a cryoelectron microscopy reconstruction of a fully occluded virion; bottom center: cross section of a full nonoccluded virion; and bottom right: cross section of an empty virion. The cross sections show evidence of dsRNA packaged as distinct layers and suggest localization of the transcriptase complexes at the fivefold axes of symmetry. Courtesy of C. L. Hill.

Table 2

Coding assignments of the CPV type 1 genome segments

Segment

Size (bp)

Protein

Function/location

Seg-1 Seg-2 Seg-3 Seg-4 Seg-5 Seg-6 Seg-7 Seg-8 Seg-9 Seg-10

4190 3854 3846 3262 2852 1796 1501 1328 1186 994

VP1 (VP1) VP2 (VP2) (VP3) VP3 (VP4) NS1a (NS5) VP4 (VP6) NS3, NS4 (VP7) VP5 (NSP8) NS5 (NSP9) Polyhedrin

Major core (virion) RdRp (virion) (Virion) Possible methyltransferase (virion) Nonstructural, similar to FMDV 2Apro Leucine zipper ATP/GTP binding protein (virion) Nonstructural, with ‘structural’ cleavage products Unknown (anomalous PAGE migration at 55 kDa) Nonstructural, dsRNA binding Polyhedrin protein (Pod)

a

NS1 (NS5): cleaved into NS2 (NS5a) and NS6 (NS5b). RdRp, RNA-dependent RNA polymerase. Note: From information available at: http://www.iah.bbsrc.ac.uk/dsRNA_virus_proteins/BmCPV-1.htm. Under the ‘Protein’ heading, an alternative nomenclature for the viral proteins is given in brackets. The nomenclature that is used in the Eighth Report of the ICTV and presented on the website: http://www.iah.bbsrc.ac.uk/dsRNA_virus_proteins/protein-comparison.htm.

may reflect retention of enzyme activities despite particle disruption, particularly during the early stages of digestion. Cypovirus particles are resistant to detergents including sodium deoxycholate (0.5%–1%) but are disrupted by 0.5%–1% sodium dodecyl sulfate (SDS), which releases the genomic dsRNA. Treatment with triton X-100, NP40, or urea also causes disruption of the virus

874

Reoviruses of Invertebrates (Reoviridae)

ApDNV-1 1, 2, 3 4 5

CPV-1

1  2, 3 4 5

6 6

7

CPV-2

CPV-3 1, 2  3, 4

5 6 7 8 9

6 7

8

8 9 7 8 9

10

CPV-8 1, 2, 3 5

10

9 10

1 ,2, 3 4,5

6 7 8 9 10

Fig. 3 The genomes of dinovernaviruses and cypoviruses. Genome electropherotypes of members of genera Dinovernavirus (ApDNV-1) and Cypovirus (CPV  1,  2,  3, and  8). The segment position is indicated to the right of each profile.

particle structure. One or two fluorocarbon treatments have little effect on virus infectivity, though treatment with ethanol leads to release of RNA from virions. Viruses and polyhedra are readily inactivated by ultraviolet (UV) irradiation which dissociates the dsRNA template from the transcriptase complexes. Polyhedra remain infectious for years at temperatures below 201C. Virions can be released from polyhedra by treatment with carbonate buffer at pH greater than 10.5 but are disrupted below pH 5. As in permissive insects’ midguts, high pH treatment completely dissolves the polyhedral protein matrix. This process is partly due to increased solubility of polyhedrin at high pH but is also aided by alkaline-activated proteases associated within the polyhedra. Polyhedra (but not virions) contain significant amounts of adenylate-rich oligonucleotides. In the majority of cases, cypovirus particles contain 10 dsRNA genome segments. However, virus particles containing an 11th small segment (e.g., Trichoplusia ni cypovirus 15, TnCPV-15 and Antheria mylitta cypovirus 4), 8 segments (in Choristoneura fumiferana cypovirus 16 and unclassified Daphnis merii cypovirus) and up to 16 segments (in unclassified Dendrolimus punctatus cypovirus 22) have been detected The genome segments of BmCPV-1 vary from 4,190–994 bp with a total genome size of 24,809 bp. The genome segments of other cypoviruses have been estimated by electrophoretic comparisons, and have calculated sizes between 5.6 and 0.6 kb, with a total genome size of 29.2–33.3 kb. The size distribution of the genome segments varies widely among different cypoviruses (e.g., the smallest dsRNA has an estimated size that varies between 0.53 and 1.44 kb). These size differences formed the initial basis for recognition and classification of distinct species (electropherotypes) of cypoviruses (which differ significantly in the migration of at least three genome segments during electrophoresis using 1% agarose or 3% SDS-PAGE). The genome segment migration patterns of types 1, 12, and 14 have some overall similarity, suggesting that they are more closely related than some other cypoviruses (Fig. 3). More accurate sizes are derived from whole genome sequencing (Table 1). By sequence analysis, the aggregate sizes of cypoviruses ranges from 18,088 bp to 25,325 bp, although the largest size of 32,753 bp was reported for Dendrolimus punctatus cypovirus 22. Aggregate G þ C content ranges from 36.42% to 43.22%. The smallest segment was 0.88 kb (Heliothis armigera cypovirus 5) and the largest was 4.44 kb (Lymantria dispar cypovirus 14). The termini of different genome segments within a single cypovirus species are often highly conserved but differ from those reported for other species (Table 3). Choristoneura fumiferana cypovirus 16 (CfCPV-16) shows high levels of sequence variation when compared to CPV-1, -2, -5, -14, or -15 and is therefore considered to be a distinct species, even though it has a similar 50 end but different 30 ends to representatives of CPV-5. These data demonstrate that different cypovirus electropherotypes are likely to have different conserved RNA terminal sequences. Large size variations in the genome segments of most cypovirus species (apart from CPV-1, -12, and -14) indicate that the gene assignments for one “type” or species cannot simply be applied to the other cypoviruses. Genome segment coding assignment by in vitro translation of individual genome segments have been published for isolates of CPV-1 and -2. These data and subsequent sequencing studies indicate that the polyhedrin protein is often encoded by the smallest segment. Cypovirus uptake appears to be a relatively inefficient process in insect cell cultures but can be significantly improved by lipofection. Liposomes deliver cypovirus particles into the cytoplasm where replication occurs. Cypoviruses do not require particle modification to activate transcriptase and capping enzymes. The outer coat proteins of other reoviruses need to be removed to activate transcription. Cypovirus polymerase activity can show very pronounced dependence on the presence of S-adenosyl-lmethionine or related compounds, although this varies between different cypovirus species. Virus replication and assembly occur within the host cell cytoplasm, although viral RNA synthesis can sometimes occur within the nucleus. Replication is accompanied by the formation of viroplasm (virogenic stroma or “VIB”) within the cytoplasm, containing

Reoviruses of Invertebrates (Reoviridae)

875

Table 3 The conserved terminal sequences of member viruses of genera Cypovirus, Idnoreovirus, Dinovernavirus, and their comparison to those of genera Fijivirus and Oryzavirus (both are plant viruses) Virus

(Isolate)

Conserved RNA terminal sequences

Cypovirus 1

(BmCPV-1) (DpCPV-1) (LdCPV-1)

50 -AGUAA………………GUUAGCC-30 50 -AGTAA………………GUUAGCC-30 50 -AGUA/GA/G……………Gu/cUAGCC-30

Cypovirus 2 Cypovirus 4

(LiCPV-2) (ApCPV-4) (AaCPV-4) (AmCPV-4)

50 -AGUUUUA………………. UAGGUC-30 50 -AAUCGACG……………. GUCGUAUG-30 50 -AAUCGACG……………. GUCGUAUG-30 50 -AAUCGACG……………. GUCGUAUG-30

Cypovirus 5 Cypovirus 14 Cypovirus 15 Cypovirus 16 Idnoreovirus 1 Dinovernavirus Fijivirus

(OpCPV-5) (LdCPV-14) (TnCPV-15) (CfCPV-16) DpIRV-1 APDNV NLRV MRCV FDV RBSDV

50 -AGUU……………………UUGC-30 50 -AGAA…………………CAGCU-30 50 -AUUAAAAA…………………. GC-30 50 -AGUUUUU………………UUUGUGC-30 50 -A/U/GCAAUUU……AGUAAAAAAAUnA/GG-30 50 -AGUUA/U…………………A/UAGU-30 50 -AGU…………………GUUGUC-30 50 -AAGUUUUUU………………GUC-30 50 -AAGUUUUUU………………GUC-30 50 -AAGUUUUUU………………GUC-30

Oryzavirus

RRSV

50 -GAUAAA…………………. GUGC-30

Note: Data concerning the terminal regions of members of the Reoviridae are listed at: http://www.iah.bbsrc.ac.uk/dsRNA_virus_proteins/ CPV-RNA-Termin.htm.

large amounts of viral proteins and virus particles. The mechanism used to select the individual genome segments for packaging and assembly into progeny particles (exactly one copy of each segment) is unknown. The importance of the conserved terminal regions in this process is indicated by packaging and transcription of a mutant Seg-10 of CPV-1 that contained only 121 bp from the 50 end and 200 bp from the 30 end. Particles are occluded within polyhedra apparently at the periphery of the VIB, from about 15 h postinfection onward. The polyhedrin protein is produced late in infection and in large excess compared to the other viral proteins. It is unknown how polyhedrin synthesis is regulated.

Idnoreoviruses Electron microscopy and negative staining of the idnoreovirus particles show that they have no prominent features, are spherical in appearance (icosahedral symmetry), and B70 nm in diameter. However, they do have a clearly defined outer capsid layer that can be removed to reveal the virus core. Unlike the cypoviruses, idnoreoviruses do not encode a polyhedrin protein, and the virus particles are “nonoccluded”. Idnoreovirus core particles have an estimated diameter of B60 nm, with 12 icosahedrally arranged surface “turrets” or “spikes”. Limited studies of some viruses within the genus indicate that they are resistant to freon (trichlorotrifluoroethane) and CsCl. They may also be resistant to chymotrypsin. Intact virus particles and cores of the prototype isolate Diadromus pulchellus idnoreovirus-1 (DpIRV-1) have densities of 1.370 and 1.385 g ml1, respectively, while intact virions and empty particles of Dacus oleae idnoreovirus-4 (DoIRV-4) have a density of B1.38 and B1.28 g ml1, respectively (determined by CsCl gradient centrifugation). The outer capsid layer of Hyposoter exiguae idnoreovirus-2 (HeIRV-2) can be disrupted by brief exposure to 0.4% sodium sarcosinate, releasing the virus core. The genome of most idnoreoviruses consists of 10 linear segments of dsRNA that are conventionally identified as “genome segment 1” to “genome segment 10” (Seg-1 to Seg-10) in order of reducing molecular weight and increasing electrophoretic mobility during agarose gel electrophoresis (AGE). The total genome of DpIRV-1 contains an estimated 25.15 kb of dsRNA, with individual segments ranging between B4.8 and B0.98 kb, and an electrophoretic migration pattern (by 1% AGE) showing five larger and five smaller segments (a “5/5” pattern). However, the virions of DpIRV-1 may be atypical since they can sometimes also contain an 11th, 3.33-kb genome segment, the presence of which is related to the sex and ploidy of the individual wasp host. This additional dsRNA (migrating between genome segments 3 and 4) contains sequences similar to and therefore possibly derived from Seg-3 (3.8 kb). The genome segments of HeIRV-2 range in size from B3.9 to B1.35 kb, with a “4/6” electrophoretic migration pattern (by PAGE). DoIRV-4 contains an estimated 23.4 kb, with segments estimated from B3.8 to B0.7 kb and a “5/3/2” electrophoretic migration pattern (by PAGE). Ceratitis capitata idnoreovirus-5 (CcIRV-5) has a “3/3/4” genome segment migration pattern when analyzed by PAGE and has clear similarities to Drosophila melanogaster idnoreovirus-5 (DmIRV-5), suggesting that despite some

876

Table 4

Reoviruses of Invertebrates (Reoviridae)

Coding assignments for the DpIRV-1 genome segments

Segment

Size (bp)

Protein

Seg-1 Seg-2 Seg-3 Seg-x Seg-4 Seg-5 Seg-6 Seg-7 Seg-8 Seg-9 Seg-10

4800 4230 3812 3333 3000 2700 1750 1652 1318 1240 985

VP1 VP2 VP3

Function/location

Contains RdRp motifs Sequence closely related to Seg-3. Presence related to sex and ploidy of the host

VP4 VP5 VP6 VP7 VP8 VP9 VP10

Note: Seg-x: the supernumerary segment that is linked to the sex and ploidy of the insect host. RdRp, RNA-dependent RNA polymerase. More information is available at: http://www.iah.bbsrc.ac.uk/dsRNA_virus_proteins/Idnoreovirus.htm.

serological differences, they belong to the same virus species, specifically Idnoreovirus 5. It is unclear how closely drosophila S virus is related to the other drosophila viruses. It is therefore, currently, classified as a “tentative species” within the genus. By analogy with other reoviruses (e.g., the orbiviruses), genome segment migration patterns (by AGE) are likely to be characteristic of each Idnoreovirus species. Initial sequencing studies suggest that the 30 termini of DpIRV-1 genome segments are more variable than most other reoviruses with little sign of conservation. However, conserved sequences were detected at the 50 termini (Table 3), which are different from those of other reovirus species. Except for the sequences of 6 segments from DplRV-1, no sequence data are currently available for other members of this genus. Although the proteins of DpIRV-1 have not been extensively characterized, purified virions contain 11 structural proteins with Mr 21–140 kDa (as analyzed by SDS-PAGE). Three of these appeared to be glycosylated (Mr approximately of 21, 15, and 35 kDa). Some of the viral genome segments have been sequenced (as indicated in Table 4). The viral proteins are named as VP1–VP10 based on the molecular weight of the genome segment (segment number) from which they are translated. On the basis of their structural and biochemical similarity, it seems likely that many aspects of the genome organization and replication of the idnoreoviruses will show similarities to those of other reoviruses (particularly the other turreted spinareoviruses). On this basis it is likely that the virus core will contain transcriptase complexes that synthesize mRNA copies of the individual genome segments. These will be exported and translated to produce the viral proteins within the host cytoplasm. These positive-sense RNAs are also likely to form templates for negative-strand synthesis during progeny virus assembly and maturation. The genome segments that have been characterized represent single genes, with a large ORF, and relatively short noncoding terminal regions.

Dinovernavirus Transmission electron microscopy of ApDNV (the prototype and only dinovernavirus) showed that purified virions have a structure similar to core particles of turreted reoviruses (Fig. 1(b)), with particular similarities to the single-shelled cypoviruses, with no outer capsid layer. The mean diameter of the virus particle is B49–50 nm, with a central space (for the viral genome) that is 36–37 nm in diameter. Turrets were clearly visible projecting from the particle surface. The infectivity of ApDNV-1 particles is destroyed by treatment with 0.1% SDS but unaffected by 1% deoxycholate or repeated treatments with Freon 113 or Vertrel-XF. Freezing at  20 or  801C, or heating to 50–601C for 30 min abolished ApDNV infectivity, even in presence of 50% fetal bovine serum (FBS). However, infectivity was stable for up to 3 weeks at room temperature and for at least 5 months at 41C. ApDNV is stable at pH 6–8, but infectivity was reduced tenfold at pH 4–5 or pH 9–10. The virion morphology (observed by electron microscopy) becomes distorted below pH 5 and is completely disrupted below pH 3.5. ApDNV persistently infects AP61 cells without any visible cytopathic effects and has been detected in these cells from several different sources. The number of copies of the ApDNV genome in each persistently infected AP61 cell was estimated by quantitative polymerase chain reaction (PCR) as between 1 and 5, although treatment with 2-aminopurine (2-AM) can increase the number to 60–80 genome copies per cell. The virus can also infect and replicate in mosquito cells, including AA23 and C6/36 (from Aedes albopictus) cells and A20 cells (from Aedes aegypti), but not in AE cells (A. aegypti) or Aw-albus (Aedes w-albus). ApDNV failed to replicate in mammalian cells or in mice. Unlike other reoviruses, the genome of ApDNV contains only 9 segments of dsRNA, with a 4/1/3 AGE migration profile (Fig. 3). The aggregate genome size is 23,147 bps and the 9 dsRNA segments range from 1.15 to 3.82 kb in size. The G þ C content is 33.95% compared to 34.8% for fijiviruses, 44.7% for oryzaviruses, 43% for the cypoviruses, and 63% for idnoreoviruses. The termini of ApDNV contain conserved terminal sequences (Table 3). The first three nucleotides “AGU” are conserved between ApDNV, some fijiviruses, and some cypoviruses; though, unlike the cypoviruses and fijiviruses, the first and last nucleotides of the ApDNV genome segments are complementary (A and U). The mean G þ C content of the ApDNV genome is 34%.

Reoviruses of Invertebrates (Reoviridae)

Table 5

877

Coding assignments for the ApDNV-1 genome segments

Segment

Size (bp)

Protein

Function by similarity

Seg-1 Seg-2 Seg-3 Seg-4 Seg-5 Seg-6 Seg-7 Seg-8 Seg-9

3817 3752 3732 3375 3227 1775 1171 1151 1147

VP1 VP2 VP3 VP4 VP5 VP6 VP7 VP8 VP9

Possible cell attachment RdRp T2 Nonstructural Possible capping enzyme Possible NTPase Nonstructural Possible translational regulation Viral inclusion bodies

Note: More information is available at: http://www.iah.bbsrc.ac.uk/dsRNA_virus_proteins/Dinovernavirus.htm.

The putative functions of ApDNV proteins VP1–VP9 (based on sequence similarities to other reovirus proteins) are shown in Table 5. These comparisons suggest that there is no equivalent to the cypovirus polyhedrin gene and appear to confirm the status of ApDNV as a distinct and authentic nine-segmented dsRNA reovirus.

Antigenic and Genetic Relationships Cypoviruses Cypoviruses are currently classified within 16 species that were initially identified by their distinctive dsRNA migration patterns. Cross-hybridization analysis of the dsRNA, serological comparisons of the viral proteins, and comparison of RNA sequences confirmed the validity of this classification and have identified further tentative species within the genus. However, only relatively few cypovirus isolates have been characterized, suggesting that there may be many more species yet to be identified. Virus isolates within a single cypovirus electropherotype exhibit high levels of antigenic cross-reaction, as well as efficient crosshybridization of their denatured genomic dsRNA, even under high-stringency conditions. In contrast, there is little to no evidence of serological cross-reaction between viruses from different electropherotypes (representing different cypovirus species). However, CPV-1 and CPV-12 are exceptions, showing a low but significant level of serological cross-reaction. These viruses also show some overall similarity in their electropherotype pattern and detectable levels of cross-hybridization of their genomic RNA. CPV-14 also shows some similarity in its RNA electropherotype pattern to both CPV-1 and CPV-12, suggesting that it may also show some antigenic relationship and RNA sequence homology with these viruses. The nomenclature currently used to identify different cypovirus isolates takes account of both the virus species and the host species from which the virus was originally isolated (e.g., an isolate of CPV-1 from Bombyx mori would be identified as BmCPV-1). The sequence data that are available for isolates of CPV  1,  2,  5,  14,  15, and  16 allow a comparison of some genes from these viruses. There is significantly higher conservation in the largest genome segments (possibly as a result of functional constraints) although the level of variation is relatively uniform across the whole genome. Earlier crosshybridization studies suggested that the level of nucleotide sequence variation is also relatively uniform across the whole cypovirus genome, possibly reflecting the absence of neutralizing antibodies (and therefore antibody selective pressure) in their insect hosts. Different isolates within a single cypovirus species usually show very high levels of nucleotide sequence identity. For example, different isolates of CPV-5 show 498% identity in genome Seg-10 (the polyhedrin gene), while different isolates of CPV-1 show 89%–98% nucleotide sequence identity in this gene. In contrast, comparisons of unrelated species showed only relatively low levels (20%–23%) of sequence identity (Fig. 4). The level of amino acid (aa) identity in the putative viral polymerase varied from 92.9% to 99.5% within a single cypovirus type, and 42.3%–43.3% between different types (Fig. 5); (see Table 6 for abbreviations and accession numbers).

Idnoreoviruses Antigenic relations between different idnoreoviruses have not been analyzed. The phylogenetic relationship between different idnoreoviruses is also unknown (the genome of DpRV is the only idnoreovirus one sequenced to date). However, the sequence of DpRV shows homologies to other reoviruses (aa identities reaching up to 20% with fijiviruses, oryzaviruses, rotaviruses, seadornaviruses, and orbiviruses).

Dinovernaviruses Sequence comparison of the Aedes pseudoscutellaris reovirus genome to other reoviruses has shown significant aa identities in each of the viral proteins. Members of the genera Cypovirus, Oryzavirus, and Fijivirus, which are also turreted, showed the highest aa identities (19%–31%). Amino acid identity values o30% between RdRps of the various reoviruses can be used to distinguish the

878

Reoviruses of Invertebrates (Reoviridae)

Cypovirus 16 CfCPV-16

Cypovirus 15 Cypovirus 1

TnCPV-15

LdCPV-1 BmCPV-1 DpCPV-1

LdCPV-4 LiCPV-2

Cypovirus 4

Cypovirus 2

0.1

HaCPV-5 OpCPV-5 EsCPV-5 Cypovirus 5

Fig. 4 Phylogenetic tree for polyhedrin proteins from isolates belonging to six Cypovirus species. The scale bar indicates the number of substitutions per site.

members of different genera. The relationship of ApDNV to other reoviruses is illustrated by a phylogenetic tree for the RdRp gene (Fig. 5). Although these sequence comparisons confirm the validity of ApDNV classification within a separate genus, they also indicate significant relationships with the turreted cypoviruses, fijiviruses, and oryzaviruses (aa identities of 26%, 23%, and 22%, respectively). Lower levels of aa identity (20%) were also detected with the mycoreoviruses (which are also turreted).

Sedoreovirinae The Sedoreovirinae subfamily encompasses six genera, Cordoreovirus, Mimoreovirus, Orbivirus, Phytoreovirus, Rotavirus and Seadornavirus. Viruses in genus Cardoreovirus infect only crustaceans. Viruses in the single and type species of genus Mimoreovirus have 11 segments of dsRNA and infect the unicellular and ubiquitous alga Micormonas pusilla. Reoviruses in the genus Orbivirus have 10 dsRNA segments and are considered as arboviruses infecting a wide range of vertebrate (mammalian and avian) hosts and their arthropod vectors, like gnats, mosquitoes, phlebotomines and ticks. While orbiviruses infect invertebrate vectors they have little to no affect on the infected arthropod and will not be described in this article. Genus Rotavirus viruses have 11 dsRNA segments and infect only vertebrates and are transmitted by the fecal-oral route. Homalodisca vitrepennis reovirus, is the only reovirus in genus Phytoreovirus that infects only an insect, the glassy-winged sharpshooter and is not found in plants frequented by this pest. While most reoviruses in the genus Seadornavirus are considered arboviruses infecting vertebrates and their invertebrate vectors, two viruses Kadipiro virus and Liao Ning virus have so far been found only in mosquitoes. A summary of the recognized species or reoviruses in the Sedoreovirinae infecting only invertebrates is provided in Table 7.

Cardoreovirus The first cardoreovirus (from carcinus for crab and dodeca for 12 dsRNA segments) was isolated in 2004 from diseased mitten crabs (Eriocheir sinensis). The genus has a single species, the type species Eriocheir sinensis reovirus with Eriocheir sinensis reovirus (ESRV905) as an exemplar virus member. There are at least five other possible members of the genus, Callinectes sapidus reovirus from blue crab, Carcinus mediterraneus reovirus W2 (CMRV) from shore crab, Macrobrachium nipponense reovirus from prawn, Macropipus depurator reovirus P from swimming crab and Scylla serrata reovirus from mud crab. The cardoreovirus icosahedral virions are about 70 nm in diameter and have a smooth non turreted surface, with an RNA containing core of about 55 nm diameter. All cardoreoviruses, except Macrobrachium nipponense reovirus which has 10 dsRNA segments, have a genome of 12 dsRNA segments. Replication occurs in the cytoplasm of crab cells and virions appear as rosettes around viral inclusion bodies. Vago in 1966 described the first crustacean virus ever to be identified, Paralysis virus of Macropipus depurator virus, now known as the cardeovirus Macropipus depurator reovirus P (the “P” stands for paralysis) which causes a slowly developing paralysis and darkening exoskeleton.

Reoviruses of Invertebrates (Reoviridae)

879

Seadornavirus Cardoreovirus

BAV-Ch KDV-Ja7075

LNV-NE9712 Phytoreovirus

EsRV

RDV-Ch RDV-A RDV-H

Rotavirus

Orbivirus

Hu/MuRV-B/IDIR

SCRV PoRV-C/Co SiRV-A/SA11

CHUV AHSV-9 BTV-11 BTV-2 BTV-13 BTV-10 BTV-17

BoRV-A/UK

MpRV

Mimoreovirus NLRV SBRV CSRV GSRV GCRV

RRSV Oryzavirus

Aquareovirus HaCPV-14

MRV-3 MRV

LdCPV-14 DpCPV-1 LdCPV-1 BmCPV-1

Orthoreovirus RaRV

CPRV-9B21

CTFV-Fl EYAV-Fr578

ApRV

Cypovirus

Dinovernavirus Mycoreovirus 0.5

Coltivirus

Fig. 5 Neighbor-joining phylogenetic tree, built with available polymerase sequences for representative members of 15 recognized genera of family Reoviridae. The abbreviations and accession numbers are those provided in Table 6. The scale bar indicates the number of substitutions per site.

Carcinus mediterraneus reovirus W2 (CMRV) isolated from the hepatopancreas of the shore crab infects interstitial cells of the hepatopancreas, midgut, gills and haemocytes. Infection of blue crabs in crab fisheries on the Atlantic coast of North and South America with Callinectes sapidus reovirus leads to mortalities of up to 100% in both wild and farmed populations. Diagnosis of CSRV is by in situ hybridization and RT-PCR. Eriocheir sinensis reovirus (EsRV905) was first reported in 2004 from freshwater crab Eriocheir sinensis. ESRV is associated with trembling disease, including trembling of the legs, sluggishness and loss of appetite, in farmed mitten crabs in China. This is an economically important disease of freshwater cultured Chinese mitten crabs in Jiangsu Province. The virus infects connective tissue of the gills, midgut, hepatopancreas and hemocytes with mortality reaching 30% in experimental infections. The ESRV WX-2012 genome has an aggregate size of 23,913 bp with a G þ C content of 43.80% and the 12 dsRNA segments range from 1.18 to 4.24 kb in size. Genome segment 1 encodes the RdRp of 1217 aa. Scylla serrata reovirus, originally named mud crab reovirus, was first isolated from mud crab in southern China in 2007. It causes sleeping disease, a severe disease in marine-cultured mud crab in southern China. The highly pathogenic virus infects connective tissues of the hepatopancreas, gills and intestines with mortality reaching 70% in affected farms. Clinical signs include lethargy, anorexia and a gray discoloration. The virus causes vacuolar degeneration of connective tissues in the heart, stomach and intestines. During infection the presence of high numbers of virions results in crystal-like inclusion bodies in the cytoplasm. The Scylla serrata reovirus dsRNA genome has an aggregate size of 24,464 bp with 12 segments ranging in size from 1.16 to 4.33 kb. Macrobrachium nipponense reovirus, was first described in 2016 in farmed freshwater prawn (oriental river prawn) Macrobrachium nipponense. Though there are no clinical signs of infection the virus can be associated with smaller shrimp and high levels of mortality.

880

Reoviruses of Invertebrates (Reoviridae)

Table 6

Sequences of the RNA-dependent RNA polymerases (RdRps) used in phylogenetic analysis of different reoviruses (Fig. 4)

Species

Isolate

Abbreviation

Accession number

Genus Seadornavirus (12 segments) Banna virus Kadipiro virus Liao ning virus

Ch Java-7075 LNV-NE9712

BAV-Ch KDV-Ja7075 LNV-NE9712

AF168005 AF133429 AY701339

Genus Coltivirus (12 segments) Colorado tick fever virus Eyach virus

Florio Fr578

CTFV-Fl EYAV-Fr578

AF134529 AF282467

Lang strain Jones strain Dearing strain

MRV-1 MRV-2 MRV-3

M24734 M31057 M31058

Genus Orthoreovirus (10 segments) Mammalian orthoreovirus

Mahlapitsi orthoreovirus Genus Orbivirus (10 segments) African horse sickness virus Bluetongue virus

2511

AHSV-9 BTV-2 BTV-10 BTV-11 BTV-13 BTV-17

U94887 L20508 X12819 L20445 L20446 L20447

Chuzan SCRV

CHUV SCRV

Baa76549 AF133431

Bovine strain UK Simian strain SA11

BoRV-A/UK SiRV-A/SA11

X55444 AF015955

Rotavirus B Rotavirus C

Human/murine strain IDIR Porcine Cowden strain

Hu/MuRV-B/IDIR PoRV-C/Co

M97203 M74216

Genus Aquareovirus (11 segments) Golden shiner reovirus Grass carp reovirus Chum salmon reovirus Striped bass reovirus

GSRV GCRV-873 CSRV SBRV

GSRV GCRV CSRV SBRV

AF403399 AF260511 AF418295 AF450318

Genus Fijivirus (10 segments) Nilaparvata lugens reovirus

Izumo strain

NLRV-Iz

D49693

Isolate China Isolate H Isolate A

RDV-Ch RDV-H RDV-A

U73201 D10222 D90198

Palyam virus St. Croix river virus Genus Rotavirus (11 segments) Rotavirus A

Genus Phytoreovirus (10 segments) Rice dwarf virus

Serotype Serotype Serotype Serotype Serotype Serotype

KU198603

9 2 10 11 13 17

Homalodisca vitripennis reovirus (unclassified)

Fillmore

Genus Oryzavirus (10 segments) Rice ragged stunt virus

Thai strain

RRSV-Th

U66714

Genus Cypovirus (10 segments) Bombyx mori cytoplasmic polyhedrosis virus 1 Dendrolimus punctatus cytoplasmic polyhedrosis 1 Lymantria dispar cytoplasmic polyhedrosis 1 Lymantria dispar cytoplasmic polyhedrosis 14 Heliothis armigera cypovirus 14

BmCPV-1 DpCPV-1 LdCPV-1 LdCPV-14 HaCPV-14

BmCPV-1 DpCPV-1 LdCPV-1 LdCPV-14 HaCPV-14

AF323782 AAN46860 NC_003017 AAK73087 DQ242048

FJ497789

Reoviruses of Invertebrates (Reoviridae)

Table 6

881

Continued

Species

Isolate

Abbreviation

Accession number

Genus Mycoreovirus (11 or 12 segments) Rosellinia anti-rot virus Cryphonectria parasitica reovirus

W370 9B21

RaRV CPRV

AB102674 AY277888

Genus Mimoreovirus (11 segments) Micromonas pusilla reovirus

MpRV

MPRV

DQ126102

Genus Dinovernavirus (9 segments) Aedes pseudoscutellaris dinovernavirus

ApDNV

ApDNV

DQ087277

Genus Cardoreovirus (12 segments) Eriocheir sinensis reovirus

Isolate 905

EsRV

AY542965

Table 7

Sampling of Sedoreovirinae reoviruses infecting invertebrates, their hosts and some of their genome characteristics

Species Genus Cardoreovirus Eriocheir sinensis reovirus Unclassified Unclassified Unclassified Unclassified Unclassified Genus Phytoreovirus Unclassified Genus Seadornavirus Banavirus Kadipirovirusb Liao ning virus

Virus

Host

# Segs

Range of segment % Aggregate GþC genome size (bp) sizes (kb)

Eriocheir sinensis reovirus WX-2012

Mitten crab

12

23,913

1.18–4.24

Carcinus mediterraneus W2 virus Callinectes sapidus reovirus SZ-2007 Macrobrachium nipponense reovirus Macropipus depurator reovirus P Scylla serrata reovirus

Shore crab Blue crab Prawn Swimming crab Mud crab

12 12 10 12 12

24,464

1.16–4.33

Homalodisca vitripennis reovirusFillmore

Glassy winged sharpshootera

12

25,724

1.04–4.47

38.70

Banna virus strain JKT-6423 Kadipiro virus JKT-7075 Liao ning virus LNV-NE9712

mosquitoes mosquitoes mosquitoes

12 12 12

20,682 20,985 20,739

0.86–3.75 0.77–3.77 0.76–3.74

37.68 41.36

43.80

a

Although a phytoreovirus, Homalodisca vitripennis reovirus has been detected solely in the glassy winged sharpshooter, it has not been detected in any plants (e.g citrus trees) where the insect is commonly found. Other phytoreoviruses are known to be plant viruses vectored by invertebrates. b While considered arboviruses replicating in vertebrate hosts and their invertebrate vectors, the Seadornavirus Kadipiro virus has been isolated solely from mosquitoes.

Phytoreovirus Phytoreoviruses (from “phyto” Greek for plant) were initially discovered in 1941 as wound tumor viruses of plants transmitted by leaf hoppers in which the virus also replicates. The phytoreovirus icosahedral virions appear to be double shelled with a diameter of about 70 nm diameter. While phytoreoviruses are generally considered as plant viruses vectored by cicadellid leafhoppers, they induce no marked disease in the leafhopper vector. However, one unclassified reovirus, Homalodisca vitripennis reovirus (HoVRV) in the genus Phytoreovirus appears to have only an insect host, the hemipteran glassy winged sharpshooter. This virus was first described in 2009 and found to be closely related to Rice dwarf virus. It appears to be maintained in the absence of a plant host. The genome of Homalodisca vitripennis reovirus has an aggregate size of 25,724 bp with a G þ C content of 38.70% and the 12 ds RNA segments range in size from 1.04 to 4.47 kb.

Seadornavirus The Banna virus, in the genus Seadornavirus (from Southeast Asian dodeca for 12 dsRNA segments), was first discovered in 1987 in a patient with encephalitis and fever in China and is an arbovirus transmitted by Anopheles and Culex mosquitoes. However, two seadornaviruses have been found only in mosquitoes, Kadipiro virus in Culex and Liao ning virus in Aedes dorsalis. Seadornavirus icosahedral virions have a well-defined dual shell surface capsomeric structure with a diameter of 60–70 nm and a 40–50 nm core. Kadipiro virus was first discovered following growth in mosquito C6/36 cells of material collected from Culex fuscocephalus from Java, Indonesia in 1981. Since then it has been isolated from other mosquitoes, particularly from China. The Kadipiro virus

882

Reoviruses of Invertebrates (Reoviridae)

genome has an aggregate size of 20,985 bp with a G þ C content of 37.68% and the 12 dsRNA segments range is size from 0.77 to 3.77 kb. Liao ning virus, another seadornavirus, was initially isolated from Aedes dorsalis mosquitoes in North East China in 1997. The Liao ning virus genome has an aggregate size of 20,739 bp with a G þ C content of 41.36% and the 12 dsRNA segments range in size from 0.76 to 3.74 kb.

See also: Reoviruses (Reoviridae) and Their Structural Relatives

Further Reading Attoui, H., Jaafar, M.F., Belhouchet, M., et al., 2005. Expansion of family Reoviridae to include nine-segmented dsRNA viruses: Isolation and characterization of a new virus designated Aedes pseudoscutellaris reovirus assigned to a proposed genus (Dinovernavirus). Virology 343, 212–223. Attoui, H., Mertens, P.P.C., Becnel, J., et al., 2012. Reoviridae. In: King, A.M.Q., Adams, M.J., Carstens, E.B., Lefkovitz, E.J. (Eds.), Virus Taxonomy: Ninth Report of the International Committee on Taxonomy of Viruses. San Diego, CA: Elsevier Academic Press, pp. 541–637. Bateman, K.S., Stentiford, G.D., 2017. A taxonomic review of viruses infecting crustaceans with an emphasis on wild hosts. Journal of Invertebrate Pathology 14, 86–110. Kibenge, F.S.B., Godoy, M.G., 2016. Reoviruses of aquatic organisms. In: Kibenge, S.B., Godoy, M.G. (Eds.), Aquatic Virology. New York: Academic Press, pp. 205–236. Mertens, P.P.C., Attoui, H., 2008. Insect reoviruses. In: Mahy, B.W.J., van Regenmortel, M.H.V. (Eds.), Encyclopedia of Virology, third ed. , pp. 450–459. Payne, C.C., Mertens, P.P.C., 1983. Cytoplasmic polyhedrosis viruses. In: Joklik, W.K. (Ed.), The Reoviridae. New York: Plenum, pp. 425–504. Plus, N., Gissman, L., Veyrunes, J.C., Pfister, H., Gateff, E., 1981a. Reoviruses of drosophila and ceratitis populations and of drosophila cell lines: A new genus of the Reoviridae family. Annales de Virologie 132E, 261–270. Plus, N., Veyrunes, J.C., Cavalloro, R., 1981b. Endogenous viruses of Ceratitis capitata Wied. ‘J.R.C. Ispra strain’ and C. capitata permanent cell lines. Annales de Virologie 132E, 91–100. Varma, M.G., Pudney, M., Leake, C.J., 1974. Cell lines from larvae of Aedes (Stegomyia) malayensis Colless and Aedes (S) pseudoscutellearis (Theobald) and their infection with some arboviruses. Transactions of the Royal Society for Tropical Medicine and Hygiene 68, 374–382. Zhou, Y., Qin, T., Xiao, Y., et al., 2014. Genomic and biological characterization of a new cypovirus isolated from Dendrolimus punctatus. PLoS One 9, e113201. doi:10.1371/ journal.pone.0113201.

Rhabdoviruses of Insects (Rhabdoviridae) Andrea González-González, Nicole T de Stefano, David A Rosenbaum, and Marta L Wayne, University of Florida, Gainesville, FL, United States r 2021 Elsevier Ltd. All rights reserved.

Classification Rhabdoviruses are negative sense, single stranded RNA viruses that infect a wide range of hosts including mammals, birds, fish, reptiles, plants, and arthropods, and have a considerable impact on crop production and public health. The family Rhabdoviridae belongs to the order Mononegavirales and is classified as a member of group V from the Baltimore system based on its method of mRNA synthesis. The Rhabdoviridae are composed of 20 genera and 144 defined species as well as many unclassified isolates. Each genus corresponds to a monophyletic group in a phylogenetic reconstruction inferred using a maximum likelihood algorithm and the full-length sequence of the large L gene. The L gene is a reliable molecular marker for taxonomic classification due both to its low rate of recombination and because its high level of conservation, with some regions involved in functions such as transcription and replication. Besides sharing a common evolutionary history, members of each genus have similar genomic structure, transmission mode and host range. Insect rhabdoviruses can be divided into two groups: one whose members infect vertebrate hosts through arthropod vectors (i.e., arboviruses), and one whose members infect solely insects. Members of the genera Curiovirus, Ephemerovirus, Hapavirus, Ledantevirus, Sripuvirus, Tibrovirus, Vesiculovirus and Caligrhavirus have been isolated from arthropod vectors and can also infect vertebrate hosts. In contrast, the genera Almendravirus and Sigmavirus comprise viruses capable of infecting only arthropods. A phylogenetic analysis using the conserved regions of the L protein led to the suggestion of new genera tentatively called Bahiavirus and Sawgravirus (not yet approved by ICTV) from arthropod hosts; it is not yet known whether they can infect non-arthropod hosts. Additional novel rhabdoviruses have been identified in insect hosts using next-generation sequencing technologies. Some of these are still in the process of being classified into particular genera, such as the Withyham and Lye Green viruses isolated from Drosophila obscura, Merida virus isolated from mosquitoes (MERDV), Apis rhabdovirus-1 (ARV-1) and Apis rhabdovirus-2 (ARV-2) isolated from bees, bumble bees and their mites. Again, the breadth of host range for these viruses remains to be determined. Almendravirus and Sigmavirus are the largest genera whose members infect only insects. So far, five different species are officially classified as Almendravirus and seven as Sigmavirus (Table 1). The Almendravirus species are isolated from mosquitoes in the Americas and the Sigmavirus species from different species of flies collected around the world (Spain, USA, Kenya, France, Ghana and the UK). Recent metatranscriptomic studies have suggested the existence of more sigmavirus species (Table 1) that remain unclassified. Table 1

Member species of invertebrate Rhabdovirusesa

Genus

Species

Virus name abbreviation

Host

Sigmavirus

Drosophila affinis sigmavirus Drosophila ananassae sigmavirus Drosophila immigrans sigmavirus Drosophila melanogaster sigmavirus (Type species) Drosophila obscura sigmavirus Drosophila tristis sigmavirus Drosophila stabulans sigmavirus b Drosophila sturtevanti sigmavirus b Drosophila busckii sigmavirus b Drosophila algonquin sigmavirus b Drosophila montana sigmavirus b Scaptodrosophila deflexa sigmavirus b Ceratitis capitata sigmavirus b Pararge aegeria rhabdovirus

DAffSV DAnaSV DlmmSV DMelSV DObssV DTrisV MStaSV DStuSV DBusSV DalgSV DMonSV SDefSV CCapSV PAerRV

Fruit fly Fruit fly Fruit fly Fruit fly Fruit fly Fruit fly Fruit fly Fruit fly Fruit fly Fruit fly Fruit fly Fruit fly Mediterranean fruit fly Speckled wood butterfly

Almendravirus

Arboretum almendravirus Balsa almendravirus Coot Bay almendravirus Puerto Almendras almendravirus (Type species) Rio Chico almendravirus

ABTV BALV CBV PTAMV RCHV

Psorophora albigenu mosquitoes Culex erraticus mosquitoes Anopheles quadrimaculatus mosquitoes Ochlerotatus fulvus mosquitoes Anopheles triannulatus mosquitoes

a Table elaborated using information from Longdon et al. (2015), Walker et al. (2018) and from the ICTV Taxonomy Report at: https://talk.ictvonline.org/ictv-reports/ictv_online_report/ negative-sense-rna-viruses/mononegavirales/w/rhabdoviridae. b Newly discovered, unclassified Sigmavirus species. Membership in the Sigmavirus genus is implied by isolation from CO2 sensitive flies using RNA-Seq.

Encyclopedia of Virology, 4th Edition, Volume 4

doi:10.1016/B978-0-12-809633-8.21557-4

883

884

Rhabdoviruses of Insects (Rhabdoviridae)

Virion Structure As other members of the family Rhabdoviridae, virions from the Sigmavirus and Almendravirus genera are bullet shaped and enveloped. Sigmavirus particles have an approximate diameter of 80 nm and a length of 100 nm, while Almendravirus particles have a 40–55 nm diameter and 190–460 nm length. The outer surface of the virions have spikes made by glycoprotein that cross the lipidic envelope. The inside of the virion is composed of the nucleocapsid consisting of one single molecule of negative sense single-stranded RNA attached to a nucleoprotein, RNA-dependent RNA polymerase and a phosphoprotein. In addition there is also a matrix protein that links the nucleocapsid to the membrane glycoproteins.

Genome Rhabdoviruses have negative sense, single stranded, unipartite RNA genomes 11–16 kb length consisting of five canonical structural genes that encode (30 –50 ) the nucleoprotein (N), phosphoprotein (P), matrix protein (M), glycoprotein (G) and large protein (L) (Fig. 1). The nucleoprotein constitutes the major component of the viral nucleocapsid and binds the template bases to the polymerase. The phosphoprotein interacts physically with L ensuring it is properly placed on the N-RNA template. It also secures the correct synthesis of N. The matrix protein confers the inner structural component of the virion and is thought to regulate genomic RNA transcription. The glycoprotein binds to host cell receptors and induces endocytosis of the virus. The large protein is a component of the nucleocapsid. It is involved in functions related to transcription and replication, like RNAdependent RNA polymerase (RdRp) 50 capping and 30 polyadenylating. Each gene is separated by short untranscribed intergenic sequences and has conserved transcription stop and start sequences of about 10 nt in length. The genome of some rhabdoviruses

Fig. 1 Genome organization of different insect rhabdoviruses. In contrast to the typical rhabdovirus genome structure consisting on N, P, M, G and L ORFs, the typical Sigmavirus genomes encode the X protein whose function is unknown. Drosophila busckii sigmavirus encodes an extra protein between G and L with unknown function as well. Almendravirus genomes do not code for the X protein but encode a class 1a viroporinlike protein ORF which is located between G and L (filled rectangle). Empty spaces between ORFs corresponds to untranscribed intergenic sequence regions. The broken line indicates genome regions not sequenced. Modified from Dietzgen, R.G., Kondo, H., Goodin, M.M., Kurath, G., Vasilakis, N., 2017. The family Rhabdoviridae: Mono- and bipartite negative-sense RNA viruses with diverse genome organization and common evolutionary origins. Virus Research 227, 158–170. Longdon, B., Murray, G.G.R., Palmer, W.J., et al., 2015. The evolution, diversity, and host associations of rhabdoviruses. Virus evolution 1, 1–12. Walker, P.J., Blasdell, K.R., Calisher, C.H., et al., 2018. ICTV Virus Taxonomy Profile: Rhabdoviridae. Journal of General Virology 99, 447–448.

Rhabdoviruses of Insects (Rhabdoviridae)

885

includes additional genes located between the structural proteins, therefore genomes of viruses from different genera may vary in length and organization.

Sigmavirus Genome Members of the Sigmavirus genus have genomes that range from 12.4 to 14.5 kb and typically encode six proteins. The N, P, M, G and L proteins have the same structural characteristics and are homologous to proteins of other rhabdoviruses. The X gene is located between P and M, and codes for the protein PP3 (also called X protein) which is presumably involved in hostvirus interactions. Different genome sizes are explained by the variation in length of the untranscribed intergenic sequences. For example, there is a 664 nucleotide untranscribed region between G and L in the G Drosophila affinis sigmavirus (DAffSV) (Fig. 1). Some of the newly discovered related and unclassified sigmaviruses isolated from different species of flies share the canonical genome structure of six genes present in Drosophila melanogaster sigmavirus (DMelSV) while others deviate from the prototypic genome organization. For example, the Drosophila busckii sigmavirus (DBusSV) genome does not code for the X gene however, it has an accessory ORF of unknown function between the G and L genes. Pararge aegeria rhabdovirus (PAerRV) does not code for the X protein, and the G protein is split into two contiguous ORFs. Other sigmaviruses have been identified from RNAseq data including Drosophila montana sigmavirus (DMonSV), Drosophila sturtevanti sigmavirus (DStuSV) and Scaptodrosophila deflexa sigmavirus (SDefSV); however, their genomes are not yet fully sequenced (Fig. 1).

Almendraviruses Genome Members of the Almendravirus genus have genomes that range from 10.5 to 11.5 kb. They also encode the canonical five proteins present in genomes of members of the Rhabdoviridae family, with the same structural characteristics and homology of the proteins of other rhabdoviruses. However, Almendraviruses encode an additional class 1a viroporin-like protein as an independent transcriptional unit between the G and L genes. The Balsa virus (BALV) genome, has an extra ORF (Px) within the P gene which function is unknown (Fig. 1).

Other Insect Rhabdovirus Genomes The recently discovered Apis mellifera Rhabdoviruses 1 (ARV-1) and 2 (ARV-2) have genomes B14.5 kb long. They both have the prototypic rhabdovirus genome structure consisting of the five open reading frames N, P, M, G and L. In contrast, the genome of other unclassified rhabdoviruses, isolated from bedbugs (Shuangao Bedbug Virus 2), mosquitoes (Wuhan Mosquito Virus 9) and flies (Shayang Fly virus 3), codes only for the G and L proteins plus three different capsid proteins (VP1, VP2 and VP3).

Life Cycle Sigma virus has the same basic steps as other members of other rhabdoviruses for cell entry, transcription, translation, replication, and assembly, all of which occur predominately in the cytoplasm. Rhabdoviruses have five essential proteins that aid in these processes: nucleoprotein (N), glycoprotein (G), RNA-dependent RNA polymerase large subunit (L), phosphoprotein (P), and matrix protein (M). Initially, the G protein aids in the attachment of the virus to the host cell membrane. Entry of the virus into the host cell occurs by receptor mediated endocytosis, utilizing clathrin coated pits. The low pH levels in the environment facilitate fusion of the viral envelope and the host cell membrane by causing a conformational change of the G protein which subsequently releases the nucleocapsid into the cytoplasm where the virus can undergo transcription. Transcription of the viral RNA is initiated by the detachment of the M protein from the nucleocapsid. Primary transcription of the negative-sense RNA is catalyzed by the viral transcriptase enzyme and follows a stop-start pattern starting at the 30 end of the RNA. Each mRNA transcript has an initiation sequence and is terminated by a polyadenylation signal, a string of A’s at the end of the sequence. The essential genes are transcribed as capped poly-adenylated mRNAs in the order N - P - M - G - L. Small RNAs are also produced that are uncapped and positive sense, typically considered leader RNAs. Once transcribed, the mRNA sequences are translated into proteins and these proteins are transported to varying areas of the cell. Protein translation of N, M, P, and L occur in cytoplasmic polysomes. However, in the case of the G protein, it is translated on membrane- bound polysomes, and then transported from the endoplasmic reticulum to the Golgi apparatus for processing and finally to the plasma membrane. After a sufficient amount of protein is produced, genome replication is initiated. This focuses mainly on viral encapsidation, where proteins N and P are vital components in this process. The synthesized positive sense antigenome provides a template for formation of the negative sense RNA to be encapsidated. Once initial RNA replication occurs, secondary transcription, translation, and replication can take place. After formation of the nucleocapsid, it attaches to the cell membrane and the synthesized M protein aids in the budding process where virions are released and are able to proliferate the infection to adjacent cells.

886

Rhabdoviruses of Insects (Rhabdoviridae)

Integration Into the Host Genome SIGMAV-like and other rhabdovirus-like elements homologous to genes coding for protein L (RNA-dependent RNA polymerase), N (Nucleoprotein), P (polymerase-associated protein) and G (glycoprotein) are integrated into the genomes of a range of different Drosophila species as well as the genomes of the mosquito Aedes aegypti and tick Ixodes scapularis. There is evidence that suggests that the sequence encoding the viral protein P was integrated into the D. yakuba genome by retrotransposon template-switching, and that it is expressed by the host.

Epidemiology Distribution, Spatial and Temporal Rhabdoviruses have been found in worldwide distributed natural populations of insects from the Drosophilidae, Tephritidae, Nymphalidae, Culicidae and, Apidae families.

Transmission The most studied rhabdovirus in insects is sigma virus, which has been isolated from different Drosophila species (Table 1). DmelSV (sigma virus isolated from D. melanogaster), DAffSV (sigma virus isolated from D. affinis) and DObsSV (sigma virus isolated from D. obscura) are vertically transmitted by both parents to their offspring, a condition thought to ensure the persistence of vertically transmitted virus in natural populations despite the fitness cost such viruses can impose on infected flies. In the wild, transmission efficiency of DmelSV and DAffSV by females is higher compared to males. In contrast, DObsSV is transmitted at similar rates by both parental sexes. Additionally, if a fly is infected by its female parent, the probability of transmitting DmelSV to its offspring is higher than when it is infected by its male parent meaning that while there can be patrilineal transmission for multiple generations, the transmission efficiency decreases with each generation. In contrast, D. affinis and D. obscura males infected by their male parents do not transmit the virus to their offspring. Conveniently, although sigma virus is vertically transmitted in nature, sigma virus can be experimentally injected into naïve female flies where it can replicate and be transmitted to the next generation by these flies.

Prevalence D. melanogaster females tend to have slightly higher prevalence of sigma virus than males in the field and this prevalence varies from season to season in natural populations ranging from 28% to 6.2% depending the year. This prevalence may be influenced by a latitudinal or climatic gradient (lower latitudes having higher prevalence compared to higher altitudes). In D. obscura, the presence of DObsSV ranges from 22% to 73% depending on the collecting site in the UK. Laboratory experiments have demonstrated that DmelSV, DAffSV and DObsSV can persist and replicate in closely related novel hosts.

Population Dynamics Sigma virus isolated from different natural populations of D. melanogaster across Europe shows almost null genetic diversity and zero levels of recombination. At the same time, this diversity is structured since viral isolates from a particular population share a unique pool of genetic variants compared to other populations. Like DmelSV, DObsSV has very low genetic variation. However, in DObsSV these genetic variants are distributed homogeneously among different geographical populations. Using mathematical modeling, it has been demonstrated that DobsSV invades natural populations of D. obscura flies at a fast pace, dramatically increasing its population size. At the single fly level, infections by multiple strains of DmelSV have been reported by measuring their plaque size and incubation times to CO2 sensitivity. Sequence variation consistent with multiple infections has also been detected.

Clinical Features Infection with sigma virus decreases female fecundity in natural populations of D. melanogaster. Also, eggs laid by uninfected females had greater viability and faster developing times than those laid by infected females in laboratory population.

Pathogenesis The paralysis experienced by D. melanogaster flies after being exposed to CO2 has been attributed to the damage caused to the nervous tissue by the replication of sigma virus in the fly thoracic ganglia. It also infects other tissues like the eyes, brain and other

Rhabdoviruses of Insects (Rhabdoviridae)

887

nerves. However, the exact mechanism remains unknown. Although not directly studied in the D. melanogaster-sigma virus system, it can be hypothesized that the protein G plays an important role in this CO2 sensitivity. In other members of the Rhabdovirus family such as rabies, the viral protein G attaches to different host cell receptors, allowing the entrance of the virus into nervous cells and mediates the spread of the virus among neurons.

Diagnosis Sigma virus-infected flies die or become permanently paralyzed after being exposed to high levels of CO2 and this CO2 sensitivity is correlated with the viral titers. This symptom is commonly used to screen flies in natural and laboratory populations. Adult flies are exposed to CO2 for 5–15 min at 121C and after 30 min, infected flies rarely recover. While death after exposure with CO2 is very strongly associated with sigma virus infection in D. melanogaster, in other Drosophila species like D. obscura, some infected flies are not sensitive to CO2. Another way to test for sigma virus presence is by quantitative real-time PCR (qRT-PCR) or by first-strand synthesis followed by quantitative PCR.

Further Reading Ammar, E.-D., Tsai, C.-W., Whitfield, A.E., Redinbaugh, M.G., Hogenhout, S.A., 2008. Cellular and molecular aspects of Rhabdovirus interactions with insect and plant hosts. Annual Review of Entomology 54, 447–468. Brun, G., 1963. Étude d’uneassociation du virus s et de son hôte la Drosophile: L’etat stabilisé. These, Paris-Orsay, p. 254. Carpenter, J.A., Obbard, D.J., Maside, X., Jiggins, F.M., 2007. The recent spread of a vertically transmitted virus through populations of Drosophila melanogaster. Molecular Ecology 16, 3947–3954. Contreras, M.A., Eastwood, G., Guzman, H., et al., 2017. Almendravirus: A proposed new genus of rhabdoviruses isolated from mosquitoes in tropical regions of the Americas. The American Journal of Tropical Medicine and Hygiene 96, 100–109. Dietzgen, R.G., Kondo, H., Goodin, M.M., Kurath, G., Vasilakis, N., 2017. The family Rhabdoviridae: Mono- and bipartite negative-sense RNA viruses with diverse genome organization and common evolutionary origins. Virus Research 227, 158–170. Fleuriet, A., 1976. Presence of the hereditary rhabdovirus sigma and polymorphism for a gene for resistance to this virus in natural populations of Drosophila melanogaster. Evolution 30, 735–739. Fleuriet, A., 1988. Maintenance of a hereditary virus, the Sigma virus in populations of its host, Drosophila melanogaster. Evolutionary Biology 23, 1–30. Levin, S., Galbraith, D., Sela, N., et al., 2017. Presence of Apis Rhabdovirus-1 in populations of pollinators and their parasites from two continents. Frontiers in Microbiology 8, 2482. L’Héritier, P.L., Teissier, G., 1945. Transmission héréditaire de la sensibilité au gaz carbonique chez la Drosophile. Pubis Lab. Ecole Normale Superieure Biologie vol. 1, pp. 35–76. Longdon, B., Murray, G.G.R., Palmer, W.J., et al., 2015. The evolution, diversity, and host associations of rhabdoviruses. Virus evolution 1, 1–12. Rittschof, C.C., Pattanaik, S., Johnson, L., et al., 2013. Sigma virus and male reproductive success in Drosophila melanogaster. Behavioral Ecology and Sociobiology 67, 529–540. Tsai, C.W., McGraw, E.A., Ammar, E.D., Dietzgen, R.G., Hogenhout, S.A., 2008. Drosophila melanogaster mounts a unique immune response to the rhabdovirus sigma virus. Applied and Environmental Microbiology 74, 3251–3256. Walker, P.J., Blasdell, K.R., Calisher, C.H., et al., 2018. ICTV Virus Taxonomy Profile: Rhabdoviridae. Journal of General Virology 99, 447–448. Walker, P.J., Firth, C., Widen, S.G., et al., 2015. Evolution of genome size and complexity in the Rhabdoviridae. PLOS Pathogens 11. (e1004664). Wayne, M.L., Blohm, G.M., Brooks, M.E., et al., 2011. The prevalence and persistence of sigma virus, a biparentally transmitted parasite of Drosophila melanogaster. Evolutionary Ecology Research 13, 323–345.

Relevant Websites https://viralzone.expasy.org/7276 Almendravirus ViralZone page. https://talk.ictvonline.org/ictv-reports/ictv_online_report/negative-sense-rna-viruses/mononegavirales/w/rhabdoviridae/788/genus-almendravirus Genus: Almendravirus Rhabdoviridae Mononegavirales. https://talk.ictvonline.org/ictv-reports/ictv_online_report/negative-sense-rna-viruses/mononegavirales/w/rhabdoviridae/799/genus-sigmavirus Genus: Sigmavirus Rhabdoviridae Mononegavirales. https://talk.ictvonline.org/ictv-reports/ictv_online_report/negative-sense-rna-viruses/mononegavirales/w/rhabdoviridae Rhabdoviridae Rhabdoviridae Mononegavirales. https://viralzone.expasy.org/4656 Sigmavirus ViralZone page. https://www.genome.jp/virushostdb Virus-Host Database.

Sarthroviruses (Sarthroviridae) Azeez Sait Sahul Hameed, C. Abdul Hakeem College, Melvisharam, India r 2021 Elsevier Ltd. All rights reserved.

Glossary Cell line It is a permanently established cell culture that will proliferate indefinitely in appropriate cell culture medium. Pathogenicity Ability of infectious agent to cause disease. Post-larvae One of early developmental stages in freshwater prawn after larval stages.

Satellite virus Virus dependent on another virus for replication and transcription. Vertical transmission Transmission of virus from an infected brooder to its progeny.

Introduction The giant freshwater prawn, Macrobrachium rosenbergii is the most favored crustacean species by the prawn farmers and is being cultured on a large-scale in many countries. Disease is one of the important factors which limit the aquaculture production worldwide. Like other animals, prawns are affected by viruses, bacteria, fungi, and metazoan parasites. Very few diseases have been reported in freshwater prawn when compared to marine shrimp because M. rosenbergii is a moderately disease-resistant species. Among the pathogens, prawn viruses are very important and responsible for huge economic losses, particularly in the hatchery and nursery phases. Reported prawn viruses include Macrobrachium hepatopancreatic parvo-like virus (MHPV), Macrobrachium muscle virus (MMV), infectious hypodermal and hematopoietic necrosis virus (IHHNV) and white spot syndrome virus (WSSV). MHPV, MMV, and IHHNV have been reported only once in freshwater prawn (Anderson et al., 1990; Tung et al., 1999; Hsieh et al., 2006). M. rosenbergii is not a natural host of WSSV, but the syndrome was experimentally induced in this species via different routes of infection. A new viral disease namely white tail disease caused by Macrobrachium rosenbergii nodavirus and Macrobrachium satellite virus 1 was reported to cause severe mortality in freshwater prawn hatcheries and nursery ponds in different parts of world.

Geographical Distribution This virus was first reported in the French West Indies later in China, Chinese Taipei, Thailand, Australia, and Malaysia. The occurrence of WTD is being reported in other prawn growing countries.

Clinical Signs of WTD and Histopathology The clinical signs of WTD in infected post-larvae include lethargy and opaqueness of the abdominal muscle that gradually extends from the center to the anterior and the posterior sections of the muscle. Degeneration of the telson and uropods is observed in severe cases. The prevalence of opaque and milky post-larvae increases dramatically 1–2 days later and changes are particularly obvious in the abdomen (tail). The mortality rate reaches a maximum at five days after the first observation of gross signs. The most affected tissues in WTD-infected post-larvae are striated muscles in the abdomen and cephalothorax and the connective tissues of all organs. There are basophilic cytoplasmic inclusions with a diameter of 1–40 mm in striated muscles of the abdomen, cephalothorax, and intratubular connective tissue of the hepatopancreas. No viral inclusions were observed in epithelial cells of the hepatopancreatic tubules or in midgut mucosal epithelial cells. Muscles exhibit multifocal areas of hyaline necrosis of the fibers, with moderate edema.

Morphology of Virus Extra small virus particles are non-enveloped, icosahedral and about 15 nm in diameter with a density in CsCl of 1.325 g ml and serologically unrelated to those of Macrobrachium rosenbergii nodavirus (Fig. 1). Two overlapping coat proteins of about 17 kDa (CP-17) and 16 kDa (CP-16) are found in extra small virus particles. Since the particles are very small, this viral agent has been called extra small virus (XSV).

Genome Organization The genome of XSV is a linear positive-sense RNA molecule and contains 796 nucleotides with a single ORF located between nucleotides 63 and 587, and a short poly (A) tail of 15–20 nucleotides at the 30 -end. A potential polyadenylation signal (AAUAAA)

888

Encyclopedia of Virology, 4th Edition, Volume 4

doi:10.1016/B978-0-12-809633-8.21558-6

Sarthroviruses (Sarthroviridae)

889

Fig. 1 Negatively stained purified (a) extra small virus (bar ¼ 50 nm) and (b) Macrobrachium rosenbergii nodavirus (bar ¼ 100 nm).

is located 6 nt upstream of the poly(A). The ORF encodes two capsid proteins namely CP16 and CP17 (Fig. 2). Amino acid sequence analysis indicates that CP16 is synthesized by internal initiation at a second methionine codon 11 amino acid codons downstream of the first initiation codon. The two forms of the capsid protein are synthesized in an equimolar ratio.

Taxonomic Position The genome sequence of XSV does not have any significant homology with known virus genomes. The genome of XSV encodes only the capsid proteins and this virus depends on the RdRp of MrNV for replication and transcription. This virus meets the primary conditions of satellite viruses and appears more similar to the known satellite viruses infecting plants (e.g., tobacco necrosis satellite virus-like) than those infecting insects (e.g., chronic bee paralysis-associated satellite virus). XSV is a virus member of the single species, Macrobrachium satellite virus 1 in the single genus, Macronovirus belonging to the family Sarthroviridae. Isolates from different geographical locations, including French West Indies, Thailand, Taiwan, China and India, have been reported. All isolates belong to different geographical locations showed 96%–99% nucleotide sequence identity in their capsid protein genes but the encoded proteins are not recognizably similar to proteins in the public sequence databases.

Pathogenicity and Transmission of Disease Mixtures of both MrNV nodavirus and XSV sarthroviruses caused 100% mortality in post-larvae 12 days post infection (d p.i.) by immersion and the results indicate the possibility of horizontal transmission of disease. In the virus-infected group, post-larvae started showing whitish muscles at 7 d p.i. and reached the highest proportion at 12 d p.i. MrNV and XSV were purified and post-larvae were challenged with different combinations of these two viruses by immersion (Zhang et al., 2006). Clinical signs of WTD and high mortality were observed in post-larvae challenged with combinations containing a relatively high proportion of MrNV and low proportion of XSV. In contrast, there was little sign of WTD and low mortality in post-larvae challenged with a higher proportion of

890

Sarthroviruses (Sarthroviridae)

Fig. 2 Genome organization of MrNV and XSV (Sarthroviridae). RdRP: RNA-dependent RNA polymerase, CP: capsid protein. Numbers indicate nucleotide positions.

XSV than MrNV, indicating that MrNV plays a key role in causing disease in M. rosenbergii. RT-PCR analysis confirmed the presence of both viruses in experimentally infected post-larvae and in gill tissue, head muscle, stomach, intestine, heart, hemolymph, pleopods, ovaries, and tail muscles of experimentally injected adult prawns. Previous studies revealed that MrNV and XSV viruses were always found to be associated with WTD (Zhang et al., 2006). Despite these studies, the role played by XSV in the disease and particularly in its pathogenicity remains unclear and needs further research. Experimental studies confirmed vertical transmission from infected broodstock to post-larvae. XSV affects mainly post-larvae of M. rosenbergii and is responsible for severe mortalities in the early life stages of freshwater prawn. The vertical mode is the main route of WTD transmission. Experimental infection of brooders with both MrNV and XSV indicated that the vertical route is the main route of disease transmission. In the infected brooders, ovarian tissue and fertilized eggs were found to be positive for MrNV/XSV as evidenced by RT-PCR and mortality of up to 100% was observed in post-larvae hatched from eggs of infected brooders. Some virus-infected post-larvae can survive to adulthood and can act as carriers of the disease. Artemia which plays an important role in nutrition of shellfish at the larval stages, may also serve as a vector for MrNV/XSV. The results of experimental infection and RT-PCR assay revealed that both MrNV and XSV responsible for WTD, are possibly transmitted to post-larvae of freshwater prawn via the oral route.

Host Range The early life stages such as larvae, post-larvae and early juvenile of M. rosenbergii were found to be susceptible to XSV. No mortality was observed either in naturally or experimentally (MrNV/XSV) infected subadult and adult prawns. In addition, the PCR results showed XSV-positive in kuruma prawn (Penaeus japonicus), Indian white prawn (Penaeus indicus), giant tiger prawn (Penaeus monodon), dragonfly (Aeshna sp.), giant water bug (Belostoma sp.), beetle (Cybister sp.), backswimmer (Notonecta sp.), hairy river prawn (Macrobrachium rude), monsoon river prawn (Macrobrachium malcolmsonii), brine shrimps (Artemia sp.), and red claw crayfish (Cherax quadricarinatus).

Susceptibility of Cell Line to XSV XSV can be easily propagated in the C6/36 mosquito Aedes albopictus cell line. This cell line can be cultured in Leibovitz L-15 medium containing 100 International Units ml–1 penicillin, 100 mg ml–1 streptomycin and 2.5 mg ml–1 fungizone supplemented with 10% fetal bovine serum at 281C. The C6/36 cell line was found to be useful for propagation of XSV and viral replication was confirmed by RT-PCR, acridine orange staining, infectivity studies and electron microscopy. A specific cytopathic effect was not observed in XSV-infected cell lines, but multiple vacuolations were observed.

Diagnostic Tools XSV can be detected by genome-based and protein-based methods. Various diagnostic methods including histopathology have been developed in different laboratories throughout the world to detect XSV. Other methods include immunological methods, RT–PCR, loop-mediated isothermal amplification (LAMP), in-situ hybridization, and dot blot hybridization. All the above works were aimed at detecting MrNV and XSV in the samples.

Sarthroviruses (Sarthroviridae)

891

Control Measures RT-PCR and other user-friendly kits for detection of XSV are available for commercial use. Therefore, broodstock and seed screening are strongly recommended to prevent WTD in the farms. Broodstock or seed that test positive for WTD are recommended to be discarded by proper zoo-sanitary methods. Usual sanitation and control protocols for viral infections are recommended to avoid WTD in the prawn farm.

Acknowledgment Author thanks the Department of Biotechnology, Government of India, New Delhi, India for financial support to carry out the research on viruses of shrimp and prawn under Viral Repository Project (BT/PR12660/AAQ/3/710/2014).

References Anderson, I.G., Law, T., Shariff, M., Nash, G., 1990. A parvo-like virus in the giant freshwater prawn, Macrobrachium rosenbergii. Journal of Invertebrate Pathology 55, 447–449. Hsieh, C.Y., Chuang, P.C., Chen, L.C., et al., 2006. Infectious hypodermal and haematopoietic necrosis virus (IHHNV) infections in giant freshwater prawn, Macrobrachium rosenbergii. Aquaculture 258, 73–79. Tung, C.W., Wang, C.S., Chen, S.N., 1999. Histological and electron microscopic study on Macrobrachium muscle virus (MMV) infection in the giant freshwater prawn, Macrobrachium rosenbergii (de Man), cultured in Taiwan. Journal of Fish Diseases 22, 319–323. Zhang, H., Wang, J., Yuan, J., et al., 2006. Quantitative relationship of two viruses (MrNV and XSV) in white tail disease of Macrobrachium rosenbergii de Man. Diseases of Aquatic Organisms 71, 11–17.

Relevant Website https://talk.ictvonline.org/ictv-reports/ictv_online_report/positive-sense-rna-viruses/w/sarthroviridae Sarthroviridae. Positive-sense RNA Viruses. ICTV.

Solinviviruses (Solinviviridae) Steven M Valles, Center for Medical, Agricultural and Veterinary Entomology, Agricultural Research Service, US Department of Agriculture, Gainesville, FL, United States Andrew E Firth, University of Cambridge, Cambridge, United Kingdom Published by Elsevier Ltd.

Glossary Biological Control (¼ biocontrol) A method to control insect pests using organisms that are natural predators, parasites, or pathogens. Biopesticide (¼ biological pesticide) A pesticide whose active ingredient is biologically based, including predators, parasites, or pathogens.

Entomopathogen A pathogen of insects. Ribosomal Frameshifting Method of translation exhibited by some viruses to fuse proteins encoded by two overlapping open reading frames. Trophallaxis In social insects, the exchange of food or other nutrients between colony members by mouth-tomouth or mouth-to-anus feeding.

Introduction Solinviviridae is a new family of viruses that possess non-segmented, single-stranded, positive-sense RNA genomes of 10–11 kilobases. This is a highly divergent group exhibiting phylogenetic, genome, and replication characteristics with aspects that are both picornavirus- and calicivirus-like. Among the two classified species of the family, gene order appears conserved. However, whereas the genome of Nylanderia fulva virus 1 (NfV-1) contains a single large open reading frame, in the genome of Solenopsis invicta virus 3 (SINV-3) this is split into two large open reading frames with the second open reading frame being accessed by a ribosomal frameshifting mechanism, resulting in the translation of a trans-frame fusion polyprotein. In both species, the non-structural proteins involved in replication (picorna-like helicase, protease and RNA-dependent RNA polymerase) are encoded towards the 50 end of the genome, and the capsid proteins (a jelly-roll fold domain, an extension domain, and an additional capsid-associated protein) are encoded towards the 30 end. The capsid proteins may be expressed from genomic or subgenomic RNAs. SINV-3 is the best characterized of the two species and infects the invasive red imported fire ant (Solenopsis invicta Buren); NfV-1 infects the tawny crazy ant (Nylanderia fulva (Mayr)). Although SINV-3 and NfV-1 infect ant species (Insecta: Formicidae), additional unclassified related sequences have been identified in various other arthropods.

Classification (Compact) The Solinviviridae name is derived from the type species, Solenopsis invicta virus 3. The family currently comprises two genera, each currently containing a single species, Invictavirus (derived from Solenopsis invicta virus 3) and Nyfulvavirus (derived from Nylanderia fulva virus 1). Comparative analyses of protein complement, gene expression, and genome organization of the two members suggest that the family may form a sister group to the Caliciviridae. However, this conclusion will require additional sequence discoveries and analyses of related viruses to resolve the taxonomic position of the group. The Solinviviridae exhibit characteristics that are picornavirus-like and calicivirus-like. Similar to both picornaviruses and caliciviruses, the Solinviviridae encode a superfamily III helicase, a 3C-like chymotrypsin-related cysteine protease and a superfamily I RNA-dependent RNA polymerase. Additional calicivirus-like characteristics exhibited by solinviviruses include a single jelly-roll fold capsid domain, capsid protein encoding in the 30 end of the genome, production of a capsid-encoding subgenomic RNA (empirically established in SINV-3 only), similar nucleotide sequences at or near the genomic and subgenomic RNA 50 -termini, and the presence of a capsid projection domain. Another calicivirus characteristic exhibited by SINV-3 is its environmental stability and resistance to heat inactivation. SINV-3 remains viable for years within ant corpses. Currently, the genomes of three SINV-3 isolates have been sequenced in entirety, including the DM isolate from North American Solenopsis invicta (GenBank Accession FJ528584), the SF isolate from South American Solenopsis invicta (GU017972), and the Hybrid isolate from Solenopsis invicta x richteri hybrid ants (MF797911). The isolates exhibit highly similar amino acid sequences (498% identical) with the majority of nucleotide differences being synonymous. The genome of NfV-1 (isolate Florida initial) has also been sequenced in entirety (KX024775).

Virion Structure Electron microscopy of SINV-3 revealed spherical particles with irregularly spaced surface projections, cup-shaped depressions, and a diameter of 27.3 7 1.3 nm (Fig. 1). NfV-1 particles were also spherical with irregularly spaced projections, and a diameter of 28.7 7 1.1 nm. This morphology was also reported in related unclassified viruses, including Kelp fly virus and Riptortus pedestris virus 1. The irregularly spaced surface projections appear to originate at the five-fold axes of symmetry in Kelp fly virus.

892

Encyclopedia of Virology, 4th Edition, Volume 4

doi:10.1016/B978-0-12-809633-8.21559-8

Solinviviruses (Solinviviridae)

893

Fig. 1 Electron micrographs of a negative stain of a Solenopsis invicta virus 3 preparation purified by cesium chloride isopycnic centrifugation. Surface projections (Sp) and cup-shaped depressions (Csd) are evident. Bar size represents 25 nm and 200 nm in the upper and lower panels, respectively.

SINV-3 virions contain two major proteins, viral protein 1 (VP1) and VP1 fused to the capsid projection domain (VP1-CPD) (Fig. 2). VP1 is a single jelly-roll capsid protein and is the major component of the virion; particles are presumed to contain 180 copies of VP1 in a T ¼ 3 icosahedral shell arrangement. The capsid projection domains of VP1-CPD proteins are thought to form the irregularly spaced projections. The VP1: VP1-CPD ratio is estimated at approximately 5:1 in SINV-3. A third protein, viral protein 2 (VP2) encoded downstream of CPD is detected at only sub-stoichiometric levels in SINV-3 virions. It is presumed that VP2 is proteolytically cleaved from the CPD region but this has not been determined empirically. VP2 is detected at very low levels, and its function or contribution to the virion structure or its formation are not yet known. For NfV-1, the virion proteins may be slightly different because the genome contains only one long open reading frame. In contrast to SINV-3, no frameshift separates the VP1 and CPD coding sequences (Fig. 2) and it is presumed that all VP1 and CPD polypeptides are produced as the single linked form (VP1-CPD). It is possible that this VP1-CPD polypeptide might be subsequently cleaved to produce free VP1. This latter notion is supported by evidence from the related, unclassified Kelp fly virus whose genome, like that of NfV-1, also contains only a single open reading frame. Two proteins were identified in virions of Kelp fly virus in a molar ratio of 5:1. These proteins may correspond to VP1 and VP1-CPD, respectively.

Genome Members of the family Solinviviridae have single-stranded, positive-sense RNA genomes of 10–11 kb with polyadenylated 30 tails. The genome of NfV-1 contains a single large open reading frame. Conversely, the genome of SINV-3 contains a large 50 -proximal open reading frame (spanning approximately ¾ of the genome), and a second smaller 30 -proximal open reading frame accessed via a  1 ribosomal frameshift at the 30 terminus of open reading frame 1. Thus, open reading frame 1 can be translated alone, or in conjunction with open reading frame 2 as a trans-frame fusion polyprotein. Regardless of species, the protein order is similar. ORFs for structural proteins follow the non-structural proteins. Superfamily III helicase, 3C-like chymotrypsin-related protease and superfamily I RNA-dependent RNA polymerase domains are encoded towards the 50 end and followed by a jelly-roll fold domain (VP1), the capsid projection domain (CPD), and the additional capsid-associated protein (VP2) towards the 30 end. Although not confirmed empirically, a viral protein of the genome (VPg) is thought to be encoded upstream of the 3C-like protease and covalently attached to the 50 end of the genome. A predicted dsRNA binding domain with homology to the 1A suppressor of silencing protein of the dicistrovirus, Drosophila C virus, and perhaps acting as a viral suppressor of RNA interference, is encoded between the RNA-dependent RNA polymerase and

894

Solinviviruses (Solinviviridae)

Fig. 2 Genome organization of Solenopsis invicta virus 3 (Invictavirus) and Nylanderia fulva virus 1 (Nyfulvavirus). Predicted protein domains and their relative positions for the genomic and subgenomic RNA include Helicase (Hel), Protease (Pro), RNA dependent RNA polymerase (RdRP), Jelly-roll capsid viral protein 1 (VP1), Capsid projection domain (CPD), Viral protein 2 (VP2), dsRNA binding protein (*), and Ovarian tumor domain (OTU).

jelly-roll fold domains in both SINV-3 and NfV-1. NfV-1 also encodes a predicted Ovarian Tumor (OTU) domain upstream of the helicase. In SINV-3, and presumably NfV-1, the capsid proteins may be expressed from both genomic and subgenomic RNAs, which is a calicivirus-like characteristic. A 24 nucleotide perfect repeat (AAAGUCCAGUAAGGUUACUGGCAU) occurs in the untranslated region at the 50 terminus (genome position 18–41) and just upstream of the sequence encoding the dsRNA binding domain (genome position 6664–6687) in SINV-3. This sequence repeat may be a promoter sequence for genomic and subgenomic RNA production. Indeed, the 36 nucleotide sequence at genome position 6652 through 6687 exhibits strong similarity with the sequence of the 50 -terminal 41 nucleotides of the genome. Thus, the start position for the subgenomic RNA likely occurs at genome position 6652, with translation starting at the AUG at genome position 6800–6802. Consistently, northern analysis revealed a subgenomic RNA species of approximately 3800 nucleotides. Based on analogy with the SINV-3 sequence, the potential start site for an NfV-1 subgenomic RNA occurs at genome position 7312 as similar sequence elements (75% identity) are detected at the 50 end of the genome and at the sequence beginning with nucleotide 7312.

Life Cycle Solinviviruses are thought to follow a replication cycle similar to picorna/caliciviruses. Once the virus gains cell entry it replicates in the cytoplasm and no DNA stage is formed. The viral genome serves a dual purpose as transcript for polyprotein synthesis and template for production of the complementary minus strand (i.e., replicative strand) of the genome, which, in turn, serves as template for production of the infectious plus strand genome. Viral assembly ultimately occurs by encapsidation of a plus strand genome. SINV-3 and NfV-1 gain entry into the host through the alimentary canal by ingestion. Because the alimentary canal is the primary means of acquiring SINV-3 and NfV-1 infections, virus transmission into their ant hosts is easily accomplished by feeding. After the virus infects an individual ant worker, it is spread intra-colonially by trophallaxis. SINV-3 and NfV-1 exhibit stagedependent tropism. SINV-3 replication is detected in only adult worker ants. Conversely, the replicative form of NfV-1 appears limited to the larval ant stage.

Epidemiology (Host Specificity/Prevalence/Transmission) Establishment of host specificity is of paramount importance to establish the suitability for development and ultimate release of an entomopathogen for control purposes. If the entomopathogen (e.g., virus) exhibits broad host specificity, it is possible that non-target insect hosts could be negatively and irreparably impacted. Host specificity tests have been conducted extensively for SINV-3. Passive and high-dose active exposure of 23 ant species in 14 genera and 4 subfamilies revealed that SINV-3 was capable of replication in only Solenopsis invicta, Solenopsis richteri, and Solenopsis invicta x richteri hybrids. Thus, SINV-3 is host specific for ant species within the saevissima complex of South American fire ants. Studies also revealed that SINV-3 could not replicate in the honey bee (Apis mellifera). The narrow host specificity exhibited by SINV-3 supported intentional release into the Solenopsis invicta United States population as a classical biological control agent. Furthermore, SINV-3 was detected naturally occurring in Solenopsis invicta ants from several U.S.

Solinviviruses (Solinviviridae)

895

states. However, the incidence of SINV-3 was rather low and irregular. Population surveillance studies detected SINV-3 in Solenopsis invicta from the native range (Formosa, Argentina), the United States, and parts of the Caribbean. Because the incidence in the United States was low and irregular, intentional releases of SINV-3 have been completed in California and Florida with successful establishment of the virus. Interestingly, SINV-3 exhibits a seasonal phenology. The prevalence of the virus is higher during fall and winter months in Florida. Analysis of the relationship between temperature and prevalence showed a strong negative correlation between SINV-3 presence and warmer temperature. Temperature can have a significant impact on viral replication, assembly, or host resistance. Based on laboratory studies, SINV-3 transmission occurs through ingestion of virus. Midgut epithelial cells are thought to be the specific target for the virus. After the infection is established in midgut cells the virus spreads to additional tissues by an unknown path. The mode of natural transmission of SINV-3 in Solenopsis invicta is not known. However, the virus was successfully transmitted by crickets that had consumed dead, SINV-3-infected ant workers, which had subsequently fallen prey to uninfected Solenopsis invicta worker colonies. The transmission was simply mechanical because crickets did not support replication of SINV-3. This mode of transmission facilitates use of SINV-3 as a biopesticide. Indeed, baits have been developed that successfully transmit SINV-3 to fire ant colonies in the laboratory and field. Perfunctory host specificity tests with NfV-1 suggest that the virus is specific for Nylanderia fulva ants. However, more extensive testing will be required to firmly establish the host specificity of the virus.

Clinical Features Not applicable.

Pathogenesis The sequalae exhibited by SINV-3 infection of Solenopsis invicta begins with a cessation of food retrieval by the worker caste, followed by larval, pupal, and worker mortality and decreased queen fecundity. Very large midden piles composed of brood and worker corpses develop (Fig. 3(A)). Interestingly, in the laboratory the midden piles are created directly on the food source (Fig. 3(B)). SINV-3 titer in moribund worker ants typically exceeds 10 9 SINV-3 genome copies per ant. Queen weight declines dramatically in the latter stages of SINV-3 infections in fire ant colonies. Decreases in developing ova and reduced fecundity are also associated with infection. It is not known whether the decrease in feeding causes the fecundity/ovary changes observed, or if SINV-3 has a direct impact on the queen. Despite significant larval mortality in colonies infected with SINV-3, the virus does not appear to replicate or assemble in the larval ant stage. Plus-strand SINV-3 is regularly detected in larval stages by RT-PCR, but virus replication or assembly do not appear to occur in this stage. It has been hypothesized that cessation of food retrieval/feeding and/or behavioral changes whereby the workers neglect the developing larvae is the cause of larval mortality. Reduced foraging in response to virus infection has been reported to occur in other arthropod species. NfV-1 is also associated with a decrease in fecundity in Nylanderia fulva colonies, although this response occurs inconsistently.

Fig. 3 Large midden pile composed of accumulating corpses from a Solenopsis invicta virus 3-infected fire ant colony (A). Often, the midden piles are created on the food source (B). In this case, the food source is the house cricket, Acheta domesticus.

896

Solinviviruses (Solinviviridae)

Diagnosis (Detection) Detection of colonies infected with solinviviruses is typically made by molecular analysis of a portion of the genome using RT-PCR or quantitative PCR. In the laboratory, SINV-3 infections may be identified by declining colony health, changes in food consumption and distribution, and increased mortality (midden piles). However, in the field, RT-PCR is required. NfV-1 is best detected by RT-PCR, in the laboratory or field.

Prevention (Biocontrol) SINV-3 and NfV-1 were discovered using a metagenomics approach coupled with next generation sequencing to detect pathogens within the transcriptomes of pest ant species. The sole purpose of the work was pathogen bioprospecting and discovery. SINV-3 and NfV-1 are being investigated for their use as natural control agents (biological control) against the invasive ant species, Solenopsis invicta and Nylanderia fulva, respectively. Therefore, from an entomological perspective, introduction and establishment of these viruses into target ant host populations is the objective, rather than disease prevention. SINV-3 has been released as a classical biological control agent into areas where the virus is not found in Solenopsis invicta. In addition, SINV-3 is being developed as a biopesticide. SINV-3-containing baits have been developed and successfully transmit the virus in the laboratory and field. Protein-based baits were superior to sugar- or oil-based baits. The minimum SINV-3 dose required for causing colony collapse was in the 107 to 109 viral particles per microliter range. Lower doses resulted in a chronic infection with limited impact on colony health. A limitation to commercial development of SINV-3 as a biopesticide is in vitro production of virus. Virus genome production was possible using a baculovirus-driven expression system, but virus assembly failed to occur.

Further Reading Adams, M.J., Lefkowitz, E.J., King, A.M.Q., et al., 2017. Changes to taxonomy and the International Code of Virus Classification and Nomenclature ratified by the International Committee on Taxonomy of Viruses. Archives of Virology 162, 2505–2538. Brown, K., Olendraite, I., Valles, S.M., et al., 2019. ICTV virus taxonomy profile: Solinviviridae. Journal of General Virology 100, 736–737. doi:10.1099/jgv.0.001242. Hartley, C.J., Greenwood, D.R., Gilbert, R.J.C., et al., 2005. Kelp fly virus: A novel group of insect picorna-like viruses as defined by genome sequence analysis and a distinctive virion structure. Journal of Virology 79, 13385–13398. Oi, D.H., Valles, S.M., Porter, S.D., et al., 2018. Introduction of fire ant biological control agents into the Coachella Valley of California. Florida Entomologist 102, 284–286. Porter, S.D., Valles, S.M., Oi, D.H., 2013. Host specificity and colony impacts of Solenopsis invicta virus 3. Journal of Invertebrate Pathology 114, 1–6. Valles, S.M., 2012. Positive-strand RNA viruses infecting the red imported fire ant, Solenopsis invicta. Psyche 2012, 1–14. Valles, S.M., Bell, S., Firth, A.E., 2014. Solenopsis invicta virus 3: Mapping of structural proteins, ribosomal frameshifting, and similarities to Acyrthosiphon pisum virus and Kelp fly virus. PLoS One 9, e93497. Valles, S.M., Hashimoto, Y., 2009. Isolation and characterization of Solenopsis invicta virus 3, a new postive-strand RNA virus infecting the red imported fire ant, Solenopsis invicta. Virology 388, 354–361. Valles, S.M., Oi, D.H., Becnel, J.J., et al., 2016. Isolation and characterization of Nylanderia fulva virus 1, a positive-sense, single-stranded RNA virus infecting the tawny crazy ant, Nylanderia fulva. Virology 496, 244–254. Valles, S.M., Oi, D.H., Porter, S.D., 2010. Seasonal variation and the co-occurrence of four pathogens and a group of parasites among monogyne and polygyne fire ant colonies. Biological Control 54, 342–348. Valles, S.M., Oi, D.H., Yu, F., Tan, X.X., Buss, E.A., 2012. Metatranscriptomics and pyrosequencing facilitate discovery of potential viral natural enemies of the invasive Caribbean crazy ant, Nylanderia pubens. PLoS One 7, e31828. Valles, S.M., Porter, S.D., 2015. Dose response of red imported fire ant colonies to Solenopsis invicta virus 3. Archives of Virology 160, 2407–2413. Valles, S.M., Porter, S.D., Choi, M.Y., Oi, D.H., 2013. Successful transmission of Solenopsis invicta virus 3 to Solenopsis invicta fire ant colonies in oil, sugar, and cricket bait formulations. Journal of Invertebrate Pathology 113, 198–204. Valles, S.M., Porter, S.D., Firth, A.E., 2014. Solenopsis invicta virus 3: Pathogensis and stage specificity in red imported fire ants. Virology 461, 66–71. Valles, S.M., Varone, L., Ramirez, L., Briano, J., 2009. Multiplex detection of Solenopsis invicta viruses  1,  2, and  3. Journal of Virological Methods 162, 276–279.

Relevant Websites https://talk.ictvonline.org/ictv-reports/ictv_online_report/positive-sense-rna-viruses/w/solinviviridae Solinviviridae. Positive-sense RNA Viruses. ICTV. https://viralzone.expasy.org/7936?outline=all_by_species Solinviviridae. ViralZone page.

Tetraviruses (Alphatetraviridae, Carmotetraviridae, Permutotetraviridae) Rosemary A Dorrington, Rhodes University, Grahamstown, South Africa Tatiana Domitrovic, Federal University of Rio de Janeiro, Rio de Janeiro, Brazil Meesbah Jiwaji, Rhodes University, Grahamstown, South Africa r 2021 Elsevier Ltd. All rights reserved.

Introduction Tetraviruses are a diverse group of small, non-enveloped RNA viruses that infect lepidopteran (butterflies and moths) insects. In general, virus particles vary in size from 38 to 41 nm in diameter and share a unique T ¼ 4 icosahedral capsid architecture. It is this feature that led to the name “tetravirus”, derived from the Greek tettares meaning “four”. Tetravirus particles assemble as procapsids comprising 240 copies of a single capsid precursor protein (CP). During maturation, the CP undergoes autoproteolytic cleavage to form the b (58–67 kDa) and g (6–8 kDa) subunits of the mature capsid. Tetravirus particles encapsidate one or two single-stranded (ss), positive sense ( þ ve) genomic RNAs. While some tetraviruses have bipartite genomes, the majority have monopartite genomes ranging from 5.3 to 6.6 kb in size and encoding up to four genes. The first tetravirus, Nudaurelia capensis b virus (NbV), was isolated from diseased South African pine emperor moth (Nudaurelia cytherea capensis) larvae in 1968. Structural studies on NbV particles revealed a new type of virus capsid with unusual T ¼ 4 icosahedral symmetry that led to NbV becoming the type-member of a new family of viruses then named the Tetraviridae. The Tetraviridae were classified into two genera according to their genome organization, where viruses with monopartite genomes were classified into the Betatetravirus genus, while viruses with bipartite genomes resided in the Omegatetravirus genus. Comparative analysis of available genome sequences revealed the Tetraviridae to be a highly diverse group of viruses, despite the structural conservation of their capsid. Phylogenetic analysis showed that the RNA-dependent RNA polymerase (RdRp) domains of tetravirus replicases can be clustered into three groups, each related to a different supergroup of RNA viral replicases. Consequently, the International Committee for the Taxonomy of Viruses reclassified the Tetraviridae into three new families based upon the characteristics of their replicases, namely the Alphatetraviridae, the Permutotetraviridae and the Carmotetraviridae (Table 1). The increasing availability of metatranscriptomic sequence data and the elucidation of the invertebrate virosphere will serve to clarify the viral diversity present in the invertebrates.

Genome Organization Alphatetraviridae The Alphatetraviridae are classified into two genera according to their genome organization with the Betatetravirus containing virus species with monopartite genomes and viruses with bipartite genomes classified within the Omegatetravirus genus. Currently, there are three omegatetravirus and seven betatetravirus species (Table 1). The genome sequence of only one betatetravirus (NbV) is available, so it is possible that other species classified in this genus could in fact be reclassified into the Permutotetraviridae or Carmotetraviridae families based upon the characteristics of their replicases. In the omegatetravirus, Helicoverpa armigera stunt virus (HaSV), REP is encoded by RNA1, along with three small ORFs, p11, p15 and p8 that overlap with the 30 end of REP (Fig. 1). The HaSV REP (as do those of all sequenced alphatetraviruses) includes the three conserved functional domains, namely methyltransferase (MET), helicase (HEL) and RdRp domain in an N- to C-terminal modular arrangement, characteristic of the alpha-like virus superfamily. The three small ORFs are arranged tandemly and in an in-frame arrangement with p11 and p15 separated by a UGA stop, and UAA GGG separating p15 and p8. These ORFs are translated in HaSV-infected cells via a mechanism that is as yet unknown, but their translation products are thought to be involved in initiating the viral replication complex. The HaSV CP is encoded by the smaller RNA2. At its 50 end, RNA2 encodes a second protein, p17 that is expressed at low levels in HaSV-infected Helicoverpa armigera larval midgut tissue and is involved in packaging of viral RNA (Fig. 1). The viral RNAs (vRNAs) of all sequenced genomes of alphatetraviruses (HaSV), Dendrolimus punctata tetravirus and, NbV, have a 50 cap and terminates in a tRNA-like structures at its 30 ends that is thought to function in virus replication. The NbV genome encodes the viral REP that overlaps the CP ORF located at their 30 end and the CP is translated from a subgenomic (sg) RNA originating just upstream of the CP coding sequence (Fig. 1).

Permutotetraviridae There are currently only two permutotetravirus species, namely Euprosterna elaeasa virus (EeV) and the type species, Thosea asigna virus (TaV). Both viruses have monopartite genomes that are highly conserved (82% and 84% identity, respectively) encoding two overlapping ORFs, with the viral REP at the 50 end and the CP at the 30 end (Fig. 1). In contrast to the alphatetraviruses, the 50 terminus of the TaV/EeV vRNA is predicted to be blocked by a viral protein genome linked, VpG, while the 30 terminus is predicted to form a conserved pseudoknot. The permutotetravirus CP is translated from a sgRNA originating just upstream of the CP ORF.

Encyclopedia of Virology, 4th Edition, Volume 4

doi:10.1016/B978-0-12-814515-9.00010-2

897

898

Table 1

Tetraviruses (Alphatetraviridae, Carmotetraviridae, Permutotetraviridae) Taxonomic classification of tetraviruses

Virus spécifs Family: Alphatetraviridae Genus: Omegatetravirus

Genus: Betatetravirus

Acronym Lepidopteran host family Nudaurelia capensis o virusa NoV Helicoverpa armigera stunt HaSV virus Dendrolimus punctatus DpTV Tetravirus Nudaurelia capensis b virusa Antheraea eucalypti virusb Darna trima virus Dasychira pudibunda virus Calliteara pudibunda virus Philosamia cynthia x ricini virus Pseudoplusia includens virus Trichoplusia ni virus

GenBank Particle diameter (nm) accession no

South Africa Australia

41 38

Lasiocampidae China

40

[S43937.1] [KX423453.1]; [L37299] [AY594352]; [AY594353]

NbV AeV DtV DpV CpV PxV

Saturniidae Saturniidae Lymacodidae Lymantriidae

South Africa Australia Malaysia UK

38 32 38 38

[AF102884] NA NA NA

Saturniidae

UK

35

NA

PiV TnV

Noctuidae Noctuidae

USA USA

40 38

NA NA

TaV

Lymacodidae

Malaysia, USA

38

Euprosterna elaeasa virus

EeV

Zygaenidae

Peru

38

[AF282930], [AF062037] [AF461742]

Providence virusa

PrV

Noctuidae

USA

40

[AF062037]

Family: Permutotetraviridae Genus: Alphapermutotetravirus Thosea asigna virusa

Carmotetraviridae Genus: Alphacarmotetravirus

Saturniidae Noctuidae

Geographical location

a

Type strain for the genus. Serologically indistinguishable from NbV (Grace and Mercer, 1976).

b

The activity of 2A-like processing site located near the N-terminus of the CP results in the production of a 17 kDa amino-terminal peptide (p17) and the 65 kDa a protein (Fig. 1). Permutotetravirus REPs are structurally related to those of birnaviruses and the Drosophila A virus and distantly related to members of the picorna-like virus superfamily. In both viruses, the MET and HEL domains of REP are absent and the REP displays an internally-permuted C-A-B arrangement of the canonical A, B and C motifs found in palm subdomain of the RdRP, upstream of which is a putative nucleotidylation (VPg) signal.

Carmotetraviridae Providence virus is the sole species of the Carmotetraviridae family. The virus has a monopartite genome with the CP ORF, located at the 30 end of the vRNA and translated from a sgRNA originating just upstream of the CP AUG. Two 2A-like processing sites are present at the amino terminus of the ORF that likely result in the translation of two small peptides of 7 and 8 kDa, along with the a protein of 68 kDa that is subsequently cleaved by maturation-dependent autoproteolytic cleavage to form the b and g subunits of the mature virus particle (Fig. 1). The PrV REP ORF is upstream is of the CP CDS and encodes a read-through stop codon that results in the translation of a smaller replication accessory protein (p40). The RdRp domain of the PrV REP is related to replicases of tombus- and umbraviruses, which are ( þ ve) ssRNA plant viruses in the carmovirus-like supergroup. Unlike any other tetravirus, the PrV genome encodes a third ORF (p130) at its 50 end, upstream of REP. The p130 ORF encodes a 2A-like processing site at its N-terminus would result in the translation of two products of 17 kDa and 113 kDa, respectively. The function of p130 in the PrV lifecycle is unknown.

Viral Replication Our understanding of tetravirus replication is based on studies of HaSV (Alphatetraviridae) and PrV (Carmotetraviridae) because these two viruses infect and replicate in tissue culture cell lines. In studies on HaSV-infected H. armigera larvae, RNA1 was detected within three days post infection and RNA2 appeared after a further two days. This implies that RNA1 is replicated in the early stages of infection; delayed RNA2 replication results in the later expression of CP and p17 in the virus lifecycle. Immunofluorescence microscopy studies of HaSV-infected Spodoptera frugiperda tissue culture cells shows that virus replication occurs in punctate structures in the host cell cytoplasm. The REP is trafficked to detergent-resistant membranes derived from the secretory pathway and membrane targeting is dependent on the presence of two domains – one located at the N-terminus and the other within the C-terminal within the RdRp domain of the protein.

Tetraviruses (Alphatetraviridae, Carmotetraviridae, Permutotetraviridae)

899

Fig. 1 Genome organization and gene expression of representative tetraviruses of the Alphatetraviridae, Permutotetraviridae and Carmotetraviridae families. Viral genes encoding the replicase, Capsid Protein precursor (CP) and replication accessory protein of PrV (p40) are indicated on the maps as well as gene products produced by co-translational (2A) processing in EeV and PrV as well as maturation-dependent autoproteolytic cleavage of the capsid a protein to produce the b and g capsid subunits. Translation products labeled with (?) are predicted to be produced but have yet to be identified in virus infected cells. The positions of 2A-like processing sequences (*) and read-through stop codon (▲) are indicated. The bent arrow shows the start of the sgRNAs of the monopartite viruses (NbV, EeV and PrV) from which the viral CP is translated. GenBank Accession numbers: KX423453.1 (HaSV RNA1), L37299 (HaSV RNA2), AF102884 (NbV), AF461742 (EeV) and AF062037 (PrV).

900

Tetraviruses (Alphatetraviridae, Carmotetraviridae, Permutotetraviridae)

The PrV REP also localizes to detergent resistant, punctate structures in the cytoplasm of persistently infected Helicoverpa zea MG8 tissue culture cells. In S. frugiperda cells expressing EGFP-REP fusion proteins, the REP co-localizes with replicating vRNA so these punctate structures are the site of viral replication in PrV-infected cells. As with HaSV, the PrV REP appears to be associated with vesicles originating from the Golgi and secretory vesicles, thus both viruses likely share the same site of replication. The PrV REP is translated into two viral gene products; a truncated accessory replication protein (p40) resulting from the activity of a readthrough stop codon and the full length REP, which encodes the RdRp domain (Fig. 1). p40 also co-localizes with vRNA and is thought to be involved in the establishment of the viral replication complex.

The Tetravirus Capsid Capsid Structure X-ray crystallography and cryo-electron microscopy image reconstruction (cryoEM) have been used to examine the different structural states of tetravirus capsids. The crystal structure of authentic Nudaurelia capensis omega virus (NoV) was determined in 1996–2.8 Å resolution. The virion is structurally indistinguishable from the NoV virus-like particles (VLPs) made by expressing the 70 kDa CP in a baculovirus system. The VLP packages random cellular RNA and can be purified in a metastable procapsid state, allowing for maturation steps to be exquisitely controlled by changes in pH in a time scale that is amenable to biochemical and structural analysis. These characteristics made maturation of this alphatetravirus one of the most accessible systems for the study of eukaryotic icosahedral virus maturation and dynamics. The NoV crystal structure showed that the capsid is icosahedral, 420 Å in diameter, and composed of 240 capsid proteins arranged with T ¼ 4 quasi-symmetry. Thus, the 60 icosahedral asymmetric units each have four subunits, named A–D, in unique chemical environments (Fig. 2(a)). The 644 amino acid capsid subunit has three domains: an exterior immunoglobulin (Ig)-like fold, a central b-barrel in tangential orientation, and an interior helix bundle (Fig. 2(a)). The helix bundle is created by the N- and C-termini of the polypeptide before and after the b-barrel fold, and the Ig-like domain is an insert between the E and F b-strands of the barrel. The exterior Ig domain is unique in nonenveloped viruses, has the least conserved sequence among tetravirus subunits and is presumed to be involved in cell binding and virus tropism.

Fig. 2 Structure and dynamics of the alphatetravirus, NoV, capsid. (a) The subunit organization in the NoV T¼4 capsid crystal structure. There are four subunits in the asymmetric unit (A in blue, B in red, C in green, and D in yellow). Icosahedral symmetry axes are marked with white symbols, quasi-symmetry axes are marked with black symbols. Quasi-twofold dimer interfaces occur with the bent A–B or flat C–D conformations (right). The well-ordered C-terminal helices in the C–D subunits comprise a molecular switch that stabilizes the flat interface. (b) Ribbon diagram of the C subunit illustrates the three domain tertiary structure present in all 240 subunits (exterior of capsid on top, interior on bottom). Maturation cleavage occurs within the helical domain between residues Asn570 and Phe571. The molecular switch, present only in the C or D subunits, is highlighted in magenta. (c) Effect of pH on NoV maturation and activity. Red arrows indicate the maturation pathway. In the wild-type virus, the procapsid converts to uncleaved capsid with the drop of pH, followed by a maturation cleavage, which locks the particle in capsid conformation after B10% of the subunits are cleaved. While the large conformational change takes milliseconds, cleavage is a slow process taking, minutes to several hours. The surface of the partially matured NoV reconstructions are colored according to its dynamic properties as determined by standard deviation maps. Note the progressive particle stabilization promoted by cleavage. In the cross section of matured particles, is possible to visualize a dynamic region at the five-fold axis where the lytic g peptide resides. Infectivity depends on another conformational transition promoted by high pH (black arrow), when the cleaved particle gains lytic activity. No structural model is available for this condition.

Tetraviruses (Alphatetraviridae, Carmotetraviridae, Permutotetraviridae)

901

Fig. 3 Cryoelectron microscopy reconstructions of the icosahedral tetravirus virions. All four structures are at B25–30 Å resolution and are viewed down their icosahedral twofold axes. The capsids are B400 Å in diameter and have T ¼ 4 quasi-icosahedral symmetry. The capsid morphologies of monopartite and bipartite viruses are closely similar (monopartite at top, bipartite at bottom), but differ between the groups. Most notably, within the triangular facets, NbV and PrV have a pitted surface where the alphatetraviruses, NoV and HaSV, surfaces are filled in. The permutotetravirus, TaV, also displays the same pitted surface in negative-stain EM photographs as observed for NbV and PrV. The recent crystal structure determination of PrV showed this is mainly due to different rotations of the Ig-like domains relative to the b-barrels when compared to the subunits in the NoV crystal structure (see Fig. 2). This supports the suggested role of the Ig-like domain in determining the restrictive host cell specificity of these viruses by displaying distinctly different surfaces to cellular receptors.

The interior helix domain is believed to interact with the packaged vRNA genome and have a role in capsid polymorphism and maturation (see below). In mature capsids, all 240 of the NoV capsid subunits have undergone autoproteolysis between residues N570 and F571 after assembly, leaving the cleaved portion of the C-terminus, called the g peptide, non-covalently associated with the capsid. This peptide has membrane lytic activity and is essential for virus infectivity. The g peptide also functions as a quasiequivalent molecular switch (Fig. 2(a) and (b)). In NoV, a segment of the g peptide (residues 608–641) is only ordered in the C and D subunits and functions as a wedge between the ABC and DDD morphological units to prevent curvature and create a flat contact. The interface between two ABC units is bent partly due to the lack of ordered g peptide. The crystal structure of HasV VLPs is very similar to the NoV capsid, presenting a root-mean-square deviation of 1.08. Importantly, tetraviruses subunit structure and the mechanism of autoproteolysis displays a strong relationship with the T ¼ 3 insect nodaviruses. Indeed, the b-barrel folds and autocatalytic sites can be superimposed with little variation, with the g peptide in the nodaviruses also involved in quasiequivalent switches and membrane lytic activity. The differences between tetravirus capsids are most pronounced in the Ig-like domain. The crystal structure of PrV at 3.8 Å is the only high-resolution structure of a tetravirus with a monopartite genome. The PrV subunit retains the three domains and a superimposable autoproteolysis site (all four subunits show a clear break at the cleavage site), but the Ig-like domain and the first and last ordered residues of the termini in the helical domain have different positions and structure compared to NoV and HaSV. The different Ig domain position was first observed in the cryo-EM reconstruction of NbV as a distinctly different surface structure compared to the omega-like particle surfaces (Fig. 3). Interestingly, the use of subunit termini in PrV is more like that seen in the nodaviruses. The quasi-symmetry switch in PrV has swapped elements compared to the omegatetraviruses and represents a new and unique mixture of structural features among the insect viruses.

Capsid Maturation, Auto-Proteolysis and Dynamics NoV and HaSV coat proteins expressed in the baculovirus system assemble into a round and porous metastable VLPs that are 480–490 Å in diameter, called procapsids, when purified at neutral pH 7.6. The procapsid coat proteins remain in the uncleaved precursor form until a reduction in pH. When exposed to acidic conditions (pH 5.0), procapsids undergo a large-scale structural rearrangement (maturation) in which the spherical and porous procapsid condenses into the smaller, angular mature capsid. In NoV the structural transition triggers the autocatalytic cleavage, where the 70 kDa a protein begins to cleave into the b (62 kDa) and g (8 kDa) polypeptides, in a reaction that proceeds even at neutral pH (Fig. 2(b)). Small angle X-ray scattering demonstrated that the pH-induced large conformational change occurs in less than 100 ms in NoV, while the autocatalytic cleavage of all subunits takes several hours to complete. Structure Analysis

902

Tetraviruses (Alphatetraviridae, Carmotetraviridae, Permutotetraviridae)

Mutagenesis studies have shown the asparagine at the scissile bond (N570) in NoV is required for cleavage. Substituting threonine for asparagine (N570T) results in assembly of procapsids that do not cleave but continue to undergo the conformational change reversibly as a function of pH (Fig. 2(b)). Cryo-EM structures at 8 Å resolution of cleaved and N570T VLPs demonstrated the role of cleavage in stabilizing the capsid form. There are more extensive inter-subunit contacts in cleaved particles that result in more ordered C-terminal helices, which account for the additional stability of the particle. Cryo-EM was used to compare particles at intermediate stages of cleavage. Since these intermediates have the same size as a mature capsid, they are superimposable, allowing a detailed analysis of the structural differences in the cleavage site. NoV maturation has also served as a model for the development of an innovative method for Cryo-EM image reconstruction that captures the dynamic character of viruses displayed in the CryoEM ensemble of particles at the moment of freezing. The result is a quantitative portrait of the structural variance derived from protein dynamics. With this method it was possible to quantify the progressive capsid stabilization during maturation (Fig. 2(b)). Another relevant finding reproduced by both analysis approaches was that the four quasi-equivalent subunit positions of NoV cleave at different rates and have specialized functions in the particle. g peptide derived from the pentamer subunits are produced first and are organized in a vertical helical bundle that is projected towards the particle surface. The identical polypeptides in other quasi-equivalent subunits are produced later and act as molecular switches. They make extensive monovalent contacts and are trapped in positions inappropriate for release. Functional data derived from liposome assays provided experimental support for the hypothesis that the pentameric helical bundle is a specialized lytic domain of the capsid structure. The lytic activity of mature capsids against cellular membranes is triggered by alkaline conditions (Fig. 2(b)). This condition is found in the mid-gut of the larvae and seems to be required for infectivity of HaSV virions (see next).

Pathology Symptoms and Transmission Tetraviral infection of lepidopteran larvae occurs primarily by ingestion and the symptoms of infection vary widely between the tetraviruses. Trichoplusia ni virus (TnV), when it infects Trichoplusia ni larvae, causes only a slight retardation in the growth of the larvae. This is in contrast to NbV, which are lethal to infected N. capensis larvae and can act rapidly. NbV-infected Antheraea eucalypti larvae could appear healthy and be feeding normally but within 12 h, they would be dead. Larvae dying of an NbV infection exhibit certain characteristics: they either hang from the branches by the prolegs or fall to the ground and the contents of their bodies liquefy. All infection studies have shown that the stage at which the tetravirus infection occurs is critical to the outcome. Early instar (L1-L3) H. armigera larvae infected with HaSV cease to feed within 24 h, stunt dramatically, liquefy and die within 4 days. Infection of larvae as later instars does not result in any detectable symptoms or lethality, although the larvae do test positive for the presence of HaSV. There have been no reports of infection of pupae or adults by tetraviruses, but there is evidence that vertical transmission may occur.

Host Range It was a long held belief that tetraviruses replicate only in lepidopteran midgut cells. Tetraviral infection, based on data collected from HaSV infection of H. armigera larvae is focused in the midgut of the animal. Regardless of whether HaSV was fed to the larvae or injected into the hemocoel of the animal, vRNA could be detected only in the midgut. Infected cells undergo apoptosis and are sloughed from the midgut of the animal. It is proposed that the virus exploits this host cell response because apoptosis induces cellular acidification, which would induce maturation of the tetravirus particles. The carmotetravirus, PrV, is also targeted to the larval midgut, being first identified as a persistent infection in a H. zea midgut (MG8) cell line. Recent findings, however, indicate that HaSV and PrV can infect other lepidopteran cell types as well. Both PrV and HaSV readily infect and replicate in S. frugiperda Sf9 cells, which were derived from S. frugiperda wing bud tissue and PrV has also been shown to infect Spodoptera exigua 1 cells. An explanation for the apparent tropism of tetraviruses to midgut tissue is provided by the observation that HaSV virus particles require structural rearrangements brought about by exposure to alkaline pH to facilitate binding host cell receptors followed by entry and infection of S. frugiperda Sf9 cells. These alkaline pH conditions are provided by only the lepidopteran midgut and hence infection is confined to midgut tissues. The host range of carmotetraviruses has recently broadened beyond lepidopteran insects. PrV has been shown to infect and replicate in human cervical (HeLa) and breast (MCF7) cancer cell lines. Remarkably, PrV particles purified from H. zea MG8 cells are able to productively infect Vigna unguiculata (cowpea) plants. vRNA can be detected at sites distant from the site of infection and PrV particles purified from cowpea plants are capable of infecting and establishing replication in mammalian cancer cell lines. These data reveal that PrV is a virus that has crossed Kingdom boundaries retaining the ability to infect and replicate in animal and plant cells. While species of the Nodaviridae family have a similar host range, the initiation of replication in yeast, plant and mammalian cells needs to be launched from transfected cDNA or RNA.

Persistent Infections PrV was the first tetravirus known to replicate in tissue culture. The virus was first isolated from a persistently infected H. zea MG8 cell line and subsequently, PrV has been observed to establish persistent infections in other lepidopteran cell lines including

Tetraviruses (Alphatetraviridae, Carmotetraviridae, Permutotetraviridae)

903

Fig. 4 PrV infected H. zea midgut cells express the intestinal stem cell marker Delta. Detection of CP, Delta and p40/REP in persistently infected H. zea FB33 (a) and MG8 (b) tissue culture cell lines. Cells were probed with rabbit polyclonal anti-PrV CP and anti-rabbit AF633 secondary antibodies to detect the PrV CP. Mouse anti-Delta and anti-mouse AF546 were used to detect the Delta protein. Viral replicase was stained with biotin-conjugated anti-PrV p40 polyclonal antibodies and streptavidin AF488 to detect the PrV REP. All images represent 1 mm optical slices taken using a Zeiss LSM780 laser scanning confocal microscope using a X63, 0.75 NA objective. Scale bar represents 10 mm. (c) Cell-free lysates from PrV-infected H. zea FB33 and MG8 cells were separated on an SDS-PAGE gel and western analysis to detect the PrV CP using anti-PrV CP (rabbit) polyclonal and anti-rabbit horse-radish-peroxidase (HRP)-conjugated (goat) secondary antibodies and Delta was detected using anti-Delta (mouse) primary and anti-mouse HRP (goat) secondary antibodies.

an H. zea fat body (FB33) and S. frugiperda (Sf9) wing bud cell lines. PrV is also able to replicate persistently in human HeLa cells. Lepidopteran larval midgut tissues contain multipotent intestinal stem cells that could be involved in the establishment of persistent PrV infections. Delta, a marker of insect stem cells, is involved in Notch signaling and in the midgut of insects, cellular Delta protein levels are thought to determine the cell fate. High levels of Delta protein activate the Notch pathway, which in turn leads to the down-regulation of Delta in the daughter cells resulting in the cells becoming enterocytes. On the other hand, cells with low levels of Delta are destined to become enteroendocrine cells. Therefore it is the cellular levels of Delta that decides the fate of the cell. PrV infection appears to either target Delta expressing cells or to induce Delta expression in H. zea MG8 cells (Fig. 4(b,c)). However, Delta is not expressed in persistently infected H. zea FB33 cell line (Fig. 4(a,c)) suggesting that PrV persistence does not involve the infection of insect midgut stem cells.

Concluding Remarks Since the discovery of NbV, tetraviruses have been defined as small ( þ ve) ssRNA viruses that share a distinctive T ¼ 4 capsid architectures. Their non-structural genes, however, are representative of three diverse virus supergroups, the alpha-like (type IV), the carmo-like (type II) and the picorna-like (type I) supergroups. This implies that modern tetraviruses are the result of converging evolutionary paths that resulted from re-assortment between diverse animal and plant ( þ ve) ssRNA ancestral viruses. The availability of metatranscriptomic sequence data has significantly extended the extent of the invertebrate virosphere, and there are reports of potentially new tetravirus species. However, these data are based on homology with tetravirus REP sequences. The conserved tetravirus CP is not present on the reconstructed viral genomes and therefore these viruses cannot be considered to be tetraviruses based upon the current definition. The ability of tetravirus capsids to package non-viral RNAs certainly facilitates the evolution of new, re-assorted viruses in the lepidopteran midgut that must be ongoing and undoubtedly, will result in the discovery of new tetravirus species in the future.

Reference Grace, T.D.C., Mercer, E.H., 1976. Serological relations between twelve small RNA viruses of insects. J. Gen. Virol. 31, 131–134.

Further Reading Azad, K., Banerjee, M., Johnson, J.E., 2017. Enzymes and enzyme activity encoded by nonenveloped viruses. Annual Review of Virology 4, 221–240. Doerschuk, P.C., Gong, Y., Xu, N., Domitrovic, T., Johnson, J.E., 2016. Virus particle dynamics derived from CryoEM studies. Current Opinion in Virology 18, 57–63. Domitrovic, T., Matsui, T., Johnson, J.E., 2012. Dissecting quasi-equivalence in non-enveloped viruses: Membrane disruption is promoted by lytic peptides released from subunit pentamers, not hexamers. Journal of Virology 86, 9976–9982. Dorrington, R.A., Jiwaji, M., Awando, J.A., de Bruyn, M.-M., 2019. Advances in tetravirus research: New Insight into the infectious virus lifecycle and an expanding host range. In: Bonning, B.C. (Ed.), Insect Molecular Virology: Advances and Emerging Trends. Norfolk: Caister Academic Press, pp. 145–162.

904

Tetraviruses (Alphatetraviridae, Carmotetraviridae, Permutotetraviridae)

Jiwaji, M., Matcher, G.F., de Bruyn, M.-M., et al., 2019. Providence virus: An animal virus that replicates in plants or a plant virus that infects and replicates in animals? PLoS One 14, e0217494. Jiwaji, M., Short, J.R., Dorrington, R.A., 2016. Expanding the host range of small insect RNA viruses: Providence virus (Carmotetraviridae) infects and replicates in a human tissue culture cell line. Journal of General Virology 97, 2763–2768. Penkler, D.L., Jiwaji, M., Domitrovic, T., et al., 2016. Binding and entry of a non-enveloped T ¼ 4 insect RNA virus is triggered by alkaline pH. Virology 498, 277–287. Routh, A., Domitrovic, T., Johnson, J.E., 2012. Host RNAs, including transposons, are encapsidated by a eukaryotic single-stranded RNA virus. Proceedings of the National Academy of Sciences of the United States of America 109, 1907–1912. Speir, J.A., Taylor, D.J., Natarajan, P., et al., 2010. Evolution in action: N and C termini of subunits in related T ¼ 4 viruses exchange roles as molecular switches. Structure 18, 700–709. Tang, J., Kearney, B.M., Wang, Q., et al., 2014. Dynamic and geometric analyses of Nudaurelia capensis o virus maturation reveal the energy landscape of particle transitions. Journal of Molecular Recognition 27, 230–237. Zeddam, J.-L., Gordon, K.H., Lauber, C., et al., 2010. Euprosterna elaeasa virus genome sequence and evolution of the Tetraviridae family: Emergence of bipartite genomes and conservation of the VPg signal with the dsRNA Birnaviridae family. Virology 397, 145–154.