Bacterial Regulatory RNA: Methods and Protocols (Methods in Molecular Biology, 2741) [2 ed.] 1071635646, 9781071635643

This second edition details new and updated methods used for studying prokaryotic non-coding RNAs and their protein acco

126 91 15MB

English Pages 430 [420] Year 2024

Report DMCA / Copyright

DOWNLOAD PDF FILE

Table of contents :
Preface/Summary
Contents
Contributors
Part I: sRNA Discovery
Chapter 1: RNA Extraction from Gram-Positive Bacteria Membrane Vesicles Using a Polymer-Based Precipitation Method
1 Introduction
2 Materials
3 Methods
3.1 Concentration of MVs from Culture Supernatants
3.2 Precipitation of MVs
3.3 Isolation of RNA from MVs with the miRNeasy Kit
4 Notes
References
Chapter 2: Extraction and Purification of Outer Membrane Vesicles and Their Associated RNAs
1 Introduction
2 Materials
2.1 Bacterial Growth
2.2 OMVs Extraction
2.2.1 Ultracentrifugation
2.2.2 Ultrafiltration
2.2.3 Ion-Exchange-Based Columns
2.3 OMVs Purification
2.4 OMVs Analysis
2.5 RNA Extraction and Analysis
2.5.1 RNA Extraction Methods
2.5.2 Quality Control
2.5.3 RNA Analysis
3 Methods
3.1 OMVs Extraction
3.1.1 Bacterial Growth
3.1.2 Ultracentrifugation
3.1.3 Ultrafiltration
3.1.4 Centrifugation Steps
3.2 OMVs Purification
3.2.1 Density Gradient Purification
3.3 OMVs Analysis
3.3.1 OMVs Concentration
FM1-43 Dosage
ZetaSizer
Nanoparticles Tracker Analyzer (NTA)
Bradford Assay
3.3.2 OMVs Observations
Transmission Electron Microscopy (TEM)
SDS-PAGE Electrophoresis
3.4 Extraction of RNAs
3.4.1 RNA Extraction Methods
3.4.2 Quality Control
3.4.3 RNA Analysis
4 Notes
References
Chapter 3: Analysis of Phage Regulatory RNAs: Sequencing Library Construction from the Fraction of Small Prokaryotic RNAs Less...
1 Introduction
2 Materials
2.1 Reagents to Be Supplied by User
3 Methods
3.1 Isolation of the Bacterial sRNAs Using PureLink miRNA Isolation Kit (Invitrogen)
3.2 DNA Digestion Using TURBO DNA-free Kit (Invitrogen)
3.3 RNA Precipitation and Concentration
3.4 Polyacrylamide Gel Electrophoresis (PAGE) of the Isolated Small RNA Molecules
3.5 Elution of sRNAs Less Than 50 nt in Length from the Gel Slice
3.6 Library Preparation Using SMARTer smRNA-Seq Kit for Illumina (Clontech Laboratories)
3.7 Library Validation and Sequencing
4 Notes
References
Chapter 4: Discovering Novel Bacterial Small RNA by RNA-seq Analysis Toolkit ANNOgesic
1 Introduction
2 Installation Guidelines
2.1 Docker
2.2 Singularity
2.3 pip3
3 Methods
3.1 Arguments
3.2 Download the Input Files
3.3 Create an ANNOgesic Project Folder
3.4 Place the Input Files in ANNOgesic Analysis Folders
3.5 Detect Transcripts
3.6 Identification of Factor-Independent Terminators (Optional)
3.7 Download sRNA and nr Databases (Optional)
3.8 Detection of sRNA
3.9 Outputs of sRNA Prediction
3.9.1 sRNA Annotations and Their Scores
3.9.2 Secondary Structures and Sequences of the sRNA Candidates
3.9.3 BLAST Results (for Reference)
3.9.4 Plots
4 Result Interpretation and Experimental Validation
5 Trouble Shooting Guide
6 Note
References
Part II: sRNA Functions
Chapter 5: Ribosome Profiling Methods Adapted to the Study of RNA-Dependent Translation Regulation in Staphylococcus aureus
1 Introduction
2 Materials
2.1 S. aureus Cultures
2.2 Cell Harvesting and Lysis
2.3 Preparation of Total RNA
2.4 Monosome/Polysome Enrichment and Buffer Exchange
2.5 MNase Treatment
2.6 Polysome Analysis and Monosome Isolation
2.7 Hot Phenol RNA Extraction from Monosomes
2.8 Purification of Ribosome Protected Fragments
3 Methods
3.1 S. aureus Cultures (to Be Performed in an L2 Laboratory)
3.2 Cell Harvesting and Lysis (to Be Performed in an L2 Laboratory)
3.2.1 Ice Bath Protocol
3.2.2 Flash-Freezing Protocol
3.3 Preparation of Total RNA for Transcriptomics Analysis
3.4 Monosome/Polysome Enrichment and Buffer Exchange
3.5 MNase Treatment
3.6 Polysome Analysis and Monosome Isolation
3.7 Hot Phenol RNA Extraction from Monosomes
3.8 Purification of Ribosome-Protected Fragments
3.9 Library Preparation
3.10 Data Analysis
3.11 Visualization and Data Interpretation with Ribo-RET Prediction Explorer
4 Notes
References
Chapter 6: CRISPR Interference-Based Functional Small RNA Genomics
1 Introduction
2 Materials
2.1 Design of the Guide Library
2.2 Cloning of the Guide Library
2.3 CRISPRi Screen
3 Methods
3.1 OPTIONAL: Compilation of the List of Targetable Sequences
3.2 OPTIONAL: Identification of the Most Abundant PAM
3.3 Design of the gRNA Library
3.4 Cloning of the gRNAs
3.5 OPTIONAL: Assembly and Cloning of CRISPR Arrays
3.6 Transformation into the Donor Strain E. coli S17-1
3.7 Conjugation into B. thetaiotaomicron
3.8 Growth of the Library Under Selecting Conditions (CRISPRi Screen)
3.9 gDNA Extraction
3.10 PCR Amplification
3.11 Library Preparation and Sequencing
3.12 Data Analysis
4 Notes
References
Chapter 7: Investigation of sRNA-mRNA Interactions in Bacillus subtilis In Vivo
1 Introduction
1.1 Application of Two-Plasmid Systems, Reporter Gene Fusions, and Chromosomal Deletions, Insertions, or Mutations to Study th...
1.2 Determination of Expression Profiles, Half-Lives, and Intracellular Concentrations of sRNA and mRNA
1.3 In Vivo Analysis of RNA Chaperones That Promote the sRNA-mRNA Interaction
2 Materials
2.1 Two-Plasmid Systems, Reporter Gene Fusions, and Chromosomal Deletions, Insertions, or Mutations to Study the Effect of sRN...
2.1.1 Growth Media
2.1.2 Vector, PCR Fragment Preparation, and Cloning
2.1.3 Bacterial Strains
2.1.4 Transformation and Selection
2.1.5 β-galactosidase Measurements
2.1.6 LFH Materials
2.2 Northern Blotting
2.2.1 RNA Isolation
2.2.2 PAAGE and Tank Blotting
2.2.3 Agarose Gels and Capillary Blotting
2.2.4 Preparation of Riboprobes/Oligo-DNA Probes (See Chapter 8, Subheading 2.1.1)
2.2.5 Prehybridization, Hybridization of the Probe, and Detection
2.3 Two-Step Quantitative Real-Time PCR
2.3.1 DNase I Treatment
2.3.2 Quantitative cDNA Synthesis
2.3.3 qRT-PCR
2.4 In Vivo Analysis of RNA Chaperones That Promote the sRNA-mRNA Interaction
3 Methods
3.1 Two-Plasmid Systems, Reporter Gene Fusions, and Chromosomal Deletions, Insertions, or Mutations to Study the Effect of sRN...
3.1.1 Cloning of Wild-Type or Mutated sRNA Gene into a Plasmid Vector (See Note 1)
3.1.2 E. coli Transformation and Selection of Correct Clones
3.1.3 Bacillus subtilis Transformation
3.1.4 Long-Flanking Homology (LFH)-PCR
3.1.5 Construction of Compensatory Mutations to Confirm Base-Pairing Interactions (See Notes 15, 16, 17, and 18)
3.1.6 Construction of Transcriptional or Translational lacZ Reporter Gene Fusions to Demonstrate Effects of an sRNA on Transla...
3.2 Determination of Expression Profiles, Half-Lives, and Intracellular Concentrations of sRNA and mRNA
3.2.1 Isolation of Total RNA
3.2.2 Separation of sRNA and Short Target mRNA in Polyacrylamide Gels and Subsequent Tank Blotting
3.2.3 Separation of Target mRNAs >1 kb in Agarose Gels and Subsequent Capillary Blotting
3.2.4 Preparation of Riboprobes and Oligo-DNA Probes
3.2.5 Prehybridization, Hybridization, and Exposure
3.3 Two-Step Quantitative Real-Time PCR
3.3.1 DNase I Treatment
3.3.2 Quantitative cDNA Synthesis
3.3.3 qRT-PCR
3.4 In Vivo Analysis of RNA Chaperones That Promote the sRNA-mRNA Interaction
4 Notes
References
Chapter 8: In Vitro Methods for the Investigation of sRNA-mRNA Interactions in Bacillus subtilis
1 Introduction
1.1 In Vitro Synthesis and Purification of Labeled or Unlabeled RNA from Denaturing Polyacrylamide Gels
1.2 Investigation of RNA-RNA Interactions by Electrophoretic Mobility Shift Assay (EMSA)
1.3 Determination of Secondary Structures of the sRNA and the sRNA/Target RNA Complex
1.4 Analysis of RNA Chaperones That Promote the sRNA-mRNA Interaction
2 Materials
2.1 In Vitro Synthesis and Purification of Labeled or Unlabeled RNA from Denaturing Polyacrylamide Gels
2.1.1 In Vitro Synthesis of RNA
2.1.2 Polyacrylamide Gel (PAAG) Preparation
2.2 Investigation of RNA-RNA Interactions by Electrophoretic Mobility Shift Assay (EMSA)
2.2.1 In Vitro Synthesis of Internally Labeled RNA
2.2.2 In Vitro Synthesis and Purification of 5′ Labeled RNA
2.2.3 Electrophoretic Mobility Shift Assay (EMSA)
2.3 Determination of Secondary Structures of the sRNA and the sRNA/Target RNA Complex
2.3.1 Enzymatic RNA Structure Probing
2.3.2 Chemical RNA Secondary Structure Probing
2.3.3 Chemical Probing with DMS (Dimethyl Sulfate)
2.3.4 Chemical Probing with CMCT (1-Cyclohexyl-3-2-Morpholinoethyl Metho-p-Toluene Carbodiimide Sulfonate)
2.3.5 Chemical Probing with Pb2+
2.3.6 PAAG for Chemical Probing of RNA Secondary Structure
2.4 Analysis of RNA Chaperones That Promote the sRNA-mRNA Interaction
2.4.1 DRaCALA (Differential Radial Capillary Action of Ligand Assay)
2.4.2 EMSAs for Binding of an RNA Chaperone to an RNA
2.4.3 EMSAs to Study the Effect of an RNA Chaperone on an RNA-RNA Interaction
2.4.4 RNA Secondary Structure Probing in RNA-protein Complexes
3 Methods
3.1 In Vitro Synthesis and Purification of Labeled or Unlabeled RNA from Denaturing Polyacrylamide Gels
3.1.1 In Vitro Synthesis of the RNA
3.1.2 RNA Purification by PAAG Electrophoresis and Subsequent Elution
3.1.3 In Vitro Synthesis of Internally Labeled RNA
3.1.4 Separation on a 6% Denaturing PAAG and Elution of the Labeled RNA
3.1.5 Sephadex G-50 Column Preparation and Probe Purification (See Note 8, Fig. 1b)
3.1.6 In Vitro Synthesis and Purification of 5′ Labeled RNA
3.1.7 Dephosphorylation of the RNA
3.1.8 Labeling with 32P-[γ-ATP]
3.2 Investigation of RNA-RNA Interactions by Electrophoretic Mobility Shift Assay (EMSA)
3.2.1 In Vitro Synthesis of sRNA and Target RNA and Labeling of One Interaction Partner
3.2.2 Binding Reaction
3.2.3 Gel Separation of Free RNA and Duplex
3.2.4 Determination of the Kd of the RNA-RNA Complex
3.2.5 Localization of the Minimal Inhibitory Sequence of an sRNA
3.2.6 Determination of the Apparent Binding Rate Constant kapp
3.3 Determination of Secondary Structures of the sRNA and the sRNA/Target RNA Complex
3.3.1 Enzymatic Secondary Structure Probing of RNA
3.3.2 RNase and Nuclease S1 Cleavage Reactions
3.3.3 Preparation of T1 Ladder
3.3.4 Preparation of Alkaline Ladder
3.3.5 Determination of the RNA Secondary Structure After Enzymatic Cleavage (See Notes 24-26)
3.3.6 Secondary Structure Probing of RNA-RNA Complexes
3.3.7 Chemical Probing of RNA by Modification with DMS, CMCT, or Pb2+
3.3.8 Chemical Probing of RNA by Modification with DMS
3.3.9 Chemical Probing of RNA by Modification with CMCT
3.3.10 Chemical Probing of RNA by Modification with Pb2+
3.3.11 Determination of the Secondary Structure After Chemical Modification
3.4 Analysis of RNA Chaperones That Promote the sRNA-mRNA Interaction
3.4.1 DRaCALA (Differential Radial Capillary Action of Ligand Assay)
3.4.2 RNA-Protein EMSA
3.4.3 Applications of RNA-Protein EMSA
3.4.4 RNA-RNA-Protein EMSA
3.4.5 Evaluation of RNA-RNA-Protein EMSA
3.4.6 RNA Secondary Structure Probing in RNA-protein Complexes
3.4.7 Evaluation of Changes in the Secondary RNA Structure by RNA Chaperone Binding
4 Notes
References
Chapter 9: RNA Double-Helix Hybridization Measured by Fluorescence Correlation Spectroscopy
1 Introduction
1.1 RNA Double-Strand Hybridization
1.2 FCS Theory
2 Materials
2.1 ssRNAs
2.2 Hybridization
2.3 Fluorescence Correlation Spectroscopy (FCS)
2.4 Software
3 Methods
3.1 Hybridization Assay
3.2 Performance of FCS Measurement
3.3 Data Analysis
4 Notes
References
Chapter 10: New Perspectives on Crosstalks Between Bacterial Regulatory RNAs from Outer Membrane Vesicles and Eukaryotic Cells
1 Introduction
2 Regulatory sRNAs Carried by Membrane Vesicles
2.1 sRNAs Carried by OMVs
2.1.1 Crosstalk Between Host Immune System and OMV-Associated sRNAs
2.2 sRNAs Carried by EVs or Other Membrane Vesicles
3 Bacterial sRNAs Regulating Gene Expression in Plant Cells
4 Future Perspectives
References
Chapter 11: Experimental Validation of RNA-RNA Interactions by Electrophoretic Mobility Shift Assay
1 Introduction
2 Materials
2.1 DNA Template
2.2 In Vitro Transcription and Purification of RNA
2.3 Labeling of RNA
2.4 EMSA
3 Methods
3.1 Design of Mutant Derivatives
3.2 Design and Preparation of DNA Template
3.3 In Vitro Transcription and Purification of RNAs
3.4 Labeling of RNAs
3.5 EMSA
4 Notes
References
Chapter 12: Dynamics and Function of sRNA/mRNAs Under the Scrutiny of Computational Simulation Methods
1 Simulations
1.1 Theoretical Framework
1.2 The MM Force Fields
1.3 Free Energy Landscapes and Conformational Ensembles
1.4 Challenges for RNA MD Simulations
2 Dynamical Characteristics of RNAs
2.1 Ensemble Modularity in RNA
2.2 Hierarchy of RNA Free Energy Landscapes
3 MD Simulations Protocols
3.1 Packages to Perform MD Simulations
3.2 Setting Up the Model
3.3 Minimisation, Heating, and Equilibration
3.4 Production
3.5 Enhanced Sampling
4 Advances in RNA Simulations
4.1 Benchmark Applications
4.2 Applications to Regulatory sRNAs
5 Conclusions
6 Movie Legend
References
Chapter 13: Analysis of sRNAs and Their mRNA Targets in Sinorhizobium meliloti: Focus on Half-Life Determination
1 Introduction
1.1 Translation Inhibition as an Independent Signal for the Tryptophan Attenuator
1.2 Considerations of Methods for RNA Isolation and RNA Stability Determination
2 Materials
2.1 Cultivation and Harvest
2.2 RNA Isolation with Spike-In Transcript, Using RNeasy Mini Columns
2.3 DNase Treatment
2.4 RT-qPCR
3 Methods
3.1 Cultivation and Harvest
3.2 RNA Isolation (According to the RNeasy Mini Kit Protocol) and Spike-In
3.3 DNase Treatment
3.4 RT-qPCR
3.5 Cq Determination
3.6 Primer Efficiency Evaluation
3.7 Evaluation of mRNA Half-Life
4 Notes
References
Chapter 14: Evaluation of 5′-End Phosphorylation for Small RNA Stability and Target Regulation In Vivo
1 Introduction
2 Materials
2.1 Strain and Plasmid Constructions
2.2 Cultures for Determination of (s)RNA Steady-State Levels
2.3 Cultures for (s)RNA Half-Life Determinations
2.4 Total RNA Extraction
2.5 Denaturing Urea-Polyacrylamide Gel Electrophoresis for Separation of Total RNA
2.6 Northern Blotting
2.7 Stripping the Nylon Membrane for Probe Removal
2.8 Generation of DIG-Labeled RNA Probes
3 Methods
3.1 Construction of the Test Strain and Recombinant Plasmids for Release of sRNAx
3.2 Cultivation of Bacteria for Determination of (s)RNA Steady-State Levels
3.3 Cultivation of Bacteria for Determination of (s)RNA Decay Rates
3.4 Extraction of Total RNA from the Bacterial Pellets
3.5 Denaturing Urea Polyacrylamide Gel Electrophoresis
3.6 Northern Blotting
3.7 Stripping the Nylon Membrane for Detection of Distinct RNA Species
3.8 Generation of DIG-Labeled RNA Probes by In Vitro Transcription
4 Notes
References
Chapter 15: In-Gel Cyanoethylation for Pseudouridines Mass Spectrometry Detection of Bacterial Regulatory RNA
1 Introduction
2 Materials
2.1 Instruments and Equipment
2.2 Gel Electrophoresis
2.3 In-Gel Cyanoethylation of ψ and RNase Digestion
2.4 Nano Liquid Chromatography
3 Methods
3.1 RNA Purification by Gel Electrophoresis
3.2 In-Gel Cyanoethylation and RNase Digestion
3.3 RNA Digest Products Desalting
3.4 nanoLC-MS/MS
3.4.1 Nano Liquid Chromatography
3.4.2 Mass Spectrometry Analysis
3.5 Data Analysis
4 Notes
References
Part III: sRNA Interactome
Chapter 16: Directed Screening for sRNA Targets in E. coli Using a Plasmid Library
1 Introduction
2 Materials
2.1 Molecular Cloning of sRNA Genes
2.2 mRNA Translational Fusion Construction
2.3 Library Screening
3 Methods
3.1 Molecular Cloning of sRNA Genes
3.2 mRNA Translational Fusion Construction
3.2.1 Insertion DNA Preparation
3.2.2 Recombineering (Fig. 3b)
3.3 Library Screening
3.3.1 TSS Transformation (Fig. 4)
3.3.2 β-Gal Assay Using 96-Well Microplates (Fig. 4)
4 Notes
References
Chapter 17: Defining Bacterial RNA-RNA Interactomes Using CLASH
1 Introduction
2 Materials
2.1 Strains
2.2 Growth Medium
2.3 Buffers
2.4 Solutions
2.5 Enzymes and Reagents (See Notes 1 and 2)
2.6 Consumables
2.7 Adapters and Primers
2.8 Equipment
3 Method
3.1 Experimental Procedure
3.1.1 Cell Growth
3.1.2 Cell Lysis
3.1.3 RNP-HTF Immunoprecipitation with Anti-FLAG Magnetic Beads
3.1.4 TEV Digestion of HTF Epitope
3.1.5 RNase Trimming of Transcripts Cross-Linked to the RBP
3.1.6 Ni-NTA Affinity Purification of RNP-His6 and 3′ Dephosphorylation of Covalently Bound RNAs
3.1.7 Phosphorylation of the 5′ Ends of Cross-Linked RNAs with Radioactive 32P
3.1.8 On-Bead Ligation of the 5′ Linker to the Cross-Linked RNAs
3.1.9 On-Bead Ligation of the App-PE Linker to the 3′ End of the RNAs
3.1.10 Elution of RBP and Cross-Linked RNAs from Ni-NTA Agarose Resin
3.1.11 Trichloroacetic Acid (TCA) Precipitation of RNPs
3.1.12 Proteinase K Digestion of the Purified RBP
3.1.13 Extraction of Cross-Linked RNAs
3.1.14 Reverse Transcription of Purified RNAs
3.1.15 Purification of cDNA Library
3.1.16 Amplification of cDNA Library
3.1.17 Purification of Amplified cDNA Library
3.1.18 Size-Based Selection of cDNA Libraries
3.1.19 Ethanol Precipitation of cDNAs
3.2 Computational Analysis
3.2.1 Bioinformatic Analysis of CLASH Sequencing Output
3.2.2 Preparing a Minimum Input File for the Data Analysis
3.2.3 Removal of Duplicate Hybrids and Folding Unique Chimera
3.2.4 Comparison of Folding Energies of Experimentally Defined Chimeras and Artificially Generated Control Interactions
3.2.5 Clustering of Targets with Common Seed Sequences
3.2.6 Uncovering Enriched Motifs within Seed Sequences
4 Notes
References
Chapter 18: Global Identification of RNA-Binding Proteins in Bacteria
1 Introduction
2 Materials
2.1 Bacterial Culturing
2.2 Buffers and Reagents
2.3 Equipment
3 Methods
3.1 Verification of Polyadenylation of RNA
3.2 RNA Harvest and Northern Blotting
3.3 Determining the Optimal Duration of PAPI Induction
3.4 UV Cross-Linking and Harvesting of Bacterial Cultures
3.5 Cell Lysis
3.6 RNA-Protein Pulldown
3.7 Recycling of Beads
3.8 Protein Analysis by SDS-PAGE
4 Notes
References
Chapter 19: An Integrated Affinity Chromatography-Based Approach to Unravel the sRNA Interactome in Nitrogen-Fixing Rhizobia
1 Introduction
2 Materials
2.1 MS2 Aptamer Tagging of the sRNA
2.2 Culture, Harvest of Bacteria, and Cell Lysis
2.3 Total RNA Extraction from Bacterial Iysates
2.4 Affinity Chromatography
2.5 RT-qPCR Analysis
2.6 Bioinformatics Analysis
3 Methods
3.1 Aptamer Tagging of the sRNAs
3.2 Cell Growth and Pulse Expression of the MS2-sRNA
3.3 Quality Check of the Tagging Strategy
3.3.1 RNA Purification from Harvested Cells
3.3.2 Assessment of Quality and Quantity of RNA in Samples
3.4 Affinity Chromatography
3.5 RNA Purification from the Eluates and Processing for RNAseq
3.6 Protein Purification from the Eluates and Preparation for Proteomics
3.7 Data Analysis (RNA-Seq)
3.8 Data Analysis (Proteomics)
4 Notes
References
Part IV: sRNA Structure
Chapter 20: sRNA Structural Modeling Based on NMR Data
1 Introduction
2 Materials
2.1 Enzymes, Buffers, and Chemicals
2.2 Equipment
2.3 Software
3 Methods
3.1 RNA In Vitro Transcription for NMR Studies
3.2 Experimental Determination of sRNA Secondary Structure
3.3 Structural Modeling of sRNA Using Computational Methods
4 Notes
References
Chapter 21: Circular and Linear Dichroism for the Analysis of Small Noncoding RNA Properties
1 Introduction
2 Materials
2.1 Nucleic Acids, Proteins, Buffers, and Chemicals
2.2 Materials for Synchrotron Radiation Circular Dichroism (SRCD) Spectroscopy
2.3 Materials for Synchrotron Radiation Linear Dichroism (SRLD) Spectroscopy
3 Methods
3.1 Acquisition and Treatment of SRCD Spectra
3.1.1 Sample Preparation and Loading
3.1.2 SRCD Data Acquisition
3.1.3 Spectral Data Treatment
3.1.4 RNA Spectra Analysis
3.1.5 Thermal Scans Acquisition
3.1.6 Specific Case of Proteins Allowing RNA Annealing
3.2 SRLD Spectra Acquisition and Treatment
3.2.1 Sample Loading
3.2.2 SRLD Data Acquisition and Treatment
4 Notes
References
Index
Recommend Papers

Bacterial Regulatory RNA: Methods and Protocols (Methods in Molecular Biology, 2741) [2 ed.]
 1071635646, 9781071635643

  • 0 0 0
  • Like this paper and download? You can publish your own PDF file online for free in a few minutes! Sign Up
File loading please wait...
Citation preview

Methods in Molecular Biology 2741

Véronique Arluison Claudio Valverde  Editors

Bacterial Regulatory RNA Methods and Protocols Second Edition

METHODS

IN

MOLECULAR BIOLOGY

Series Editor John M. Walker School of Life and Medical Sciences University of Hertfordshire Hatfield, Hertfordshire, UK

For further volumes: http://www.springer.com/series/7651

For over 35 years, biological scientists have come to rely on the research protocols and methodologies in the critically acclaimed Methods in Molecular Biology series. The series was the first to introduce the step-by-step protocols approach that has become the standard in all biomedical protocol publishing. Each protocol is provided in readily-reproducible step-by step fashion, opening with an introductory overview, a list of the materials and reagents needed to complete the experiment, and followed by a detailed procedure that is supported with a helpful notes section offering tips and tricks of the trade as well as troubleshooting advice. These hallmark features were introduced by series editor Dr. John Walker and constitute the key ingredient in each and every volume of the Methods in Molecular Biology series. Tested and trusted, comprehensive and reliable, all protocols from the series are indexed in PubMed.

Bacterial Regulatory RNA Methods and Protocols Second Edition

Edited by

Véronique Arluison Laboratoire Léon Brillouin LLB, CEA, CNRS UMR 12, CEA Saclay, Gif-sur-Yvette, France; Université Paris Cité, Paris, France

Claudio Valverde Laboratorio de Fisiología y Genética de Bacterias Beneficiosas para Plantas (LFGBBP), Centro de Bioquímica y Microbiología del Suelo (CBMS), Departamento de Ciencia y Tecnología, Universidad Nacional de Quilmes – CONICET, Bernal, Buenos Aires, Argentina

Editors Ve´ronique Arluison Laboratoire Le´on Brillouin LLB, CEA, CNRS UMR 12, CEA Saclay Gif-sur-Yvette, France Universite´ Paris Cite´ Paris, France

Claudio Valverde Laboratorio de Fisiologı´a y Gene´tica de Bacterias Beneficiosas para Plantas (LFGBBP), Centro de Bioquı´mica y Microbiologı´a del Suelo (CBMS), Departamento de Ciencia y Tecnologı´a, Universidad Nacional de Quilmes – CONICET Bernal, Buenos Aires, Argentina

ISSN 1064-3745 ISSN 1940-6029 (electronic) Methods in Molecular Biology ISBN 978-1-0716-3564-3 ISBN 978-1-0716-3565-0 (eBook) https://doi.org/10.1007/978-1-0716-3565-0 © The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Science+Business Media, LLC, part of Springer Nature 2024 This work is subject to copyright. All rights are solely and exclusively licensed by the Publisher, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting, reproduction on microfilms or in any other physical way, and transmission or information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or hereafter developed. The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication does not imply, even in the absence of a specific statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use. The publisher, the authors, and the editors are safe to assume that the advice and information in this book are believed to be true and accurate at the date of publication. Neither the publisher nor the authors or the editors give a warranty, expressed or implied, with respect to the material contained herein or for any errors or omissions that may have been made. The publisher remains neutral with regard to jurisdictional claims in published maps and institutional affiliations. This Humana imprint is published by the registered company Springer Science+Business Media, LLC, part of Springer Nature. The registered company address is: 1 New York Plaza, New York, NY 10004, U.S.A. Paper in this product is recyclable.

Preface/Summary Regulatory small non-coding RNAs (sRNAs) are ubiquitous key regulators of gene expression in prokaryotes, operating mostly at the post-transcriptional level to influence the fate of mRNA translation and/or stability, in most cases with the complicity of RNA binding proteins. Although significant progress has been made in the past 25 years to understand the function of individual sRNAs, high-throughput RNomics has recently revealed the enormous wealth of non-coding RNA species existing in a wide variety of prokaryotes, thus significantly expanding the implications of this class of regulatory molecules and boosting this new field of research of yet undimensioned relevance. The understanding of many of the fundamental processes underlying the evolution, expression, structure, subcellular location, dynamics, and function of sRNA requires the availability of solid experimental approaches that may be applied either singly or in combinations to explore key aspects of sRNA biology. This volume collects many of the most important methods that have been recently set up for studying prokaryotic non-coding RNAs and their protein accomplices and complete the first edition published in 2018. These methods cover different aspects in the biology of the field presented in different sections: Discovery of sRNAs, their functional analysis, characterization of sRNA interactomes, and structural studies. Each method includes a section with advice and tips from the authors. This volume aims to provide a guidebook to scientists, which we hope will lead to new tools and procedures for further development in the field of sRNA biology. Gif-sur-Yvette, France Bernal, Buenos Aires, Argentina

Ve´ronique Arluison Claudio Valverde

v

Contents Preface/Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Contributors. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

PART I

SRNA

DISCOVERY

1 RNA Extraction from Gram-Positive Bacteria Membrane Vesicles Using a Polymer-Based Precipitation Method . . . . . . . . . . . . . . . . . . . . . . . Paul Briaud and Ronan K. Carroll 2 Extraction and Purification of Outer Membrane Vesicles and Their Associated RNAs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Anaı¨s Blache and Wafa Achouak 3 Analysis of Phage Regulatory RNAs: Sequencing Library Construction from the Fraction of Small Prokaryotic RNAs Less Than 50 Nucleotides in Length . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Sylwia Bloch, Natalia Lewandowska, Wojciech Wesołowski, Aleksandra Łukasiak, Paulina Mach, Boz˙ena Nejman-Falen´czyk, and Grzegorz We˛grzyn 4 Discovering Novel Bacterial Small RNA by RNA-seq Analysis Toolkit ANNOgesic . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Chin-Hsien Tai, Deborah Hinton, and Sung-Huan Yu

PART II

v ix

SRNA

3

11

25

35

FUNCTIONS

5 Ribosome Profiling Methods Adapted to the Study of RNA-Dependent Translation Regulation in Staphylococcus aureus. . . . . . . . . . . Maximilian P. Kohl, Be´atrice Chane-Woon-Ming, Roberto Bahena-Ceron, Jose Jaramillo-Ponce, Laura Antoine, Lucas Herrgott, Pascale Romby, and Stefano Marzi 6 CRISPR Interference-Based Functional Small RNA Genomics . . . . . . . . . . . . . . . Gianluca Prezza and Alexander J. Westermann 7 Investigation of sRNA-mRNA Interactions in Bacillus subtilis In Vivo . . . . . . . . . ¨ ller, and Sabine Brantl Inam Ul Haq, Peter Mu 8 In Vitro Methods for the Investigation of sRNA-mRNA Interactions in Bacillus subtilis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ¨ ller, and Sabine Brantl Inam Ul Haq, Peter Mu 9 RNA Double-Helix Hybridization Measured by Fluorescence Correlation Spectroscopy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Arne Werner 10 New Perspectives on Crosstalks Between Bacterial Regulatory RNAs from Outer Membrane Vesicles and Eukaryotic Cells . . . . . . . . . . . . . . . . . . Moumita Roy Chowdhury and Eric Masse´

vii

73

101 117

145

175

183

viii

Contents

11

Experimental Validation of RNA–RNA Interactions by Electrophoretic Mobility Shift Assay . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Eva Maria Sternkopf Lillebæk and Birgitte Haahr Kallipolitis 12 Dynamics and Function of sRNA/mRNAs Under the Scrutiny of Computational Simulation Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Agustı´n Ormaza´bal, Juliana Palma, and Gustavo Pierdominici-Sottile 13 Analysis of sRNAs and Their mRNA Targets in Sinorhizobium meliloti: Focus on Half-Life Determination . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Robina Scheuer, Jennifer Kothe, Jan W€ a hling, and Elena Evguenieva-Hackenberg 14 Evaluation of 5′-End Phosphorylation for Small RNA Stability and Target Regulation In Vivo . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Alexandra Schilder, Yvonne Go¨pel, Muna Ayesha Khan, and Boris Go¨rke 15 In-Gel Cyanoethylation for Pseudouridines Mass Spectrometry Detection of Bacterial Regulatory RNA. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Antony Lechner and Philippe Wolff

PART III

Directed Screening for sRNA Targets in E. coli Using a Plasmid Library . . . . . . . Xing Luo and Nadim Majdalani 17 Defining Bacterial RNA-RNA Interactomes Using CLASH . . . . . . . . . . . . . . . . . . Sofia Esteban-Serna, Liang-Cui Chu, Mehak Chauhan, Pujitha Raja, and Sander Granneman 18 Global Identification of RNA-Binding Proteins in Bacteria. . . . . . . . . . . . . . . . . . . Thomas Søndergaard Stenum and Erik Holmqvist 19 An Integrated Affinity Chromatography-Based Approach to Unravel the sRNA Interactome in Nitrogen-Fixing Rhizobia . . . . . . . . . . . . . . . . . . . . . . . . Natalia Isabel Garcı´a-Tomsig, Antonio Lagares Jr., Anke Becker, Claudio Valverde, and Jose´ Ignacio Jime´nez-Zurdo

20 21

207

239

255

273

SRNA INTERACTOME

16

PART IV

195

SRNA

291 307

347

363

STRUCTURE

sRNA Structural Modeling Based on NMR Data . . . . . . . . . . . . . . . . . . . . . . . . . . . 383 Pengzhi Wu and Lingna Yang Circular and Linear Dichroism for the Analysis of Small Noncoding RNA Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 399 Florian Turbant, Kevin Mosca, Florent Busi, Ve´ronique Arluison, and Frank Wien

Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

417

Contributors WAFA ACHOUAK • CEA, CNRS, Aix Marseille University Lab of Microbial Ecology of the Rhizosphere (LEMiRE), UMR7265 BIAM, Saint-Paul-lez-Durance, France LAURA ANTOINE • Architecture et Re´activite´ de l’ARN, CNRS 9002, Universite´ de Strasbourg, Strasbourg, France VE´RONIQUE ARLUISON • Laboratoire Le´on Brillouin LLB, CEA, CNRS UMR 12, CEA Saclay, Gif-sur-Yvette, France; Universite´ Paris Cite´, Paris, France ROBERTO BAHENA-CERON • Architecture et Re´activite´ de l’ARN, CNRS 9002, Universite´ de Strasbourg, Strasbourg, France ANKE BECKER • Center for Synthetic Microbiology (SYNMIKRO), University of Marburg, Marburg, Germany ANAI¨S BLACHE • CEA, CNRS, Aix Marseille University Lab of Microbial Ecology of the Rhizosphere (LEMiRE), UMR7265 BIAM, Saint-Paul-lez-Durance, France SYLWIA BLOCH • Department of Molecular Biology, Faculty of Biology, University of Gdansk, Gdansk, Poland SABINE BRANTL • Matthias-Schleiden-Institut fu¨r Genetik, Bioinformatik und Molekulare Botanik, AG Bakteriengenetik, Friedrich-Schiller-Universit€ at Jena, Jena, Germany PAUL BRIAUD • Department of Biological Sciences, Ohio University, Athens, OH, USA FLORENT BUSI • Universite´ Paris Cite´, Paris, France; BFA, UMR 8251, Universite´ Paris cite´, CNRS, Paris, France RONAN K. CARROLL • Department of Biological Sciences, Ohio University, Athens, OH, USA; Molecular and Cellular Biology Program, Ohio University, Athens, OH, USA; Infectious and Tropical Disease Institute, Ohio University, Athens, OH, USA BE´ATRICE CHANE-WOON-MING • Architecture et Re´activite´ de l’ARN, CNRS 9002, Universite´ de Strasbourg, Strasbourg, France MEHAK CHAUHAN • Centre for Engineering Biology, School of Biological Sciences, University of Edinburgh, Edinburgh, UK MOUMITA ROY CHOWDHURY • Department of Biochemistry, RNA Group, Universite´ de Sherbrooke, Sherbrooke, QC, Canada LIANG-CUI CHU • Centre for Engineering Biology, School of Biological Sciences, University of Edinburgh, Edinburgh, UK SOFIA ESTEBAN-SERNA • Centre for Engineering Biology, School of Biological Sciences, University of Edinburgh, Edinburgh, UK ELENA EVGUENIEVA-HACKENBERG • Institute of Microbiology and Molecular Biology, University of Giessen, Giessen, Germany NATALIA ISABEL GARCI´A-TOMSIG • Structure, Dynamics and Function of Rhizobacterial Genomes (RhizoRNA Lab), Estacion Experimental del Zaidı´n, Consejo Superior de Investigaciones Cientı´ficas (CSIC), Granada, Spain YVONNE GO¨PEL • Department of Microbiology, Immunobiology and Genetics, Max Perutz Labs, University of Vienna, Vienna Biocenter (VBC), Vienna, Austria; Lexogen, Campus Vienna Biocenter 5, Vienna, Austria BORIS GO¨RKE • Department of Microbiology, Immunobiology and Genetics, Max Perutz Labs, University of Vienna, Vienna Biocenter (VBC), Vienna, Austria

ix

x

Contributors

SANDER GRANNEMAN • Centre for Engineering Biology, School of Biological Sciences, University of Edinburgh, Edinburgh, UK INAM UL HAQ • Matthias-Schleiden-Institut fu¨r Genetik, Bioinformatik und Molekulare Botanik, AG Bakteriengenetik, Friedrich-Schiller-Universit€ at Jena, Jena, Germany LUCAS HERRGOTT • Architecture et Re´activite´ de l’ARN, CNRS 9002, Universite´ de Strasbourg, Strasbourg, France DEBORAH HINTON • Gene Expression and Regulation Section, Laboratory of Biochemistry and Genetics, National Institute of Diabetes and Digestive and Kidney Diseases, National Institutes of Health, Bethesda, MD, USA ERIK HOLMQVIST • Department of Cell and Molecular Biology, Biomedical Centre, Uppsala University, Uppsala, Sweden JOSE JARAMILLO-PONCE • Architecture et Re´activite´ de l’ARN, CNRS 9002, Universite´ de Strasbourg, Strasbourg, France JOSE´ IGNACIO JIME´NEZ-ZURDO • Structure, Dynamics and Function of Rhizobacterial Genomes (RhizoRNA Lab), Estacion Experimental del Zaidı´n, Consejo Superior de Investigaciones Cientı´ficas (CSIC), Granada, Spain BIRGITTE HAAHR KALLIPOLITIS • Department of Biochemistry and Molecular Biology, University of Southern Denmark, Odense, Denmark MUNA AYESHA KHAN • Department of Microbiology, Immunobiology and Genetics, Max Perutz Labs, University of Vienna, Vienna Biocenter (VBC), Vienna, Austria; Lexogen, Campus Vienna Biocenter 5, Vienna, Austria MAXIMILIAN P. KOHL • Architecture et Re´activite´ de l’ARN, CNRS 9002, Universite´ de Strasbourg, Strasbourg, France JENNIFER KOTHE • Institute of Microbiology and Molecular Biology, University of Giessen, Giessen, Germany ANTONIO LAGARES JR • Center for Synthetic Microbiology (SYNMIKRO), University of Marburg, Marburg, Germany; Laboratorio de Fisiologı´a y Gene´tica de Bacterias Beneficiosas para Plantas (LFGBBP), Centro de Bioquı´mica y Microbiologı´a del Suelo, Departamento de Ciencia y Tecnologı´a, Universidad Nacional de Quilmes – CONICET, Bernal, Argentina ANTONY LECHNER • Universite´ de Strasbourg, CNRS, Architecture et Re´activite´ de l’ARN, UPR 9002, Strasbourg, France NATALIA LEWANDOWSKA • Department of Molecular Biology, Faculty of Biology, University of Gdansk, Gdansk, Poland EVA MARIA STERNKOPF LILLEBÆK • Department of Biochemistry and Molecular Biology, University of Southern Denmark, Odense, Denmark ALEKSANDRA ŁUKASIAK • Department of Molecular Biology, Faculty of Biology, University of Gdansk, Gdansk, Poland XING LUO • Laboratory of Molecular Biology, National Cancer Institute, Bethesda, MD, USA PAULINA MACH • Department of Molecular Biology, Faculty of Biology, University of Gdansk, Gdansk, Poland NADIM MAJDALANI • Laboratory of Molecular Biology, National Cancer Institute, Bethesda, MD, USA STEFANO MARZI • Architecture et Re´activite´ de l’ARN, CNRS 9002, Universite´ de Strasbourg, Strasbourg, France ERIC MASSE´ • Department of Biochemistry, RNA Group, Universite´ de Sherbrooke, Sherbrooke, QC, Canada

Contributors

xi

KEVIN MOSCA • Laboratoire Le´on Brillouin LLB, CEA, CNRS UMR 12, CEA Saclay, Gif-sur-Yvette, France; Synchrotron SOLEIL, L’Orme des Merisiers Saint Aubin, Gif-surYvette, France; SANOFI, Marcy-l’Etoile, France PETER MU¨LLER • Matthias-Schleiden-Institut fu¨r Genetik, Bioinformatik und Molekulare Botanik, AG Bakteriengenetik, Friedrich-Schiller-Universit€ at Jena, Jena, Germany BOZ˙ENA NEJMAN-FALEN´CZYK • Department of Molecular Biology, Faculty of Biology, University of Gdansk, Gdansk, Poland AGUSTI´N ORMAZA´BAL • Departamento de Ciencia y Tecnologı´a, Universidad Nacional de Quilmes, Bernal, Buenos Aires, Argentina; Consejo Nacional de Investigaciones Cientı´ficas y Te´cnicas, CONICET, CABA, Buenos Aires, Argentina JULIANA PALMA • Departamento de Ciencia y Tecnologı´a, Universidad Nacional de Quilmes, Bernal, Buenos Aires, Argentina; Consejo Nacional de Investigaciones Cientı´ficas y Te´ cnicas, CONICET, CABA, Buenos Aires, Argentina GUSTAVO PIERDOMINICI-SOTTILE • Departamento de Ciencia y Tecnologı´a, Universidad Nacional de Quilmes, Bernal, Buenos Aires, Argentina; Consejo Nacional de Investigaciones Cientı´ficas y Te´cnicas, CONICET, CABA, Buenos Aires, Argentina GIANLUCA PREZZA • Helmholtz Institute for RNA-based Infection Research (HIRI), Helmholtz Centre for Infection Research (HZI), Wu¨rzburg, Germany PUJITHA RAJA • Division of Pathway and Infection Medicine, University of Edinburgh, Edinburgh, UK PASCALE ROMBY • Architecture et Re´activite´ de l’ARN, CNRS 9002, Universite´ de Strasbourg, Strasbourg, France ROBINA SCHEUER • Institute of Microbiology and Molecular Biology, University of Giessen, Giessen, Germany ALEXANDRA SCHILDER • Department of Microbiology, Immunobiology and Genetics, Max Perutz Labs, University of Vienna, Vienna Biocenter (VBC), Vienna, Austria; Doctoral School in Microbiology and Environmental Science, University of Vienna, Vienna, Austria THOMAS SØNDERGAARD STENUM • Department of Cell and Molecular Biology, Biomedical Centre, Uppsala University, Uppsala, Sweden CHIN-HSIEN TAI • Laboratory of Molecular Biology, Center for Cancer Research, National Cancer Institute, National Institutes of Health, Bethesda, MD, USA FLORIAN TURBANT • Laboratoire Le´on Brillouin LLB, CEA, CNRS UMR 12, CEA Saclay, Gif-sur-Yvette, France; Synchrotron SOLEIL, L’Orme des Merisiers Saint Aubin, Gif-surYvette, France CLAUDIO VALVERDE • Laboratorio de Fisiologı´a y Gene´tica de Bacterias Beneficiosas para Plantas (LFGBBP), Centro de Bioquı´mica y Microbiologı´a del Suelo, Departamento de Ciencia y Tecnologı´a, Universidad Nacional de Quilmes – CONICET, Bernal, Provincia de Buenos Aires, Argentina JAN WA€ HLING • Institute of Microbiology and Molecular Biology, University of Giessen, Giessen, Germany GRZEGORZ WE˛GRZYN • Department of Molecular Biology, Faculty of Biology, University of Gdansk, Gdansk, Poland ARNE WERNER • Institute for Biochemistry and Molecular Biology, Department of Chemistry, Faculty of Mathematics, Computer Science and Natural Science, Hamburg University, Hamburg, Germany; Institute for Microbiology, Faculty for Mathematics and Natural Science, University Kiel, Kiel, Germany WOJCIECH WESOŁOWSKI • Department of Molecular Biology, Faculty of Biology, University of Gdansk, Gdansk, Poland

xii

Contributors

ALEXANDER J. WESTERMANN • Helmholtz Institute for RNA-based Infection Research (HIRI), Helmholtz Centre for Infection Research (HZI), Wu¨rzburg, Germany; Institute of Molecular Infection Biology (IMIB), University of Wu¨rzburg, Wu¨rzburg, Germany FRANK WIEN • Laboratoire Le´on Brillouin LLB, CEA, CNRS UMR 12, CEA Saclay, Gifsur-Yvette, France PHILIPPE WOLFF • Universite´ de Strasbourg, CNRS, Architecture et Re´activite´ de l’ARN, UPR 9002, Strasbourg, France; Plateforme prote´omique Strasbourg Esplanade FRC1589 du CNRS, Universite´ de Strasbourg, Strasbourg, France PENGZHI WU • Hefei National Laboratory for Physical Sciences at the Microscale, School of Life Sciences, Division of Life Sciences and Medicine, University of Science and Technology of China, Hefei, Anhui, People’s Republic of China; Department of Biology, Institute of Biochemistry, ETH Zu¨rich, Zu¨rich, Switzerland LINGNA YANG • Hefei National Laboratory for Physical Sciences at the Microscale, School of Life Sciences, Division of Life Sciences and Medicine, University of Science and Technology of China, Hefei, Anhui, People’s Republic of China SUNG-HUAN YU • Institute of Precision Medicine, National Sun Yat-sen University, Kaohsiung, Taiwan; School of Medicine, College of Medicine, National Sun Yat-sen University, Kaohsiung, Taiwan

Part I sRNA Discovery

Chapter 1 RNA Extraction from Gram-Positive Bacteria Membrane Vesicles Using a Polymer-Based Precipitation Method Paul Briaud and Ronan K. Carroll Abstract Investigations into the biological role and composition of bacterial extracellular vesicles have grown in popularity in recent years. Vesicles perform a variety of functions during interactions with eukaryotic host cells, ranging from antibiotic resistance to immune modulation. It is necessary to isolate vesicles in order to understand their biological functions. Here we describe a polymer-based precipitation method allowing high-yield isolation of extracellular vesicles and their cargo RNA from the Gram-positive bacterium Staphylococcus aureus. Key words Membrane vesicle, RNA, Gram-positive, Staphylococcus, Extraction, MV, OMV

1

Introduction Extracellular vesicles (EVs) are membrane-derived lipid spheres produced by all domains of life. EVs have different names depending on the organism they originate from and the subcellular location of production. In Gram-negative bacteria, EVs are typically called outer-membrane vesicles (OMVs), as they are pinched off from the outer-membrane [1]. For decades, EVs from Grampositive bacteria (also called MVs for membrane vesicles) were neglected due to the presence of a thick cell wall at the surface, which was thought to hinder the production of MVs [2, 3]. In the late 2000s, the first studies showed that Gram-positive bacteria could produce MVs with the same features as described for Gramnegative bacteria: (i) MVs are made of a lipid bilayer, which originates from the bacterial membrane; (ii) the cargo is composed of nucleic acids (DNA and RNA) and proteins, which appear to be actively sorted into vesicles; (iii) MVs can deliver their content into surrounding cells and trigger a wide array of responses depending on the cargo packaged [2, 4–9]. Similar to OMVs, the precise mechanism of MV production remains elusive and species-specific

Ve´ronique Arluison and Claudio Valverde (eds.), Bacterial Regulatory RNA: Methods and Protocols, Methods in Molecular Biology, vol. 2741, https://doi.org/10.1007/978-1-0716-3565-0_1, © The Author(s), under exclusive license to Springer Science+Business Media, LLC, part of Springer Nature 2024

3

4

Paul Briaud and Ronan K. Carroll

genetic determinants have been implicated [10]. MV functions are closely correlated to the cargo packaged as toxins, proteases, and other virulence factors can be released in vesicles from pathogenic bacteria [2, 8, 11]. Interestingly, nucleic acid content such as RNA has been shown to play a role in the host cell response. OMV-associated RNA can trigger pro-inflammatory response via Rig-1-like receptor (RLR) recognition [12]. Also, OMVs from Pseudomonas aeruginosa carry a small RNA that is predicted to interact with eukaryotic mRNA involved in pro-inflammatory cytokine production [13]. More studies are needed to address the role of MV-associated RNAs in interactions with host cells. Isolation/ separation of vesicles from secreted proteins in the supernatant is essential to studying their composition and their interactions with other cells. Different methods have been described to isolate vesicles for eukaryotic cells and Gram-negative bacteria, such as diafiltration coupled with ultracentrifugation [14]. Diafiltration removes particles smaller than the filter size used and concentrates the remaining solution. Then, ultracentrifugation uses centrifugal force to pellet vesicles while less dense particles and molecules remain in the upper phases. Such methods require working with large starting volumes of cultures and rely on expensive equipment such as tangential flow filtration devices and ultracentrifuges. Polymer-based precipitation methods, on the other hand, use volume-excluding polymers to decrease the solubility of vesicles and allow isolation by low-speed centrifugation. Precipitationbased protocols have become increasingly popular over the past few years as they can isolate and purify vesicles from culture supernatants or body fluids faster and more efficiently than previous methods and use common laboratory equipment, reducing the cost and allowing for higher throughput assays than traditional ultracentrifugation technics. Here, we describe a polymer-based precipitation method to isolate Gram-positive bacterial MVs using the ExoQuick-TC buffer and the subsequent steps to isolate MV RNA. We use Staphylococcus aureus as the working model.

2

Materials Prepare all solutions with purified deionized water and sterilize by autoclaving if needed. All solutions can be stored at room temperature. 1. Tryptic Soy Broth (TSB), autoclaved. 2. 250 mL sterilized glass flask 3. 1.5 mL microcentrifuge tubes 4. 50 mL conical tubes 5. 37 °C shaking incubator

RNA Extraction from Extracellular Vesicles

5

6. Vortex. 7. Pierce™ BCA Protein Assay kit (Thermo Scientific, 23225). 8. β-Mercaptoethanol (Sigma, M6250), 9. Macrosep Advance MAP100C37).

100

kDa

concentrators

(Pall,

10. 0.22 μm filter membranes 11. Ethanol 80%. 12. Isopropanol. 13. 37 °C water bath. 14. 50 mL syringes. 15. ExoQuick-TC buffer (System Bioscience, ExoTC10A-1). 16. Phosphate-buffered saline (PBS 1×): 137 mM NaCl, 2.7 mM KCl, 8 mM Na2HPO4, 2 mM KH2PO4, pH 7.4, autoclaved. 17. miRNeasy kit (Qiagen, 217604). 18. RLT buffer from miRNeasy kit. 19. AL buffer from miRNeasy kit. 20. gDNA column eliminator from miRNeasy kit. 21. RNeasy column from miRNeasy kit. 22. RWT buffer from miRNeasy kit. 23. RPE buffer from miRNeasy kit. 24. Turbo DNAse kit (Invitrogen AM1907). 25. Deactivation reagent from Turbo DNAse kit. 26. RNAse-free water, autoclaved. 27. FM™ 4–64 Dye (Invitrogen™, T3166). 28. Refrigerated centrifuge with swinging buckets. 29. Refrigerated tabletop centrifuge. 30. Optional: 2100 Agilent Bioanalyzer and Agilent RNA 6000 Pico Kit (Agilent, 5067–1513).

3

Methods

3.1 Concentration of MVs from Culture Supernatants

Our lab and others [4, 15] were successful in isolating MVs from S. aureus using a precipitation method. If working with a different bacterial species, we recommend adapting growth conditions to your bacterium of interest. 1. Grow bacterial cultures (25 mL TSB in a 250 mL flask) with shaking (250 rpm) until mid-exponential (see Note 1). 2. Centrifuge the 25 mL culture in a 50 mL conical tube at 3000 × g for 15 min at 4 °C.

6

Paul Briaud and Ronan K. Carroll

3. Filter sterilize the supernatant through a 0.22 μm filter with a 50 mL syringe. 4. Transfer the cell-free supernatant (~ 15 mL) to the 100 kDa concentrators (see Note 2). 5. Centrifuge at 3000 × g for 60 min at room temperature in a swing bucket rotor (see Note 3). 3.2 Precipitation of MVs

1. Transfer the concentrated supernatants into a 1.5 mL microcentrifuge tube, and estimate the volume recovered with a micropipette. 2. Add ExoQuick-TC buffer to the concentrated supernatants for a final ratio of 1:5 (ExoQuick-TC buffer:supernatant) (e.g., if 200 μL of concentrated supernatant was recovered, add 50 μL of ExoQuick-TC buffer). 3. Mix well by vortexing for ~10 s, and incubate on ice at 4 °C overnight. 4. Centrifuge the mixture at 1500 × g for 30 min at 4 °C. A white/beige pellet should be visible. 5. Carefully remove the supernatant by pipetting. 6. Centrifuge again for a few seconds at 1500 × g, and pipette off the remaining supernatant. 7. Resuspend the MV pellet in a suitable volume of 1× PBS (ice cold) (see Note 4). 8. Use the BCA assay kit to estimate the number of vesicles isolated (see Note 5).

3.3 Isolation of RNA from MVs with the miRNeasy Kit

We usually perform the RNA extraction directly from the MV pellet from step 6 in Subheading 3.2. Phosphate-buffered saline used to resuspend MVs could reduce/inhibit the lysis step. 1. Add 260 μL of RLT buffer containing β-mercaptoethanol. 2. Vortex for 1 min. 3. Add 80 μL of buffer AL and mix thoroughly with a pipet. 4. Incubate at room temperature for 3 min. 5. Transfer the lysate to a gDNA eliminator column and spin down at 8000 × g for 30 s. 6. Add one volume (340 μL) of isopropanol to the flow through and pipet up and down 8 times. 7. Transfer the mixture onto an RNeasy column placed on a collection tube. 8. Centrifuge at 8000 × g for 15 s. 9. Pipet 700 μL of RWT buffer and spin down at 8000 × g for 15 s.

RNA Extraction from Extracellular Vesicles

7

10. Wash the column with 500 μL of buffer RPE and centrifuge at 8000 × g for 15 s. 11. Add 500 μL of EtOH 80% and centrifuge at 8000 × g for 15 s. 12. Place the RNeasy column onto a new collection tube and dry the column at full speed for 1 min. 13. Place the RNeasy column onto a clean 1.5 mL microcentrifuge tube, and add 35 μL of RNAse-free water. 14. Incubate for 1 min at room temperature. 15. Elute MV RNA at full speed for 1 min. 16. Add 4 μL of 10x TurboDNase buffer and 1 μL of TurboDNase enzyme to remove DNA. 17. Incubate at 37 °C for 30 min. 18. Add 4 μL of deactivation reagent and mix by pipetting up and down. 19. Centrifuge for 2.5 min at 10,000 × g. 20. Transfer the supernatant (~ 30 μL) into a fresh 1.5 mL microcentrifuge tube. 21. Determine the concentration and size of MV RNA with Agilent 2100 Bioanalyzer (see Note 6).

4

Notes 1. MVs can be formed from active budding at the bacterial membrane or passively, with auto-assembly of bacterial membrane fragments upon cell death. It is preferred to work with bacteria during mid-exponential phase as the cell lysis is minimal and the amount of MVs from cell lysis is reduced. If working at a later time point, samples may contain both active and passive MV forms. 2. Be careful to load the same volume of cell-free supernatant, particularly when working with supernatants from different conditions (e.g., WT vs. mutant). This allows a direct comparison of the MV yield between conditions. 3. If using a swing bucket rotor, the position of the concentrator membrane with respect to the rotor axis is not important. However, if using a fixed rotor, the concentrator membrane needs to be perpendicular to the rotor axis. 4. The volume of PBS to add will depend on the amount of MV produced by the bacteria. We typically resuspend MVs in 500 μL of 1× PBS. After resuspending the MV pellet, it is also possible to increase the purity of the sample by performing a density gradient purification (e.g., Optiprep).

8

Paul Briaud and Ronan K. Carroll

5. Characterization of MV content should be performed after each isolation to estimate the concentration of vesicles isolated. As MVs are composed of proteins and lipids, we routinely use the Pierce BCA protein assay kit for protein concentration estimation and the fluorescent FM4-64 dye for lipid detection. We then normalize MVs by their protein and/or lipid concentration when using them for subsequent assays. We highly recommend performing, at least once, a nanoparticle tracking analysis (e.g., LM10 NanoSight) and/or transmission electronic microscope analysis to visualize MVs directly and ensure that isolation occurred correctly (Fig. 1). 6. The profile and concentration of MV RNA obtained can differ depending on the bacterial species studied. For S. aureus, we routinely get concentrations in the nanograms / μL range. The RNA profile shows an abundance of short-sized RNAs (Fig. 2). The RIN (RNA integrity number) cannot be computed as no defined rRNA bands can be determined. RNA isolated can be used for sequencing.

Fig. 1 Nanoparticle tracking analysis (NTA) of MVs isolated from S. aureus. Most of the vesicles concentrate around a particle size of ~82 nm, which is the usual size for vesicles isolated from Gram-positive bacteria

RNA Extraction from Extracellular Vesicles

9

Fig. 2 RNA profile of MV RNA from S. aureus grown at 37 °C for 3 h. A high abundance of short-sized RNAs is observed. The Agilent RNA 6000 Pico Kit was used References 1. Roier S, Zingl FG, Cakar F, Schild S (2016) Bacterial outer membrane vesicle biogenesis: a new mechanism and its implications. 6. Microb Cell 3:257–259 2. Briaud P, Carroll RK (2020) Extracellularvesicle (EV) biogenesis and functions in Gram-positive bacteria. Infect Immun. https://doi.org/10.1128/IAI.00433-20 3. Brown L, Wolf JM, Prados-Rosales R, Casadevall A (2015) Through the wall: extracellular vesicles in Gram-positive bacteria, mycobacteria and fungi. 10. Nat Rev Microbiol 13: 620–630 4. Briaud P, Frey A, Frey A, Marino EC, Bastock RA, Zielinski RE, Wiemels RE, Keogh RA, Murphy ER, Shaw LN, Shaw LN, Carroll RK (2021) Temperature influences the composition and cytotoxicity of extracellular vesicles in Staphylococcus aureus. https://doi.org/ 10.1128/msphere.00676-21 5. Choi E-J, Lee HG, Bae I-H, Kim W, Park J, Lee TR, Cho E-G (2018) Propionibacterium acnes-derived extracellular vesicles promote acne-like phenotypes in human epidermis. 6. J Invest Dermatol 138:1371–1379 6. Hong S-W, Choi E-B, Min T-K, Kim J-H, Kim M-H, Jeon SG, Lee B-J, Gho YS, Jee Y-K, Pyun B-Y, Kim Y-K (2014) An important role of α-hemolysin in extracellular vesicles on the development of atopic dermatitis induced by Staphylococcus aureus. 7. PLoS One 9: e100499

7. Lee J, Lee E-Y, Kim S-H, Kim D-K, Park K-S, Kim KP, Kim Y-K, Roh T-Y, Gho YS (2013) Staphylococcus aureus extracellular vesicles carry biologically active β-lactamase. 6. Antimicrob Agents Chemother 57:2589–2595 8. Bitto NJ, Cheng L, Johnston EL, Pathirana R, Phan TK, Poon IKH, O’Brien-Simpson NM, Hill AF, Stinear TP, Kaparakis-Liaskos M (2021) Staphylococcus aureus membrane vesicles contain immunostimulatory DNA, RNA and peptidoglycan that activate innate immune receptors and induce autophagy. J Extracell Vesicles 10:e12080 9. Dean SN, Rimmer MA, Turner KB, Phillips DA, Caruana JC, Hervey WJ, Leary DH, Walper SA (2020) Lactobacillus acidophilus membrane vesicles as a vehicle of Bacteriocin delivery. Front Microbiol 11:710 10. Lee E-Y, Choi D-Y, Kim D-K, Kim J-W, Park JO, Kim S, Kim S-H, Desiderio DM, Kim Y-K, Kim K-P, Gho YS (2009) Gram-positive bacteria produce membrane vesicles: proteomicsbased characterization of Staphylococcus aureus-derived membrane vesicles. 24. Proteomics 9:5425–5436 11. Choi JH, Moon CM, Shin T-S, Kim EK, McDowell A, Jo M-K, Joo YH, Kim S-E, Jung H-K, Shim K-N, Jung S-A, Kim Y-K (2020) Lactobacillus paracasei-derived extracellular vesicles attenuate the intestinal inflammatory response by augmenting the endoplasmic reticulum stress pathway. Exp

10

Paul Briaud and Ronan K. Carroll

Mol Med. https://doi.org/10.1038/s12276019-0359-3 12. Kaparakis-Liaskos M, Ferrero RL (2015) Immune modulation by bacterial outer membrane vesicles. 6. Nat Rev Immunol 15:375– 387 13. Koeppen K, Hampton TH, Jarek M, Scharfe M, Gerber SA, Mielcarz DW, Demers EG, Dolben EL, Hammond JH, Hogan DA, Stanton BA (2016) A novel mechanism of host-pathogen interaction through sRNA in bacterial outer membrane vesicles. 6. PLoS Pathog 12:e1005672

14. Brennan K, Martin K, FitzGerald SP, O’Sullivan J, Wu Y, Blanco A, Richardson C, Mc Gee MM (2020) A comparison of methods for the isolation and separation of extracellular vesicles from protein and lipid particles in human serum. 1. Sci Rep 10:1039 15. Schlatterer K, Beck C, Hanzelmann D, Lebtig M, Fehrenbacher B, Schaller M, Ebner P, Nega M, Otto M, Kretschmer D, Peschel A (2018) The mechanism behind bacterial lipoprotein release: phenol-soluble modulins mediate toll-like receptor 2 activation via extracellular vesicle release from Staphylococcus aureus. 6. MBio 9:e01851–e01818

Chapter 2 Extraction and Purification of Outer Membrane Vesicles and Their Associated RNAs Anaı¨s Blache and Wafa Achouak Abstract Outer membrane vesicles (OMVs), produced by Gram negative-bacteria and sRNAs, are key players in cellto-cell communication and interactions of bacteria with the environment. OMVs act as information carriers and encapsulate various molecules such as proteins, lipids, metabolites, and RNAs. OMVs and sRNAs play a broad range of functions from pathogenesis to stress resistance, to biofilm formation and both mediate interkingdom signaling. Various studies indicate that there is a mechanism of intercellular communication mediated by OMV-derived bacterial RNAs that is conserved among certain bacterial species. Here we describe methods for the extraction and purification of vesicles produced by Gram-negative bacteria, such as Pseudomonas brassicacearum and Escherichia coli, and address methods for the extraction of OMVsderived sRNA and techniques for the analysis of sRNAs. Key words OMVs, sRNA, Gram-negative bacteria, Extraction methods, Purification methods, OMVs-derived RNA extraction

1

Introduction Gram-negative bacteria produce outer membrane vesicles (OMVs) by pinching off portions of the outer membrane. OMVs, key players in cell-to-cell communication, carry various molecules like proteins, RNAs, secondary metabolites, and lipids [1, 2]. OMVs act as information carriers. A lipid bilayer protects the cargo from environmental stress. For some species, OMVs can activate quorum sensing, immune regulation, and biofilm formation [3, 4]. Secretion of OMVs allows rapid adaptation to environmental changes by transporting signaling molecules, misfolded proteins, toxins, and virulence factors [1, 5–10]. Analysis of the OMV profile of human pathogenic bacteria reveals the presence of sRNAs that have been shown to be internalized by human cells [11, 12]. Pseudomonas aeruginosa OMVs carrying sRNAs have been shown to transfer their sRNAs to human

Ve´ronique Arluison and Claudio Valverde (eds.), Bacterial Regulatory RNA: Methods and Protocols, Methods in Molecular Biology, vol. 2741, https://doi.org/10.1007/978-1-0716-3565-0_2, © The Author(s), under exclusive license to Springer Science+Business Media, LLC, part of Springer Nature 2024

11

12

Anaı¨s Blache and Wafa Achouak

airway epithelial cells and in the mouse lung, attenuating their immune response [13]. sRNAs are key players in cell-to-cell communication, including plant–pathogen interactions [14]. A 115 nt sRNA from the plant pathogen Xanthomonas has been reported to regulate 63 genes associated with signaling and virulence functions in pepper [15]. In the absence of any evidence at present, we can expect bacterial sRNAs to be transferred from OMVs into plant cells. Functional studies of OMVs and identification of their cargo are challenging because of the slow extraction process and low purification yield. Most experiments require purified vesicles, for example, in the case of host–pathogen interactions. The most commonly used extraction method is ultracentrifugation. Objects larger than 0.45 μm are eliminated by filtration, but heavy objects like membrane pieces, pili, and flagella remain. During ultracentrifugation, these objects will be pelleted and disturb the purity of the OMVs. According to the purpose of the study, a purification step is mandatory. The volume of OMVs required will influence the choice of extraction and purification method. The ultracentrifugation process requires less equipment than ultrafiltration, but the volume of the supernatant is limiting. Small-scale extraction and purification involve the ultracentrifugation process and density gradient purification, or the use of a commercial extraction and purification kit. Ultrafiltration allows to process culture supernatant without volume boundaries and can be used to enhance the extraction yield by concentrating a large volume of supernatant. This method has been used for OMVs produced by Escherichia coli with a 100 kDa membrane [16, 17] and 70 kDa [18]. Ultrafiltration and gel filtration columns compose the large-scale process. Once the OMVs have been purified, their RNAs can be extracted, quantified, and quality checked, and then used for RT-PCR/qRT-PCR, Northern blot, or RNA-Seq analysis. The methods described here are efficient with Gram-negative bacteria such as Pseudomonas brassicacearum and Escherichia coli.

2 2.1

Materials Bacterial Growth

1. Tryptic Soy Broth (TSB) diluted tenfold or LB Broth. 2. Glass culture flasks. 3. Incubator.

2.2

OMVs Extraction

2.2.1

Ultracentrifugation

1. Centrifuge. 2. Fixed angle rotor 5Jla 9 1000 or Jla 16 250. 3. Sterile 1 L or 200 mL bottles able to spin at 10,000 × g. 4. PES Filtration units 0.45 μm or 0.22 μm.

Gram-Negative Outer Membrane Vesicles and Associated RNAs

13

5. Vacuum pump. 6. Ultracentrifuge. 7. Fixed angle rotor (i.e., 50.2. Ti). 8. Ultracentrifuge bottles (26.3 mL) capable of spinning at 150,000 × g. 9. Phosphate-buffered saline (PBS), pH 7.4. 10. DPBSS (Dulbecco’s phosphate buffered supplemented in salt): 0.901 mM CaCl2, 2.682 mM KCl, 0.4918 mM MgCl2 hexahydrate, 136.8 mM NaCl2 and 15.21 mM Na3PO4 dibasic). 2.2.2

Ultrafiltration

1. Ultrafiltration cassette: Pellicon Biomax 30 kDa 0.57 m2 D screen. 2. Pellicon cassette acrylic holder and assembly. 3. 5 L of 0.1 M NaOH. 4. 10 L of ultrapure H2O. 5. Sterile bottle (5 L). 6. Peristaltic pump. 7. Autoclavable tubing adaptable to the peristaltic pump. 8. Perforated caps. 9. 2 clamps for tubing. 10. pH paper.

2.2.3 Ion-ExchangeBased Columns

1. ExoBacteria™ OMV Isolation Kit for Gram-negative bacteria. 2. Cold room. 3. Shaker.

2.3 OMVs Purification

1. Optiprep density gradient medium. 2. Ultracentrifuge. 3. Fixed angle rotor (e.g., 50.2. Ti). 4. 8 mL ultracentrifugation tubes capable of spinning at 150,000 × g. 5. Sterile 1 mL syringes. 6. Needles: Sterican 0.70 × 40 mm 22G × 11/2. 7. Ultracentrifuge. 8. Rotor MLA-55. 9. Ultracentrifuge tubes (26.3 mL) capable of spinning at 200,000 × g.

2.4

OMVs Analysis

1. 5 μg/mL lipophilic fluorescent dye FM1–43. 2. Multimode microplate reader.

14

Anaı¨s Blache and Wafa Achouak

3. Zetasizer. 4. 50–2000 μL UVette 220–1600 nm. 5. Nanoparticle Tracker Analyzer (NTA). 6. Transmission electron microscope (TEM). 7. 0.1% Phosphotungstic acid. 8. Electrophoresis cell. 9. Generator. 10. Imperial Protein Stain. 11. SDS 20x MES Running Buffer. 12. NUPAGE 10% Bis-tris gel. 13. LDS sample buffer. 14. Protein all blue standard. 15. Bradford protein estimation kit. 2.5 RNA Extraction and Analysis

1. TRIzol Reagent.

2.5.1 RNA Extraction Methods

3. Isopropanol.

2. Chloroform. 4. Ethanol 75%. 5. RNAse-free water. 6. Centrifuge. 7. Vortex. 8. MiVac DNA Concentrator. 9. RNAse A/T1. 10. Nanovue spectrophotometer. 11. GIGAPUR RNAse-DEKONTA OFF.

2.5.2

Quality Control

1. Qubit 4 fluorometer. 2. Qubit RNA IQ Assay Kit. 3. Qubit RNA BR Assay Kit.

2.5.3

RNA Analysis

1. TURBO DNA-free Kit. 2. Transcriptor First Strand cDNA Synthesis Kit. 3. Q-PCR LightCycler 480 SYBR Green I Master kit. 4. Light Cycler 480 or other real-time PCR system. 5. Thermocycler.

Gram-Negative Outer Membrane Vesicles and Associated RNAs

3

15

Methods

3.1

OMVs Extraction

3.1.1

Bacterial Growth

1. Start an overnight culture from an isolated clone in 10 mL of tenfold diluted Tryptic Soy Broth (TSB 1/10) or LB Broth. 2. Prepare sterile flasks of 1/10 TSB or LB. The volume should be adjusted according to the extraction method. Choose 80% aeration of the volume. • Ultracentrifugation: 600 mL in a 3 L flask (see Note 1). • Ultrafiltration: 7 × 600 mL in 3 L flasks. • Exobacteria OMVs isolation kit: 100 mL in a 500 mL flask. • Centrifugation steps: 600 mL in 3 L flask. 3. Grow the cultures at the appropriate temperature and at 140 rpm for 24 h (see Note 2). 4. Dispense the culture into 200 mL or 1 L sterile bottles. To separate vesicles and cells, centrifuge the cultures twice at 10, 000 × g for 30 min. Use a new sterile bottle for each centrifugation step, and discard the bacterial pellet. 5. Filter the supernatant through 0.45 μm PES filters. If the OMVs are smaller than 200 nm, filter the supernatant through a 0.22 μm PES filter unit. 6. After this step, a cell-free supernatant is obtained (see Notes 3 and 4).

3.1.2

Ultracentrifugation

The ultracentrifugation process is the most common method for extracting OMVs. This technique is convenient and repeatable but restricts the volume of supernatant used. 1. Transfer the cell-free supernatant into sterile 26.3 mL tubes for ultracentrifugation. The vials may be sterilized with 15% H2O2 and washed twice with sterile water (see Note 5). 2. Centrifuge the supernatant at 150,000 × g for 2 h at 4 °C (see Note 6). 3. Filter the phosphate-buffered saline (PBS at pH 7.4) through 0.22 μm filter units. 4. After ultracentrifugation, the OMVs are concentrated in the pellet. Discard the supernatant and resuspend the pellets with the correct volume of sterile PBS (see Note 7).

3.1.3

Ultrafiltration

The ultrafiltration technique is easy and effective but requires the use of rather expensive equipment such as the acrylic ultrafiltration cassette holder and the ultrafiltration membrane. 1. Place the Pellicon membrane into the holder, and screw the holder nuts strongly with a wrench to avoid any leakage (see Note 8).

16

Anaı¨s Blache and Wafa Achouak

1. Flushing

2. Concentrating cell-free supernatant

2. Concentrating cell-free supernatant

3. Washing

Created with BioRender.com

Fig. 1 Ultrafiltration technique: (1) Flushing step with H2O, (2) Concentrating supernatant, (3) Cleaning step with NaOH

2. Connect the inlet of the peristaltic pump with a 10 L bottle of ultrapure H2O and the retentate and permeate to trash bottles (Fig. 1). Start the pump at a low flow rate. Check the system for leaks (see Note 9). 3. The first step is to flush the membrane with ultrapure H2O to avoid any residual storage solution (0.1 M NaOH). Determine the pH of the permeate and retentate before flushing. Start the pump at a maximum flow rate (120 rpm) to flush the membrane with 3 L of ultrapure H2O. 4. Close the retentate with a clamp, the entire flow must pass through the membrane and flush with 2 L of ultrapure H2O. Check the pH of the permeate and retentate; if it is equivalent to pH of H2O, stop flushing. Otherwise, keep flushing the membrane. 5. Connect the peristaltic pump and the retentate to the supernatant bottle, and the permeate to the trash bottle (Fig. 1). 6. Close the retentate to concentrate the supernatant. When the desired volume of supernatant is achieved, open the retentate and close the permeate. Reverse the peristaltic pump direction to collect all the concentrated supernatant.

Gram-Negative Outer Membrane Vesicles and Associated RNAs

17

7. Connect the pump to a 0.1 M NaOH bottle, the retentate and permeate go into trash bottles (Fig. 1). Close the retentate and clean the membrane with 3 L of NaOH. Open the retentate and finish the cleaning with 2 L of 0.1 M NaOH. Check the permeate and retentate pH to make sure that the NaOH passes through the membrane. 8. Store the Pellicon cassette at 4 °C in 0.1 M NaOH. 9. Re-filter the concentrated supernatant with 0.45 μm or 0.22 μm PES filters. 10. Ultracentrifuge the supernatant at 150,000 × g for 2 h at 4 °C and resuspend the pellet with the appropriate volume of sterile PBS (depending on the pellet thickness). 3.1.4 Centrifugation Steps

This method allows direct OMV purification with a long and highspeed centrifugation step [19]. Supernatant volumes can be higher than those used with ExoBacteria™ OMV Isolation Kit (System Biosciences). 1. Sterilize DPBSS using 0.22 μm PES filter units. 2. Centrifuge the supernatant at 38,400 × g for 2 h and collect the pellet in 25 mL of sterile Dulbecco’s phosphate-buffered saline with added magnesium and calcium [19]. 3. Ultracentrifuge OMVs at 100,000 × g for 1 h (6), and resuspend the pellet in the appropriate volume of sterile DBPS or PBS. After extraction by ultracentrifugation or ultrafiltration methods, OMVs are cell-free but not pure. The sample may contain membrane debris, flagella, and pili. To study the vesiculation process or to compare vesicle production between strains, the extraction steps are sufficient. For other investigations as host–OMVs interactions, OMVs need to be purified. Flagellin is a microbeassociated molecular pattern capable of activating the plant immune response [20]. The most commonly used technique for purification is density gradient purification. For large-scale extraction, gel extraction columns can be employed. Some like Collins et al. (2022) [21] use size exclusion chromatography.

3.2 OMVs Purification 3.2.1 Density Gradient Purification

Density gradient purification is the most commonly used method for OMV purification [17, 18, 22, 23]. This method provides high purity but causes low purification yield and poor reproducibility. 1. Pool the three 100–200 μL OMVs samples in a 26.3 mL sterile ultracentrifugation tube. Centrifuge the sample at 150,000 × g for 2 h at 4 °C. Discard the supernatant and resuspend the pellet in 410 μL of sterile PBS.

18

Anaı¨s Blache and Wafa Achouak Gradient density fractionation

Optiprep Gradient

30% 35%

150,000 x g for 20 h at 4°C

40%

45% (OMVs)

Lipid content of each fraction (FM1-43) Relative fluorescence per CFU

Collect fractions

20%

8E-05 6E-05 4E-05 2E-05 0E+00

1

2

3

Fig. 2 Density gradient purification

2. From the Optiprep 60% density gradient medium, prepare 1.5 mL of diluted solutions: 45%, 40%, 35%, 30%, 25%, and 20%. The solutions are diluted with sterile PBS except 45% solution which is already diluted with 410 μL OMVs sample. 2.5 mL of 20% solution is needed (see Note 10). 3. Sterilize an 8 mL tube with 15% H2O2 and wash twice with sterile H2O. 4. Carefully and slowly lay 1 mL of each Optiprep solution as described in Fig. 2 into the sterilized tube, beginning with the 45% solution, in descending order (from 45% to 20%) (see Note 11). 5. Centrifuge the gradient at 150,000 × g for at least 20 h at 4 °C (see Note 12). 6. After centrifugation, extract each 1 mL fraction with a 1 mL sterile syringe and needle. Fix the tube and begin extraction from the top of the gradient (see Note 13). 7. Use the lipophilic fluorescent dye FM1-43 (Thermo Scientific) to quantify the OMVs in each fraction. Run the assay as described in the OMVs analysis section. OMVs should be found in the third and fourth extraction fraction, in the case of P. brassicacearum. 8. Sterilize a 26.3 mL ultracentrifugation tube with 15% H2O2 and wash twice with sterile H2O. 9. Pool the OMVs-containing fractions into the sterile 26.3 mL centrifugation tube. Fill the tube with approximately 15 mL of sterile PBS and equilibrate the tube (±0.05 g). 10. Centrifuge the tube at 200,000 × g for 2 h at 4 °C. Discard the supernatant and resuspend the pellet with the appropriate volume of sterile PBS.

Gram-Negative Outer Membrane Vesicles and Associated RNAs

3.3

OMVs Analysis

3.3.1 OMVs Concentration FM1-43 Dosage

19

The production of OMVs can be estimated with 5 μg/mL of the lipophilic fluorescent dye FM1-43 in a 96-well microplate. As there is no calibration curve available, this assay is only an estimate and does not provide the concentration of OMVs. This method is a quick and efficient way to compare the production of OMVs (see Note 14). 1. Fill each well. Three blanks are required (see Note 15). 2. Incubate for 5 min in the dark. 3. Use a multimode microplate reader. The excitation and emission wavelengths of FM1-43 are 479 nm and 600 nm, respectively.

ZetaSizer

OMV samples may be analyzed with a ZetaSizer instrument to determine the size distribution of OMVs. This method confirms the presence of OMVs in the sample. It allows the size distribution of nanosized particles and the presence of aggregates to be determined. The ZetaSizer analysis does not provide the concentration of OMVs. 1. Put 50 μL of your OMVs sample into a UVette (Eppendorf), and insert it into the Zetasizer instrument. 2. Use the ZetaSizer software with automatic measurement duration and three measurements per sample.

Nanoparticles Tracker Analyzer (NTA)

The Nanoparticle Tracker Analyzer (NTA) is capable of measuring the concentration of OMVs directly from the sample. This method is rapid, easy, precise, and reproducible, but the NTA equipment is quite expensive [23, 24]. 1. Clean and calibrate the NTA machine with ultrapure H2O. 2. Dilute the sample 103-fold with PBS. 3. Triplicate each measure.

Bradford Assay

The Bradford assay evaluates the concentration of proteins in the sample. The sample must be purified to correlate the protein concentration with the OMV concentration. Follow the Bradford protein estimation kit protocol [25].

3.3.2

Direct observation of OMVs is a good method to confirm sample purity and OMVs integrity and to determine the size of OMVs. Transmission electron microscopy (TEM) is the only tool available to observe OMVs accurately. TEM also brings out the presence of flagella and pili, an efficient tool to ensure sample purity. For TEM observations, OMVs should be negatively stained, such as with 0.1% phosphotungstic acid.

OMVs Observations

Transmission Electron Microscopy (TEM)

20

Anaı¨s Blache and Wafa Achouak

SDS-PAGE Electrophoresis

Protein electrophoresis is used to observe the protein composition of OMVs, to compare the production of OMVs, and to prove their purity. The protein sizes of flagella and pili are known and can be identified on SDS-PAGE gel. Anti-flagellum or pili protein antibodies can be used to ensure the absence of these contaminants in the samples. 1. Mix 15 μL of OMVs sample with 5 μL of 4x LDS sample buffer. 2. Heat samples at 95 °C for 10 min. 3. Load 20 μL of OMVs sample onto a 10% Bis-Tris NUPAGE gel. 4. Run at 130 V for 1 h. 5. Incubate the SDS-PAGE gel for 1 h in imperial protein stain for staining. 6. Bleach the gel overnight in H2O. 7. Observe the protein profile.

3.4 Extraction of RNAs

sRNAs are an important part of the OMVs cargo and key players in cell-to-cell communication. To identify the cargo whole of RNAs or to study the encapsulation of specific sRNAs, sRNAs must be extracted from OMVs samples. To distinguish sRNA extracted from the OMVs cargo and sRNAs located at the membrane surface, incubate the samples in the presence and absence of RNAse. Add 1 μL of RNAse A/T1 (Thermoscientific) to the samples and incubate at 37 °C for 1 H. Use 0.5 mL 30 kDa Amicon to eliminate RNAse after incubation. Measure sRNA concentrations and compare data from samples with and without RNAse. RNAs are very sensitive to high temperatures. To avoid RNA degradation, always manipulate RNA samples on ice (see Note 16).

3.4.1 RNA Extraction Methods

1. Trizol allows the extraction of small RNAs from low-concentration samples [23]. Use the TRIzol Reagent and follow the User Guide from “Lysis” to “Isolate RNA” sections (see Note 17). 2. The RNAeasy extraction Kit is efficient on OMVs samples [13]. This Kit allows extraction without the use of chemical hood and provides rapid and reproductible results. Follow the manufacturer’s user guide instructions. 3. RNAzol RT Kit (Sigma-Aldrich) is recommended for the extraction of small RNAs ( 7).

3.4 Polyacrylamide Gel Electrophoresis (PAGE) of the Isolated Small RNA Molecules

1. Prepare 7.5 mL of 15% denaturing acrylamide gel by adding: 0.75 mL 10 × TBE buffer, 3.15 g urea, 2.8 mL 40% acrylamide (19:1 acrylamide:bis-acrylamide), 37.5 μL 10% ammonium persulfate, 7.5 μL TEMED and DEPC-treated water up to 7.5 mL. Pour the gel into a prepared mold immediately (see Note 5). 2. Mix the RNA sample obtained in Subheading 3.3 with an equal amount of 2 × Gel Loading Buffer II (Invitrogen) containing 95% formamide, 18 mM EDTA, 0.025% SDS, 0.025% xylene cyanol, and 0.025% bromophenol blue. Vortex briefly. If needed centrifuge shortly to bring the mixture to the bottom of the tube. Heat the sample for 5 min at 95 °C to denature the RNA, then put the tube in ice (see Note 6). 3. Pre-run the gel without samples for 15 min in 1 × TBE at 90 V.

32

Sylwia Bloch et al.

4. Load RNA samples from step 3.4.2 and two RNA ladders, one appropriate for small RNA molecules (e.g., microRNA Marker from New England BioLabs) and a second dedicated for bigger RNA fragments in the range of 100–1000 nt (e.g., RNA Marker 1 RTU from A&A Biotechnology, Poland) (see Note 7). 5. Separate electrophoretically small RNA molecules at 90 V until the samples deeply enter the gel (~45 min) and then continue at 120 V for another 45 min. Monitor the electrophoretic RNA separation and stop it when the leading dye bromophenol blue reaches 2 cm above the bottom of the gel (see Note 8). 3.5 Elution of sRNAs Less Than 50 nt in Length from the Gel Slice

1. To recover the gel slice that contains a fraction of sRNAs less than 50 nt in length, excise a band from the gel starting just below the center of the xylene cyanol band (migrates at approximately 55 bp) and extending to the bromophenol blue band that migrates at approximately 12 bp in 15% denaturing polyacrylamide gel. 2. Place the gel piece in the dialysis membrane with a 3.5 kDa molecular weight cut-off (MWCO) filled with sterile 1 × TBE buffer. Seal both ends of the dialysis bag with clips. 3. Put the dialysis bag in the migration chamber of the apparatus and perform electrophoresis in 1 × TBE buffer at 100 V for 45 min. During this step, the sRNA molecules migrate out of the gel into the dialysis buffer (see Note 9). 4. Recover the eluted sRNA from the dialysis bag and precipitate it according to the procedure described in step No. 3.3 5. Quantify and assess the purity of the eluted sRNAs using the Qubit™ microRNA Assay Kit.

3.6 Library Preparation Using SMARTer smRNA-Seq Kit for Illumina (Clontech Laboratories)

This procedure was carried out according to the manufacturer’s instructions with some indications: 1. Begin with 10 ng of PAGE-purified sRNA obtained in the previous section (see Note 10). 2. Perform polyadenylation reaction using 1 μL of the ATP solution (provided in the kit) at 16 °C for 5 min. Then cool the sample for 3 min on ice. 3. Follow the protocol from the manual and perform cDNA synthesis. 4. Add full-length Illumina adapters via PCR with Forward and Reverse primers from the kit, and perform according to the program indicated by the manufacturer using 22 cycles as assessed in the pilot experiment. 5. Purify the cDNA library using Macherey-Nagel NucleoSpin® Gel and PCR Clean-Up kit as described in the manual, and elute in 30 μL of provided NE buffer.

Analysis of Phage Regulatory RNAs: Sequencing Library Construction. . .

3.7 Library Validation and Sequencing

33

1. Quantify and assess the purity of the Cdna library using the Qubit™ dsDNA HS Assay Kit. 2. Dilute the library to ~1.5 ng/μL, and evaluate its size by running the sample on an Agilent 2100 Bioanalyzer using the Agilent High Sensitivity DNA kit. Figure 1a, b includes concentrations and profiles of three cDNA libraries generated from the PAGE-enriched sRNAs according to the above procedure. 3. Proceed directly to Illumina sequencing. Figure 1c shows the read length distributions obtained for sequencing results of the cDNA libraries presented in Fig. 1b, using NextSeq 500 Illumina Sequencer.

4

Notes 1. To isolate a small RNA fraction follow a two-column purification protocol and keep the first flow through. 2. The final reaction volume may slightly exceed the recommended by the manufacturer 100 μL. 3. The pellet of DNase Inactivation Reagent detaches easily, better leave a small amount of the supernatant to avoid it. 4. At this step, you should combine at least four separate DNA digestion reactions. Be careful, the excess of glycogen may interfere with cDNA synthesis, better use it up to a final concentration of 8 μg/μL. 5. All solutions should be prepared in DEPC-treated water. The urea should be completely dissolved before adding 10% ammonium persulfate and TEMED as these components catalyze gel polymerization. 6. Do not add ethidium bromide to RNA samples or gel; you may add this intercalation dye to the RNA ladder only. 7. If the ethidium bromide is not a part of the ladder sample originally, you may add it to the final concentration of 50 μg/ mL after the heating step. Be careful, as heating some RNA ladders is not recommended. 8. Do not stain the gel in ethidium bromide solution. 9. To obtain high-quality intact RNA molecules from polyacrylamide gels, all solutions should be prepared using water treated with DEPC. 10. If required, you can use an even smaller amount of PAGEpurified sRNA, but not less as recommended by the manufacturer 1 ng.

34

Sylwia Bloch et al.

Acknowledgments This work was supported by the National Science Centre (Poland) grants No. 2018/29/B/NZ1/00549 to G.W. and No. 2018/30/ E/NZ1/00400 to B.N-F. References 1. Gottesman S, McCullen CA, Guillier M, Vanderpool CK, Majdalani N, Benhammou J, Thompson KM, FitzGerald PC, Sowa NA, FitzGerald DJ (2006) Small RNA regulators and the bacterial response to stress. Cold Spring Harb Symp Quant Biol 71:1–11. https://doi.org/10.1101/sqb.2006.71.016 2. Djapgne L, Oglesby AG (2021) Impacts of small RNAs and their chaperones on bacterial pathogenicity. Front Cell Infect Microbiol 11: 604511. https://doi.org/10.3389/fcimb. 2021.604511 3. Michaux C, Verneuil N, Hartke A, Giard JC (2014) Physiological roles of small RNA molecules. Microbiology 160:1007–1019. https:// doi.org/10.1099/mic.0.076208-0 4. Richards GR, Vanderpool CK (2011) Molecular call and response: the physiology of bacterial small RNAs. Biochim Biophys Acta 1809:525– 531. https://doi.org/10.1016/j.bbagrm. 2011.07.013 5. Vanderpool CK, Balasubramanian D, Lloyd CR (2011) Dual-function RNA regulators in bacteria. Biochimie 93:1943–1949. https:// doi.org/10.1016/j.biochi.2011.07.016 6. Bak G, Lee J, Suk S, Kim D, Young Lee J, Kim KS, Choi BS, Lee Y (2015) Identification of novel sRNAs involved in biofilm formation, motility, and fimbriae formation in Escherichia coli. Sci Rep 5:15287. https://doi.org/10. 1038/srep15287 7. Gottesman S, Storz G (2011) Bacterial small RNA regulators: versatile roles and rapidly evolving variations. Cold Spring Harb Perspect Biol 3:a003798. https://doi.org/10.1101/ cshperspect.a003798 8. Tree JJ, Granneman S, McAteer SP, Tollervey D, Gally DL (2014) Identification of bacteriophage-encoded anti-sRNAs in pathogenic Escherichia coli. Mol Cell 55:199–213. https://doi.org/10.1016/j.molcel.2014. 05.006 9. Bloch S, Lewandowska N, We˛grzyn G, Nejman-Falen´czyk B (2021) Bacteriophages as sources of small non-coding RNA molecules. Plasmid 113:102527. https://doi.org/10. 1016/j.plasmid.2020.102527

10. Lee HJ, Hong SH (2012) Analysis of microRNA-size, small RNAs in Streptococcus mutans by deep sequencing. FEMS Microbiol Lett 326:131–136. https://doi.org/10.1111/ j.1574-6968.2011.02441.x 11. Kang SM, Choi JW, Lee Y, Hong SH, Lee HJ (2013) Identification of microRNA-size, small RNAs in Escherichia coli. Curr Microbiol 67: 609–613. https://doi.org/10.1007/s00284013-0411-9 12. Furuse Y, Finethy R, Saka HA, Xet-Mull AM, Sisk DM, Smith KL, Lee S, Coers J, Valdivia RH, Tobin DM, Cullen BR (2014) Search for microRNAs expressed by intracellular bacterial pathogens in infected mammalian cells. PLoS One 9:e106434. https://doi.org/10.1371/ journal.pone.0106434 13. Choi JW, Kim SC, Hong SH, Lee HJ (2017) Secretable small RNAs via outer membrane vesicles in periodontal pathogens. J Dent Res 96:458–466. https://doi.org/10.1177/ 0022034516685071 14. Choi JW, Kwon T, Hong SH, Lee HJ (2016) Isolation and characterization of a microRNAsize secretable small RNA in Streptococcus sanguinis. Cell Biochem Biophys 76:293–301. https://doi.org/10.1007/s12013-0160770-5 15. Nejman-Falen´czyk B, Bloch S, Licznerska K, Dydecka A, Felczykowska A, Topka G, We˛grzyn A, We˛grzyn G (2015) A small, microRNA-size, ribonucleic acid regulating gene expression and development of Shiga toxin-converting bacteriophage Φ24B. Sci Rep 5:10080. https://doi.org/10.1038/ srep10080 16. Desgranges E, Caldelari I, Marzi S, Lalaouna D (2020) Navigation through the twists and turns of RNA sequencing technologies: application to bacterial regulatory RNAs. Biochim Biophys Acta Gene Regul Mech 1863:194506. https://doi.org/10.1016/j.bbagrm.2020 17. Diallo I, Provost P (2020) RNA-sequencing analyses of small bacterial RNAs and their emergence as virulence factors in hostpathogen interactions. Int J Mol Sci 21:1627. https://doi.org/10.3390/ijms21051627

Chapter 4 Discovering Novel Bacterial Small RNA by RNA-seq Analysis Toolkit ANNOgesic Chin-Hsien Tai, Deborah Hinton, and Sung-Huan Yu Abstract ANNOgesic is an RNA-seq analysis pipeline that can detect sRNAs and many other genomic features in bacteria and archaea. In addition to listing sRNA candidates, ANNOgesic also generates various formats of data files for visual examination and downstream experimental design. Based on validations from previous studies, the sRNA predictions are accurate and reliable. In this chapter, we outline the sRNA detection algorithm, important parameters used, step-by-step execution, and data interpretation with a B. pertussis study as an example. Following those procedures, novel sRNA can be revealed by ANNOgesic. Key words Small RNAs, RNA-seq, Computational predictions, ANNOgesic, genome annotation

1

Introduction Like a Swiss army knife, ANNOgesic [1] is a bacterial and archaeal RNA-seq analysis pipeline with many functional modules. Taking RNA-seq result counts, along with reference genome sequence and the annotation, ANNOgesic can detect genes, coding sequences (CDSs), tRNAs, rRNAs, transcription start sites (TSSs), processing sites (PSs), transcripts, terminators, and untranslated regions (UTRs), as well as small RNAs (sRNAs), small open reading frames (sORFs), circular RNAs, CRISPR-related RNAs, riboswitches and RNA-thermometers (Fig. 1a), and RNA-RNA or protein–protein interactions. Within the ANNOgesic subcommands (Fig. 1b), identification of transcript assembly, sRNA, and sORF are novel features, while other functions are adaptive and improved from third-party tools. ANNOgesic also provides both statistics and visual diagrams for easy result evaluation. In this chapter, we focus on the novel feature of sRNA detection. There are three types of sRNAs in bacteria: intergenic, antisense, and UTR-derived or processed ones [2]. ANNOgesic can detect the former two by comparing the transcripts, annotation

Ve´ronique Arluison and Claudio Valverde (eds.), Bacterial Regulatory RNA: Methods and Protocols, Methods in Molecular Biology, vol. 2741, https://doi.org/10.1007/978-1-0716-3565-0_4, © The Author(s), under exclusive license to Springer Science+Business Media, LLC, part of Springer Nature 2024

35

36

Chin-Hsien Tai et al.

profiles, and the coverage files, as shown in Fig. 2a, b. If a transcript does not overlap with CDSs and the length is within 30–500 nucleotides, it is assigned as an intergenic sRNA or antisense sRNA depending on the strand. The third type of sRNA is derived from transcript processing, which can arise from the 5′UTR, the 3′ UTR, or within an operon between two CDSs. When TSS and PS information is provided, ANNOgesic can detect all these features. As shown in Fig. 2c, a 5′UTR-derived sRNA starts with a TSS at the 5′end and is associated with a PS or a sharp drop of coverage at its 3′ end. Figure 2d shows a 3′UTR-derived sRNA starting at a PS at the 3′ end of a CDS. If the sRNA is located between two CDSs of the same transcript and has PSs at both ends, it is an “InterCDs” sRNA as shown in Fig. 2e. For a transcript to be considered as a sRNA, ANNOgesic takes several scenarios shown in Fig. 3 into consideration. Parameters, which users can adjust, are listed in Table 1 along with their descriptions, arguments, and default values. In brief, when the transcript expression is more than the coverage cutoff and the length is within the length cutoff plus the length tolerance, the transcript is considered to be a sRNA, as shown in Fig. 3a, d. If an expression drop is more than the decrease cutoff, ANNOgesic treats the sRNA ending at the drop site instead of considering the whole transcript as sRNA (Fig. 3b). However, ANNOgesic combines the two transcripts if the distance between them is within the transcript tolerance and the total length is within the length cutoff (Fig. 3c).

Fig. 1 Features of ANNOgesic. (a) Cartoon depiction of ANNOgesic. Similar to a Swiss army knife, ANNOgesic has many functionalities. (b) Screenshot of the help command, which lists all the ANNOgesic modules

Discovering Novel Bacterial Small RNA by ANNOgesic

Fig. 1 (continued)

37

38

Chin-Hsien Tai et al.

Fig. 2 ANNOgesic detects the various types of sRNAs in bacteria and archaea. By comparing the genome transcript and annotation, ANNOgesic detects sRNAs between two different CDS transcripts when the intergenic sRNA is on the same strand as the CDS (a) or when the sRNA and CDS are antisense (b). When the TSS and PS information is provided, ANNOgesic detects sRNAs that are derived from transcript processing within the 5′UTR (c), the 3′UTR (d), or between two CDSs of the same transcript (e). Read coverage is shown in blue or green

The workflow of ANNOgesic sRNA prediction is shown in Fig. 4. To detect sRNAs, ANNOgesic only needs three input files: the reference genome sequence in FASTA format, annotation in GFF3 format, and the coverage file in wiggle format from the RNA-seq alignment. If users have TEX+/- libraries from differential RNA-seq analyses (dRNA-seq [3]), ANNOgesic can first predict TSSs and PSs, and then use that information to detect UTR-derived sRNAs, improving the accuracy of sRNA boundaries. If the sequence identity between the predicted sRNA and a known one present in a sRNA database is greater than the threshold, the known sRNA information is included in the output. Since functional sRNAs usually have a stable secondary structure, the free energy change should be negative. To reduce false positives of mis-annotated genes that were mistaken as sRNAs, candidates with homologous hits in NCBI non-redundant (nr) database [4]

Discovering Novel Bacterial Small RNA by ANNOgesic

39

Fig. 3 Diagrams explaining the criteria used in ANNOgesic to detect sRNAs. Transcript coverages and sRNAs are shown in blue and orange, respectively. Cutoffs that users can adjust include the coverage cutoff (height of the red dash line); the length cutoff (length of the brown arrow); the decrease cutoff (length of the orange double arrows); the transcript tolerance (length of the green double arrows); and the length tolerance (length of the purple double arrows). (a) To be considered a sRNA, the expression of the transcript has to be more than the coverage cutoff and its length must be within the range of the length cutoff. (b) When the coverage within the transcript drops more than the decrease cutoff, ANNOgesic assigns a 3′end to the sRNA at the drop-off position. (c) When the number of nucleotides between two transcripts is less than the Transcript tolerance and length is less than the length cutoff, ANNOgesic merges the two transcripts into one. If the length is longer than the length cutoff, ANNOgesic searches for a potential terminator within the transcript as shown in (b). (d) If the length of the transcript is longer than the length cutoff but within the length tolerance, the transcript is assigned as an sRNA. Description of cutoff and default values are listed in Table 1

above threshold are removed. In addition, TSS/promoter/terminator/sORF information can be used as filters and is included in the results to help users prioritize which sRNA candidates should be validated first. The output includes a list of sRNAs with their positions, sequences, scores, and secondary structure files. All the arguments for the sRNA module can be found by typing “annogesic srna -h”. Since the majority of RNA-seq projects do not have TEX+/libraries, in this chapter we focus on detecting intergenic and antisense sRNAs from conventional RNA-seq results. However, a detailed manual for sRNA detection is available online (https:// annogesic.readthedocs.io/en/latest/index.html) for users, who have TSS data from a TEX+/- library.

Defines the end of sRNAs by detecting a sharp coverage drop

Defines the length of sRNAs

Decrease cutoff

Length cutoff --min_length --max_length

--decrease_intergenic_antisense --decrease_utr

--min_all_utr_coverage

--min_utr_fragmented_coverage

--min_utr_notex_coverage

--height (from transcript) --min_intergenic_tex_coverage --min_intergenic_notex_coverage --min_intergenic_fragmented_coverage --min_antisense_tex_coverage --min_antisense_notex_coverage --min_antisense_fragmented_coverage --min_utr_tex_coverage

tolerance_coverage

Transcript For detecting transcripts, a temporary coverage drop needs to be --tolerance (from transcript) --tolerance_coverage (from tolerance considered --tolerance is the number of nucleotides with coverage transcript) lower than --height, but higher than --

--min_u_poly_u --mutation_poly_u

5 0

15 5 2

10 10

30 500

0.1 0.05

10 0,0,0,40,20 0,0,0,30,10 400,200,0,50,20 0,0,0,40,20 0,0,0,30,10 400,200,0,50,20 p_0.8,p_0.6, p_0.7 p_0.7,p_0.5, p_0.6 p_0.7,p_0.5, p_0.6 50

Defaulta

Detailed information can be seen in the section of “srna (sRNA detection)” and “transcript (transcript detection)” in ANNOgesic documentation (https://annogesic. readthedocs.io/en/latest/subcommands.html)

a

For specifying the cutoffs of coverage by different types of sRNAs and libraries; since sRNA detection is based on transcript identification, --height is the coverage cutoff from transcript detection

Coverage cutoff

Argumentsa

Length For including the sRNAs whose lengths are slightly longer than --tolerance_intergenic_antisense --max_length; based on the types of sRNAs, different --tolerance_utr tolerance lengths are used --search_poly_u Poly U For searching for poly U tails of sRNAs

Description

Cutoffs

Table 1 Cutoffs for ANNOgesic sRNA detection

40 Chin-Hsien Tai et al.

Discovering Novel Bacterial Small RNA by ANNOgesic

41

Fig. 4 Workflow of ANNOgesic sRNA prediction. Required input files colored in blue include reference genome sequence, annotation, and expression coverage from the RNA-seq analysis. Optional input information includes annotation files that indicate TSS, PS, transcript, promoter, and/or terminator information, as well as sRNA or nr databases. ANNOgesic outputs a sRNA table, statistics, annotation, and sequences with their secondary structure figures

2

Installation Guidelines ANNOgesic source codes are available at a git repository (https:// github.com/Sung-Huan/ANNOgesic). To provide a seamless installation including non-Python dependencies, a Docker image (https://hub.docker.com/r/silasysh/annogesic), in which all the

42

Chin-Hsien Tai et al.

required tools are embedded, is available for users to construct ANNOgesic with Docker or Singularity. Following are the stepby-step installation commands. 2.1

Docker

The Docker image contains all the third-party tools ANNOgesic needs so users do not need to download individual packages. The image can be pulled by typing: $ docker pull silasysh/annogesic

To examine the content and the version in the images: $ docker images REPOSITORY TAG IMAGE ID CREATED SIZE silasysh/annogesic latest e5c39cdef6be 6 days ago 6.54GB silasysh/annogesic 1.1.14 5b63a4898424 3 weeks ago 6.41GB

To create a docker container and synchronize the local directory to the container: $ docker run -t -I -v \ $ABSOLUTE_PATH/Examples_Bacterial_Regulatory_RNAs:/home/project \ silasysh/annogesic bash

In the command above, $ABSOLUATE_PATH needs to be replaced by the correct path of Examples_Bacterial_Regulatory_RNAs in the user’s local machine. This command will create and enter a container. Thus, now the input files will be synchronized to /home/project of the container. Please check the contents, which should be the same as in the git repository: $ cd /home/project $ ls NC002929_2.fna

NC002929_2.gff

README.md

run.sh

wigs

In the Docker container, the command lines described in Subheading Methods can be executed directly. 2.2

Singularity

To install ANNOgesic via Docker image without root permission, one can build the singularity image as follows: $ singularity build annogesic.img docker://silasysh/annogesic: latest $ ls annogesic.img

Discovering Novel Bacterial Small RNA by ANNOgesic

43

Then the ANNOgesic analysis folder can be created by the following command: $ singularity exec -B $STORAGE_PATH $IMAGE_PATH/annogesic.img \ annogesic create -pj ANNOgesic $ ls $STORAGE_PATH ANNOgesic

Where $STORAGE_PATH is the path of the project folder of ANNOgesic and $IMAGE_PATH is the path where annogesic. img is stored. 2.3

ANNOgesic can also be installed via pip3 through the host server PyPI, with root permission:

pip3

$ pip3 install ANNOgesic $ pip3 install ANNOgesic --upgrade

or without root permission: $ pip3 install --user ANNOgesic $ pip3 install ANNOgesic --user --upgrade

To obtain the ANNOgesic source codes from a git repository: $ git clone https://github.com/Sung-Huan/ANNOgesic.git

However, the third-party tools need to be installed separately. The tools that the ANNOgesic sRNA module needs and the versions of various programs are listed in Table 2 should users wish to install them independently. Table 2 The required third-party tools and databases for sRNA detection in ANNOgesic Feature

Tool/Databases

Version Purpose

sRNA

Vienna RNA package [5] 2.3.2 Blast+ [6] 2.6.0+ nr database [4] BSRD database [7]

– –

Predict second structure of sRNAs Search known sRNAs and potential translated proteins Exclude sRNAs which may be proteins Search known sRNAs

TSS or PS

TSSpredator [8]

1.1

Detect TSS and PS via dRNA-Seq

Terminator

TransTermHP [9]

2.09

Identify Rho-independent terminators

Promotor

MEME [10]

3.2

Search motif sequences

sRNA target

IntaRNA [11] RNAup [5, 12] RNAplex [13]

2.0.4 2.3.2 2.3.2

Predict sRNA-target interaction Predict sRNA-target interaction Predict sRNA-target interaction

44

3 3.1

Chin-Hsien Tai et al.

Methods Arguments

To view all the modules ANNOgesic listed in Fig. 1b, users can type “annogesic -h”. For each module, users can check the arguments by the command “annogesic $MODULE -h” and the example below is for the srna module (only part of the arguments are shown here). $ annogesic srna -h ... basic arguments: --project_path PROJECT_PATH, -pj PROJECT_PATH Path of the project folder. --transcript_files TRANSCRIPT_FILES [TRANSCRIPT_FILES . . .], -a TRANSCRIPT_FILES [TRANSCRIPT_FILES ...] Paths of the transcript files. --fasta_files FASTA_FILES [FASTA_FILES ...], -f FASTA_FILES [FASTA_FILES ...] Paths of fasta files of reference genome. ... additional arguments: --rnafold_path RNAFOLD_PATH Path of RNAfold of the Vienna package --terminator_files TERMINATOR_FILES [TERMINATOR_FILES ...], -e TERMINATOR_FILES [TERMINATOR_FILES ...] Paths of the terminator gff files. ...

ANNOgesic classifies arguments into two types—basic and additional. Basic arguments are either required or may influence the results significantly. Additional arguments are not essential but may increase the accuracy by fine-tuning parameters. For example, if the sequencing depth is low, using default values may result in the detection of very few sRNAs. In this case, decreasing the cutoffs such as --height (from transcript), --min_intergenic_fragmented_coverage or --min_antisense_fragmented_coverage may be helpful. Moreover, although additional arguments such as --terminator_files are not mandatory, when provided, such information is included in the output so that users may prioritize sRNA candidates associated with terminators for experimental validation first. 3.2 Download the Input Files

In this chapter, we use our studies with a Bordetella pertussis RNA-seq analysis [14] as an example of how to find novel sRNAs with ANNOgesic. The required input files include the reference

Discovering Novel Bacterial Small RNA by ANNOgesic

45

genome of the bacteria in FASTA format, the annotation file in GFF3 format, and the coverage information from RNA-seq processing in wiggle format, along with a run.sh script with the commands shown in this chapter are all in the Examples_Bacterial_Regulatory_RNAs directory at the git repository (https://github.com/Sung-Huan/Examples_Bacte rial_Regulatory_RNAs). To download and examine the contents, users can type: $ git clone https://github.com/Sung-Huan/Examples_Bacterial_Regulatory_RNAs.git $ ls Examples_Bacterial_Regulatory_RNAs $ ls Examples_Bacterial_Regulatory_RNAs/ NC002929_2.fna

NC002929_2.gff

README.md

run.sh

wigs $ ls Examples_Bacterial_Regulatory_RNAs/wigs/ S1_1_forward.wig S1_2_forward.wig S2_1_forward.wig S2_2_forward.wig S3_1_forward.wig S3_2_forward.wig S4_1_forward.wig S4_2_forward.wig S1_1_reverse.wig S1_2_reverse.wig S2_1_reverse.wig S2_2_reverse.wig S3_1_reverse.wig S3_2_reverse.wig S4_1_reverse.wig S4_2_reverse.wig

3.3 Create an ANNOgesic Project Folder

Once ANNOgesic is installed, to detect sRNAs an ANNOgesic analysis folder should be created first by typing: $ annogesic create -pj ANNOgesic $ ls ANNOgesic

3.4 Place the Input Files in ANNOgesic Analysis Folders

Required input directories are generated when the ANNOgesic analysis folder is created. The reference genome FASTA file should be put in the references/fasta_files folder; the genome annotation GFF3 file is placed in the references/annotations directory; and the wiggle files should be in the wigs/fragment directory. If the users have wiggle files from dRNA-seq, they can be stored in wigs/tex_notex folder. $ mv NC002929_2.fna ANNOgesic/input/references/fasta_files $ mv NC002929_2.gff ANNOgesic/input/references/annotations $ mv wigs/* ANNOgesic/input/wigs/fragment $ rm -rf wigs

46

Chin-Hsien Tai et al.

In order to correlate the samples and conditions correctly, users must clearly specify the wiggle files in the format described below: $LIBRARY_FILENAME:$LIBRARY_TYPE:$CONDITION:$REPLICATE:$STRAND $LIBRARY_FILENAME is the name of the wiggle files. If the samples are treated with TEX, they should be specified as “tex” (TEX+) or “notex” (TEX-) at $LIBRARY_TYPE, or otherwise, “frag” if they represent a conventional or fragmented library. $CONDITION is the index for conditions and they should be 1, 2, 3. . . whereas $REPLICATE is the index of the replicates and is denoted as a, b, c. . . The last index is the strand of the library, either + for forward or – for reverse. They are all separated by colons. In order to simplify the execution of ANNOgesic modules, the input libraries of wig files can be defined beforehand as “LIBS” by the following command. $

LIBS="ANNOgesic/input/wigs/fragment/S1_1_reverse.wig:

frag:1:a:- \ ANNOgesic/input/wigs/fragment/S1_1_forward.wig:frag:1:a:+ \ ANNOgesic/input/wigs/fragment/S1_2_reverse.wig:frag:1:b:- \ ANNOgesic/input/wigs/fragment/S1_2_forward.wig:frag:1:b:+ \ ANNOgesic/input/wigs/fragment/S2_1_reverse.wig:frag:2:a:- \ ANNOgesic/input/wigs/fragment/S2_1_forward.wig:frag:2:a:+ \ ANNOgesic/input/wigs/fragment/S2_2_reverse.wig:frag:2:b:- \ ANNOgesic/input/wigs/fragment/S2_2_forward.wig:frag:2:b:+ \ ANNOgesic/input/wigs/fragment/S3_1_reverse.wig:frag:3:a:- \ ANNOgesic/input/wigs/fragment/S3_1_forward.wig:frag:3:a:+ \ ANNOgesic/input/wigs/fragment/S3_2_reverse.wig:frag:3:b:- \ ANNOgesic/input/wigs/fragment/S3_2_forward.wig:frag:3:b:+ \ ANNOgesic/input/wigs/fragment/S4_1_reverse.wig:frag:4:a:- \ ANNOgesic/input/wigs/fragment/S4_1_forward.wig:frag:4:a:+ \ ANNOgesic/input/wigs/fragment/S4_2_reverse.wig:frag:4:b:- \ ANNOgesic/input/wigs/fragment/S4_2_forward.wig:frag:4:b:+"

It is important to ensure the consistency of the strain names or the accession numbers across FASTA, GFF3, and wiggle files before executing ANNOgesic. As shown in the commands below, the accession numbers (highlighted in red) are the same in all three types of files. Moreover, to distinguish which strand the data represents, the coverage values in the wig file from the forward strand need to be designated as positive whereas those from the reverse need to be designated as negative. $ head ANNOgesic/input/references/fasta_files/NC002929_2.fna >NC_002929.2 Bordetella pertussis Tohama I chromosome, complete genome

Discovering Novel Bacterial Small RNA by ANNOgesic

47

ATGGATTTTCCCCGCGAATTTGATGTGATCGTCGTTGGTGGCGGTCACGCCGGTACGGAGGCAGCCCTGG CTGCAGCCCGCGCCGGCGCACAGACATTGCTGCTTACCCACAATATCGAGACCCTGGGCCAAATGTCCTG CAATCCCTCCATCGGGGGGATAGGCAAGGGTCATTTGGTCAAGGAAGTCGATGCGTTGGGCGGCGCGATG GCTATCGCCACCGACGAGGCAGGTATCCAATTCCGTATTCTCAACAGCTCCAAGGGGCCAGCGGTACGTG CCACGCGTGCCCAAGCCGACCGGGTGCTGTACCGAAACGCCATACGTGCACAGCTCGAGAACCAGCCCAA CCTCTGGCTGTTCCAGCAGGCGGTGGACGATCTGATGGTGCAGGGCGACCAGGTGGTGGGCGCCGTTACG CAGATCGGGTTGCGCTTTCGTGCCCGTACCGTGGTGCTGACGGCTGGGACCTTCCTCAACGGTTTGATTC ACGTGGGGCTGCAGAACTATTCCGGAGGGCGGGCAGGGGATCCTCCCGCCAATTCCCTGGGCCAGCGGCT CAAGGAGCTGCAACTTCCGCAAGGCCGCCTGAAAACTGGCACGCCGCCGCGCATCGACGGACGCAGCATC $ head ANNOgesic/input/references/annotations/NC002929_2.gff ##gff-version 3 #!gff-spec-version 1.21 #!processor NCBI annotwriter #!genome-build ASM19571v1 #!genome-build-accession

NCBI_Assembly:

GCF_000195715.1 ##sequence-region NC_002929.2 1 4086189 ##species https://www.ncbi.nlm.nih.gov/Taxonomy/ Browser/wwwtax.cgi?id=257313 #NC_002929.2 RefSeq region 1 4086189 . + . ID=id0;Dbxref=taxon:257313;Is_circular=true; Name=ANONYMOUS;gbkey=Src;genome=chromosom e;mol_type=genomic DNA;old-name=Bordetella pertussis;strain=Tohama I NC_002929.2 RefSeq gene 1 1920 . + . ID=gene0;Dbxref=GeneID:2664547;Name=gidA;gbkey=Gene;gene=gidA;gene_biotype=protein_co ding;locus_tag=BP0001 NC_002929.2 RefSeq CDS 1 1920 . + 0 ID=cds0;Parent=gene0;Dbxref=Genbank:NP_878920.1, GeneID:2664547;Name=NP_878920.1;Note= GidA%3B glucose-inhibited cell division protein A% 3B involved in the 5carboxymethylaminomethyl modification (mnm(5)s(2) U) of the wobble uridine base in some tRNAs;gbkey=CDS;gene=gidA;product=tRNA uridine 5-carboxymethylaminomethyl

48

Chin-Hsien Tai et al. modification

protein;protein_id=NP_878920.1;

transl_table=11 $ head ANNOgesic/input/wigs/fragment/S1_1_forward.wig track type=wiggle_0 name="S1_1_forward" variableStep chrom=NC_002929.2 span=1 1 3.0271648823 2 4.86051826171 3 5.20160726254 4 5.20160726254 5 5.20160726254 6 5.20160726254 7 5.62796851356 8 6.82178001644 $ head ANNOgesic/input/wigs/fragment/S1_1_reverse.wig track type=wiggle_0 name="S1_1_reverse" variableStep chrom=NC_002929.2 span=1 12 -0.170544500411 13 -0.170544500411 14 -0.170544500411 15 -0.170544500411 16 -0.170544500411 17 -0.170544500411 18 -0.170544500411 19 -0.170544500411

3.5 Detect Transcripts

Since the identification of sRNAs in ANNOgesic is based on the positions of expressed transcripts (Figs. 2 and 3), the transcript detection module must be executed first. The required inputs are only the genome annotation (NC002929_2.gff) and the coverage files (defined as $LIBS in Subheading 3.4). The command for transcript identification is the following: $ annogesic transcript \ -g ANNOgesic/input/references/annotations/NC002929_2.gff \ -fl $LIBS \ -rf all_1 \ -cf gene CDS \ -pj ANNOgesic

The value “all_1” for -rf means a transcript can be annotated as long as one replicate detects the transcript, and the cutoff is applied to all experimental conditions. The cutoff (-rf) can also be assigned for specific conditions in the format $CONDITION_ $NUMBER_OF_REPLICATE, such as “1_2 2_2” means that a transcript needs to be detected by two replicates for conditions 1 and 2, while “3_1 4_1” means at least one replicate is needed for conditions 3 and 4. The index of the conditions is described in

Discovering Novel Bacterial Small RNA by ANNOgesic

49

Subheading 3.4. In addition, -cf is for comparing transcripts with genomic features. In the above command, the statistics report of the relationship between transcripts and genes or CDSs will be produced. To examine the output of the transcript detection, users can type: $ ls ANNOgesic/output/transcripts/ gffs log.txt statistics tables $ ls ANNOgesic/output/transcripts/gffs/ NC002929_2_transcript.gff $ ls ANNOgesic/output/transcripts/statistics/ NC002929_2_length_all.png

NC002929_2_length_-

less_2000.png stat_compare_transcript_genome_NC002929_2.csv $ ls ANNOgesic/output/transcripts/tables/ NC002929_2_transcript.csv

The outputs include the transcript annotation GFF3 file, the histogram of transcript length distribution (NC002929_2_length_less_2000.png, shown in Fig. 5a), statistics (stat_compare_transcript_genome_NC002929_2. csv, shown in Fig. 5b) and a table (NC002929_2_transcript.csv) with all detailed information (as shown in Fig. 5c), in which the expression of genes and proteins are listed. If the genome annotations already contain transcripts, users can also use that information to identify sRNAs (see Note 6.1). 3.6 Identification of Factor-Independent Terminators (Optional)

If a terminator is located at or near the end of a sRNA, the 3′-end of this sRNA can be clearly defined. Although the presence of a factorindependent terminator, such as one dependent on the termination factor Rho, is not a mandatory input for sRNA identification, such information in the output allows users to focus on sRNA candidates that are associated with such terminators for experimental validation. To detect factor-independent terminators, FASTA and GFF3 files for the reference genome, wiggle files, and the transcript GFF3 file are needed. The command is listed below: $ annogesic terminator \ -f

ANNOgesic

/input/references/fasta_files/

NC002929_2.fna \ -g

ANNOgesic

/input/references/annotations/

NC002929_2.gff \ -a

ANNOgesic

/output/transcripts/gffs/

NC002929_2_transcript.gff \ -fl $LIBS \ -rf all_1 \ -pj ANNOgesic

50

Chin-Hsien Tai et al.

Fig. 5 Output of transcript detection: (a) distribution of the transcript length. (b) Statistics of the relationship between transcript and user-specified genomic features (-cf). (c) a portion of the NC_002929_2_transcript.csv file is displayed in Excel

Discovering Novel Bacterial Small RNA by ANNOgesic

51

The output files of terminator prediction include tables containing all detailed information, a GFF3 file for viewing, statistics results, and the direct output from TransTermHP [9]. $ ls ANNOgesic/output/terminators/ gffs log.txt statistics tables transtermhp_results

The terminator GFF3 file stored in gffs can be used as the input for sRNA detection. The subfolder best_candidates represent reliable terminators with a sharp coverage decrease at the 3′ end. The other two folders expressed_candiates and non_expressed_candidates are the terminators located either in an expressed region without a significant coverage drop or in a non-expressed region, respectively. The directory all_candidates stores the information of all predicted terminators. Only the GFF3 file from best_candidates should be used for the sRNA identification. $ ls ANNOgesic/output/terminators/gffs/ all_candidates best_candidates expressed_candidates non_expressed_candidates

3.7 Download sRNA and nr Databases (Optional)

To find novel sRNAs, users need to know which ones are already known. The sRNA database that we used in this example is BSRD [7], a stringent sRNA database curated with experimental validations. If a predicted sRNA has BLAST+ [6] hits in the sRNA database (E-value is smaller than --blast_e_srna and the score is higher than --blast_score_srna), that sRNA candidate will be included in the results. On the other hand, if a candidate can be found in the nr database (E-value ≤ - -blast_e_nr and score ≥ --blast_score_nr), this candidate may be a gene, not a sRNA; hence, it will be removed. To search sRNA in existing sRNA databases or nr databases, users can first download the database and put them in the input/ databases directory: $ wget -cP ANNOgesic/input/databases/ https://raw.githubusercontent.com/Sung-\ Huan/ANNOgesic/master/database/sRNA_database_BSRD.fa $ wget -cP ANNOgesic/input/databases/ ftp://ftp.ncbi.nih.gov/ blast/db/FASTA/nr.gz

Then unzip the nr sequence database and rename the file as nr.fa $ gunzip ANNOgesic/input/databases/nr.gz $ mv ANNOgesic/input/databases/nr ANNOgesic/input/databases/ nr.fa

52

Chin-Hsien Tai et al.

The details of the parameters are explained in the next section. Since the nr database is about 140GB, if the users choose to run ANNOgesic without using an nr database, please refer to Note 6.2. 3.8 Detection of sRNA

From Subheadings 3.1, 3.2, 3.3, 3.4, 3.5, 3.6, and 3.7, the inputs for sRNA detection have been established. Now sRNAs can be detected by ANNOgesic with the following command: $ annogesic srna \ -d blast_srna sec_str blast_nr \ -g ANNOgesic/input/references/annotations/NC002929_2.gff \ -a ANNOgesic/output/transcripts/gffs/NC002929_2_transcript.gff \ -f ANNOgesic/input/references/fasta_files/NC002929_2.fna \ -e ANNOgesic/output/terminators/gffs/all_candidates/ NC002929_2_term.gff \ -m \ -cs \ -sf \ -nf \ -nd ANNOgesic/input/databases/nr \ -sd ANNOgesic/input/databases/sRNA_database_BSRD \ -fl $LIBS \ -rf all_1 \ -pj ANNOgesic

The descriptions and default values of these arguments are listed in Table 3. Depending on the library type (conventional or differential with TEX treatments), the settings of some arguments such as -filter_info/-f need to be adjusted accordingly. The available options include tss, sec_str, blast_nr, blast_srna, sorf, term, promoter and none. When tss is set, the sRNA has to start with a TSS; sec_str sets the free energy change of the secondary structure, which is normalized by the length of the sequence; it should be smaller than a user-defined cutoff (--cutoff_energy); blast_nr sets the number of homologs found in the nr database to be less than the threshold -cutoff_nr_hit; blast_srna includes any sRNA candidate that has homologs in the sRNA database without considering other filters; sorf, term, and promoter limit the sRNA candidate to those that do not overlap with sORFs, is associated with a terminator, or is associated with a promoter motif, respectively; none means that no filter is applied. Since the example in this chapter came from conventional RNA-seq, filters such as promoter and tss do not apply due to the requirement of TEX +/- libraries. Including a terminator GFF3 file with --terminator_files/-e allows ANNOgesic to include terminator information in the output, which helps users prioritize experimental

Discovering Novel Bacterial Small RNA by ANNOgesic

53

validation by testing terminator-associated sRNA candidates first. However, filtering with terminator information increases false negatives as sRNAs ending with processing sites will be excluded. Moreover, the number of predicted terminators is often very low (only ~300 terminators are listed in best_candidates from the terminator module). Therefore, filtering with term is not recommended. In order to manage these filters, additional parameters listed in Table 4 can be set by the users according to their needs. 3.9 Outputs of sRNA Prediction

The output of the sRNA prediction is stored in the sRNAs subdirectory. In our example, five directories and three files including log. txt are generated. To examine, users can type the following command: $ ls ANNOgesic/output/sRNAs/ blast_results_and_misc

figs

gffs

log.txt

sRNA_2d_NC002929_2 sRNA_seq_NC002929_2 statistics tables

3.9.1 sRNA Annotations and Their Scores

The annotations of all sRNA candidates are stored in the gffs directory. They can be uploaded to the Integrative Genomics Viewer (IGV) browser [15, 16] or another genome browser for visual examination. There are three subdirectories under the gffs directory: all_candidates has all the unfiltered sRNA candidates; best_candidates lists the predicted sRNAs passed all filtering criteria applied by the user-defined --filter_info/-f plus those that have been identified in a sRNA database even if they do not pass all filtering criteria; for_classes allows users to examine the results of individual filter or different combinatorial filters. $ ls ANNOgesic/output/sRNAs/gffs/ all_candidates best_candidates for_classes $ ls ANNOgesic/output/sRNAs/gffs/all_candidates/ NC002929_2_sRNA.gff $ ls ANNOgesic/output/sRNAs/gffs/best_candidates/ NC002929_2_sRNA.gff $ ls ANNOgesic/output/sRNAs/gffs/for_classes/NC002929_2/ class_1_all.gff

class_1_class_2_class_3_clas-

s_6_all.gff class_1_class_3_class_4_all.gff class_1_class_4_class_6_all.gff class_2_class_3_class_4_class_6_all.gff ...

The names of the files in for_classes are found in class_ If a strain has multiple genomes, such as like chromosomes and plasmids, the results of each genome will be stored in

$ID_$GENOME.

– – – – all_1. The format and meaning of the setting are mentioned in Suheading 3.6 –

The path of annotation GFF3 files

The path of transcript GFF3 files

The path of reference genome FASTA files

The path of wig files with their experimental setting

The minimal number of replicates that a sRNA has to be detected in

The path of the analysis folder

--annotation_files, -g

--transcript_files, -a

--fasta_files, -f

--frag_libs, -fl

--replicate_frag. -rf

--project_path, -pj

– False. The value becomes True by adding --mountain_plot or -m. False. The value becomes True by adding -compute_sec_structures or -cs. False. The value becomes True by adding --sRNA_format or -sf. False. The value becomes True by adding --nr_format/-nf. – –

The path of terminator GFF3 files

Generate mountain plot

Compute RNA secondary structure

Format sRNA database. If the database is formatted before, this argument can be turned off

Format nr database. If the database is formatted before, this argument can be turned off

The path of the nr database

The path of the sRNA database

--terminator_files, -e

--mountain_plot, -m

--compute_sec _structures, -cs

--srna_format, -sf

--nr_format, -nf

--nr_database_path, -nd

--srna_database_path, -sd

Additional argument

tss, sec_str blast_nr blast_srna

Default

The filters for removing false positives

Description

--filter_info, -d

Basic argument

Argument

Table 3 The basic arguments for sRNA subcommand

54 Chin-Hsien Tai et al.

Discovering Novel Bacterial Small RNA by ANNOgesic

55

Table 4 The arguments for managing filters of sRNA detection Argument

Description

Default

--cutoff_energy, -ce

The maximum folding energy change (normalized by length of gene)

–0.05

--cutoff_nr_hit, -cn

The maximum hits number in the nr database

0

--blast_e_nr, -en

The maximum e-value for searching in the nr database

0.0001

--blast_e_srna, -es

The maximum e-value for searching in the sRNA database

0.0001

--blast_score_nr, -bn

The minimum score for searching in the nr database

40

--blast_score_srna, -bs

The minimum score for searching in the sRNA database

40

different directories. If the results are from all genomes, $GENOME will be “all”. In our case, the strain has only one accession number. Hence $GENOME is “all”. Depending on the filters used in the --filter_info/-f, different numbers of classes are generated and the information is stored in the stat_sRNA_class_$GENOME.csv in the statistics folder. In our example, sec_str, blast_nr, and blast_srna are applied and the following six classes are listed: $

cat

ANNOgesic/output/sRNAs/statistics/stat_sRNA_-

class_NC002929_2.csv 1 – Normalized (by length of sRNA) free energy change of the secondary structure is below to -0.05 2 – sRNA candidates start with TSS (3′UTR derived and interCDS sRNA also includes the sRNA candidates which start with processing site.) 3 – sRNA candidates end with terminator (including the candidates ends with processing site). 4 – Running BLAST cannot find the homology in sRNA database. 5 – Running BLAST can find the homology in sRNA database. 6 – Running BLAST cannot find the homology in nr database (the cutoff is 0).

Since the data in the for_classes directory is classified by all different combinations of the filters, users can examine which sRNAs are detected or filtered out by which criteria. For instance, class_4_all.gff lists all potential novel sRNAs; class_5_all.gff lists known sRNAs that have been previously

56

Chin-Hsien Tai et al.

detected; and class_2_class_3_class_4_class_6_all.gff lists novel sRNAs that start with a TSS, end with a terminator, and have no homologs in nr database. Since the results of all combinatorial filtering scenarios are provided, users can directly select the criteria relevant to their studies. There is no need to re-run the process for different criteria. The scores and detailed information on sRNAs are in the tables directory. The organization of the subdirectories is the same as gffs with all_candidates, best_candidates and for_classes. $ ls ANNOgesic/output/sRNAs/tables/ all_candidates best_candidates for_classes $ ls ANNOgesic/output/sRNAs/tables/all_candidates/ NC002929_2_sRNA.csv $ ls ANNOgesic/output/sRNAs/tables/best_candidates/ NC002929_2_sRNA.csv $ ls ANNOgesic/output/sRNAs/tables/for_classes/NC002929_2/ class_1_all.csv

class_1_class_2_class_3_clas-

s_6_all.csv class_1_class_3_class_4_all.csv class_1_class_4_class_6_all.csv class_2_class_3_class_4_class_6_all.csv class_2_class_6_all.csv ...

ANNOgesic uses the average coverage as the ranking score. When promoter information is provided, weight is given to the promoter associated with the sRNA by multiplying the average coverage score with the user-defined constant --ranking_time_promoter (default is 2). The description of each column in the output table is listed in Table 5 and a part of the table is shown in Fig. 6. 3.9.2 Secondary Structures and Sequences of the sRNA Candidates

Secondary structures of all the sRNA candidates in dot-bracket notation and the sequences in FASTA format are stored in sRNA_2d_NC002929_2 and sRNA_seq _NC002929_2, respectively. $ cat ANNOgesic/output/sRNAs/sRNA_seq_NC002929_2 >NC_002929.2_srna0|NC_002929.2|62|130|G G C C C A G G G T C T C G A T A T T G T G G G T A A G C A G CAATGTCTGTGCGCCGGCGCGGGCTGCAGCCAGGGCTG >NC_002929.2_srna1|NC_002929.2|6575|6748|GGTACGTTCAAACTTGCCTTTTGCCATGGCTGACTCCTGACCTGGATTGAAGCTTCTGACTTGAAGAAGAACTTAGAATTCGATGGTGCCCATGACGCGGATCGAACGCGTGACCTCTCCCTTACCAAGGGAGTGCTCTACCACTGAGCCACATGGGCAAATCTTGGAGCGGGT ...

Discovering Novel Bacterial Small RNA by ANNOgesic

57

$ cat ANNOgesic/output/sRNAs/sRNA_2d_NC002929_2 >NC_002929.2_srna0|NC_002929.2|62|130|GGCCCAGGGUCUCGAUAUUGUGGGUAAGCAGCAAUGUCUGUGCGCCGGCGCGGGCUGCAGCCAGGGCUG ( ( ( ( ( . . ( ( ( ( ( ( ( .... . . ) ) ) ) . . . ( ( ( ( ( . . . . . ((((((....))))))))))).))).))))). (-27.30) (((((..(((,,,{..,{{,|}}}...,|{{||,,|| ((((((.,..)))}})}}}}}})}}}}}))). [-28.86] (((((..(((........................... ((((((....))))))......))).))))). {-17.10 d=16.88} frequency of mfe structure in ensemble 0.0792712; ensemble diversity 25.35 >NC_002929.2_srna1|NC_002929.2|6575|6748|-

Table 5 The columns of the sRNA output table Column

Description

Rank ID

The rank of the sRNA according to the sRNA score

Genome

The NCBI accession number or strain name of the reference genome

Name

If the sRNA has a hit in the sRNA database, its name is the name of the known sRNA; otherwise, ANNOgesic sorts the candidates by position and name them by sRNA_0000, sRNA_0001, sRNA_0002, etc.

Start

The starting coordinate of the sRNA

End

The ending coordinate of the sRNA

Strand

The strand of sRNA. “+” and “–” means forward and reverse strand, respectively

Start_with_TSS/ Cleavage_site

The coordinates of the associated TSS or the cleavage site if they exist; otherwise, the value is NA

End_with_cleavage

If the sRNA is ended by a cleavage site, the value is the coordinate of that cleavage site; otherwise, the value is NA

Candidates

The coordinate range of the sRNA candidates. Users can easily copy and paste it to the genome browser to examine the expression and the neighboring genes

Lib_type

Which library the sRNA is detected, TEX+/- or fragmented/conventional library?

Best_avg_coverage

Among all the libraries, the average coverage of the sRNA in each library is calculated. The best one is listed here

Normalized_secondary_ The folding energy change of the sRNA secondary structure is normalized by length. In general, the lower the value, the more stable the secondary energy_change structure is (by_length) sRNA_types

Whether the sRNA is intergenic, antisense, 5’UTR, 3’UTR, or intragenic

Conflict_sORF

If the sRNA overlaps with a sORF, the value will be the coordinate of the overlapped sORF; otherwise, the value is NA (continued)

58

Chin-Hsien Tai et al.

Table 5 (continued) Column

Description

nr_hit_number:

The hit numbers of the sRNA in the nr database, if no BLAST search for the nr database, NA is displayed

sRNA_hit_humber

The number of times this sRNA candidate hits the database. If no BLAST search for the sRNA database, NA is displayed

nr_hit_top3|ID|e-value| The protein name, e-value and BLAST score of the top 3 hits of the sRNA score candidate when BLAST searching against the nr database is executed. If no hit can be found, NA is shown sRNA_hit|e-value|score

If the sRNA candidate has homologues in the sRNA database, the name of the known sRNA, the e-value and score are shown here. Otherwise, the value is NA

Overlap_CDS_forward/ If the sRNA candidate overlaps with other genomic features (mainly CDSs) Overlap_CDS_reverse known in the genomic annotation file, the overlapped features are listed here for forward and reverse strands, respectively. If there is no overlapping, NA is displayed Overlap_nts_forward/ Overlap_nts_reverse

If the sRNA overlaps with other genomic features, the percentage of the overlapped region for the whole sRNA length will be shown here. The values will be for the forward and reverse strands, respectively. If no overlapping, NA is displayed

End_with_terminator

The coordinates of the associated terminators. If no associated terminator, NA is shown

Associated_promoter

The name of the promoter. If no associated terminator, NA is shown

sRNA_length

The number of nucleotides of the sRNA

Avg_coverage: $LIB_NAME

The average coverage of each library for this sRNA candidate. The number of columns corresponds to the number of libraries. If a library has no read mapped on the sRNA, “Not_detected” is displayed

Fig. 6 Screenshot of a part of the sRNA output table opened in Excel

Discovering Novel Bacterial Small RNA by ANNOgesic

59

GGUACGUUCAAACUUGCCUUUUGCCAUGGCUGACUCCUGACCUGGAUUGAAGCUUCUGACUUGAAGAAGAACUUAGAAUUCGAUGGUGCCCAUGACGCGGAUCGAACGCGUGACCUCUCCCUUACCAAGGGAGUGCUCUACCACUGAGCCACAUGGGCAAAUCUUGGAGCGGGU ((((.((....)).))))...........(((.((((......(((((. ((((((((.......)))))..)))..)))))(((.. (((((((((((((.......))))).....(((((((...))))))). ((((.......))))..)))))))).)))..))))))).. (-53.80) ((((.((.)).)})))}}.........,,(({.((((..,,{,{{(((. {({{((((.......)})))}}})}}}}})))||,},| ((((((((((((.......))))).....(((((((...))))))). ((((.......))))..)))))))}.,,}}}}}}})))., [-56.57] ((((.((....)).))))...........(((.((((...... (((((..(((((((.......)))))..))...)))))..... (((((((((((((.......))))).....(((((((...))))))). ((((.......))))..))))))))......))))))).. {-50.20 d=23.49} frequency of mfe structure in ensemble 0.0112115; ensemble diversity 33.52 ...

3.9.3 BLAST Results (for Reference)

ANNOgesic provides the BLAST+ alignment results of sRNA candidates against the nr and sRNA databases in nr_blast_ $GENOME.txt and sRNA_blast_$GENOME.txt, respectively. They are also in the blast_results_and_misc directory for users to examine further. Below is the command, and Fig. 7 shows an example of a predicted sRNA (srna149), which is aligned to a known sRNA bprC in the comprehensive bacterial sRNA database BSRD. $ ls ANNOgesic/output/sRNAs/blast_results_and_misc/ nr_blast_NC002929_2.txt sRNA_blast_NC002929_2.txt

In addition, the names and the number of known sRNAs in the genome detected by ANNOgesic can be found in the file stat_ $GENOME_sRNA_blast.csv in the statistics folder. This acts as a positive control of the program and also allows users to focus on novel sRNAs. Since some sRNAs are only expressed under certain conditions, these sRNAs in BSRD may not be in the ANNOgesic results. $ cat ANNOgesic/output/sRNAs/statistics/stat_NC002929_2_sRNA_blast.csv NC_002929.2: sRNA_name amount BprD2/BprD1 1 BprF2/BprF1 1 BprC 1 BprB 1 BprK 1 BprN2/BprN1 2

60

Chin-Hsien Tai et al.

Fig. 7 A portion of the results from a BLAST+ search against BSRD showing the alignment of srna149 and the known sRNA bprB. The E-value and score are 1e – 77 and 279, respectively

3.9.4

Plots

ANNOgesic can generate three types of plots: dot plots, mountain plots, and the predicted secondary structures of sRNA candidates. Plots are generated by RNAfold from Vienna RNA packages [5, 17] and stored in the figs directory. $ ls ANNOgesic/output/sRNAs/figs/ dot_plots mountain_plots sec_plots $ ls ANNOgesic/output/sRNAs/figs/dot_plots/NC002929_2/ NC_002929.2_srna0_dp.ps NC_002929.2_srna128_dp.ps NC_002929.2_srna156_dp.ps NC_002929.2_srna184_dp. p s

N C _ 0 0 2 9 2 9 . 2 _ s r n a 2 1 1 _ d p . p s

NC_002929.2_srna23_dp.ps ... $ ls ANNOgesic/output/sRNAs/figs/mountain_plots/NC002929_2/ N C _ 0 0 2 9 2 9 . 2 _ s r n a 0 _ m o u n t a i n . p d f N C _ 0 0 2 9 2 9 . 2 _ s r n a 1 3 5 _ m o u n t a i n . p d f N C _ 0 0 2 9 2 9 . 2 _ s r n a 1 7 0 _ m o u n t a i n . p d f NC_002929.2_srna205_mountain.pdf ... $ ls ANNOgesic/output/sRNAs/figs/sec_plots/NC002929_2/ NC_002929.2_srna0_rss.ps NC_002929.2_srna131_rss. p s

N C _ 0 0 2 9 2 9 . 2 _ s r n a 1 6 3 _ r s s . p s

Discovering Novel Bacterial Small RNA by ANNOgesic

61

N C _ 0 0 2 9 2 9 . 2 _ s r n a 1 9 5 _ r s s . p s N C _ 0 0 2 9 2 9 . 2 _ s r n a 2 2 6 _ r s s . p s NC_002929.2_srna258_rss.ps ...

The format of dot plots and the secondary structure plots is postscript (.ps). To view, users can convert them into PDF format by using Ghostscript which is included in the Docker image and Singularity. Otherwise, users can install and covert the files by the following commands: $ sudo apt-get install ghostscript $ sudo ps2pdf -dEPSCrop \ ANNOgesic/output/sRNAs/figs/dot_plots/NC002929_2/ NC_002929.2_srna114_dp.ps \ NC_002929.2_srna114_dp.pdf $ sudo ps2pdf -dEPSCrop \ ANNOgesic/output/sRNAs/figs/sec_plots/NC002929_2/ NC_002929.2_srna114_rss.ps \ NC_002929.2_srna114_rss.pdf

Dot plots show all the possible base pairings as well as those with minimal free energy. An example of srna114, the S1 sRNA validated in the study by K. Moon et al. [14], is shown in Fig. 8a. The figures in sec_plots provide another way to visualize the secondary structure and the free energy of sRNAs. In Fig. 8b, the positional entropy of each nucleotide in srna114 is represented in different colors with low entropy in red and high entropy in blue. It is clear that the stems have less entropy than the loop regions and hence are more stable. In addition to the two plots above, ANNOgesic outputs mountain plots [17] which also provide useful information for understanding the secondary structure of an sRNA. The important region in the sRNA such as stem-loop where targets bind can be visualized in mountain plots. The x axis represents the nucleotide position whereas the y axis is the number of nucleotides enclosing the base. Loops are plateaus where there is no change of enclosing nucleotides; helices correspond to slopes with an increasing or decreasing number of enclosing bases; stem-loops are obtuse peaks composed of at least two slopes and a plateau [17]. The mountain plot of srna114 is shown in Fig. 9. The two stem loops are around residues 15 and 60, consistent with the results in Fig. 8.

62

Chin-Hsien Tai et al.

Fig. 8 Two ANNOgesic output plots representing the secondary structure of a predicted sRNA srna114. (a) dot plot. The x and y axes show the sequence of srna114. The dashed grid lines indicate the position of every tenth nucleotide. The dots show the possible base pairing between two nucleotides. The free energy of the pairing is reflected by the sizes of the dots: the lower the energy, the larger the dots and the more stable the pairing. The diagonal line cuts the plot into two triangles. The upper-right one shows all possible pairings, whereas the bottom-left represents only the base pairs with minimal free energy. It is clear that there are two stem regions in srna114: one is pairing between position 2–12 and 32–22, and the other is between 36–55 and 75–87. (b) the entropy plot and the predicted secondary structure of srna114. The color indicates the entropy of each nucleotide as shown in the color bar

Discovering Novel Bacterial Small RNA by ANNOgesic

63

Fig. 8 (continued)

Fig. 9 The mountain plot of srna114. The upper panel shows the probabilities of base pairs (red line) and the minimal free energy (blue line). The bottom panel shows the entropy at each nucleotide position

64

Chin-Hsien Tai et al.

Fig. 10 Two novel sRNAs S1 and S17 in Bordetella pertussis identified by ANNOgesic. The blue and orange coverage tracks are experimental conditions of wild type without and with MgSO4, respectively. In the bottom annotation track, the brown and green arrows represent ORFs and sRNA, respectively. The right arrow represents the forward strand; the left represents the reverse strand. (a) S1 is an intergenic sRNA located on the reverse strand from 2,643,393 to 2,643,507. (b) S17 is an intergenic sRNA located between BP3151 and BP3152. Its position is on the reverse strand from 3,363,126 to 3,363,053

4 Result Interpretation and Experimental Validation Previous studies have demonstrated that ANNOgesic can successfully detect sRNAs from Helicobacter pylori, Campylobacter jejuni, Escherichia coli, and Bordetella pertussis RNA-seq data with sensitivity >85% [1, 14] The predictions were further validated by northern blot or functional assays. Once users have a list of predicted sRNAs, they can compare them with existing known ones from databases and publications. For novel sRNAs, users can perform northern blot analyses to check for the presence of a sRNA in vivo, as we did in the B. pertussis study [14] and/or perform functional studies. To obtain highly reliable candidates, we used a relatively stringent cutoff of 30 for best_ave_coverage, which resulted in 143 sRNAs among 8 samples. Among the 10 ANNOgesic predicted sRNAs that we tested, all were confirmed by Northern Blot analysis. Homology search also suggested that 9 out of 10 of these sRNAs are conserved among Bordetella genus. The

Discovering Novel Bacterial Small RNA by ANNOgesic

65

expression of two examples, S1 and S17, reported in our previous study are presented in Fig. 10. S1 is an intergenic sRNA; expression of particular S1 sRNA species increased when the bacteria were grown in media containing MgSO4, a condition that represents the transmission mode for the bacterium [18, 19]. On the contrary, S17, a sense sRNA, revealed higher expression in the experimental condition without MgSO4, a condition that represents the virulence mode for this pathogen [18, 20]. The northern blot results for these two sRNAs can be seen in Fig. 1 in K. Moon et al. [14]. Furthermore, both of these sRNAs can be found in various members of the Bordetella genus (Fig. S2A and S2H of the study from K. Moon et al. [14]).

5

Trouble Shooting Guide From our user support experience, the most common error is a discrepancy in the sequence ID between different formats of the files. Therefore, it is important to ensure the sequence ID or strain names are the same among all sequence FASTA files, annotation GFF3 files, alignment bam files and the coverage wiggle files (refer to Subheading 3.5). Also, ANNOgesic expects the wiggle file for the reverse strand to have negative signs (refer to Subheading 3.5), and we recommend using READemption [21] as the RNA-seq pipeline to generate the wiggle files. The duration of the sRNA run depends on the size of the genome, the number of samples, and the computing power. The troubleshooting guide below lists common scenarios, symptoms, and solutions (Table 6). Figure 3 and Table 1 contain a list of parameters users may adjust according to their experimental design. For example, sequencing depth affects the length of the transcript; therefore, the coverage cutoff should be adjusted accordingly. If the sequencing depth is shallow, setting a high coverage cutoff (red dash line in Fig. 11) will generate numerous short transcripts, influencing the accuracy of the sRNA identifications; whereas setting the cutoff too low (brown dash line in Fig. 11) may introduce false negative.

6

Note 1. Running srna command with known transcripts. Instead of generating transcripts from the RNA-seq data, users can use existing transcript annotation to detect sRNAs. Use the following command to extract transcript information from a known annotation GFF3 file: $ awk -F ‘\t’ ‘$3 == “Transcript”’ \ $ANNOTATION_PATH/genome_annotation.gff > $TRANSCRIPT_PATH/ transcript.gff

66

Chin-Hsien Tai et al.

Table 6 ANNOgesic Q&A Scenario

Error message

Solution

No module can be found Traceback (most recent call last): File "ANNOgesic.py", line 6, in from annogesiclib.controller import Controller ImportError: No module named annogesiclib. controller

ANNOgesic was installed but an error occurred during the execution

Traceback (most recent call last): File "bin/ annogesic", line 7, in from annogesiclib. controller import Controller ImportError: No module named ’annogesiclib

There is only one line in the result sRNA GFF3 file

##gff-version 3

The original GFF3 file is

Use Python 3.3 or higher will avoid the missing module error.

Generate symbolic link of annogesiclib to bin by: $ cd ANNOgesic/bin $ ln -s ../annogesiclib .

Given the current threshold, ANNOgesic did not find any sRNA. Users should adjust the parameters listed in Table 1 according to their samples and conditions

$ANNOTATION_PATH/genome_an-

notation.gff ($ANNOTATION_PATH

is the path of genome_anThe GFF3 file containing only transcript information will be saved to the transcript.gff ($TRANSCRIPT_PATH should be replaced by the path of the directory where transcript.gff was placed). Afterward, the transcript. gff is used as the input file for sRNA detection. notation.gff).

$ annogesic srna \ -d blast_srna sec_str \ -g ANNOgesic/input/references/annotations/NC002929_2.gff \ -a $TRANSCRIPT_PATH/transcript.gff \ -f ANNOgesic/input/references/fasta_files/NC002929_2.fna \ -e ANNOgesic/output/terminators/gffs/all_candidates/ NC002929_2_term.gff \ -m \ -cs \ -sf \ -sd ANNOgesic/input/databases/sRNA_database_BSRD \

Discovering Novel Bacterial Small RNA by ANNOgesic

67

Fig. 11 Schematic showing how different cutoffs affect the prediction outcome. The expression coverage is in blue. If the coverage cutoff is set too high (red dash line), low expression sRNA1 will be missed (false negative), the CDS2 transcript becomes too short, and sRNA3 is created (false positive). If the coverage cutoff is too low (brown dashed line), transcripts may be merged, missing a sRNA prediction (false negative— sRNA2) -fl $LIBS \ -rf all_1 \ -pj ANNOgesic

2. Running srna command without downloading NCBI nr database. If downloading or storing the ~140GB NCBI nr database is an issue, ANNOgesic can still detect sRNA without blast_nr in the --filter_info and take away –nd and – nf. The command becomes the following: $ annogesic srna \ -d blast_srna sec_str \ -g ANNOgesic/input/references/annotations/NC002929_2.gff \ -a ANNOgesic/output/transcripts/gffs/NC002929_2_transcript.gff \ -f ANNOgesic/input/references/fasta_files/NC002929_2.fna \ -e ANNOgesic/output/terminators/gffs/all_candidates/ NC002929_2_term.gff \ -m \ -cs \ -sf \

68

Chin-Hsien Tai et al. -sd ANNOgesic/input/databases/sRNA_database_BSRD \ -fl $LIBS \ -rf all_1 \ -pj ANNOgesic

Acknowledgments This work was funded by the MOST 110-2222-E-110-008-MY3 (S.-H.Y.); by the intramural program of the National Institutes of Health, the National Cancer Institute (C.-H.T.) and by the intramural program of the National Institute of Diabetes and Digestive and Kidney Diseases (D.M.H.). The work utilized the computational resources of the NIH HPC Biowulf cluster. We declare no conflicts of interest. References 1. Yu SH, Vogel J, Forstner KU (2018) ANNOgesic: a Swiss army knife for the RNA-seq based annotation of bacterial/archaeal genomes. Gigascience 7(9):giy096 2. Chao Y, Vogel J (2016) A 3’ UTR-derived small RNA provides the regulatory noncoding arm of the inner membrane stress response. Mol Cell 61(3):352–363 3. Sharma CM, Vogel J (2014) Differential RNA-seq: the approach behind and the biological insight gained. Curr Opin Microbiol 19:97–105 4. National Library of Medicine (2023) US NCBI, nr database. https://www.ncbi.nlm. nih.gov/gene. Accessed 18 Feb 2023 5. Lorenz R, Bernhart SH, Honer Zu Siederdissen C, Tafer H, Flamm C, Stadler PF, Hofacker IL (2011) ViennaRNA Package 2.0. Algorithms Mol Biol 6:26 6. Camacho C, Coulouris G, Avagyan V, Ma N, Papadopoulos J, Bealer K, Madden TL (2009) BLAST+: architecture and applications. BMC Bioinform 10:421 7. Li L, Huang D, Cheung MK, Nong W, Huang Q, Kwan HS (2013) BSRD: a repository for bacterial small regulatory RNA. Nucleic Acids Res 41(Database issue):D233– D238 8. Dugar G, Herbig A, Forstner KU, Heidrich N, Reinhardt R, Nieselt K, Sharma CM (2013) High-resolution transcriptome maps reveal strain-specific regulatory features of multiple Campylobacter jejuni isolates. PLoS Genet 9(5):e1003495

9. Kingsford CL, Ayanbule K, Salzberg SL (2007) Rapid, accurate, computational discovery of Rho-independent transcription terminators illuminates their relationship to DNA uptake. Genome Biol 8(2):R22 10. Bailey TL, Elkan C (1994) Fitting a mixture model by expectation maximization to discover motifs in biopolymers. Proc Int Conf Intell Syst Mol Biol 2:28–36 11. Mann M, Wright PR, Backofen R (2017) IntaRNA 2.0: enhanced and customizable prediction of RNA-RNA interactions. Nucleic Acids Res 45(W1):W435–W439 12. Muckstein U, Tafer H, Hackermuller J, Bernhart SH, Stadler PF, Hofacker IL (2006) Thermodynamics of RNA-RNA binding. Bioinformatics 22(10):1177–1182 13. Tafer H, Hofacker IL (2008) RNAplex: a fast tool for RNA-RNA interaction search. Bioinformatics 24(22):2657–2663 14. Moon K, Sim M, Tai CH, Yoo K, Merzbacher C, Yu SH, Kim DD, Lee J, Forstner KU, Chen Q, Stibitz S, Knipling LG, Hinton DM (2021) Identification of BvgAdependent and BvgA-independent small RNAs (sRNAs) in Bordetella pertussis using the prokaryotic sRNA prediction toolkit ANNOgesic. Microbiol Spectr 9(2):e0004421 15. Robinson JT, Thorvaldsdottir H, Winckler W, Guttman M, Lander ES, Getz G, Mesirov JP (2011) Integrative genomics viewer. Nat Biotechnol 29(1):24–26 16. Thorvaldsdottir H, Robinson JT, Mesirov JP (2013) Integrative Genomics Viewer (IGV):

Discovering Novel Bacterial Small RNA by ANNOgesic high-performance genomics data visualization and exploration. Brief Bioinform 14(2): 178–192 17. Hofacker IL (2003) Vienna RNA secondary structure server. Nucleic Acids Res 31(13): 3429–3431 18. Lacey BW (1960) Antigenic modulation of Bordetella pertussis. J Hyg (Lond) 58(1): 57–93

69

19. Trainor EA, Nicholson TL, Merkel TJ (2015) Bordetella pertussis transmission. Pathog Dis 73(8):ftv068 20. Chen Q, Stibitz S (2019) The BvgASR virulence regulon of Bordetella pertussis. Curr Opin Microbiol 47:74–81 21. Forstner KU, Vogel J, Sharma CM (2014) READemption-a tool for the computational analysis of deep-sequencing-based transcriptome data. Bioinformatics 30(23):3421–3423

Part II sRNA Functions

Chapter 5 Ribosome Profiling Methods Adapted to the Study of RNA-Dependent Translation Regulation in Staphylococcus aureus Maximilian P. Kohl, Be´atrice Chane-Woon-Ming, Roberto Bahena-Ceron, Jose Jaramillo-Ponce, Laura Antoine, Lucas Herrgott, Pascale Romby, and Stefano Marzi Abstract Noncoding RNAs, including regulatory RNAs (sRNAs), are instrumental in regulating gene expression in pathogenic bacteria, allowing them to adapt to various stresses encountered in their host environments. Staphylococcus aureus is a well-studied model for RNA-mediated regulation of virulence and pathogenicity, with sRNAs playing significant roles in shaping S. aureus interactions with human and animal hosts. By modulating the translation and/or stability of target mRNAs, sRNAs regulate the synthesis of virulence factors and regulatory proteins required for pathogenesis. Moreover, perturbation of the levels of RNA modifications in two other classes of noncoding RNAs, rRNAs, and tRNAs, has been proposed to contribute to stress adaptation. However, the study of how these various factors affect translation regulation has often been restricted to specific genes, using in vivo reporters and/or in vitro translation systems. Genome-wide sequencing approaches offer novel perspectives for studying RNA-dependent regulation. In particular, ribosome profiling methods provide a powerful resource for characterizing the overall landscape of translational regulation, contributing to a better understanding of S. aureus physiopathology. Here, we describe protocols that we have adapted to perform ribosome profiling in S. aureus. Key words Staphylococcus aureus, Translation regulation, Ribosome profiling, Ribo-seq, Ribo-RET, Translation initiation, Start codons

1

Introduction Staphylococcus aureus is a major opportunistic human pathogen whose prevalence and difficult-to-treat infections represent a burden to healthcare systems worldwide. While skin and soft tissue infections are most common, bacteremia and infective endocarditis

Maximilian P. Kohl, Be´atrice Chane-Woon-Ming and Roberto Bahena-Ceron contributed equally with all other contributors. Ve´ronique Arluison and Claudio Valverde (eds.), Bacterial Regulatory RNA: Methods and Protocols, Methods in Molecular Biology, vol. 2741, https://doi.org/10.1007/978-1-0716-3565-0_5, © The Author(s), under exclusive license to Springer Science+Business Media, LLC, part of Springer Nature 2024

73

74

Maximilian P. Kohl et al.

are particularly dangerous [1, 2]. In a recent report, the total number of deaths associated with S. aureus infections was estimated at more than one million cases in 2019 [3]. The reasons why S. aureus is such a successful pathogen include the high prevalence of antimicrobial resistance (AMR) and the speed at which resistance mechanisms evolve [4]. For instance, when methicillin was first used to combat penicillin resistance in 1952, methicillin resistance was reported only 2 years later [4, 5]. Today, more than 50% of clinical isolates in the United States, India, and China carry methicillin resistance [6], and AMR against other classes of antibiotics is continuously on the rise [7, 8]. Consequently, a more detailed understanding of S. aureus pathogenicity is urgently required to develop new therapeutic strategies. For successful infection and colonization of the host, S. aureus not only expresses a large arsenal of virulence factors but also employs a plethora of mechanisms allowing rapid adaptation to various stimuli and stresses [9–12]. These responses are often multilayered and comprise intricate regulatory networks. Two-component systems, transcriptional regulatory proteins, and alternative sigma factors are critical for transcriptional control of specific sets of genes [11, 13, 14]. In addition to transcriptional control, S. aureus also utilizes translational control to fine-tune its responses to ever-changing conditions. Typically, this is achieved either through the regulation of ribosome binding to mRNA or the modulation of translational efficiency. Regulation often takes place at the rate-limiting step of translation [11, 15–17], which is the initiation process, whereby the initiation complex assembles at the ribosomal binding site (RBS) [18–29]. Both cis-acting effectors, such as riboswitches and leader peptides, as well as trans-acting players, like sRNAs, can affect the RBS accessibility [15–17, 30]. In S. aureus, several sRNAs form base pairings with their target mRNAs to inhibit or activate their translation by masking or inducing the opening of their RBS, respectively [10, 31]. More recently, it has been proposed that protein yield and activity can also be affected by modulating the elongation phase of translation [32, 33]. Advances in RNA modification and codon bias analyses in different pathogenic bacteria led to new models for translational control during elongation. They showed how stress-induced tRNA or rRNA modifications can optimize the translation of stress response transcripts that use specific codons (for a review, see Ref. [33]). To fully understand how S. aureus adapts to host environments and stresses, it is crucial to gain knowledge of its translational regulation landscape. This requires the identification and validation of the expression of all functional open reading frames (ORFs) in controlled laboratory conditions and under stress conditions. Ribosome profiling (Ribo-seq) is a powerful next-generation sequencing technique that enables transcriptome-wide measurement of

Ribosome Profiling Methods Adapted to the Study of RNA-Dependent. . .

75

ribosome occupancy, providing a global view of translation and its regulation in different conditions [34, 35]. To perform Ribo-seq, cells are harvested and lysed in a way that preserves ribosome and polysome integrity. The mRNAs are then digested by one specific nuclease, whereby only the actively translated regions of the mRNAs are protected by the ribosomes. This results in ribosome footprints of specific sizes, which are then purified and analyzed by deep-sequencing to provide information on the location of ribosomes at the time of harvesting, offering a high-resolution map of ribosome occupancy on mRNAs. While eukaryotic Ribo-seq has been continuously optimized and has shown reliable ribosome occupancy at single codon resolution, its application to bacteria has been surprisingly difficult [36]. A systematically revised method for ribosome profiling in Escherichia coli, however, has recently revealed translational pauses at the single-codon level [37]. While high resolution is not necessarily required to determine ribosome occupancy and translation efficiency, method optimization is crucial for the successful application of Ribo-seq in any bacterial species and requires attention to detail. One of the major drawbacks of the bacterial Ribo-seq is the use of antibiotic treatment (e.g., chloramphenicol) to freeze ribosomes on translating mRNAs in order to increase the yield of ribosomeprotected fragments. For instance, chloramphenicol treatment can lead to context-specific effects on translation because certain codons or codon combinations may be disproportionately affected, leading to biased results. In addition, chloramphenicol may cause ribosome stalling, premature termination, and nonnatural enrichment of ribosomes at translation initiation sites, which can lead to inaccuracies in the interpretation of the data [37, 38]. Therefore, an ideal Ribo-seq protocol should capture ribosomes in the absence of translation inhibitors to minimize artifacts. Nevertheless, some inhibitors can be useful to tackle specific questions. The pleuromutilin family antibiotic retapamulin (RET) has gained recent popularity for its highly specific activity that traps ribosomes exclusively during the initiation step, enriching ribosome densities at start codons [39]. As a result, retapamulin-assisted ribosome profiling (Ribo-RET) tremendously contributes to ORF annotation and reveals unexpected cryptic translation initiation sites [39, 40]. These cryptic sites correspond to unannotated start sites located either in intergenic regions or even within coding sequences. In E. coli, Ribo-RET has led to the detection of more than one hundred of such cryptic translation initiation sites [39], suggesting that its use holds great promise in revealing novel aspects of translational control in various bacterial species [40]. Furthermore, Ribo-seq has proven extremely useful to monitor the effect of specific translational regulators or RNA modification enzymes on gene regulation at the post-transcriptional level. For instance, Ribo-seq has been used to characterize the multiple

76

Maximilian P. Kohl et al.

targets of the E. coli sRNA RyhB regulated at the translational level [41]. By monitoring translation efficiency and ribosome occupancy in the presence and absence of RyhB, it was possible to uncover both established and novel target genes as well as modes of action of the sRNA. In this method section, we provide step-by-step instructions for two different Ribo-seq protocols specifically tailored for the investigation of S. aureus, along with our modified version of the RiboRET approach [42] (Fig. 1). The Ribo-seq protocols differ in the steps of cell harvesting and lysis. However, they share the procedures for nuclease digestion, monosome isolation, size selection, library preparation, and the pipeline for the bioinformatic analysis. Although both protocols are suitable for generating informative ribosome profiling data, one has greater potential for improved codon resolution (“flash freezing”), whereas the other one provides high yields, more gently preserves polysome integrity, and is easier to use (“ice-bath”).

2

Materials

2.1 S. aureus Cultures

1. A biosafety level L2 laboratory to work with pathogenic or infectious organisms posing a moderate risk. 2. BHI (Brain Heart Infusion) liquid growth medium: 7.7 g calf brain infusion from 200 g, 9.8 g beef heart infusion from 250 g, 10 g proteose peptone, 2 g dextrose, 5 g sodium chloride, 2.5 g disodium phosphate. Dissolve 37 g of the powder in 1 liter of distilled water and autoclave. Store at 20 °C. 3. Columbia sheep blood agar plates. 4. BHI-agar plates. Dissolve 37 g of BHI powder and 15 g of agar in 1 liter of distilled water and autoclave. Add the antibiotics as required before pouring them into plates. Store plates at 4 °C. 5. 14 mL polystyrene sterile tubes. 6. 500 mL Erlenmeyer flasks. 7. Spectrophotometer and 1 mL plastic cuvettes. 8. Retapamulin (resuspend in absolute ethanol to a final concentration of 25 mg/mL and store at -20 °C).

2.2 Cell Harvesting and Lysis

1. Lysis buffer A (10X): 200 mM Tris-HCl pH 8, 1.5 M MgCl2, 1 M NH4Cl, 50 mM CaCl2, 4% Triton X-100, 1% NP-40, 1000 U/mL DNase I (RNase free). 2. Liquid Nitrogen (LN).

Ribosome Profiling Methods Adapted to the Study of RNA-Dependent. . .

77

Fig. 1 Schematic depiction of the Ribo-seq/Ribo-RET experimental workflow using either the “flash freezing” or “ice bath” protocol for harvesting and lysis

78

Maximilian P. Kohl et al.

3. Lysis buffer B (1X): 20 mM Tris-HCl pH 8, 150 mM MgCl2, 100 mM NH4Cl, 5 mM CaCl2, 0.4% Triton X-100, 0.1% NP-40, 100 U/mL DNase I (RNase free). 4. Lysing Matrix B and Fastprep tubes (MP Biomedicals, see Note 1). 5. Stainless steel beads (3.175 mm diameter). 6. MP Biomedicals Fastprep24 bead beating grinder and lysis system (see Note 1). 2.3 Preparation of Total RNA

1. 15 mL tubes 2. FastRNA® Pro Blue kit (MP Biomedicals, see Note 1): RNApro® solution and supplied tubes prefilled with Lysing Matrix B. 3. FastPrep-24® Classical bead beating grinder and lysis system (MP Biomedicals, see Note 1). 4. 2 mL tubes 5. Refrigerated centrifuge. 6. Chloroform:isoamyl alcohol (19:1 v/v). 7. Sodium acetate 3 M pH 5.2. 8. Ethanol 80% (v/v). 9. Nanodrop spectrophotometer. 10. DNase I buffer 10X: 400 mM Tris-HCl, 100 mM NaCl, 10 mM CaCl2, 60 mM MgCl2, pH 7.9. 11. 10 U/μL DNase I (RNase-free) 12. Phenol:chloroform:isoamyl alcohol 25:24:1, pH 4.5–5.0. 13. TBE buffer: 100 mM Tris, 100 mM Boric Acid, 2 mM EDTA, pH 8.3. 14. Agilent 2100 Bioanalyzer.

2.4 Monosome/ Polysome Enrichment and Buffer Exchange

1. Sucrose cushion: 20 mM Tris-HCl pH 8, 150 mM MgCl2, 0.5 M NH4Cl, 0.5 mM EDTA, 1.1 M Sucrose. 2. Resuspension buffer R (1X): 20 mM Tris-HCl pH 8, 100 mM NH4Cl, 15 mM MgCl2, 1 mM DTT. 3. 26.3 mL Polycarbonate ultracentrifuge tubes.

2.5 MNase Treatment

1. MNase (10 U/μL).

2.6 Polysome Analysis and Monosome Isolation

1. Sucrose gradient buffer G (10X): 200 mM Tris-HCl pH 8, 1 M NH4Cl, 150 mM MgCl2, 20 mM DTT.

2. SUPERase-In RNase inhibitor (20 U/μL).

2. 60% sucrose solution. Dissolve 60 g of D(+)-Saccharose in MilliQ H2O, adjust the total volume to 100 mL, and filter it with a 0.45 μm filter.

Ribosome Profiling Methods Adapted to the Study of RNA-Dependent. . .

79

3. 5% sucrose solution. To make 40 mL of solution, use 4 mL of sucrose gradient buffer G, 3.33 mL of the 60% sucrose solution, and 32.67 mL of MilliQ H2O. 4. 50% sucrose solution. To make 40 mL of solution, use 4 mL of sucrose gradient buffer G, 33.3 mL of the 60% sucrose solution, and 2.67 mL of MilliQ H2O. 5. Open-top polyclear ultracentrifuge tubes (14 × 89 mm, 12.5 mL, Seton). 6. Gradient maker. 7. Piston gradient fractionator. 8. Fraction collector. 2.7 Hot Phenol RNA Extraction from Monosomes

1. 1.5 mL tubes 2. Phenol:chloroform:isoamyl alcohol 25:24:1, pH 4.5–5.0. 3. SDS 20% (w/v). 4. Ethanol 80% (v/v).

2.8 Purification of Ribosome Protected Fragments

1. 15% polyacrylamide (bis-acrylamide 1/19) – 8 M urea gel (8 × 8 cm, 0.75 mm thick) 2. Urea loading dye (8 M Urea, 0.025% Xylene Cyanol, 0.025% Bromophenol Blue). 3. RNA elution buffer (1X): 0.5 M NH4Ac pH 6.5, 1 mM EDTA, 0.1% SDS. 4. SYBR™ Green II, concentrated 10,000X in DMSO. 5. Phenol (Roti-phenol). 6. Glycoblue.

3

Methods

3.1 S. aureus Cultures (to Be Performed in an L2 Laboratory)

1. Streak the bacteria from a glycerol stock onto a blood agar plate and incubate overnight at 37 °C. For strains that require a specific antibiotic, use BHI-agar plates supplemented with the appropriate antibiotic. 2. To prepare bacterial culture, pick a single colony from the freshly made plate and inoculate 2 mL BHI medium (supplemented with antibiotic if required) in 14 mL polystyrene tubes. Prepare one independent culture per biological replicate and grow bacteria overnight at 37 °C under agitation at 180 rpm. 3. Dilute the overnight culture to an optical density (600 nm) of 0.05 in 50 mL prewarmed BHI medium (supplemented with any required antibiotic) in a 250 mL Erlenmeyer flask (see Subheading 3.2 and Note 2).

80

Maximilian P. Kohl et al.

4. Grow bacteria at 37 °C under agitation at 180 rpm until the culture reaches the desired growth phase. Monitor its optical density. Optimization of culture conditions for different growth phases should be performed for each strain (see Subheading 3.2 and Note 3). 5. For Ribo-RET experiments, retapamulin is added to the culture 5 min before harvest at a final concentration of 12.5 μg/ mL. This concentration corresponds to 100X the minimum inhibitory concentration (MIC), which we have estimated for S. aureus HG001, in accordance with previously published tests on hundreds of clinical isolates [43]. 3.2 Cell Harvesting and Lysis (to Be Performed in an L2 Laboratory)

The method used for cell harvesting and lysis is critical for obtaining reliable high-resolution ribosome profiling data without artifacts [37, 38]. Following the recommendations and procedures from the Buskirk and Sharma groups [37, 44, 45], we have adapted two protocols for ribosome profiling of S. aureus, one involving flash freezing of whole cultures and the other involving rapid cooling in an ice bath, followed by cell lysis in high magnesium concentration. The flash freezing protocol may be more suitable for analyzing translational pauses at single-codon resolution, while the ice bath protocol results in higher polysome yields and is easier to perform (Fig. 2). Both methods are suitable for ribosome profiling and are complemented by total RNA extraction for RNA-seq analysis.

Fig. 2 Sedimentation profiles of samples harvested by either the “flash freezing” or “ice bath” protocol. In both cases, the samples were fractionated on a 5–50% sucrose gradient. For the “ice bath” profile, the window corresponding approximately to the 10–50% section of the gradient is displayed. The detected absorbance at 260 nm (A260) is shown as a function of the gradient density. Spectra for lower molecular weight 50S ribosomal subunits (50S), as well as for higher molecular weight 70S ribosomes (70S) and polysomes, are indicated accordingly

Ribosome Profiling Methods Adapted to the Study of RNA-Dependent. . . 3.2.1

Ice Bath Protocol

81

1. Prepare an ice bath by filling a plastic container with ice and cold water. The ice bath should be large enough to fully submerge the 50 mL culture volume within the 250 mL flask. 2. Take the flasks directly from the 37 °C incubator into the ice bath and swirl them for 3 min. 3. Using a 25 mL serological pipette, quickly transfer 25 mL of the culture into two 50 mL falcon tubes previously placed on ice (see Note 2). 4. Centrifuge the culture at 2100 g for 10 min at 4 °C. 5. Carefully discard the supernatants and remove the excess medium with a pipette. 6. Snap-freeze the pellets in liquid nitrogen and store them at 80 °C until lysis or proceed directly. 7. When proceeding with frozen samples, allow the pellets to thaw slowly on ice for approximately 30 min. In the following steps, strictly keep the samples on ice. 8. Add 550 μL of cold lysis buffer B to the pellets, resuspend, and transfer the solutions to 2 mL fastprep tubes with Lysing Matrix B placed on ice (see Note 1). 9. Break the cells using a mechanical lysis system such as the ‘MP Biomedical Fastprep24’ (see Note 1). Perform one round of fastprep at 6 m/s for 40 s. 10. Centrifuge the samples at 16,000 g for 10 min at 4 °C and carefully retrieve the supernatant (see Note 4). 11. The samples that originate from the same culture flask can be pooled again. 12. The following steps can be performed outside the L2 laboratory (see Subheading 3.4).

3.2.2 Flash-Freezing Protocol

1. To harvest the cells, flash freeze 22 mL of the culture directly in liquid nitrogen. To do this, set up a plastic tray filled halfway with liquid nitrogen under a fume hood. Next, using an electronic pipette controller and a 25 mL serological pipette, slowly add the culture drop by drop to the liquid nitrogen while moving the pipette over the tray in a circular motion to avoid the formation of larger aggregates (see Note 5). 2. Use a metallic spatula to gently crush any small aggregates that are still visible after freezing. 3. After careful removal of excess liquid nitrogen (see Note 6), promptly transfer the frozen cells into a 50 mL falcon graduated tube that is filled with metallic beads to approximately the 5–6 mL mark noted on the tube. Keep the Falcon tube containing the frozen culture on dry ice, and if necessary, immerse the tube in liquid nitrogen to avoid thawing at this stage.

82

Maximilian P. Kohl et al.

4. Flash freeze 2.4 mL of lysis buffer A directly in liquid nitrogen, crush it into smaller blocks using a metallic spatula, and carefully transfer the frozen buffer to the falcon tube containing the frozen cells. 5. Use a bead-beating grinder such as the ‘MP Biomedical Fastprep24’ system with an adapter for a 50 mL falcon tube in order to pulverize the frozen cultures. Six cycles of bead beating for 40 s at 6 m/s are sufficient to break the cells (see Note 7). The pulverized sample can now be stored at -80 °C or used directly for subsequent steps. 6. If you are ready to proceed, allow the cryo-pulverized sample to thaw on ice for approximately 1 h. 7. Centrifuge the sample at 2100 g for 20 min at 4 °C. Carefully retrieve the supernatant and transfer them to a new 50 mL falcon tube. 8. To clear the lysate, centrifuge the sample at 13,880 g for 20 min at 4 °C, collect the supernatant, and carefully transfer it into a new 50 mL falcon tube. 9. The following steps can be performed outside the L2 laboratory (see Subheading 3.4). 3.3 Preparation of Total RNA for Transcriptomics Analysis

1. From the sample prepared either with the flash-freezing or with the ice-bath method, reserve 5 mL of the culture in a 15 mL tube for transcriptomics analysis. 2. In the L2 laboratory, centrifuge the sample containing the 5 mL of the culture at 2100 g for 10 min at 4 °C, remove the supernatant, snap-freeze the pellet in liquid nitrogen, and store it at -80 °C until performing the RNA extraction procedure. 3. If ready for the next step, let the sample thaw on ice for approximately 10 min. 4. Resuspend the bacterial pellet with 1 mL of RNApro® solution and transfer the suspension into a 2 mL tube containing Lysing Matrix B. 5. Break the cells using a mechanical bead-beating system such as the MP Biomedical FastPrep-24 instrument. Perform 40 s of bead-beating at 6 m/s. 6. Centrifuge at 16,000 g for 10 min at 4 °C to pellet the Lysing Matrix B and cell debris. 7. Transfer the supernatant into a 2 mL tube with particular care to avoid particles from the Lysing Matrix B and cell debris. Incubate for 5 min at 20 °C. The following steps can be performed outside of the L2 laboratory.

Ribosome Profiling Methods Adapted to the Study of RNA-Dependent. . .

83

8. Add 1/2 volume of chloroform, vortex for 10 s, and incubate at 20 °C for 5 min to allow dissociation of ribonucleoproteins. Centrifuge at 16,000 g for 10 min at 4 °C. 9. Transfer the upper aqueous phase into a new 2 mL tube and add 1/2 volume of chloroform:isoamyl alcohol (19:1 v/v). Vortex for 10 s. Centrifuge at 16,000 g for 10 min at 4 °C. 10. Transfer the upper aqueous phase into a new 2 mL tube. Add 1/10 volume of sodium acetate 3 M (pH 5.2) and 2.5 to 3.0 volumes of cold absolute ethanol. Invert the tube five times for a gentle mix. 11. Store the sample overnight at -20 °C to allow RNA precipitation. 12. Centrifuge at 16,000 g for at least 30 min at 4 °C to pellet RNAs. 13. Carefully discard the supernatant and wash the pellet with 500 μL of cold 80% ethanol. Centrifuge at 16,000 g for 10 min at 4 °C. 14. Remove carefully and completely the supernatant using a micropipette and air dry the pellet for approximately 20 min at 20 °C until all remaining fluid has evaporated. Avoid excessive drying because the RNA pellet can be difficult to dissolve. 15. Dissolve the pellet in 85 μL of MilliQ H2O and measure the concentration of a 1:10 dilution using the Nanodrop spectrophotometer. Save 1 μg for analysis of the RNA quality on an agarose gel. 16. Add 10 μL of 10X DNase I buffer and 5 μL of 10 U/μL DNase I for a total volume of 100 μL and incubate for 1 h at 20 °C to remove DNA contaminants. 17. Stop the reaction by adding 1 volume of acidic phenol:chloroform:isoamyl alcohol 25:24:1. Vortex for 10 s and centrifuge the sample at 16,000 g for 10 min at 4 °C. 18. Transfer the upper aqueous phase into a 1.5 mL tube and add 1 volume of chloroform. Vortex for 10 s and centrifuge the sample at 16,000 g for 10 min at 4 °C. 19. Transfer the upper aqueous phase into a new tube and add 1/10 volume of sodium acetate 3 M (pH 5.2) and 2.5–3.0 volumes of cold absolute ethanol. Invert the tube five times for a gentle mix. Incubate the sample overnight at -20 °C to allow RNA precipitation. 20. Repeat steps 12–14 once more to recover RNA. 21. Dissolve the pellet in 20–50 μL of MilliQ H2O and measure the concentration of a 1:10 dilution using the Nanodrop spectrophotometer. Save 1 μg for RNA analysis on agarose gel.

84

Maximilian P. Kohl et al.

22. Analyze the saved samples before and after DNase I digestion on a 1% agarose gel electrophoresis in TBE buffer. 23. Assess the quality of the RNA sample using an Agilent bioanalyzer. 24. Store the sample at -80 °C until RNA-seq analysis. 3.4 Monosome/ Polysome Enrichment and Buffer Exchange

To ensure proper polysome isolation, high concentrations of magnesium are used during the initial steps of cell harvesting and lysis. A sucrose cushion is then used to pellet ribosomes and exchange the buffer for the preparation of ribosome-protected mRNA fragments. 1. To begin the polysome purification, load 3 mL of a 1.1 M sucrose cushion at the bottom of 26.3 mL Polycarbonate ultracentrifugation tubes. 2. If using the flash freezing protocol, the full volume of cleared lysate should be loaded on top of the cushion. Be careful to avoid mixing with the cushion. 3. If using the ice bath protocol, add 20 mL of lysis buffer B to the sucrose cushion before loading the samples on top. Again, be careful to avoid any mixing of the sample with the cushion. 4. Perform ultracentrifugation at 370,000 g for 2 h and 49 min at 4 °C in a ‘70 Ti’ rotor. It is important to balance the weight of each tube before running the ultracentrifugation so that the force applied during centrifugation is evenly distributed and does not cause any damage to the rotor or the samples. Use lysis buffer B to adjust the weight of each tube. 5. Carefully remove the supernatants without disturbing the polysome pellets. Add 200 μL of lysis buffer B to each tube and slowly rotate the tubes for gentle washing. 6. Remove the buffer and resuspend the pellets in 200 μL of resuspension buffer R. Use a pipette to disrupt the pellets and to resuspend them, but avoid the formation of foam or air bubbles as much as possible. Place the tubes in an icebox at a tilted angle, add a small stirring magnet, and mix for 15–20 min at a low setting (150–200 rpm). 7. To determine the yield, use 1 μL aliquots for 1:10 dilutions in H2O and measure the absorbance at 260 nm in a Nanodrop spectrophotometer. Typically, a total of 25 Absorbance Units (AU) can be obtained, roughly corresponding to 1 mg RNA.

3.5 MNase Treatment

The buffer R contains low magnesium concentration in order to perform the nuclease digestion with MNase to generate ribosomeprotected mRNA fragments (see Note 8). To optimize the MNase hydrolysis, calcium ions are added to the samples. After the reaction, EGTA is added to quench the MNase and to stop the digestion.

Ribosome Profiling Methods Adapted to the Study of RNA-Dependent. . .

85

1. Add to the 200 μL of ribosome enriched fractions (25 AU) contained in the resuspension buffer R 6 μL of SUPERase-In to protect the mRNAs from other RNases. Note that SUPERase-In does not inhibit the MNase. 2. To perform the nuclease hydrolysis, add 625 U of MNase (10 U/μL). Supplement the sample with CaCl2 to a final concentration of 5 mM for a total volume of 400 μL. Incubate the samples for 1 h at 25 °C in a thermomixer under agitation (1400 rpm). 3. To stop the digestion, add EGTA to a final concentration of 6 mM and mix by gentle pipetting. 4. After the nuclease treatment, keep the samples strictly on ice until loading them onto a sucrose gradient for monosome isolation. 3.6 Polysome Analysis and Monosome Isolation

The samples are then applied onto a 5–50% sucrose gradient to isolate the 70S monosome peaks generated by the S. aureus Micrococcal Nuclease (MNase) digestion, corresponding to the ribosome-protected mRNA fragments. A successful MNase digestion is characterized by the collapse of polysomes and by an enrichment of monosomes. The same protocol is used to visualize untreated polysomes and to estimate their quality (Fig. 3). 1. Prepare 5% and 50% sucrose solutions containing 1/10th of the final volume sucrose gradient buffer G. 2. Carefully add approximately 5 mL of the 5% sucrose solution to the ‘open-top polyclear’ ultracentrifuge tubes. 3. To reach the bottom of the tube beneath the 5% sucrose solution, use a 10 mL syringe with a long 10 cm needle to slowly add 5 mL of 50% sucrose solution. Be careful not to disrupt or mix the layers. 4. Adjust the final volume by slowly filling the top of the tubes with 5% sucrose solution, and seal the tubes with long and thin caps, avoiding air bubbles. 5. Prepare the gradient using the gradient maker ‘Gradient Master IP’. Let the gradients settle and cool down for at least 1 h at 4 °C. 6. Carefully layer the MNase-treated samples and the undigested controls on top of the gradients. 7. Adjust the weights with the resuspension buffer R. Ultracentrifuge the gradients at 260,000 g for 2 h and 46 min at 4 °C using a ‘SW 41 Ti’ rotor. 8. Fractionate the gradients using a piston gradient fractionator combined with a fraction collector.

86

Maximilian P. Kohl et al.

Fig. 3 Polysome sedimentation profiles of S. aureus samples harvested by the “ice bath” protocol under differential treatment conditions are fractionated on a 5–50% sucrose gradient. S. aureus cultures were either harvested directly or treated with the antibiotic Retapamulin (+RET) 5 min prior to harvest by rapid cooling in an ice bath and centrifugation. Following harvest and lysis as well as a sucrose cushion step for buffer exchange, samples were either digested in the presence of micrococcal nuclease S7 (+MNase) or mock treated without adding the enzyme (-MNase). The detected absorbance at 260 nm (A260) is shown for the sedimentation profiles corresponding approximately to the 10–50% fractions of the gradient. Note that both RET and MNase treatment are able to collapse polysomes to 70S monosomes (70S). In the given example, approximately 10 AU of RNA were digested with 500 U MNase, although we have found even lower amounts (e.g., 500 U per 20 AU) to be sufficient

9. Pool the fractions containing the isolated 70S monosome peak and snap-freeze them in liquid nitrogen to store them at -80 ° C. Alternatively, proceed directly with the treatment of the samples with hot acid phenol-chloroform for RNA extraction. 3.7 Hot Phenol RNA Extraction from Monosomes

1. Transfer 700 μL of acidic phenol:chloroform:isoamyl alcohol 25:24:1 into 1.5 mL tubes and preheat at 65 °C for approximately 10 min in a thermomixer under a fume hood. 2. If monosome fractions were frozen, let them thaw on ice for 10 min. Add 40 μL of SDS 20% to each 750 μL monosome fraction to a final concentration of 1%. 3. Transfer the samples into the tubes containing the heated phenol:chloroform:isoamyl alcohol. Vortex for 10 s.

Ribosome Profiling Methods Adapted to the Study of RNA-Dependent. . .

87

4. Incubate for 5 min at 65 °C with vigorous mixing. To prevent caps from opening during this incubation, one may place a pipette-tips box lid over the tubes and tape it on firmly. Vortex the samples occasionally during incubation. Cool down the samples on ice for 5 min and centrifuge at 12,000 g for 5 min at 20 °C. 5. Recover carefully the aqueous phase and transfer it into a new tube. Due to the high concentration of sucrose, the aqueous phase might be on the bottom. 6. Add 1 volume of acidic phenol:chloroform:isoamyl alcohol 25: 24:1, vortex for 10 s, and incubate for 5 min with vigorous mixing at 20 °C. Centrifuge at 12,000 g for 5 min at 20 °C. 7. Carefully recover the aqueous phase and transfer it into a new tube. Add 1 volume of chloroform, vortex for 10 s, and centrifuge at 12,000 g for 5 min at 20 °C. 8. Recover carefully the aqueous phase and transfer it into a 2 mL tube. Add 1/10 volume of sodium acetate 3 M (pH 5.2) and 2.5–3.0 volumes of cold absolute ethanol. Invert 5 times to mix and incubate overnight at -20 °C to allow precipitation of RNA. Centrifuge at 16,000 g for at least 30 min at 4 °C to pellet RNA. 9. Discard the supernatant and wash the pellet with 200 μL of cold 80% ethanol. Centrifuge at 16,000 g for 10 min at 4 °C. Remove completely the supernatant using a micropipette and air dry the pellet for approximately 20 min at 20 °C. 10. Dissolve the pellet in 10–20 μL of MilliQ H2O and measure the concentration of a 1/10 dilution using the Nanodrop spectrophotometer. 11. Store the samples at -20 °C or proceed with size selection by gel purification. 3.8 Purification of Ribosome-Protected Fragments

The preparation of ribosome-protected footprints can lead to different mRNA fragment sizes. In eukaryotes, the sources of heterogeneity are well characterized, but less is known in bacteria. The average size distribution of bacterial footprints is broader, and the large heterogeneity is believed to arise not only from technical aspects but also from intrinsic properties of the bacterial ribosome [37, 40, 46, 47]. Although most of the protected fragments are typically 24–27 nucleotides long [36, 37], it is recommended to select a wider range of sizes (e.g., 20–40 nucleotides) to capture all relevant footprints [37, 40]. The mRNA-protected fragments are separated by denaturing polyacrylamide gel electrophoresis (PAGE), excised from the gel, and eluted.

88

Maximilian P. Kohl et al.

1. Prepare a 15% polyacrylamide-(bis acrylamide 1:19) gel in 8 M urea (8 × 8 cm, 0.75 mm thick) and prerun the gel for 30 min at 100 V in 1X TBE buffer before use. 2. Supplement approximately 30 μg of RNA sample with 1 volume (10 μL) of urea loading dye and denature the samples at 80 °C for 2 min. Then, immediately put them on ice. 3. A mixture of RNA oligonucleotides as size markers (e.g., 20 nt, 28 nt, and 40 nt) is prepared in parallel. Add 2 μL of 20 μM RNA oligo, 3 μL of MilliQ H2O, and 5 μL of urea loading dye, and as above, incubate the sample at 80 °C for 2 min. 4. Wash the wells of the gel with a pipette and load the RNA samples and the size markers. Then, run the gel at 150 V until the bromophenol blue and xylene cyanol indicators are well separated. 5. Knowing that on a 15% polyacrylamide gel in 8 M urea, the xylene cyanol and bromophenol blue migrate approximately as a 29 nt and a 9 nt-long RNA oligonucleotide, respectively, a window around the xylene cyanol band can be conveniently excised from the gel. 6. Alternatively, the gel may be stained using 5 μL of SYBR-Green in 50 mL TBE solution for 20 min. Then, place the gel in between two thin layers of plastic foil next to an image of the gel, printed at its actual size, to indicate the excision window with a marker. 7. Excise the desired size range using a sterile scalpel,cut the gel slices into small pieces, and transfer them to 0.5 mL tubes, which have been pierced at the bottom multiple times with a 20-gauge needle. 8. Place the 0.5 mL tubes in 2 mL tubes and centrifuge at 20,000 g for 5 min or until the gel has fully extruded into the recipient tubes. 9. Add 300 μL of RNA elution buffer, 2.5 μL of SUPERase-In as well as 70 μL of phenol to the crashed gels. Incubate the samples overnight at 4 °C under agitation. 10. Pellet the gel by centrifugation at 16,000 g at 4 °C for 15 min and transfer the supernatant to fresh 1.5 mL tubes. Add 1 volume of phenol to the samples, vortex for 1 min, and centrifuge at 12,000 g for 5 min. 11. Carefully recover the aqueous (upper) phase. Add 1 volume of chloroform/isoamyl alcohol (19:1 vol/vol). Vortex for 1 min, centrifuge to separate the phases once more, and recover the aqueous (upper) phase.

Ribosome Profiling Methods Adapted to the Study of RNA-Dependent. . .

89

12. Add 1/10 volume of sodium acetate 3 M (pH 5.2), 3 μL of Glycoblue, and 2.5–3.0 volumes of cold absolute ethanol. Invert five times to mix and incubate overnight at -20 °C to allow precipitation of RNA. 13. Centrifuge at 16,000 g for at least 30 min at 4 °C. Discard the supernatant and wash carefully the pellets with 100 μL of 80% ethanol. 14. Centrifuge again at 16,000 g for 15 min at 4 °C; carefully discard the supernatant and air dry the pellets for 20 min at 20 °C. 15. Resuspend the pellets in 10 μL MilliQ H2O. 16. Check the quality of the RNA on an Agilent bioanalyzer. The samples are ready for library preparation if adequate quality is achieved. 3.9 Library Preparation

After MNase cleavage, RNA fragments contain a 3′-phosphate (3′-P) and a 5′-hydroxyl (5′-OH). To facilitate the ligation of adapters, it is important to dephosphorylate the 3′-P and to phosphorylate the 5′-OH before proceeding with the adapter ligation. These reactions are performed using the antarctic Phosphatase AnP and T4 polynucleotide kinase PNK, respectively. These steps are typically performed by the sequencing platform before the use of commercial kits for RNA library preparation via adapter ligation protocols.

3.10

Ribosome profiling libraries obtained with or without retapamulin treatment (Ribo-RET and Ribo-seq, respectively) and their corresponding total RNA (RNA-seq) are typically sequenced on an Illumina NGS instrument using a single-end run of 50 cycles. The resulting demultiplexed sequencing data in FASTQ format are then processed and analyzed with some published or custom Python and R scripts (Fig. 4). The initial stages of the analysis (steps 1–5) rely on the iPython notebooks provided by Mohammad in 2018 [48, 37].

Data Analysis

1. Preprocessing. Raw reads are filtered and trimmed using Skewer v.0.2.2 [49], with the following parameters: -x AGAT CGGAAGAGCACACGTCT -Q 10 and -l 10 -L 40 for RiboRET and Ribo-seq data (see Note 9) or -l 35 -L 50 for RNA-seq data (with -x: adapter sequence, -Q: lowest mean quality value (phred-score) allowed before trimming; -l: minimum read length allowed after trimming; -L: maximum read length allowed after trimming). Bowtie v.1.2.2 [50] is then applied to discard any contaminating reads mapping to tRNA or rRNA sequences, using the following parameters: -v 2 -y -m 1 -a — best —strata (with -v 2: report alignments with at most two mismatches, -y: try as hard as possible to find valid alignments

90

Maximilian P. Kohl et al.

Fig. 4 Workflow for S. aureus Ribo-seq, Ribo-RET, and RNA-seq data analysis. See text for details

when they exist; -m 1: suppress all alignments for a particular read if more than one reportable alignment exists for it; -a — best —strata: report only alignments in the best alignment “stratum,” i.e., those having the lowest number of mismatches). 2. Genome mapping. The remaining reads are uniquely mapped to the S. aureus HG001 reference genome [51], allowing up to two mismatches with Bowtie v.1.2.2 using the same parameters as above. 3. Quality checks. At this stage, the overall quality of the library is evaluated by examining the read allocation following filtering and bowtie alignments, as well as the size distribution and composition of the final reads that have been mapped to the S. aureus HG001 genome (see Note 10). 4. Ribosome density. In ribosome profiling experiments conducted with MNase, the raw ribosome density and read density in coupled RNA-seq data are assigned to the 3′ end of reads. Previous research has established that the 3′ end of footprints offers the most accurate measure of occupancy in bacteria

Ribosome Profiling Methods Adapted to the Study of RNA-Dependent. . .

91

[52]. From the raw densities, normalized densities are calculated for each strand and genomic position in reads per million mapped reads (RPM) and are provided for each library in a wiggle track format as two separate files, *_plus.wig and *_minus.wig. These wig files can be converted into bigWig files, which are in an indexed binary format, using the UCSC utility wigToBigWig. This facilitates their display in a genome browser such as IGV [53] or our custom Shiny application (see Subheading 3.11). The command line is: wigToBigWig input_ [minus|plus].wig HG001.chrom.sizes output_[minus|plus]. bw. 5. Metagene analyses. Average gene analysis, frame analysis, and asymmetry analysis are performed to evaluate whether the Ribo-seq and Ribo-RET data accurately reflect translation elongation and initiation events. These analyses respectively show the distribution of ribosome-protected footprints (RPFs) across the entire translatome near the start and stop codons, determine the level of occupancy at each reading frame, and establish the degree of dropoff/runoff along CDSs (referred to as the asymmetry score). The average gene analysis is also helpful to estimate the position of the ribosome P site relative to the 3′ end of the mapped footprints for initiating ribosomes. This “shift” value is crucial for analyzing translational initiation sites (TISs 3.10.7). To accurately estimate the frame occupancy and asymmetry score for Ribo-seq, codons near the ends of each coding sequence are excluded (specifically, 27 nt downstream of the start codon and 12 nt upstream of the stop codon). 6. Estimation of Translation Efficiencies (TE) and differential TE analysis. Raw read counts are computed from previous raw densities (as described in Subheading 3.10, step 4) for each coding sequence (CDS) of interest in both Ribo-seq and matched RNA-seq libraries. To remove the effects of translation initiation and termination, reads mapped to the first 10 and last 4 codons of the ORFs are excluded. The resulting RPFs and mRNA counts per gene are combined into two count matrices, ribo_counts.txt and rna_counts.txt, respectively. In addition, a complementary tab-separated sample information file, sample_info.txt, is prepared, which includes information on each sample’s condition (untreated or treated), sequencing type (Ribo- or RNA-seq), and optional batch or known covariate. The deltaTE method developed by Chothani et al. [54] is then applied to identify translationally regulated genes between conditions. This integrative analysis, implemented in R scripts [55], utilizes a DESeq2 [56] generalized linear model with three components: the condition (c), sequencing type (s), and an interaction term combining both (c:s). This approach

92

Maximilian P. Kohl et al.

calculates a deltaTE fold change and associated false discovery rate (FDR) to quantify the extent of translational regulation between conditions. Using this method, differentially transcribed genes (DTGs) and differential translation efficiency genes (DTEGs) can be identified and classified into distinct regulation classes (translationally forwarded, exclusive, buffered, or intensified) by computing and combining changes in RPFs, mRNA counts, and TE for every gene of interest between two conditions. The command line for the basic protocol is RScript DTEG.R ./ribo_counts.txt ./rna_counts.txt ./sample_info.txt . The last parameter refers to the presence (1) or absence (0) of a potential covariate or batch effect, which needs to be taken into account in DESeq2 statistical model. 7. Analysis of Translation Initiation Sites (TISs). To identify new alternative and/or regulatory ORFs and computationally map all translational initiation sites (TISs) in S. aureus HG001, we have developed a custom Python script. This script is based on two previously published methods that were used to study E. coli’s proteome from retapamulin-assisted ribosome profiling data (Weaver et al. [57]; Meydan et al. [39]). Our algorithm first scans both strands of S. aureus HG001 genome in search for putative ORFs that start with ATG, GTG, or TTG codons and end with TAA, TAG, or TGA codons. By default, we include ORFs with at least one sense codon to ensure a comprehensive list of predictions. Then, each ORF is compared to the annotations of S. aureus HG001 and tagged as “inter” or “intra” genic, depending on its overlap with any known genomic feature(s). A supplementary classification system has been developed to further characterize overlaps with annotated coding sequences (CDSs). This system includes seven categories (with ‘N’: N-terminal end, ‘C’: C-terminal end, ‘id’: identical, ‘trunc’: truncated, and ‘ext’: extended): 1. “NidCid” indicates the same start and stop codons as for the annotated feature (defining the primary translation initiation site, or pTIS); 2. “NextCid” indicates an in-frame upstream start codon and an identical stop codon (N-terminal extension); 3. “NextCtrunc” indicates the out-of-frame upstream start and stop codons (e.g., an upstream initiation site defining an ORF that partially overlaps with an annotated one); 4. “NextCext” indicates out-of-frame upstream start codon and downstream stop codon (embedding the known CDS in the new ORF);

Ribosome Profiling Methods Adapted to the Study of RNA-Dependent. . .

93

5. “NtruncCid” indicates an in-frame downstream start codon and an identical stop codon (in-frame internal translation initiation site, or “in-frame iTIS”); 6. “NtruncCtrunc” indicates out-of-frame downstream start codon and upstream stop codon (out-of-frame internal translation initiation site, or “out-of-frame iTIS”); 7. “NtruncCext” indicates out-of-frame downstream start and stop codons (e.g., an internal initiation site defining an ORF that partially overlaps with an annotated one). Furthermore, any ORF found within 100 nucleotides upstream of a CDS start codon is flagged as a putative upstream ORF (uORF) for the corresponding CDS. Several additional criteria are evaluated to reduce the list of interesting new alternative and/or regulatory ORFs. The presence of any of the following Shine-Dalgarno (SD) sequences: ‘GGGGG,’ ‘GGGG,’ ‘GGAGG,’ ‘GAGG,’ ‘GGAG,’ and ‘AGGA’ is first checked in a window spanning from -16 to 4 nts upstream of the detected start codons. The translation initiation site of any predicted ORF can be assessed by examining the ribosome occupancy profiles observed in retapamulintreated samples. A 3’ density peak of stalled ribosome footprints is expected around 15 nt downstream of the first base of the start codon [57]. The height of this peak (RPM density) is considered an indicator of active translation initiation events, along with its observed distance from the P-site codon. However, since the peak distance can vary under different experimental conditions, the area of putative initiation peaks within an 18 nt window downstream the first base of the start codon is calculated for each candidate ORF to aid in identifying active TISs. Lastly, to further confirm bona fide ORFs in active elongation state, both normalized expression levels (RPKM) and the percentage of sequence covered by ribosome footprints are calculated for each candidate ORF using ribosome densities observed in control Ribo-seq libraries from the previously generated wig files. All the aforementioned findings are finally compiled into a unified comma-separated file named “ORF_predictions.csv” for further investigation. 3.11 Visualization and Data Interpretation with Ribo-RET Prediction Explorer

When searching for yet unannotated bona fide ORFs and initiation sites, computational predictions alone are insufficient, and candidates need to be individually evaluated. This laborious process, known as sequence gazing, can be greatly facilitated by software designed for data visualization and inspection. To improve the exploration and visualization of all predicted ORFs in the genomic context of S. aureus HG001 (sequence and annotation), we have developed a dedicated Shiny application, which we named “RiboRET Prediction Explorer” (Fig. 5).

94

Maximilian P. Kohl et al.

Fig. 5 S. aureus ribosome profiling data visualization and translation initiation site inspection within our custom Shiny application. An example gene locus (gehA) is shown to demonstrate how RET treatment (+) affects the profiling data and enables the identification of start sites. The 3′ ends of mapped reads are displayed and their normalized intensity is auto-scaled. It is worth noting the significant shift in ribosome occupancy from within the ORF toward the initiation site upon RET treatment

To analyze ORF predictions with Ribo-RET Prediction Explorer, users should launch it in RStudio and select the previously generated comma-separated file (see Subheading 3.10, step 7). The application then displays a readily customizable table corresponding to the underlying data frame of the ORF predictions, which includes predicted start codons, nucleotide, and protein sequences, and assigned ribosome densities. The table can be sorted by individual variables, such as RPM values, from highest to lowest or vice versa. Users can also utilize the search bar to directly access specific entries, including genomic features of interest that may overlap a predicted ORF. Users can display a predicted ORF in its genomic context along with corresponding normalized ribosome densities by selecting the candidate from the table and by uploading the bigWig files of samples of interest, which contain the RPM values assigned to the 3′ ends of the reads for each strand and genomic position (see Subheading 3.10, step 4). Several indicators may be useful when evaluating selected candidates through sequence gazing. In Ribo-RET experiments, the characteristic read pattern that emerges from the RET treatment is the most important predictor of a bona fide start site. While untreated Ribo-seq samples display broad and diverse read distributions throughout ORFs, RET-treated samples are expected to

Ribosome Profiling Methods Adapted to the Study of RNA-Dependent. . .

95

show enriched ribosome occupancy specifically at start codons. A pronounced shift in the read pattern from within putative ORFs toward potential initiation sites is generally a strong indication of their authenticity. Another excellent predictor when evaluating a selected candidate is the presence of an SD sequence upstream of putative start codons. Although a significant proportion of genes in E. coli lacks an SD sequence, SD-led initiation appears to play an important role in S. aureus, and the vast majority of newly discovered initiation sites are expected to harbor an upstream SD motif. Our algorithm scans for potential SD sequences and indicates their presence accordingly (see Subheading 3.10, step 7). In S. aureus, the average distance between the consensus SD sequence and the start codon is slightly increased compared to E. coli, where the SD motif is typically separated by a 5–7 nt-long spacer element. Spacers that are 4–10 nt long give a good indication of initiation sites, although one cannot exclude that longer spacers might adopt a structure to bring closer the SD and start codon. Another criterion for validation of a suitable start codon (AUG, GUG, and UUG) is the nucleotide distance between the 3′ end of the reads and the start codon. Knowledge about the length and nucleotide composition of RPFs generated during the nuclease digestion (see Subheading 3.10, step 3) can offer further insights. We expected that fully protected fragments by intact 70S ribosomes at initiation sites should have 3′ ends that map approximately 16 nucleotides downstream of putative start codons, while enrichment of G nucleotides at their 5′ ends should indicate the presence of the SD sequence. Depending on the length of the spacer, these reads should be approximately 25–35 nucleotides long. However, we have observed examples where the 3′ end of reads maps within the ribosomal P-site, leading to shorter reads between 15 and 22 nts. This is most likely due to MNase activity, which damages the integrity of the 70S. As a consequence, the read position cannot always validate the start site. However, it is highly unlikely that the 3′ ends of reads without any putative start codons within a ~20 nts window upstream of their position correspond to initiating ribosome footprints. In contrast, reads with a distance to putative start codons that match expected full protection by the ribosome (approximately 14–16 nts) or ribosomal P site cuts (0–2 nts) but also broad distributions in-between are good candidates.

4

Notes 1. We have developed protocols for S. aureus cell lysis and RNA extraction using the “MP Biomedical” bead-beating grinder Fastprep24. Although other bead mill homogenizers can be employed, it may necessitate various adaptations and the development of specific parameters, reagents, and lysing matrices.

96

Maximilian P. Kohl et al.

2. One advantage of using the “ice bath” protocol is its scalability, whereby larger yields can be easily obtained. We found that pelleting 25 mL of culture is optimal for subsequent resuspension and lysis in 2 mL fastprep tubes. When scaling up, we suggest to perform the procedure this way with multiple aliquots from larger culture volumes for cell pelleting and lysis. The samples can be pooled again during later steps of the protocol. 3. The timing of cell harvesting may vary depending on the biological question and the chosen protocol for harvest and lysis. For flash-freezing cultures, the culture volume can be a limiting factor, and it can be laborious to freeze cultures larger than 22 mL. Therefore, we have observed that harvesting at a late point in the growth phase (e.g., OD 600 nm = 3.5–4) is necessary to obtain sufficient material. In contrast, for the ice bath protocol, earlier time points can be used since the culture volume can be readily scaled up. 4. At this point, the Lysing Matrix B in the tubes resuspends quickly with the supernatant. This can impair the purity and yield, especially if handling multiple samples at once. To ensure proper separation and good yields, retrieve the clear supernatants after centrifugation, then respin for another 2–3 min and retrieve the rest. 5. To prevent frozen drops from aggregating in liquid nitrogen, it is important to form them very slowly with the electronic pipette. Additionally, the drops may stick to the walls of the plastic tray, so it is necessary to detach them carefully before removing the liquid nitrogen. 6. When harvesting by flash freezing, be very careful not to transfer any residual liquid nitrogen into the 50 mL falcon tubes. If liquid nitrogen becomes entrapped in a closed falcon tube, the tube may explode. 7. To prevent overheating of the system and adapters from multiple rounds of bead beating and thawing the sample, it is important to add small blocks of dry ice to the designated spaces in the adapters and take breaks between individual cycles. Between each cycle, vigorously shake the tubes to redistribute the pulverized culture and beads, briefly reimmerse them in liquid nitrogen, and replace the dry ice in the adapters. 8. When using micrococcal nuclease S7 (MNase), it is important to exercise caution during the analysis step. MNase has a nucleotide sequence bias because it cleaves more efficiently 5′ of A or U compared to G or C in mRNAs [58]. This can lead to apparent periodicity in MNase-generated datasets, but it may actually be an artifact of the sequence-specific nuclease digestion [37, 40]. As a result, the resolution of MNase-generated

Ribosome Profiling Methods Adapted to the Study of RNA-Dependent. . .

97

datasets at the single nucleotide level is not ideal, although flash freezing can improve single codon resolution [37]. The nucleotide sequence bias is not as pronounced in S. aureus Ribo-seq datasets due to the lower genomic GC content (~30%) compared to E. coli (~50%). Caution is also needed when using large amounts of MNase (e.g., 1000 U per 20 AU of RNA) because it can lead to significant mRNA cleavage in the ribosomal P-site, resulting in an increased prevalence of shorter mRNA-protected fragments and overall greater heterogeneity in footprint sizes. To mitigate this effect, a softer digest (e.g., using ~500 U per 20 AU of RNA), more stringent experimental size selection, and filtering during the analysis step can be performed. 9. We have observed that the distribution of read sizes can be strongly affected by variations in cell harvesting protocols and the balance of ribosomes between the elongation and initiation phases. Under conditions that favor the isolation of initiating ribosomes (such as Ribo-RET experiments), we have observed a bimodal distribution centered at 20 and 25 nucleotides. This particular distribution of read lengths was prominent when ribosomes were isolated using protocols that include a sucrose cushion step. We believe that the use of the sucrose cushion, while purifying ribosomes, also removes tRNAs from the A-site and the initiator tRNA from the P-site, only retaining peptidyltRNAs in the P-site during elongation. This A- and P-site distinct occupancy, in turn, might influence nuclease digestion. In recent experiments, we observed that lowering the magnesium concentration during cell lysis to 50 mM, facilitated nuclease compatibility, thus alleviating the need for the sucrose cushion step. With this protocol, the bimodal distribution shifted to peaks at 27 and 35 nucleotides. Metagene and read composition analyses for this new protocol suggested that shorter reads may originate from fragments derived from residual elongating ribosomes, while longer reads mainly result from initiating ribosomes. 10. To ensure dependable S. aureus Ribo-RET and Ribo-seq evaluations, a minimum of five million reads should align to the chromosome at this stage. The level of polysomes in Ribo-seq experiments (without retapamulin) may differ depending on the bacterial strains and growth conditions, which could affect the number of RPFs. This factor should be taken into account when determining the sequencing depth.

98

Maximilian P. Kohl et al.

Acknowledgments We thank all the team members, Allen Buskirk, Cynthia Sharma and Alexander Mankin for helpful discussions. This work was supported by the Centre National de la Recherche Scientifique (CNRS) by the French National Research Agency ANR (ANR-21-CE12-0030-01 to [S. M.]). This work of the Interdisciplinary Thematic Institute IMCBio, as part of the ITI 2021-2028 program of the University of Strasbourg, CNRS, and Inserm, was supported by IdEx Unistra (ANR-10-IDEX-0002) and by SFRI-STRAT’US project (ANR 20-SFRI-0012) and EUR IMCBio (ANR-17-EURE-0023) under the framework of the French Investments for the Future Program. References 1. Cheung GYC, Bae JS, Otto M (2021) Pathogenicity and virulence of Staphylococcus aureus. Virulence 12(1):547–569 2. Tong SYC, Davis JS, Eichenberger E, Holland TL, Fowler VG (2015) Staphylococcus aureus infections: epidemiology, pathophysiology, clinical manifestations, and management. Clin Microbiol Rev 28(3):603–661 3. Ikuta KS, Swetschinski LR, Robles Aguilar G, Sharara F, Mestrovic T, Gray AP et al (2022) Global mortality associated with 33 bacterial pathogens in 2019: a systematic analysis for the Global Burden of Disease Study 2019. Lancet 400(10369):2221–2248 4. Stryjewski ME, Corey GR (2014) Methicillinresistant Staphylococcus aureus: an evolving pathogen. Clin Infect Dis 58(suppl_1):S10– SS9 5. Jevons MP (1961) “Celbenin” - resistant Staphylococci. Br Med J 1(5219):124–125 6. Murray CJ, Ikuta KS, Sharara F, Swetschinski L, Robles Aguilar G, Gray A et al (2022) Global burden of bacterial antimicrobial resistance in 2019: a systematic analysis. Lancet 399(10325):629–655 7. Howden BP, Davies JK, Johnson PDR, Stinear TP, Grayson ML (2010) Reduced vancomycin susceptibility in Staphylococcus aureus, including vancomycin-intermediate and heterogeneous vancomycin-intermediate strains: resistance mechanisms, laboratory detection, and clinical implications. Clin Microbiol Rev 23(1):99–139 8. Wu Q, Sabokroo N, Wang Y, Hashemian M, Karamollahi S, Kouhsari E (2021) Systematic review and meta-analysis of the epidemiology of vancomycin-resistance Staphylococcus aureus isolates. Antimicrob Resist Infect Control 10(1):101

9. Felden B, Vandenesch F, Bouloc P, Romby P (2011) The Staphylococcus aureus RNome and its commitment to virulence. PLoS Pathog 7(3):e1002006 10. Bronesky D, Wu Z, Marzi S, Walter P, Geissmann T, Moreau K et al (2016) Staphylococcus aureus RNAIII and its regulon link quorum sensing, stress responses, metabolic adaptation, and regulation of virulence gene expression. Annu Rev Microbiol 70(1): 299–316 11. Desgranges E, Marzi S, Moreau K, Romby P, Caldelari I (2019) Noncoding RNA. Microbiol Spectr 7:2 12. Barrientos L, Mercier N, Lalaouna D, Caldelari I (2021) Assembling the current pieces: the puzzle of RNA-mediated regulation in Staphylococcus aureus. Front Microbiol 12:706690 13. Bleul L, Francois P, Wolz C (2021) Two-component systems of S. aureus: signaling and sensing mechanisms. Genes 13(1):34 14. Chan PF, Foster SJ, Ingham E, Clements MO (1998) The Staphylococcus aureus alternative sigma factor sigmaB controls the environmental stress response but not starvation survival or pathogenicity in a mouse abscess model. J Bacteriol 180(23):6082–6089 15. Geissmann T, Marzi S, Romby P (2009) The role of mRNA structure in translational control in bacteria. RNA Biol 6(2):153–160 16. Marzi S, Fechter P, Chevalier C, Romby P, Geissmann T (2008) RNA switches regulate initiation of translation in bacteria. Biol Chem 389(5):585–598 17. Duval M, Simonetti A, Caldelari I, Marzi S (2015) Multiple ways to regulate translation initiation in bacteria: mechanisms, regulatory circuits, dynamics. Biochimie 114:18–29

Ribosome Profiling Methods Adapted to the Study of RNA-Dependent. . . 18. Shine J, Dalgarno L (1974) The 3′-terminal of Escherichia coli 16S sequence ribosomal RNA: complementarity to nonsense triplets and ribosome binding sites. Proc Natl Acad Sci 71(4):1342–1346 19. Steitz JA, Jakes K (1975) How ribosomes select initiator regions in mRNA: base pair formation between the 3′ terminus of 16S rRNA and the mRNA during initiation of protein synthesis in Escherichia coli. Proc Natl Acad Sci 72(12):4734–4738 20. Hui A, De Boer HA (1987) Specialized ribosome system: preferential translation of a single mRNA species by a subpopulation of mutated ribosomes in Escherichia coli. Proc Natl Acad Sci 84(14):4762–4766 21. Stormo GD, Schneider TD, Gold LM (1982) Characterization of translational initiation sites in E. coli. Nucleic Acids Res 10(9):2971–2996 22. Kozak M (1999) Initiation of translation in prokaryotes and eukaryotes. Gene 234(2): 187–208 23. Gualerzi CO, Pon CL (2015) Initiation of mRNA translation in bacteria: structural and dynamic aspects. Cell Mol Life Sci 72(22): 4341–4367 24. Milo´n P, Rodnina MV (2012) Kinetic control of translation initiation in bacteria. Crit Rev Biochem Mol Biol 47(4):334–348 25. Studer SM, Joseph S (2006) Unfolding of mRNA secondary structure by the bacterial translation initiation complex. Mol Cell 22(1):105–115 26. Kudla G, Murray AW, Tollervey D, Plotkin JB (2009) Coding-sequence determinants of gene expression in Escherichia coli. Science 324(5924):255–258 27. Espah Borujeni A, Channarasappa AS, Salis HM (2014) Translation rate is controlled by coupled trade-offs between site accessibility, selective RNA unfolding and sliding at upstream standby sites. Nucleic Acids Res 42(4):2646–2659 28. Espah Borujeni A, Cetnar D, Farasat I, Smith A, Lundgren N, Salis HM (2017) Precise quantification of translation inhibition by mRNA structures that overlap with the ribosomal footprint in N-terminal coding sequences. Nucleic Acids Res 45(9): 5437–5448 29. Evfratov SA, Osterman IA, Komarova ES, Pogorelskaya AM, Rubtsova MP, Zatsepin TS et al (2016) Application of sorting and next generation sequencing to study 5′-UTR influence on translation efficiency in Escherichia coli. Nucleic Acids Res 45(6):3487–3502

99

30. Dever TE, Ivanov IP, Sachs MS (2020) Conserved upstream open reading frame nascent peptides that control translation. Annu Rev Genet 54:237–264 31. Geissmann T, Chevalier C, Cros MJ, Boisset S, Fechter P, Noirot C et al (2009) A search for small noncoding RNAs in Staphylococcus aureus reveals a conserved sequence motif for regulation. Nucleic Acids Res 37(21): 7239–7257 32. Samatova E, Daberger J, Liutkute M, Rodnina MV (2021) Translational control by ribosome pausing in bacteria: how a non-uniform pace of translation affects protein production and folding. Front Microbiol 11:619430 33. Antoine L, Bahena-Ceron R, Devi Bunwaree H, Gobry M, Loegler V, Romby P et al (2021) RNA modifications in pathogenic bacteria: impact on host adaptation and virulence. Genes 12(8):1125 34. Ingolia NT, Ghaemmaghami S, Newman JR, Weissman JS (2009) Genome-wide analysis in vivo of translation with nucleotide resolution using ribosome profiling. Science 324(5924): 218–223 35. Brar GA, Weissman JS (2015) Ribosome profiling reveals the what, when, where and how of protein synthesis. Nat Rev Mol Cell Biol 16(11):651–664 36. Glaub A, Huptas C, Neuhaus K, Ardern Z (2020) Recommendations for bacterial ribosome profiling experiments based on bioinformatic evaluation of published data. J Biol Chem 295(27):8999–9011 37. Mohammad F, Green R, Buskirk AR (2019) A ribosome profiling systematically-revised method for bacteria reveals pauses at singlecodon resolution. elife 8:e42591 38. Marks J, Kannan K, Roncase EJ, Klepacki D, Kefi A, Orelle C et al (2016) Context-specific inhibition of translation by ribosomal antibiotics targeting the peptidyl transferase center. Proc Natl Acad Sci 113(43):12150–12155 39. Meydan S, Marks J, Klepacki D, Sharma V, Baranov PV, Firth AE et al (2019) Retapamulin-assisted ribosome profiling reveals the alternative bacterial proteome. Mol Cell 74(3):481–93.e6 40. Vazquez-Laslop N, Sharma CM, Mankin A, Buskirk AR (2022) Identifying small open reading frames in prokaryotes with ribosome profiling. J Bacteriol 204(1):e00294–e00221 41. Wang J, Rennie W, Liu C, Carmack CS, Pre´vost K, Caron M-P et al (2015) Identification of bacterial sRNA regulatory targets using ribosome profiling. Nucleic Acids Res 43(21): 10308–10320

100

Maximilian P. Kohl et al.

42. Meydan S, Klepacki D, Mankin AS, Va´zquezLaslop N (2021) Identification of translation start sites in bacterial genomes. Methods Mol Biol 2252:27–55 43. Woodford N, Afzal-Shah M, Warner M, Livermore DM (2008) In vitro activity of retapamulin against Staphylococcus aureus isolates resistant to fusidic acid and mupirocin. J Antimicrob Chemother 62(4):766–768 44. Mohammad F, Buskirk AR (2019) Protocol for ribosome profiling in bacteria. Bio-protocol 9(24):e3468 45. Hadjeras L, Heiniger B, Maaß S, Scheuer R, Gelhausen R, Azarderakhsh S et al (2023) Unraveling the small proteome of the plant symbiont Sinorhizobium meliloti by ribosome profiling and proteogenomics. Microlife 4: uqad012 46. Mohammad F, Woolstenhulme CJ, Green R, Buskirk AR (2016) Clarifying the translational pausing landscape in bacteria by ribosome profiling. Cell Rep 14(4):686–694 47. O’Connor PBF, Li G-W, Weissman JS, Atkins JF, Baranov PV (2013) rRNA:mRNA pairing alters the length and the symmetry of mRNAprotected fragments in ribosome profiling experiments. Bioinformatics 29(12): 1488–1491 48. Mohammad F. Bacterial_Pipeline_riboseq. GitHub2018. Available from: https://github. com/greenlabjhmi/2018_Bacterial_Pipeline_ riboseq 49. Jiang H, Lei R, Ding S-W, Zhu S (2014) Skewer: a fast and accurate adapter trimmer for next-generation sequencing paired-end reads. BMC Bioinformatics 15(1):182 50. Langmead B, Trapnell C, Pop M, Salzberg SL (2009) Ultrafast and memory-efficient

alignment of short DNA sequences to the human genome. Genome Biol 10(3):R25 51. Caldelari I, Chane-Woon-Ming B, Noirot C, Moreau K, Romby P, Gaspin C et al (2017) Complete genome sequence and annotation of the Staphylococcus aureus strain HG001. Genome Announc 5(32):e00783-17 52. Woolstenhulme Christopher J, Guydosh Nicholas R, Green R, Buskirk AR (2015) High-precision analysis of translational pausing by ribosome profiling in bacteria lacking EFP. Cell Rep 11(1):13–21 53. Robinson JT, Thorvaldsdo´ttir H, Winckler W, Guttman M, Lander ES, Getz G et al (2011) Integrative genomics viewer. Nat Biotechnol 29(1):24–26 54. Chothani S, Adami E, Ouyang JF, Viswanathan S, Hubner N, Cook SA et al (2019) deltaTE: detection of translationally regulated genes by integrative analysis of Ribo-seq and RNA-seq data. Curr Protoc Mol Biol 129(1):e108 55. Chothani S. Detection and classification of differential translation-efficiency genes with the deltaTE method. Github 2019. Avai labl e from: https://github.com/ SGDDNB/translational_regulation 56. Love MI, Huber W, Anders S (2014) Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2. Genome Biol 15(12):550 57. Weaver J, Mohammad F, Buskirk AR, Storz G (2019) Identifying small proteins by ribosome profiling with stalled initiation complexes. MBio 10(2):e02819-18 58. Dingwall C, Lomonossoff GP, Laskey RA (1981) High sequence specificity of micrococcal nuclease. Nucleic Acids Res 9(12): 2659–2674

Chapter 6 CRISPR Interference-Based Functional Small RNA Genomics Gianluca Prezza and Alexander J. Westermann Abstract Small RNAs (sRNAs) are versatile regulators universally present in species across the prokaryotic kingdom, yet their functional characterization remains a major bottleneck. Gene inactivation through random transposon insertion has proven extremely valuable in discovering hidden gene functions. However, this approach is biased toward long genes and usually results in the underrepresentation of sRNA mutants. In contrast, CRISPR interference (CRISPRi) harnesses guide RNAs to recruit cleavage-deficient Cas nucleases to specific DNA loci. The ensuing steric hindrance inhibits RNA polymerase assembly at—or migration along—predefined genes, allowing for targeted knockdown screens without major length bias. In this chapter, we provide a detailed protocol for CRISPRi-based functional screening of bacterial sRNAs. Using the abundant microbiota species Bacteroides thetaiotaomicron as a model, we describe the design and generation of a guide library targeting the full intergenic sRNA repertoire of this organism and its application to identify sRNA knockdown-associated fitness effects. Our protocol is generic and thus suitable for the systematic assessment of sRNA-associated phenotypes in a wide range of bacterial species and experimental conditions. We expect CRISPRi-based functional genomics to boost sRNA research in understudied bacterial taxa, for instance, members of the gut microbiota. Key words CRISPRi, sRNA, Noncoding RNA, Bacteroides, Cas12a, Microbiota, Functional genomics

1

Introduction Bacterial small RNAs (sRNAs) are ~50–250 nucleotide-long noncoding RNA molecules that post-transcriptionally regulate target mRNAs, with broad physiological impacts [1, 2]. Over the past two decades, several dozens of sRNAs have been functionally characterized in well-established model species, with regulatory roles in diverse cellular processes ranging from central metabolism to iron homeostasis, quorum sensing, the adaptation to various stresses, and virulence in pathogens [3]. In contrast, the sRNA landscape of gut microbiota species has only recently come into focus [4]. For example, we have predicted 135 intergenic sRNAs in the human gut mutualist Bacteroides thetaiotaomicron and uncovered the

Ve´ronique Arluison and Claudio Valverde (eds.), Bacterial Regulatory RNA: Methods and Protocols, Methods in Molecular Biology, vol. 2741, https://doi.org/10.1007/978-1-0716-3565-0_6, © The Author(s), under exclusive license to Springer Science+Business Media, LLC, part of Springer Nature 2024

101

102

Gianluca Prezza and Alexander J. Westermann

function of two of them [5, 6]. Based on gene expression profiling of a corresponding sRNA deletion mutant, we identified a regulatory function of the GibS sRNA in the catabolism of Nacetylglucosamine-containing carbon sources [5]. Another sRNA—MasB—plays a role in the antibiotic sensitivity of B. thetaiotaomicron, with target candidates inferred from RNA pulldown experiments [6]. However, dissecting the functions of the remaining Bacteroides sRNAs individually—and of the sRNA complement of the hundreds to thousands of additional microbiota species— would be laborious and reliant on serendipities. Alternatively, tools to systematically screen for sRNA-associated phenotypes in a highthroughput manner could provide a substantial shortcut toward understanding the functions of these molecules in the microbiota. Functional genomics, wherein phenotypic screening of loss-offunction mutants is coupled to sequencing-based readouts, has proven valuable for the prediction of bacterial gene function [6– 10]. However, most genome-wide perturbation screens rely on random mutagenesis, e.g., as a result of exposing bacterial genomes to transposon insertions [10]. This naturally leads to an underrepresentation of mutations within short genes, including those for sRNAs. For example, in a previously generated B. thetaiotaomicron transposon library composed of ~8 × 104 unique mutants, 53 (~40%) out of 135 intergenic sRNAs had none or only a single insertion [11] (Fig. 1a), which renders phenotype identification for these genes statistically unreliable to plainly impossible. Even in a more recent, very dense transposon library that contained ~3 × 105 unique mutants [7], this number only shrunk to 21 sRNAs (~16%) [6] (Fig. 1a). Furthermore, not every insertion results in the functional abrogation of the targeted sRNA, and the large population

Fig. 1 Comparison of B. thetaiotaomicron sRNAs hit in conventional versus CRISPR-based functional genomics. (a) Number of insertions per intergenic sRNA in two distinct transposon libraries based either on insertion sequencing (INseq) [11] or on random barcode transposon-site sequencing (RB-TnSeq) [7]. (b) Number of gRNAs or arrays targeting each annotated sRNA in our CRISPRi library [23]

CRISPR Interference-Based Functional Small RNA Genomics

103

size of dense libraries can create bottleneck effects—i.e., the stochastic loss of individual mutants in a pool [12]. Together, these aspects highlight the demand for more tailored functional genomics approaches, especially when screening short genes. CRISPR interference (CRISPRi) builds upon the sequence homology of RNA guides to highly specific DNA loci [13]. These guide RNAs (gRNAs) recruit a catalytically inactivated CRISPRassociated (Cas) nuclease to interfere with target gene transcription by sterically hindering RNA polymerase assembly or migration [14]. The only hard requirement for a gene to be targetable via CRISPRi is the presence of a protospacer adjacent motif (PAM) within its promoter or transcribed sequence to be recognized by an ectopically expressed Cas enzyme. The correspondingly designed gRNAs are either expressed as single units or as an array that can be processed into multiple different guides [15]. Due to its high versatility, CRISPRi has enabled the identification of functions and essentiality of coding genes in multiple bacterial species [16– 22], yet noncoding genes have not previously been the focus of CRISPRi screens. Only recently have we harnessed the technique to systematically search for sRNA-associated phenotypes in B. thetaiotaomicron [23]. As opposed to the random insertion of transposons and owing to the high specificity of gRNAs, this approach enabled us to target the full suite of intergenic sRNAs annotated in this organism (Fig. 1b) and led to the discovery of an sRNA that confers susceptibility to bile salts [23]. Here, we provide a detailed protocol for CRISPRi-based functional sRNA genomics (Fig. 2). First, a CRISPRi guide library is designed with the help of a Python script that we have developed specifically for this purpose. Next, the gRNA pool is generated by cloning the sequences proposed by the software and introducing them into the bacterial species of interest. Using the growth of B. thetaiotaomicron under bile stress as an example, we illustrate how to use the resulting library for the identification of sRNAs implicated in bacterial physiology under the tested condition. This entails both a step-by-step protocol of the wet lab screening assay and a detailed description of how to analyze the resulting data. While established in Bacteroides, our protocol appears generally suitable for targeting custom gene sets in any genetically tractable bacterial species and will thus boost functional sRNA research.

2

Materials

2.1 Design of the Guide Library

1. Guide library design tools (https://github.com/gprezza/ CRISPRi_tools). Detailed instructions can be found under the same link.

104

Gianluca Prezza and Alexander J. Westermann

Fig. 2 Outline of the CRISPRi protocol described in this chapter. Using the provided Python script, gRNAs and—if desired—CRISPR arrays are designed in Subheadings 3.1, 3.2 and 3.3. After obtaining the corresponding oligonucleotides, the library is assembled (Subheadings 3.4 and 3.5) and introduced into the bacterial species of choice (Subheadings 3.6 and 3.7). The protocol then describes the common steps of using the resulting library for a CRISPRi screen (Subheading 3.8), from data generation (Subheadings 3.9, 3.10 and 3.11) to analysis (Subheading 3.12)

2.2 Cloning of the Guide Library

1. DNase-free water. 2. Plasmid AWP-029 (Addgene ID: 213966). 3. Plasmid AWP-031 (Addgene ID: 213967). 4. SmaI restriction enzyme. 5. NEBuilder® HiFi DNA Assembly Cloning Kit (NEB). 6. Optional: BsmBI restriction enzyme. 7. Kit for extraction and purification of DNA from agarose gels. 8. Agarose. 9. T4 polynucleotide kinase and buffer A (Thermo Scientific). 10. T4 DNA ligase (NEB). 11. Thermocycler. 12. Electrocompetent Escherichia coli S17-1 cells. 13. Electroporation cuvettes (long electrode, 2 mm gap). 14. Electroporator. 15. Super Optimal broth with Catabolite repression (SOC) medium. 16. 10 cm plates and 25 cm2 square dishes filled with LB agar + ampicillin (100 μg/mL). 17. LB medium.

CRISPR Interference-Based Functional Small RNA Genomics

105

18. Ampicillin stock (100 mg/mL). 19. Sterile 80% glycerol. 20. Spectrophotometer for OD measurements in cuvettes. 21. B. thetaiotaomicron type strain VPI-5482. 22. Sterile PBS. An aliquot is also needed in prereduced form, i.e., preincubated in the anaerobic chamber over night to remove the dissolved oxygen. 23. TYG medium: 20 g/L tryptone, 10 g/L yeast extract, 5 g/L glucose, 1 g/L L-cysteine, 0.5 mg/L hemin, 19.2 mg/L MgSO4·7H2O, 40 mg/L KH2PO4, 40 mg/L K2HPO4, 80 mg/L NaCl, 8 mg/L CaCl2 and 2 g/L NaHCO3. 24. BHIS plates (52 g/L BHI-agar, 1 g/L L-cysteine, 0.5 mg/L hemin, 2 g/L NaHCO3) without or with gentamicin (100 μg/ mL) and erythromycin (12.5 μg/mL). Note that the antibiotic concentrations used here are halved as compared to their default concentrations to avoid the potential loss of library members that are overly sensitive to these drugs. 25. 1–2 mL cryotubes. 26. Sterile 0.2 mL tubes. 2.3

CRISPRi Screen

1. Appropriate medium for the CRISPRi screening condition. 2. Isopropyl β-d-1-thiogalactopyranoside (IPTG). 3. Lysozyme (25 mg/mL). 4. Cell lysis buffer: 100 mM Tris HCl pH 8.5, 200 mM NaCl, 0.2% SDS, 5 mM EDTA. 5. Proteinase K. 6. RNase A. 7. Phenol:chloroform:isoamyl 25:24:1 mix. 8. Glycoblue. 9. 3 M NaOAc. 10. 100% and 75% EtOH. 11. High-fidelity PCR kit. 12. PCR cleanup kit.

3

Methods This protocol was designed for the screening of intergenic sRNAs in B. thetaiotaomicron, with a library of dPb2Cas12a gRNAs/arrays cloned into plasmids AWP-029 and AWP-031, respectively (see Note 1). These plasmids harbor an IPTG-inducible dCas12a and a constitutively expressed cassette with restriction sites for cloning of either gRNAs (AWP-029) or CRISPR arrays

106

Gianluca Prezza and Alexander J. Westermann

(AWP-031). However, the protocol can be easily adapted to other bacterial species, other nucleases, and other plasmid constructs with only minor modifications in the sections pertaining to species-, nuclease-, and plasmid-specific steps. 3.1 OPTIONAL: Compilation of the List of Targetable Sequences

This step generates a file with the sequences of the genes to be targeted, including their promoters (defined by the –p option, with the default set to 0 nt upstream of the transcription start site) and transcribed regions. Skip this step if such a file is already available for your envisaged target gene set. 1. Run the get_sequences_from_gff.py script with the genome sequence and annotation files as input and the optional parameters indicating what genes to include.

3.2 OPTIONAL: Identification of the Most Abundant PAM

If you have not selected the nuclease to be used in the screen, perform this step to identify the most frequently occurring PAM within the target gene space. 1. Run the PAM_frequency.py script on the list of genes to be targeted. This will rank the desired PAMs (default: 5′-TTV-3′ and 5′-NGG-3′) according to their frequency within the customized sequence space. 2. Select the nuclease with the most frequent PAM for the screen (see Note 2).

3.3 Design of the gRNA Library

1. Run the design_CRISPRi_gRNAs.py script on the gene sequences to be targeted (from Subheading 3.1, step 1). The script outputs one file containing oligonucleotide sequences for cloning the gRNAs (Subheading 3.4) and—if desired and specified with the -a option—one or more files with oligonucleotides for the cloning of arrays (Subheading 3.5). Note that the design of arrays is only supported for the TTV PAM. With the -nt option, the script also designs nontargeting gRNAs as internal controls. The -s option specifies how many gRNAs per target gene should be designed by the software. Maximizing the number of different gRNAs per individual sRNA decreases the risk of “losing” sRNAs due to inefficient knockdown. 2. Obtain the proposed oligonucleotides in pools (e.g., from a commercial vendor). gRNAs (both targeting and non-targeting) belong to the same pool, while oligonucleotides for arrays must be kept in a separate pool, as defined by the files outputted by the software.

3.4 Cloning of the gRNAs

1. OPTIONAL: If the oligonucleotide pools are lyophilized, they must first be resuspended in water. To do this, spin the tube at full speed for 1 min and add DNase-free water to a final concentration of 50 μM. Resuspend the dehydrated pool by

CRISPR Interference-Based Functional Small RNA Genomics

107

incubating the vial for 5 min at 65 °C, with constant shaking at 800 rpm. Store at -20 °C till needed. 2. Digest 1 μg of AWP-029 with SmaI according to the enzyme supplier’s protocol and purify with a DNA cleanup kit. It is essential that the plasmid is completely digested at the end of this step. If this is not feasible, extract the linearized vector from an agarose gel to remove traces of undigested DNA. 3. Prepare a working solution of the gRNA pool(s) at a concentration of 1 μM by diluting in DNase-free water. 4. Clone the gRNA oligonucleotide pool(s) into the digested AWP-029 in the following reaction: 33 ng of linearized AWP-029, 1 μL of the diluted oligonucleotide pool, 10 μL of NEBuilder HiFi DNA Assembly Master Mix, filled up to 20 μL with DNase-free water. Incubate at 50 °C for 1 h. Immediately thereafter, proceed with the transformation of the ligation reaction (Subheading 3.6) or freeze it at -20 °C. 3.5 OPTIONAL: Assembly and Cloning of CRISPR Arrays

Relevant only if arrays were designed in Subheading 3.3, step 1 with the –a flag. The cloning reaction is based on the CRATES protocol [24]. 1. OPTIONAL: Resuspend the lyophilized oligonucleotide pool (s) as in Subheading 3.4, step 1. 2. Prepare a working dilution of the array pool(s) in DNase-free water at a final concentration of 10 μM. 3. Phosphorylate the oligonucleotides in the following reaction: 5 μL of 10 μM array oligonucleotide pool, 1.5 μL of T4 polynucleotide kinase, 4 μL of 10× reaction buffer A, 4 μL of 10 mM ATP, 25.5 μL of DNase-free water. Incubate at 37 °C for 30 min, followed by heat-inactivation for 20 min at 65 °C. 4. Anneal the oligonucleotides: Incubate the previous reaction mix at 95 °C for 5 min, followed by cooling to 85 °C in 30-s steps per degree, then cool to 65 °C at a rate of 1 min per degree, followed by cooling down to 15 °C, again in 30 s steps per degree. 5. Repair the nicks in the annealed fragments by adding to the reaction tube: 2.5 μL of T4 DNA ligase, 0.55 μL of T4 polynucleotide kinase buffer A, 2 μL of 10 mM ATP and 4.95 μL of DNasefree water. Mix and incubate at 16 °C for 5 h, followed by heatinactivation at 65 °C for 10 min. 6. Load an aliquot of each ligation reaction onto a 2% agarose gel to confirm that only the correctly assembled, 202 nt-long band is detected. If additional bands are present after ligation, the reaction contains misassembled fragments. In this case, load

108

Gianluca Prezza and Alexander J. Westermann

the entire reaction on a 2% preparative agarose gel, extract and purify the 202-nt fragment and use it for the following steps. 7. Digest 1 μg of AWP-031 with BsmBI according to the supplier’s protocol and purify the product with a DNA cleanup kit. Again, if the plasmid cannot be completely linearized, it may be required to excise the correct band from an agarose gel (see Subheading 3.4, step 2). 8. Ligate the assembled arrays into AWP-031 according to the following reaction scheme: 1 μL of 10× T4 DNA ligase buffer, 80 ng of AWP-031, 0.5 μL of T4 DNA ligase, 0.5 μL of BsmBI restriction enzyme, 3 μL of the assembled pool from Subheading 3.5, step 5 (or the gel-purified fragment from Subheading 3.5, step 6) and DNase-free water to a final volume of 10 μL. Perform the ligation by incubating at 16 °C in a thermocycler for 1 h, 45 cycles of digestion (42 °C, 2 min) and ligation (16 ° C, 5 min), followed by a final digestion at 55 °C for 20 min and a heat-inactivation at 80 °C for 10 min. Proceed directly with the transformation (Subheading 3.6) or freeze the sample at -20 °C. 3.6 Transformation into the Donor Strain E. coli S17-1

1. Add each 4 μL of the reactions from Subheading 3.4, step 4 and Subheading 3.5, step 8 to separate pre-chilled electroporation cuvettes and place on ice. 2. Incubate a 160-μL aliquot of frozen competent cells per each ligation reaction on ice until the cells have thawed. 3. Add 160 μL of thawed competent cells to each cuvette and put on ice. 4. Electroporate the plasmids into E. coli cells with a 2.5 kV pulse. 5. Harvest the electroporated cells immediately and add them to 900 μL of pre-warmed SOC medium. Incubate for 1 h at 37 °C under constant shaking at 300 rpm. 6. For each transformation, add 90 μL of LB medium to a 10-μL aliquot of the cell mix and plate it on a 10 cm LB agar (+ ampicillin) plate. Streak the remaining volume of the transformation on a 25 × 25 cm dish filled with LB-agar (+ ampicillin). Incubate both plates over night at 37 °C. 7. Count the number of colonies on the 10 cm plate and multiply it by a factor of 105 to obtain an estimate of the number of colonies on the large dish. To ensure good coverage, you should collect many more colonies than the mere number of individual constructs present in the ligation reaction. We recommend harvesting 50 times more colonies compared to the number of contained CRISPR constructs. Following the above protocol, we routinely obtained 2–4 × 105 colonies per transformation reaction.

CRISPR Interference-Based Functional Small RNA Genomics

109

Table 1 Oligonucleotides used in this study ID

Sequence (5′-3′)

AWO-575

CCAAGACGTCAACTCGACC

AWO-576

CTCTCATCCGCCAAAACAG

AWO-577

TTCTTATCTTTGCAGTAATTTC

AWO-578

ACAGTCATTCATCTTTCTGC

8. Screen at least 10 colonies per transformation by colony PCR (using the oligonucleotides in Table 1) and Sanger sequencing to confirm that the majority (ideally, all) of the obtained colonies harbor a construct of interest. For AWP-029 (gRNAs) transformations, use primers AWO-577/ AWO-578 (expected amplicon size: 144 bp). For AWP-031 (arrays), use primers AWO-575/AWO-576 (expected amplicon size: 274 bp). 9. Harvest the colonies by adding 10 mL of LB medium to each 25 × 25 cm dish and by gently scraping with a cell spreader without damaging the agar layer. Collect the cell mass and estimate the concentration of each resuspension by measuring the OD of a 1:50 dilution (or another appropriate dilution whose reading falls within the linear range of the spectrophotometer). 10. Adjust each individual cell resuspension to 10 ODs with LB medium and mix with an equal volume of 80% glycerol to generate 5-OD stocks. Freeze and store the stock in a –80 °C freezer. 3.7 Conjugation into B. thetaiotaomicron

1. Inoculate separate tubes containing 5 mL of LB + ampicillin with an aliquot of the above glycerol stocks and incubate at 37 °C with shaking at 220 rpm over night. Inoculate a colony from a freshly streaked plate of wild-type B. thetaiotaomicron in 5 mL of TYG medium and incubate the culture anaerobically at 37 °C over night. 2. Subculture the B. thetaiotaomicron overnight culture 1:100 in 5 mL of fresh TYG medium. 3. After ~1 h, subculture the E. coli donor cultures 1:100 in 5 mL of fresh LB medium containing ampicillin. This is done to ensure that the Bacteroides and E. coli cultures reach their required ODs (see below) at roughly the same time. In case the cultures grow differently in your hands, adjust culture conditions appropriately.

110

Gianluca Prezza and Alexander J. Westermann

4. Once both cultures reach the early exponential phase (OD of 0.4–0.6; takes ~5–6 h in our experience), they are mixed in a 1: 2 ratio of E. coli to B. thetaiotaomicron cells. It is important to maintain an equal library member ratio within the E. coli pool in case the different transformation reactions contain different numbers of unique constructs. For example, if the gRNA transformation contains 120 unique strains and the array transformation only 23 (i.e., about 5.2 times less), mix in a 5.2 (gRNA transformation) to 1 (array transformation) OD ratio. 5. Pellet 10 ODs of the cell mix by centrifugation at 4,000 × g for 4 min and resuspend in 1 mL of PBS. Spot the resuspension on three BHIS plates and incubate aerobically at 37 °C over night. 6. Resuspend the cell lawns in 5 mL of PBS and measure the OD of an appropriate dilution. 7. Plate a 0.2-OD equivalent of the resuspension on 10 cm BHIS plates supplemented with gentamicin and erythromycin and incubate anaerobically at 37 °C for 2–3 days until visible colonies form. In our experience, this results in 1–2 × 103 colonies per plate. Plan for a sufficiently high number of plates to match the required total number of colonies. As before, aim for at least 50 times more colonies than the number of unique members of your envisaged library. 8. Collect the colonies from all plates in 1 mL of pre-reduced PBS per plate and pool all resuspensions. We have noticed that glycerol stocks prepared at this stage display poor viability after freezing and rethawing. Therefore, the protocol proceeds with a subculture in liquid medium before cryostocks are prepared. 9. Subculture the resuspension mix in 25 mL of TYG medium containing gentamicin and erythromycin (100 μg/mL or 12.5 μg/mL, respectively) at a starting OD of 0.05. Incubate anaerobically at 37 °C for ~16 h. 10. After this incubation, the culture should have reached deep stationary phase (OD ~4). Pellet 80-OD equivalents at 4,000 × g, 4 min and resuspend in 8 mL of TYG. 11. Mix the resuspension with 8 mL of 80% glycerol to obtain 16 mL of a master stock (5 OD in 40% glycerol). 12. Freeze the stock at -80 °C as single-use, 100 μL aliquots in 0.2 mL tubes and a few “master” tubes of each 1 mL. 13. OPTIONAL: we advise to sequence the library at this stage to confirm that all or most of the anticipated library members are indeed present. To this end, thaw a single-use aliquot, pellet it at maximum speed for 2 min, remove the supernatant, and process the pellet according to Subheadings 3.9, 3.10, 3.11 and 3.12 (see below).

CRISPR Interference-Based Functional Small RNA Genomics

3.8 Growth of the Library Under Selecting Conditions (CRISPRi Screen)

111

To account for biological variation, the steps below should be performed in at least three biological replicates. 1. Inoculate 50 μL of a single-use aliquot in 5 mL TYG medium and incubate anaerobically at 37 °C over night. 2. Subculture the overnight culture to a starting OD of 0.05 in two flasks containing the screening medium supplemented with 250 μM of IPTG. One flask will serve as the control and the other flask—to which, e.g., a defined stressor is added—as the test condition. This protocol is based on 1 OD of culture to be collected for gDNA extraction at the end of the experiment. Therefore, subculture in a volume of medium that will allow you to collect such an amount of cells from both flasks. 3. Grow the two cultures to the stationary phase (see Note 3). 4. Collect 1-OD equivalent of cells, centrifuge at 12,000 × g at room temperature for 2 min and discard the supernatant. Store the cell pellets at -20 °C until further processing.

3.9

gDNA Extraction

This protocol was optimized for a cell mass of 1 OD. If needed, adjust volumes and concentrations accordingly. 1. Thaw the samples and resuspend them in 90 μL of PBS. 2. Add 10 μL of 25 mg/mL lysozyme and vortex briefly. 3. Add 100 μL of cell lysis buffer and incubate at 37 °C for 5 min. 4. Add 1 μL of proteinase K, mix and incubate at 56 °C for 30 min. 5. Add 0.5 μL of RNase A, mix and incubate at 56 °C for 5 min. 6. Under a fume hood, add 200 μL of phenol:chloroform:isoamyl alcohol and shake vigorously for 15 s. Centrifuge at 12,000 × g, room temperature, for 15 min. 7. Under a fume hood, collect the aqueous phase and mix with 0.5 μL of glycoblue, 0.1 volumes of 3 M NaOAc and 3 volumes of 100% EtOH. Incubate over night at -20 °C. 8. Pellet the gDNA by centrifuging at 4 °C and full speed for 30 min. 9. Aspirate and discard the supernatant without disturbing the pellet. 10. Wash the pellet by adding 500 μL of 75% EtOH and centrifuge at 4 °C, full speed, for 10 min. 11. Discard the supernatant as above. Briefly spin the tube again and carefully remove the remaining supernatant using a 10-μL pipette tip. 12. Air-dry the pellet with an open lid until the ethanol has evaporated (~5 min).

112

Gianluca Prezza and Alexander J. Westermann

13. Add 50 μL of DNase-free water and resuspend the pellet by incubating at 60 °C for 5 min under constant shaking at 600 rpm. 14. Measure the DNA concentration and store the sample at 20 °C. 3.10 PCR Amplification

1. Assemble a PCR reaction containing 150 ng of template gDNA in a total volume of 50 μL. Use primers AWO-577/AWO-578 for amplifying gRNA constructs (melting temperature: 56 °C, expected product size: 144 bp). If the library also contains arrays, a second reaction with primers AWO-575/AWO-576 is required (melting temperature: 64 °C, expected product size: 274 bp). Using a high-fidelity PCR kit and minimizing the number of PCR cycles is highly recommended. In our experience, 37 cycles are sufficient for the downstream steps. 2. Load 5 μL of the PCR reaction on a 2% agarose gel to check for correct amplification (see Note 4). 3. Clean up the PCR reaction with a kit. Measure the DNA concentration of each sample and store it at -20 °C.

3.11 Library Preparation and Sequencing

1. OPTIONAL: amplified array amplicons can be pooled with gRNA amplicons derived from the same sample prior to sequencing, thereby reducing the number of sequencing libraries. Make sure to pool the two amplicons in a ratio that reflects the respective number of library members. For example, if your library contains 100 gRNAs and 25 arrays, mix 4 moles of gRNA amplicon per mole of array amplicon. Taking into consideration the different lengths of the two amplicons, this means that you should mix roughly two times the amount (in nanograms) of gRNA amplicon with one amount of array amplicon. 2. Generate sequencing libraries from the amplicon samples. Any protocol that allows library preparation from PCR products is suitable. 3. Choose the sequencing mode to ensure that the entire unique region of each library member will be covered. While sequencing the single spacer is sufficient in the case of gRNAs, include all three spacers for arrays. Thus, sequencing 150 ntlong, paired-end reads is necessary for arrays cloned into AWP-031 and amplified using primers AWO-575/ AWO-576. In contrast, if the library only consists of standalone gRNAs, single-end sequencing is usually sufficient. For example, in the case of gRNAs cloned into AWP-029 and amplified using primers AWO-577/AWO-578, 100 nt-long, single-end reads will cover the spacer region in both strands. Either way, as amplicons differ only in the spacer regions, the resulting

CRISPR Interference-Based Functional Small RNA Genomics

113

sequencing library will be of low complexity. To account for this and ensure optimal sequencing performance, we strongly recommend mixing in a highly heterogeneous library. For example, we had success when adding ~50% of the commonly used PhiX spike-in to the samples. 4. Libraries are sequenced to a depth of ~5–10 million reads/ sample. Note that since 50% of each library is composed of the PhiX spike-in, the actual number of usable reads will only be half the sequencing output. 3.12

Data Analysis

1. OPTIONAL: if samples were sequenced in paired-end mode, merge mate reads. This can be achieved using, e.g., the bbmerge tool [25]. 2. Run the count_guides.py script by providing the fastq sequencing files as input. This will count the occurrences of each gRNA/array in the sample. 3. Differences in the relative abundance of specific gRNAs/arrays between the test and control condition can be inferred with the help of tools conventionally used for differential expression analysis, such as edgeR [26] and DESeq2 [27]. As a result, fold changes and p-values between the two conditions will be calculated individually for each gRNA and each array contained in the library (Fig. 3a). As these data may be noisy and potentially confounded by inefficient gRNAs or off-targeting effects, we usually combine abundance changes at the level of

Fig. 3 Analysis of CRISPRi data on two levels—that of individual gRNAs and arrays and that of individual sRNAs. Shown here for an illustrative purpose, the raw data underlying these volcano plots stem from our CRISPRi screen for bile stress-related sRNAs in B. thetaiotaomicron [23]. (a) Construct-level results: differential abundance is calculated for each library member relative to the control condition (Subheading 3.12, step 3). (b) sRNA-level results: merging the data from panel A over all constructs targeting the same sRNA (Subheading 3.12, step 4) gives rise to combined fold-changes and p-values

114

Gianluca Prezza and Alexander J. Westermann

individual sRNAs (see next step). However, construct-specific fold-changes have their own value, e.g., to pinpoint false negatives in situations when the effect of efficient guides is masked by inefficient constructs against the same target. 4. To merge fold-changes and p-values over individual constructs targeting the same sRNA gene, average fold-changes, and combine p-values according to Fisher’s method (Fig. 3b). In our opinion, these data are more robust and facilitate the interpretation of the result of a screen.

4

Notes 1. The CRISPRi approach presented here for screening sRNAs is not restricted to this gene class. Instead, our protocol lends itself to targeting any predefined gene set, including those that tend to be underrepresented in conventional functional genomics screens. For instance, small proteins—encoded by short open reading frames (sORFs)—have emerged as another class of important bacterial regulators [26]. Meanwhile, sORFs are well annotated in bacterial model species, e.g., Salmonella enterica harbors 609 high-confidence sORFs in its genome [27]. Due to the gene length bias of random mutagenesis, small protein-associated phenotypes are rarely identified following conventional functional genomics protocols. For example, a reanalysis of a published Salmonella enterica transposon insertion sequencing dataset revealed that 23% of its sORFs did not contain an insertion [28]. We are confident that in the future, the CRISPRi approach described here will also facilitate the functional annotation of bacterial small proteins. 2. When designing the CRISPRi assay, it is advisable to pick a PAM that occurs frequently in the target sequences to maximize the number of efficient gRNAs. However, in some cases, the selection of another PAM can be a valid option, for example, when a well-characterized Cas nuclease is preferred over a non-canonical enzyme. In any case, it is worth noting that our library design software is highly flexible, allowing the user to specify the nuclease (the PAM) on which to build the ensuing screen. 3. In a CRISPRi-based screen, relative strain abundance in the output pool is used as a proxy for the fitness of a given library member. For example, a positive fold change reflects an increase in the relative abundance of a given strain compared to the same strain in the control condition. This way, however, it is impossible to distinguish between increased resilience (i.e., decreased cell death) and enhanced growth (shorter lag phase, increased growth rate, accumulation to higher cell densities) of

CRISPR Interference-Based Functional Small RNA Genomics

115

a knockdown strain in the test compared to the control condition. In the present case, we screened in media supplemented with a sublethal concentration of bile salts. This means that the selection for/against specific library members depended primarily on the number of cell divisions during the assayed time span. We collected samples in the deep stationary phase and this proved sufficient to identify two sRNAs significantly altering Bacteroides fitness under bile stress. However, when wanting to detect more subtle phenotypes, a plausible option could be to extend the selection pressure for longer time periods, e.g., by subjecting the output pool from the first round to subsequent rounds of subculturing and, hence, selection. In any case, sRNA-associated fitness phenotypes identified through a CRISPRi screen should be independently confirmed, e.g., by generating a clean deletion of the respective gene and subjecting the corresponding mutant and its isogenic wild-type to the same experimental condition. In doing so, keep in mind that some phenotypes might depend on direct competition with other strains (as present in a CRISPRi library), yet might be less pronounced—or even vanish—when deletion mutants are grown in isolation. On the contrary, the absence of changes for a specific strain in the CRISPRi screen does not exclude the hit gene to be functionally important in the tested condition, as, for example, other members of the pool could compensate for the loss of the corresponding gene product in a given strain (e.g., via cross-feeding mechanisms). 4. We made several observations with respect to the PCR amplification prior to sequencing (Subheading 3.11) that are worth briefly mentioning. In our experience, the used primer pairs sometimes gave rise to primer dimers. We noticed that this is less likely to occur when using freshly prepared oligonucleotide aliquots. In any case, the dimers tend to be mostly lost during PCR product purification due to their small size and have never affected the quality of the sequencing data. Furthermore, the array amplicon usually appears as two bands on gels: one with the expected size and a larger one. We determined that this upper band reflects “non-mate” annealing, where DNA strands from two different arrays anneal due to high sequence similarity. However, this, too, did not cause any issues during the remaining steps of the protocol. References 1. Wagner EGH, Romby P (2015) Small RNAs in bacteria and archaea: who they are, what they do, and how they do it. Adv Genet 90:133– 208

2. Storz G, Vogel J, Wassarman KM (2011) Regulation by small RNAs in bacteria: expanding frontiers. Mol Cell 43(6):880–891 3. Papenfort K, Melamed S (2023) Small RNAs, large networks: posttranscriptional regulons in

116

Gianluca Prezza and Alexander J. Westermann

gram-negative bacteria. Annu Rev Microbiol 77:23 4. Ryan D, Prezza G, Westermann AJ (2020) An RNA-centric view on gut Bacteroidetes. Biol Chem 402(1):55–72 5. Ryan D et al (2020) A high-resolution transcriptome map identifies small RNA regulation of metabolism in the gut microbe Bacteroides thetaiotaomicron. Nat Commun 11(1):3557 6. Ryan D et al (2023) An integrated transcriptomics-functional genomics approach reveals a small RNA that modulates Bacteroides thetaiotaomicron sensitivity to tetracyclines. bioRxiv 7. Liu H et al (2021) Functional genetics of human gut commensal Bacteroides thetaiotaomicron reveals metabolic requirements for growth across environments. Cell Rep 34(9): 108789 8. Price MN et al (2018) Mutant phenotypes for thousands of bacterial genes of unknown function. Nature 557(7706):503–509 9. Chao MC et al (2016) The design and analysis of transposon insertion sequencing experiments. Nat Rev Microbiol 14(2):119–128 10. Cain AK et al (2020) A decade of advances in transposon-insertion sequencing. Nat Rev Genet 21(9):526–540 11. Goodman AL et al (2009) Identifying genetic determinants needed to establish a human gut symbiont in its habitat. Cell Host Microbe 6(3):279–289 12. Abel S et al (2015) Analysis of bottlenecks in experimental models of infection. PLoS Pathog 11(6):e1004823 13. Dominguez AA, Lim WA, Qi LS (2016) Beyond editing: repurposing CRISPR-Cas9 for precision genome regulation and interrogation. Nat Rev Mol Cell Biol 17(1):5–15 14. Qi LS et al (2013) Repurposing CRISPR as an RNA-guided platform for sequence-specific control of gene expression. Cell 152(5): 1173–1183 15. McCarty NS et al (2020) Multiplexed CRISPR technologies for gene editing and transcriptional regulation. Nat Commun 11(1):1281 16. Liu X et al (2017) High-throughput CRISPRi phenotyping identifies new essential genes in Streptococcus pneumoniae. Mol Syst Biol 13(5):931

17. Rousset F et al (2018) Genome-wide CRISPRdCas9 screens in E. coli identify essential genes and phage host factors. PLoS Genet 14(11): e1007749 18. Lee HH et al (2019) Functional genomics of the rapidly replicating bacterium Vibrio natriegens by CRISPRi. Nat Microbiol 4(7): 1105–1113 19. Shin J et al (2023) Genome-wide CRISPRi screen identifies enhanced autolithotrophic phenotypes in acetogenic bacterium Eubacterium limosum. Proc Natl Acad Sci U S A 120(6):e2216244120 20. Peters JM et al (2016) A comprehensive, CRISPR-based functional analysis of essential genes in bacteria. Cell 165(6):1493–1506 21. Wang T et al (2018) Pooled CRISPR interference screening enables genome-scale functional genomics study in bacteria with superior performance. Nat Commun 9(1): 2475 22. Peters JM et al (2019) Enabling genetic analysis of diverse bacteria with Mobile-CRISPRi. Nat Microbiol 4(2):244–250 23. Prezza G et al (2024) A CRISPR-based genetic screen in Bacteroides thetaiotaomicron reveals a small RNA modulator of bile susceptibility. Proc Natl Acad Sci USA. https://doi.org/ 10.1073/pnas.2311323121 (in press) 24. Liao C et al (2019) Modular one-pot assembly of CRISPR arrays enables library generation and reveals factors influencing crRNA biogenesis. Nat Commun 10(1):2948 25. Bushnell B, Rood J, Singer E (2017) BBMerge - accurate paired shotgun read merging via overlap. PLoS One 12(10):e0185056 26. Robinson MD, McCarthy DJ, Smyth GK (2010) edgeR: a Bioconductor package for differential expression analysis of digital gene expression data. Bioinformatics 26(1): 139–140 27. Venturini E et al (2020) A global data-driven census of Salmonella small proteins and their potential functions in bacterial virulence. Microlife 1(1):uqaa002 28. Love MI, Huber W, Anders S (2014) Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2. Genome Biol 15(12):550

Chapter 7 Investigation of sRNA-mRNA Interactions in Bacillus subtilis In Vivo Inam Ul Haq, Peter Mu¨ller, and Sabine Brantl Abstract In this chapter, we describe in vivo methods for the analysis of interactions between an sRNA and its target mRNA in B. subtilis. All these methods have been either established or significantly improved in our group and successfully employed to characterize a number of sRNA/target mRNA systems in Bacillus subtilis. Whereas in Chap. 8, we describe a combination of in vitro methods, e.g., EMSA and RNA secondary structure probing, we focus here on the investigation of RNA-RNA interactions in vivo using compatible plasmids or chromosomal insertions and deletions, the elucidation of the mechanisms of action of regulatory sRNAs employing transcriptional and translational reporter gene fusions, as well as the determination of expression profiles, half-lives of sRNA and mRNA, and their intracellular concentrations, and, finally, the investigation of RNA chaperones that promote the sRNA/mRNA interaction. For an in-depth analysis of sRNA-mRNA interactions in B. subtilis, a combination of in vivo and in vitro methods should be applied. Key words Reporter gene fusions, Plasmid vectors, Compensatory mutations, LFH PCR, Northern blotting, qRT-PCR, RNA chaperones

1

Introduction

1.1 Application of Two-Plasmid Systems, Reporter Gene Fusions, and Chromosomal Deletions, Insertions, or Mutations to Study the Effect of sRNAs on Their Target mRNAs

To date, only four trans-encoded and eleven cis-encoded sRNAs and their targets have been investigated in detail in Bacillus subtilis, the majority of them in our group [1, 2]. For this reason, we present examples from our own work to illustrate the in vivo methods described in detail below. To investigate the interactions between a small regulatory RNA and its target mRNA in vivo, the sRNA has to be expressed in excess—e.g., from a medium copy (10–15 copies/cell) or multicopy plasmid (50–100 copies/cell)—over its target mRNA which is expressed from the chromosome. To visualize effects on the target protein, the chromosomally located target gene can be provided with a C-terminal 3xFLAG tag that allows protein detection by Western blotting with anti-FLAG M2 antibodies [3, 4]. Effects on target mRNA levels or stability are investigated by Northern

Ve´ronique Arluison and Claudio Valverde (eds.), Bacterial Regulatory RNA: Methods and Protocols, Methods in Molecular Biology, vol. 2741, https://doi.org/10.1007/978-1-0716-3565-0_7, © The Author(s), under exclusive license to Springer Science+Business Media, LLC, part of Springer Nature 2024

117

118

Inam Ul Haq et al.

Fig. 1 Plasmid maps. (a) The circular maps of the plasmids described in the text are shown. pGK14-16, pPR1/ E, and pUCB2 are plasmids for constitutive expression of sRNA genes under their own promoters. Plasmid pWH353 [8] codes for the tet promoter and repressor (tetR) for tet-inducible overexpression. (b) Principle of transcriptional and translational reporter gene fusions. (c) Map of plasmid pGM16 used for the construction of transcriptional lacZ fusions

blotting or qRT-PCR (see Subheading 2). To find out if an sRNA impacts the translation of its target mRNA, translational reporter gene fusions (usually lacZ fusions) with the 5′ UTR, SD sequence, and the first three to eight codons of the target mRNA are constructed, and the β-galactosidase activity determined in strains that express the sRNA from a plasmid. In case the primary target of the sRNA is a TF (transcription factor), transcriptional lacZ fusions of promoters regulated by this TF should be constructed to assay the effects on downstream targets [4]. The sRNA gene can be cloned either under its native promoter or under an inducible promoter. The latter provides the advantage

In Vivo Analysis of sRNA-mRNA Interactions in B. subtilis

119

that the sRNA expression can be turned on specifically under conditions of target gene expression and a negative control (without induction) can be easily included. For constitutive expression under the native sRNA promoter, plasmids pGK14 (eryR), pGK15 (specR), or pGK16 (kanR) that replicate both in E. coli and B. subtilis and have about 15 copies/cell can be used. Multicopy plasmids for constitutive expression are either pPR1 (phleoR) [5] or pUCB2 (kanR catR) [6, 7] which are shuttle vectors based on pUC19 that replicate both in E. coli (selection for ampR) and B. subtilis (50 copies/cell). For tightly controlled inducible overexpression of the sRNA under control of anhydro-tetracycline, pWH353 [8] is suitable (e.g., pWSR1 and its mutants [9]. Plasmid maps are shown in Fig. 1a. To analyze the impact of an sRNA on its target gene, an sRNA knockout strain has to be constructed, e.g., by LFH (long-flanking homology)-PCR. This method can be also employed to construct start- to stop-codon mutations, to N- or C-terminally tag a target mRNA, or to introduce compensatory mutations into the target gene directly in the chromosome. LFH-PCR generates either a “deletion cassette” to knock-out and or an “insertion cassette” to knock in genes on the B. subtilis chromosome by homologous recombination. Both cassettes are generated by amplifying upstream and downstream regions of the gene of interest which are then fused to an appropriate antibiotic resistance gene. In order to investigate if an sRNA inhibits or activates translation initiation of its target RNA(s), translational lacZ reporter gene fusions are constructed. These fusions should include all basepairing regions, independent of whether they are located upstream, overlapping with or downstream of the RBS, or within the coding region of the target gene. In B. subtilis, the lacZ gene is not stable when expressed from a plasmid. Therefore, translational lacZ fusions have to be integrated into a chromosomal region where they do not interfere with other physiological processes. Most frequently, integration into the amyE (amylase E) locus is used, but also the thrZ locus or aprE (major extracellular alkaline protease) gene are suitable. To exclude additional effects of the sRNA on transcription of the target gene, a heterologous weak promoter can be used to direct transcription of the translational target-lacZ fusion. This heterologous promoter has to be strong enough to allow measuring of at least 40–50 Miller units of β-galactosidase activity, but weak enough to ensure an excess of the sRNA over the target-lacZ mRNA to reflect the natural situation. When we used the constitutive weak promoter pI [10] from the S. agalactiae plasmid pIP501 for the expression of an ahrC-lacZ fusion from the chromosomal amyE locus [4], translational repression by SR1 from multicopy plasmid pWH353 carrying the sr1 gene (pWSR1, ref. [8]) was clearly detectable, whereas the use of the strong promoter pIII [5] yielded such a high ahrC-lacZ translational

120

Inam Ul Haq et al.

activity that it could not be down-regulated by SR1 expressed from pWSR1 [4]. To study if an sRNA influences transcription of downstream genes (e.g., in case the primary target to which the sRNA base-pairs is a TF or a histidine kinase), transcriptional lacZ fusions comprising the promoter of the downstream target(s) including all binding regions for the TF have to be constructed and integrated into the amyE [4], thrZ or aprE locus of the B. subtilis chromosome. The sRNA has to be expressed from either its chromosomal locus under its native promoter or in trans from a plasmid. 1.2 Determination of Expression Profiles, Half-Lives, and Intracellular Concentrations of sRNA and mRNA

In many cases, sRNAs affect the stability of their target mRNAs, either directly (by forming a long double-stranded stretch that is a substrate for RNase III or by recruiting an RNase) or indirectly, by inhibiting translation and consequently, preventing protection of the mRNA by translating ribosomes. In addition, the expression profiles and intracellular concentrations of both RNAs depend on environmental conditions. The half-life of the target mRNA (and also the sRNA) can be determined by treating a culture with rifampicin that inhibits transcription initiation by the bacterial RNA polymerase, taking time samples, isolating total RNA, and subjecting it to either Northern blotting or qRT-PCR. Whereas Northern blotting can be used to discriminate between different species of the same transcript, active forms, and inactive pre-forms, 5′ or 3′ ends of transcripts that might be used as sRNAs and endoribonucleolytic cleavage products of large operons with different stability this is not possible with qRT-PCR. However, the advantage of qRT-PCR compared to Northern blotting is the higher sensitivity, which allows the analysis of an RNA from a limited number of cells, very long transcripts, and transcripts with very low abundance. The detection of very low abundant transcripts in B. subtilis can be hard. The low GC content in B. subtilis results in slightly unspecific primer binding due to less sequence variation and, therefore, the necessity to use longer oligos. To eliminate this problem, we perform a two-step qRT-PCR, where we introduce in a reverse transcription an extrinsic and for B. subtilis very specific sequence stretch (5′ of the complementary RT primer region). The qPCR is then performed with one primer against this extrinsic sequence resulting in complete background elimination. Furthermore, this extrinsic sequence can be adjusted, allowing fine-tuning of the primer properties, compensation for the low GC content, and optimizing the PCR product length. In case an RNA chaperone binds the sRNA or the mRNA in vitro in an EMSA and/or promotes their interaction, this interaction has to be corroborated in vivo. This can be done by the investigation of translational or transcriptional lacZ fusions in a wild-type strain and an isogenic RNA chaperone knockout strain. Thereby, mutations in

In Vivo Analysis of sRNA-mRNA Interactions in B. subtilis

1.3 In Vivo Analysis of RNA Chaperones That Promote the sRNA-mRNA Interaction

2

the chaperone binding sites in either sRNA or target mRNA can be introduced. In many cases, RNA chaperones affect the stability of the sRNA or its target mRNA. Such effects can be analyzed by halflife determination in Northern blots or qRT-PCR as described in Subheading 3.2. Examples are described in Mu¨ller et al. [4].

Materials

2.1 Two-Plasmid Systems, Reporter Gene Fusions, and Chromosomal Deletions, Insertions, or Mutations to Study the Effect of sRNAs on Their Target mRNAs 2.1.1

121

Growth Media

You need all materials and devices to perform cloning experiments: centrifuge, power supply, agarose gel electrophoresis, thermoblock, vortexer, thermocycler for PCR, shaker bath with appropriate cultivation flasks, 37 °C incubator, Petri dishes.

1. TY medium: 16 g bacto tryptone, 10 g yeast extract, 5 g NaCl, dissolve in 1 L distilled water and autoclave. 2. TFB medium: 12 g bacto tryptone, 24 g yeast extract, 4 mL glycerol, dissolved in 900 mL distilled water, autoclaved in 135 mL portions. Before use, add 15 mL phosphate buffer to 135 mL TFB to yield 150 mL. 3. Phosphate buffer: 1 g KH2PO4, 12.54 g K2HPO4, dissolve in 100 mL distilled water and autoclave. 4. 1× Spizizen salts: 6 g KH2PO4, 14 g K2HPO4, 2 g (NH4)2SO4, 1 g Na-citrate × 2 H2O, 0.2 g MgSO4 × 7 H2O; add distilled water to 1 L, adjust pH to 7.0 and autoclave. 5. 10 mL Spizizen medium: mix 10 mL 1 × Spizizen salts, 0.1 mL autoclaved 50% glucose, 0.1 mL autoclaved 2% casamino acids, 0.1 mL autoclaved 10% yeast extract and—in case of auxotrophy of the B. subtilis strain—0.05 mL filter-sterilized 10 mg/ mL amino acid.

2.1.2 Vector, PCR Fragment Preparation, and Cloning

1. Plasmid vectors 2. Chromosomal DNA from B. subtilis wild-type strain (prepared as described in [3]) 3. Restriction enzymes and their corresponding 10× buffers 4. Q5 polymerase with 10× Q5 buffer (NEB) 5. 2 mM dNTP 6. Appropriate oligos (oligodeoxyribonucleotides) for PCR 7. T4 DNA ligase and 10× ligation buffer

122

Inam Ul Haq et al.

8. Kit for the purification of DNA fragments from agarose gels (e.g., from Qiagen) 9. 20× TAE buffer for agarose gels: 96.9 g Tris-HCl, 32.8 g Na-acetate, 5.8 g EDTA; adjust pH to 7.5 with acetic acid 10. Agarose 11. Phenol and chloroform 12. Ethidium bromide (10 g/L) 13. 96% and 80% ethanol 14. 3 M Na-acetate pH 5.0 15. 10 mg/mL tRNA in bidist as a carrier for ethanol precipitation 2.1.3

Bacterial Strains

1. E. coli strain (DH5α or TG1) 2. Wild-type B. subtilis strain (e.g., 168)

2.1.4 Transformation and Selection

1. 0.1 M CaCl2 solution (sterile filtered) 2. Antibiotics: e.g., ampicillin (100 μg/mL) 3. Kit for plasmid preparation 4. X-Gal: 25 mg/mL in DMF (dimethylformamide, store at 20 °C) 5. Starch 6. Potassium iodine solution 7. Glycerol

2.1.5 β-galactosidase Measurements

1. 1 M MgSO4 2. Lysozyme/DNase solution: 8 mg/mL lysozyme and 50 μL of 25 mg/mL DNase I, store in 50 μL aliquots at -20 °C 3. ONPG (ortho-nitrophenyl-β-galactoside): 4 mg/mL in 0.1 M phosphate buffer pH 7.0 (store in 1.6 mL aliquots at -20 °C) 4. Z-buffer: 10.7 g Na2HPO4 × 2 H2O, 6.24 g NaH2PO4 × 2 H2O, 0.75 g KCl, 0.12 g MgSO4, 2.7 mL β-mercaptoethanol (freshly added, Z-buffer is stored in fridge and not autoclaved) 5. Stop solution: 1 M Na2CO3 6. Spectrophotometer, cuvettes

2.1.6

LFH Materials

1. Thermocycler machine 2. Q5 high fidelity DNA polymerase and buffer (NEB) 3. 2 mM dNTPs 4. Appropriate primers 5. 1% or 3% agarose gels 6. Kit for purification of DNA fragments (e.g., Qiagen)

In Vivo Analysis of sRNA-mRNA Interactions in B. subtilis

2.2

2.2.1

Northern Blotting

RNA Isolation

123

The best method to prevent RNA degradation before the isolation of total RNA is to add 1 mL RNAprotect Bacteria Reagent (Qiagen) to 0.5 mL B. subtilis culture, vortex, and incubate for 5 min at RT. After centrifugation, the pellets can be flash-frozen in liquid nitrogen and stored until preparation at -20 °C. Common laboratory materials and devices like Eppendorf tubes, pipettes and tips, centrifuge, cooling centrifuge, and thermoblock with thermoshaker are required. 1. Lysis buffer 1: 100 mM NaCl, 50 mM EDTA, 10% saccharose, 3 mg/mL lysozyme 2. Lysis buffer 2: 10 mM Tris-HCl (pH 7.5), 1 mM EDTA, 100 mM Na-acetate, 2% SDS 3. Aqua Phenol 4. Phenol/chloroform: in 1:1, freshly mixed 5. 3 M Na-acetate pH 5.0 6. 96% EtOH and 80% EtOH

2.2.2 PAAGE and Tank Blotting

1. Siliconized glass plates. 2. PAAG casting system. 3. Vertical gel electrophoresis system. 4. Tank blotting system. 5. UV cross-linker. 6. 4–6% denaturing PAA solution: Prepare as described in (Chapter 8, Subheading 2.1.2). 7. Whatman paper pieces: Prepare 2 pieces per gel, slightly bigger than the gel size. 8. Nylon membrane: Cut in the same size as the gel. 9. 0.5× and 1× TBE buffer: Prepare from the 10× TBE as described in (Chapter 8, Subheading 2.1.2).

2.2.3 Agarose Gels and Capillary Blotting

1. Horizontal gel electrophoresis system. 2. 1% or 1.5% agarose gel: Prepare in 1× BPTE without ethidium bromide. 3. 10× BPTE: 300 mM Bis-Tris, 100 mM PIPES, 10 mM EDTA with a final pH of 6.5. 4. 50 mM NaOH. 5. Glyoxal mix: 6 mL DMSO, 2 mL deionized glyoxal (60%), 1.2 mL 10× BPTE buffer, 0.6 mL glycerol, 0.2 mL ethidium bromide; aliquots stored at -20 °C. 6. A 7–10 cm high stack of paper towels cut slightly larger than the gel piece to be blotted.

124

Inam Ul Haq et al.

7. 7 pieces of Whatman paper: Cut as above. 8. One piece of nylon membrane: Cut as above. 2.2.4 Preparation of Riboprobes/Oligo-DNA Probes (See Chapter 8, Subheading 2.1.1)

1. Sephadex G-50 (Sigma): 10 g Sephadex G-50, 150 mM NaCl, 10 mM EDTA, 50 mM Tris, pH 8.0. Dissolve in 150 mL bidist, autoclave and store at 4 °C. 2. Glass test tube. 3. Sterile gauze. 4. Swing bucket rotor centrifuge.

2.2.5 Prehybridization, Hybridization of the Probe, and Detection

1. Hybridization oven and the corresponding hybridization glass tubes. 2. 20× SSC: 3 M NaCl and 0.3 M Na-citrate; adjust the pH to 7.2 and autoclave. 3. 100× Denhardt’s solution: 2% bovine serum albumin, 2% Ficoll 400, and 2% polyvinylpyrrolidone in bidist, sterile filtered, and stored in aliquots at -20 °C. 4. Hybridization solution for riboprobes: 5× SSC, 5× Denhardt, 50% formamide, 50 mM Na-phosphate pH 6.7, 0.1% SDS, 1% dextran sulfate. 5. Hybridization solution for oligo-DNA probes: 6× SSC, 3× Denhardt, 0.5% SDS. 6. Herring or salmon sperm DNA: Stock solution 10 mg/mL. 7. Membrane washing solution: 2× SSC, 0.5% SDS. 8. PhosphorImager with screens and the corresponding analysis software.

2.3 Two-Step Quantitative Real-Time PCR

2.3.1

DNase I Treatment

Common laboratory materials and devices like Eppendorf tubes, pipette tips, thermoblock, centrifuge, and ice are required. Moreover, a real-time PCR thermocycler (e.g., the Mx3005P-System from Stratagene) and corresponding 96-well PCR plates are needed. 1. Isolated sample RNA is described in Subheading 3.2.1. 2. DNase I (8 U, RNase-free). 3. DNase I buffer. 4. Phenol. 5. Phenol/Chloroform: in 1:1, freshly mixed. 6. 3 M Na-acetate pH 5.0. 7. 96% EtOH and 80% EtOH: Prepare as described in (Chapter 8, Subheading 2.1.1).

In Vivo Analysis of sRNA-mRNA Interactions in B. subtilis 2.3.2 Quantitative cDNA Synthesis

125

1. DNase I-treated RNA as described in Subheading 3.3.1 2. 10 mM dNTPs 3. RT primer mix (2 μM each) 4. SuperScript IV Reverse Transcriptase kit (Invitrogen) 5. RNasin (Promega) 6. RNase A (4 mg/mL; Promega)

2.3.3

qRT-PCR

1. cDNA as described in Subheading 3.3.2 2. Primer mix for the gene of interest (5 μM each) 3. Maxima SYBR Green/ROX qPCR Master Mix 2× kit (Thermo Scientific)

2.4 In Vivo Analysis of RNA Chaperones That Promote the sRNA-mRNA Interaction

3

The same materials as in Subheading 2.1 are required.

Methods

3.1 Two-Plasmid Systems, Reporter Gene Fusions, and Chromosomal Deletions, Insertions, or Mutations to Study the Effect of sRNAs on Their Target mRNAs 3.1.1 Cloning of WildType or Mutated sRNA Gene into a Plasmid Vector (See Note 1)

1. Cleave the cloning vector with restriction enzyme 1 (see Note 2). 2. Check an aliquot on a 1% agarose gel for complete digestion. 3. Perform a phenol/chloroform extraction and ethanol precipitation of the linearized vector. 4. Dissolve the vector in 10 μL bidist and cleave with restriction enzyme 2. 5. Perform a phenol/chloroform extraction and ethanol precipitation of the double-digested vector. 6. Dissolve in 10 μL bidist and check 2 μL on a 1% agarose gel to determine the amount for ligation or determine the concentration with a NanoDrop (see Note 3). 7. Set up a 50 μL PCR reaction with Q5 polymerase (25–30 cycles) (for example see Subheading 3.1.4) with chromosomal DNA from B. subtilis wild-type strain as a template and appropriate oligos to amplify the sRNA gene (either with or without native promoter, depending on the vector). 8. Purify the PCR fragment from a 3% agarose gel (fragments smaller than 500 bp) with a kit. 9. Dissolve the PCR fragment in 30 μL bidist, and add 10 μL restriction buffer.

126

Inam Ul Haq et al.

10. Add 5 μL of each restriction enzyme (depending on the vector), and incubate for 1–2 h at 37 °C (see Note 4). 11. Perform a phenol/chloroform extraction and ethanol precipitation with 1 μL tRNA as carrier. 12. Dissolve the fragment in 20 μL bidist. 13. Check an aliquot (2–3 μL) on a 3% agarose gel to estimate the amount needed for ligation or determine the concentration with a NanoDrop. 14. Set up the ligation as follows: 1 μL plasmid vector (about 30–50 ng), 5 μL PCR fragment encoding sRNA (about 100–120 ng), 1 μL 10× ligation buffer, 2 μL bidist (or less, in case you need more fragment), 1 μL T4 DNA ligase. Total volume = 10 μL. 15. Incubate at room temperature for at least 1 h (up to overnight incubation). 3.1.2 E. coli Transformation and Selection of Correct Clones

1. Prepare E. coli competent cells according to Hanahan, 1983 [11], or employ commercially available frozen competent E. coli cells. 2. Use 3 μL of the ligation mix for transformation with 100 μL of E. coli competent cells [11]. 3. Plate on TY agar plates with appropriate antibiotic (usually 100 μg/mL ampicillin), i.e., one-tenth of transformation mix on the first plate and the rest on the second plate (undiluted). 4. Incubate plates overnight at 37 °C (see Note 5). 5. Pick 19 clones for a colony PCR as described ([12], see Note 6). 6. Check colony PCR on 3% agarose gel to identify recombinant transformants. 7. Cultivate 4 recombinant transformants in 5 mL TFB medium with antibiotic and vigorous shaking at 37 °C. 8. Isolate plasmids as described in [13] or with a kit. 9. Cleave aliquots with the two restriction enzymes used for cloning for 1 h at 37 °C. Check cleavage on a 3% agarose gel. 10. Send positive clones for sequencing (see Note 7).

3.1.3 Bacillus subtilis Transformation

Use recombinant plasmid for transformation of competent B. subtilis cells as follows (see Note 8): 1. Inoculate Spizizen medium with an overnight culture to OD600 = 0.1. 2. Grow in a shaker bath at 37 °C with 300 rpm. 3. Measure the optical density first after 3.5 h, afterward each 30 min.

In Vivo Analysis of sRNA-mRNA Interactions in B. subtilis

127

4. Determine the transition point T0 from logarithmic to stationary growth phase. 5. At T0 + 2 h, add 0.5 mL culture to about 1–5 μg of your plasmid in an Eppendorf tube, use another 0.5 mL culture as negative control (see Notes 8 and 9). 6. Shake for at least 1 h (up to 2 h) at 37 °C in a thermoshaker. 7. Plate 0.1 mL on one TY plate with antibiotic, briefly centrifuge the rest, suspend in 50–100 μL supernatant and plate on the second TY plate, use the same approach for your negative control. 8. Incubate the three plates overnight at 37 °C. 9. Grow two or three transformants in TY with antibiotic, prepare glycerol cultures and freeze them at -80 °C for later experiments. 3.1.4 Long-Flanking Homology (LFH)-PCR

The following PCR fragments are required (see Fig. 2): (a) Front and Back cassettes. Both cassettes should be about 1 kb long (see Note 10) with 20–25 bp overhanging complementary sequences to the adjacent cassette; (b) Resistance cassette. It should have 20–25 bp overhanging complementary sequences to both adjacent cassettes and can be obtained by performing a PCR on a plasmid carrying the desired resistance gene; (c) Insert. Gene of interest (e.g., carrying mutations, or a FLAG tagging mRNA in the native locus). It should have 20–25 bp overhanging complementary sequences to both adjacent cassettes. In case of knock-out, the gene to be deleted is replaced by an antibiotic resistance cassette. Based on the purpose of LFH-PCR, the location and direction of the resistance cassette can be changed. 1. Set up individual Q5 PCR reaction as follows: 1 μL template (chromosomal DNA or plasmid), 1 μL oligo A (10 pmol), 1 μL oligo B (10 pmol), 10 μL 5× Q5 DNA polymerase reaction buffer, 5 μL 2 mM dNTPs, 31.25 μL bidist, 0.75 μL Q5 DNA polymerase. Total volume = 50 μL. 2. PCR scheme: 30 s 98 °C denaturation, 30 s 42 °C annealing (see Note 11), 1 min 72 °C elongation (for 1 kb fragment). 3. All individual PCR products are purified from an agarose gel and resuspended in 20 μL. 4. For the final LFH PCR, add 5 μL of each fragment (see Note 12) without any oligos in a Q5 PCR reaction to a final volume of 100 μL and perform 10 cycles as described above in step 2 with 3 min elongation at 72 °C for each cycle. 5. After 10 cycles, add oligo 1 and oligo 8 (10 pmol each) and run 30 cycles with the time scheme shown above and 3 min elongation (see Note 13).

128

Inam Ul Haq et al.

Fig. 2 Working scheme for gene insertion by LFH PCR. Shown is a gene insertion cassette generated by LFH PCR. The upstream and downstream regions of the gene of interest are depicted as front and back cassettes. Oligos with overhangs to the adjacent fragments are denoted by arrows with overhangs in red. ABR, antibiotic resistance gene

6. Check 5 μL of the PCR product on a 1% agarose gel alongside a medium-range DNA marker. 7. Use the rest of the PCR product to transform B. subtilis. 8. Prepare chromosomal DNA from several transformants [3], and employ them as PCR template with oligos 1 and 8. Use wild-type chromosomal DNA as a negative control. 9. Check PCR yields on a 1% agarose gel under long-wave UV light to avoid mutations (see Note 14). 10. Excise PCR fragments of the expected length (e.g., 3 kb vs. 2 kb in the wild-type), purify, and sequence in both upstream and downstream directions using oligos that bind within the resistance gene (see Fig. 2). 3.1.5 Construction of Compensatory Mutations to Confirm Base-Pairing Interactions (See Notes 15, 16, 17, and 18)

1. Based on the complementarity between sRNA and target RNA predicted by IntaRNA [14], design oligos to introduce mutations in one of the complementary regions in the sRNA and the compensatory mutation in the target RNA (see Note 19).

In Vivo Analysis of sRNA-mRNA Interactions in B. subtilis

129

2. Construct a strain with a chromosomal deletion of the sRNA gene, e.g., by LFH PCR (see Subheading 3.1.4). 3. Perform the mutant-PCR with the Q5 polymerase and clone the mutated fragment into the same vector you used for expression of the wild-type sRNA as described in Subheading 3.1.1. 4. Transform the B. subtilis wild-type strain with the deletion in the sRNA gene with plasmids expressing either wild-type or mutated sRNA as described in Subheading 3.1.3 and prepare glycerol cultures of 5–7 transformants with mutant plasmid. 5. Compare the phenotype of the strains expressing wild-type sRNA and mutated sRNA from a plasmid. 6. In case you detect a clear difference, construct a strain with a compensatory mutation in the target gene by LFH-PCR (see Subheading 3.1.4). 7. Transform this strain with either the plasmid expressing wildtype sRNA or that expressing mutated sRNA as described in Subheading 3.1.1. 8. Perform phenotypical analyses to determine if the mutation in the target gene can be compensated by the mutated sRNA, whereas this should not be the case for the wild-type sRNA, and vice versa. 3.1.6 Construction of Transcriptional or Translational lacZ Reporter Gene Fusions to Demonstrate Effects of an sRNA on Translation of Its Primary Target or Transcription of Downstream Targets

1. For the construction of a transcriptional lacZ fusion, prepare vector pMG16 (see Fig. 1c) by cleavage with BamHI and EcoRI, as described in Subheading 3.1.1. 2. Amplify by PCR the promoter region of the downstream target gene of your sRNA and purify the fragment after cleavage with BamHI and EcoRI, as described in Subheading 3.1.1 (see Note 20). 3. Ligate fragment and vector, transform E. coli as described in Subheading 3.1.1, and plate on TY plates with ampicillin (see Note 21). 4. Perform a colony PCR to identify recombinant clones as described in Subheading 3.1.2, isolate plasmids, and send for sequencing. 5. Linearize one correct recombinant pMG16 plasmid isolated from E. coli with restriction enzyme ScaI (see Note 22) and use it for the transformation of the isogenic B. subtilis wild-type and sRNA knockout strains (the latter constructed as described in Subheading 3.1.4) as described in Subheading 3.1.1. Plate on TY with 100 μg/mL spectinomycin and X-Gal (100 μL/ 100 mL). In the case of strong promoters, the transformant colonies will be blue.

130

Inam Ul Haq et al.

6. Pick at least 20 clones and transfer them to a TY plate with 0.5% starch and in parallel on a TY plate with spectinomycin. After at least 8–10 h of growth (better overnight), overlay the starch plate with 3 mL potassium iodide solution. Clones without lysis zone indicate that the amyE locus has been interrupted by successful double-crossing over and should be used for the assays. 7. Pick at least 10 colonies of each isogenic strain, prepare glycerol cultures for the repetition of measurements. 8. Grow at least 6 clones of each isogenic strain in TY with 1 mM MgSO4 (see Note 23) in the shaker bath at 37 °C to the desired optical density, take samples as follows (see Note 24): measure OD600, then calculate X = 625 μL: OD600; take volume X, centrifuge in Eppendorf tube (see Note 25), and measure the β-galactosidase activity as follows. 9. Add 5 μL lysozyme/DNase solution. 10. Incubate for 10 min at 37 °C (solution clears off). 11. Centrifuge 2 min (very small pellet will be visible). 12. Take from the supernatant 20 μL or 200 μL (if you expect low activity) and add Z-buffer to a total volume of 800 μL. 13. Add 200 μL ONPG solution (do not forget a negative control: 800 μL Z-buffer + 200 μL ONPG solution). 14. Immediately incubate at 28 °C between 1 min and 45 min until the solution becomes yellowish. 15. Note incubation time in min (tmin). 16. Stop the reaction by adding 0.5 mL 1 M Na2CO3. 17. Measure OD420 against control without supernatant (see Note 26). 18. Calculate β-galactosidase activity for 200 μL Z-lysate as follows (see Note 27): β-galactosidase activity = (1500 × OD420): (1/2 tmin). 19. Grow independently at least two more times and measure the β-galactosidase activities to obtain three biological replicates for a statistical analysis. 20. For the construction of translational lacZ fusions of your target mRNA, use the same approach as described in steps 1–9 in this section, but prepare BamHI/EcoRI-cleaved vector pGAB1 instead of pMG16 and PCR-amplify the 5′ UTR, including the SD sequence, and at least the first three codons of your target gene such that they will be in frame with the SD-less lacZ gene (see Notes 28 and 29). 21. When you transform E. coli with the ligation mix, you can directly assay for correct clones when plating on TY plates

In Vivo Analysis of sRNA-mRNA Interactions in B. subtilis

131

with ampicillin and X-Gal (blue colonies), since B. subtilis SD sequences are recognized in E. coli, and the empty vector yields white colonies. 22. Proceed as described above in steps 5–8. However, since pGAB contains the gene for a thermostable β-galactosidase, its activity has to be measured at 55 °C. 23. In the case of both the transcriptional and the translational fusions, you can transform the sRNA knockout strain with a plasmid for sRNA overexpression to confirm complementation. 24. Furthermore, you can directly use translational lacZ fusions to assay compensatory mutations as described in Subheading 3.1.5. 3.2 Determination of Expression Profiles, Half-Lives, and Intracellular Concentrations of sRNA and mRNA

1. Dissolve the pellet in 100 μL lysis buffer 1 (see Note 30), and incubate for 5 min at 37 °C.

3.2.1 RNA

4. Centrifuge at 13,000 rpm for 8 min, and transfer the supernatant to a tube containing 300 μL phenol/chloroform and vortex it for 1 min.

Isolation of Total

2. Add 300 μL of lysis buffer 2 (see Note 30), and gently invert the tube several times. 3. Add 300 μL of hot phenol (65 °C), and incubate the sample in a thermo-shaker for 3 min at 65 °C with vigorous shaking.

5. Centrifuge for 5 min and transfer the upper phase with the RNA to a tube containing 300 μL phenol/chloroform. 6. After centrifugation for 5 min, transfer the supernatant to a tube containing 40 μL 3 M Na-acetate pH 5.0 and 1 mL 96% ethanol, followed by 30 min at -20 °C. 7. After 10 min centrifugation in a cooling centrifuge and washing with 80% ethanol for 2 min, dissolve the RNA in 20 μL bidist. 3.2.2 Separation of sRNA and Short Target mRNA in Polyacrylamide Gels and Subsequent Tank Blotting

1. Add equal volume of FD to the isolated RNA samples, denature for 5 min at 95 °C, and place on ice. 2. Load samples alongside a labeled marker (e.g., pBR322 × MspI, NEB, see Note 31) onto a 4–6% denaturing PAAG. 3. Run the gel at 300–400 V and 25 mA in 0.5 × TBE until BPB (bromophenol blue) is at the expected position based on the size of the investigated RNA. 4. Soak a piece of nylon membrane for 1 min in DEPC water, and afterward, place it in 1× TBE buffer. 5. Soak both foam cushions of a blotting cassette (see Fig. 3) in 1× TBE.

132

Inam Ul Haq et al.

Fig. 3 Principle of tank blotting. Left: Principle of assembly for tank blotting as described in the text. Right, blotting chamber of the company BIO-RAD

6. Remove one of the glass plates, cut unnecessary parts of the gel, place a piece of Whatman paper on top of the gel, and gently press it. 7. Carefully lift the gel with the Whatman paper, soak it in 1× TBE, and place on one of the foam cushions (see Note 32). 8. Place the wet nylon membrane on the gel and remove air bubbles (see Note 32). 9. Place a wet Whatman paper soaked in 1× TBE on the nylon membrane, and remove air bubbles. 10. Close the blotting cassette, and place it in the blotting chamber containing 1× TBE with the nylon membrane facing the anode. 11. Tank blotting is performed overnight at 4 °C and 20 V. 12. Carefully remove the nylon membrane, and dry it for 2 h at RT or 15–20 min under a hot table lamp. 13. Use a UV cross-linker to covalently cross-link the RNA to the membrane. 3.2.3 Separation of Target mRNAs >1 kb in Agarose Gels and Subsequent Capillary Blotting

1. Add to 2 μL total RNA 10 μL glyoxal mix, and incubate for 1 h at 50 °C to denature structured RNA. 2. Load samples into the slots of the BPTE agarose gel, and in one adjacent slot, load BPB with glycerol as mobility marker.

In Vivo Analysis of sRNA-mRNA Interactions in B. subtilis

133

3. Perform electrophoresis at 4 °C and 150 V (50 mA) until BPB has reached the middle of the gel (see Note 33). 4. Incubate the gel for 20 min in 50 mM NaOH on a shaker to cleave larger RNA species, followed by equilibration of the pH for 20 min with 20× SSC (see Note 34). 5. On a glass plate that bridges the vessel containing a 20× SSC buffer reservoir a paper towel stack is set, then four pieces of dry Whatman paper, followed by 2 wet ones (wetted with 20× SSC) (see Note 35). 6. Briefly wet the membrane in DEPC water and afterward in 20× SSC, and transfer it on top of the 5–7 cm high paper stack. 7. Place the treated gel on the membrane (all air bubbles must be removed, see Note 32), and on top of it another piece of wet Whatman paper. 8. Take a long Whatman paper strip soaked in 20× SSC, as wide as the gel (see Fig. 4), and place it on top of the stack (remove bubbles) such that the two short edges of the strip rest inside the buffer vessel. 9. Fix a glass plate on top of the stack with a weight (a metal weight or a full 2 L bottle) to ensure even transfer. Conduct capillary blotting for 3 h (short RNA up to 5 kb) or overnight. 10. After blotting, discard all paper and dry the membrane (see Note 36). 11. Bind the transferred RNA covalently to the membrane by UV cross-linking (see Note 37). 3.2.4 Preparation of Riboprobes and Oligo-DNA Probes

1. To prepare riboprobes follow the steps described in Chapter 8, Subheading 3.1.1, Fig. 1 (see Note 38). 2. For oligo-DNA probes, label a 20 nt (or longer) DNA oligo at the 5′ end with PNK as described in (Chapter 8, Subheading 3.1.8). 3. Purify your riboprobe via a Sephadex G-50 column as described in (Chapter 8, Subheading 3.1.5).

3.2.5 Prehybridization, Hybridization, and Exposure

When riboprobes are used, prehybridization and hybridization are performed at 42 °C in formamide-containing solutions. For oligoDNA probes, the hybridization temperature depends on the sequence of the probe. However, oligo-DNA probes against the abundant ribosomal RNAs (5S, 16S, or 23S rRNA) that are used to obtain a loading control can be applied in SSC/SDS solution at temperatures up to 55 °C. 1. Place the membrane with the cross-linked RNA into the hybridization glass tube and prehybridize in 15 mL hybridization

134

Inam Ul Haq et al.

Fig. 4 Downward capillary transfer. The arrangement is described in the text

Fig. 5 Northern blot from a 1.5% agarose gel. Ability of the SR1 mutants to complement the effects of the sr1 knockout strain on the gapA operon. B. subtilis DB104(Δsr1::cat) was transformed with the pWSR1 derivatives expressing mutant SR1 species, grown at 37 °C in TY medium until OD560 = 5.0 and used for the preparation of total RNA. The RNA was treated with glyoxal, separated on a 1.5% agarose gel, blotted onto nylon membrane, and hybridized with a 32P-[α-dATP]-labeled DNA probe specific for gapA. The filter was reprobed against SR1 and afterward against 23S rRNA as loading control. Autoradiogram of the Northern blot is shown. (Taken from [9])

In Vivo Analysis of sRNA-mRNA Interactions in B. subtilis

135

Fig. 6 Northern blotting allows to analyze the expression profile of an antisense RNA and its target mRNA. Expression profiles and intracellular concentrations of bsrE mRNA and SR5 in complex TY and minimal CSE medium. Above: Northern blots. B. subtilis strain DB104 was grown at 37 °C in TY or CSE medium, aliquots taken at different times, and later on used for the preparation of total RNA. RNA was separated on 6% denaturing PAA gels along with a defined amount of in vitro synthesized bsrE RNA or SR5, blotted onto nylon membrane, and hybridized with riboprobes against bsrE mRNA or SR5. Autoradiograms of the corresponding gels are shown. For the correction of loading errors, filters were reprobed with a 32P-[γ-ATP]-labeled oligonucleotide specific for 5S rRNA (framed in red). C1 and C2, 1.0 fmol and 0.2 fmol of in vitro synthesized RNA, respectively. Below: Quantification of the gels. For both RNAs, the amounts calculated using the in vitro synthesized controls are indicated in fmol, which could be used to determine the relative amounts of SR5 and bsrE RNA. The corresponding growth curves are shown using the Y-axis at the right side of each graph. (Taken from [7])

Fig. 7 Northern blotting allows the determination of RNA half-lives. The half-life of rpsO mRNA is compared between the wild-type DB104 strain and the isogenic RNase Y knockout strain. B. subtilis strains were grown in TY medium, rifampicin added, time samples taken, and used for the preparation of total RNA. RNA was separated on 6% denaturing PAA gels and blotted onto nylon membrane. Autoradiograms of the Northern blots are shown. Calculated half-lives with standard deviations are averages of four independent determinations. (Taken from [3])

136

Inam Ul Haq et al.

Fig. 8 Determination of the intracellular concentration of an sRNA. Total RNA is isolated from a defined number of cells (determined by plating dilutions of the culture on agar plates). For comparison, an unlabelled reference sRNA is synthesized in vitro with T7 RNA polymerase, and its concentration determined by Nanodrop. The total RNA containing the sRNA and different amounts of the reference sRNA are separated in a 6% denaturing PAA gel and subjected to Northern blotting. A riboprobe against the sRNA and a DNA-oligo probe against 5S rRNA (loading control) are used. As a control for loss during RNA isolation, cells of an sRNA knockout strain (Δsr1), to which the same amounts of reference RNA are added, are subjected in parallel to the RNA isolation procedure (compare a and b). the autoradiogram of the gel is taken from [16]

solution with 220 μL herring or salmon sperm DNA at 42 °C for 2–4 h. 2. Carefully remove the prehybridization solution. 3. Perform hybridization overnight in 10 mL hybridization solution without herring sperm DNA, but with the radioactively labeled probe (about 106 cpm/mL hybridization solution). 4. Subsequently, remove the probe with a pipette and store it at 20 °C (it can be used up to 10 times). 5. Wash the membrane twice for 20 min each with 20 mL washing solution at 42 °C. The wet filter should never dry. 6. Place the membrane in a plastic bag and expose it with a PhosphoImager screen in a cassette for 30 min (see Note 39) or overnight. 7. After scanning the plate, perform analysis corresponding PhosphoImager-based program.

with

the

8. In all instances, filters have to be reprobed (see Note 40) with a labeled oligo-DNA probe against a stable RNA (5S, 16S, or 23S rRNA) to allow the correction of loading errors.

In Vivo Analysis of sRNA-mRNA Interactions in B. subtilis

137

Below, more examples of the application of Northern blotting are shown. This method can be used for the investigation of the expression profile of an sRNA or its target mRNA in different culture media (Fig. 6). In this case, in vitro synthesized (see Chapter 8, Subheading 3.1.1) defined amounts of RNA are loaded (C1, C2) to directly determine the amount of the sRNA SR5 and its target, bsrE mRNA, overgrowth. Another application is the determination of the half-life of an sRNA or its target mRNA (Fig. 7). The RNA half-life can be compared in different media, under different stress conditions, or in wild-type and RNase knockout strains. To this end, B. subtilis strains are cultivated until the desired growth phase, rifampicin that inhibits prokaryotic transcription initiation but does not affect elongation or degradation, is added to a final concentration of 200 μg/mL and time samples taken for the isolation of total RNA. The example in Fig. 7 shows that rpsO mRNA is degraded by RNase Y, because its half-life is much longer in a Δrny strain. A further application is the determination of the intracellular concentration (number of molecules/cell) of a specific regulatory sRNA, as shown in Fig. 8, for the trans-encoded sRNA SR1 from B. subtilis. 3.3 Two-Step Quantitative Real-Time PCR

1. Dissolve the RNA sample in 86 μL bidist.

3.3.1

3. Perform a phenol/chloroform purification followed by ethanol precipitation as described in (Chapter 8, Subheading 3.1.1).

DNase I Treatment

3.3.2 Quantitative cDNA Synthesis

2. Add 10 μL of DNase I buffer and 4 μL of DNase I and incubate for 1 h at 37 °C (see Note 41).

1. Dissolve the DNase I-treated RNA in 10 μL bidist. 2. Add 1 μL of mixed RT oligos (see Note 42) and 1 μL of dNTPs. 3. Incubate for 5 min at 65 °C and transfer to ice. 4. Add 4 μL of 5× RT buffer, 2 μL DTT (see Note 43), and 1 μL RNasin, and incubate for 2 min at 42 °C. 5. Add 0.5 μL of SuperScript IV RT and incubate for 1 h at 42 °C (see Note 44). 6. Add 0.5 μL of RNase A and incubate for 10 min at 37 °C. 7. Perform a phenol/chloroform purification, followed by ethanol precipitation, as described in (Chapter 8, Subheading 3.1.1).

3.3.3

qRT-PCR

1. Dissolve the cDNA in 20 μL bidist. 2. Load each reaction on the 96-well plate with 9 μL bidist, 1.5 μL primer mix (see Note 45), and 2 μL cDNA (see Note 46). 3. Add 12.5 μL of the qPCR Master Mix (2×) to each reaction (see Note 47), and load it into the RT-PCR device.

138

Inam Ul Haq et al.

4. PCR scheme: 95 °C for 10 min initial denaturation and 35 cycles of 95 °C 30 s denaturation, 48 °C 30 s annealing, 72 °C 30 s elongation (see Note 48). 5. Melting point analysis: 95 °C 1 min and a thermogradient from 48 to 95 °C (0.5 °C/min). 6. The analysis is performed using the provided software and the ΔΔCt method [15]. 3.4 In Vivo Analysis of RNA Chaperones That Promote the sRNA-mRNA Interaction

1. Construct a chaperone knockout strain by employing LFH-PCR as described in Subheading 3.1.4 (see Note 49). 2. Prepare chromosomal DNA as described in [3] from the strain carrying the translational target-lacZ fusion in the amyE locus which was constructed as described in Subheading 3.1.6. 3. Use this chromosomal DNA for the transformation of the isogenic wild-type and chaperone knockout strains as described in Subheading 3.1.3. 4. Check for double-crossing over with starch plates and potassium iodine solution as in Subheading 3.1.6. 5. Pick 7 B. subtilis transformants each from the knockout and the wild-type strain and prepare glycerol cultures. 6. Use these transformants for the determination of the β-galactosidase activity of the translational target-lacZ fusion in the isogenic wild-type and chaperone knockout strains (see Note 50). 7. In addition, knock out the sRNA gene by transformation of wild-type and chaperone knockout strains by using chromosomal DNA prepared from an sRNA knockout strain constructed by LFH-PCR. 8. Transfer the plasmid for overexpression of the sRNA gene into the wild-type and the isogenic chaperone knockout strains that are deleted for the chromosomal sRNA gene as described in Subheading 3.1.3. 9. Determine β-galactosidase activities in all strain combinations (Δchaperone/ΔsRNA, Δchaperone/WT sRNA, WT-chaperone/WT sRNA, WT-chaperone/ΔsRNA) as described in Subheading 3.1.6 to find out if the chaperone affects the sRNA/mRNA interaction or if it only affects the translation of the target mRNA (see Note 51). 10. In case the target of your sRNA is a TF that regulates the promoters of downstream genes, use the transcriptional lacZ fusions of the downstream promoters of your target gene (constructed as described in Subheading 3.1.6) to assay the β-galactosidase activities in the presence and absence of the chaperone and the sRNA (all combinations: Δchaperone/Δ

In Vivo Analysis of sRNA-mRNA Interactions in B. subtilis

139

sRNA, Δchaperone/WT sRNA, WT-chaperone/WT sRNA, WT-chaperone/ΔsRNA). 11. Introduce point mutations into the binding motif of your RNA chaperone which you have determined by EMSA or secondary structure probing (see Chapter 8, Subheadings 3.2 and 3.3) to confirm that the observed effects are due to binding of the chaperone (see Note 52). 12. Employ Northern blotting or qRT-PCR to determine the halflife of the sRNA and the target mRNA in a wild type and an isogenic chaperone knockout strain (see Note 53).

4

Notes 1. Here, a directed cloning using two different restriction enzymes into the vector is described. 2. First, you have to find out how many μL of each restriction enzyme you need to obtain complete cleavage of a certain amount of the plasmid vector, since you cannot check the completeness of cleavage by the second restriction enzyme. 3. For a single ligation experiment, you need 25–50 ng cleaved vector, i.e., in 10 μL about 250–500 ng. 4. You have to cleave the PCR fragment with the same pair of restriction enzymes as the vector. Normally, 1 h incubation time is enough, but you can extend it to 2–3 h or perform the cleavage overnight in a thermocycler set at 37 °C for 3 h followed by cooling down to 10 °C. 5. In case you clone first into pUC19, you can add 100 μL/mL X-Gal to the plates. The recombinant clones will be white, whereas the transformants with empty pUC19 will be blue. Then you do not need to perform a colony PCR. 6. Colony PCR can be performed with a normal Taq polymerase (e.g., from Solidyne), as this is an analytical method and does not require proofreading during amplification. You can use the same primers as for cloning or primers that bind outside of your fragment directly in the plasmid vector. Colony PCR works best with fresh transformants (in case you plan to perform it later, keep the plates on your bench, not in the fridge). 7. As you used a PCR for amplifying your sRNA gene, you have to confirm the correct sequence of the recombinant plasmids (even a polymerase with a proofreading activity as Q5 can occasionally introduce mutations during the PCR). 8. The cells remain competent for about 2–3 h, afterwards, competence decreases rapidly. As B. subtilis takes up single-stranded DNA, only trimeric plasmids can be successfully used for

140

Inam Ul Haq et al.

transformation. In case you isolated your plasmids from an E. coli recA strain (e.g., DH5α), you have to oligomerize them before B. subtilis transformation by cleaving with a single-cutter (cleaving outside the insert) and subsequently ligating them at high concentration to obtain linear oligomers. 9. Add glycerol to the rest of the cells that you do not need on the same day to a final concentration of 15%, aliquot cells in Eppendorf tubes, and freeze them at -80 °C. The frozen cells can be stored for several months. They should be rethawed quickly at 37 °C in either a shaking water bath or a thermomixer and immediately used. When you thaw them slowly, they will lose their competence. 10. In certain cases (e.g., when the binding primer is in the vicinity of a terminator structure), a longer complementary region might be needed. 11. The annealing temperature depends on the primer pair used, but in general, 42 °C works for the majority of oligos. 12. Ideally, 1–2 μg of individual fragments are recommended to obtain a high yield in the final PCR. 13. In some cases when the final yield is very low, two fragments (e.g., front cassette and insert or resistance cassette and back cassette) can be joined in a separate PCR. After gel purification, they are combined in the final PCR. 14. Colonies with the correct insert will yield around 3 kb (the final PCR insert size) while negative colonies will produce wild-type (2 kb) fragments. 15. To confirm the base-pairing interaction between the regulatory sRNA and its target mRNA, both wild-type sRNA/wild-type target mRNA pairs as well as mutants in either sRNA or mRNA that interrupt/prevent base-pairing and compensatory mutants that restore base-pairing between both RNAs have to be analyzed. 16. To ensure that mutations in the sRNA do not affect its intracellular amount, Northern blotting (see Subheading 3.2) has to be employed. 17. In case the sRNA does not only affect translation of the target mRNA but also its stability, target mRNA amounts can be determined by Northern blotting or qRT-PCR (see Subheadings 3.2 and 3.3). 18. In cases where the sRNA does not alter the target mRNA stability, but only its translation, Western blotting with M2 anti-FLAG antibodies can be conducted to evaluate target protein levels.

In Vivo Analysis of sRNA-mRNA Interactions in B. subtilis

141

19. Only mutations can be introduced into the target RNA that do not alter the amino acid sequence of the target protein. In cases where this is not possible, the analysis has to be restricted to mutations in the sRNA that prevent base-pairing with the target mRNA, and compensatory mutations can be only studied in EMSA in vitro (see Chapter 8, Subheading 3.2). 20. Design the PCR oligos for the promoter region in a way that all binding sites for the TF that is the primary target of your sRNA are included. 21. When you transform E. coli with the ligation mix, you will not be able to use blue/white selection for recombinant clones because already the empty pMG16 will yield blue colonies. The reason is that in E. coli several DNA regions upstream of the promoterless lacZ gene in pMG16 are recognized as promoters, whereas the requirements for a promoter are much stronger in B. subtilis. 22. The linearization of pMG16 derivatives is required to ensure double-crossing over for the insertion of the fragment into the amyE gene. If you do not linearize, you will obtain a lot of single-crossing over clones, which are not stable in the absence of an antibiotic. 23. MgSO4 is required for the activity of the β-galactosidase. 24. You will always observe variations in the β-galactosidase activities between the individual clones. In case one clone shows an aberrantly high or low activity compared to the other 5, you can replace it with another clone from your 10 clones frozen in glycerol. 25. You can either freeze the pellets at -20 °C or start immediately with the activity determination. 26. In case your sample is dark yellow (high activity), dilute sample before measuring 1:5 or 1:10, then correct the measured OD. 27. In case you used 20 μL or 100 μL Z-lysate, you must multiply the value by 10 or 2, respectively. 28. Take into consideration that you include all complementary regions between sRNA and target mRNA into your lacZ fusions (this might include much more than the first three or eight codons). 29. To avoid analyzing a combination of the transcriptional and translational impact of your sRNA on its target gene, replace the target promoter with a heterologous weak promoter. This ensures that you analyze only the effects of the sRNA on target translation. However, you have to make sure that the heterologous promoter is not too strong (see Introduction). Otherwise, the sRNA expressed from its own promoter in the chromosome or from a plasmid might not be in excess over

142

Inam Ul Haq et al.

the target mRNA and consequently, you will not be able to detect an effect of the sRNA. 30. Lysozyme and SDS should be added just before use. 31. The marker can be labeled at the 5′ end by following the steps described in Chapter 8, Subheading 3.1.8. 32. Air bubbles can be removed by gently rolling over a glass pipette. In case of removing air bubbles from/between the gel and Whatman paper/Nylon membrane, the glass pipette should be continuously wetted in 1× TBE to prevent gel breakage. 33. This ensures that 5S rRNA (125 nt) used as loading control does not run out of the gel. When a better separation in the higher MW range is desired, the gel should be run to the end (then reprobing of the membrane for 16S or 23S rRNA is needed). The gel can be photographed along a fluorescent ruler on a UV lamp and unnecessary parts trimmed before proceeding to Northern blotting. 34. The gel should always be slightly submerged in the buffer. 35. This serves as a “capillary bridge.” 36. The gel should not contain significant amounts of RNA anymore (check on a UV lamp). The dried membrane is checked under UV light to see if the three rRNA species were evenly transferred. 37. The membrane can now be subjected to prehybridization and hybridization as described in Subheading 2.2.2, step 5. Alternatively, it can be stored between two glass plates at room temperature for later use. 38. Riboprobes can be synthesized by T7 RNA polymerase from either a PCR template or after cloning the corresponding gene in a vector containing the T7 promoter. Cleaning through Sephadex G-50 column should be performed to remove the unincorporated 32 P-[α-UTP]. 39. Exposure can be short, e.g. 30 min (in case of probes against 5S rRNA, tmRNA, or very abundant small RNAs) or up to 2 days (probes against other RNA species). 40. Probes can be easily removed by boiling the membrane for 10 min in 1% SDS solution. Before the next probe is applied, 30 min to 1 h prehybridization is recommended as described above. Wet filters can be stored in a tightly closed plastic bag at -20 °C for several years and later used for rehybridizations with other probes. 41. The DNase I reaction has to be complete. To check for the absence of chromosomal DNA it is recommended to perform a

In Vivo Analysis of sRNA-mRNA Interactions in B. subtilis

143

qRT-PCR as described in Subheading 3.3.3 with the DNase I-treated RNA. 42. RT primers should be very specific and the mix has to contain the primers for all genes of interest and all internal controls. 43. The RT buffer and DTT are provided with the SuperScript IV RT kit. 44. The yield can be lowered by strong RNA structures and high GC content, then the temperature can be increased up to 50 °C to resolve RNA secondary structures. The RT primers have to be designed such that they still bind well at these higher temperatures. 45. Primers have to be highly specific, bind with a comparable affinity, and result in a highly efficient PCR with >95% amplification per round. Specificity can be checked in the melting point analysis and efficiency by analysis of a cDNA dilution row. The amplified stretch for all genes of interest and controls should have the same length of around 100 nt. If no suitable primer constellation can be found, the qPCR can be performed against an extrinsic sequence added to the RT primer before. Examples of successfully used extrinsic sequences are: 5′ CAA GAA CAT CTG TAT TCG AAG, 5′ TCT TCA GTG ACA AAA CCA CA and 5′ CCT ATA GAA GCG GAT TTG TC. 46. For highly abundant internal controls, a dilution of up to 1: 1000 (5S rRNA or tmRNA) is necessary. 47. The master mix containing the fluorophores for detection should be pipetted in a darkened room. 48. Perhaps the annealing temperature has to be adjusted to possible primer pairs, but a uniform annealing temperature for all primer pairs is worthwhile. 49. You can simply replace the chaperone gene with an antibiotic resistance cassette. 50. This will tell you if the chaperone has an effect when both the sRNA and the target mRNA are expressed. 51. Use for each strain 7 transformants and perform the measurements three times with freshly grown cultures to obtain three biological replicates. For repetitions of the measurements, always use the same 7 clones for each strain. 52. With this approach, you can also confirm which of the potential binding motifs (if there are more than one) of the chaperone play a decisive role in vivo. 53. Do not forget reprobing against one of the stable rRNAs or tmRNA to obtain a loading control.

144

Inam Ul Haq et al.

References 1. Ul Haq I, Mu¨ller P, Brantl S (2020) Intermolecular communication in Bacillus subtilis: RNA-RNA, RNA-protein and small proteinprotein interactions. Front Mol Biosci 7:178 2. Gimpel M, Brantl S (2017) Dual-function small regulatory RNAs in bacteria. Mol Microbiol 103:387–397 3. Ul Haq I, Mu¨ller P, Brantl S (2021) SR7 – a dual-function antisense RNA from Bacillus subtilis. RNA Biol 18:104–117 4. Mu¨ller P, Gimpel M, Wildenhain T, Brantl S (2019) A new role for CsrA: promotion of complex formation between an sRNA and its mRNA target in Bacillus subtilis. RNA Biol 16: 972–987 5. Brantl S (1994) The copR gene product of plasmid pIP501 acts as a transcriptional repressor at the essential repR promoter. Mol Microbiol 14:473–483 6. Jahn N, Brantl S (2013) One antitoxin – two functions: SR4 controls toxin mRNA decay and translation. Nucleic Acids Res 41:9870– 9880 7. Mu¨ller P, Jahn N, Ring C, Maiwald C, Neubert R, Meißner C, Brantl S (2016) A multistress responsive type I toxin-antitoxin system: bsrE/SR5 from the B. subtilis chromosome. RNA Biol 13:511–523 8. Geissendo¨rfer M, Hillen W (1990) Regulated expression of heterologous genes in Bacillus subtilis using the Tn10 encoded tet regulatory elements. Appl Microbiol Biotechnol 33:657– 663

9. Gimpel M, Heidrich N, M€ader U, Kru¨gel H, Brantl S (2010) A dual-function sRNA from B. subtilis: SR1 acts as a peptide encoding mRNA on the gapA operon. Mol Microbiol 76:900–1009 10. Licht A, Preis S, Brantl S (2005) Implication of CcpN in the regulation of a novel untranslated RNA (SR1) in Bacillus subtilis. Mol Microbiol 58:189–206 11. Hanahan D (1983) Studies on transformation of Escherichia coli with plasmids. J Mol Biol 166:557–580 12. Bergkessel M, Guthrie C (2013) Colony PCR. Methods Enzymol 529:299–309 13. Birnboim HC, Doly J (1979) A rapid alkaline extraction procedure for screening recombinant plasmid DNA. Nucleic Acids Res 7: 1513–1523 14. Busch A, Richter AS, Backofen R (2008) IntaRNA: efficient prediction of bacterial sRNA targets incorporating target site accessibility and seed regions. Bioinformatics 24: 2849–2856 15. Livak KJ, Schmittgen TD (2001) Analysis of relative gene expression data using real-time quantitative PCR and the 2-[Delta][Delta]CT method. Methods 25:402–408 16. Heidrich N, Moll I, Brantl S (2007) In vitro analysis of the interaction between the small RNA SR1 and its primary target ahrC mRNA. Nucleic Acids Res 35:43331–44346

Chapter 8 In Vitro Methods for the Investigation of sRNA-mRNA Interactions in Bacillus subtilis Inam Ul Haq, Peter Mu¨ller, and Sabine Brantl Abstract So far, in Bacillus subtilis, only four trans-encoded and 11 cis-encoded sRNAs and their targets have been investigated in detail, the majority of them in our group (rev. in 1, 2). Here, we describe in vitro methods for the analysis of sRNA/mRNA interactions. All these methods have been either elaborated or significantly improved in our group and successfully applied to characterize a number of sRNA/target mRNA systems in Bacillus subtilis for which we provide examples from our own work. The in vitro methods comprise the synthesis and purification of labeled and unlabeled RNA, the analysis of sRNA/mRNA interactions in electrophoretic mobility shift assays (EMSAs) including the calculation of their apparent binding rate constants (kapp) and equilibrium dissociation constants (Kd), the localization of minimal regulatory regions of an sRNA, the determination of the secondary structures of both interacting RNAs and their complex as well as the analysis of RNA chaperones that may promote the sRNA/mRNA interaction. Key words In vitro-RNA synthesis, RNA-RNA and RNA-protein-EMSA, RNA secondary structure probing, DRaCALA

1

Introduction

1.1 In Vitro Synthesis and Purification of Labeled or Unlabeled RNA from Denaturing Polyacrylamide Gels

For a number of applications in vitro synthesized, gel-purified RNA of a defined concentration is required, for the analysis of sRNAmRNA interactions, for the secondary structure probing of RNA, for the addition of quantitative amounts of sRNA in in vitro translation experiments, for the analysis of protein-RNA interactions, etc. Both unlabeled and internally labeled RNA can be synthesized in vitro, the latter one under the inclusion of 32P-[α-UTP]. Generally, commercially available T7 RNA polymerase is employed, and the template DNA should contain the 23 bp long T7 promoter (see Note 1). The easiest way is to synthesize a PCR fragment by adding the T7 promoter sequence to the 5′ PCR primer, purifying the PCR fragment from an agarose gel, and using it as template for in vitro transcription (see below). Alternatively, one can clone the gene of interest into a vector containing a T7 promoter and use the

Ve´ronique Arluison and Claudio Valverde (eds.), Bacterial Regulatory RNA: Methods and Protocols, Methods in Molecular Biology, vol. 2741, https://doi.org/10.1007/978-1-0716-3565-0_8, © The Author(s), under exclusive license to Springer Science+Business Media, LLC, part of Springer Nature 2024

145

146

Inam Ul Haq et al.

Fig. 1 Preparation of riboprobes by in vitro transcription. (a) Synthesis of the DNA template for riboprobes. The oligo pair with the T7 promoter sequence is designed in a way that the generated RNA will be complementary to the target RNA. The PCR product is either cloned into a vector or directly used as template. If a plasmid is used as template, the plasmid has to be linearized first. In both cases, the template DNA should be cleaned prior to use to prevent intrinsic degradation of the synthesized probe. (b) Construction of a Sephadex column. The Sephadex G-50 column is assembled in a glass test tube. The column is used once per probe

recombinant vector as template (see Fig. 1), but this is timeconsuming and only necessary when high RNA amounts are needed, e.g., for an NMR analysis. 1.2 Investigation of RNA-RNA Interactions by Electrophoretic Mobility Shift Assay (EMSA)

EMSA is a simple technique that provides a rapid means to analyze sequence-specific interactions between RNAs and is able to resolve complexes of different stoichiometries or conformations. A labeled sRNA is incubated with the unlabeled target RNA, and the formed complex is separated by native PAAG electrophoresis (see Fig. 2). To determine the specificity and binding affinity, a competition assay is employed using an excess of a heterologous unlabeled competitor RNA. EMSA can be employed to determine the Kd (equilibrium dissociation constant) of an sRNA/mRNA complex, the apparent binding rate constant kapp of an sRNA, to investigate the interaction of mutated sRNA or target mRNA species, or to narrow down the minimal inhibitory region of an sRNA.

1.3 Determination of Secondary Structures of the sRNA and the sRNA/Target RNA Complex

An RNA function is not only determined by its primary nucleotide sequence but also its secondary structure that reveals the RNA folding, intramolecular base-pairing, and accessibility of bases. While RNA folding simulations (e.g., RNAfold webserver at the Vienna University) can predict secondary structures, experimental

In Vitro Analysis of sRNA-mRNA Interactions in B. subtilis

147

Fig. 2 EMSA for the study of sRNA-target RNA interactions. Binding assays of various labeled antisense RNA (RNAIII) species (1.25 × 10-8 M) and their target, repR mRNA encoding the essential replication initiator protein of plasmid pIP501. (a) Autoradiograms of the EMSAs are shown and time points are indicated. (b) Graphic representation of the efficiency of duplex formation. (Taken from [3])

data often differ in structurally important regions. Furthermore, additional factors like RNA binding proteins might be involved which cannot be considered by the prediction programs. Secondary structure probing uses in vitro transcribed 5′-labeled RNA and applies either RNases or chemical probes. All of them have specific requirements regarding the type and spatial accessibility of nucleobases at which they act and their specificity for single- or double-stranded regions. After separation on an 8% denaturing PAAG, each band represents a specific cleavage site in the RNA. Based on these data, the secondary structure of the RNA or an sRNA/target RNA complex can be derived. sRNA and target mRNA secondary structures can also be probed chemically. The advantage is that very long RNAs can be probed from the 3′ end. The disadvantage is that first the unlabeled RNA has to be chemically modified and afterward a reverse transcription with a 5′-labeled primer has to be employed to visualize the result. 1.4 Analysis of RNA Chaperones That Promote the sRNAmRNA Interaction

For many trans-encoded sRNAs of Gram-negative bacteria, the RNA chaperones Hfq or ProQ are required to either stabilize the sRNA or promote its interaction with their corresponding target mRNA. B. subtilis and other Gram-positive bacteria do not encode ProQ, and Hfq has not been shown to be required for the function of any sRNA in B. subtilis, e.g [1]. Recently, we discovered that the RNA chaperone CsrA can promote the interaction of B. subtilis sRNA SR1 with its primary target ahrC mRNA [2]. It can be assumed that CsrA plays a role for other sRNAs in B. subtilis or/and that perhaps other, still unknown RNA chaperones might play a role in promoting sRNA/mRNA base-pairing in Grampositive bacteria.

148

Inam Ul Haq et al.

In general, DRaCALA (Differential Radial Capillary Action of Ligand Assay) is the simplest and most rapid method to analyze if a putative RNA binding protein can bind RNA. It is suitable to screen for the binding of bigger sets of RNAs or proteins. Afterward, EMSA is used to confirm the results of the DRaCALA and to determine a Kd value which is indicative of the strength of binding as well as to determine important interacting sequence stretches. The promotion of sRNA/mRNA complex formation can be investigated in EMSAs in the presence and absence of the RNA chaperone. Finally, secondary structure probing helps to elucidate the molecular mechanism behind it.

2

Materials

2.1 In Vitro Synthesis and Purification of Labeled or Unlabeled RNA from Denaturing Polyacrylamide Gels

2.1.1 RNA

In Vitro Synthesis of

DEPC (diethyl pyrocarbonate)-treated RNase-free bidist (bidistilled water) (add 1 mL DEPC to 1-liter bidist and incubate at 37 °C overnight, autoclave afterward), pipette tips, microcentrifuge tubes, isotope laboratory with scintillation counter, and PhosphorImager scanner with corresponding software and screens; gel electrophoresis equipment, thermomixer with thermoblock, UV handlamp with 254 nm wave-length, NanoDrop, Eppendorf-centrifuges, and a silica gel glass plate (as for thin-layer chromatography) are required. It is very important to maintain RNase-free conditions throughout the procedure. 1. T7 RNA polymerase and reaction buffer 2. DEPC-treated RNase-free bidist (further abbreviated bidist) 3. 5× NTPs (20 mM each) 4. 1% Triton X-100 5. 0.1 M DTT (dithiothreitol) 6. 0.1 M MgCl2 7. RNasin (RNase inhibitor; Promega) 8. TEN buffer: 150 mM NaCl, 50 mM Tris, and 10 mM EDTA (ethylendiamine tetraacetic acid) 9. Sterilized blade 10. DNase I (RNase-free) 11. Chloroform 12. Phenol 13. tRNA (10 g/L) or glycogen (10 g/L) 14. 3 M Na acetate pH 5.0 15. 96% EtOH

In Vitro Analysis of sRNA-mRNA Interactions in B. subtilis

149

16. 80% EtOH: take 83.3 mL 96% EtOH and add to 16.6 mL RNase free bidist 17. Formamide loading dye (FD): 916 μL formamide, 34 μL 0.5 M EDTA, 50 μL 1% BPB (bromophenol blue), and xylene cyanol 2.1.2 Polyacrylamide Gel (PAAG) Preparation

1. Vertical gel electrophoresis apparatus. 2. Siliconized glass plates, gel spacers (0.4 mm, 40 × 20 cm), and comb (0.4 mm). 3. 10× TBE buffer pH 8.0: 900 mM Tris base, 900 mM boric acid, 20 mM Na2 EDTA. 4. 6% denaturing PAA: Dissolve 7 M urea in 200 mL distilled water by heating up, add 100 mL 10× TBE, add 150 mL of 40% acrylamide solution (19:1 acrylamide:bisacrylamide), bring the final volume to 1 L and store the solution at 4 °C in a brown glass bottle. 5. APS (ammonium persulfate): 10% (w/v) solution. 6. Tetramethylethylenediamine (TEMED). 7. Transparent plastic foil cleaned with hydrogen peroxide.

2.2 Investigation of RNA-RNA Interactions by Electrophoretic Mobility Shift Assay (EMSA)

In addition to the materials described in Subheading 2.1, the following materials are required:

2.2.1 In Vitro Synthesis of Internally Labeled RNA

3. 5× NTPs low U: 500 μM GTP, ATP, CTP, and 20 μM UTP.

1.

32

P-[α-UTP] (3000 Ci/mmol, 20 mCi/mL, Hartmann Analytic).

2. 0.1% Triton-X100. 4. Whatman paper labels: Add a minute amount of radioactivity mixed with BPB to 3–4 small pieces of Whatman paper and cover with adhesive tape. 5. Sephadex G-50 (Sigma): 10 g Sephadex G-50, 150 mM NaCl, 10 mM EDTA, 50 mM Tris, pH 8.0. Dissolve in 150 mL bidist, autoclave and store at 4 °C. 6. Glass test tube. 7. Sterile gauze. 8. Swing bucket rotor centrifuge.

2.2.2 In Vitro Synthesis and Purification of 5′ Labeled RNA

In addition to the above-described materials (Subheading 2.1), the following are required: 1. CIP (calf intestinal phosphatase) and its reaction buffer 2.

32

P-[γ-ATP] (3000 Ci/mmol, 20 mCi/mL, Hartmann Analytic)

3. T4 Polynucleotide kinase (PNK) and its reaction buffer

150

Inam Ul Haq et al.

2.2.3 Electrophoretic Mobility Shift Assay (EMSA)

1. 5× TMN buffer: 100 mM Tris-HCl, 500 mM NaCl, 50 mM MgCl2. 2. 6% native PAA solution: 100 mL 10× TBE, 150 mL of 40% acrylamide solution (19:1 acrylamide:bisacrylamide), and 750 mL DEPC treated bidist; store the solution at 4 °C in a brown glass bottle. 3. Ammonium persulfate: 10% (w/v) solution. 4. TEMED. 5. Native RNA loading buffer: 50% glycerol in 1× TBE, BPB. 6. tRNA 10 g/L. 7. Vertical gel electrophoresis system. 8. Gel dryer (Bio-Rad). 9. PhosphorImager with screens and corresponding program.

2.3 Determination of Secondary Structures of the sRNA and the sRNA/Target RNA Complex 2.3.1 Enzymatic RNA Structure Probing

A height-adjustable gel chamber (20 cm × 40 cm), two siliconized glass plates (40 × 20 cm), two spacers (40 cm long, 0.4 mm thick), comb (0.4 mm thick), a power supply (2000 V, 40 mA), a gel dryer, Whatman paper, PhosphorImager screens (40 × 20 cm), and a PhosphorImager with the corresponding software are needed. 1. 5× TMN buffer: 100 mM Tris-HCl, 500 mM NaCl, 50 mM MgCl2 (pH 7.5) 2. RNA sequencing buffer 3. Alkaline solution: 500 mM NaOH, 100 mM EDTA (prepare freshly) 4. 10 g/L tRNA 5. RNase T1 6. RNase T2 7. RNase A 8. RNase V 9. Nuclease S1 10. Stop solution: 50 mM Na-acetate, 100 mM acetic acid 11. 3 M Na acetate pH 5.0 12. 96% ethanol 13. 80% ethanol

2.3.2 Chemical RNA Secondary Structure Probing

1. Superscript IV reverse transcriptase (Invitrogen) with its reaction buffer 2. 10 mM dNTP mix 3. RNasin 4. 0.1 M DTT

In Vitro Analysis of sRNA-mRNA Interactions in B. subtilis

151

5. Buffers for the respective method given below are required 6. Labeled primers 7. KOH 8. 5× remove buffer: 50 mM Tris-HCl pH 7.5, 7.5 mM EDT, 0.5% SDS 2.3.3 Chemical Probing with DMS (Dimethyl Sulfate)

1. 5× native buffer: 250 mM Tris-HCl pH 7.5, 25 mM MgCl2., 750 mM KCl, 25 mM β-mercaptoethanol 2. 5× SD-buffer: 250 mM Tris-HCl (pH 7.5), 5 mM EDTA 3. DMS: Dissolve in 96% ethanol

2.3.4 Chemical Probing with CMCT (1-Cyclohexyl3-2-Morpholinoethyl Metho-p-Toluene Carbodiimide Sulfonate)

1. 5× native buffer: 250 mM borate-NaOH pH 8.0, 25 mM Mg acetate, 750 mM K acetate, 25 mM β-mercaptoethanol

2.3.5 Chemical Probing with Pb2+

1. 5× Buffer: 250 mM Tris-acetate pH 7.5, 25 mM Mg-acetate, 250 mM Na-acetate, 40 mM and 80 mM Pb2+ acetate in bidist (prepare just before use)

2.3.6 PAAG for Chemical Probing of RNA Secondary Structure

1. 8–15% denaturing PAA: Prepare from a 40% acrylamidebisacrylamide solution (19:1) as described in Subheading 2.1.2.

2. 5× SD buffer: 250 mM borate-NaOH pH 8.0,5 mM EDTA, CMCT

2. FD; see Subheading 2.1.1. 3. 10× TBE buffer see Subheading 2.1.2. 4. TEMED. 5. 10% APS. 2.4 Analysis of RNA Chaperones That Promote the sRNAmRNA Interaction 2.4.1 DRaCALA (Differential Radial Capillary Action of Ligand Assay)

Besides general laboratory equipment, such as pipettes, a centrifuge, a thermoblock, gel running devices, and a gel drier, gel cooling devices and a PhosphorImaging system are required. In addition, pipette tips, reaction tubes, and all chemicals to cast and run native and denaturing PAA gels (see Subheadings 2.1.2 and 2.2.3) are required. 1. Nitrocellulose membrane 2. 5× TMN buffer as described in Subheading 2.2.3 3. tRNA (1 g/L) 4.

32

P-labeled RNA of interest as described in Subheadings 2.2.1 or 2.2.2

5. Purified protein of interest and suitable protein dilution buffer

152

Inam Ul Haq et al.

2.4.2 EMSAs for Binding of an RNA Chaperone to an RNA

1. The same as for RNA/RNA EMSAs, except for the unlabeled RNA, see Subheading 2.2.3.

2.4.3 EMSAs to Study the Effect of an RNA Chaperone on an RNA-RNA Interaction

1. The same as for RNA/RNA EMSAs, see Subheading 2.2.3.

2.4.4 RNA Secondary Structure Probing in RNAprotein Complexes

1. The same as for enzymatic RNA secondary structure probing, see Subheading 2.3.1.

3

2. Purified protein of interest and suitable protein dilution buffer.

2. Purified protein of interest and suitable protein dilution buffer.

2. Purified protein of interest and suitable protein dilution buffer.

Methods

3.1 In Vitro Synthesis and Purification of Labeled or Unlabeled RNA from Denaturing Polyacrylamide Gels 3.1.1 In Vitro Synthesis of the RNA

1. Prepare a PCR template with the T7 promoter sequence (see Note 1). 2. Set up the reaction by adding the following components: 10 μL in vitro transcription template with T7 promoter, 20 μL 10× T7 RNAP reaction buffer, 40 μL 5× NTPs, 2 μL 1% Triton X-100, 20 μL 0.1 M DTT, 4 μL 0.1 M MgCl2, 4 μL RNasin, 5 μL T7 RNA polymerase, 95 μL bidist. Total volume = 200 μL. 3. Incubate the reaction at 37 °C overnight (16–18 h). 4. Add 2 μL DNase I and continue incubation at 37 °C for a further 10 min. 5. Perform a phenol/chloroform extraction. 6. Precipitate the mix by adding 20 μL 3 M Na-acetate pH 5.0, 3 μL of either tRNA (10 g/L) or glycogen (10 g/L) as carrier, and 600 μL of 96% ethanol and placing at -20 °C for 30 min. 7. Centrifuge at 13,000 rpm at 4 °C for 10 min, wash the pellet with 200 μL 80% ethanol, and dissolve the synthesized RNA in 10 μL bidist.

3.1.2 RNA Purification by PAAG Electrophoresis and Subsequent Elution

1. Add 10 μL FD to the RNA, incubate for 5 min at 95 °C, and immediately chill on ice. 2. Load the sample on a 6% denaturing PAAG and run until BPB or xylene cyanol is at the expected position (see Note 2). 3. Carefully remove one of the glass plates and place a plastic foil (see Note 3) on top of the gel. 4. Invert the gel and gently transfer it to a silica gel plate covered with a transparent plastic foil. 5. Use a UV hand lamp with a wavelength of 254 nm to spot the synthesized RNA (RNA makes a shadow on the silica gel plate).

In Vitro Analysis of sRNA-mRNA Interactions in B. subtilis

153

6. Cut out the RNA from the gel with the blunt side of a clean blade (see Note 3), and transfer it to an Eppendorf tube. 7. Add 200 μL TEN buffer to the tube and gently shake it for 1 h at 37 °C and transfer the TEN buffer containing the eluted RNA to a tube containing 5 μL tRNA/glycogen (10 g/L), 30 μL Na acetate and 900 μL 96% EtOH (see Note 4). 8. Repeat the elution step again by adding 100 μL TEN buffer to the gel pieces and shake it for 1 h at 37 °C, combine both elution fraction, and precipitate at -20 °C for 30 min. 9. Centrifuge at 4 °C and wash the pellet with 80% EtOH, dissolve in 20 μL bidist, and measure the concentration with a NanoDrop (see Note 5). 3.1.3 In Vitro Synthesis of Internally Labeled RNA

1. Prepare a PCR template with T7 promoter (see Note 1). 2. Add the reaction components in the following manner: 5 μL template DNA with T7 promoter, 5 μL 10× T7 RNAP reaction buffer, 10 μL 5× NTPs low U, 1.5 μL 0.1 M DTT, 5 μL 0.1% Triton X-100, 1 μL 0.1 M MgCl2, 15.5 μL bidist, 1 μL RNasin, 5 μL 32 P-[α-UTP], 1 μL T7 RNA polymerase. Total volume = 50 μL. 3. Incubate the reaction at 37 °C for 4 h or 16 h (in the latter case, final concentrations of 8 mM MgCl2 and 4 mM NTPs are needed). 4. Add 1 μL DNase I and incubate at 37 °C for 10 min. 5. Add 10 μL FD and place the sample at 95 °C for 5 min, followed by quick chilling on ice. 6. For further use in EMSA or secondary structure probing, separate the labeled RNA on a 6% denaturing PAA gel and elute it afterward as described in Subheading 3.1.2. 7. If the RNA has to be used as riboprobe in Northern blotting, no FD should be added, but the RNA passed through a Sephadex column as described in Subheading 3.1.5.

3.1.4 Separation on a 6% Denaturing PAAG and Elution of the Labeled RNA

1. Load samples in a 6% denaturing PAAG and run it in 0.5× TBE until the loading dye is at the expected position (see Note 2). 2. Carefully remove one of the glass plates and cover the gel with a thin transparent plastic foil. 3. Place 3–4 Whatman paper labels around the expected position of the labeled RNA. 4. Place a PhosphorImager screen on top of the gel for 1–15 min (see Note 6). 5. Scan the PhosphorImager screen and print the image in 1:1 on an overhead transparency.

154

Inam Ul Haq et al.

6. Determine the location of labeled RNA with the help of Whatman paper label, cut it out with a clean blade, and transfer it to a new Eppendorf tube (see Note 7). 7. Elute the labeled RNA as described in Subheading 3.1.2, and determine the cpm (counts per minute) with a scintillation counter. 3.1.5 Sephadex G-50 Column Preparation and Probe Purification (See Note 8, Fig. 1b)

1. Cut the tip of a 1 mL pipette tip, and place it inside another 1 mL tip (this serves now as a column). 2. Cut a small gauze piece, place it inside the upper pipette tip, and push it to the bottom. 3. Pour 2 mL Sephadex G-50 on top of the gauze so that the column is completely filled. 4. Cut the lid of an Eppendorf tube, and place it in a glass centrifuge tube. 5. Place the column inside the glass tube, and centrifuge for 1 min at 1000 g in a centrifuge with a swing bucket rotor. 6. Discard the Eppendorf tube (with the collected liquid from the Sephadex suspension), and replace it with a new tube with a cut lid. 7. Add the synthesized probe to the column, and centrifuge it as described above. 8. Collect the cleaned probe, add 100 μL DEPC bidist to the column, and centrifuge for 1 min. 9. Collect the cleaned probe and, if necessary, repeat this step with 100 μL bidist again. 10. Measure the cpm/μL.

3.1.6 In Vitro Synthesis and Purification of 5′ Labeled RNA

In vitro synthesize, elute from the gel, and measure the concentration of the unlabeled RNA as described in Subheading 3.1.1, then continue with the following steps:

3.1.7 Dephosphorylation of the RNA

1. Prepare the reaction mix as follows: x μL RNA (30 pmol), 5 μL 10× CIP buffer, 1 μL RNasin, 1 μL CIP 43-x μL bidist. Total volume = 50 μL. 2. Incubate the reaction at 37 °C for 30 min. Afterward, perform a phenol/chloroform extraction and precipitate the RNA as described above. 3. After centrifugation and washing, dissolve the pellet in 7 μL bidist.

3.1.8 Labeling with 32P[γ-ATP]

1. Start the reaction by adding components in the following manner: 3.5 μL dephosphorylated RNA (15 pmol), 1 μL 10×

In Vitro Analysis of sRNA-mRNA Interactions in B. subtilis

155

PNK buffer, 3 μL 32P-[γ-ATP], 0.5 μL RNasin, 1 μL bidist, 1 μL PNK. Total volume = 10 μL. 2. Incubate for 5 min at 37 °C and add 10 μL FD. 3. Denature the sample at 95 °C for 5 min and immediately chill on ice. 4. Load the reaction on a 6% denaturing PAA gel 5. Run the gel for 1 h at 1200 V. 6. Detect and elute the labeled RNA as described in Subheading 3.1.4. 3.2 Investigation of RNA-RNA Interactions by Electrophoretic Mobility Shift Assay (EMSA) 3.2.1 In Vitro Synthesis of sRNA and Target RNA and Labeling of One Interaction Partner 3.2.2

Binding Reaction

1. Label the RNA of interest either internally or at the 5′ end and elute it from a denaturing PAAG as described in Subheading 3.1.4. 2. In vitro synthesize and gel-purify the unlabeled target RNA as described in Subheading 3.1.1 and determine its concentration by NanoDrop. 3. If necessary, refold both RNAs with 5 min incubation at 95 °C in a water bath, followed by immediate chilling on ice for 2 min, and subsequent incubation at room temperature. 1. Set up the reaction as follows: 1.0 μL labeled sRNA (30,000 cpm), 0.5 μL tRNA (10 g/L), 2.0 μL 5× TMN buffer, x μL target RNA (at least in tenfold excess over the labeled sRNA), 10-x μL bidist. Total volume = 10 μL. 2. Start the reaction by incubating at 37 °C; take 1 μL time samples at 0, 1, 2, 5, 10, 15, and 30 min (see Note 9); and add them to tubes with 10 μL FD placed on ice.

3.2.3 Gel Separation of Free RNA and Duplex

1. Prepare a 6% native PAAG and pre-run it in 1× TBE at constant 250 V at 4 °C. 2. Load the samples and run the gel for 3 h at 250 V and 4 °C to separate free RNA and complex (see Note 10). 3. Dry the gel at 80 °C in a gel dryer and expose it with a PhosphorImager screen for at least 2 h or overnight. 4. Detect and quantify the amounts of free and bound sRNA with the PhosphorImager analysis program. 5. Repeat the EMSA with reciprocal labeling to confirm binding specificity (see Note 11). Figure 2 shows an example from the RNAIII/repR mRNA replication control system of plasmid pIP501 [3].

156

Inam Ul Haq et al.

3.2.4 Determination of the Kd of the RNA-RNA Complex

1. Set up the EMSA as above but with constant incubation time and different concentrations of the unlabeled target mRNA. 2. Determine the percentage of the complex for each concentration of unlabeled target RNA by PhosphorImaging. 3. Plot the fraction of bound RNA on the Y-axis with respect to the concentration of unlabeled target RNA (X-axis). This will result in a sigmoidally shaped curve. 4. The Kd corresponds to the concentration of unlabeled target RNA at which 50% of the labeled sRNA is bound.

3.2.5 Localization of the Minimal Inhibitory Sequence of an sRNA

1. Design PCR templates coding for different regions or mutated species of the sRNA. 2. Use in vitro transcription with T7 RNA polymerase from these PCR templates to synthesize internally labeled truncated or mutated sRNA species. 3. Perform an EMSA with unlabeled full-length target mRNA and in parallel wild-type and truncated or mutated labeled sRNA species. 4. Determine the amount of percentage of complex for each labeled sRNA. 5. The minimum-length sRNA which yields the same amount of complex as the wild-type sRNA comprises the minimal inhibitory sequence. Figure 3 shows such an approach for the RNA antitoxin SR4 that interacts with the toxin-encoding bsrG mRNA [4].

3.2.6 Determination of the Apparent Binding Rate Constant kapp

1. Set up an EMSA with unlabeled target RNA (e.g., 4 × 10-9 M) and at least tenfold lower concentration of 5000 cpm/3 μL of labeled sRNA (see Note 12) and incubate the mix at 37 °C. 2. Take 1.8 μL time samples at 1, 2, 5, 10, and 30 min by pipetting them to 10 μL of FD placed on ice to stop the reaction. 3. Load the samples at the same time on a native 6% or 8% PAAG and separate free RNA and complex at 250 V for 3 h in the cold room, dry, and expose the gel as described in Subheading 3.2.3. 4. Calculate the percentage of free sRNA and sRNA/mRNA complex as in Subheading 3.2.4. 5. Determine the time t1/2 after which 50% of the sRNA is bound in complex with the target mRNA (see Fig. 4a). 6. Use the equations in Fig. 4b to calculate the second-order rate constant k2 for duplex formation and, finally, the apparent binding rate constant kapp.

In Vitro Analysis of sRNA-mRNA Interactions in B. subtilis

157

Fig. 3 Determination of the minimal functional region of an antisense RNA. Here, this is shown for the cis-encoded sRNA SR4, an RNA antitoxin that regulates bsrG mRNA encoding a small (39 aa) toxin. Red— minimal inhibitory region of SR4. (a) Schematic representation of the stem and loop regions of SR4, with deletions or nt exchanges indicated in white. (b) Representative binding assays with wild-type and truncated/ mutated SR4 derivatives. Wild-type and truncated/mutated SR4 species were 5′ labeled and used in at least 10-fold lower equimolar amounts compared with the full-length bsrG RNA. The concentration of unlabeled wild-type bsrG RNA species is indicated. F, free labeled RNA, D duplex between SR4 and bsrG RNA. EMSAs with six of these mutants are shown which were used for the calculation of the kapp values (as described in Fig. 4) depicted in (a). (Taken from [4])

Figure 4 shows an example from the analysis of the antisense RNA of plasmid pT181 that base-pairs with repC mRNA to regulate plasmid replication [5]. In vitro synthesis, 5′ labeling of the RNA and gel purification is performed as described in Subheading 3.1.6 (see Notes 13–15). Table 1 provides an overview of the commonly used nucleases and their dilutions.

158

Inam Ul Haq et al.

Fig. 4 Calculation of the apparent pairing rate constant of an antisense RNA. (a) The antisense RNA (RNAII146) of staphylococcal plasmid pT181 was 32P labeled and its target repC RNA encoding the essential replication initiator protein was used at 2 × 10-9 M. F and D refer to free antisense RNA and stable RNA duplexes, respectively. The autoradiogram of the EMSA is shown, and time points indicated. (b) Calculation of kapp. A (left) is taken from [5] Table 1 Properties and dilutions for RNases/Nuclease S1 Enzyme

RNase T1

RNase T2

RNase A

RNase V

Specificity ssRNA 3′ of G

ssRNA ssRNA ds or stacked 3′ of A >> G, U, C 3′ of U or C RNA

Dilutions 10- to 1000-fold

50- to 1000-fold

Notes

More active at lower pH

Sensitive to divalent cations

103 to 105fold

Nuclease S1 ssRNA unspecific

Up to 100-fold 10- to 1000fold Needs 1 mM ZnCl2

3.3 Determination of Secondary Structures of the sRNA and the sRNA/Target RNA Complex 3.3.1 Enzymatic Secondary Structure Probing of RNA 3.3.2 RNase and Nuclease S1 Cleavage Reactions

1. Set up the following cleavage reaction. Do not forget a control sample without RNase. Use three different dilutions for each nuclease (see Note 16): three μL labeled RNA (30,000 cpm), 0.5 μL tRNA (10 g/L), 2 μL 5× TMN buffer, 0.5 μL RNasin, 4 μL bidist, 1 μL RNase (T1, T2, A, V or nuclease S1, see Notes 17–19)—add the nuclease last. Total volume = 10 μL. 2. Incubate at 37 °C for 5 min.

In Vitro Analysis of sRNA-mRNA Interactions in B. subtilis

159

3. Stop the reaction by adding 10 μL FD. 4. Denature the samples at 95 °C followed by chilling on ice. 5. Keep the samples on ice until gel loading or store them at -20 ° C for a longer gel run. 6. Separate 5 μL of each reaction alongside a control without RNase (C), marker, and a ladder (see Subheadings 3.3.3 or 3.3.4) in an 8% denaturing PAAG at 1200 V, 20–40 mA until BPB is 7 cm from the bottom of the gel. 7. Transfer the gel on Whatman paper, dry it in a gel dryer, and expose it overnight with a PhosphorImager plate. 8. Detect and evaluate your gel with a PhosphorImager-based program. 3.3.3 Preparation of T1 Ladder

1. Set up the following denaturation reaction: 2.0 μL labeled RNA (30,000 cpm), 0.5 μL tRNA (10 g/L), 6.0 μL RNA sequencing buffer. Total volume = 8.5 μL. 2. Incubate at 65 °C for 2 min and then slowly cool down to room temperature (in a beaker or in the thermoblock). 3. Cleave the denatured RNA with RNase T1 as follows: 8.5 μL denatured labeled RNA from above, 0.5 μL RNasin, 1 μL RNase T1 (dilute as determined before). Total volume = 10 μL. 4. Incubate for 5 min at 37 °C and add 10 μL FD. 5. Denature at 95 °C for 2 min and chill on ice prior to gel loading. 6. The rest can be frozen at -20 °C for a second gel run.

3.3.4 Preparation of Alkaline Ladder

1. Set up the following reaction: 2 μL labeled RNA (30,000 cpm), 0.5 μL tRNA (10 g/L), 1.5 μL bidist, 1.0 μL alkaline solution. Total volume = 5.0 μL. 2. Incubate at 95 °C for 20 s. 3. Add 5 μL stop solution. 4. Add 10 μL FD. 5. Denature at 95 °C for 2 min, chill on ice prior to gel loading or store at -20 °C. 6. Load your samples on an 8% denaturing PAAG which had been prerun for 15–20 min at 1000–1200 V, and separate them at 1200 V, 25 mA until BPB is about 7 cm far from the bottom of the gel (Notes 20–23). 7. Transfer your gel onto a Whatman paper and dry it in a gel dryer for 30 min at 80 °C. 8. Expose the gel overnight with a PhosphorImager screen and evaluate with the corresponding program.

160

Inam Ul Haq et al.

Fig. 5 Secondary structure probing of the antisense SR5, an RNA antitoxin. (a) secondary structure probing of SR5 with RNases. Purified, 5′-labeled wild-type SR5 was subjected to limited cleavage with the RNases indicated. Digested RNAs were separated on 15% (left) or 8% (centre and right) denaturing PAAGs. Autoradiograms are shown. C, control without RNase treatment; L, alkaline ladder; T1L, T1 digestion under denaturing conditions. The nucleotide positions are indicated. (b) Secondary structure of SR5 consistent with the cleavage data in A. Major (dark symbols) and minor (light symbols in the same colour) cuts are indicated (see box). The four stem loops SL1-SL4 and the single-stranded regions J1-J3 are indicated. (Taken from [6])

In Vitro Analysis of sRNA-mRNA Interactions in B. subtilis

161

Figure 5 shows an example for the secondary structure probing of the RNA antitoxin SR5 that interacts with the bsrE toxin mRNA [6]. 3.3.5 Determination of the RNA Secondary Structure After Enzymatic Cleavage (See Notes 24– 26)

1. Use a computer prediction (RNAfold web server from Vienna University) as basis for comparison and evaluation of your experimental results. 2. Use the T1 ladder, the alkaline ladder, and the pBR322×MspI marker to determine the number (from the 5′ end of your RNA) of the first nt at the bottom of the gel for which you see a signal (usually, you cannot see the first 5–10 nt). 3. Subtract the few cuts you see in your control reaction (unspecific self-cleavage) from the cuts obtained with RNases T1, T2, V, A, and nuclease S1. Cuts obtained at the same positions with one of the RNases or nuclease S1 cannot be used for the determination of the secondary structure. 4. Use a combination of RNase T1, T2, and A cuts to identify linear single-stranded regions or loop regions. T1 cleaves 3′ of each single-stranded G residue, whereas T2 cleaves in all singlestranded regions, but preferentially 3′ of A residues. RNase A cleaves also in single-stranded regions, but preferentially 3′ of U’s and C’s. In loops, nt toward the 3′ end are less frequently cleaved by RNases T2 or A, i.e., not all single-stranded A’s, U’s or C’s might yield signals in these regions. 5. Use cuts of RNase V to identify double-stranded or stacked regions. RNase V does not cleave each base in a doublestranded region, but only a few, and cuts are much more frequently observed in 5′ parts of stems than in 3′ parts. 6. In case you cannot obtain RNase V1 commercially anymore, you have to assign double-strandedness to all regions that are not cleaved by T1, T2, A, or S1. 7. To achieve a good resolution towards the 3′ end, additional longer gel runs can be performed (see Fig. 5a, other examples are found in [1, 5–8]. In Fig. 5b, you see how a secondary structure has been assigned manually. 8. Short double-stranded regions are likely to be in a dynamic state of opening and closing, showing partial protection (as, e.g., the stem of SL3 in Fig. 5).

3.3.6 Secondary Structure Probing of RNARNA Complexes

The setup is as in Subheading 3.3.1, but either increasing concentrations or increasing incubation times of the unlabeled complementary RNA have to be employed (see Notes 27). 1. Set up the following reaction for complex formation prior to the RNase cleavage (see Note 14): 2 μL labeled RNA (30,000 cpm (see Note 28), 0.5 μL tRNA (10 g/L), 2 μL

162

Inam Ul Haq et al.

RNA (unlabeled interaction partner; different concentrations, see Note 29), 2 μL bidist, 2 μL 5× TMN buffer. Total volume = 8.5 μL. 2. Incubate at 37 °C for 10 min (or under modified conditions as determined before). 3. Add 0.5 μL RNasin and 1 μL of the corresponding RNase 4. Incubate at 37 °C for 5 min. 5. Add 10 μL FD and place on ice. 6. Denature for 2 min at 95 °C followed by chilling on ice. 7. Perform PAAG electrophoresis and evaluation as described in Subheading 3.3.1. 3.3.7 Chemical Probing of RNA by Modification with DMS, CMCT, or Pb2+

First, you modify your RNA, afterward, you anneal a 5′ labeled primer and reverse-transcribe the RNA yielding cDNA, which is separated on a denaturing PAAG as for enzymatic probing. Primer elongation stops at a modified base. For primer design, see Notes 30 and 31.

3.3.8 Chemical Probing of RNA by Modification with DMS

DMS modifies RNA at the N3 of cytidine and the N1 of adenosine. This allows to detect unpaired C and A residues. 1. Modify your RNA as described in Table 2. 2. Add 2 μL 3 M Na acetate and 60 μL 96% ethanol and mix. 3. Precipitate for 10 min at -20 °C and centrifuge at 13000 rpm in a cool centrifuge. 4. Wash the pellet with 200 μL 80% ethanol and dissolve it in 9 μ sterile bidist. 5. Hybridize the primer by adding 1 μL 5′-end labeled primer (100.000 cpm) and 2 μL dNTP mix, for a total volume of 12 μL. 6. Incubate for 5 min at 65 °C. 7. Afterward, add: 4 μL 5× Superscript IV buffer, 1 μL 0.1 M DTT, 1 μL RNasin, 1 μL bidist, 1 μL Super Script IV (2 U/μ L). Total volume = 20 μL. 8. Perform reverse transcription for 45 min at 55 °C or for 20 min at 50 °C, 55 °C and 60 °C. 9. Remove the template by adding 20 μL of removal buffer and 3 μL KOH. 10. Incubate for 3 min at 90 °C. 11. Incubate for 1 h at 37 °C. 12. Precipitate the cDNA by adding 6 μL 3 M acetic acid, 100 μL 0,3 M Na acetate, and 300 μL 96% ethanol. 13. Incubate at -20 °C for 10 min.

In Vitro Analysis of sRNA-mRNA Interactions in B. subtilis

163

Table 2 Pipetting scheme for DMS modification Sample number

Control

1

2

3

4

5

2 pmol/μL RNA

1

1

1

1

1

1

H2O to 20 μL

13

12

11

11

11

11

Place samples at 90 °C for 1 min and then shift to ice for 1 min. Bring the samples to 20 °C for 5 min and add 5× Native buffer

4

4

4

4





5× SD Buffer









4

4

Let the samples stay at 20 °C for further 10 min and add Total tRNA 1 mg/mL

2

2

2

2

2

2

DMS 1/16 DMS 1/8 in ethanol

– –

1 –

2 –

– 2

2 –

– 2

20 °C

5 min

5 min

5 min

5 min

2 min

2 min

14. Centrifuge 10 min at 13,000 rpm and 4 °C. 15. Wash the pellet with 200 μL 80% ethanol and dissolve in 2.5 μL bidist. 16. Add 2.5 μL FD, denature at 95 °C for 2 min, and chill on ice. 17. Separate either 2.5 or 5 μL of the sample on an 8% denaturing PAAG. 18. Freeze the remaining sample at -20 °C for a later long gel run. 3.3.9 Chemical Probing of RNA by Modification with CMCT

CMCT modifies RNA at the N3 of uridine and the N1 of guanosine. This allows to detect unpaired U and G residues. 1. Modify your RNA as described in Table 3. 2. Afterward, use the same procedure for reverse transcription, precipitation, sample preparation and gel run as in Subheading 3.3.8.

3.3.10 Chemical Probing of RNA by Modification with Pb2+

Pb2+ modifies single-stranded RNA in loops or linear regions. 1. Modify your RNA as described in Table 4. 2. Add 10 μL FD and put the samples on ice to stop the reaction. 3. Denature for 2 min at 95 °C and chill on ice. 4. Separate 2.5 or 5 μL of the sample on an 8% denaturing PAA gel. Store the rest of the samples at -20 °C.

164

Inam Ul Haq et al.

Table 3 Pipetting scheme for CMCT modification at 20 °C Sample number

Control

1

2

3

4

5

2 pmol/μL RNA

1

1

1

1

1

1

H2O to 20 μL

13

12

11

8

11

8

Place samples at 90 °C for 1 min and then shift to ice for 1 min. Bring the samples to 20 °C for 5 min and add 5× Native buffer

4

4

4

4





5× SD Buffer









4

4

Let the samples stay at 20 °C for further 10 min and add Total tRNA 1 mg/mL

2

2

2

2

2

2

CMCT (40 mg/mL)



1

2

5

2

5

20 °C

20 min

20 min

20 min

20 min

20 min

20 min

Table 4 Pipetting scheme for lead cleavage at 20 °C Sample number

Control

1

2

Labelled RNA in μL

1

1

1

10 μL H2O

6

4

4

Put samples to 90 °C for 1 min and then shift to ice for 1 min. Incubate the samples at 20 °C for 5 min and then add 5× buffer in μL

2

2

2

1

1

Let the samples at 20 °C for 15 min and then add 5 mg/mL tRNA in μL 2+

Pb

1

acetate in μL

40 mM

2

80 mM Incubation at 20 °C

2 5 min

5 min

5 min

In Vitro Analysis of sRNA-mRNA Interactions in B. subtilis

165

Fig. 6 DRaCALA, Differential Radial Capillary Action of Ligand Assay. Left, the principle of the assay is shown. Right [9]: Investigation if GapA with or without the bound 39 aa protein SR1P can bind RNA. Negative control, buffer; positive controls, Hfq and RNase A 3.3.11 Determination of the Secondary Structure After Chemical Modification

3.4 Analysis of RNA Chaperones That Promote the sRNAmRNA Interaction 3.4.1 DRaCALA (Differential Radial Capillary Action of Ligand Assay)

Use the same approach as described for enzymatic probing (see Subheading 3.3.5), except that stops after DMS or CMCT indicate single-stranded C and A or G and U residues, respectively, and lead cleavages indicate single-stranded regions in general. 1. Dilute the labeled RNA (2000 cpm) with bidist to 2.5 μL. 2. Add 0.5 μL of tRNA (see Note 32) and 1 μL of 5× TMN buffer (see Note 33). 3. Add 1 μL of the protein of interest (see Note 34) and incubate at 37 °C for 5 min. 4. Mark a spot on the nitrocellulose membrane (see Note 36), and drop 2 μL of the mixture on the membrane (see Note 36). 5. As a negative control use the protein dilution buffer instead of the protein. Suitable positive controls are already known RNA binding proteins like Hfq and RNase A. 6. Let the membrane dry at room temperature, expose it with a PhosphorImager screen overnight (see Note 37), and detect. 7. All proteins will be immobilized at the initial application spot in the middle. If the labeled RNA is bound, the activity will be retained there, forming a black spot in the middle. Unbound RNA will migrate to the periphery and no defined spot in the middle is formed. Intermediary results can be interpreted as partially bound RNA, degraded RNA, or very unstable complexes (see Fig. 6).

166

Inam Ul Haq et al.

Figure 6 shows how the potential RNA binding of glyceraldehyde-3P-dehydrogenase A in combination with the small protein SR1P was analyzed by DRaCALA [9]. 3.4.2

RNA-Protein EMSA

Set up an EMSA using your labeled RNA of interest and different protein concentrations, including a sample without protein as follows: 1. Dilute the labeled RNA (2000 cpm) with bidist to 5 μL. 2. Add 1 μL of tRNA (see Note 32) and 2 μL of 5× TMN buffer (see Note 33). 3. Add 2 μL of the protein of interest (see Note 34) and incubate at 37 °C for 5 min. 4. Add 10 μL of native RNA loading buffer and immediately load on a running native PAA gel (see Note 38). 5. Run the gel at 250 V for 3 h at 4 °C (see Note 39). 6. Dry the gel at 80 °C in a gel dryer and expose it with a PhosphorImager plate overnight (see Note 37). 7. Evaluate the RNA-protein binding using a PhosphorImagerbased analysis program (see Note 40)

3.4.3 Applications of RNA-Protein EMSA

1. As described in Subheading 3.2.4, the Kd of the interaction can be determined, using different protein concentrations. 2. As described in Subheading 3.2.4, regions important for the interaction can be determined using truncated and mutated RNA species (see Fig. 7a). Figure 7 shows how binding of the RNA chaperone CsrA to SR1 or ahrC mRNA was analyzed by EMSA (Taken from [2]).

3.4.4 RNA-RNA-Protein EMSA

Set up an EMSA using your labeled RNA and an increasing amount of unlabeled target RNA with and without the protein of interest as follows: 1. Dilute the labeled RNA (2000 cpm) with bidist to 4 μL. 2. Add 1 μL of tRNA (see Note 32) and 2 μL of 5× TMN buffer (see Note 33). 3. Add 2 μL of the protein of interest (see Note 34) and incubate at 37 °C for 5 min. 4. Add 1 μL of the unlabeled target RNA (see Note 41) and incubate at 37 °C for 10 min (see Note 38). 5. Add 10 μL of native RNA loading buffer and immediately load on a running native PAA gel. 6. Run the gel at 250 V for 3 h at 4 °C (see Note 39). 7. Dry the gel at 80 °C in a gel dryer and expose it with a PhosphorImager screen overnight (see Note 37) and evaluate.

In Vitro Analysis of sRNA-mRNA Interactions in B. subtilis

167

Fig. 7 EMSA for the analysis of RNA chaperone CsrA promoting the SR1/ahrC mRNA interaction. (a) EMSA with purified internally 32P-[α-UTP] labelled RNA and increasing concentration of purified CsrA. 0.15 fmol of labelled wild-type or mutated SR1 were incubated with CsrA at the indicated concentrations in a total volume of 10 μL (final RNA concentration 0.015 nM) for 10 min, followed by separation on an 8% native PAA gel. The investigated mutations are indicated. Autoradiograms of the gels are shown. The Kd value was determined. (b) SR1/ahrC mRNA complex formation in the presence and absence of CsrA. Complex formation was monitored in a 6% native PAA gel. 0.15 fmol internally 32P-[α-UTP]-labeled full-length ahrC mRNA per reaction was used. (Taken from [2])

3.4.5 Evaluation of RNARNA-Protein EMSA

1. As a control for protein binding, the RNA-protein complex should be visible in the appropriate samples. 2. If the protein facilitates the RNA-RNA interaction, more complexity can be found in its presence (see Fig. 7b). By contrast, in case the protein interferes with the RNA-RNA interaction by blocking interacting sequences, less complex is observed in its presence. 3. Repeat the experiment with swapped labeled and unlabeled RNA. Sometimes one variant gives better results. 4. In the case of a stable ternary RNA-RNA-protein complex, an additional band will be visible above the other complex bands, but it is also possible that the RNA/RNA interaction displaces the protein from the complex. 5. Furthermore, a combination with RNAs mutated at the protein binding site(s) can prove that the promotion of the RNA-RNA interaction is directly due to protein binding to the RNA. Then, a control experiment is necessary to show that the mutations do not affect the RNA-RNA interaction.

168

Inam Ul Haq et al.

3.4.6 RNA Secondary Structure Probing in RNAprotein Complexes

Set up a secondary structure probing experiment with and without the addition of the protein of interest as follows: 1. Dilute the labeled RNA (30,000 cpm) with bidist to 2 μL. 2. Add 0.5 μL of tRNA (see Note 32) and 2 μL of 5× TMN buffer (see Note 33). 3. Add 4 μL of the protein of interest (see Notes 34 and 42) and incubate at 37 °C for 10 min. 4. Add 0.5 μL of RNasin. 5. Add 1 μL of diluted RNase T1, T2, A, or V (see Note 43) and incubate for 5 min at 37 °C. 6. Add 10 μL of FD to stop the reaction. 7. Denature the samples at 95 °C for 5 min and chill on ice. 8. Keep the sample on ice until gel loading or store them at -20 ° C for a longer gel run. 9. Separate 5 μL of each reaction alongside a control without RNase, marker, and a ladder (see Subheading 3.3) in an 8% denaturing PAAG at 1200 V, 20–40 mA until BPB is 7 cm from the bottom of the gel. 10. Transfer the gel on Whatman paper, dry it in a gel dryer, and expose overnight with a PhosphorImager screen. 11. Detect and evaluate your gel with a PhosphorImager-based program.

3.4.7 Evaluation of Changes in the Secondary RNA Structure by RNA Chaperone Binding

1. The general secondary structure probing evaluation is performed as in Subheading 3.3.5. 2. Protein binding sites will be protected in the presence of the protein and, in addition, induced structural changes can be determined (see Fig. 8). The protected regions might be slightly larger than the binding motif. 3. If both RNAs are able to interact with the protein, the experiment should also be repeated with the other RNA. Figure 8 demonstrates how secondary structure probing was used to uncover small secondary structure changes introduced by the RNA chaperone CsrA in ahrC mRNA (Taken from [2]).

In Vitro Analysis of sRNA-mRNA Interactions in B. subtilis

169

Fig. 8 The chaperone CsrA can induce slight structural changes into ahrC mRNA. (a) Secondary structure probing of 5′ labeled ahrC136 mRNA with RNases T1, T2, A and nuclease S1 after binding of increasing amounts of CsrA. Red dots indicate enhanced protection, green dots better accessibility upon CsrA binding. Autoradiograms of the gels are shown. T1L, T1 cleavage under denaturing conditions. Bars indicate the positions of the GGA motifs that are required for CsrA binding. (b) Secondary structure of ahrC136 mRNA. The SD sequence is boxed; CsrA binding motifs are highlighted in black and region G’ for the initial interaction with SR1 in grey. Red and green residues indicate enhanced protection or accessibility, respectively, according to the autoradiogram shown left. (Taken from [2])

4

Notes 1. T7 promoter sequence: 5′ GAA ATT AAT ACG ACT CAC TAT AGG 2. Gel running time depends on the size of the RNA of interest which can be determined based on the running location of BPB and xylene cyanol. 3. A hard plastic foil or overhead transparency can be made RNase-free by swiping with a tissue soaked in hydrogen peroxide. Similarly, blades used for cutting gels can be cleaned. 4. tRNA should not be used if the synthesized RNA is to be used for EMSA or as binding partner in secondary structure probing. 5. Some in vitro synthesized RNAs show high intrinsic degradation, in such cases, it is recommended to keep the RNA in

170

Inam Ul Haq et al.

EtOH after elution from the gel and only withdraw the amount necessary for planned experiments. 6. Initially, place the PhosphorImager screen on top of the gel for 1 min and scan it to see if the labeled RNA alongside the Whatman labels is visible. In case the labeled RNA is not visible, increase the exposure time. 7. To determine the precise location of the labeled RNA, cut holes where activity spots (for the labeled RNA as well as for Whatman labels) are printed on the transparency and align it precisely on top of the Whatman labels (blue dots). 8. All the steps should be performed under RNase-free conditions with the use of gloves to prevent probe degradation. DNA probe should be heated to 95 °C for 5 min and then placed on ice before loading onto the column. 9. Alternatively, the same amount of labeled sRNA can be incubated with different amounts of unlabeled target RNA for 30 min at 37 °C, reactions are stopped and separated on a native PAAG as above. 10. Gel electrophoresis should be performed in a cold room with a temperature not exceeding 4 °C, high temperature can lead to dissociation of the formed complexes. 11. To confirm specific binding, an unlabeled heterologous competitor RNA can be added, and complex formation studied. A much higher amount of competitor RNA than of unlabeled sRNA should be needed to displace the labeled target RNA from the complex. Furthermore, in vitro-synthesized mutated sRNA or target RNA species can be employed to narrow down important binding motifs. 12. To determine the kapp, you have to make sure that you follow a pseudo-second-order reaction, i.e., that the excess of unlabeled target RNA is so big that its concentration does almost not change when a small amount of it is bound by the sRNA. As a control, repeat the EMSA with threefold lower (just visible after o/n exposure) concentration of labeled RNA. The calculated t1/2 should be the same. If this is not the case, you do not follow a pseudo-second-order reaction and have to use a higher concentration of unlabeled target mRNA. Usually, kapp values are for cis-encoded sRNAs (bona fide antisense RNAs) in the order of 10-5 to 10-6 M-1 s-1. 13. After gel-purification, it is crucial to avoid self-cleavage of the RNA during storage. Some RNAs show less degradation when stored at 4 °C in bidist. It is also possible to store the RNA as ethanol precipitate at -20 °C until use and only centrifuge, wash, and dissolve it in bidist immediately before setting up the reaction.

In Vitro Analysis of sRNA-mRNA Interactions in B. subtilis

171

14. Some RNA species need a proper restructuring after gel purification and precipitation. Most common is heating the labeled RNA (in bidist) to 75 °C and slowly cooling down to room temperature in a water bath. Alternatively, the RNA can be heated in bidist to 95 °C (all incorrectly folded structures are removed), chilled on ice, and then transferred to 37 °C for 15–20 min to allow proper folding. The necessity and protocol for refolding have to be evaluated for each individual RNA. 15. Different RNase or S1 nuclease dilutions should be employed to obtain only partial cleavage (no more than 10% of the RNA should be cleaved). This also allows the identification of easily accessible nucleotides with highly diluted RNases and the detection of breathing structures or less accessible structures with lower dilutions. 16. Enzymes diluted in 50 mM Tris-HCl (pH 7.4) can be stored at 4 °C for several weeks. 17. Nuclease S1 needs at least 1 mM Zn2+, which should be added to the reaction buffer. 18. After the addition of TMN buffer, add the corresponding RNase immediately. When you keep the RNA in TMN on your bench for some minutes, the Mg2+ in the TMN buffer might bind to Mg2+ binding pockets in the RNA and induce self-cleavage yielding a number of unspecific cuts in the control sample. In case these cuts coincide with a cut by RNase T1 or T2, you cannot use this T1 or T2 cleavage to assign a base to a single- or a double-stranded region. 19. For band assignments, markers and ladders have to be prepared. All of them can be stored at -20 °C. As marker, MspI-cleaved pBR322 (from NEB) is suitable and can be labeled at the 5′ end with PNK (as in Subheading 3.1.8). An alkaline ladder prepared with the same RNA shows cleavage at all positions and enables exact counting. A T1 ladder also prepared with the same RNA displays all G’s in a specific RNA. 20. Even with a short gel run (BPB to 7 cm from the bottom of the gel), you will be able to count only from nt 5–10 of the RNA. The first 5–10 nt cannot be evaluated from the 5′ end. In case the information about them is essential, chemical probing with a labeled primer has to be used. 21. With short gel runs you can only properly evaluate the sequence up to nt 120 from the 5′ end in an 8% denaturing PAAG. Therefore, at least one longer gel run (xylene cyanol that runs with nt 79 from the 5′ end of the labeled RNA should be run to the bottom of the gel) has to be performed to evaluate the structure of the RNA between 90 and 200 nt.

172

Inam Ul Haq et al.

22. Prior to the use of samples frozen with FD you always have to denature them again at 95 °C followed by chilling on ice. 23. With enzymatic probing, up to 200 nt can be evaluated from the 5′ end. If independently folding regions are identifiable, they can be synthesized in vitro individually and their structures determined (this was done for the sRNA SR1, see ref. [1]. 24. To obtain structural information on the 3′ region of an RNA, the RNA can be labeled at the 3′ end by either ligating labeled pCp to the 3′ end or by adding labeled 32P-[α-CTP] (other nt do not work well for the enzyme) to the 3′ end by terminal transferase. However, in both cases, it is extremely difficult to purify a distinct singly labeled RNA from an 8% or 12% denaturing PAAG. In case you accidentally purify two species that differ by one nt, you will have double bands in each RNase cleavage, which makes it impossible to assign bases to structures. Therefore, chemical probing (see Subheadings 3.3.7, 3.3.8, 3.3.9, and 3.3.10) is recommended for RNAs longer than 200 nt. 25. For the final evaluation of the RNA structure, it is recommended to use a predicted structure (e.g., by the RNAfold web server) for the comparison with the experimental data. Usually, rho-independent terminators at the 3′ end of bacterial RNAs are predicted reliably, whereas parts of 5′ or central regions frequently differ from the predictions. 26. Prior to the probing of the RNA/RNA complex, the concentration of the unlabeled RNA that allows at least 50% of the labeled RNA to be bound has to be determined in an EMSA (see Subheading 3.3). 27. Two analyses have to be performed, the first analysis with 5′-labeled sRNA and an excess of unlabeled target mRNA, the second with 5′ labeled target mRNA and an excess of unlabeled sRNA. With this approach, structural alterations of both interaction partners upon binding can be determined. Thereby, the structures of each binding partner have to be compared in the absence and presence of its interaction partner to allow to distinguish between intramolecular double-strands and intermolecular double-strands. 28. Use increasing concentrations of the unlabeled RNA to detect the protection of interacting regions as well as possible changes in the secondary structure induced by binding of the complementary RNA. 29. Prepare a master mix for 5 or 10 RNase cleavages (depending on the number of RNases you plan to use) and distribute 8.5 μL of the master mix to tubes containing 0.5 μL RNasin and 1 μL RNase so that you end up with a 10 μL reaction volume in each tube.

In Vitro Analysis of sRNA-mRNA Interactions in B. subtilis

173

30. Although it is always recommended to use a prediction program, e.g., RNAfold, to get an idea of which regions might be double-stranded, you will only know that after you have performed the experiment. Therefore, you will obtain poor results in chemical probing [4] if you accidentally designed primers to long double-stranded regions. In such a case, you have to design additional primers by moving up or down your primer binding site to solve the structure of the entire RNA. 31. As you only would see afterward, if you chose the appropriate conditions for the modification with DMS or CMCT or cleavage with Pb2+, you have to use different dilutions of the chemicals or different incubation times. 32. The tRNA saturates the unspecific binding capacity of the plastic surfaces and prevents the RNA of interest from sticking in the tube. Due to very low concentrations of labeled RNA used, this reduces experimental variations. Moreover, unspecific RNA binding of the protein is saturated and potentially contaminating RNase activities are inhibited. 33. TMN buffer is in most cases suitable for the assay. However, if the protein needs specific buffer conditions or cations, an adjustment should be considered. 34. Different protein concentrations should be tried to find a proper concentration. If more volume is necessary for the protein, the amount of water can be reduced to keep the total volume constant. As a control, one sample should contain the protein buffer only, to exclude false positive signals, that might be observed, for example, in the presence of Zn2+ or EDTA. 35. It is important to use a membrane that binds proteins but not nucleic acids. The membrane has to be dry, therefore only nitrocellulose is suitable. The pore size has no influence on the assay. 36. The drop will slowly infiltrate the membrane with an increasing spot. Samples should be separated at least 1 cm from each other. 37. By using more labeled RNA (>10,000 cpm), detection after several hours is also possible. 38. It is advisable to set up the binding reactions time-delayed and to load on a running gel to achieve comparable incubation times. In most cases, protein-RNA interactions quickly reach a binding equilibrium, while RNA-RNA interactions can take some time in vitro. Perhaps an adjustment of the incubation time is necessary. 39. Gel electrophoresis should be performed in a cold room with a temperature not exceeding 4 °C. Higher temperatures can lead to dissociation of the formed complexes.

174

Inam Ul Haq et al.

40. Bands above the band of the free RNA will indicate RNA/protein complexes. If several bands emerge, this can be interpreted as different RNA conformations (appearing simultaneously) or complexes of different stoichiometry (appearing successively at higher protein concentrations). 41. The unlabeled RNA should be in large excess of the labeled one. Different concentrations should be tried and a control with water only implemented. Proper concentrations can be determined as in Subheading 3.2.4. If more volume is necessary for the unlabeled RNA, the amount of water can be reduced to keep the total volume constant. 42. The protein has to be well purified and must be free of contaminating RNases; otherwise, a background of cleavage products prevents the identification of specific cleavage sites. It is also advisable to use a protein concentration high enough to bind the majority of the RNA. 43. As in Subheading 3.3, different enzymes in at least three different dilutions should be used. Moreover, for all protein concentrations used, a sample without RNase has to be included to identify unspecific and protein-caused cleavage products. References 1. Heidrich N, Moll I, Brantl S (2007) In vitro analysis of the interaction between the small RNA SR1 and its primary target ahrC mRNA. Nucleic Acids Res 35:43331–44346 2. Mu¨ller P, Gimpel M, Wildenhain T, Brantl S (2019) A new role for CsrA: promotion of complex formation between an sRNA and its mRNA target in Bacillus subtilis. RNA Biol 16:972–987 3. Brantl S, Wagner EGH (1994) Antisense RNA-mediated transcriptional attenuation occurs faster than stable antisense/target RNA pairing: an in vitro study of plasmid pIP501. EMBO J 13:3599–3607 4. Jahn N, Brantl S (2013) One antitoxin – two functions: SR4 controls toxin mRNA decay and translation. Nucleic Acids Res 41:9870–9880 5. Brantl S, Wagner EGH (2000) Antisense-RNA mediated transcriptional attenuation: an in vitro

study of plasmid pT181. Mol Microbiol 35: 1469–1482 6. Meißner C, Jahn N, Brantl S (2016) In vitro characterization of the type I toxin-antitoxin system bsrE/SR5 from Bacillus subtilis. J Biol Chem 291:560–571 7. Ul Haq I, Brantl S, Mu¨ller P (2021) A new role for SR1 from Bacillus subtilis: regulation of sporulation by inhibition of kinA translation. Nucleic Acids Res 49:10589–10603 8. Heidrich N, Brantl S (2003) Antisense-RNA mediated transcriptional attenuation: importance of a U-turn loop structure in the target RNA of plasmid pIP501 for efficient inhibition by the antisense RNA. J Mol Biol 33:917–929 9. Gimpel M, Brantl S (2016) Dual-function sRNA encoded peptide SR1P modulates moonlighting activity of B. subtilis GapA. RNA Biol 13:916– 926

Chapter 9 RNA Double-Helix Hybridization Measured by Fluorescence Correlation Spectroscopy Arne Werner Abstract RNA double-strand hybridization is a key player in gene expression regulation. Single-stranded RNA of up to 300 nucleotides forms Watson-Crick base pairs with complementary messenger RNA. Fluorescencebased single-molecule methods allow to study RNA-RNA interaction under physiological conditions. Here is described, how the dissociation constant of RNA double strands can be determined by applying fluorescence correlation spectroscopy. Key words Fluorescence correlation spectroscopy, FCS, RNA, dsRNA hybridization

1

Introduction

1.1 RNA DoubleStrand Hybridization

RNA double-strand hybridization has been demonstrated to perform an essential role in gene expression regulation. Regulatory RNA of up to 300 nucleotides (nt) is known to act by Watson-Crick base pairing with complementary single-stranded mRNA [1]. Antisense RNA silences gene expression when forming Watson-Crick base pairs with mRNA. Double-stranded small inhibitory RNA (siRNA) and microRNA (miRNA) of ~20 bp are key players in RNA interference in eukaryotes, leading with their guide single strand to endonucleolytic cleavage of viral mRNA or repress translation of intrinsic complementary mRNA [2]. In prokaryotes, the defense of mobile genetic elements, clustered regularly interspaced short palindromic repeats/Cas (CRISPR/Cas), is based on transactivating crRNA. RNA-RNA interaction has been biochemically investigated by electrophoretic mobility shift assay, EMSA [3], and based on the theoretical thermodynamic parameter, implemented by IntaRNA [4]. Under physiological conditions, RNA-RNA interaction can be studied with fluorescence-based methods. A method that has been demonstrated to be able to determine dissociation constants of ribonucleic acids in vitro and under native

Ve´ronique Arluison and Claudio Valverde (eds.), Bacterial Regulatory RNA: Methods and Protocols, Methods in Molecular Biology, vol. 2741, https://doi.org/10.1007/978-1-0716-3565-0_9, © The Author(s), under exclusive license to Springer Science+Business Media, LLC, part of Springer Nature 2024

175

176

Arne Werner

Fig. 1 Schematic representation of fluorescence correlation spectroscopy (FCS). (a) The detection volume of a confocal microscope is defined by the focused laser beam. Fluorescence-labeled molecules move due to free three-dimensional diffusion through the detection volume. (b) The emitted light is collected by the detector of the optical setup in a time-dependent manner. Fluctuations of the fluorescence intensity are due to the diffusion of the molecules through the confocal detection volume. (c) Autocorrelation analysis of the fluorescence intensity time trace gives the diffusion time τD. Autocorrelation functions of two different diffusion species with significantly different temporal decay

conditions is fluorescence correlation spectroscopy (FCS) [5– 8]. FCS is based on the fluctuation of fluorescence intensity due to the stochastic motion of single molecules [9, 10]. The autocorrelation of the fluorescence time trace gives the diffusion velocity of the molecules. Applying FCS, diffusion coefficients of 10-6 to 10-9 cm2s-1 can be measured. The formation of intermolecular complexes can be monitored due to a change in diffusion time (Fig. 1). Interaction with nonlabeled molecules of similar or larger size can be monitored by an increase in the diffusion time of the fluorescent species [8]. Dissociation constants in the nano- to micromolar range could be quantified under physiological conditions [5, 7, 8]. Here is described how to determine the dissociation constant of complementary RNA single strands applying FCS. 1.2

FCS Theory

Fluorescence correlation spectroscopy (FCS) enables to measure the diffusion time τD of fluorescence-labeled molecules [9]. Stochastic fluorescence intensity fluctuations δF(t) in the detection volume give the intensity-normalized autocorrelation function G(τ) = hδF(t + τ)δF(t)i/hFi2. The principle of FCS is schematically described in Fig. 1. Using Zeiss software (4.0 SP2; R3.5) [11], the fluorescence autocorrelation function is fitted to a model describing free threedimensional diffusion and triplet excited state to determine the number of particles N, the diffusion time τD,i and the fractions fi of n different diffusion species with ∑fi = 1 as the following -τ

1 1 - T þ Te τT G ðτÞ = 1 þ N ð1 - T Þ

n i=1

fi τ 1 þ τD,i

1 þ S 2 ττ

D:i

ð1Þ

RNA Double Helix Hybridization Measured by FCS

177

The structure parameter S defines the ratio of the radial and axial distances between maximum and 1/e2 intensities of the focused laser beam, r0, and z0. Furthermore, from Eq. 1, the triplet decay time τT and the fractional population of the triplet state T are derived. Equation 2 [12] can be applied when the quantum yield η of two diffusion species, η1 and η2, differed significantly f2=

f 2 η22 ð1 - f 2 Þη21 þ f 2 η22

ð2Þ

The diffusion coefficient Di can be calculated with r0 and τD using D i = r 20 = 4τD,i

2

ð3Þ

Materials

2.1

ssRNAs

2.2

Hybridization

The sequences of the 22 nt (5′- AAGGCUGAGAACGGGAAG CUUU -3′) and 26 nt (5′- GAAAAGCUUCCCGUUCUCAGC CUUGA -3′) ssRNA were derived from the glyceraldehyde-3phosphate dehydrogenase (GAPDH) mRNA (Gene ID 2597). The RNA was chemically synthesized by IBA (Go¨ttingen) and purified by denaturing polyacrylamide gel electrophoresis (PAGE). 1. 1× Hybridization buffer (HB): 30 mM Tris HCl pH 6.8, 50 mM NaCl, 2.5 mM MgCl2 2. 10× HB buffer without MgCl2: 300 mM Tris HCl pH 6.8, 500 mM NaCl 3. 10× MgCl2: 25 mM MgCl2 4. RNA stock solution: 1 μM ssRNA, 1× HB. 5. Diethylpyrocarbonate (DEPC) treated water.

2.3 Fluorescence Correlation Spectroscopy (FCS)

1. LSM510-Meta ConfoCor2 (Carl Zeiss Jena, Germany). 2. 633 nm HeNe laser 3. Dichroic mirror HFT 633 nm, emission filter LP 650 nm. 4. C-Apochromat 40×, 1.2 NA, water immersion objective (Carl Zeiss, Jena, Germany). 5. Avalanche photodiode detector (SPCM-AQR-13-FC, Perkin Elmer). 6. Siliconized Cover slides (22 mm) (Hampton Research, Journey, GB).

178

2.4

Arne Werner

Software

1. Zeiss software (4.0 SP2; R3.5). 2. Excel (2005 Microsoft Corporation, Redmond, USA). 3. Origin (Northampton, USA).

3

Methods

3.1 Hybridization Assay

In the RNA hybridization assay [13], the RNA is diluted in 1× HB and diethylpyrocarbonate (DEPC)-treated water to a final concentration of 1 μM. For titration experiments, to solutions of 100 nM Atto647N-labeled ssRNA different amounts of complementary nonlabeled ssRNA are added. Hybridization is performed by heating for 3 min at 95 °C, followed by cooling down to room temperature (1 °C per 1.2 min). MgCl2 is added to a final concentration of 2.5 mM. The solution is incubated at RT for 30 min.

3.2 Performance of FCS Measurement

The FCS experiments are performed using LSM510-Meta ConfoCor2 [11] (Carl Zeiss Jena, Germany) with the avalanche photodiode detector (SPCM-AQR-13-FC, Perkin Elmer). 1. Prepare FCS measurement (Note 1). 2. Switch on the confocal microscope and start FCS software (4.0 SP2; R3.5). 3. Define the beam path by choosing the excitation. The excitation filter depends on the excitation maximum wavelength and the emission filter depends on the emission maximum wavelength (633 nm HeNe laser: excitation filter HFT 633 and emission filter LP 650). Adjust the pinhole diameter to 90 μm. 4. Add water with a soft plastic tip to an objective, which is suited for FCS measurements (C-Apochromat 40x, 1.2 NA, water immersion objective, Carl Zeiss Jena). 5. Switch on the laser at 633 nm wavelength with 2% or 3% of the 5.0 mW HeNe laser, regulated by an acousto-optic tuneable filter (AOTF) (Notes 2–3). 6. By scrolling up very slowly the microscope table, it is possible to focus on the water drop above the cover glass with the support of the software. Reflection points of the focused light appear at the bottom and upper edge of the cover glass. The two reflection points should have a distance of 0.1 μm. When focusing the reflection point of the upper glass edge, scroll 200 μm in the probe solution. 7. Pipet the nanomolar Cy5 solution onto the cover glass. Raise laser power slowly until a count rate of 200–400 kHz is reached (Notes 4–5). 8. Adjust the pinhole to define a Gaussian-shaped observation volume. The detection volume can be adjusted in x, y, and z

RNA Double Helix Hybridization Measured by FCS

179

directions. The correct pinhole adjustment is a necessary precondition for reproducible diffusion time measurements. The structure parameter S should be 5–8. 9. Pipet about 20–40 μl of sample and Cy5 solution (~10 nM) onto the cover glass and perform FCS measurements. 10. Examine photobleaching and triplet state population. Adjust laser power and measurement time. Standard measurement times are 1 s or 5 s, which are repeated to increase S/N (Notes 6–8). 11. Safe every FCS measurement data set. The data file contains the raw data and the fitting results of the chosen fitting model (Note 9). 3.3

Data Analysis

1. During the FCS measurement the fluorescence intensity time trace of the confocal detection volume is on time autocorrelated, giving the autocorrelation function G(τ) (Fig. 1). The half-maximum decay describes the typical diffusion velocity of the molecule (Fig. 1c). In Fig. 2, the autocorrelation functions G(τ) of ssRNA and dsRNA are shown. 2. Apply the fitting model of ConfoCor2 software [11], which describes free three-dimensional diffusion and triplet excited state (Eq. 1) of a solution of one diffusion species (n = 1). The parameter diffusion time τD, the structure parameter S, the triplet decay time τT, the fractional population of the triplet state T, and the number of particles N are derived from the fit. 3. To process the data of the titration experiment, fit the autocorrelation function to Eq. 1, setting n = 2. To calculate the fraction of fdsRNA in Step 2 determined diffusion times of unbound and completely bound labeled RNA are set as fixed values. 4. To determine the dimensions of the effective detection volume, the diffusion time of a standard fluorophore, Cy5, is measured. The radial distance of the confocal detection volume r0 is derived from Eq. 3. 5. The diffusion coefficient D of a sample diffusion species can be calculated with r0 and τD applying Eq. 3. 6. To determine the dissociation constant, the fraction of dsRNA fdsRNA can be determined using Eq. 1 with n = 2, setting the two diffusion times fixed. The fraction of dsRNA fdsRNA is plotted against the concentration of nonlabeled complementary ssRNA, [ssRNA_NL] (Fig. 3). Setting the concentration of labeled RNA, [ssRNA_L], fixed, Eq. 4 [8] can be applied.

180

Arne Werner

Fig. 2 FCS measurement of ssRNA and dsRNA diffusion time. (Top) Autocorrelation functions of Atto647N-labeled 26 nt ssRNA (red line) and Atto647N-labeled 26 bp dsRNA (green line)

Fig. 3 Titration experiment of hybridized complementary ssRNA. Solutions, containing a constant amount of Atto647N-labeled 26 nt ssRNA and different amounts of nonlabeled complementary ssRNA, were hybridized and measured with FCS. The fraction of dsRNA is plotted against the concentration of nonlabeled ssRNA and fit to Eq. 4 (green line)

f dsRNA =

ðK d þ ½ssRNA L ] þ ½ssRNA

NL ]Þ -

ðK d þ ½ssRNA

NL ]

þ ½ssRNA L ]Þ2 - 4½ssRNA

NL ]½ssRNA L ]

2½ssRNA L ]

ð4Þ

4

Notes 1. Become familiar with laser safety. 2. For confocal microscopes, special cover glasses are required, which have a thickness of 100 nm. 3. Do not switch the laser on and off with rates below 15 min. 4. The signal–to-noise ratio (S/N) is defined by the counts per molecule (cpm). Additionally, measurement times raise

RNA Double Helix Hybridization Measured by FCS

181

S/N. Optimum cpm values depend on the laser power, the dye, and its concentration. 5. Fluorophores, suited for FCS, should have a high quantum yield and extinction coefficient, be photostable, and show a low tendency to populate the triplet state. The laser power ranges typically between 1 and 10% AOTF for a HeNe laser (633 nm). 6. For titration, one should use special tubes with low affinity to nucleic acids. 7. To diminish the risk of artifacts, originating from changes in dimension and shape of the confocal detection volume, the sequence of the FCS measurements of the different solutions of a titration experiment can be varied. 8. Evaporation of the sample can be avoided by the usage of a small plastic tube cap. 9. To balance changes in the shape of the confocal detection volume during the experiment, the structure parameter can be kept unfixed in the fitting routine. References 1. Cech TR, Steitz JA (2014) The noncoding RNA revolution-trashing old rules to forge new ones. Cell 157:77–94 2. Wilson R, Doudna JA (2013) Molecular mechanisms of RNA interference. Annu Rev Biophys 42:217 3. Ul Haq I, Brantl S, Mu¨ller P (2021) A new role for SR1 from Bacillus subtilis: regulation of sporulation by inhibition of kinA translation. Nucleic Acids Res 49:10589–10603 4. Mann M, Wright PR, Backofen R (2017) IntaRNA 2.0: enhanced and customizable prediction of RNA–RNA interactions. Nucleic Acids Res 45:W435–WW39 5. Werner A, Konarev PV, Svergun DI, Hahn U (2009) Characterization of a fluorophore binding RNA aptamer by fluorescence correlation spectroscopy and small angle X-ray scattering. Anal Biochem 389:52–62 6. Werner A, Hahn U (2009) Fluorescence correlation spectroscopy based characterisation of aptamer ligand interaction. Meth Mol Biol 535:107–114 7. Eydeler K, Magbanua E, Werner A, Ziegelmu¨ller P, Hahn U (2009) Fluorophore

binding aptamers as a tool for RNA visualization. Biophys J 96:3703–3707 8. Werner A, Skakun VV, Meyer C, Hahn U (2011) RNA dimerization monitored by fluorescence correlation spectroscopy. Eur Biophys J 40:907–921 9. Rigler R, Elson ES (eds) (2001) Fluorescence correlation spectroscopy: theory and applications. Springer, Berlin 10. Magde D, Elson ES, Webb WW (1972) Thermodynamic fluctuations in a reacting system – measurement by fluorescence correlation spectroscopy. Phys Rev Lett 29:705–708 11. Weisshart K, Jungel V, Briddon SJ (2004) The LSM 510 META – ConfoCor 2 system: an integrated imaging and spectroscopic platform for single-molecule detection. Curr Pharm Biotechnol 5:135–154 12. Fradin C, Zbaida D, Elbaum M (2005) Dissociation of nuclear import cargo complexes by the protein Ran: a fluorescence correlation spectroscopy study. C R Biol 328:1073–1082 13. Werner A, Skakun VV, Ziegelmu¨ller P, Hahn U (2012) A fluorescence correlation spectroscopy-based enzyme assay for human Dicer. Biol Chem 393:187–193

Chapter 10 New Perspectives on Crosstalks Between Bacterial Regulatory RNAs from Outer Membrane Vesicles and Eukaryotic Cells Moumita Roy Chowdhury and Eric Masse´ Abstract Regulatory small RNAs (sRNAs) help the bacteria to survive harsh environmental conditions by posttranscriptional regulation of genes involved in various biological pathways including stress responses, homeostasis, and virulence. These sRNAs can be found carried by different membrane-bound vesicles like extracellular vesicles (EVs), membrane vesicles (MVs), or outer membrane vesicles (OMVs). OMVs provide myriad functions in bacterial cells including carrying a cargo of proteins, lipids, and nucleic acids including sRNAs. A few interesting studies have shown that these sRNAs can be transported to the host cell by membrane vesicles and can regulate the host immune system. Although there is evidence that sRNAs can be exported to host cells and sometimes can even cross the blood–brain barrier, the exact mechanism is still unknown. In this review, we investigated the new techniques implemented in various studies, to elucidate the crosstalks between bacterial cells and human immune systems by membrane vesicles carrying bacterial regulatory sRNAs. Key words Outer membrane vesicles, extracellular vesicles, small RNAs, bacterial RNAs, RNA crosstalk, gene regulation

1

Introduction Bacterial cells survive and proliferate in harsh environmental conditions with the help of a tightly regulated network of genes. An important part of this network are RNA-based regulators such as small regulatory RNAs (sRNAs), which bacteria use to swiftly adjust their transcriptome and metabolism accordingly to new conditions [1]. Bacterial regulatory sRNAs are 50–500 nucleotides long and can regulate transcriptional, posttranscriptional, and translational processes through base pairing with mRNAs or binding to proteins [2, 3]. These bacterial sRNAs are involved in the regulation of various biological processes like iron homeostasis, quorum sensing, and responses to various stresses like oxidative

Ve´ronique Arluison and Claudio Valverde (eds.), Bacterial Regulatory RNA: Methods and Protocols, Methods in Molecular Biology, vol. 2741, https://doi.org/10.1007/978-1-0716-3565-0_10, © The Author(s), under exclusive license to Springer Science+Business Media, LLC, part of Springer Nature 2024

183

184

Moumita Roy Chowdhury and Eric Masse´

stress, membrane stress, phosphosugar stress, stress due to starvation, and persistence [4, 5]. Pathogenic bacteria respond quickly to environmental changes for survival and propagation. Regulatory sRNAs along with RNA binding proteins and chaperones are found to regulate virulence gene expression by posttranscriptional modulation in pathogenic bacteria [6, 7]. Biomacromolecules like proteins, lipids, lipoproteins, and polysaccharides serve as transporters in eukaryotic cells and are gaining huge interest in drug delivery and tissue engineering nowadays [8]. Transporter proteins carry a diverse range of substances including metabolites and ions across cellular membranes and help regulate levels of nutrients, cell volume, and cellular waste removal [9]. In eukaryotic cells, extracellular vesicles (EV) carry proteins, lipids, and nucleic acids including miRNAs and help in intracellular communication and removal of cellular wastes and are categorized into exosomes, microvesicles, and apoptotic bodies [10]. Exosomes, made of endosomal membranes, are found to communicate with the recipient cell either by cell signaling, or internalization through endocytosis or phagocytosis and deliver proteins, lipids, various metabolites, mRNAs, miRNAs, and DNA to the recipient cell [11]. Exosomal miRNAs have been proven to be associated with cardiovascular diseases and malignancy and can be used for diagnostic and treatment purposes [12, 13]. High-density lipoproteins (HDL) in mammals also transport a diverse set of molecules; however, this transport is different from EVs [14]. HDL-bound miRNAs act as biomarkers for various disorders and symptoms of cardiovascular diseases [15]. In the same way, bacterial sRNAs are also found to be exported to other bacterial cells or eukaryotic cells through various membrane-bound secretory vehicles. These vehicles are termed outer membrane vesicles (OMVs) in Gram-negative bacteria [16– 19], and the vesicles from Gram-positive bacteria are generally known as EVs, same as eukaryotic cells [20–23]. EVs in Grampositive bacteria are spherical, nano-sized membrane blebs (50–200 nm in diameter) that carry different macromolecules required in nutrition sensing, communication between cells, tolerating environmental stress, and virulence [24]. OMVs are spherical vesicles made of lipid bilayer membranes and are released from the outer membrane of both nonpathogenic and pathogenic Gramnegative bacteria into extracellular environments [25]. In Gram-negative bacteria, OMV production was first observed by Bishop and Work from an E. coli strain in 1965, and in the next year, Knox et al. demonstrated with electron microscopy that these secreted vesicles were derived from the outer membrane [26– 28]. OMVs contain proteins, nucleic acids, lipids, and other metabolites and are related to various physiological processes, such as the transport of biomolecules, accumulation of nutrients, antibacterial activity, modulation of host immune responses, and cellular communication [29, 30]. OMVs play an important role in host-

Regulatory RNAs from Outer Membrane Vesicles

185

pathogen interaction to form biofilms, transporting virulence factors like toxins and adhesins from pathogenic bacteria to host cells, and modulation of host cell defense response [31, 32]. Various plant-associated Gram-negative bacteria also produce OMVs which are found to elicit plant immune responses and plant colonization [33, 34]. In the following sections, we have reviewed the emerging roles of sRNAs, carried by membrane vesicles, in hostimmune response.

2

Regulatory sRNAs Carried by Membrane Vesicles In recent findings, it has been noticed that EVs or OMVs are involved in the export of bacterial sRNAs and enable their delivery to the host cells [35–37]. The membrane vesicles can carry a diverse set of bacterial regulatory sRNAs including SsrS, SsrA, 10Sa, CsrC, RsaC, RNAIII, transfer RNA (tRNA), transfer-messenger RNA (tmRNA), and miRNA-sized RNAs (msRNAs). The various roles of these sRNAs in host cells are described in Table 1. Whether these regulatory sRNAs have any inter-species effect is still not clear.

Table 1 Biological roles of OMV carried bacterial regulatory sRNAs

a

Name of sRNA

Modulation of host immune system or pathogenicity of bacteria

4.5S RNA

Regulation of virulence gene expression and secretion [60]

6S RNA

Roles in cell survival and stress responses in pathogens [61]

10Sa RNA or SsrA RNA or transfer-messenger RNA (tmRNA)

Plays a key role in rescuing stalled translation in bacteria and regulation of bacterial persistence

[39, 62, 63]

CsrB and CsrC

Regulation of virulence

[64]

GcvB

Involved in limiting uptake and biosynthesis of energy-expensive amino acids during nutrientenriched conditions

[65]

msRNAa

Involved in reduction of cytokines expression

[41]

RNAIII

Involved in the regulation of virulence gene expression

[66]

RsaC

Involved in the nutritional immunity and adaptation [47, 67] of oxidative stress

SsrS

Plays a crucial role in acid resistance and in the invasion of epithelial cells

[68]

tRNA

Involved in developing immune cells, immune responses, inflammatory responses

[69]

References

msRNAs are generally 22 nt long and produced from a precursor RNA with hairpin structure and having a msRNA* strand-like eukaryotic miRNAs [70]

186

Moumita Roy Chowdhury and Eric Masse´

Blenkiron et al. showed the transportation of membrane vesicles associated E. coli sRNAs in the nucleus and cytoplasm of human bladder cells by confocal microscopy but localization of the OMV-associated sRNAs in the host cells is not known yet [35]. 2.1 sRNAs Carried by OMVs

Ghosal et al. have identified several novel OMV-associated sRNAs [38]. In their study, OMVs were collected from the early stationary phase of E. coli K-12 MG1655, and differentiation of OMV fractions and protein extracts from whole cells was performed through analyzing protein profiles. To find out whether the OMVs are carrying RNAs, a lipid tracer dye was used to label OMV-associated lipids in combination with an RNA quantification dye, both of which were visualized by confocal laser scanning microscopy. Notably, they could identify many characterized sRNAs, such as RyeA, RyfD, 6S RNA (ssrS), 4.5S RNA (SRP-RNA), and tmRNA, through small RNA sequencing. Although there was an abundance of RNAs that were found associated with OMV lipids in most cases, free extracellular RNAs and nucleoprotein complexes were also found [38]. Full-length transcripts of 12 OMV-associated sRNAs (Table 2) were found by Malabirade et al., where at least four sRNAs, namely, SsrS, 10Sa/tmRNA, RNase P type B (RnpB), and CsrC were protected from degradation by OMVs in Salmonella enterica serovar Typhimurium (S. Typhimurium) [39]. The sRNAs were characterized by analyzing RNA-Seq data of the RNA extracted from S. Typhimurium OMVs [39].

2.1.1 Crosstalk Between Host Immune System and OMV-Associated sRNAs

Although the internalization of OMVs in the host cells and sRNA delivery is not clearly understood yet, the uptake may happen either by fusion of the OMVs to the host cell membrane followed by cargo delivery or through endocytic pathways [28]. sRNAs carried by OMVs are not only involved in various bacterial gene regulatory processes but can also modulate the host immune system. Evidence from recent studies suggests a potential crosstalk between bacteria and mammalian host cells through OMV-secreted sRNAs, which can reduce certain inflammatory responses in the host cells (Fig. 1). Koeppen et al. have shown that sRNA52320, an OMV-associated sRNA of Pseudomonas aeruginosa, reduces interleukin secretion followed by attenuation of neutrophil recruitment in human bronchial epithelial (HBE) cells [40]. In this study, the authors determined that the sRNA was inside OMVs and hence got protection from RNase digestion followed by successful delivery of the sRNA into the host cell. OMVs were extracted from the early stationary phase of P. aeruginosa PA14 followed by RNA extraction from the OMVs and RNA-seq. Through bioinformatic analysis, the most abundant sRNAs and their targets have been predicted and the most promising sRNA and its targets have been selected. Authors showed that sRNA52320 can bind mRNAs encoding

Regulatory RNAs from Outer Membrane Vesicles

187

Table 2 Membrane vesicles carried sRNAs discovered in different environmental conditions Name of functional sRNA CsrC [35]

Bacterial species

Detection/analysis of Vesicles sRNA

Exponential phase MVs (optical density at 600 nm, OD600 = 1.5)

MV RNA was labeled with 100 μM 5-ethynyl uridine.

Escherichia coli tRNAs, 4.5S RNA, 6S RNA, K-12 substrain MG1655 and tmRNAa [38]

Early stationary phase

OMVs

Detection by RNA quantification dye and visualization by confocal laser scanning microscopy.

SsrS, CsrC, 10Sa, Salmonella enterica rnpB [39] serovar Typhimurium

Early stationary phase

OMVs

OMV-associated RNAs were determined by using DESeq2, which calculates differential presence among intracellular and OMV-associated fractions.

sRNA52320 [40] Pseudomonas aeruginosa

Early stationary phase

OMVs

Agarose gel electrophoresis and qPCR of control OMVs and RNase A-treated OMVs. sRNA got protection from RNase by OMV.

Anaerobic condition Anaerobic condition Anaerobic condition

OMVs

sRNAs were detected by SYTO RNASelect Green Fluorescent Cell Stain kit and visualized by laser scanning confocal microscopy.

AA_11134 AA_20050 [41] PG_16418 PG_45033 [41] TD_2161 TD_15612 TD_16563 [41]

Uropathogenic Escherichia coli (UPEC) strain 536

Environmental condition

Aggregatibacter actinomycetemcomitans Porphyromonas gingivalis Treponema denticola

OMVs OMVs

sR-2509025, sR-989262, sR-7497631, sR-1074949 [42]

Helicobacter pylori

Logarithmic phase OMVs (OD600 = 1, 48–72 h culture)

sncRNAs were strained with SYBR safe for visualization followed by qPCR on control OMVs and RNase A-treated OMVs.

msRNA [43]

Aggregatibacter actinomycetemcomitans

Anaerobic condition

OMV-associated msRNA was stained by Syto RNA-Select dye and visualized by 2D and 3D visualization tools.

OMVs

(continued)

188

Moumita Roy Chowdhury and Eric Masse´

Table 2 (continued) Name of functional sRNA

Bacterial species

Environmental condition

Detection/analysis of Vesicles sRNA

vsRNA Ile-tRF-5X [44]

Escherichia coli K-12 substrain MG1655

OMVs Logarithmic phase (OD600 = 0.5)

RT-qPCR has been performed to validate the OMV-associated vsRNA.

SsrA, RsaC, and RNAIII [46]

Staphylococcus aureus MSSA476

Under vancomycin stress

qPCR is used to confirm the presence of the EV associated sRNAs.

EVs

a tmRNA possesses properties of both tRNA and mRNA. Utilizing the function of these two RNAs, tmRNA can release stalled ribosome during translation and helps degrade nascent polypeptides by targeting them [71]

Fig. 1 Role of OMV-carried sRNAs in bacteria and the interaction between bacteria and host cells

kinases related to host-pathogen interaction. For the in vitro analysis, HBE cells were transfected with the sRNA and downregulation of multiple kinase proteins in the presence of the sRNA confirms the bioinformatic prediction [40]. In another study, Choi et al. identified and characterized a few OMV-associated msRNAs from three periodontal pathogens, namely, Aggregatibacter actinomycetemcomitans, Porphyromonas gingivalis, and Treponema denticola by deep sequencing followed by bioinformatic analysis. Remarkably, these msRNAs were found to reduce the expression of cytokines [41]. Expression levels of IL-5, IL-13, and IL-15 are found to be significantly reduced in Jurkat T cells after transfection with three synthetic msRNAs which were chosen according to three highly expressed msRNAs from each pathogen. All these pathogens were grown anaerobically in specific media and OMVs were purified from the supernatants followed by RNA extraction. Small-size complementary DNA libraries from the three pathogens were analyzed through deep-

Regulatory RNAs from Outer Membrane Vesicles

189

sequencing and msRNAs were predicted by bioinformatic analysis. The authors analyzed the length of highly expressed OMV-associated msRNAs by Northern blots and confirmed that the msRNAs are protected from RNase by the OMVs. They have also confirmed the uptake of the OMVs along with the RNAs inside the target NIH3T3 fibroblast cells by confocal fluorescent microscopy through staining OMV-associated lipids with a lipid tracer dye and the RNAs with an RNA quantification dye. Moreover, the authors computationally predicted human target genes related to the immune system by using bioinformatic tools [41]. In another study, two OMV-associated small ncRNAs (sncRNAs), namely, sR-2509025 and sR-989262 of Helicobacter pylori are found to reduce IL-8 secretion in cultured human gastric adenocarcinoma (AGS) cells [42]. OMVs were extracted from the log phase of H. pylori strain J99, grown in microaerobic conditions. Total RNA was isolated and RNA-seq was performed followed by bioinformatic prediction of OMV-associated sncRNAs. The authors confirmed that the sncRNAs were packaged inside OMVs and by the help of fluorescent microscopy they showed the infiltration of the sncRNAs to the AGS cells. Next, the authors exposed host cells to sncRNAs in the presence or the absence of transfection reagent and observed a reduced expression in IL-8 secretion only in the presence of the transfection reagent. To support their hypothesis, the authors constructed mutants of H. pylori and showed a reduced IL-8 secretion in the OMVs isolated from wild-type H. pylori compared to the OMVs isolated from the mutants [42]. Recently, OMV-associated msRNAs of A. actinomycetemcomitans were found to promote TNF-1 production through the activation of TLR-8 and NF-κB signaling pathways in U937 cells [43]. OMVs were purified from anaerobically grown A. actinomycetemcomitans and analyzed by nanoparticle tracking. Total RNA was isolated and sRNAs were found through deep-sequencing and bioinformatic analysis. In this study, the authors confirmed the presence of sRNAs inside the OMVs and the presence of OMVs carrying the sRNAs is internalized into U937 cells by confocal microscopy. They found a significant reduction in TLR-8 expression along with NF-κB activity in RNase-treated OMV lysates in U937 cells, which is in concordance with their hypothesis. With the help of high-throughput sequencing of sRNAs associated with Ago2, these authors have shown the incorporation of msRNAs into host RNA-induced silencing complex (RISC). They have also shown that OMVs are able to cross the blood–brain barrier and induce the production of TNF-1 [43]. Diallo et al. have shown that a very small RNA (vsRNA) named Ile-tRF-5X (isoleucine tRNA-derived fragment) from E. coli is carried by OMVs and promotes the expression of MAP3K4 in human HCT116 cells [44]. OMVs were extracted from the

190

Moumita Roy Chowdhury and Eric Masse´

log-phase of E. coli K-12 MG1655 and characterized through Coomassie blue staining, transmission electron micrography, and dynamic light scattering. Total RNA was extracted and the particular vsRNA was detected by splint-ligation. The authors confirmed the internalization of OMVs in HCT116 cells by using confocal microscopy imaging and transfer of the sRNA to the host cell with LNA RT-qPCR. As bacterial OMVs can also increase the expression level of MAP3K4, the authors also proved that this induction is solely dependent on OMV-carried Ile-tRF-5X and not any other components of OMVs [44]. 2.2 sRNAs Carried by EVs or Other Membrane Vesicles

3

Extravesicular RNAs (evRNAs) have been identified in different bacteria including Vibrio cholerae, Staphylococcus aureus, and clinical S. aureus strain HG003 [45–47]. By analyzing RNA-Seq data of three different-sized RNA libraries, Blenkiron et al found more than 50 ncRNAs in the membrane vesicles of uropathogenic Escherichia coli (UPEC) strain 536. In this study, the authors have confirmed the delivery of membrane vesicles to bladder carcinoma cells by confocal microscopy and the uptake of membrane vesiclesassociated RNAs by cultured cells through droplet digital RT-PCR [35].

Bacterial sRNAs Regulating Gene Expression in Plant Cells A probable cross-kingdom interaction between plant-derived miRNAs and mammalian gene expression has been previously suggested. Zhang et al. discovered plant-derived miRNAs in human serum and according to their hypothesis, plant-derived miRNAs can enter the human circulation system after being taken up through the gastrointestinal track [48]. They have also suggested that these circulating miRNAs can further be packed into the micro-vesicles and delivered to the recipient cell [48]. However, this hypothesis is controversial as another group has shown an ineffective delivery of the diet-derived microRNAs to the crosskingdom recipient [49]. Recent studies have suggested the role of regulatory sRNAs of plant-associated bacteria in adapting to challenging environments during their interaction with plants by providing necessary resources [50]. Small RNA fragments derived from rhizobial tRNAs are found to regulate Soybean nodulin genes [51]. Various plant-beneficial and plant-pathogenic bacteria are found to interact with their host cells by OMVs [52, 53]. Proteomic studies have shown that OMVs are involved in plant–bacteria interaction by secreting various periplasmic proteins including virulence proteins, adhesins, hydrolases, and porins into the host cells [54–56]. Recent studies have also shown that sRNAs of a fungal pathogen named Botrytis cinerea, translocate into host plant cells via extracellular

Regulatory RNAs from Outer Membrane Vesicles

191

vesicles and suppress host immune responsive genes by cross kingdom RNA interference pathway [57, 58]. However, a recent study disproved the idea of cross-kingdom RNA interference [59]. While investigating the interaction between B. cinerea and tomato, the authors did not find any evidence of sRNA involvement in fungal virulence [59]. Moreover, there is no evidence of bacterial sRNAs getting packaged into OMVs or transferred into plant host cells by OMVs yet.

4

Future Perspectives Through this study, we have tried to show the possible roles of bacterial regulatory sRNAs in host immunity. Recent studies have shown that these regulatory sRNAs can be transferred to host cells through OMVs similar to the transfer of miRNAs by exosome and HDL in eukaryotes. A few studies have been performed in mammalian cells where the bacterial sRNAs are found to regulate the immunity response but so far, no studies have been performed in plant host cells. While we have tried to gather all the information available so far on bacterial sRNA-host immunity interaction, many of these studies struggled to demonstrate the effect of bacterial OMV-secreted RNAs on host cells. Also, investigation of the subcellular localization of these sRNAs in host cells is required to find out their involvement in specific cellular pathways. In many cases, overproduction and transfection are necessary to infer a modulation of candidate genes. While these approaches might suggest an effect of bacterial sRNAs on the host mRNAs, additional studies are needed to confirm the natural transfer of bacterial sRNAs in host cells and their plausible effect.

References 1. Carrier M-C, Lalaouna D, Masse´ E (2018) Broadening the definition of bacterial small RNAs: characteristics and mechanisms of action. Annu Rev Microbiol 72:141–161 2. Waters LS, Storz G (2009) Regulatory RNAs in bacteria. Cell 136:615–628 3. Gottesman S, Storz G (2011) Bacterial small RNA regulators: versatile roles and rapidly evolving variations. Cold Spring Harb Perspect Biol 3:a003798 4. Hoe C-H, Raabe CA, Rozhdestvensky TS, Tang T-H (2013) Bacterial sRNAs: regulation in stress. Int J Med Microbiol 303:217–229 5. Holmqvist E, Wagner EGH (2017) Impact of bacterial sRNAs in stress responses. Biochem Soc Trans 45:1203–1212

6. Djapgne L, Oglesby AG (2021) Impacts of small RNAs and their chaperones on bacterial pathogenicity. Frontiers in cellular and infection. Microbiology 11 7. Sy BM, Tree JJ (2021) Small RNA regulation of virulence in pathogenic Escherichia coli. Front Cell Infect Microbiol 10:622202 8. Zhang Y, Sun T, Jiang C (2018) Biomacromolecules as carriers in drug delivery and tissue engineering. Acta Pharm Sin B 8:34–50 9. Pizzagalli MD, Bensimon A, Superti-Furga G (2021) A guide to plasma membrane solute carrier proteins. FEBS J 288:2784–2835 10. Kurian TK, Banik S, Gopal D, Chakrabarti S, Mazumder N (2021) Elucidating methods for isolation and quantification of exosomes: A review. Mol Biotechnol 63:249–266

192

Moumita Roy Chowdhury and Eric Masse´

11. Este´banez B, Jime´nez-Pavo´n D, Huang C-J, Cuevas MJ, Gonza´lez-Gallego J (2021) Effects of exercise on exosome release and cargo in in vivo and ex vivo models: A systematic review. J Cell Physiol 236:3336–3353 12. Zheng D et al (2021) The role of exosomes and Exosomal MicroRNA in cardiovascular disease. Front Cell and Dev Biol 8 13. Li C, Zhou T, Chen J, Li R, Chen H, Luo S, Chen D, Cai C, Li W (2022) The role of Exosomal miRNAs in cancer. J Transl Med 20:6 14. Vickers KC, Michell DL (2021) HDL-small RNA export, transport, and functional delivery in atherosclerosis. Curr Atheroscler Rep 23:38 15. Sui G, Jia L, Song N, Min D, Chen S, Wu Y, Yang G (2021) Aberrant expression of HDL-bound microRNA induced by a high-fat diet in a pig model: implications in the pathogenesis of dyslipidaemia. BMC Cardiovasc Disord 21:280 16. Huang Y, Nieh M-P, Chen W, Lei Y (2022) Outer membrane vesicles (OMVs) enabled bio-applications: A critical review. Biotechnol Bioeng 119:34–47 17. Tran TM, Chng C-P, Pu X, Ma Z, Han X, Liu X, Yang L, Huang C, Miao Y (2022) Potentiation of plant defense by bacterial outer membrane vesicles is mediated by membrane nanodomains. Plant Cell 34:395–417 18. Zhao Z, Wang L, Miao J, Zhang Z, Ruan J, Xu L, Guo H, Zhang M, Qiao W (2022) Regulation of the formation and structure of biofilms by quorum sensing signal molecules packaged in outer membrane vesicles. Sci Total Environ 806:151403 19. Park A-M, Tsunoda I (2022) Helicobacter pylori infection in the stomach induces neuroinflammation: the potential roles of bacterial outer membrane vesicles in an animal model of Alzheimer’s disease. Inflamm Regener 42:39 20. Brown L, Wolf JM, Prados-Rosales R, Casadevall A (2015) Through the wall: extracellular vesicles in gram-positive bacteria, mycobacteria and fungi. Nat Rev Microbiol 13:620–630 21. Liu Y, Defourny KAY, Smid EJ, Abee T (2018) Gram-positive bacterial extracellular vesicles and their impact on health and disease. Front Microbiol 9 22. Wang X, Thompson CD, Weidenmaier C, Lee JC (2018) Release of Staphylococcus aureus extracellular vesicles and their application as a vaccine platform. Nat Commun 9:1379 23. Jeon J, Park SC, Her J, Lee JW, Han J-K, Kim Y-K, Kim KP, Ban C (2018) Comparative lipidomic profiling of the human commensal bacterium Propionibacterium acnes and its

extracellular vesicles. RSC Adv 8:15241– 15247 24. Bose S, Aggarwal S, Singh DV, Acharya N (2020) Extracellular vesicles: an emerging platform in gram-positive bacteria. Microb Cell 7: 312–322 25. Jan AT (2017) Outer membrane vesicles (OMVs) of gram-negative bacteria: A perspective update. Front Microbiol 8:1053 26. Knox KW, Vesk M, Work E (1966) Relation between excreted lipopolysaccharide complexes and surface structures of a lysine-limited culture of Escherichia coli. J Bacteriol 92: 1206–1217 27. Bishop DG, Work E (1965) An extracellular glycolipid produced by Escherichia coli grown under lysine-limiting conditions. Biochem J 96:567–576 28. Sartorio MG, Pardue EJ, Feldman MF, Haurat MF (2021) Bacterial outer membrane vesicles: from discovery to applications. Annu Rev Microbiol 75:609–630 29. Ahmadi BS, Bruno SP, Moshiri A, Tarashi S, Siadat SD, Masotti A (2020) Small RNAs in outer membrane vesicles and their function in host-microbe interactions. Front Microbiol 11 30. Avila-Caldero´n ED, del Ruiz-Palma MS, Aguilera-Arreola MAG, Vela´zquezGuadarrama N, Ruiz EA, Gomez-Lunar Z, Witonsky S, Contreras-Rodrı´guez A (2021) Outer membrane vesicles of gram-negative bacteria: an outlook on biogenesis. Front Microbiol 12:557902 31. Kuehn MJ, Kesty NC (2005) Bacterial outer membrane vesicles and the host–pathogen interaction. Genes Dev 19:2645–2655 32. Cecil JD, Sirisaengtaksin N, O’Brien-Simpson NM, Krachler AM (2019) Outer membrane vesicle-host cell interactions. Microbiol Spectr 7 33. Ionescu M, Zaini PA, Baccari C, Tran S, da Silva AM, Lindow SE (2014) Xylella fastidiosa outer membrane vesicles modulate plant colonization by blocking attachment to surfaces. Proc Natl Acad Sci U S A 111:E3910–E3918 34. McMillan HM, Zebell SG, Ristaino JB, Dong X, Kuehn MJ (2021) Protective plant immune responses are elicited by bacterial outer membrane vesicles. Cell Rep 34:108645 35. Blenkiron C et al (2016) Uropathogenic Escherichia coli releases extracellular vesicles that are associated with RNA. PLoS One 11: e0160440 36. Tsatsaronis JA, Franch-Arroyo S, Resch U, Charpentier E (2018) Extracellular vesicle RNA: A universal mediator of microbial

Regulatory RNAs from Outer Membrane Vesicles communication? Trends Microbiol 26:401– 410 37. Le´crivain A-L, Beckmann BM (2020, 1863) Bacterial RNA in extracellular vesicles: a new regulator of host-pathogen interactions? Acta (BBA) – Gene Regul Mech:194519 38. Ghosal A et al (2015) The extracellular RNA complement of Escherichia coli. Microbiology 4:252–266 39. Malabirade A et al (2018) The RNA complement of outer membrane vesicles from Salmonella enterica Serovar Typhimurium under distinct culture conditions. Front Microbiol 9: 2015 40. Koeppen K et al (2016) A novel mechanism of host-pathogen interaction through sRNA in bacterial outer membrane vesicles. PLoS Pathog 12:e1005672 41. Choi J-W, Kim S-C, Hong S-H, Lee H-J (2017) Secretable small RNAs via outer membrane vesicles in periodontal pathogens. J Dent Res 96:458–466 42. Zhang H, Zhang Y, Song Z, Li R, Ruan H, Liu Q, Huang X (2020) sncRNAs packaged by Helicobacter pylori outer membrane vesicles attenuate IL-8 secretion in human cells. Int J Med Microbiol 310:151356 43. Han E-C, Choi S-Y, Lee Y, Park J-W, Hong S-H, Lee H-J (2019) Extracellular Rnas in Periodontopathogenic outer membrane vesicles promote Tnf-A production in human macrophages and cross the blood-brain barrier in mice. FASEB J 33:13412–13422 44. Diallo I, Ho J, Lambert M, Benmoussa A, Husseini Z, Lalaouna D, Masse´ E, Provost P (2022) A tRNA-derived fragment present in E. coli OMVs regulates host cell gene expression and proliferation. PLoS Pathog 18: e1010827 45. Langlete P, Krabberød AK, Winther-Larsen HC (2019) Vesicles from vibrio cholerae contain AT-rich DNA and shorter mRNAs that do not correlate with their protein products. Front Microbiol 10:2708 46. Joshi B, Singh B, Nadeem A, Askarian F, Wai SN, Johannessen M, Hegstad K (2021) Transcriptome profiling of Staphylococcus aureus associated extracellular vesicles reveals presence of small RNA-cargo. Front Mol Biosci 7 47. Luz BSRD, Nicolas A, Chabelskaya S, de Rodovalho VR, Le Loir Y, de Azevedo VAC, Felden B, Gue´don E (2021) Environmental plasticity of the RNA content of Staphylococcus aureus extracellular vesicles. Front Microbiol 12 48. Zhang L et al (2012) Exogenous plant MIR168a specifically targets mammalian

193

LDLRAP1: evidence of cross-kingdom regulation by microRNA. Cell Res 22:107–126 49. Snow JW, Hale AE, Isaacs SK, Baggish AL, Chan SY (2013) Ineffective delivery of dietderived microRNAs to recipient animal organisms. RNA Biol 10:1107–1116 50. Harfouche L, Haichar F, el Z., Achouak W. (2015) Small regulatory RNAs and the fine-tuning of plant–bacteria interactions. New Phytol 206:98–106 51. Ren B, Wang X, Duan J, Ma J (2019) Rhizobial tRNA-derived small RNAs are signal molecules regulating plant nodulation. Science 365:919– 922 52. Castiblanco LF, Sundin GW (2016) New insights on molecular regulation of biofilm formation in plant-associated bacteria. J Integr Plant Biol 58:362–372 53. Rudnicka M, Noszczyn´ska M, Malicka M, Kasperkiewicz K, Pawlik M, Piotrowska-Seget Z (2022) Outer membrane vesicles as mediators of plant–bacterial interactions. Front Microbiol 13 54. Knief C, Delmotte N, Vorholt JA (2011) Bacterial adaptation to life in association with plants – A proteomic perspective from culture to in situ conditions. Proteomics 11:3086– 3105 55. Afroz A, Zahur M, Zeeshan N, Komatsu S (2013) Plant-bacterium interactions analyzed by proteomics. Front Plant Sci 4 56. Feitosa-Junior OR, Stefanello E, Zaini PA, Nascimento R, Pierry PM, Dandekar AM, Lindow SE, da Silva AM (2019) Proteomic and metabolomic analyses of Xylella fastidiosa OMV-enriched fractions reveal association with virulence factors and signaling molecules of the DSF family. Phytopathology 109:1344– 1353 57. Weiberg A, Wang M, Lin F-M, Zhao H, Zhang Z, Kaloshian I, Huang H-D, Jin H (2013) Fungal small RNAs suppress plant immunity by hijacking host RNA interference pathways. Science 342:118–123 58. Wang M, Weiberg A, Dellota E, Yamane D, Jin H (2017) Botrytis small RNA Bc-siR37 suppresses plant defense genes by cross-kingdom RNAi. RNA Biol 14:421–428 59. Qin S et al (2022) Molecular characterization reveals no functional evidence for naturally occurring cross-kingdom RNA interference in the early stages of Botrytis cinerea-tomato interaction. Mol Plant Pathol 60. Tu Y, Jia X, Yang R, Peng X, Zhou X, Xu X (2019) Genetic regulation of streptococci by small RNAs. Curr Issues Mol Biol 32:39–86

194

Moumita Roy Chowdhury and Eric Masse´

61. Wassarman KM (2018) 6S RNA, a global regulator of transcription. In: Regulating with RNA in bacteria and archaea. Wiley, pp 355–367 62. Lin X, Poeta P, Peng B (2020) Editorial: the molecular mechanisms of antibiotic resistance in aquatic pathogens. Front Cell Infect Microbiol 10 63. Himeno H, Kurita D, Muto A (2014) tmRNAmediated trans-translation as the major ribosome rescue system in a bacterial cell. Front Genet 5:66 64. Fris ME, Murphy ER (2016) Riboregulators: fine-tuning virulence in Shigella. Front Cell Infect Microbiol 6 65. Lalaouna D, Eyraud A, Devinck A, Pre´vost K, Masse´ E (2019) GcvB small RNA uses two distinct seed regions to regulate an extensive targetome. Mol Microbiol 111:473–486 66. Le Huyen KB, Gonzalez CD, Pascreau G, Bordeau V, Cattoir V, Liu W, Bouloc P, Felden B, Chabelskaya S (2021) A small regulatory RNA alters Staphylococcus aureus

virulence by titrating RNAIII activity. Nucleic Acids Res 49:10644–10656 67. Lalaouna D et al (2019) RsaC sRNA modulates the oxidative stress response of Staphylococcus aureus during manganese starvation. Nucleic Acids Res 47:9871–9887 68. Ren J, Sang Y, Qin R, Cui Z, Yao Y-F (2017) 6S RNA is involved in acid resistance and invasion of epithelial cells in Salmonella enterica serovar Typhimurium. Future Microbiol 12: 1045–1057 69. Zhu C, Sun B, Nie A, Zhou Z (2020) The tRNA-associated dysregulation in immune responses and immune diseases. Acta Physiol (Oxf) 228:e13391 70. Middleton H, Yergeau E´, Monard C, Combier J-P, El Amrani A (2021) Rhizospheric plantmicrobe interactions: miRNAs as a key mediator. Trends Plant Sci 26:132–141 71. Keiler KC, Ramadoss NS (2011) Bifunctional transfer-messenger RNA. Biochimie 93:1993– 1997

Chapter 11 Experimental Validation of RNA–RNA Interactions by Electrophoretic Mobility Shift Assay Eva Maria Sternkopf Lillebæk and Birgitte Haahr Kallipolitis Abstract Regulatory RNAs in bacteria are known to act by base pairing with other RNAs. Interactions between two partner RNAs can be investigated by electrophoretic mobility shift assays. The regions predicted to be engaged in base pairing are analyzed by introducing mutations in one RNA that prevent RNA–RNA complex formation. Next, base pairing is restored by introducing complementary mutations in its partner RNA. Here, we describe the mutational strategy and experimental methods used to validate specific base pairing between two RNA species. Key words Electrophoretic mobility shift assay, RNA–RNA interactions, Base pairing

1

Introduction Regulatory RNAs are known to control a variety of processes in bacteria, including stress responses and virulence [1–3]. In many cases, RNA-mediated control involves direct base pairing between the RNA regulator and specific regions in target RNAs. For example, small regulatory RNAs (sRNAs) typically affect the translation and/or stability of specific target mRNAs through direct base pairing. RNA–RNA complex formation can be analyzed by electrophoretic mobility shift assay (EMSA). This method was first developed to analyze protein–DNA interactions [4] but can be used for studies of protein–RNA and RNA–RNA interactions as well. Traditionally, 32P is used for labeling of nucleotides (nts) as described in this protocol but nonradioactive alternatives are also available [5, 6]. For studies of RNA–RNA interactions, one of the RNAs is labeled and mixed with unlabeled partner RNA. The RNA mixture is analyzed by gel electrophoresis using a native polyacrylamide gel. Base pairing between the labeled RNA and its unlabeled partner RNA results in a more slowly migrating band relative to the unbound labeled RNA (Fig. 1). By mutation analyses of the

Ve´ronique Arluison and Claudio Valverde (eds.), Bacterial Regulatory RNA: Methods and Protocols, Methods in Molecular Biology, vol. 2741, https://doi.org/10.1007/978-1-0716-3565-0_11, © The Author(s), under exclusive license to Springer Science+Business Media, LLC, part of Springer Nature 2024

195

196

Eva Maria Sternkopf Lillebæk and Birgitte Haahr Kallipolitis

Fig. 1 EMSA analysis of RNA–RNA interactions. Sample A contains the labeled RNA (blue) only, whereas in sample B, the labeled RNA is mixed with its unlabeled partner RNA (yellow). The RNA–RNA complex formed in sample B moves more slowly through the native gel, relative to the unbound labeled RNA species in sample A. The bands are visualized by autoradiography

interacting partners, it is furthermore possible to investigate specific regions predicted to be engaged in RNA–RNA base pairing. First, complementary mutations in the RNA partners are designed to: (1) disrupt base pairing between the mutant RNAs and their wildtype RNA partner; and (2) restore base pairing between the two mutant partner RNAs. Next, the mutant and wild-type RNA transcripts are prepared and labeled, and their ability to form RNA– RNA complexes is evaluated by EMSA. We used EMSA to validate the predicted base pairing between regulatory sRNAs and their target mRNAs in L. monocytogenes [7–13]. More specifically, we identified specific regions engaged in direct base pairing by introducing mutations in the sRNA predicted to disrupt sRNA–mRNA interaction, followed by the introduction of mutations in the partner mRNA that restore base pairing. Using this strategy, direct base pairing between specific regions of the following sRNA–mRNA pairs could be confirmed: LhrC4-lapB [11], LhrC4-oppA [10], LhrC4-tcsA [8], LhrC4-lmo0484 [7], Rli22-oppA [9], Rli33–1oppA [9], LhrA-lmo0850 [12], and LhrA-chiA [13]. In a previous chapter, we described how to analyze sRNA–mRNA base pairing using EMSA [14]. Importantly, this method is useful for analyzing direct base pairing between any pair of RNAs. In this updated chapter, the interacting RNA partners will be named “RNA-1” and “RNA-2” and schematic illustrations will provide an overview of the strategy and experimental methods used to validate specific base pairing between them. For true cases, we refer to the analyses presented in publications on the sRNA–mRNA pairs mentioned above [7–13].

Experimental Validation of RNA–RNA Interactions

2

197

Materials Make sure that nuclease-free H2O, demineralized H2O (dH2O), 96% ethanol (cold), 3 M sodium acetate pH 4.5, and 70% ethanol (cold) are available for repeated use throughout the protocol.

2.1

DNA Template

1. Overlapping DNA primer sets of interest (10 μM) (see Note 1). 2. dNTP mix: 10 mM of dATP, dCTP, dGTP, and dTTP. 3. High-fidelity DNA polymerase and reaction buffer. We used phusion high-fidelity DNA polymerase and 5× Phusion HF buffer (provided by the supplier). 4. 2% agarose gel mix with ethidium bromide (1 μg/mL) with TAE buffer (0.04 M Tris-base, 0.02 M sodium acetate, 1 mM ethylene-diamine-tetra-acetic-acid (EDTA)).

2.2 In Vitro Transcription and Purification of RNA

1. Purified DNA templates of interest (see Note 2). 2. In vitro transcription kit. We used the MEGAscript T7 transcription kit. 3. RNA fragments of interest (see Note 3). 4. TBE running buffer (1× TBE): prepared by dilution of 10× TBE buffer with dH2O. 10× TBE consists of 0.89 M Tris-base, 0.89 M boric acid, and 0.02 M EDTA. We used UltraPure™ TBE buffer (10×). 5. 6% denaturing polyacrylamide gel mix (50:1 acrylamide:bisacrylamide): mix 150 mL of acrylamide solution (40%), 60 mL of bisacrylamide solution (2%), 100 mL of 10× TBE buffer, 420 g urea, and dH2O to 1 L. Filtrate the solution and store it at 4 °C. 6. 10% ammonium persulfate (APS): Dissolve 1 g APS in dH2O to a total volume of 10 mL (see Note 4). 7. N, N, N´, N´-tetramethylethylene diamine (TEMED). 8. Glass plates (16 × 19 cm), spacers, and comb (1 mm thick). The comb should optimally have wells large enough to accommodate a volume of 63 μL. 9. Formamide loading dye: Formamide, 5 mM EDTA, and 0.01% bromophenol blue (BPB) (see Note 5). 10. Vita-wrap (see Note 6). 11. TLC plate polygram SIL G/UV254 (Macherey–Nagel). 12. UV lamp. We use a mineralight multiband ultraviolet lamp 254/366 nm. 13. Surgical blades. 14. 2 M ammonium acetate pH 5.5: dissolve 15.42 g of ammonium acetate in approx. 50 mL of dH2O. Adjust pH with

198

Eva Maria Sternkopf Lillebæk and Birgitte Haahr Kallipolitis

CH3COOH and adjust the volume to 100 mL. Sterile filtrate and store at 4 °C. 15. RNA phenol pH 4.5: dissolve phenol in dH2O (see Note 7). Mix 100 mL phenol phase, 50 mL aqueous phenol phase, and 0.1 g 8-hydroxyquinoline in a dark glass bottle. Add 3.33 mL 3 M sodium acetate, pH 4.5. Shake to mix and leave to separate overnight (ON). Check the pH of the aqueous phase with pH strips and adjust with sodium acetate if necessary. 2.3

Labeling of RNA

1. Polynucleotide kinase (PNK) (10 U/μL) supplied with 10× PNK buffer. 2. Shrimp alkaline phosphatase (SAP) (1 U/μL). 3. [γ-32P] ATP (6000 Ci/mmol) (Perkin Elmer). 4. EDTA (7.5 mM). 5. RNA purification kit. We use NucleoSpin miRNA, a mini kit for miRNA and RNA purification (Macherey–Nagel).

2.4

EMSA

1. Unlabeled and labeled RNA fragments of interest (see Note 8). 2. EMSA loading buffer: 50% glycerol. 3. 10× structure buffer (Invitrogen) (see Note 9). 4. Yeast RNA (10 mg/mL) (Invitrogen) (see Note 10). 5. 5% native polyacrylamide gel (38:1 acrylamide:bisacrylamide): mix 50 mL 10× TBE, 125 mL acrylamide (40% solution), 65 mL bisacrylamide (2% solution), and dH2O to a total volume of 1 L. Filtrate the solution and store it at 4 °C. 6. Glass plates (16 × 19 cm), comb, and spacers (1 mm thick). The comb should accommodate at least 20 samples. 7. Whatman Grade 2 Chr Cellulose Chromatography Paper (Frisenette).

3

Methods

3.1 Design of Mutant Derivatives

Base pairing regions between two RNA molecules can be predicted using web-based programs, such as IntaRNA (http://rnainter.org/ IntaRNA/) [15–17]. For simplicity, the protocol of validating RNA–RNA interactions by EMSA will be described using two fictive base-pairing RNA molecules, named RNA-1 and RNA-2. The predicted interaction between RNA-1 and RNA-2 is illustrated in Fig. 2a. To validate this interaction, mutant derivatives of RNA-1 and RNA-2 are designed by substituting selected nts within the predicted base-pairing region. First, design a mutant RNA-1 derivative (RNA-1-mut) that is predicted to disrupt base pairing with RNA-2 (Fig. 2b). Next, design a mutant RNA-2 derivate (RNA-2-

Experimental Validation of RNA–RNA Interactions

199

Fig. 2 Schematic illustration of mutation analyses of RNA–RNA interactions. The predicted base pairing between the interacting partners RNA-1 and RNA-2 is shown in (a). Complementary mutations in RNA-1 and RNA-2 are designed to disrupt base pairing between the mutant RNAs and their wild-type RNA partner (b, c), whereas base pairing is restored between the two mutant RNAs (d). The mutated nucleotides are indicated in red

mut) that restores base pairing with RNA-1-mut but disrupts interaction with wild-type RNA-1 (Fig. 2c, d). The mutant derivatives should be carefully designed to prevent changes in the secondary structure of the RNA. Potential effects of nt substitutions on secondary structure are tested by using an RNA folding web server, such as mfold (http://www.unafold.org/) [18]. 3.2 Design and Preparation of DNA Template

1. DNA templates for in vitro transcription of RNA-1 and RNA-2, respectively, are prepared by using overlapping primer sets (forward and reverse primers, respectively) (see Note 11). From the 5′-end, a forward primer contains (1) four G’s; (2) the T7 promoter sequence as described by the manufacturer for use of the MEGAscript T7 transcription kit; and (3) DNA sequence encoding the 5′-part of the desired RNA. The reverse primer contains sequence information corresponding to the 3′-part of the desired RNA. Importantly, the forward and reverse primers are designed to contain overlapping sequences of 20–25 nts at their 3′-ends. For preparation of DNA templates encoding mutant derivatives of RNA-1 and RNA-2, the primer sets are designed to contain the substitutions described in Subheading 3.1 and Fig. 2. 2. For each template, prepare a 50 μL DNA template synthesismix: 1 μL dNTP mix, 5 μL forward primer, 5 μL reverse primer, 10 μL 5× Phusion buffer, 21.5 μL dH2O, and 0.5 μL of Phusion polymerase (2 U/μL). 3. Run the DNA template synthesis reaction in a thermocycler. To extend the overlapping primers, use the following program: 98 °C for 15 s, TM + 3 °C for 15 s, and 72 °C for 15 s (see Note 12). 4. Check the quality and length of the DNA product by loading 2 μL on a 2% agarose gel.

200

Eva Maria Sternkopf Lillebæk and Birgitte Haahr Kallipolitis

5. Precipitate DNA from the residual 48 μL DNA template synthesis reaction by adding 144 μL 96% ethanol (cold) and 4.8 μL 3 M sodium acetate pH 4.5. Leave the sample for at least 1 h in a -20 °C freezer. 6. Centrifuge at 4 °C for 30 min at 21,000 × g. Remove the supernatant without disturbing the DNA pellet. Wash carefully with 100 μL 70% ethanol (cold). Spin again for 10 min and remove the supernatant. Spin briefly in a tabletop centrifuge and aspirate the residual liquid from the tube. 7. Leave the DNA pellet to air dry at room temperature (RT). Dissolve the pellet in 11 μL dH2O and quantify the DNA using Denovix or equivalent. Adjust the DNA concentration to 0.125 μg/μL. 3.3 In Vitro Transcription and Purification of RNAs

1. For in vitro transcription of the DNA template, use a MEGAscript T7 kit. About 30 μL reactions are prepared as follows: 3 μL ATP solution, 3 μL CTP solution, 3 μL GTP solution, 3 μL UTP solution, 3 μL 10× reaction buffer, 12 μL of template DNA (0.125 μg/μL), 3 μL of enzyme mix. Incubate the in vitro transcription reaction ON at 37 °C in an incubator (see Note 13). 2. After ON incubation, add 1.5 μL of TURBO DNase and digest for 15 min at 37 °C in an incubator (see Note 14). 3. Prepare 35 mL of 6% denaturing gel mix with 350 μL APS (10%) and 35 μL TEMED in a falcon tube and cast the gel. 4. Mix the in vitro transcribed RNA sample with equal amounts of formamide loading dye, then heat for 2 min at 75 °C. Load the sample into a single well (see Note 15). Separate RNA and template DNA by running the gel at 300 V for approx. 45 min. 5. Separate the glass plates and carefully transfer the gel to Vitawrap. Cover the gel with a single layer of foil. Place the gel on a fluor-coated TLC plate. Visualize the RNA transcript by UV shadowing and mark the band with a marker. 6. Slice out the gel piece and transfer it to a 2 mL safe-lock tube. Add 500 μL of 2 M ammonium acetate pH 5.5, and allow diffusion for 2 h at 17 °C, slowly shaking. Add 500 μL of RNA phenol and mix gently. Leave ON at 17 °C, slowly shaking. 7. Briefly centrifuge the tube containing the gel piece and transfer the liquid to a clean 2 mL safe-lock tube. Centrifuge for 5 min at 13,000 × g, and transfer the aqueous phase to a clean tube containing 500 μL RNA phenol. Mix by shaking. 8. Centrifuge for 5 min at 13,000 × g and transfer the aqueous phase to a clean tube containing 500 μL chloroform. Vortex for 5 s and repeat the centrifugation step.

Experimental Validation of RNA–RNA Interactions

201

9. Transfer the aqueous phase to a clean tube, and add 0.1 volumes of 3 M sodium acetate pH 4.5 and 3.5 volumes of 96% ethanol to precipitate RNA. Leave at -20 °C for at least 1 h. 10. Centrifuge the samples at 4 °C for 30 min at 21,000 × g. Remove the supernatant without disturbing the RNA pellet, and add 1 mL of ice-cold 70% ethanol. Re-centrifuge for 15 min. Carefully remove the supernatant. Spin briefly in a tabletop centrifuge and remove the remaining supernatant. Allow pellet to air dry at RT for approx. 10 min. 11. Resuspend the RNA pellet in 50 μL of nuclease-free H2O. Quantify and check the integrity of the transcript by Denovix and gel electrophoresis (see Subheading 3.2), respectively. Adjust to a final concentration of 10 pmol/μL RNA and store sample aliquots at -80 °C (see Note 16). 3.4

Labeling of RNAs

1. Dephosphorylate RNA fragments using SAP. For a 10 μL reaction, mix 1 μL of 10× PNK buffer, 1 μL of SAP, 0.8 μL of purified RNA (10 pmol/μL), and 7.2 μL nuclease-free H2O. 2. Incubate at 37 °C for 2 h and heat-inactivate at 65 °C for 10 min. Place on ice and proceed immediately to the next step. 3. Mix 3 μL of dephosphorylated RNA (2.4 pmol) with 0.7 μL 10× PNK buffer, 2 μL [γ-32P] ATP, and 3.3 μL nuclease-free H2O. Add 1 μL PNK enzyme to a final volume of 10 μL. 4. Incubate at 37 °C for 30 min, add 20 μL of 7.5 mM EDTA, and incubate at 75 °C for 15 min. Place on ice for 2 min. Proceed immediately to the next step (see Note 17). 5. Purify the labeled RNA transcripts by using the NuceloSpin miRNA kit. Mix labeled RNA sample with 200 μL MX binding buffer, load on column (see Note 18), incubate 1 min at RT, and centrifuge for 1 min at 13,000 × g. Transfer the column to a clean tube and discard the flow through. 6. Add 500 μL MW1 wash buffer, and incubate for 1 min at RT. Repeat centrifugation and discard flow through (see Note 19). 7. Add 500 μL MW2 wash buffer, and incubate for 1 min at RT. Re-centrifuge and discard flow through. 8. Elute RNA by adding 30 μL nuclease-free H2O to the column; wait for 1 min; centrifuge for 1 min at 13,000 × g. 9. Repeat the elution step with another 30 μL nuclease-free H2O in the same tube. Final RNA concentration: 0.04 pmol/μL (see Note 20). 10. Store 5′-end labeled RNA transcripts at -20 °C until use.

202

3.5

Eva Maria Sternkopf Lillebæk and Birgitte Haahr Kallipolitis

EMSA

Set up the EMSA experiment as exemplified for RNA-1 and RNA-2, see Table 1. Firstly, the binding between wild-type RNA-1 and wild-type RNA-2 (samples 1–5) or mutant RNA-2mut (samples 6–10) is tested by preparing samples containing 5′-end labeled RNA-1 and increasing amounts of unlabeled

Table 1 Overview of samples prepared for an EMSA experiment, testing the binding between 5′-end labeled RNA-1 and unlabeled RNA-2 (samples 1–5) or RNA-2-mut (samples 6–10), or the binding between 5′-end labeled RNA-1-mut and unlabeled RNA-2 (samples 11–15) or unlabeled RNA-2-mut (samples 16–20). Numbers correspond to the volumes used in μL Sample no.

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20

Labeled RNA-1 (0.04 pmol/μL)

1 1 1 1 1 1 1 1 1 1

_

_

_

_

_

_

_

_

_

_

Labeled RNA-1-mut (0.04 pmol/μL)

_ _ _ _ _ _ _ _ _ _

1

1

1

1

1

1

1

1

1

1

Yeast RNA (10 mg/mL)

1 1 1 1 1 1 1 1 1 1

1

1

1

1

1

1

1

1

1

1

Unlabeled RNA-2 (0.08 pmol/μL) 4-fold excess

– 2 – – – – – – – –



2

















Unlabeled RNA-2 (0.4 pmol/μL) 20-fold excess

– – 2 – – – – – – –





2















Unlabeled RNA-2 (2 pmol/μL) 100-fold excess

– – – 2 – – – – – –







2













Unlabeled RNA-2 (10 pmol/μL) 500-fold excess

– – – – 2 – – – – –









2











Unlabeled RNA-2-mut – – – – – – 2 – – – (0.08 pmol/μL) 4-fold excess













2







Unlabeled RNA-2-mut – – – – – – – 2 – – (0.4 pmol/μL) 20-fold excess















2





Unlabeled RNA-2-mut – – – – – – – – 2 – (2 pmol/μL) 100-fold excess

















2



Unlabeled RNA-2-mut – – – – – – – – – 2 (10 pmol/μL) 500-fold excess



















2

10× structure buffer

1 1 1 1 1 1 1 1 1 1

1

1

1

1

1

1

1

1

1

1

Nuclease-free H2O

7 5 5 5 5 7 5 5 5 5

7

5

5

5

5

7

5

5

5

5

Experimental Validation of RNA–RNA Interactions

203

RNA-2 or RNA-2-mut, respectively. Secondly, the binding between RNA-1-mut and RNA-2 (samples 11–15) or RNA-2mut (samples 16–20) is tested by using 5′-end labeled RNA-1mut and increasing amounts of unlabeled RNA-2 or RNA-2-mut, respectively. 1. Prepare diluted stocks of unlabeled RNA-2 and unlabeled RNA-2-mut, corresponding to 0.08, 0.4, and 2 pmol/μL. 2. Mix the samples as described in Table 1. Start by adding (a premix of) water, structure buffer, and Yeast RNA to the tubes (see Note 21). 3. Incubate the samples for 1 h at 37 °C and place them on ice for 10 min. In the meanwhile, prepare the EMSA gel (next step). 4. Cast a 5% native acrylamide gel; solidify with 1% APS and 0.1% TEMED. Pre-run gel in cold room (4 °C) at 100 V for 30 min. Use ½× TBE for running buffer. 5. Add 5 μL of 50% glycerol to the tubes immediately before loading, mix by pipetting once, and load 5 μL while the gel is running. 6. Run the gel for 1–1½ h in a cold room (4 °C). 7. Separate glass plates and transfer gel to Whatman grade 2 filter paper. Cover with Vita-wrap. Vacuum dry in gel dryer for approx. 1 h. Radioactive bands are visualized and quantified by autoradiography (see Note 22). 8. A schematic illustration of the resulting EMSA is shown in Fig. 3.

4

Notes 1. Information on how to design overlapping primers is provided in Subheading 3.2. 2. Information on how to generate purified DNA templates is provided in Subheading 3.2. 3. Information on how to generate the RNA fragments is provided in Subheading 3.3. 4. The solution can be stored for 1 week at RT but we recommend preparing a fresh batch each time. 5. Mix and aliquot into 1 mL portions. Store at -20 °C.

204

Eva Maria Sternkopf Lillebæk and Birgitte Haahr Kallipolitis

Fig. 3 Schematic illustration of EMSA analysis of RNA–RNA complex formation. The samples listed in Table 1 are analyzed on a native gel and bands are visualized by autoradiography. The slowly migrating band corresponds to the RNA–RNA complex formed between base pairing variants of RNA-1 and RNA-2. The unbound labeled RNA species migrate faster through the gel; they correspond to the lower bands. The labeled RNA variants are marked by an asterisk (*)

6. Based on our experience, using cling film of a different brand or thickness may disrupt the UV shadowing. Vita-wrap is made of 8 μm thick polyethylene. 7. Dissolving phenol may take a long time. Leave with a magnetic stirrer for 24 h. Allow to separate into organic and aqueous phases for 24 h. If not dissolved, repeat. Wear proper protective gloves and use a flow bench to avoid physical contact with phenol! 8. Information on how to prepare unlabeled and labeled RNA is provided in Subheadings 3.3 and 3.4, respectively. 9. We used the 10× structure buffer provided with the RNase T1 enzyme kit from Invitrogen. 10. We used the Yeast RNA provided with the RNase T1 enzyme kit (see Note 9) but it can be purchased separately from Invitrogen. 11. Based on our experience, the DNA templates should be designed to result in RNA transcripts ranging from 50 to 300 nts in length. 12. TM is the melting temperature calculated for the 20–25 nt overlap of the forward and reverse primers. 13. Based on our experience, using an incubator instead of a heating block will minimize evaporation. 14. TURBO DNase (2 U/μL) is included in the MEGAscript kit but the enzyme can be purchased separately. 15. The sample can be loaded into two or more smaller wells, but we find that using a single well, large enough to accommodate the entire sample, minimizes the loss of transcript in downstream handling.

Experimental Validation of RNA–RNA Interactions

205

16. The RNA is now ready for use as “unlabeled RNA” in EMSA experiments. Otherwise, the RNA may be 5´-end labeled as described in Subheading 3.4. 17. Alternatively, store samples at -20 °C ON. 18. Use the blue columns provided with the NucleoSpin miRNA mini kit. 19. Handle with care. The non-incorporated [γ-32P] ATP.

flow-through

contains

20. The recovery yield of the Nucleospin miRNA Mini kit is >95%. 21. When handling many samples, it is recommended to prepare a pre-mix of the 10× structure buffer, yeast RNA, and nucleasefree H2O. 22. We use a Typhoon FLA 9500 (Cytiva) to visualize the bands and IQTL 8.0 quantification software (Cytiva) to quantify the bands.

Acknowledgments The figures were made using Biorender (https://biorender.com/). References 1. Waters LS, Storz G (2009) Regulatory RNAs in bacteria. Cell 136(4):615–628. https://doi. org/10.1016/j.cell.2009.01.043 2. Jorgensen MG, Pettersen JS, Kallipolitis BH (2020) sRNA-mediated control in bacteria: an increasing diversity of regulatory mechanisms. Biochim Biophys Acta Gene Regul Mech 1863 (5):194504. https://doi.org/10.1016/j. bbagrm.2020.194504 3. Thorsing M, Dos Santos PT, Kallipolitis BH (2018) Small RNAs in major foodborne pathogens: from novel regulatory activities to future applications. Curr Opin Biotechnol 49:120– 128. https://doi.org/10.1016/j.copbio. 2017.08.006 4. Pennings S (1997) Nucleoprotein gel electrophoresis for the analysis of nucleosomes and their positioning and mobility on DNA. Methods 12(1):20–27. https://doi.org/10.1006/ meth.1997.0443 5. Daras G, Alatzas A, Tsitsekian D, Templalexis D, Rigas S, Hatzopoulos P (2019) Detection of RNA-protein interactions using a highly sensitive non-radioactive electrophoretic mobility shift assay. Electrophoresis 40(9):1365–1371. https://doi.org/10.1002/ elps.201800475

6. Wang F, Yao T, Yang W, Wu P, Liu Y, Yang B (2022) Protocol to detect nucleotide-protein interaction in vitro using a non-radioactive competitive electrophoretic mobility shift assay. STAR Protoc 3(4):101730. https://doi. org/10.1016/j.xpro.2022.101730 7. Dos Santos PT, Menendez-Gil P, Sabharwal D, Christensen JH, Brunhede MZ, Lillebaek EMS, Kallipolitis BH (2018) The small regulatory RNAs LhrC1-5 contribute to the response of Listeria monocytogenes to Heme toxicity. Front Microbiol 9:599. https://doi. org/10.3389/fmicb.2018.00599 8. Ross JA, Thorsing M, Lillebaek EMS, Teixeira Dos Santos P, Kallipolitis BH (2019) The LhrC sRNAs control expression of T cell-stimulating antigen TcsA in Listeria monocytogenes by decreasing tcsA mRNA stability. RNA Biol 16(3):270–281. https://doi.org/10.1080/ 15476286.2019.1572423 9. Mollerup MS, Ross JA, Helfer AC, Meistrup K, Romby P, Kallipolitis BH (2016) Two novel members of the LhrC family of small RNAs in Listeria monocytogenes with overlapping regulatory functions but distinctive expression profiles. RNA Biol 13(9):895–915. https:// doi.org/10.1080/15476286.2016.1208332

206

Eva Maria Sternkopf Lillebæk and Birgitte Haahr Kallipolitis

10. Sievers S, Lund A, Menendez-Gil P, Nielsen A, Storm Mollerup M, Lambert Nielsen S, Buch Larsson P, Borch-Jensen J, Johansson J, Kallipolitis BH (2015) The multicopy sRNA LhrC controls expression of the oligopeptidebinding protein OppA in Listeria monocytogenes. RNA Biol 12(9):985–997. https://doi. org/10.1080/15476286.2015.1071011 11. Sievers S, Sternkopf Lillebaek EM, Jacobsen K, Lund A, Mollerup MS, Nielsen PK, Kallipolitis BH (2014) A multicopy sRNA of Listeria monocytogenes regulates expression of the virulence adhesin LapB. Nucleic Acids Res 42(14):9383–9398. https://doi.org/10. 1093/nar/gku630 12. Nielsen JS, Lei LK, Ebersbach T, Olsen AS, Klitgaard JK, Valentin-Hansen P, Kallipolitis BH (2010) Defining a role for Hfq in Grampositive bacteria: evidence for Hfq-dependent antisense regulation in Listeria monocytogenes. Nucleic Acids Res 38(3):907–919. https://doi.org/10.1093/nar/gkp1081 13. Nielsen JS, Larsen MH, Lillebaek EM, Bergholz TM, Christiansen MH, Boor KJ, Wiedmann M, Kallipolitis BH (2011) A small RNA controls expression of the chitinase ChiA in Listeria monocytogenes. PLoS One 6(4): e19019. https://doi.org/10.1371/journal. pone.0019019

14. Lillebaek EMS, Kallipolitis BH (2018) Mutational analysis of sRNA-mRNA base pairing by electrophoretic mobility shift assay. Methods Mol Biol 1737:165–176. https://doi.org/10. 1007/978-1-4939-7634-8_10 15. Busch A, Richter AS, Backofen R (2008) IntaRNA: efficient prediction of bacterial sRNA targets incorporating target site accessibility and seed regions. Bioinformatics 24(24): 2849–2856. https://doi.org/10.1093/bioin formatics/btn544 16. Wright PR, Georg J, Mann M, Sorescu DA, Richter AS, Lott S, Kleinkauf R, Hess WR, Backofen R (2014) CopraRNA and IntaRNA: predicting small RNA targets, networks and interaction domains. Nucleic Acids Res 42 (Web Server issue):W119–123. https:// doi.org/10.1093/nar/gku359 17. Mann M, Wright PR, Backofen R (2017) IntaRNA 2.0: enhanced and customizable prediction of RNA-RNA interactions. Nucleic Acids Res. https://doi.org/10.1093/nar/ gkx279 18. Zuker M (2003) Mfold web server for nucleic acid folding and hybridization prediction. Nucleic Acids Res 31(13):3406–3415

Chapter 12 Dynamics and Function of sRNA/mRNAs Under the Scrutiny of Computational Simulation Methods Agustı´n Ormaza´bal, Juliana Palma, and Gustavo Pierdominici-Sottile Abstract Molecular dynamics simulations have proved extremely useful in investigating the functioning of proteins with atomic-scale resolution. Many applications to the study of RNA also exist, and their number increases by the day. However, implementing MD simulations for RNA molecules in solution faces challenges that the MD practitioner must be aware of for the appropriate use of this tool. In this chapter, we present the fundamentals of MD simulations, in general, and the peculiarities of RNA simulations, in particular. We discuss the strengths and limitations of the technique and provide examples of its application to elucidate small RNA’s performance. Key words Molecular dynamics, Enhanced sampling, Free energy landscape, Conformational ensemble, Small RNAs

1

Simulations The rules governing a system are often known but constitute an exceedingly complex network. Therefore, the paths connecting those rules with the system behaviour remain unclear despite our basic knowledge. In these situations, one can resort to simulations to understand the system’s functioning in more depth. This chapter is devoted to presenting the fundamentals of Molecular Dynamics (MD) simulations of biomolecules, with an emphasis on their implementations in the study of RNAs. The main challenges the MD practitioner faces when simulating RNA molecules in aqueous solutions will be discussed, and examples of successful applications to small RNAs will be provided. We have chosen to give priority to presenting the main concepts of the methods and tried to keep the

Authors Juliana Palma and Gustavo Pierdominici-Sottile have contributed equally to this work Supplementary Information The online version contains supplementary material available at (https://doi. org/10.1007/978-1-0716-3565-0_12). Ve´ronique Arluison and Claudio Valverde (eds.), Bacterial Regulatory RNA: Methods and Protocols, Methods in Molecular Biology, vol. 2741, https://doi.org/10.1007/978-1-0716-3565-0_12, © The Author(s), under exclusive license to Springer Science+Business Media, LLC, part of Springer Nature 2024

207

208

Agustı´n Ormaza´bal et al.

number of equations and technicalities to a minimum. Thus, the chapter was meant for those who want to evaluate whether they can use MD simulations to contribute to their research on small RNAs, or those wanting to make a critical evaluation of the results obtained by others with this technique. In any simulation, we emulate the way a system functions by creating a model. The model should contain the elements and rules that are essential to reproduce the system’s performance. For a given initial situation, one then evolves the state of the model according to the assumed rules. That is the simulation. Finally, one contrasts the data obtained from the simulation with those of the actual system. This comparison reveals the predictive capacity of the model. If high, the model can be implemented to interpret and rationalise experimental findings. Also, it can be employed to make predictions about situations that evade experimental scrutiny. If the model’s predictive capacity is low, one must tune its rules to improve its functioning. The process of detecting which model’s aspects are responsible for a given outcome is, by itself, very enlightening and constitutes another way by which the combination of “modelling plus simulation” enhances our understanding. Anyone with a bachelor’s degree in natural sciences has dealt with models at some point. For example, Lewis’s diagrams constitute rudimentary models for molecular structure. Their ability to qualitatively predict the geometry of most molecules is remarkable, especially considering how simple their rules are if compared to the laws of Quantum Mechanics that actually dictate the molecule’s properties. Bohr’s atom is also a model. It proposes that the electrons orbit around the nuclei like the planets do around the sun. Despite being physically wrong, the model can quantitatively predict the absorption spectrum of the hydrogen atom (although it fails for atoms with more electrons). Finally, the ideal gas is also a model whose contributions to the understanding of the gas phase behaviour can hardly be overemphasised. The examples provided above demonstrate that a model can afford valuable predictions even if it employs rules different from those that govern the system. As summarised in the quote of the British statistician George E. P. Box, “All models are wrong, but some are useful” [1]. In any case, the limitations established on the model’s rules will eventually afford erroneous simulation results. Therefore, one must be aware of the model’s flaws and evaluate the consistency and correctness of its results whenever possible. Incredibly simple representations (models) have also been of great help for the elucidation of the structure of biological macromolecules. Pauling employed wooden-made models, consisting of balls and sticks, to guess the existence of α-helices and β-sheets as the primary components of proteins [2]. The main ingredients of his models were the molecular geometries of the constituent amino acids and the assumption of a restrained geometry for the H-bonds

Dynamics of sRNA/mRNA

209

Fig. 1 Functional form of the AMBER and CHARMM force fields that allow computing the MM potential energy for each configuration of the system. k r i and r i,eq are the harmonic constants and equilibrium distances, respectively, of the covalent bonds. k θi and θi,eq are the harmonic constants and equilibrium angles, respectively, for adjacent covalent bonds. V i,n are the coefficients of the Fourier expansions that describe the potential energy for dihedral angles. The last two summations are the Lennard-Jones and Coulombic interactions, respectively. The pairwise parameters of the Lennard-Jones terms, Ai ,j and B i,j , are computed from parameters assigned to each atom. q i and q j are the partial charges of atoms i and j, respectively

formed between the carbonyl oxygen and amide nitrogen. Watson and Crick used a similar strategy to unveil the double-helix structure of DNA [3]. The relevance of these investigations, which were awarded the Nobel Prize, pinpoint to what extent models have contributed to the development of molecular biology. The balls-and-sticks models mentioned in the previous paragraph are the predecessors of the Molecular Mechanics (MM) force fields used nowadays. These are just analytical functions of the nuclei coordinates that mimic the interactions between the system’s particles (see Fig. 1). In the transition from wooden-made to MM models, the hard spheres representing the atom’s core were replaced by soft spheres, while harmonic springs took the place of the rigid bonds and angles. MM force fields also incorporate terms accounting for the attractive interactions between non-bonded atoms. We will provide a more precise description of all these terms in the next section. Here, it suffices to notice that, using simple rules, the force field attributes a value of potential energy to each molecular configuration. The set of energies corresponding to all possible configurations of a system constitutes its potential energy surface (PES). The PES

210

Agustı´n Ormaza´bal et al.

is intimately connected to the free energy landscape (FEL), but they should not be confused. The PES is a mechanic property that corresponds to a single system. Instead, the FEL is a statistical mechanic property and therefore it corresponds to a large set of identical systems in thermodynamic equilibrium (the so-called molecular ensemble). The PES only considers the potential energy of the configurations while the FEL involves the internal energy (or enthalpy), the entropy, and fundamentally: the temperature. As can be guessed, the MM force field embodies the core of the molecular model. In pre-computer days, these MM models were employed to characterise the most stable structures of small molecules. However, the calculations were limited by the lack of computational power. The advent of modern computers was a real gamechanger, as it not only allowed for characterising the stable structures of larger molecules but also paved the way for simulating their motions in real time. Simulations of this kind that determine the motions of atoms in molecules by propagating Newton’s equations on a given MM force field are called all-atom classical molecular dynamics (MD) simulations. In this chapter, we will briefly present their fundamentals, as well as their applications to the study of sRNAs. In summary, a classical all-atom MD calculation is a computational simulation that aims to describe the dynamics of a molecular system by applying the classical equations of motion with a simplified version of the inter-atomic interactions. They can be considered as a single molecule (computational) experiment from which the temporal evolution of the system can be observed at atomic resolution. Besides, with the assistance of statistical mechanics theory, MD simulations allow for the computation of thermodynamic and kinetic properties. Movie I presents an example of what can be seen in these simulations. It shows a sRNA interacting with a protein that has two identical binding sites on each side. At the beginning of the trajectory, one of the GGA binding motifs of the sRNA attaches to one side of the protein. This initial binding then triggers a conformational change in the sRNA that positions another GGA motif close to the unoccupied binding site, located on the other side. Molecular dynamics simulations should not be confused with other computational techniques used in Biology, which belong to the field of Bioinformatics. These include methods for sequential or structural alignment, sequence finding, phylogenetic inference, and other procedures of structure or function prediction, identification of protein-protein interactions, and docking. They share with MD the fact that they employ computers to fulfil their aims but differ in their fundamentals. These bioinformatics approaches will not be discussed in this chapter.

Dynamics of sRNA/mRNA

1.1 Theoretical Framework

211

Molecular dynamics simulations can be carried out in many different ways. Each alternative implementation aims to better fit the characteristics of the system under study. In the following paragraphs, we will introduce the basics of the so-called all-atom classical MD simulations. This is the type usually employed to study RNAs. We will present this theoretical framework by describing the series of approximations that go from an accurate ab-initio calculation to an all-atom classical simulation. These approximations turn a theoretically ideal but hopeless calculation into a feasible and useful one. The presentation also highlights the situations in which simulations may fail because their assumptions are no longer valid. The strategies one can follow in such circumstances will be mentioned. Figure 2 shows a scheme of the simplifications leading from a computation entirely based on first principles to an all-atom classical MD simulation.

Fig. 2 Alternative computational approaches to study the dynamics of molecular systems. (a) The accurate solution requires simultaneously solving the Scho¨dringer’s equation for nuclei and electrons. Since this alternative is typically unaffordable, the Born-Oppenheimer approximation is invoked to separate the electronic problem from that of the nuclei. (b) The solution of the electronic Scho¨dringer’s equation provides a potential energy surface (PES) that governs the dynamics of the nuclei. This PES describes bond-breaking and bond-forming processes. Since this approach is computationally expensive, MD simulations approximate the PES with an empirical MM force field that cannot simulate chemical reactions. (c) Solving the nuclear Scho¨dringer’s equation on a given PES, either accurate or approximated, provides a quantum description of nuclear dynamics. This naturally incorporates quantum effects such as the zero-point energy of vibrations and tunnelling. Finally, since that alternative is also very costly, nuclear dynamics is determined using the classical equations of motion. (d) Finally, since that alternative is also very costly, nuclear dynamics is determined using the classical equations of motion

212

Agustı´n Ormaza´bal et al.

Any molecular system consists of nuclei and electrons whose time evolution is governed by the time-dependent Scho¨dringer’s equation. Computing this evolution is a formidable task that is only affordable when dealing with pretty small systems. For this reason, most MD simulations invoke the Born-Oppenheimer (BO) approximation. According to BO, the electronic Schro¨dinger’s equation can be decoupled from that of the nuclei. Therefore one must first solve the electronic equation assuming fixed nuclei. This calculation affords an energy value, a point on the PES that rules the nuclear motion. Since one needs the complete PES, not just a point, the electronic calculations have to be repeated for many alternative nuclear positions. After completing this job and fitting the resulting points, one has the global PES and can finally attempt to solve the nuclear equation. The problem with this strategy is that solving either Schro¨dinger’s equation, electronic or nuclear, is also an intractable task. Therefore, further approximations are invoked to make the calculations possible. Thus, instead of repeatedly solving the electronic Schro¨dinger’s equation for different nuclear positions, one computes the PES using an empirical MM force field. Also, instead of solving the nuclear Schro¨dinger’s equation for the given PES, one employs Newton’s equations to describe the nuclear movements. The BO approximation is correct for processes occurring on a single electronic state. Thus, simulations based on it can describe biomolecules undergoing conformational changes, binding to one another and even going through most chemical reactions. However, they are unable to study charge transfer reactions or processes involving excited electronic states which eventually relax towards the ground state. On the other hand, the use of the classical equations of motion to propagate the state of the nuclei precludes the analysis of any nuclear quantum effect. These mainly include the tunnel effect and zero-point energy of the vibrational motions. None of these limitations is significant when analysing the behaviour of macromolecules, unless one is interested in studying a proton transfer reaction. In those cases, one must resort to more advanced techniques that use quantum mechanics to deal with the dynamics of a small subset of “quantum” nuclei. Finally, we note that the limitations imposed by using an MM force field can only be discussed after providing the analytical expressions it employs to compute the potential energy of nuclear configurations. That is the subject of the next section. 1.2 The MM Force Fields

As stated in the previous sections, most MD simulations employ MM force fields to evaluate the potential energy corresponding to each configuration of the nuclei. They have two basic ingredients: the functional form and the parameters. The functions are based on intuitive grounds, aiming to balance the calculations’ quality and computational cost. The parameters are adjusted to reproduce

Dynamics of sRNA/mRNA

213

relevant properties of a few selected systems. Thus, the procedure relies upon the belief that the parameters are portable: those determined from the selected set of molecules are considered to be appropriate for molecules outside the set. This condition is not always fulfilled, but it works suitably for molecules of the same kind. Thus, there exist parameterisations for proteins, DNAs, RNAs, lipids, carbohydrates, and also for water molecules and ions [4–8]. New experiments or simulations sometimes reveal the shortcomings of a given force field. When this happens and the failure is severe, the set of parameters has to be modified. Also, the functional form can be changed, but that occurs less frequently. Thus, one might think that force fields evolve so that the newer ones reproduce the properties of the systems better than the older ones. Accordingly, before facing an investigation based on MD simulations of a given system, one has to thoroughly evaluate the available force fields, always looking for criticisms and revised versions. For the case of RNA simulations, there are thorough discussions dating from 2017 [9] and 2018 [10]. The reading of those articles represents an adequate starting point for those who want to learn the main characteristics of RNA’s force fields and the challenges faced in their development. The larger part of those discussions will remain valid in spite of time. However, force fields have improved since the time those reviews were written and will continue improving. Therefore, before initiating a project based on MD simulations of RNAs, one should look for the updates that occurred from those dates to the present time. The force fields most commonly used to deal with biomolecules are those of the AMBER and CHARMM families. We use the word “family” because, as stated above, there exist alternative versions for each of them that correspond to different stages of their evolution. Both force fields are typically combined with the TIP3P model for the water molecule [6]. In the case of CHARMM, the combination is mandatory because the TIP3P model is involved in its parameterisation procedure. In the case of AMBER, the election relies on the fact that the model affords good results at the lowest computation cost. However, there have been some reports that simulations of RNAs performed with the AMBER force field were dependent on the model employed for the water molecule [11]. Thus, some trial computations to evaluate if this dependence is observed on the particular system of interest are advisable. The AMBER and CHARMM force fields employ the same functional form, which is shown in Fig. 1, but they differ in the strategy implemented to determine the parameters. The first summation in the equation of Fig. 1 runs over all covalent bonds (1– 2 interactions). They are considered harmonic springs. Also, harmonic potentials are assigned to the angles between adjacent covalent bonds (1–3 interactions). Their contribution appears in the

214

Agustı´n Ormaza´bal et al.

second summation. The third summation runs over all torsional angles (1–4 interactions) which are accounted for with a Fourier expansion. Depending on the relevance of the torsion, its parameterisation may require just one term, or up to three. There are also “improper torsions”. They are employed to enforce the planarity of the three bonds steaming from a central sp2 atom. The fourth summation in the equation of Fig. 1 assigns a Lennard-Jones potential to each pair of atoms. Its repulsive part accounts for the steric repulsion, which origins in Pauli’s exclusion principle. The attractive part corresponds to forces deriving from induced dipoles (dipole/induced-dipole and dispersion forces). Finally, the fifth summation accounts for the dipole–dipole, ion– dipole, and ion–ion interactions. Their contribution is evaluated by assigning a fixed electric charge to each nucleus. These charges interact with each other via the Coulombic terms of the fifth summation. Interactions appearing in the fourth and fifth summations are called “non-bonded” interactions. In spite of their name, they are calculated not only for all pairs of atoms that belong to different molecules but also for atoms of the same molecule if they bear interactions beyond 1–4. Moreover, in the AMBER force field, they are also applied to 1–4 interactions but with a scaling factor that reduces their contribution. The functional form of the force field of Fig. 1 represents a significant improvement over the wooden-made balls-and-sticks models initially employed to unveil the basic structural features of proteins and DNA. However, one should keep in mind that it is just an approximation that relies on parameters that cannot be unambiguously derived from first principles nor determined from experiments. In particular, these considerations apply to the torsional constants and partial charges, which are of paramount importance to determine the conformations of biomolecules in water solution. Instead, equilibrium distances and angles can be fitted to reproduce information from X-ray structures while the harmonic constants can be derived from IR spectrum frequencies. Finally, the LennardJones parameters are assigned so that MD simulations of selected neutral compounds are able to reproduce properties of their liquid state. The most frequently considered properties are the density and heat of vaporisation. The determination of the partial charges is challenging because they are just an artefact devised to account for the electrostatic interactions between molecules. Actually, the point charges of the nuclei plus the electronic cloud surrounding them create an electric field around each molecule. This field, which variates with the molecule’s conformation, is felt by other molecules and ions that come close enough. However, when the two fragments start to interact, their electronic clouds distort modifying the original charge distributions. As a result, the interaction between the fragments cannot be explained by the charge distributions they had in

Dynamics of sRNA/mRNA

215

isolation. Force fields like the one of Fig. 1, which implement invariant point charges regardless of the proximity and relative orientation of the interacting fragments, cannot account for this effect (polarisation). Also, we note that since the partial charges are an artefact, there is no “correct” way of determining them. In fact, the AMBER and CHARMM force fields use different strategies. CHARMM’s charges were optimised to reproduce the quantum mechanics (QM) interaction energies between model compounds and a TIP3P water molecule located at relevant points. The charges of the AMBER force field were assigned to reproduce the QM electrostatic potential around a selected set of molecules. Thus, both approaches are physically sound but none of them is either correct or incorrect. The parameters for the torsional terms are the last ones to be determined. The simpler strategy to achieve this goal consists of using the formula of Fig. 1 to fit the QM potential energy profiles for selected torsional angles. In other words, one evaluates the difference between the QM profile and that computed with the formula of Fig. 1, but setting to zero the coefficients of the torsions (V i,n). Then, the parameters of the torsions are chosen to minimise the difference. Unfortunately, MD simulations using parameters determined in this way are typically unable to predict the structural characteristics of biological molecules in solution. For this reason, the current trend is to fine-tune the torsional parameters so that they can reproduce experimental observations, typically acquired via NMR. This aim has proven to be very challenging because the steady increase in simulation times and the improvements in sampling techniques are taking the ability of MM force fields to their very limit. It should be noted that the total potential energy as a function of a torsional angle has noticeable contributions from the electrostatic and Lennard-Jones terms of the atoms involved. Thus, the torsional parameters are tightly coupled to those of the non-bonded interactions. For this reason, they cannot be transferred from one force field to the other. The general advice is that a force field has to be taken as a whole. One should never mix parameters from different fields. Also, it is not advisable trying to modify one or a few parameters to reproduce more accurately some property of interest. This apparently small change can cause unforeseen failures in the global force field’s performance. One should also be careful with the parameters employed for the water molecule and use a water force field designed to work with that of the biological molecule. Otherwise, the crucial balance between solute–solute, solvent–solvent, and solute–solvent interactions can get lost, affording completely non-physical simulations. For example, the TIP3P water model is more polar than the actual water molecule. But in protein simulations that use the AMBER force field, the “error” is compensated by its enlarged protein’s

216

Agustı´n Ormaza´bal et al.

dipoles. Thus, using a water model with the correct polarity reduces solvent–solvent and solute–solvent interactions, which can no longer balance the artificially large solute–solute interactions, providing useless simulations. Regarding the CHARMM force field, we already mentioned it uses the TIP3P model to set the partial charges on the nuclei of biological macromolecules. Therefore, the use of this water model is mandatory when working with the CHARMM protein force field. 1.3 Free Energy Landscapes and Conformational Ensembles

The concept of free energy landscape has been very helpful in theoretical research of proteins’ behaviour and can equally be applied to other biological molecules. The same occurs with that of the “conformational ensemble”. In fact, the two ideas are tightly bound. But, what do we mean when we talk about the FEL of a molecule? How can we estimate it? In what way it determines the molecule’s functioning? According to classical statistical thermodynamics, once we settle the system’s temperature (and volume or pressure), each configuration q has a certain probability of being observed, PðqÞ. Here the “system” is our biological molecule and the water solution around it. We can then define a free energy for that particular configuration, F ðqÞ, which relates to the probability PðqÞ according to, F ðqÞ = - kB T lnðPðqÞÞ þ C:

ð1Þ

Here kB is Boltzmann constant and C is a constant that depends on temperature and it is of no importance for the present discussion. Because of the negative logarithmic relationship of Eq. 1, regions of high probability correspond with low free energy and vice versa. Thus, if we have a system in a configuration of high F ðqÞ (low probability) we will likely observe it moving to regions of low F ðqÞ (high probability). This property is reminiscent of the Gibbs or Helmholtz free energies studied in Physical Chemistry courses since macroscopic systems evolve from situations of high free energy to those of low free energy. In spite of the similarity, we note that F ðqÞ is not the same as any of the macroscopic free energies which for given T and P or T and V are independent of the system’s coordinates. From Eq. 1, the intuitive image of a free energy landscape naturally comes when one thinks of a minimalist system whose configuration is defined by two coordinates. In that case, a plot of the free energy as a function of the coordinates produces a picture like that of Fig. 3, which is reminiscent of a landscape with hills and valleys. However, the configuration of biomolecules cannot be fully specified by only two coordinates but requires hundreds or thousands. This fact, in principle, complicates the image because a function depending on so many variables is difficult to deal with. Fortunately, the main structural and dynamical characteristics of

Dynamics of sRNA/mRNA

217

Fig. 3 Schematic representation of the free energy landscape (FEL) of an RNA molecule. Typically, the surface is plotted as a function of a reduced set of collective coordinates that describe the system’s conformation. The minima of the FEL correspond to the structures that constitute the conformational ensemble

biological molecules can be described by a few collective variables (CVs). Thus, the FELs are built by averaging the probabilities PðqÞ over all coordinates, except the few special CVs. Typically, one or two CVs are employed, but for some applications, a few more may be required. A discussion of the methods available for determining appropriate CVs for a biomolecule is beyond the scope of this chapter. Here, it suffices to say that they are routinely used in computational biophysics. A more in-depth discussion of the topic can be found at [12]. When analysing a molecule’s functioning, the free energy difference between alternative configurations is more important than the absolute value for any of them, since it measures the ratio of the configuration’s probabilities,

F ðqB Þ - F ðqA Þ = - kB T lnðPðqB Þ=PðqA ÞÞ:

ð2Þ

Thus, a free energy difference of kB T (0.62 kcal/mol at 310 K) implies a probability ratio of 2.7 in favour of the state with the lowest free energy. However, if the free energy difference is 10kB T ( 6.2 kcal/mol at 310 K), the chance of seeing the lowest free energy state is more than 20,000 times larger than that of the higher state. The picture that emerges from these considerations is that each well in the FEL corresponds to a metastable conformation. By metastable, we mean a conformation that lasts for some time, since it is unlikely that the system moves up the hill to escape

218

Agustı´n Ormaza´bal et al.

from it. In fact, the rate of interconversion between alternative conformations is dictated by the free energy barriers that separate their wells, following an Arrhenius-like functional dependence. With the concepts of the previous paragraph in mind, we can now imagine what we would see if we could follow the movement of a single system (biomolecule in solution) as it walks on its FEL. If, in the beginning, the system is within a free energy well, it will remain there for a while, hovering around the minimum. But if we wait long enough, it will escape from the well since unlikely events also occur (only they do not happen often). After moving out, the system will sooner or later fall into another free energy well and stay there until a new breakout. This process will continue indefinitely so that, if we wait for a sufficient long time, the system will visit all the wells in the FEL. Moreover, it will pass by the vicinity of each point, and the time spent around each point will be given by the probability PðqÞ that comes from the statistical thermodynamics laws. One may now ask what relationship exists between the behaviour of an isolated system walking on its free energy landscape and a typical experimental situation in which we have a water solution of the biomolecule at a given temperature. The answer is easy if the solution is diluted so that each individual biomolecule is far apart from the others. In that case, the solution is formed by an ensemble of identical systems, analogous to those described in the previous paragraph, and they all share the same FEL. At any single moment, each system will be located at a different point of the FEL. But if we look at the ensemble, they will populate the FEL following the probabilities PðqÞ indicated by statistical thermodynamics laws. Of course, any system of the ensemble will be moving on the FEL as the single system of the previous paragraph. However, this drift does not change the chances of finding any of the systems at a given location. The equilibrium probabilities PðqÞ are such that they remain invariant in the ensemble. If some systems leave a given conformation, others are going to be entering it to keep the balance. Thus, when we have the aqueous solution of a biomolecule, in equilibrium at a given temperature, the biomolecule shows an ensemble of conformations. This is the “conformational ensemble”. This ensemble is dynamic, in the sense that all the biomolecules are moving on their FEL and changing their conformation. However, they do so in such a way that the probabilities PðqÞ do not change. Any property of the solution that depends on the conformation of the biomolecule will be an ensemble average. As stated above, we can compute the same average from an enoughlong trajectory of an individual biomolecule. This is what MD simulations aim to do. The equivalence between ensemble averages and time averages, when the latter are computed over infinite long trajectories, is known as the ergodic hypothesis and is at the core of statistical thermodynamics.

Dynamics of sRNA/mRNA

1.4 Challenges for RNA MD Simulations

219

As stated above, there are force fields for each of the major types of biological macromolecules. Unfortunately, these are not equally good. Therefore, the predictive power of the MD simulations carried out with them is not the same, either. In that sense, the differences between proteins and RNA are striking. More than 20 years ago MD simulations were able to predict the folded structure of the TRP-cage protein starting from a random coil [13]. The final Root Mean Square Deviation (RMSD) between that early model’s prediction and the experimental structure was < 1:0Å. More recently, Shaw and co-workers were able to repeatedly simulate the folding process of twelve proteins belonging to the three main structural classes: α-helical, β-sheet, and mixed α=β [14]. The magnitude of these achievements (produced several years ago) does not imply there are no challenges in protein’s simulations. In fact, the difficulties found when simulating intrinsically disordered proteins have exhibited that there is still room for improvement. However, it demonstrates that the field is mature enough and that the predictive power of protein’s simulations is high if they are proficiently performed. On the contrary, early MD simulations of RNA demonstrated the inadequacy of the available force fields [15, 16]. This realisation triggered the search for improvements that led to numerous refinements in the last years [17]. However, this process is still under development. Therefore, there will surely be new advances in the coming years. Compared to proteins, RNA molecules in solution have a significantly vaster spectre of possible conformations. Therefore, they pose a real challenge to both experimental and theoretical studies. Since the force fields are parameterised, in part, based on empirical determinations, one can expect the difficulties faced by the structural characterisation of RNA will condition the reliability of computational simulations. On the other hand, force field parameterisation also depends on Quantum Mechanics calculations, and in this regard RNA molecules pose a challenge. The main difficulties come from electron–electron correlations in the π - π stacking of aromatic bases, which are elusive to calculate. Finally, the lack of polarisable electrostatics in the current force fields is particularly problematic in RNA simulations, because of the high polarisability of the phosphate groups as well as the high electrostatic charge of Mgþ2 ions. Currently, many research efforts are devoted to overcoming these limitations. Both, AMBER [18] CHARMM [19], have polarisable versions of their RNAs force fields that claim to perform better than their non-polarisable counterparts. However, the implementation of any of these versions is much more computationally demanding. Therefore, they have difficulties to reach the long simulations times required a fair sampling of the RNA’s conformational space.

220

2

Agustı´n Ormaza´bal et al.

Dynamical Characteristics of RNAs In this section, we describe the main characteristics of the free energy landscapes of RNAs. They determine their dynamic behaviour and, therefore, what can and cannot be seen during an MD simulation.

2.1 Ensemble Modularity in RNA

RNAs consist of a limited number of secondary and tertiary motifs. Typically, the folding of every motif is almost independent of the structure of the remaining ones. This characteristic has been named “ensemble modularity” [20] and it implies that the final 3D structure is formed by the assembly of the individual modules. Secondary structure modularity was demonstrated by the fusion of various types of RNA aptamers to other functional RNA elements (such as siRNAs and miRNAs) capable of selectively targeting specific cell types or allosterically responding to environmental metabolites [21–23]. The interactions between the regardless folded motifs constitute the tertiary structure of RNAs, and the ensemble modularity concept also applies at this level. The ensemble modularity occurs because the existence of a given motif in the FEL of an RNA is dictated by its internal properties more than by its structural context. Thus, the interactions of the different parts of the molecule do not alter the possible structures adopted by the individual RNA motifs. Instead, they redistribute their populations within the conformational ensemble by applying geometric constraints that stabilise some conformations over others. The concept of ensemble modularity suggests that the dynamic behaviour of given RNA motifs or even partial structures of the molecule may play biological roles independently from rest of the molecule.

2.2 Hierarchy of RNA Free Energy Landscapes

As discussed in Sect. 1.3, the FEL provides a framework for describing macromolecular conformational ensembles, as it specifies the populations of every metastable structure and the rate of exchange between them. Experimental evidence demonstrates that RNA’s FELs are composed of basins hierarchically organised in tiers. The main difference between these tiers is the energetic cost of the interconversion between their alternative basins. Figure 4 depicts the tiers of RNA landscapes, highlighting the structural modifications at each of them with their related timescales. The first level of the hierarchy corresponds to steady structural motifs, mainly stabilised by stacking and complementarity between basis. These interactions determine the secondary structure which is, in turn, conditioned by the tertiary structure. The motifs in this tier variate from each other in both their secondary and tertiary rearrangements [24]. Some examples of slow processes linked to this tier are base-pair melting, reshuffling, isomerisation, and long-

Dynamics of sRNA/mRNA

221

Fig. 4 Free energy landscapes of RNAs are composed of three tiers that differ in the rate of interconversion between their alternative basins. Note that each basin of the first tier is composed of several basins of the second tier which, in turn, are formed by several basins of the third one. The main interactions related to each conformational change and their characteristic exchange rates are indicated

amplitude tertiary interactions. These processes often result in alternative functional groups’ exposition to participate in the recognition processes of other molecules [25]. The structural modules of this level correspond to deep minima of FEL since alternating between them requires the disruption of several base pairs. Consequently, the interconversions involve long transition rates, ranging from ms to seconds or even hours [24, 25]. Since these times are often out of biological timescales, protein chaperones are generally required to assist in these conformational changes [26, 27]. Besides, standard MD simulations are currently performed in the μs /ms order and many of these events are still out of the scope of the technique. Each basin of the first level of the hierarchy can be further subdivided into shallower ones corresponding to conformations of the next level. This second tier thus refers to structures with minor changes at their secondary level, and their interconversions require only the split of single base pairs without the assistance of protein chaperones [28]. Base-pairs switches in hairpins, shifts inside non-canonical motifs, and short-amplitude tertiary dynamics are some examples of transitions within this tier. The structural

222

Agustı´n Ormaza´bal et al.

characteristics of each basin are relevant for stabilising the global structure and occluding/exposing specific nucleotides involved in the interaction with other molecules. These tertiary dynamics can also play a role in transitions between active and inactive forms, thereby tuning RNA biological activity. This second tier of organisation is typically composed of quasi-isoenergetic conformations with exchange rates ranging from μs to ms [24]. Therefore events in this tier are plausible to be examined by MD simulations. Finally, each basin of the second level can be divided into even shallower basins that constitute the last grade of the structural organisation. The motions from this level are more influenced by the secondary structure than by on its sequence [29, 30]. The third level of the hierarchy corresponds to the fastest transitions, with energetic barriers significantly lower than in the previous ones, thus enabling RNA structures to readily set into specific conformations. In this third tier, we have jittering dynamics ranging from ps to μs, such as the flipping in and out of unpaired bulges and internal loop residues, sugar re-puckering, phosphate-backbone reorientations, and collective motions of helical domains [25]. Variable-amplitude inter-helical activity is also included at this level of the hierarchy. These helical domains undergo large collective motions at ns/μs timescales [25]. Only in exceptional cases, such as four-way junctions, do these movements occur in slower regimes [31]. Processes that include changes within the basins of this tier are feasible to be characterised by MD techniques [32–34]. Another relevant feature among this level is the base pairing between residues from single-stranded regions at the top of hairpin loops. This interaction, also known as kissing loops, involves flexible motions and possesses relatively short timescales associated. Remarkably, the dynamic of loops often exposes to the solvent key residues regarding ligand/protein recognition in relatively short timescales [25, 35].

3

MD Simulations Protocols In this section, we describe the main steps required to carry out a MD simulation for an RNA in solution. Important features to pay attention to at each stage will be highlighted. However, providing all the details to actually implement a simulation is out of the scope of this chapter because there exist plenty of variations. We, therefore, refer the interested reader to the tutorials available at the web sites of MD developers mentioned below.

3.1 Packages to Perform MD Simulations

Several research groups have developed software suites to perform MD simulations. CHARMM [36], GROMACS [37], NAMD [38], and AMBER [39] are the most popular. All of them are free of charge for academics. Besides, they are constantly upgrading, eliminating bugs and incorporating new tools. Their alternative

Dynamics of sRNA/mRNA

223

Table 1 Most popular molecular dynamics/visualisation programs MD Code

Website

CHARMM

https://academiccharmm.org/

GROMACS

https://www.gromacs.org/

AMBER

https://ambermd.org/index.php

NAMD

http://www.ks.uiuc.edu/Research/namd/

Visualisation Programs

Website

VMD

http://www.ks.uiuc.edu/Research/vmd/

PyMOL

https://pymol.org/2/

Chimera

https://www.cgl.ucsf.edu/chimera/index.html

Modelling Servers

Website

I-TASSER

https://zhanggroup.org/I-TASSER/

Rosetta

https://yanglab.nankai.edu.cn/trRosetta/

DeepFoldRNA

https://zhanggroup.org/DeepFoldRNA/

ModeRNA

https://iimcb.genesilico.pl/modernaserver/

AlphaFold

https://alphafold2.biodesign.ac.cn/

ColabFold

https://github.com/sokrypton/ColabFold

Swiss-Model

https://swissmodel.expasy.org/

versions can be consulted and downloaded from the web pages listed in Table 1. Also, in Table 1, we provide the sites of molecular visualisation programs, which constitute an essential tool in any MD study, and programs used to predict RNA 3D structures, which are sometimes required to build the computational models. 3.2 Setting Up the Model

In MD studies, the quality of the initial structure is of prime importance because simulations often cannot repair irregularities that confine the system in unfavourable conformations. This is of particular relevance for RNA, as discussed in [40]. The most common practice (although it is mutating nowadays) consists of downloading the structural file from the RCSB website (https://www. rcsb.org/). This set contained approximately 200,000 entries in January 2023, 97.75% of which corresponded to proteins or proteins in complex with other molecules. These numbers show that the RNA experimentally determined structures represent just a tiny portion of the available information. The figures are even more striking if we consider that, for humans, the number of RNAs surpasses that of proteins in an order of magnitude [41, 42]. Other data bases are specific for either nucleic acids

224

Agustı´n Ormaza´bal et al.

(http://ndbserver.rutgers.edu/) or just RNAs (http://rna.bgsu. edu/rna3dhub/pdb and https://rnasolo.cs.put.poznan.pl/). There, the 3D structures are cleaned for non-RNA data and clustered highlighting specific RNA characteristics. If there is no experimentally determined information about the 3D structure of the RNA under study, one has to resort to modelling. Some servers that produce 3D models based on protein or nucleic acid sequences are listed in Table 1. These programs employ different algorithms of inference and many times afford divergent outputs. Therefore, one has to treat with caution the structures provided by them, ideally comparing the results obtained with more than one server. Until recently, something similar occurred with the prediction of protein structures. However, artificial intelligence (AI) has recently produced an impressive breakthrough in this area when DeepMind Technologies presented AlphaFold [43– 45]. This server has proved to be able to predict over 200 million protein structures with experimental accuracy [43–45]. Unfortunately, RNA structural predictions via AI struggle far behind those of proteins, as it occurs with other aspects of their computational modelling. There are two major reasons for this. On the one hand, the experimental information available, which comprises the input to train the networks of AI algorithms, is much scarcer than that of proteins, as indicated above. On the other hand, RNAs are considerably more flexible than folded proteins and therefore the link between sequence and 3D structure is not so straightforward. We recommend the review of Ref. [46] to the readers looking for a deeper comprehension of the challenges faced by RNA 3D structural predictions. Once we have an initial RNA structure, the model construction for an all-atom MD simulation requires solvating it in a box of water molecules and ions. For the latter, one has to consider the intended salt concentration plus the ions required to neutralise the negative charge of the RNA molecule. Typically, Naþ ions are incorporated for neutralisation and then equal amounts of Naþ and Cl - ions are added on top, to reach the desired concentration. The inclusion of Mgþ2 ions poses a problem for modelling due to their high charge and relatively low concentration. Both factors hinder the balance between the free Mgþ2 ions in the solution and those bound to the RNA backbone. A simulation can only provide information about their equilibrium behaviour if the bound and free ions have exchanged several times. However, this requires simulation times in the order of milliseconds, or even longer, which are difficult to achieve in practice. Therefore, if one knows that a given Mgþ2 ion plays a critical structural role attached to a binding site in the RNA molecule, one has to add it by “hand”. Thus, it will play its relevant structural role despite being out of equilibrium.

Dynamics of sRNA/mRNA

3.3 Minimisation, Heating, and Equilibration

225

Computational models built as indicated in the previous section typically contain clashes, as well as strained bonds and angles. This occurs because the programs used to create the models add missing atoms and residues according to internal templates, without detailed consideration of their immediate environment. All these tensions must be eliminated before running the simulation. Otherwise, the system will be initially on a steep slope of the potential energy surface. In such a situation, the forces acting on some particles will be enormous and can cause the system to “explode” or to adopt non-physical configurations when Newton’s equations are propagated. Consequently, a minimisation stage is always carried out before the propagation to reduce the artificially high forces generated during the model construction process. The configuration attained at the minimum is of absolutely no importance. In fact, a typical RNA computational model may have tens of thousands of local minima [47]. Any of them is equally good for initiating the propagation. The model system prepared so far has initial coordinates and occupies a given volume but does not have a temperature. Temperature is a macroscopic property that relates to the distribution of the particle’s kinetic energies. This distribution is known as Maxwell’s distribution and is a property of paramount importance to simulate activated processes (i.e., processes that occur when the system moves over a free energy barrier). Thus, to run useful MD simulations, we have to provide the system’s particles with velocities taken from a Maxwellian distribution, and we should also implement an algorithm to ensure this distribution is maintained despite the fact that particles are constantly changing their velocities. Algorithms that fulfil this role receive the name of “thermostats”. Among them, Langeving’s and Nose´-Hoover’s are the most widely used. Regarding the initial velocities, one could be tempted to just assign them according to the Maxwellian distribution corresponding to the target temperature (for example 300 K). However, if we do this on the model obtained from the minimisation stage, which is at 0 K, the change is so abrupt that the attempt will probably end with a (virtual) “explosion”. For that reason, the heating is typically carried out in stages that smoothly take the system from zero to the desired temperature. We note that, once an MD simulation has run for a while at the target temperature, it can be stopped and re-initiated keeping the atomic positions but re-assigning the velocities from the Maxwellian distribution. This is a trick frequently employed to improve the sampling of the conformational space. The last adjustment we have to do to create an MD simulation that resembles the experimental conditions is relaxing the system volume so that its pressure is the intended one (typically 1.0 atm). This is carried out by applying an algorithm called “barostat” that

226

Agustı´n Ormaza´bal et al.

scales the system’s volume to attain the target pressure. Several barostats are available in simulation packages. This later adjustment also changes the system’s density which should fluctuate around 1.0 g/mL for simulations in aqueous solutions. We note that the barostat only works properly if the system has a relatively high temperature (typically at least 170 K). For that reason, one carries out part of all the heating stage at constant volume and the changes to constant pressure. If all the previous steps were properly performed, all the parameters that characterise the macroscopic state of the system should oscillate around a constant value. Therefore, a check for the system’s equilibration consists of plotting the temperature, density, and total energy, as a function of time. If all of them present just small variations around the desired constant value, we can consider that the system is at equilibrium and can start the production stage. 3.4

Production

3.5 Enhanced Sampling

During this stage, the system’s configurations are collected so that they can afterwards be analysed to provide insights into the atomistic functioning of the investigated biomolecule. Ideally, the snapshots taken from the simulation should represent the configurational ensemble assessed in wet-lab experiments. When this aim is fulfilled, we say that the simulation is converged. However, producing converged simulations is a very demanding task [48, 49] that turns out to be much harder for RNAs than for proteins because of their larger flexibility. The period spanned by the biological process under examination also plays a role in the sampling requirements (see Sect. 2.2). In the birth of MD studies of biomolecules, the length of the trajectories was in the order of ps [50]. Thus, only events that occur within this time scale could be directly observed. Over the years, the conjunction of computational and methodological developments has allowed the modelling of more extensive systems and increased the simulation times by several orders of magnitude. Today, simulations that include up to tens of millions of atoms [51–53] and processes in the sub-millisecond regime can be investigated by standard methods [54–57]. Moreover, for processes beyond this time scale, enhanced-sampling techniques can be used (see below). As described in Sect. 2.2, biomolecules in solution spend most of their time hovering around the minima of the free energy basins of their FELs. Of course, the same happens with the MD simulations that aim to emulate the biomolecules’ behaviour. This poses a problem when one wants to study, via MD simulations, processes that can only occur if the system visits high free energy regions. Among them, we can mention the unbinding of a substrate, which typically implies moving upward on the FEL by more than 25 kcal/ mol, or analysing conformational changes that take the system from one basin to the other, but require surmounting free energy

Dynamics of sRNA/mRNA

227

barriers larger than ≈ 10 kcal/mol. An even more challenging situation appears when one needs to characterise the conformational ensemble of a biomolecule since this requires a simulation that visits all the free energy basis in proportion to their probability. To cope with situations like these, several enhanced-sampling techniques have been devised. Here, we will discuss the basics of those most widely employed and provide some examples of their applications to the study of RNAs. Since the sampling problem arises because the system gets trapped within free energy wells, there are two basic remedies. Either one can rise the system’s kinetic energy by increasing the temperature or can reduce the potential energy barriers by adding new terms to the force field. There exist methods based on both strategies. The most simple scheme that employs ad hoc-increased kinetic energies is called simulated annealing. It consists of taking samples from a trajectory performed at a high temperature and gradually cooling them down to the temperature intended for the simulation. Ideally, the structures taken from the higher temperature will correspond to alternative basins and this provides a more extensive sampling. The main problem with this approach is that the results heavily depend on the number of samples at the higher T and on the cooling speed. The difficulties encountered in establishing general and robust protocols for simulated annealing led to the development of the method known as Temperature-Replica Exchange MD (T-REMD) [58]. In T-REMD, one runs several simulations of the same model at different temperatures (these are the “replicas”). Periodically, the T-REMD algorithm evaluates the probability of exchanging the coordinates between neighbouring replicas. Then, it holds a draw to determine whether the exchange will actually occur and it proceeds according to the result. Thus, at the end of the process, one has a set of trajectories with segments run at the alternative temperatures (as if we had suddenly adjusted the thermostat several times along each simulation). All the subsequent analysis is then carried out on the trajectory fragments run at the intended temperature. Many studies have focused on determining the most appropriate parameters for T-REMD simulations and evaluating the quality of the sampling achieved with them. Among these parameters, we have the number of replicas and the temperature range they span. Regarding this range, we highlight that the higher temperatures do not have any physical sense. They are just a gimmick to get the system out of the basins of the free energy landscape. Several more alternatives exist to enhance the sampling by reducing the potential energy barriers. Among the most commonly used methods, we can mention Umbrella Sampling (US), Metadynamics, Accelerated MD (aMD), and H-REMD. In US, a bias potential is applied to confine the system to oscillate around a

228

Agustı´n Ormaza´bal et al.

selected value of a carefully chosen “reaction coordinate”. For example, to simulate the unbinding of a substrate from the hairpin of an RNA, we can define the reaction coordinate as the distance between the centre of mass of the substrate and that of the hairpin. Several simulations that differ in the centre of the biasing potential must be carried out, affording a set of biased probability distributions of the reaction coordinate. Then, an algorithm that removes the biases is applied. Finally, the unbiased probabilities so produced are introduced in Eq. 1 to compute the free energy along the selected path (also known as the Potential of Mean Force). In aMD, a boosting potential is added to that computed from the force field, when the latter is below a given threshold [59]. Typically, this potential is added to all of some torsional terms to facilitate conformational changes. Diverse functional forms have been proposed, but the most commonly used nowadays is a quadratic function of the difference between the actual potential and the threshold. Thus, regions of lower energy receive a larger boost. The result is that the potential becomes flatter facilitating the sampling. Since the samples obtained by this procedure do not correspond to the actual potential, they cannot be directly used to compute averages over the conformational ensemble. Instead, a prior reweighting process is required to establish the statistical significance of each sample. Metadynamics also aims to enhance the sampling by producing a flatter PES but employs a strategy different from aMD. Its philosophy consists of biasing the course of the simulation by adding a history-dependent potential to that computed with the force field [60, 61]. Specifically, it imposes a potential energy penalty around points that were already visited by the system. The penalty is applied along the course of the simulation (it is said that is applied “on-thefly”). Therefore, after the system has been on a given basing of the actual PES for a while, the modified potential used to propagate the equations of motion is raised enough to allow the system to escape. We note that the locations already visited by the system are not defined as points of the whole configurational space, but of a reduced dimensionality space spanned by a few collective coordinates. The necessity of finding proper CVs to apply the potential penalty is probably the main drawback of Metadynamics. Finally, H-REMD is an alternative version to T-REMD in which the replicas have the same temperature but slightly different boosting potentials. Thus, each of them runs on different Hamiltonians and that is the origin of the method’s name. All these enhanced-sampling methods have been employed to study RNA’s behaviour via MD simulations [62]. Metadynamics was used to get insight on tetraloops’ FELs [63] and to characterise RNA aptamer complexes [64]. Also, it was applied to simulate the binding process between RNA’s and different ligands [65] or other nucleic acid molecules [66, 67]. There are also several studies of

Dynamics of sRNA/mRNA

229

RNAs performed with REMD [68]. Some of these applications were devoted to enhancing the predictive character of the RNA force fields [69, 70], as well as to exploring the conformational ensembles of small RNAs [71, 72]. Besides, the alternative REMD variants were used to model RNA folding [73, 74] and hairpin-loop dynamic behaviour [75, 76]. Even more challenging applications of REMD include examining RNA characteristics in viral contexts [68] and gene regulation by riboswitches [77]. Regarding aMD, challenging applications include studying the functioning of riboswitches [78] and RNA-protein complexes [79]. To conclude, applications of US include the study of RNA hairpins stability [80], conformational transitions [81], as well as the study of complexes of RNA’s with proteins and other nucleic acids [82, 83].

4

Advances in RNA Simulations The literature on MD simulations of RNA systems covers multiple aspects including intrinsic flexibility, the impact of specific substitutions, molecular evolution, catalysis, the role of ions and water molecules, RNA/protein complexes formation, antibiotic modes of action, structure prediction, and ribosome/spliceosome mechanisms, among others. Here we will briefly present a group of studies selected from a broader spectrum, with the aim of introducing the reader to the scope of this methodology and how it complements structural/biochemical experimental techniques. Due to the complexity of the RNA molecular processes, the synergy reached by means of experimental and computational approaches is encouraging for a full understanding. In the following sections, we first highlight benchmark MD studies on RNAs in general, and then, we describe the specific literature associated with sRNAs involved in translational regulatory processes.

4.1 Benchmark Applications

Since the first MD simulation on an RNA molecule was performed [84], the structural and dynamical features of small RNA models were widely examined in order to test force fields performance [16, 85–88] and evaluate sampling convergence [89, 90]. The models studied for these porpoises included mainly tetranucleotides and hairpins, internal or multi-helix junction loops [91– 98]. Slightly larger models such as the rRNA Loop E motif, sarcin– ricin loops, Kink-turns and kissing-loop complexes were also inspected by MD simulations [99–103]. The impact of base-pair substitution, (un)folding processes, inner stability, hydration, flexibility, and ion-binding were some of the inspected features. The study of the above-mentioned models resulted in prime meaningfulness not only in understanding its own dynamical behaviour but also to make inferences on the role of the higher hierarchical RNA structure they are part of.

230

Agustı´n Ormaza´bal et al.

Viral RNA elements such as the trans-activation response (TAR) element or the dimerisation initiation sequence (DIS) of HIV have also been extensively examined [103–105]. MD simulations have also been applied to study the reaction mechanisms of ribozymes. In this kind of study, biochemical features such as catalytic strategies, the most probable mechanism and the influence of ions, water and specific nucleotides, were inspected [106– 108]. At this point, it is important to notice that bond-breaking/ formation events cannot be simulated using conventional force fields. Instead, quantum calculations are required. The ribosomal translational machinery represents an ideal system for MD simulation studies. Different models including part of the ribosome were implemented to understand key aspects of the decoding, accommodation, peptidyl transfer and translocation processes [109–112]. Insights about antibiotic binding were also gained using these approaches [113–115]. The advances in computational power have enabled the analysis of even larger systems than the above mentioned. In this century, we have witnessed simulations encompassing millions of atoms. Those that examined the whole ribosome are just an example. Sanbonmatsu et al. suggested that a motion of tRNA in a corridor acts as a gate to the peptidyl transferase centre [116]. The rates and driving forces for tRNA translocation were examined by Bock et al. [117]. Besides, a multiple pathways mechanism for tRNA accommodation was proposed based on synergic results of MD and smFRET assays [118]. Finally, Trabuco et al. determined mobile structures implicated in tRNA release from the ribosome [119]. 4.2 Applications to Regulatory sRNAs

Small RNAs (sRNAs) have a critical role in translational regulation [120, 121] and in most, if not all cases, they are influenced by the interaction with specific proteins. MD studies of this sort of system have only recently emerged in the last decade. Therefore, there is still a lot to be learned, but the field is fertile to make original contributions. Hfq is an RNA-binding protein that can interact with both regulatory sRNA and mRNAs, influencing the translation process in most known bacteria [122, 123]. It, particularly, binds untranslated regions of mRNAs or it can also facilitate base-pair formation between the sRNAs and the target mRNAs. OxyS is one of those sRNAs that regulates mRNA translation aided by Hfq. Its expression is augmented under stress conditions. Knowing the molecular details of the Hfq-OxyS interaction mechanism is critical for understanding how these sRNAs regulate translation. Li et al. computed the Hfq-OxyS binding free energy in agreement with experimental results. By means of per-residue energy decomposition, they found that Tyr203 and Lys209 play a key role in the interaction process and could explain the increasing affinity of Hfq when the N48A mutation is considered. In Pseudomonas, Hfq and catabolite

Dynamics of sRNA/mRNA

231

repression control (Crc) protein build up, together with target mRNAs, a translation–repression complex. Protein Crc by itself binds neither RNAs nor Hfq, but it attaches Hfq–RNA complexes and strengthens Hfq interactions with target mRNAs [124]. Krepl et al., by means of MD simulations, investigated the RNA-Hfq interface and its implications on Crc binding. They found that there exists a dynamic equilibrium between the anti and syn nucleotide conformations in the RNA-Hfq complex and that upon binding, Crc shifts the free energy balance towards the anti conformation. A new binding pocket of Hfq was also detected, and they could predict a possible assembly pathway for the RNA-Hfq-Crc complex [125]. Lazar et al. [126] also investigated the RNA-Hfq system characterising the influence of ionic strength. They noticed that the cellular ionic salt concentration was crucial for stacking stabilisations between amino acids and nucleotides. Besides, they also detected that the RNA interaction diminishes as the ionic strength is incremented. CsrA, as Hfq, represents a group of proteins (Pfam code PF02599) involved in translational modulation [127, 128]. However, they act specifically on GGA motifs of the 5’ untranslated region of mRNAs. Regulatory sRNAs such as RsmX, RsmZ or CsrC, by mean of mimicking strategies (i.e., they contain many GGA motifs), bind CsrA inhibiting its interaction with target mRNAs. It has been observed that, although CsrA can interact with GGA motifs located in either loops or single-stranded regions, the binding affinity is larger in the former case [129]. We particularly examined the molecular mechanism of (un)binding of the CrsA orthologue of Pseudomonas protegens, RsmE, to a singlestranded GGA motif of RsmZ. We found that the main motion of this RNA region is to adopt a loop-like structure which is necessary for RsmE docking [82]. Moreover, using in-silico mutations we determined that the flanking nucleotides of the GGA motif are needed to confer this region the required flexibility. On the other side, and based on the recently disclosed structure of the complex formed by RsmZ and three RsmE molecules [130], we performed a series of MD simulations considering RsmZ bound to two, one, and none RsmE molecule. For the later system, we could observed that the most probable “available” GGA motifs were those that interact with the first RsmE molecule. The binding of this first molecule then causes a conformational change in RsmZ that unveils other GGA motifs that were not “accessible” in the free form of the RNA. The same pattern was observed upon the binding of a second RsmE molecule [131]. Thus, the analysis shed light on the sequential binding mechanism of RsmE on RsmZ and predicted that occluded GGA motifs of the free RNA molecule turn available as it bounds with more RsmE proteins.

232

Agustı´n Ormaza´bal et al.

Finally, a recent experimental study on rice plants investigated the reasons for their high yield and nitrogen use efficiency. They found that these facts were influenced by the levels of a couple of proteins whose translation was modulated in a temperaturedependent way by the levels of sRNAs. The authors performed MD simulations of these sRNAs to examine the influence of the temperature on their structure, finding changes in the hairpin loops that would enable their binding to the target mRNAs [132].

5

Conclusions MD simulations of RNAs have provided insights into details of their functioning in many cases. However, there still exist challenges associated with that limit the scope of this approach. Among them, we can mention the drawbacks of the force fields and the difficulties of sampling the conformational space. Both areas are currently the focus of the investigations of numerous groups. Thus, one can foresee that we will witness important progresses in this regard in the coming years. Thus, we believe that wisely designed experimental/computational studies will offer key opportunities to broaden our knowledge about these complex systems.

6

Movie Legend Movie I shows how the small RNA RsmZ captures protein RsmE, according to an MD simulation of the two molecules in NaCl water solution. We initiated the simulation with one of the GGA motifs of RsmZ located near a binding site of RsmE and observed that they rapidly attach to each other. This initial attachment then triggers a conformational change in RsmZ that takes another GGA motif close to the unoccupied binding site of RsmE. In this way, the protein can be grasped from both sides.

References 1. Box GEP (1976) Science and statistics. J Am Stat Assoc 71(356):791–799 2. Pauling L, Corey RB, Branson HR (1951) The structure of proteins: two hydrogenbonded helical configurations of the polypeptide chain. Proc Natl Acad Sci 37(4):205–211 3. Watson J, Crick FHC (1953) Molecular structure of nucleic acids: a structure for deoxyribose nucleic acid. Nature 171(4356): 737–738

4. Izadi S, Anandakrishnan R, Onufriev AV (2014) Building water models: a different approach. J Phys Chem Lett 5(21): 3863–3871 5. Zielkiewicz J (2005) Structural properties of water: comparison of the SPC, SPCE, TIP4P, and TIP5P models of water. J Chem Phys 123(10):104,501 6. Price DJ, Brooks III CL (2004) A modified TIP3P water potential for simulation with

Dynamics of sRNA/mRNA Ewald summation. J Chem Phys 121(20): 10,096–10,103 7. Sengupta A, Li Z, Song LF, Li P, Merz Jr KM (2021) Parameterization of monovalent ions for the OPC3, OPC, TIP3P-FB, and TIP4PFB water models. J Chem Inform Model 61(2):869–880 8. Li Z, Song LF, Li P, Merz Jr KM (2020) Systematic parametrization of divalent metal ions for the OPC3, OPC, TIP3P-FB, and TIP4P-FB water models. J Chem Theory Comput 16(7):4429–4442 9. Vangaveti S, Ranganathan SV, Chen AA (2017) Advances in RNA molecular dynamics: a simulator’s guide to RNA force fields. WIREs RNA 8(2):e1396 10. Sponer J, Bussi G, Krepl M, Bana´s P, Bottaro S, Cunha RA, Gil-Ley A, Pinamonti G, Poblete S, Jurecka P, Walter NG, Otyepka M (2018) RNA structural dynamics as captured by molecular simulations: a comprehensive overview. Chem Rev 118(8):4177–4338 11. Bergonzo C, Cheatham TEI (2015) Improved force field parameters lead to a better description of RNA structure. J Chem Theory Comput 11(9):3969–3972 12. Palma J, Pierdominici-Sottile G (2023) On the uses of PCA to characterise molecular dynamics simulations of biological macromolecules: basics and tips for an effective use. ChemPhysChem 24(2):e202200,491 13. Simmerling C, Strockbine B, Roitberg AE (2002) All-atom structure prediction and folding simulations of a stable protein. J Am Chem Soc 124(38):11,258–11,259 14. Lindorff-Larsen K, Piana S, Dror RO, Shaw DE (2011) How fast-folding proteins fold. Science 334(6055):517–520 15. Yildirim I, Stern HA, Tubbs JD, Kennedy SD, Turner DH (2011) Benchmarking amber force fields for RNA: comparisons to NMR spectra for single-stranded r (GACC) are improved by revised χ torsions. J Phys Chem B 115(29):9261–9270 16. Condon DE, Kennedy SD, Mort BC, Kierzek R, Yildirim I, Turner DH (2015) Stacking in RNA: NMR of four tetramers benchmark molecular dynamics. J Chem Theory Comput 11(6):2729–2742 17. Sponer J, Krepl M, Bana´s P, Kuhrova´ P, Zgarbova´ M, Jurecka P, Havrila M, Otyepka M (2017) How to understand atomistic molecular dynamics simulations of RNA and protein–RNA complexes? Wiley Interdiscip Rev: RNA 8(3):e1405

233

18. Zhang C, Lu C, Jing Z, Wu C, Piquemal JP, Ponder JW, Ren P (2018) AMOEBA polarizable atomic multipole force field for nucleic acids. J Chem Theory Comput 14(4): 2084–2108 19. Lemkul JA, MacKerell Jr AD, (2018) Polarizable force field for RNA based on the classical Drude oscillator. J Comput Chem 39(32): 2624–2646 20. Wade G, Luc J (2013) RNA modularity for synthetic biology. F1000Prime Rep 45(5) 21. Beisel CL, Chen YY, Culler SJ, Hoff KG, Smolke CD (2010) Design of small molecule-responsive microRNAs based on structural requirements for Drosha processing. Nucleic Acids Res 39(7):2981–2994 22. McNamara J, Andrechek E, Wang Y et al (2006) Cell type–specific delivery of siRNAs with aptamer-siRNA chimeras. Nat Biotechnol 24:1005–1015 23. Vinkenborg J, Karnowski N, Famulok M (2011) Aptamers for allosteric regulation. Nat Chem Biol 7:519–527 24. Ganser LR, Kelly ML, Herschlag D, Al-Hashimi HM (2019) The roles of structural dynamics in the cellular functions of RNAs. Nat Rev Mol Cell Biol 20(8):474–489 25. Mustoe AM, Brooks CL, Al-Hashimi HM (2014) Hierarchy of RNA functional dynamics. Ann Rev Biochem 83(1):441–466 26. Herschlag D (1995) RNA chaperones and the RNA folding problem. J Biol Chem 270(36): 20,871–20,874 27. Rist MJ, Marino JP (2002) Mechanism of nucleocapsid protein catalyzed structural isomerization of the dimerization initiation site of HIV-1. Biochemistry 41(50): 14,762–14,770 28. Butcher SE, Pyle AM (2011) The molecular interactions that stabilize RNA tertiary structure: RNA motifs, patterns, and networks. Acc Chem Res 44(12):1302–1311 29. Denny SK, Bisaria N, Yesselman JD, Das R, Herschlag D, Greenleaf WJ (2018) Highthroughput investigation of diverse junction elements in RNA tertiary folding. Cell 174(2):377–390.e20 30. Bailor MH, Sun X, Al-Hashimi HM (2010) Topology links RNA secondary structure with global conformation, dynamics, and adaptation. Science 327(5962):202–206 31. Hohng S, Wilson TJ, Tan E, Clegg RM, Lilley DM, Ha T (2004) Conformational flexibility of four-way junctions in RNA. J Mol Biol 336(1):69–79 32. Zhang Q, Sun X, Watt ED, Al-Hashimi HM (2006) Resolving the motional modes that

234

Agustı´n Ormaza´bal et al.

code for RNA adaptation. Science 311(5761):653–656 33. Zhang Q, Stelzer AC, Fisher CK, Al-Hashimi HM (2007) Visualizing spatially correlated dynamics that directs RNA conformational transitions. Nature 450(7173):1263–1267 34. Salmon L, Bascom G, Andricioaei I, Al-Hashimi HM (2013) A general method for constructing atomic-resolution RNA ensembles using NMR residual dipolar couplings: the basis for interhelical motions revealed. J Am Chem Soc 135(14): 5457–5466 35. Tan D, Marzluff WF, Dominski Z, Tong L (2013) Structure of histone mRNA stemloop, human stem-loop binding protein, and 3 ′ hExo ternary complex. Science 339(6117): 318–321 36. Brooks BR, Brooks III CL, Mackerell Jr AD, Nilsson L, Petrella RJ, Roux B, Won Y, Archontis G, Bartels C, Boresch S et al (2009) CHARMM: the biomolecular simulation program. J Comput Chem 30(10): 1545–1614 37. Hess B, Kutzner C, Van Der Spoel D, Lindahl E (2008) Gromacs 4: algorithms for highly efficient, load-balanced, and scalable molecular simulation. J Chem Theory Comput 4(3): 435–447 38. Nelson MT, Humphrey W, Gursoy A, Dalke A, Kale´ LV, Skeel RD, Schulten K (1996) NAMD: a parallel, object-oriented molecular dynamics program. Int J Supercomput Appl High Perform Comput 10(4): 251–268 39. Case DA, Darden TA, Cheatham TE (2018) AMBER 22. University of California, San Francisco 40. Hashem Y, Auffinger P (2009) A short guide for molecular dynamics simulations of RNA systems. Methods 47(3):187–197 41. Salzberg SL (2018) Open questions: how many genes do we have? BMC Biol 16(1):1–3 42. Nurk S, Koren S, Rhie A, Rautiainen M, Bzikadze AV, Mikheenko A, Vollger MR, Altemose N, Uralsky L, Gershman A et al (2022) The complete sequence of a human genome. Science 376(6588):44–53 43. Jumper JM et al (2021) Highly accurate protein structure prediction with AlphaFold. Nature 596:583–589 44. Varadi M et al (2021) AlphaFold Protein Structure Database: massively expanding the structural coverage of protein-sequence space with high-accuracy models. Nucleic Acids Res 50(D1):D439–D444

45. Stahl B. https://www.alphafold.ebi.ac.uk/ (2023) 46. Ou X, Zhang Y, Xiong Y, Xiao Y (2022) Advances in RNA 3d structure prediction. J Chem Inform Model 47. Chakraborty D, Collepardo-Guevara R, Wales DJ (2014) Energy landscapes, folding mechanisms, and kinetics of RNA tetraloop hairpins. J Am Chem Soc 136(52): 18,052–18,061 48. Sawle L, Ghosh K (2016) Convergence of molecular dynamics simulation of protein native states: feasibility vs self-consistency dilemma. J Chem Theory Comput 12(2): 861–869 49. Nemec M, Hoffmann D (2017) Quantitative assessment of molecular dynamics sampling for flexible systems. J Chem Theory Comput 13(2):400–414 50. McCammon JA, Gelin BR, Karplus M (1977) Dynamics of folded proteins. Nature 267(5612):585–590 51. Perilla JR, Goh BC, Cassidy CK, Liu B, Bernardi RC, Rudack T, Yu H, Wu Z, Schulten K (2015) Molecular dynamics simulations of large macromolecular complexes. Curr Opin Struct Biol 31:64–74 52. Tarasova E, Nerukh D (2018) All-atom molecular dynamics simulations of whole viruses. J Phys Chem Lett 9(19):5805–5809 53. Casalino L, Dommer AC, Gaieb Z, Barros EP, Sztain T, Ahn SH, Trifan A, Brace A, Bogetti AT, Clyde A et al (2021) AI-driven multiscale simulations illuminate mechanisms of SARSCoV-2 spike dynamics. Int J High Perform Comput Appl 35(5):432–451 54. Salomon-Ferrer R, Gotz AW, Poole D, Le Grand S, Walker RC (2013) Routine microsecond molecular dynamics simulations with amber on GPUs. 2. explicit solvent particle mesh Ewald. J Chem Theory Comput 9(9): 3878–3888 55. Shaw DE, Dror RO, Salmon JK, Grossman J, Mackenzie KM, Bank JA, Young C, Deneroff MM, Batson B, Bowers KJ et al (2009) Proceedings of the conference on high performance computing networking, storage and analysis, pp 1–11 56. Harvey MJ, Giupponi G, Fabritiis GD (2009) ACEMD: accelerating biomolecular dynamics in the microsecond time scale. J Chem Theory Comput 5(6):1632–1639 57. Shaw DE, Maragakis P, Lindorff-Larsen K, Piana S, Dror RO, Eastwood MP, Bank JA, Jumper JM, Salmon JK, Shan Y et al (2010) Atomic-level characterization of the structural

Dynamics of sRNA/mRNA dynamics of proteins. Science 330(6002): 341–346 58. Bernardi RC, Melo MC, Schulten K (2015) Enhanced sampling techniques in molecular dynamics simulations of biological systems. Biochimica et Biophysica Acta (BBA) – Gen Subj 1850(5):872–877. Recent developments of molecular dynamics 59. Wang J, Arantes PR, Bhattarai A, Hsu RV, Pawnikar S, Huang YmM, Palermo G, Miao Y (2021) Gaussian accelerated molecular dynamics: principles and applications. WIREs Comput Mol Sci 11(5):e1521 60. Bussi G, Laio A (2020) Using metadynamics to explore complex free-energy landscapes. Nat Rev Phys 2:200–212 61. Barducci A, Bonomi M, Parrinello M (2011) Metadynamics. WIREs Comput Mol Sci 1(5): 826–843 62. Mly´nsky´ V, Bussi G (2018) Exploring RNA structure and dynamics through enhanced sampling simulations. Curr Opin Struct Biol 49:63–71 63. Bottaro S, Bana´sˇ P, Sˇponer J, Bussi G (2016) Free energy landscape of GAGA and UUCG RNA tetraloops. J Phys Chem Lett 7(20): 4032–4038 64. Tanida Y, Matsuura A (2020) Alchemical free energy calculations via metadynamics: application to the theophylline-RNA aptamer complex. J Comput Chem 41(20):1804–1819 65. Dandekar BR, Sinha S, Mondal J (2021) Role of molecular dynamics in optimising ligand discovery: case study with novel inhibitor search for peptidyl t-RNA hydrolase. Chem Phys Impact 3:100,048 66. Verona M, Verdolino V, Palazzesi F et al (2017) Focus on PNA flexibility and RNA binding using molecular dynamics and metadynamics. Sci Rep 7:42,799 67. Zhu L, Jiang H, Cao S et al (2021) Critical role of backbone coordination in the mRNA recognition by RNA induced silencing complex. Commun Biol 4:1345 68. Davidson RB, Hendrix J, Geiss BJ, McCullagh M (2020) RNA-dependent structures of the RNA-binding loop in the flavivirus NS3 helicase. J Phys Chem B 124(12):2371–2381 69. Li Z, Mu J, Chen J, Chen HF (2022) Basespecific RNA force field improving the dynamics conformation of nucleotide. Int J Biol Macromol 222:680–690 70. Mly´nsky´ V, Ku¨hrova´ P, Ku¨hr T, Otyepka M, Bussi G, Bana´sˇ P, Sˇponer J (2020) Finetuning of the amber RNA force field with a new term adjusting interactions of terminal

235

nucleotides. J Chem Theory Comput 16(6): 3936–3946 71. Bottaro S, Bussi G, Kennedy SD, Turner DH, Lindorff-Larsen K (2018) Conformational ensembles of RNA oligonucleotides from integrating NMR and molecular simulations. Sci Adv 4(5):eaar8521 72. Girard N, Dagenais P, Lacroix-Labonte´ J, Legault P (2019) A multi-axial RNA joint with a large range of motion promotes sampling of an active ribozyme conformation. Nucleic Acids Res 47(7):3739–3751 73. Fox DM, MacDermaid CM, Schreij AMA, Zwierzyna M, Walker RC (2022) RNA folding using quantum computers. PLOS Comput Biol 18:1–17 74. Mly´nsky´ V, Janec¸ek M, Ku¨hrova´ P, Fro¨hlking T, Otyepka M, Bussi G, Bana´sˇ P, Sˇponer J (2022) Toward convergence in folding simulations of RNA tetraloops: comparison of enhanced sampling techniques and effects of force field modifications. J Chem Theory Comput 18(4):2642–2656 75. Lam K, Kasavajhala K, Gunasekera S, Simmerling C (2022) Accelerating the ensemble convergence of RNA hairpin simulations with a replica exchange structure reservoir. J Chem Theory Comput 18(6):3930–3947 76. Swadling JB, Ishii K, Tahara T, Kitao A (2018) Origins of biological function in DNA and RNA hairpin loop motifs from replica exchange molecular dynamics simulation. Phys Chem Chem Phys 20:2990–3001 77. Cheng L, White EN, Brandt NL, Yu AM, Chen AA, Lucks J (2022) Cotranscriptional RNA strand exchange underlies the gene regulation mechanism in a purine-sensing transcriptional riboswitch. Nucleic Acids Res 50(21):12,001–12,018 78. Chen J, Zeng Q, Wang W, Sun H, Hu G (2022) Decoding the identification mechanism of an SAM-III riboswitch on ligands through multiple independent Gaussianaccelerated molecular dynamics simulations. J Chem Inform Model 62(23):6118–6132 79. Roy R, Mishra A, Poddar S, Nayak D, Kar P (2022) Investigating the mechanism of recognition and structural dynamics of nucleoprotein-RNA complex from Peste des petits ruminants virus via Gaussian accelerated molecular dynamics simulations. J Biomol Struct Dyn 40(5):2302–2315 80. Smith LG, Tan Z, Spasic A, Dutta D, SalasEstrada LA, Grossfield A, Mathews DH (2018) Chemically accurate relative folding stability of RNA hairpins from molecular

236

Agustı´n Ormaza´bal et al.

simulations. J Chem Theory Comput 14(12): 6598–6612 81. Barthel A, Zacharias M (2006) Conformational transitions in RNA single uridine and adenosine bulge structures: a molecular dynamics free energy simulation study. Biophys J 90(7):2450–2462 82. Ormaza´bal A, Pierdominici-Sottile G, Palma J (2022) Recognition and binding of RsmE to an AGGAC motif of RsmZ: insights from molecular dynamics simulations. J Chem Inform Model 83. Basu S, Alagar S, Bahadur RP (2021) Unusual RNA binding of FUS RRM studied by molecular dynamics simulation and enhanced sampling method. Biophys J 120(9):1765–1776 84. Harvey SC, Prabhakaran M, Mao B, McCammon JA (1984) Phenylalanine transfer RNA: molecular dynamics simulation. Science 223(4641):1189–1191 85. Bana´s P, Hollas D, Zgarbova´ M, Jurecka P, Orozco M, Cheatham III TE, Sponer J, Otyepka M (2010) Performance of molecular mechanics force fields for RNA simulations: stability of UUCG and GRNA hairpins. J Chem Theory Comput 6(12):3836–3849 86. Yildirim I, Stern HA, Tubbs JD, Kennedy SD, Turner DH (2011) Benchmarking amber force fields for RNA: comparisons to NMR spectra for single-stranded r (GACC) are improved by revised χ torsions. J Phys Chem B 115(29):9261–9270 87. Kuhrova P, Best RB, Bottaro S, Bussi G, Sponer J, Otyepka M, Banas P (2016) Computer folding of RNA tetraloops: identification of key force field deficiencies. J Chem Theory Comput 12(9):4534–4548 88. Villa A, Stock G (2006) What NMR relaxation can tell us about the internal motion of an RNA hairpin: a molecular dynamics simulation study. J Chem Theory Comput 2(5): 1228–1236 89. Roe DR, Bergonzo C, Cheatham III TE (2014) Evaluation of enhanced sampling provided by accelerated molecular dynamics with Hamiltonian replica exchange methods. J Phys Chem B 118(13):3543–3552 90. Bergonzo C, Henriksen NM, Roe DR, Cheatham TE (2015) Highly sampled tetranucleotide and tetraloop motifs enable evaluation of common RNA force fields. RNA 21(9): 1578–1590 91. Sorin EJ, Rhee YM, Pande VS (2005) Does water play a structural role in the folding of small nucleic acids? Biophys J 88(4): 2516–2524

92. Sorin EJ, Rhee YM, Nakatani BJ, Pande VS (2003) Insights into nucleic acid conformational dynamics from massively parallel stochastic simulations. Biophys J 85(2):790–803 93. Zhang W, Chen SJ (2002) RNA hairpinfolding kinetics. Proc Natl Acad Sci 99(4): 1931–1936 94. Xu X, Yu T, Chen SJ (2016) Understanding the kinetic mechanism of RNA single base pair formation. Proc Natl Acad Sci 113(1): 116–121 95. Garcia AE, Paschek D (2008) Simulation of the pressure and temperature folding/unfolding equilibrium of a small RNA hairpin. J Am Chem Soc 130(3):815–817 96. Villa A, Widjajakusuma E, Stock G (2008) Molecular dynamics simulation of the structure, dynamics, and thermostability of the RNA hairpins UCACGG and CUUCGG. J Phys Chem B 112(1):134–142 97. Chen AA, Garcı´a AE (2013) High-resolution reversible folding of hyperstable RNA tetraloops using molecular dynamics simulations. Proc Natl Acad Sci 110(42):16,820–16,825 98. Kollman PA, Massova I, Reyes C, Kuhn B, Huo S, Chong L, Lee M, Lee T, Duan Y, Wang W et al (2000) Calculating structures and free energies of complex molecules: combining molecular mechanics and continuum models. Acc Chem Res 33(12):889–897 99. Havrila M, Re´blova´ K, Zirbel CL, Leontis NB, Sponer J (2013) Isosteric and non-isosteric base pairs in RNA motifs: molecular dynamics and bioinformatics study of the sarcin-ricin internal loop. J Phys Chem B 117(46):14,302–14,319 100. Re´blova´ K, Sˇpacˇkova´ N, Sˇtefl R, Csaszar K, Kocˇa J, Leontis NB, Sˇponer J (2003) NonWatson-crick basepairing and hydration in RNA motifs: molecular dynamics of 5s rRNA loop e. Biophys J 84(6):3564–3582 101. Auffinger P, Bielecki L, Westhof E (2004) Symmetric K+ and Mg2+ ion-binding sites in the 5 S rRNA loop E inferred from molecular dynamics simulations. J Mol Biol 335(2): 555–571 102. Ra´zga F, Kocˇa J, Sˇponer J, Leontis NB (2005) Hinge-like motions in RNA kink-turns: the role of the second a-minor motif and nominally unpaired bases. Biophys J 88(5): 3466–3485 103. Re´blova´ K, Spackova´ N, Sponer JE, Koca J, Sponer J (2003) Molecular dynamics simulations of RNA kissing–loop motifs reveal structural dynamics and formation of cationbinding pockets. Nucleic Acids Res 31(23): 6942–6952

Dynamics of sRNA/mRNA 104. Kulinski T, Olejniczak M, Huthoff H, Bielecki L, Pachulska-Wieczorek K, Das AT, Berkhout B, Adamiak RW, (2003) The apical loop of the HIV-1 TAR RNA hairpin is stabilized by a cross-loop base pair. J Biol Chem 278(40):38,892–38,901 105. Dethoff EA, Hansen AL, Musselman C, Watt ED, Andricioaei I, Al-Hashimi HM (2008) Characterizing complex dynamics in the transactivation response element apical loop and motional correlations with the bulge by NMR, molecular dynamics, and mutagenesis. Biophys J 95(8):3906–3915 106. Gregersen BA, Lopez X, York DM (2003) Hybrid QM/MM study of thio effects in transphosphorylation reactions. J Am Chem Soc 125(24):7178–7179 107. Krasovska MV, Sefcikova J, Re´blova´ K, Schneider B, Walter NG, Sˇponer J (2006) Cations and hydration in catalytic RNA: molecular dynamics of the hepatitis delta virus ribozyme. Biophys J 91(2):626–638 108. Hermann T, Auffinger P, Westhof E (1998) Molecular dynamics investigations of hammerhead ribozyme RNA. Eur Biophys J 27: 153–165 109. Trobro S, Åqvist J (2006) Analysis of predictions for the catalytic mechanism of ribosomal peptidyl transfer. Biochemistry 45(23): 7049–7056 110. Zeng X, Chugh J, Casiano-Negroni A, Al-Hashimi HM, Brooks III CL (2014) Flipping of the ribosomal a-site adenines provides a basis for tRNA selection. J Mol Biol 426(19):3201–3213 111. Satpati P, Åqvist J (2014) Why base tautomerization does not cause errors in mRNA decoding on the ribosome. Nucleic Acids Res 42(20):12,876–12,884 112. Trobro S, Åqvist J (2005) Mechanism of peptide bond synthesis on the ribosome. Proc Natl Acad Sci 102(35):12,395–12,400 113. Vaiana AC, Sanbonmatsu KY (2009) Stochastic gating and drug-ribosome interactions. J Mol Biol 386(3):648–661 114. Romanowska J, Reuter N, Trylska J (2013) Comparing aminoglycoside binding sites in bacterial ribosomal RNA and aminoglycoside modifying enzymes. Proteins Struct Funct Bioinf 81(1):63–80 115. Vaiana A, Westhof E, Auffinger P (2006) A molecular dynamics simulation study of an aminoglycoside/a-site RNA complex: conformational and hydration patterns. Biochimie 88(8):1061–1073 116. Sanbonmatsu KY, Joseph S, Tung CS (2005) Simulating movement of tRNA into the

237

ribosome during decoding. Proc Natl Acad Sci 102(44):15,854–15,859 117. Bock LV, Blau C, Schro¨der GF, Davydov II, Fischer N, Stark H, Rodnina MV, Vaiana AC, Grubmu¨ller H (2013) Energy barriers and driving forces in tRNA translocation through the ribosome. Nat Struct Mol Biol 20(12): 1390–1396 118. Whitford PC, Geggier P, Altman RB, Blanchard SC, Onuchic JN, Sanbonmatsu KY (2010) Accommodation of aminoacyl-tRNA into the ribosome involves reversible excursions along multiple pathways. RNA 16: 1196–1204 119. Trabuco LG, Schreiner E, Eargle J, Cornish P, Ha T, Luthey-Schulten Z, Schulten K (2010) The role of L1 stalk–tRNA interaction in the ribosome elongation cycle. J Mol Biol 402(4):741–760 120. Wagner EGH, Romby P (2015) Academic Press, pp 133–208 121. Storz G, Vogel J, Wassarman K (2011) Regulation by small RNAs in bacteria: expanding frontiers. Mol Cell 43(6):880–891 122. Vogel J, Luisi BF (2011) Hfq and its constellation of RNA. Nat Rev Microbiol 9(8): 578–589 123. Møller T, Franch T, Højrup P, Keene DR, B€achinger HP, Brennan RG, Valentin-Hansen P (2002) Hfq: a bacterial SM-like protein that mediates RNA-RNA interaction. Mol Cell 9(1):23–30 124. Sonnleitner E, Bl€asi U (2014) Regulation of Hfq by the RNA CrcZ in pseudomonas aeruginosa carbon catabolite repression. PLoS Genet 10(6):e1004,440 125. Krepl M, Dendooven T, Luisi BF, Sponer J (2021) MD simulations reveal the basis for dynamic assembly of Hfq–RNA complexes. J Biol Chem 296 126. Lazar P, Lee YO, Kim SM, Chandrasekaran M, Lee KW (2010) Molecular dynamics simulation study for ionic strength dependence of RNA-host factor interaction in staphylococcus aureus Hfq. Bull Korean Chem Soc 31(6):1519–1526 127. Romeo T, Babitzke P (2018) Global regulation by CsrA and its RNA antagonists. Microbiol Spectrum 6(2):6–2 128. Sobrero PM, Valverde C (2020) Comparative genomics and evolutionary analysis of RNA-binding proteins of the CsrA family in the genus pseudomonas. Front Mol Biosci 7: 127 129. Duss O, Michel E, Diarra dit Konte´ N, Schubert M, Allain FHT (2014) Molecular basis for the wide range of affinity found in

238

Agustı´n Ormaza´bal et al.

Csr/Rsm protein–RNA recognition. Nucleic Acids Res 42(8):5332–5346 130. Duss O, Michel E, Yulikov M, Schubert M, Jeschke G, Allain FHT (2014) Structural basis of the non-coding RNA RsmZ acting as a protein sponge. Nature 509(7502):588–592 131. Ormaza´bal A, Palma J, Pierdominici-Sottile G (2021) Molecular dynamics simulations unveil the basis of the sequential binding of

RsmE to the noncoding RNA RsmZ. J Phys Chem B 125(12):3045–3056 132. Zhang Y, Tateishi-Karimata H, Endoh T, Jin Q, Li K, Fan X, Ma Y, Gao L, Lu H, Wang Z, Cho AE, Yao X, Liu C, Sugimoto N, Guo S, Fu X, Shen Q, Xu G, Herrera-Estrella LR, Fan X (2022) Hightemperature adaptation of an OsNRT2.3 allele is thermoregulated by small RNAs. Sci Adv 8(47):eadc9785

Chapter 13 Analysis of sRNAs and Their mRNA Targets in Sinorhizobium meliloti: Focus on Half-Life Determination Robina Scheuer, Jennifer Kothe, Jan W€ahling, and Elena Evguenieva-Hackenberg Abstract Regulation of gene expression at the level of RNA and/or by regulatory RNA is an integral part of the regulatory circuits in all living cells. In bacteria, transcription and translation can be coupled, enabling regulation by transcriptional attenuation, a mechanism based on mutually exclusive structures in nascent mRNA. Transcriptional attenuation gives rise to small RNAs that are well suited to act in trans by either base pairing or ligand binding. Examples of 5′-UTR-derived sRNAs in the alpha-proteobacterium Sinorhizobium meliloti are the sRNA rnTrpL of the tryptophan attenuator and SAM-II riboswitch sRNAs. Analyses addressing RNA-based gene regulation often include measurements of steady-state levels and of half-lives of specific sRNAs and mRNAs. Using such measurements, recently we have shown that the tryptophan attenuator responds to translation inhibition by tetracycline and that SAM-II riboswitches stabilize RNA. Here we discuss our experience in using alternative RNA purification methods for analysis of sRNA and mRNA of S. meliloti. Additionally, we show that other translational inhibitors (besides tetracycline) also cause attenuation giving rise to the rnTrpL sRNA. Furthermore, we discuss the importance of considering RNA stability changes under different conditions and describe in detail a robust and fast method for mRNA half-life determination. The latter includes rifampicin treatment, RNA isolation using commercially available columns, and mRNA analysis by reverse transcription followed by quantitative PCR (RT-qPCR). The latter can be performed as a one-step procedure or in a strand-specific manner using the same commercial kit and a spike-in transcript as a reference. Key words sRNA, mRNA stability, RNA isolation method, Spike-in transcript, Transcriptional attenuator, Sinorhizobium, egfp reporter

1

Introduction RNA-based regulation is an important layer of the control of gene expression. Among others, transcriptional progress, mRNA translation, and RNA processing and degradation can be regulated by RNA. The underlying mechanisms depend on cis-acting elements in the RNA (alternative secondary structures, regulatory small ORFs, small RNA-binding sites, protein-binding sites, and

Ve´ronique Arluison and Claudio Valverde (eds.), Bacterial Regulatory RNA: Methods and Protocols, Methods in Molecular Biology, vol. 2741, https://doi.org/10.1007/978-1-0716-3565-0_13, © The Author(s), under exclusive license to Springer Science+Business Media, LLC, part of Springer Nature 2024

239

240

Robina Scheuer et al.

ribonuclease cleavage sites) and on trans-acting factors (respective base-pairing sRNAs, RNA-binding proteins, low molecular weight ligands, and translating ribosomes). As a consequence of RNA regulation, changes in the steady-state level and stability of both the riboregulators and their mRNA targets can be observed. The methods used to detect such changes consider the length and abundance of the transcripts of interest. Furthermore, to study posttranscriptional mechanisms, often reporter constructs are used, which also need careful validation at the level of RNA. Here, we discuss our experience in RNA analysis of the soil-dwelling alfalfa symbiont Sinorhizobium meliloti, which is based mostly on our studies of the tryptophan attenuator and SAM-II riboswitches. 1.1 Translation Inhibition as an Independent Signal for the Tryptophan Attenuator

The transcriptional attenuator of the tryptophan (Trp) biosynthesis gene trpE(G) in S. meliloti contains a small ORF (sORF) trpL harboring consecutive Trp codons [1]. As in other bacteria including E. coli, the attenuator RNA harbors the so-called regions 1, 2, 3, and 4, which are capable of alternative base-pairing [2, 3]. The structure adopted by the nascent RNA depends on the availability of Trp-charged tRNA. When enough Trp is available and the pioneering ribosome swiftly translates and reaches the stop codon of trpL, region 2 is blocked and cannot base pair with region 3 (antiterminator structure is prevented). Therefore, regions 3 and 4 can base pair to form a transcriptional terminator, and transcription of trpE(G) is abolished. Thereby, a small attenuator RNA rnTrpL is generated, which has its own functions in destabilizing mRNAs in trans [4]. When Trp is scarce, the ribosome transiently pauses at the Trp codons in trpL, in region 1. In the nascent RNA, this leads to the base pairing of the regions 2 and 3 (antiterminator is formed), thus preventing the formation of the transcriptional terminator and trpE(G) is transcribed. Interestingly, regions 1 and 2 can also base pair and form a so-called anti-antiterminator, an event that ensures the terminator formation. The base pairing of regions 1 and 2 is conserved in Gamma- and Alpha-proteobacteria, although it is not needed for trp gene regulation in response to Trp availability [3]. Indeed, the anti-antiterminator can be formed only when translation is abolished, which explains the super-attenuation observed in enteric bacteria when the ribosome-binding site of trpL is mutated [5]. This suggested that under conditions of translation inhibition, transcription attenuation takes place independently of the cellular Trp-level. We tested this recently, showing that in S. meliloti, the trp attenuator responds to tetracycline (Tc) exposure: Under conditions of Trp starvation, when trpL and trpE(G) are mostly co-transcribed, the attenuator sRNA rnTrpL accumulated 10 min after addition of subinhibitory Tc amount, indicating attenuation [6]. This Tc-induced attenuation was demonstrated using a ΔtrpC mutant of S. meliloti 2011, which was grown under Trp starvation

RNA Stability Determination in Sinorhizobium meliloti

241

Fig. 1 Northern blot hybridization shows antibiotic-induced transcription attenuation at the trp attenuator of Sinorhizobium meliloti, which leads to the accumulation of the sRNA rnTrpL. The pre-culture of strain 2011 ΔtrpC p was grown overnight in a rich TY medium [6]. It was diluted to an OD (at 600 nm) of 0.3 in minimal GMS medium [6] supplemented with 20 μg/ml L-Trp and grown for 2.5 h to an OD of 0.5. After centrifugation, the cells were resuspended in GMS medium containing 2 μg/ml L-Trp and incubated under shaking at 30 °C for 4 h. Then one of the following antibiotics was added (the final, sub-inhibitory concentration is given): tetracycline (Tc, 1.5 μg/ml), chloramphenicol (Cl, 9 μg/ ml), kanamycin (Km, 45 μg/ml), and erythromycin (Em, 27 μg/ml). To the control culture (C), 70% ethanol was added, with a volume corresponding to the highest volume used to add an antibiotic dissolved in ethanol (80 μl 70% ethanol added to 30 ml culture). Ten minutes after the addition of antibiotics and ethanol, the cultures were harvested, RNA was purified using the adapted TRIzol method (see the text), and 30 μg RNA per lane was separated in 10% polyacrylamide–urea gel, blotted, and hybridized with a radioactively labeled oligonucleotide. The membranes were re-hybridized with a 5S rRNA-detecting probe (loading control). Shown is the hybridization of one of the three independent experiments used for quantification and results from three independent biological experiments (mean fold change in comparison to the control and standard deviation)

conditions, in a minimal medium with 2 μg/ml Trp [1, 6]. Figure 1 shows that under these growth conditions, strong accumulation of rnTrpL (and thus increased attenuation) takes place upon exposure of S. meliloti 2011 ΔtrpC cultures to chloramphenicol, while kanamycin and erythromycin had a weaker but statistically significant effect. Furthermore, Fig. 1 shows that Tc-induced accumulation of rnTrpL was also detected in the prototrophic parental strain 2011, supporting the view that the trp attenuator responds to translation inhibition as an independent signal for downregulation of the downstream gene and sRNA induction. 1.2 Considerations of Methods for RNA Isolation and RNA Stability Determination

We use three different methods to isolate RNA from S. meliloti: (1) TRIzol (our adapted method includes an additional hot phenol extraction), (2) hot phenol, and (3) RNeasy Mini Kit (Qiagen). With the limitations discussed below, the three methods allow for comparable results in downstream applications. The decision of

242

Robina Scheuer et al.

which method to use depends on the length and abundance of the transcript of interest, the method for further analysis, the number of samples to be processed, and considerations on how to harvest and lyse the cells to avoid RNA degradation artifacts. From 1 ml culture at OD600 nm of 0.5, we usually obtain approximately 8 μg RNA using the RNeasy Mini columns, while approximately 30–50 μg RNA is obtained from 15 ml culture using either TRIzol or hot-phenol. Bacterial RNA is highly unstable and its stability changes with changing environmental conditions, including the oxygen tension [7, 8]. Thus, when studying RNA, it is very important to keep the bacterial culture constant under the chosen experimental conditions. For example, when shaking of the culture is stopped to take a sample for OD measurement, to add an inducing agent or to harvest a sample, the oxygen tension rapidly drops, and this could change the stability and abundance of certain transcripts. Therefore, we either withdraw 0.5 ml or 1 ml culture samples from shaking cultures and directly mix the samples with RNAprotect Bacteria Reagent (Qiagen) to preserve RNA, or quickly pour the whole bacterial culture (usually 30 ml culture at the OD600 nm of 0.5) into a 50 ml centrifuge tube filled with approximately 20 g ice rocks. This immediate cooling efficiently stops RNA degradation. The time at 4 °C between starting the harvest by pouring the culture on ice rocks and resuspending the cells in TRIzol or applying hot phenol should be minimized (max. 30 min). We are not using a self-made stop solution, which contains ethanol and phenol, because of the need to decontaminate the reusable centrifuge tubes. To analyze small RNAs, we use RNA purified by TRIzol, because with this method less large RNAs are isolated with lower efficiency, while short transcripts are enriched (Fig. 2) [9, 10]. Without this enrichment, it would be difficult to analyze the sRNA rnTrpL under Trp starvation conditions: In the example shown in Fig. 1, 30 μg TRIzol-isolated RNA per lane was used for the Northern blot hybridization with a radioactively labeled oligonucleotide, and some of the signals were at the limit of detection. The detection sensitivity can be increased using internally labeled antisense RNA as a probe but the usage of oligonucleotide probes is easier and cheaper. Routinely, we use 1 ml TRIzol to isolate RNA from 15 ml of S. meliloti culture (OD600 nm = 0.5). If after harvesting the cells were frozen in liquid nitrogen and stored at -80 °C (for a maximum of 3 days), we resuspend the frozen cells directly in TRIzol. The obtained RNA is not free of ribonucleases and therefore is additionally purified using hot phenol (see below). For this, each RNA sample is mixed with phenol immediately after the RNA is dissolved in cold water. Hot phenol purification is still the best method to quantitatively isolate cellular RNA. Among others, this is important for the

RNA Stability Determination in Sinorhizobium meliloti

243

Fig. 2 Ethidium bromide-stained RNA of Sinorhizobium meliloti 2011, separated in 10% polyacrylamide–urea gel. Loaded amounts are indicated. HP: RNA isolated by the hot phenol method. TRI: RNA isolated using TRIzol (and an additional hot phenol treatment). rRNA bands are marked on the right side. Rhizobial 23S rRNA is naturally processed into a 5.8S rRNA-like 5′-fragment and a 2.6 kb large 3′-fragment [17]. Shown is an inverse image

analysis of RNAs with markedly different lengths, which share a common transcription start site (TSS), such as a small attenuator RNA and the corresponding read-through mRNA (for example, the SAM-II riboswitch containing sRNAs and the corresponding mRNAs metA and metZ in S. meliloti) [11]. However, because of the high proportion of large ribosomal RNA in total RNA purified with this method (Fig. 2; rRNA often constitutes approximately 90% of the cellular RNA), low abundant sRNAs and mRNAs often cannot be detected by Northern blot hybridization. Nonetheless, Northern blot hybridization is the best way to delineate the two transcripts and ultimately the way to obtain reliable data for the small RNA, while the read-through mRNA can be analyzed by RT-qPCR with primers directed to the coding sequence. To exclude that an abundant degradation product instead of the fulllength transcript is analyzed by the chosen primer pair, several primer pairs can be used in parallel. Finally, when mRNA half-lives are to be determined using reverse transcription followed by real-time (quantitative) PCR (RT-qPCR), RNA purification with RNeasy Mini columns saves a lot of time. This method enriches RNAs larger than 200 nt and is not suitable for sRNA co-purification without protocol adjustment. However, the sRNA depletion (including tRNA) increases the relative abundance of typical mRNAs, which is advantageous in this case. Determination of mRNA stability is an important aspect in the functional analysis of base-pairing sRNAs that downregulate gene expression since often the question must be answered whether an sRNA destabilizes an mRNA and whether sRNA and target are co-degraded [12–14]. Since the steady-state level of RNA results

244

Robina Scheuer et al.

Fig. 3 Half-life determination of the reporter mRNA egfp revealed a difference in the PsinI promoter strength in Sinorhizobium meliloti cultures grown in rich or in minimal media, although the reporter gene expression suggested no difference. (a) Scheme of the used promoter fusion construct on plasmid pPsinI–egfp. The PsinI promoter was fused to an egfp reporter gene with S. meliloti codon usage [18]. The hatched box represents a synthetic 5′-UTR containing a typical Shine–Dalgarno sequence [11]. (b) S. meliloti 2011 pPsinI–egfp cultures were grown to an OD600 nm of 0.5 in rich (TY) or GMS minimal medium (MM) and fluorescence was measured. After normalization to the autofluorescence of an empty vector control, no significant difference was detected between the fluorescence in the two media. (c) From the cultures described in (b), total RNA was isolated and analyzed by RT-qPCR using egfp-specific primers and a spike-in control. Essentially no difference was detected between the steady-state amounts of the reporter mRNA in cultures grown under the two different conditions. (d) Stability determination of the reporter mRNA in the cultures described in (c) revealed a difference: The half-life of the egfp mRNA was 3.5 min in TY cultures and 1.3 min in MM-cultures (the red lines show the time point after rifampicin addition at which the mRNA amount reached 50% of the original value). We conclude that the similar steady-state egfp mRNA level under both growth conditions originates from opposite differences in promoter activities and mRNA stabilities: in TY-cultures is the egfp mRNA stability higher and the promoter activity lower than in MM-cultures. This prevents the usage of the sinI promoter as a constitutively acting promoter in reporter fusions aiming to compare gene expression in TY- and MM-cultures

from its transcription and its degradation, no change in the RNA abundance under different conditions could result from compensatory changes in transcription and decay (Fig. 3). Thus, without RNA stability analysis, important aspects in gene regulation can be missed, leading to wrong conclusions and unsuitable experimental

RNA Stability Determination in Sinorhizobium meliloti

245

strategies. For sRNA and/or mRNA half-life determination by RT-qPCR, hot phenol-purified RNA can be used, while TRIzolpurified RNA should be used instead for sRNA half-life determination by Northern blot hybridization. We are using a robust and fast method for mRNA half-life determination that includes rifampicin treatment, RNA isolation using RNeasy Mini columns, and RT-qPCR analysis with a commercial one-step reaction kit. As a reference in the RT-qPCR, we are using a spike-in transcript [11]. The mRNA half-life determination method is described below.

2

Materials

2.1 Cultivation and Harvest

1. S. meliloti culture in an Erlenmeyer flask with a (very) wide neck (see Note 1). 2. Rifampicin stock solution (600 mg/ml) in methanol (freshly prepared; (see Note 2). 3. 1.5-ml polypropylene reaction tubes (Eppendorf tubes), filled with 1 ml RNAprotect bacteria reagent (Qiagen).

2.2 RNA Isolation with Spike-In Transcript, Using RNeasy Mini Columns

1. RNeasy Mini Kit (Qiagen). 2. TE-buffer with lysozyme: 10 mM Tris–HCl, 1 mM EDTA, 1 mg/ml lysozyme, pH 8.0. Lysozyme is added to the TE buffer directly before usage. 3. Spike-in transcript (1 ng/μl, approximately 700 nt long; see Note 3). 4. β-mercaptoethanol. 5. 96% ethanol. 6. Ultrapure, nuclease-free water (thereafter: ddH2O; see Note 4).

2.3

DNase Treatment

1. TURBO DNA-free™ Kit (Thermo Fischer Scientific). 2. ddH2O (see Note 4). 3. 50 mM EDTA, pH 8.0. 4. 3 M sodium acetate, pH 5.2. 5. 96% ethanol, stored at -20 °C.

2.4

RT-qPCR

1. Brilliant III Ultra-Fast SYBR®Green RT-QPCR Master Mix kit (Agilent). 2. Forward and reverse primers (each 10 pmol/μl, dissolved in ddH2O) for detection of the mRNA of interest and of the spike-in transcript (see Note 5).

246

Robina Scheuer et al.

Fig. 4 Scheme of the experimental setup for half-life determination of mRNA by RT-qPCR. (a) At OD600 nm of 0.5, rifampicin is added to stop transcription. Culture samples are withdrawn at the indicated time points. A spike-in transcript is added at the beginning of the RNA purification procedure. After DNase treatment, the RNA concentration is adjusted to 20 ng/μl. (b) Two Reaction Master Mixes (RMMixes) are prepared. As indicated, they differ in the used primers for the RT-qPCR analysis. (c) Each RMMix is distributed to 0.2 ml reaction tubes, and then RNA is added as indicated. Reactions are performed in technical duplicates. Negative controls (NTC) containing water instead of RNA are always included

3. ddH2O (see Note 4). 4. Four DNA-free RNA samples (20 ng/μl) that correspond to the time points of the analysis (see Fig. 4). 5. Eighteen 0.2-ml reaction tubes and corresponding caps (low profile thin-walled 8 tube strips and optical clear flat 8 cap strips, Thermo Fischer Scientific). 6. Real-time PCR machine, e.g., CFX Connect (BioRad).

3 Methods 3.1 Cultivation and Harvest

1. Dilute an S. meliloti overnight culture to an OD600 nm of 0.2 in pre-warmed fresh medium (see Note 6) and cultivate the diluted culture at 30 °C and 140 rpm in an incubation room (see Note 7). Use an Erlenmeyer flask with a wide neck (see Note 1). 2. Prepare 1.5-ml reaction tubes with 1 ml RNAprotect bacteria reagent (Qiagen) for each time point of the analysis (time point 0 and at least three time points after the addition of rifampicin; see Note 8). 3. Measure the OD600 nm of the culture 60 min after inoculation and then after an appropriate time. At OD600 nm of 0.45–0.5

RNA Stability Determination in Sinorhizobium meliloti

247

(exponential growth phase), withdraw time point 0 samples (before rifampicin addition; see Note 9): Add 0.5 ml of the culture to the prepared 1.5-ml reaction tube containing 1 ml RNAprotect bacteria reagent and mix thoroughly. Withdraw the 0.5 ml culture sample without switching off the shaker and work fast (see Note 1). 4. Add rifampicin to a final concentration of 0.6 mg/ml to the shaking culture (e.g., add 30 μl rifampicin stock solution to 30 ml culture) (see Note 1). 5. Withdraw 0.5 ml culture samples at three time points after rifampicin addition (e.g., 1 min, 2 min and 4 min), add each sample immediately to the prepared 1.5-ml reaction tubes containing 1 ml RNAprotect bacteria reagent and mix thoroughly. Withdraw the culture samples without switching off the shaker (see Note 1). 6. Mix well by vortexing and incubate for 5 min at room temperature. 7. Centrifuge for 10 min at 5000 × g and 20 °C. 8. Decant supernatant carefully and remove the remaining liquid with a pipette. The pellet might not be visible. Pellets can be stored at -80 °C for up to 4 weeks. 3.2 RNA Isolation (According to the RNeasy Mini Kit Protocol) and Spike-In

1. Resuspend the bacterial pellet in 100 μl TE-buffer containing lysozyme (see Note 10). Incubate for 10 min and vortex every 2 min for 10 s. 2. Add 350 μl RLT-buffer (provided in the RNeasy kit and supplemented with 10 μl/ml β-mercaptoethanol) and vortex. 3. Add 1 ng spike-in transcript to the sample and vortex (1 ng/μl in vitro transcript; see Note 3). 4. Add 250 μl 96% ethanol and mix by pipetting. 5. Load 700 μl to an RNeasy Mini spin column placed in a 2 ml collection tube (provided in the RNeasy kit). Centrifuge for 15 s at 8000 × g. Discard the flow-through. 6. Add 700 μl RW1 buffer (provided in the RNeasy kit) to the spin column. Centrifuge for 15 s at 8000 × g. Discard the flowthrough. 7. Add 500 μl RPE-buffer (provided in the RNeasy kit as a concentrate; must be diluted with ethanol) to the spin column. Centrifuge for 15 s at 8000 × g. Discard the flow-through. 8. Add 500 μl RPE-buffer to the spin column. Centrifuge for 2 min at 8000 × g. Discard the flow-through. 9. Dry the column by centrifugation for another 1 min at 17,000 × g (full speed of the tabletop centrifuge).

248

Robina Scheuer et al.

10. Place the spin column in a new 1.5 ml reaction tube. Add 30 μl ddH2O (see Note 4) directly to the spin column membrane and incubate for 1 min at room temperature. Centrifuge at 8000 × g for 1 min. 11. Apply 20 μl ddH2O to the column while reusing the 1.5 ml reaction tube (to collect the elution samples) and centrifuge at 8000 × g for 1 min. Repeat this step. 12. Analyze the concentration, purity, and integrity of the eluted RNA. We are using 1.5 μl of the RNA sample to measure at the nanodrop the absorbance at 260 nm (absorbance of 1 corresponds to 40 ng/μl RNA) and 280 nm (absorbance ratio 260 nm/280 nm of 2.0 indicates pure RNA). Two microgram of the purified RNA can be used to check the integrity of stable RNAs (ribosomal RNA) in a 1% agarose–formaldehyde gel. 13. Aliquot the RNA sample to avoid contamination by repeated opening of one tube and store at -20 °C. 3.3

DNase Treatment

To remove residual DNA, maximally 10 μg RNA is treated with 1 μl TURBOTm DNase (2 U/μl) in a reaction volume of 50 μl. 1. Calculate the needed volumes of RNA sample and water considering 5 μl 10× TURBOTm DNase buffer (provided in TURBOTm DNase kit) and 1 μl enzyme. 2. In a sterile reaction tube, pipette in the following order: (1) ddH2O, (2) 10× buffer, (3) RNA sample (mix by pipetting up and down 2–3 times), (4) DNase. With a pipet adjusted to 30 μl, mix by pipetting up and down 2–3 times to ensure even distribution of the enzyme. Avoid bubbles and spin down in the centrifuge if necessary. If several RNA samples are treated in parallel, prepare a master mix containing the water, the 10× buffer and the enzyme, mix well, distribute to the reaction tubes and, to each of them, add the same volume of RNA samples containing maximally 10 μg RNA (see Note 11). 3. Incubate for 30–60 min at 37 °C. 4. Spin down the condensate by short centrifugation. 5. Stop the reaction by adding 15 μl 50 mM EDTA (final concentration 11.5 mM) and incubate for 10 min at 75 °C. 6. Fill up the sample to 200 μl with ddH2O. Add 1/10 volume 3 M sodium acetate (20 μl) and mix by pipetting up and down. Add 2.5 volumes of 96% ethanol (500 μl) to precipitate the RNA. 7. Precipitate the RNA overnight at -20 °C, centrifuge at 17,000 × g for 30 min at 4 °C, remove the liquid with a 1-ml automatic pipette, add 1 ml 75% cold ethanol to wash the RNA pellet, centrifuge again at 17,000 × g for 5 min at 4 °C, remove

RNA Stability Determination in Sinorhizobium meliloti

249

the liquid with the 1-ml pipette, spin down again and remove the residual liquid with a 200-μl pipette, dry the RNA-pellet in opened tubes under the hood for 20 min, dissolve in 50 μl ddH2O, and measure the concentration of RNA. Store at 20 °C. 8. Check the success of the DNase treatment by PCR; include a negative control containing water instead of RNA and a positive control containing a DNA template. 9. Adjust the concentration of the DNA-free RNA sample to 20 ng/μl, aliquot and store at -20 °C. 3.4

RT-qPCR

An example of a typical RT-qPCR experiment is given below (see Fig. 4). We are using the SYBR® Green RT-QPCR kit from Agilent. Each RT-qPCR reaction sample with a final volume of 10 μl contains 0.1 μl ddH2O (see Note 4), 0.1 μl DTT (100 mM stock solution provided in the RT-QPCR kit), 1 μl forward primer, 1 μl reverse primer (each 10 pmol/μl), 0.3 μl reference dye (provided in the RT-QPCR Kit and freshly diluted 1:50 μl in ddH2O), 0.5 μl solution containing reverse transcriptase and RNase inhibitor (RT/RNase block, provided in the RT-QPCR kit), 5 μl 2× SYBR Green QPCR Master Mix (provided in the RT-QPCR kit; contains polymerase and dNTPs), and 2 μl RNA (DNA-free, 20 ng/μl). The reactions are performed in technical duplicates. To determine the half-life of a specific mRNA under given conditions, two Reaction Master Mixes (RMMixes) are prepared. In the below example, each RMMix corresponds to ten 10 μl reaction samples and contains a primer pair (a forward and a reverse primer) but no RNA. The primer pair in the RMMix 1 detects the mRNA of interest; the primer pair in the RMMix 2 detects the spike-in reference transcript (Fig. 4). 1. Thaw a ddH2O aliquot (see Note 4), the primers solutions (10 pmol/μl), RNA samples (20 ng/μl), DTT stock solution (100 mM) and 2× SYBR Green QPCR Master Mix (provided in the RT-QPCR kit) on ice (see Note 12). 2. Dilute the reference dye provided in the RT-QPCR Kit 1:50 in ddH2O and keep it on ice (see Note 12). 3. Assemble on ice a RMMix for 10 × 10 μl-reaction samples by pipetting in the following order: (1) 1 μl ddH2O, (2) 1 μl DTT, (3) 10 μl reverse primer, (4) 10 μl forward primer (see Note 13), (5) 3 μl diluted reference dye, (6) 5 μl RT/RNase block, and (7) 50 μl 2× SYBR Green QPCR Master Mix. Mix well by carefully pipetting up and down, avoiding bubbles. 4. Distribute 8 μl of the RMMix to eight RT-qPCR tubes placed on ice (for the four-time points of the analysis, each in technical duplicates; see Note 14). Pipette 2 μl RNA to the upper part of

250

Robina Scheuer et al.

Table 1 One-step RT-qPCR program 1.

50 °C

10 min

(Reverse transcriptase (RT)-reaction)

R2.

95 °C

3 min

(Inactivation of the RT enzyme)

3.

94 °C

5s

(Start of the qPCR)

4.

56 °C

5s

5.

60 °C

5s

6.

Steps 3–5, 39 cycles

7.

95 °C

8.

Melting curve from 65 to 95 °C, increase of 0.5 °C every 5 s and plate reading. Constant lid temperature 105 °C to avoid condensation during reaction.

Plate reading

10 s

each tube (Fig. 4). Carefully attach the cap and spin down in a mini centrifuge with a PCR-stripe rotor. 5. Use 8 μl of the remaining RMMix for negative control (NTC) containing ddH2O instead of RNA (Fig. 4). The residual RMMix is discarded. 6. Switch on the RT qPCR machine and start the program (Table 1). As soon as the lid of the RT qPCR machine reaches 70 °C (approximately 2 min), place the tubes inside to start the one-step RT-qPCR: reverse transcriptase reaction for 10 min at 50 °C, followed by the qPCR (Table 1; (see Note 13)). Cq Determination

The quantification cycle (Cq) value is the point where the amplification crosses the threshold. The threshold is in the logarithmic phase of the amplification, at a position where all amplification curves are in parallel. In general, the threshold is set automatically (by the used program in the real-time PCR detection system) at the maximum fluorescence intensity in the logarithmic phase.

3.6 Primer Efficiency Evaluation

We are using RT-qPCR reactions with serial RNA dilutions (160, 80, 40, 20, and 10 ng/μl) for primer efficiency determination. Primer pair efficiency is two (or 100%) if a twofold difference in the RNA amount results in a difference of 1 Cq (all amplification curves should be in parallel; see also https://toptipbio.com/ calculate-primer-efficiencies/). The amplification efficiencies can also be evaluated directly from the amplification curves by the used program in the real-time PCR detection system.

3.5

3.7 Evaluation of mRNA Half-Life

1. Calculate the fold changes of the mRNA amounts compared to the time point 0 s by using the Pfafflformula and a spike-in as a reference.

RNA Stability Determination in Sinorhizobium meliloti

251

Fig. 5 Example of mRNA half-life determination. Shown is data for metA mRNA in S. meliloti 2011 grown in a minimal medium [11]. (a) Cq values for the mRNA and the spike-in control (determined in technical duplicates for each time point), and the calculated fold changes in relation to the time point 0. (b) Graphical determination of the mRNA half-life. The y-axis, showing the relative mRNA amount in %, is in a logarithmic scale; the x-axis, showing the time, is linear. Red lines show that 50% of the metA mRNA amount was reached approximately 35 s after rifampicin addition. That is, considering the four-time points, the half-life of this mRNA was 35 s ΔCq

E targettarget ð0 s - x sÞ ΔCq

reference ð0 s - x s E reference

Pfaffl-formula to calculate the fold changes in the RNA amount in relation to the time point 0 s. E is the primer efficiency. Target is the RNA of interest and reference is the spikein transcript; x s is the time points after rifampicin addition [15]. 2. The mRNA amount at the time points 0 s (before adding rifampicin) is defined as 100%. Calculate the relative amounts of mRNA at the time points after rifampicin addition in %. 3. The mRNA half-life is determined by using a semi-logarithmic diagram (RNA amount in relation to time): the time point, at which 50% RNA amount is reached, corresponds to the half-life (Fig. 5).

4

Notes 1. The wide neck of the Erlenmeyer flask allows for withdrawing culture samples while the culture is still shaking. Interrupted shaking can affect mRNA decay because of a change in the oxygen tension. 2. For 100 ml culture, weigh 60 mg rifampicin and dissolve it in 100 μl methanol. Shake up the antibiotic powder before adding the methanol and vortex immediately until no clumps are visible. Adding a drop of 5 M NaOH helps to dissolve it. Keep at 4 °C and dark until usage. Prepare the solution approximately 1 h before usage.

252

Robina Scheuer et al.

3. We are using a 718 nt in vitro transcript corresponding to a part of the Rhodobacter sphaeroides crtA gene [11]. The crtA gene was chosen, because it is involved in photosynthesis and has no homolog in S. meliloti, while the GC content of R. sphaeroides and S. meliloti is similarly high. The crtA-amplicon of the RT-qPCR has a length of 174 nt. 4. Aliquot purchased ultrapure, nuclease-free water (ddH2O) at the first opening of the flask in Eppendorf tubes under sterile conditions, stored the aliquots -20 °C, and avoided repeated opening. We recommend using sterile, filter-containing pipette tips. 5. The reverse primer (“right” primer) is complementary to the analyzed transcript and is used in the reverse transcription (RT) step to synthesize cDNA. The forward primer (“left” primer) is complementary to the cDNA. Both primers (a primer pair) are used in the qPCR reaction. For primer design, we are using Primer 3 (https://primer3.ut.ee/) with the default parameter; the preferred amplicon size is 150–250 bp. 6. If the bacteria experience cold stress, the lag phase of the diluted culture will be prolonged. 7. Using an incubation room (in contrast to an incubator in a colder room) helps to maintain the temperature constant during the handling of the culture. 8. Preferably, time points up to 10 min after rifampicin addition should be used. S. meliloti has powerful efflux pumps [16] and adapts quickly to the presence of rifampicin. 9. Because of the generally short half-life of bacterial RNA, in the time between adding rifampicin and “immediately” taking a sample for the time point 0, RNA decay already takes place. For very unstable RNAs this time is enough to significantly diminish their amount. Therefore, we define time point 0 as the time point before rifampicin addition, which reflects the steady state level of the RNA under the chosen experimental conditions. 10. This applies to bacteria treated with the RNAprotect bacteria reagent (Qiagen). If the bacteria were not treated, during this and the following steps, which are performed at room temperature, unstable RNA (mRNA and some sRNAs) will be degraded. EDTA does not inhibit the activity of some RNases. 11. We recommend using pipette tips containing filters. 12. The provided 2× SYBR Green QPCR Mastermix and the reference dye are light-sensitive and should be protected from light whenever possible. Pipet in a shaded room. 13. To distinguish between sense and antisense RNA, singlestrand-specific RT-qPCR analysis can be performed. To this

RNA Stability Determination in Sinorhizobium meliloti

253

end, assemble the RMMix without the forward (left) primer, distribute it to the reaction tubes, add RNA and start the RT reaction (Table 1). After the RT reaction is completed (10 min at 50 °C) and the RT enzyme is deactivated (3 min at 95 °C), cool down the 0.2-ml tubes containing the reaction samples to 4 °C and add 1 μl forward (left) primer (10 pmol/μl) to each reaction sample. Then apply the qPCR program, starting with 1 min incubation at 95 °C. 14. To minimize volume differences during the distribution of 8 μl RMMix portions, we set the 2-μl pipette to 8.2 μl. After pipetting the first RMMix portion into the tip, it is transferred to the first 0.2 ml reaction tube by depressing the plunger to the first stop. The residual liquid is not pushed out by depressing the plunger to the second sop. Instead, the second RMMix portion is pipetted into the same tip while allowing the plunger to return to its resting position. The second RMMix portion is also distributed by depressing the plunger to the first stop. The remaining RMMix portions are distributed in the same manner, still using the same pipet tip and without using the second stop of the pipette.

Acknowledgments This work was supported by the German Research Foundation (DFG) grant Ev42/6-2 and GRK2355 project number 325443116. References 1. Bae YM, Crawford IP (1990) The Rhizobium meliloti trpE(G) gene is regulated by attenuation, and its product, anthranilate synthase, is regulated by feedback inhibition. J Bacteriol 172:3318–3327 2. Merino E, Jensen RA, Yanofsky C (2008) Evolution of bacterial trp operons and their regulation. Curr Opin Microbiol 11:78–86 3. Evguenieva-Hackenberg E (2022) Riboregulation in bacteria: from general principles to novel mechanisms of the trp attenuator and its sRNA and peptide products. Wiley Interdiscip Rev RNA 13:e1696 4. Melior H, Li S, Madhugiri R, Sto¨tzel M et al (2019) Transcription attenuation-derived small RNA rnTrpL regulates tryptophan biosynthesis gene expression in trans. Nucleic Acids Res 47:6396–6410 5. Stroynowski I, van Cleemput M, Yanofsky C (1982) Superattenuation in the tryptophan

operon of Serratia marcescens. Nature 298: 38–41 6. Melior H, Li S, Sto¨tzel M et al (2021) Reprograming of sRNA target specificity by the leader peptide peTrpL in response to antibiotic exposure. Nucleic Acids Res 49:2894–2915 7. Bernstein JA, Khodursky AB, Lin PH et al (2002) Global analysis of mRNA decay and abundance in Escherichia coli at single-gene resolution using two-color fluorescent DNA microarrays. Proc Natl Acad Sci U S A 99: 9697–9702 8. Klug G (1991) Endonucleolytic degradation of puf mRNA in Rhodobacter capsulatus is influenced by oxygen. Proc Natl Acad Sci U S A 88: 1765–1769 9. Stead MB, Agrawal A, Bowden KE et al (2012) RNAsnap™: a rapid, quantitative and inexpensive, method for isolating total RNA from bacteria. Nucleic Acids Res 40:e156

254

Robina Scheuer et al.

10. Damm K, Bach S, Mu¨ller KM et al (2015) Impact of RNA isolation protocols on RNA detection by Northern blotting. Methods Mol Biol 1296:29–38 11. Scheuer R, Dietz T, Kretz J et al (2022) Incoherent dual regulation by a SAM-II riboswitch controlling translation at a distance. RNA Biol 19:980–995 12. Masse´ E, Escorcia FE, Gottesman S (2003) Coupled degradation of a small regulatory RNA and its mRNA targets in Escherichia coli. Genes Dev 17:2374–2383 13. Schu DJ, Zhang A, Gottesman S, Storz G (2015) Alternative Hfq-sRNA interaction modes dictate alternative mRNA recognition. EMBO J 34:2557–2573 14. Baumgardt K, Charoenpanich P, McIntosh M et al (2014) RNase E affects the expression of the acyl-homoserine lactone synthase gene sinI

in Sinorhizobium meliloti. J Bacteriol 196: 1435–1447 15. Pfaffl MW (2001) A new mathematical model for relative quantification in real-time RT-PCR. Nucleic Acids Res 29:e45 16. Eda S, Mitsui H, Minamisawa K (2011) Involvement of the smeAB multidrug efflux pump in resistance to plant antimicrobials and contribution to nodulation competitiveness in Sinorhizobium meliloti. Appl Environ Microbiol 77:2855–2862 17. Evguenieva-Hackenberg E (2005) Bacterial ribosomal RNA in pieces. Mol Microbiol 57: 318–325 18. McIntosh M, Krol E, Becker A (2008) Competitive and cooperative effects in quorumsensing-regulated galactoglucan biosynthesis in Sinorhizobium meliloti. J Bacteriol 190: 5308–5317

Chapter 14 Evaluation of 5′-End Phosphorylation for Small RNA Stability and Target Regulation In Vivo Alexandra Schilder, Yvonne Go¨pel, Muna Ayesha Khan, and Boris Go¨rke Abstract Bacterial small RNAs (sRNAs) can be equipped at the 5′ end with triphosphate (5′PPP) or monophosphate (5′P) groups, depending on whether they are primary transcripts, undergo dephosphorylation or originate via processing. Often, 5′ groups hallmark RNAs for rapid decay, but whether this also applies to sRNAs is little explored. Moreover, the sRNA 5′P group could activate endoribonuclease RNase E to cleave the basepaired target RNA, but a tool for investigation in vivo was lacking. Here, we describe a two-plasmid system suitable for the generation of 5′ monophosphorylated RNAs on demand inside the cell. The sRNA gene of interest is fused to the 3′ end of a fragment of sRNA GlmZ and transcribed from a plasmid in an IPTGinducible manner. The fusion RNA gets cleaved upon arabinose-controlled expression of rapZ, provided on a compatible plasmid. Adaptor protein RapZ binds the GlmZ aptamer and directs RNase E to release the sRNA of choice with 5′P ends. An isogenic plasmid generating the same sRNA with a 5′PPP end allows for direct comparison. The fates of the sRNA variants and target RNA(s) are monitored by Northern blotting. This tool is applicable to E. coli and likely other enteric bacteria. Key words 5′ end phosphorylation, Small RNA (sRNA), 5′ monophosphate, 5′ triphosphate, RNA decay, Endoribonuclease RNase E, RNA processing, RapZ, GlmZ

1

Introduction Throughout all domains of life, the modification state of the 5′ end has a major role in the fate of an RNA molecule. In bacteria, primary transcripts are mostly produced with a 5′ triphosphate (5′ PPP) group [1]. Conversion of the primary transcripts to species with a 5′ monophosphate (5′P) is often key for initiation of their degradation [2, 3]. In enteric bacteria, endoribonuclease RNase E catalyzes the processing of RNAs by cleaving in single-stranded regions at AU-rich sites, which often initiates RNA decay [4]. RNase E prefers substrates with 5′P as the latter group can interact with the sensor pocket in RNase E and allosterically boost cleavage in the catalytic center. Cleavage generates 3′ fragments with 5′P groups that are amenable to further RNase E attack.

Ve´ronique Arluison and Claudio Valverde (eds.), Bacterial Regulatory RNA: Methods and Protocols, Methods in Molecular Biology, vol. 2741, https://doi.org/10.1007/978-1-0716-3565-0_14, © The Author(s), under exclusive license to Springer Science+Business Media, LLC, part of Springer Nature 2024

255

256

Alexandra Schilder et al.

Another group of transcripts is recognized by RNase E through different means and cleaved regardless of the 5′ phosphorylation state [5, 6]. Bacterial small RNAs (sRNAs) regulate gene expression predominantly through base-pairing with target RNAs, influencing their translation, termination, or degradation. In enteric bacteria, a group of sRNAs originates with 5′PPP, reflecting their autonomous transcription, whereas others are released from 3′ regions of mRNAs by RNase E processing, providing these sRNAs ab initio with 5′P groups [7]. Several sRNAs are known for their ability to promote the decay of their targets through recruitment of RNase E to the RNA duplex, but the precise mechanism is not understood. Based on biochemical experiments, it was proposed that the sRNA 5′P may bind the sensor pocket in RNase E to stimulate cleavage of the base-paired target in the catalytic site [6, 8]. Processing-derived sRNAs are predestined to possess this activity intrinsically [9, 10]. However, the role of the 5′ end phosphorylation state for the activities, as well as the stabilities of sRNAs in vivo, remained unclear due to lack of suitable tools for analysis. Recently, we developed a genetic tool that allows us to generate sRNAs of choice in a completely 5′ monophosphorylated state in an inducible manner inside the cell and to compare them with the features of the 5′ triphosphorylated variants [11]. This tool employs the adaptor protein RapZ that binds sRNA GlmZ at its first two stem-loops and presents the sRNA for cleavage by RNase E (Fig. 1, [12]). RapZ directs RNase E to cleave GlmZ 6–7 nt downstream of stem-loop 2. Cleavage occurs at a fixed position regardless of sequence context at or downstream of the cleavage site [13]. This releases the first two stem-loops of GlmZ (designated GlmZ′) and the 3′ RNA fragment provided with a 5′P. Thus, any RNA that is fused to GlmZ′ can be produced in 5′ monophosphorylated form (Fig. 1a). The tool consists of plasmids pYG205 and pYG215, the triple mutant strain Z956, and its corresponding wild-type S4197 [11]. The endogenous copies of rapZ, glmZ, as glmY, which encodes an sRNA decoy for RapZ [14], are deleted in Z956. To avoid interference, the experimenter needs to construct a derivative of Z956 lacking the endogenous copy of the sRNA under investigation (subsequently designated sRNAx). Plasmid pYG205 carries rapZ under control of the arabinose-inducible PBAD promoter on a ColEI-type plasmid (Fig. 2a). The low copy plasmid pYG215 is compatible with pYG205 and carries a gfp variant (gfp-mut3*; [15]) with its 5′UTR fused to the 3′ end of glmZ′ (Fig. 2b). The glmZ′-gfp fusion gene is transcribed from the LacI-controlled and thus IPTG-inducible PLlacO-1 promoter (Fig. 2c; [16]), which allows to start RNA transcription at the authentic +1 position [17]. We previously demonstrated that expression of gfp from this fusion is strongly suppressed by RapZ (Fig. 1c; [13]). Plasmid

Role of Small RNA 5′ End Phosphorylation In Vivo

257

Fig. 1 Outline for evaluation of the role of sRNA 5′ end phosphorylation in vivo. (a) A fusion RNA consisting of the sRNA of interest (sRNAx; in pink) fused to the 3′ end of stem-loops I and II (position 1–146) of sRNA GlmZ (GlmZ′; in blue) is transcribed from a plasmid. A compatible plasmid produces the tetrameric adaptor protein RapZ (in yellow). RapZ binds the stem-loops of GlmZ and presents the RNA to RNase E (green scissors), which cleaves the fusion 7 nt downstream of GlmZ′s second stem-loop, thereby releasing sRNAx with its natural 5′ end provided with a 5′P. The properties of sRNAx, such as stability and regulatory effects on targets, can be studied subsequently through Northern blotting. (b) A transformant that expresses sRNAx as 5′ triphosphorylated primary transcript but is otherwise isogenic allows for direct comparison. (c) A transformant producing a GlmZ′-gfp control fusion RNA that is cleaved by RapZ/ RNase E allows to assess target RNA levels in the absence of sRNAx and to correct for possible indirect effects of the two-plasmid system on these targets

pYG215 has a dual role. First, it provides a negative control for the assessment of target RNA levels in the absence of sRNAx (Fig. 1c). Second, it is used for the construction of two derivatives, which produce sRNAx either with 5′PPP or 5′P groups (Fig. 1a, b). In the first derivative, the AatII-XbaI fragment comprising the glmZ′-gfp fusion needs to be replaced with sRNAx, while the AgeI-XbaI fragment encompassing only gfp will be substituted in the second derivative to construct specific glmZ′-sRNAx fusions (Fig. 2b, c). The resulting plasmids and pYG215 (control) are subsequently introduced into the newly constructed strain Z956ΔsRNAx, each together with plasmid pYG205 producing RapZ. The three doubly transformed strains are grown in the absence or presence of inducers for the expression of rapZ and sRNAx, respectively. Samples are harvested and total RNAs are extracted for analysis of sRNAx/ target RNA steady-state levels or RNA half-life. The RNA species are visualized by Northern blotting (see Fig. 3 for an exemplary

258

Alexandra Schilder et al.

Fig. 2 Two-plasmid system for generation of 5′P- or 5′PPP-sRNAs in an inducible manner in the bacterial cell. (a) Plasmid pYG205 carries a ColEI origin of replication and transcribes rapZ from the arabinose-inducible PBAD promoter, which is positively controlled by the divergently encoded AraC protein. To allow for high-level expression, rapZ is provided with the Shine-Dalgarno sequence of gene sacB from Bacillus subtilis. Two antibiotic resistance genes, bla and cat, allow for selection by ampicillin or chloramphenicol, respectively. The plasmid is also suitable for integration into the E. coli genome through site-specific recombination with the phage λattP site, but pilot experiments indicated insufficient cleavage of the fusion RNA when rapZ is in a single copy. (b) Plasmid pYG215 has a low copy pSC101-origin of replication and is thus compatible with

Role of Small RNA 5′ End Phosphorylation In Vivo

259

result) and signals are quantified. Comparison of these data may reveal whether the 5′P group has any impact on the levels of sRNAx and/or its targets (Fig. 1).

2

Materials

2.1 Strain and Plasmid Constructions

1. E. coli-K12 strains (available on request at https://www. addgene.org/): S4197 (wild-type = MG1655 rph+ ilvG+ ΔlacZ), Z956 (= S4197 ΔglmY ΔglmZ ΔrapZ). 2. Recombinant plasmids (Fig. 2; available on request at https:// www.addgene.org/): pYG205 (araC PBAD::rapZ λattP cat bla ori-ColEI), pYG215 (lacIq PLlacO-1::glmZ′-gfpmut3* tet cat ori-pSC101).

2.2 Cultures for Determination of (s) RNA Steady-State Levels

1. Temperature-controlled agitating water bath or incubator. 2. Spectrophotometer. 3. Cooling centrifuge accommodating 50 ml conic tubes. 4. Heating block. 5. Cooling benchtop centrifuge. 6. 37 °C incubator. 7. Freezers (-20 °C and -80 °C). 8. Sterile 100 ml glass flasks. 9. Sterile glass tubes (16 × 1.5 cm). 10. Sterile 50 ml conic plastic tubes. 11. LB-Lennox media: 10 g tryptone, 5 g yeast extract, 5 g NaCl, and fill up to 1 l with H2O, autoclave; to solidify medium, include 15 g agar. 12. 0.1 M CaCl2, autoclave.

ä Fig. 2 (continued) pYG205. It carries tetracycline (tet) and chloramphenicol (cat) resistance genes and transcribes a glmZ′-gfp fusion RNA from the PLlacO-1-promoter (in blue) that is controlled by upstream encoded LacI and is thus IPTG-inducible. Plasmid pYG215 can be used to construct derivatives that transcribe an sRNA either as primary transcript or fused to GlmZ′ (in green). To this end, the AatII-XbaI and the AgeI-XbaI fragments are replaced with the sRNA sequence of choice, respectively. (c) The nucleotide sequence displays the region between the end of lacI and the gfp start (orange). The PLlacO-1 promoter (in blue) with overlapping lacO1 operator sites, the sequence of glmZ′ (green), as well as the location of the AatII and AgeI sites (underlined) are indicated. When cloning into the AatII site, the sRNA sequence must directly follow the restriction site to start transcription at the authentic +1 position. In contrast, 4 nt must be inserted between the AgeI-site and the sRNA sequence to liberate the sRNA with its authentic 5′ end from the GlmZ′ aptamer

260

Alexandra Schilder et al.

Fig. 3 Northern blot comparing regulation of ompD mRNA by 5′PPP- and 5′PMicC sRNA variants. Example for analysis of an sRNA using the two-plasmid cleavage system. Steady-state levels of ompD mRNA were evaluated in the presence of plasmids transcribing either the glmZ′-gfp control fusion (pYG215; lanes 4–7), micC (pYG314; lanes 8–11) or the glmZ′-micC fusion (pYG313; lanes 12–15). The latter plasmids were introduced into strain Z1129, which additionally carried plasmid pYG205 as a source for RapZ. For construction of pYG314 and pYG313, the respective sequence in ancestor plasmid pYG215 was exchanged with the micC sequence via AatII/XbaI or AgeI/XbaI restriction sites (Fig. 2; [11]). IPTG (1 mM) and arabinose (0.2%) was added to induce expression of the sRNA constructs and of rapZ, respectively, as indicated. The empty ancestor strains S4197 (wild-type), Z976 (= Z956 ΔmicC) and Z1129 (= Z956 ΔmicC; attB::Ptac-ompD) are shown in lanes 1–3. To allow for analysis in E. coli, strain Z1129 carried the target mRNA ompD from Salmonella Typhimurium in the λattB-site on the chromosome [11]. Total RNAs (1.5 μg) were separated on two denaturing 5% PAA gels, respectively. The resulting blots were either

Role of Small RNA 5′ End Phosphorylation In Vivo

261

13. TEN buffer: 20 mM Tris/HCl pH 8.0, 50 mM NaCl, 1 mM EDTA pH 8.0, autoclave. 14. Ampicillin stock solution: 50 mg/ml ampicillin in 70% ethanol, store at -20 °C. 15. Tetracycline stock solution: 12.5 mg/ml tetracycline in 70% ethanol, store at -20 °C. 16. 20% L-arabinose in H2O, sterilize by 0.20 μM filtration, and store at RT. 17. 1 M Isopropyl-β-D-thiogalactopyranoside (IPTG) in H2O, sterilize by 0.20 μM filtration, store at -20 °C. 18. Liquid nitrogen. 2.3 Cultures for (s) RNA Half-Life Determinations

1. Same materials as described in Subheading 2.2.

2.4 Total RNA Extraction

1. Commercial RNA extraction kit, e.g., ReliaPrep™ RNA Cell Miniprep System (Promega).

2. Rifampicin stock solution: 50 mg/ml in dimethyl sulfoxide, prepare freshly, and keep at RT.

2. β-mercaptoethanol. 3. TE buffer (10 mM Tris-HCl, 1 mM EDTA, pH 8.0, autoclave, store at RT) containing 400 μg/ml lysozyme (add directly before use and keep on ice). 1. Vertical gel electrophoresis, e.g., Bio-Rad Mini-PROTEAN Tetra Vertical Electrophoresis System for hand-cast gels (1.0 mm) of size 8.3 × 7.3 cm.

2.5 Denaturing UreaPolyacrylamide Gel Electrophoresis for Separation of Total RNA

2. 10× TBE: 890 mM Tris, 890 mM boric acid, 20 mM EDTA, adjust to pH 8.0 using HCl, autoclave, and store at RT. 3. Solution for denaturing polyacrylamide (PAA) gels: 7 M urea, 5–8% acrylamide/bis (29:1), 1× TBE, 0.02% N,N,N,N′-tetramethyl-ethylenediamine (TEMED), 0.1% ammonium persulfate (APS). 4. 2× RNA loading dye: 95% formamide, 0.025% SDS, 0.025% bromophenol blue, 0.025% xylene cyanol, 0.5 mM EDTA pH 8, store in aliquots at -20 °C. ä Fig. 3 (continued) consecutively hybridized with probes detecting ompD and GlmZ (first and second panel from top) or probed for MicC (fourth panel from top). Membranes were reprobed for 5S rRNA to obtain loading controls. Relevant RNA species are indicated with arrows. Note that 5′P-MicC is highly unstable [11] and, consequently, only trace amounts are detectable, albeit its release from the GlmZ′-MicC fusion is clearly indicated by accumulation of the 5′ processing product GlmZ′ (lane 15)

262

2.6

Alexandra Schilder et al.

Northern Blotting

1. Semi-dry electro blotter, e.g., Trans-Blot Turbo Transfer System (BioRad). 2. UVP Crosslinker. 3. Hybridization tubes (15 × 3.5 cm). 4. Hybridizer oven. 5. Chemiluminescence imaging system, e.g., ChemiDoc imaging system (Biorad). 6. Shaker. 7. Plastic tray 9.5 × 7.0 cm. 8. Nylon transfer membrane (Hybond™-N+; cytiva). 9. Blotting paper Whatman® 3MM Chr. 10. Transparent sheets. 11. 20× SSC: 3 M NaCl, 0.3 M sodium citrate, adjust to pH 7.0 with HCl, autoclave. Heat to dissolve, but cool down to RT prior to pH adjustment. 12. 5× Buffer I: 0.5 M Maleic acid, 0.75 M NaCl, adjust to pH 7.5 with NaOH, autoclave (see Note 1). 13. Blocking solution: 10% blocking reagent (e.g. Roche) in 1× Buffer I, autoclave (see Note 2). 14. 10% N-laurylsarcosine, autoclave. 15. 10% SDS, autoclave. 16. Pre-hybridization buffer: 50% formamide, 5× SSC, 20% blocking solution, 0.1% N-laurylsarcosine, 0.02% SDS. Keep at 4 °C. 17. Wash buffer I: 2× SSC, 0.1% SDS. 18. Wash buffer II: 0.1× SSC, 0.1% SDS. 19. Buffer II: 1% blocking reagent in 1× Buffer I. Prepare freshly and keep at 4 °C. 20. Buffer III: 0.1 M Tris, 0.1 M NaCl, adjust to pH 9.5 with HCl, autoclave. 21. Anti-DIG alkaline-phosphatase (e.g. Roche).

conjugated

22. CDP-Star, ready-to-use (Roche). 2.7 Stripping the Nylon Membrane for Probe Removal

1. Plastic tray. 2. Shaker (at 37 °C). 3. Stripping solution: 0.2 M NaOH, 0.1% SDS. 4. 2× SSC (see Subheading 2.6, item 11).

antibody

Role of Small RNA 5′ End Phosphorylation In Vivo

2.8 Generation of DIG-Labeled RNA Probes

263

1. In vitro transcription: 10× RNAPol Reaction Buffer (NEB), DIG labeling mix (Roche), RiboLock RNase Inhibitor (Thermo Scientific™), T7 RNA polymerase (NEB). 2. Precipitation of RNA: 4 M LiCl (autoclaved), 96% Ethanol p.a., 70% Ethanol p.a., RNase-free H2O.

3

Methods

3.1 Construction of the Test Strain and Recombinant Plasmids for Release of sRNAx

1. Construct the test strain Z956ΔsRNAx: Deletion of endogenous sRNAx is most easily achieved by moving a previously established deletion tagged with an antibiotic resistance marker (avoid ampicillin and tetracycline resistances) into strain Z956 using general phage transduction (e.g. [18, 19]). Alternatively, the λ red recombination system may be used [20, 21]. The antibiotic resistance gene used for the replacement of sRNAx should be removed if polar effects are expected. A temperaturesensitive plasmid, which produces FLP recombinase, such as pCP20 may be used [20]. 2. Design primers for construction of recombinant plasmids: Two different DNA fragments comprising sRNAx must be generated by PCR. These fragments carry AatII/XbaI or AgeI/XbaI restriction sites at their 5′/3′ extremities, respectively, and are used to replace corresponding fragments in plasmid pYG215. For both amplifications, the same reverse primer carrying an XbaI-site (TCTAGA) at the 5′ extremity can be used. In the forward primer carrying the AatII-site (GACGTC), the sequence of the 5′ end of sRNAx must directly follow the sequence of the restriction site. In the forward primer carrying the AgeI-site (ACCGGT), 4 additional nt need to be inserted between the restriction site and the sRNAx sequence (Fig. 2c). The 4 nt found naturally upstream of sRNAx may be used. This construction places sRNAx 7 nt downstream of the second stem-loop of GlmZ′ (nt 1–146) in the fusion RNA—the distance required to release sRNAx with its genuine 5′ end (Fig. 1a). 3. Amplify the sRNAx fragments by PCR, carry out restriction digests, isolate vector DNA fragments, and ligate: Typically, cloning of the required plasmids starts with 2–5 μg of plasmid pYG215. Restriction digest of pYG215 generates 5479 bp/ 899 bp fragments in the case of AatII+XbaI digest and 5624 bp/754 bp fragments in the case of AgeI+XbaI. Preparative 0.8% agarose gel electrophoresis allows for separation and isolation of the longer vector backbones. The inserts for ligation are generated by standard PCR using the primers discussed in Subheading 3.1, step 2. For details on PCR and cloning, refer to [22].

264

Alexandra Schilder et al.

4. Select recombinants: For the preparation of competent cells, we recommend the protocol [23], which is fast and sufficiently efficient for standard cloning procedures (see Note 3). Competent cells of an E. coli-K12 work-horse strain, such as Top10 are transformed with the ligation reactions, and colonies are selected on LB plates containing appropriate antibiotics (Fig. 2). 5. Verify recombinant plasmids by colony PCR and DNA sequencing: Use 1 μl of a colony suspension in 100 μl H2O to inoculate a standard PCR with appropriate diagnostic primers, e.g. forward primer (5′- CGCGTTGGCCGATTCAT TAATGC; anneals 146 bp upstream of AatII site) and reverse primer (5′-CCCGCAAGAGGCCCGG; anneals 199 bp downstream of XbaI). “Positive” colonies generating the PCR products of the correct length are inoculated and grown for plasmid extraction. Isolated plasmids are Sanger sequenced using the primers above (see Note 4). 3.2 Cultivation of Bacteria for Determination of (s) RNA Steady-State Levels

1. Transform strain Z956ΔsRNAx with 20–50 ng of the newly constructed sRNA expression plasmids or pYG215 (control) together with 50 ng pYG205 in 50 μl TEN, respectively. Co-transform or transform consecutively as described [23]. 2. Plate the doubly transformed bacteria on LB-agar plates containing 100 μg/ml ampicillin and 12.5 μg/ml tetracycline and incubate for 14–18 h at 37 °C. 3. Inoculate single colonies in glass tubes containing 5 ml LB supplemented with ampicillin (50 μg/ml), tetracycline (12.5 μg/ml), and 0.2% L-arabinose when applicable (see Note 5). Include cultures of strain S4197 and plasmid-less Z956ΔsRNAx as additional controls (no antibiotics!). The latter allows for comparison with endogenous (s)RNA levels and to assess the specificity of the probes used for subsequent detection (cf. Fig. 3, lanes 1 and 2). 4. Grow the cultures under shaking (~165 rpm) at 37 °C for ~16 h. If the following steps cannot be carried out on the same day, cryo-cultivation of transformants is recommended (see Note 6). 5. Determine the optical densities (OD600) of the cultures using a spectrophotometer. 6. Inoculate novel cultures to an OD600 = 0.1 in 100 ml flasks containing 10 ml fresh medium with the same supplements. 7. Grow the cultures at 37 °C under shaking (~165 rpm) and periodically check OD600.

Role of Small RNA 5′ End Phosphorylation In Vivo

265

8. Add IPTG in the required concentration (see Note 7) when cultures reach an OD600 = 0.3, usually 1.5–2 h after inoculation. 9. Continue growth until cultures reach an OD600 = 0.5–0.6. 10. Harvest 2 × 2 ml each and place samples immediately on ice. The second sample may serve as a technical replicate if needed. 11. Pellet cells in a pre-cooled benchtop centrifuge (3 min, 12.000–13.000 × g, 4 °C). 12. Discard supernatants and flash-freeze pellets in liquid nitrogen. 13. Store frozen bacteria at -80 °C until RNA extraction. 3.3 Cultivation of Bacteria for Determination of (s) RNA Decay Rates

1. To assess the half-life of sRNAx and target RNAs, carry out steps 1–9 as described in Subheading 3.2, except that main cultures are grown in a 25 ml medium. 2. Harvest 1 ml of each culture (t = 0) when attaining an OD600 = 0.5–0.6. Add 500 μg/ml rifampicin to the remaining cultures instantly. 3. Continue growth and harvest additional 1 ml samples in a suitable time course, e.g., 1, 2, 4, 8, and 16 min after rifampicin addition. 4. Upon harvest, place samples immediately on ice and proceed with steps 11–13 as described before in Subheading 3.2.

3.4 Extraction of Total RNA from the Bacterial Pellets

A commercial kit for the isolation of total RNA from the harvested samples may be used. We use the ReliaPrep™ RNA Cell Miniprep System (Promega) with a few deviations from the manufacturer’s protocol to account for the characteristics of bacterial cells as follows. 1. When ready for RNA extraction, thaw the bacterial pellets from Subheading 3.2 or Subheading 3.3 on ice and resuspend the cells in 100 μl TE + lysozyme. Incubate for 5 min at RT and vortex every 2 min. 2. Add 350 μl of the BL-TG buffer containing 1% β-mercaptoethanol and vortex the lysate several times to shear the DNA. 3. Add 250 μl 96% ice-cold ethanol and mix gently by pipetting the solution. 4. Lysate loading onto the Minicolumn, washing steps, and RNA elution are performed as described in the manufacturers’ protocol. Note that the DNase I incubation step can be omitted when the RNA is used for Northern analysis only.

266

Alexandra Schilder et al.

5. Determine RNA concentration using a NanoDrop microvolume spectrophotometer. Typically, RNA concentrations of ~1 μg/μl are obtained when RNA is eluted in 30 μl nuclease-free H2O. 6. Store the purified RNA at -80 °C until use. 3.5 Denaturing Urea Polyacrylamide Gel Electrophoresis

8% denaturing PAA gels are recommended for separating sRNA species (see Note 8). A distinct PAA gel of usually lower concentration, e.g., 5%, is suitable for the separation of target RNAs up to 1200 nt. Even longer RNAs should be separated on denaturing 1% agarose gels and subsequently transferred to the nylon membrane by vacuum blotting. For the latter procedure, refer to [11, 24, 25]. 1. Prepare denaturing PAA gels as described in Subheading 2.5, item 3 (see Note 9). 2. Pre-run the gels (without combs) for 20 min at 100 V prior to loading. Running buffer is 0.5× TBE. 3. While gels are pre-running, mix 1.5–3 μg of the total RNA with 2× RNA-loading dye and nuclease-free H2O to obtain a 1× concentration of the RNA-Loading dye in the desired volume (see item 6). 4. Incubate samples for 15 min at 65 °C for denaturation of the RNA. 5. Cool samples on ice for 2–3 min. Subsequently, spin down the samples with a brief centrifugation step and put them back on ice until loading. 6. Load the gel: clear gel pockets with 0.5× TBE using a pipette shortly before loading! Load a maximum of 8 μl/slot in case of 15 well combs and a maximum of 12 μl/slot when 10 well combs are used to avoid sample overflow (see Note 10). 7. Run the gels at 100 V. Electrophoresis time depends on the RNA under investigation and PAA concentration. Electrophoresis is usually stopped when the bromophenol blue dye enters the buffer chamber, which takes ~90 min. Signals for RNA species of 150–200 nt length can be expected in the middle of the gel when 8% PAA gels are used.

3.6

Northern Blotting

For analysis of sRNAx, Northern blotting is superior. Other methods make it difficult to account for the fractions of cleaved and uncleaved GlmZ′-sRNAx precursors and somewhat longer sRNA species initiated upstream of the PLlacO-1 promoter (cf. Fig. 3; [11]). Northern blot protocols for the detection of bacterial sRNAs are available, e.g., [26]. While the latter protocol relies on radioactively labeled probes, Digoxigenin (DIG)—labeled RNA probes provide a convenient alternative (see Subheading 3.8). In

Role of Small RNA 5′ End Phosphorylation In Vivo

267

the following, we describe the Northern blotting protocol used in our original study [11]. 1. Assemble the blotting sandwich in the semidry electroblotting apparatus in the following order: (anode) three sheets of 3 MM Chr Whatman® paper, Nylon transfer membrane, gel, three sheets of 3 MM Chr Whatman® paper (cathode). Soak paper and membrane in 0.5× TBE prior to assembly. Equilibrate gel in 0.5× TBE before placing it onto the membrane. Avoid touching the membrane (use forceps) and avoid air bubbles throughout the sandwich. 2. Transfer the RNA to the membrane at 120 mA (constant) and 15 V for 1 h. 3. Place the membrane between transparent sheets and cross-link the RNA to the membrane using a UV Crosslinker (254 nM UV light / 2–3 min) (see Note 11). 4. Place the membrane in a hybridization tube (see Note 9) and incubate under rotation with 7 ml pre-warmed pre-hybridization solution for 30–120 min at 68 °C in a hybridization oven. 5. Add 0.5 ml pre-hybridization solution containing ~25–100 ng of the DIG-labeled RNA probe to the hybridization tube and continue incubation at 68 °C overnight. 6. Wash the membrane for 2× 5 min with 10 ml Wash buffer I in the cold rotating hybridization oven or by hand at RT. 7. Wash the membrane 2× 15 min with 10 ml pre-warmed Wash buffer II in the rotating hybridization oven at 68 °C. 8. Rinse the membrane with Milli-Q water or dH2O to remove SDS. Transfer the membrane into a plastic tray. 9. Incubate membrane for 30 min in 15 ml Buffer II on a shaker at RT. 10. Dilute Anti-DIG Antibody conjugated with alkaline phosphatase 1:10.000–1:20.000 in 15 ml Buffer II and incubate the membrane with the antibody dilution for at least 30 min. 11. Incubate 2× 15 min in 15 ml Buffer I. 12. Incubate 3–5 min in 15 ml Buffer III. 13. Dilute the CDP* substrate 1:2 or 1:3 in Buffer III to a final volume of 500 μl. Put the membrane between transparent sheets, add CDP* solution onto the membrane, and incubate for 5 min in the dark. 14. Detect the RNA:RNA hybrids on the blot using a chemiluminescence imaging system for 1 min up to 60 min, depending on signal strength (see Note 11).

268

Alexandra Schilder et al.

3.7 Stripping the Nylon Membrane for Detection of Distinct RNA Species

Following the detection of sRNAx and its target RNA, the RNA probes must be stripped off from membranes to allow for the detection of additional RNAs with distinct probes. As it is in the same size range, GlmZ′ can be detected on the same membrane on which sRNAx was detected before. In the last step, 5S rRNA is detected to provide loading controls for signal normalization. All steps are performed in a plastic tray on a shaker. 1. Rinse the membrane with Milli-Q water to remove the CDP-Star substrate. 2. Wash the membrane for 2× 15 min at 37 °C in 50 ml stripping solution under agitation on a shaker. 3. Incubate on a shaker in 50 ml 2× SSC for 5 min at RT (see Note 11). 4. Perform the Northern blotting protocol starting with Subheading 3.6, step 4.

3.8 Generation of DIG-Labeled RNA Probes by In Vitro Transcription

DIG-labeled RNA probes are typically complementary over the entire length to sRNAx. In case of the longer mRNAs, ~500 nt long RNA probes are used. Probes are generated through in vitro transcription using T7 RNA polymerase and a PCR fragment as template. The template contains a T7 RNA polymerase promoter at the 5′ end of the opposite strand, which is incorporated by the reverse primer during PCR (cf. Supplementary Table III in [11] for recognition sequence). 1. For in vitro transcription, mix in 20 μl: 1000 ng purified PCR fragment, 2 μl 10× RNAPol Reaction Buffer (NEB), 2 μl DIG labeling mix (Roche), 40 units RiboLock RNase Inhibitor (Thermo Scientific™), 50 units T7 RNA polymerase (NEB). Incubate reaction for 16 h at 37 °C. 2. Stop reaction by adding 1 μl 0.5 M EDTA, pH 8.0. 3. Precipitate the RNA by adding 0.1 volume 4 M LiCl and 3 volumes ice-cold 96% EtOH p.a. Incubate for 3–4 h at 80 °C or overnight at -20 °C. 4. Pelletise RNA in a pre-cooled benchtop centrifuge (30 min, max. speed, 4 °C). 5. Remove supernatant, wash with 500 μl ice-cold 70% EtOH p.a. and centrifuge again (15 min, max. speed, 4 °C). 6. Remove supernatant, air-dry RNA, and dissolve RNA probe in 50 μl RNase free H2O under shaking (~600 rpm) for 5 min at 65 °C. Store at -20 °C (see Note 12). 7. For determination of labeling efficiency, prepare serial dilutions of the probe (10-1 to 10-6) in 10 μl dH2O and spot 5 μl each onto a Nylon membrane and let dry.

Role of Small RNA 5′ End Phosphorylation In Vivo

269

8. Detect the probe as described in Subheadings 3.6, steps 9–14, but use half of the described volumes (i.e. enough to cover the membrane). Labeling is efficient if at least the first four probe dilutions are detectable.

4

Notes 1. To prepare 1 l 5× Buffer I, weigh 58.035 g Maleic acid, 43.83 g NaCl, and transfer into a bottle containing a magnetic stir bar. Weigh ~36 g NaOH and add together with dH2O up to 800–900 ml. Stir until all material is solved (exothermic reaction!). Subsequently, adjust pH exactly at RT and fill up to 1 l. 2. The blocking reagent only dissolves during autoclaving. To prepare 100 ml blocking solution, use 100 ml dH2O to label the filling level in a bottle. Remove H2O, weigh 10 g blocking reagent directly into the bottle, fill with 1× Buffer I up to the label, and add a magnetic stir bar and autoclave. Store at 4 °C. It can be used for ~1 month. 3. Deviating from [23], half of the described solution volumes may be used. This generates 10 aliquots of 100 μl competent cells, which is sufficient. The competent cells can be stored on ice for several hours without any major impact on transformation efficiency. The resuspension step in 0.1 M MgCl2, as described in [23], can be omitted. Following the final incubation at 37 °C for expression of the antibiotic resistance gene, plate 100 μl of the cell suspension directly. The remaining cells are pelletized by centrifugation (5 min, 6000–7000 × g, RT), resuspended in 100 μl of the supernatant, and distributed on a second plate. 4. Take care that plasmids do not receive mutations affecting the copy number. This may lead to erroneous conclusions when comparing the two variants of sRNAx. Occasionally, we observed a Glu83Gly substitution (GAA ! GGA) in the repA gene, resulting in a higher concentration of the plasmid (≥100 ng/μl) when purified and isolated using a commercial plasmid mini-prep kit. Usually, concentrations of ≤50 ng/μl are obtained from 8 ml bacterial culture for derivatives of pYG215. 5. Make sure to include 0.2% L-arabinose for induction of rapZ expression already in the pre-cultures. Pilot experiments indicated more efficient cleavage of the GlmZ′-MicC fusion RNA [11] by RapZ/RNase E in a Δhfq mutant (unpublished data), suggesting that binding of Hfq to MicC blocks access of RNase E. Similar observations were also made for full-length GlmZ, reflecting overlapping Hfq-binding and RNase E cleavage sites

270

Alexandra Schilder et al.

[13]. Higher RapZ levels carried over from the pre-cultures may outcompete Hfq and allow for better cleavage of the fusion RNA. 6. For long-term cryo-storage, 1 ml cell suspension is mixed with 80 μl dimethyl sulfoxide in a 2 ml cryo tube and subsequently deposited in a -80 °C freezer. Novel cultures can be inoculated at any time by scraping off frozen material using a sterile needle or tip and transferring it into a fresh medium or onto a plate. 7. To achieve a meaningful graduation in transcription strength of the sRNAs, inductions with various IPTG concentrations [11] are recommended. Accordingly, prepare 100 mM, 10 mM, 5 mM, and 1 mM IPTG solutions through serial dilution of the 1 M stock solution. This allows to reach the desired final concentrations by pipetting the same volume everywhere (100 μl or less if samples were taken for OD600 determination). 8. When working with RNA, try to create an RNase-free environment and use distilled Milli Q H2O for the preparation of all solutions and buffers (see also [26] for tips). 9. Prior to starting, rinse glass plates, combs, hybridization tubes, and trays thoroughly with water, Milli-Q water, and 70% ethanol in the mentioned order. For preparation of the gel solution, dissolve urea in acrylamide, TBE, and dH20 under shaking at 37 °C in a sterile 100 ml flask. Then, add TEMED and APS and pour gels subsequently. Around 8% gels need ~30 min to polymerize, lower percentage of gels need longer. 10. Best results are obtained when samples are loaded under the current flow. To this end, the Bio-Rad Mini-PROTEAN electrophoresis system is equipped with only two gels at a time. Connect the electrodes of the gel running module directly with the power supply using a separate cable. Clean the pockets and load samples quickly in order to obtain sharp bands. 11. If it is impossible to continue the protocol on the same day, place the membranes in a transparent sheet and store them at 20 °C. This is possible after the UV-crosslinking step, after signal detection, or after stripping off the probe. 12. When stored at -20 °C, DIG-labeled RNA probes can be used for several months.

Acknowledgments This work was supported by the Austrian Science Fund through a stand-alone grant [P32410 to B.G.] and the Doktoratskolleg RNA biology [W1207-B09]. We thank Birte Reichenbach, who established non-radioactive Northern blotting in the Go¨rke laboratory, and all group members for discussion.

Role of Small RNA 5′ End Phosphorylation In Vivo

271

References 1. Schauerte M, Pozhydaieva N, Ho¨fer K (2021) Shaping the bacterial epitranscriptome-5’-terminal and internal RNA modifications. Adv Biol (Weinh) 5(8):e2100834. https://doi. org/10.1002/adbi.202100834 2. Laalami S, Zig L, Putzer H (2014) Initiation of mRNA decay in bacteria. Cell Mol Life Sci 71(10):1799–1828. https://doi.org/10. 1007/s00018-013-1472-4 3. Luciano DJ, Vasilyev N, Richards J, Serganov A, Belasco JG (2017) A novel RNA phosphorylation state enables 5′ end-dependent degradation in Escherichia coli. Mol Cell 67(1):44–54.e46. https://doi.org/ 10.1016/j.molcel.2017.05.035 4. Bandyra KJ, Luisi BF (2018) RNase E and the high-fidelity orchestration of RNA metabolism. Microbiol Spectr 6(2):6.2.23. https:// doi.org/10.1128/microbiolspec.RWR0008-2017 5. Clarke JE, Sabharwal K, Kime L, McDowall KJ (2023) The recognition of structured elements by a conserved groove distant from domains associated with catalysis is an essential determinant of RNase E. Nucleic Acids Res 51(1): 365–379. https://doi.org/10.1093/nar/ gkac1228 6. Bandyra KJ, Wandzik JM, Luisi BF (2018) Substrate recognition and autoinhibition in the central ribonuclease RNase E. Mol Cell 72(2):275–285.e274. https://doi.org/10. 1016/j.molcel.2018.08.039 7. Ponath F, Ho¨r J, Vogel J (2022) An overview of gene regulation in bacteria by small RNAs derived from mRNA 3′ ends. FEMS Microbiol Rev 46(5). https://doi.org/10.1093/femsre/ fuac017 8. Bandyra KJ, Said N, Pfeiffer V, Gorna MW, Vogel J, Luisi BF (2012) The seed region of a small RNA drives the controlled destruction of the target mRNA by the endoribonuclease RNase E. Mol Cell 47:943–953 9. Belasco JG (2017) Ribonuclease E: chopping knife and sculpting tool. Mol Cell 65(1):3–4. https://doi.org/10.1016/j.molcel.2016. 12.015 10. Chao Y, Li L, Girodat D, Forstner KU, Said N, Corcoran C, Smiga M, Papenfort K, Reinhardt R, Wieden HJ, Luisi BF, Vogel J (2017) In vivo cleavage map illuminates the central role of RNase E in coding and non-coding RNA pathways. Mol Cell 65(1): 39–51. https://doi.org/10.1016/j.molcel. 2016.11.002

11. Schilder A, Go¨rke B (2023) Role of the 5′ end phosphorylation state for small RNA stability and target RNA regulation in bacteria. Nucleic Acids Res. https://doi.org/10.1093/nar/ gkad226 12. Islam MS, Hardwick SW, Quell L, DuricaMitic S, Chirgadze DY, Go¨rke B, Luisi BF (2023) Structure of a bacterial ribonucleoprotein complex central to the control of cell envelope biogenesis. EMBO J 42(2):e112574. h t t p s : // d o i . o r g / 1 0 . 1 5 2 5 2 / e m b j . 2022112574 13. Go¨pel Y, Khan MA, Go¨rke B (2016) Domain swapping between homologous bacterial small RNAs dissects processing and Hfq binding determinants and uncovers an aptamer for conditional RNase E cleavage. Nucleic Acids Res 44(2):824–837. https://doi.org/10.1093/ nar/gkv1161 14. Go¨pel Y, Papenfort K, Reichenbach B, Vogel J, Go¨rke B (2013) Targeted decay of a regulatory small RNA by an adaptor protein for RNase E and counteraction by an anti-adaptor RNA. Genes Dev 27(5):552–564. https://doi.org/ 10.1101/gad.210112.112 15. Andersen JB, Sternberg C, Poulsen LK, Bjorn SP, Givskov M, Molin S (1998) New unstable variants of green fluorescent protein for studies of transient gene expression in bacteria. Appl Environ Microbiol 64(6):2240–2246 16. Lutz R, Bujard H (1997) Independent and tight regulation of transcriptional units in Escherichia coli via the LacR/O, the TetR/O and AraC/I1-I2 regulatory elements. Nucleic Acids Res 25(6):1203–1210. https://doi.org/ 10.1093/nar/25.6.1203 17. Guillier M, Gottesman S (2006) Remodelling of the Escherichia coli outer membrane by two small regulatory RNAs. Mol Microbiol 59(1): 231–247 18. Thomason LC, Costantino N, Court DL (2007) E. coli genome manipulation by P1 transduction. Curr Protoc Mol Biol Chapter 1:1.17.11–11.17.18. https://doi. org/10.1002/0471142727.mb0117s79 19. Wilson GG, Young KY, Edlin GJ, Konigsberg W (1979) High-frequency generalised transduction by bacteriophage T4. Nature 280(5717):80–82 20. Datsenko KA, Wanner BL (2000) One-step inactivation of chromosomal genes in Escherichia coli K-12 using PCR products. Proc Natl Acad Sci U S A 97(12):6640–6645 21. Bryant JA, Lee DJ (2017) Homologous recombineering to generate chromosomal

272

Alexandra Schilder et al.

deletions in Escherichia coli. Methods Mol Biol 1624:3–16. https://doi.org/10.1007/978-14939-7098-8_1 22. Green MR, Sambrook J (2012) Molecular cloning: a laboratory manual, 4th edn. Cold Spring Harbor Laboratory Press, Cold Spring Harbor, New York 23. Lederberg EM, Cohen SN (1974) Transformation of Salmonella typhimurium by plasmid deoxyribonucleic acid. J Bacteriol 119(3): 1072–1074 24. Zhao Y, Du L, Zhang N (2013) Sensitivity of prestaining RNA with ethidium bromide before electrophoresis and performance of

subsequent northern blots using heterologous DNA probes. Mol Biotechnol 54(2):204–210. https://doi.org/10.1007/s12033-0129553-4 25. Amersham-Biosciences (2003) VacuGene XL Vacuum blotting System (manual), pp 1–31 26. Wagner EGH, Vogel J (2005) Approaches to identify novel non-messenger RNAs in bacteria and to investigate their biological functions: functional analysis of identified non-mRNAs. In: Hartmann RK, Bindereif A, Scho¨n A, Westhof E (eds) Handbook of RNA biochemistry, vol 2. Wiley-VCH, Weinheim, pp 614–642

Chapter 15 In-Gel Cyanoethylation for Pseudouridines Mass Spectrometry Detection of Bacterial Regulatory RNA Antony Lechner and Philippe Wolff Abstract Regulatory RNAs, as well as many RNA families, contain chemically modified nucleotides, including pseudouridines (ψ). To map nucleotide modifications, approaches based on enzymatic digestion of RNA followed by nano liquid chromatography-tandem mass spectrometry (nanoLC-MS/MS) analysis were implemented several years ago. However, detection of ψ by mass spectrometry (MS) is challenging as ψ exhibits the same mass as uridine. Thus, a chemical labeling strategy using acrylonitrile was developed to detect this mass-silent modification. Acrylonitrile reacts specifically to ψ to form 1-cyanoethylpseudouridine (Ceψ), resulting in a mass shift of ψ detectable by MS. Here, a protocol detailing the steps from the purification of RNA by polyacrylamide gel electrophoresis, including in-gel labeling of ψ, to MS data interpretation to map ψ and other modifications is proposed. To demonstrate its efficiency, the protocol was applied to bacterial regulatory RNAs from E. coli: 6S RNA and transfermessenger RNA (tmRNA, also known as 10Sa RNA). Moreover, ribonuclease P (RNase P) was also mapped using this approach. This method enabled the detection of several ψ at single nucleotide resolution. Key words nanoLC-MS/MS, Pseudouridine, Modifications mapping

1

Introduction MS coupled to liquid chromatography is a powerful tool that has demonstrated its efficiency for both identification and sequence placement of RNA modification [1]. ψ is a ubiquitous RNA modification obtained after the isomerization of U. Therefore, ψ, as an isomer of U, is mass-silent and not detectable by direct MS analysis. To overcome this difficulty, RNAs can be subjected to chemical treatment to specifically label ψ [2, 3]. Among the reactants, acrylonitrile specifically reacts under mildly alkaline conditions to the N1 position of ψ, through a type of Michael addition, resulting in an increased mass of +53.0 Da (corresponding to Ceψ), easily detectable by MS [3, 4] (see Fig. 1). This way, it is possible to distinguish ψ from U and thus to locate ψ on the RNA sequence. RNAs are purified by denaturing polyacrylamide gel electrophoresis

Ve´ronique Arluison and Claudio Valverde (eds.), Bacterial Regulatory RNA: Methods and Protocols, Methods in Molecular Biology, vol. 2741, https://doi.org/10.1007/978-1-0716-3565-0_15, © The Author(s), under exclusive license to Springer Science+Business Media, LLC, part of Springer Nature 2024

273

274

Antony Lechner and Philippe Wolff

Fig. 1 Chemical reaction showing the specific labeling of pseudouridine with acrylonitrile. Acrylonitrile reacts specifically at position N1 of pseudouridine in mildly alkaline conditions

(PAGE), and acrylonitrile derivatization of ψ is carried out in-gel followed by in-gel ribonuclease (RNase) hydrolysis to obtain product amenable to MS analysis. After digestion, hydrolysates are separated by nano ion-pair reversed-phase high-performance liquid chromatography (nano‑IP‑RP‑HPLC) [5, 6]. Detection of oligonucleotides is achieved by MS, and the oligonucleotides are subjected to collision-induced dissociation (CID) to obtain sequencing information [7]. Sequencing results are compared to genomic sequence to identify RNA and to map RNA modifications, including ψ, in the sequence. In this chapter, a detailed protocol for ψ and other modifications mapping by nanoLC-MS/MS is presented. Modifications mapping of two bacterial regulatory RNAs, 6S RNA and tmRNA plus RNase P, are shown. 6S RNA is a small RNA regulator of RNA polymerase [8, 9]. tmRNA is involved in trans-translation, the major and ubiquitous ribosome rescue system in bacteria [10]. tmRNA consists of two domains: a tRNA-like domain and an mRNA-like domain. tmRNA is a rare example of bacterial regulatory RNA where modifications have been identified. This RNA contains two types of modified nucleosides, 5-methyluridine and pseudouridine. Both modifications are in a tRNA-like domain, located in a mimicking tRNA Tψ-loop [11]. RNase P is an endonuclease essential for protein synthesis, catalyzing the 5′-end maturation of transfer RNAs (tRNAs) [12]. Three different nucleotide-specific RNases are used: RNase T1 (cleaves at 3′ end of G), RNase A (cleaves at 3′ end of U and C), and RNase U2 (cleaves at 3′ end of purine but preferably at 3′ end of A) to obtain the largest sequence coverage [13]. The RNA modifications mapping and data analysis workflow are shown in Fig. 2. This protocol could be applied to different RNA types (transfer RNA, ribosomal RNA, etc.) and organisms (bacteria, archaea, and eukaryotes).

Mapping of Regulatory Bacterial RNA Pseudouridines by Mass Spectrometry

275

Fig. 2 Worklow for ψ detection by MS. First, RNAs are separated by PAGE, and bands containing RNA of interest are sliced. Then, ψ is in-gel derivatized by acrylonitrile and RNA is digested by individual RNases. Next, oligonucleotides are analyzed by nanoLC-MS/MS. Finally, data from LC-MS/MS enable oligonucleotide sequencing and modifications mapping, including ψ

2

Materials

2.1 Instruments and Equipment

1. Polyacrylamide gel electrophoresis (PAGE) apparatus for 50 cm glass plates, 1 mm spacers. 2. Electrophoresis power supply. 3. UV transilluminator and UV-safety glasses. 4. Sterile surgical blades. 5. SpeedVac vacuum concentrator. 6. NanoAcquity ULPC system (Waters, Manchester, UK). 7. Synapt G2-S mass spectrometer equipped with NanoLockSpray ionization source (Waters, Manchester, UK). 8. MassLynx mass spectrometry software with MaxEnt3 module (Waters, Manchester, UK).

276

Antony Lechner and Philippe Wolff

9. Acquity UPLC peptide BEH C18 column (130 Å, 1.7 μm, 75 μm × 200 mm) (Waters, Manchester, UK). 10. Basic laboratory materials (vortex, thermoblock, micropipettes, etc.). 11. Personal protective equipment (lab coat, gloves, etc.). 2.2 Gel Electrophoresis

1. Total RNA extract from E. coli K12 MG 1655 obtained by TRI-Reagent extraction [14]. 2. 12.5% acrylamide,bis-acrylamide (19:1), 8 M urea, 1× Tris Borat EDTA (TBE). 3. 20% Ammonium persulfate. 4. N,N,N′,N′-tetramethylethane-1,2-diamine (TEMED). 5. 0.5 μg/mL Ethidium bromide (EtBr).

2.3 In-Gel Cyanoethylation of ψ and RNase Digestion

1. 41% Ethanol (EtOH), 59% 1.1 M Triethylammonium acetate (TEAA) pH 8.6 (v/v). 2. ≥99% Acrylonitrile. 3. 1 U/μL RNase T1 in 100 mM Ammonium acetate (NH4AcO) (pH is not adjusted). 4. 0.01 μg/μL RNase A in 100 mM NH4AcO (pH is not adjusted). 5. 0.1 μg/μL RNase U2 in 200 mM NH4AcO (pH 5.3) (U2 homemade prepared as described [15]). 6. 100 mM NH4AcO (pH is not adjusted). 7. 50% Acetonitrile. 8. Milli Q water. 9. ZipTip C18 (Millipore).

2.4 Nano Liquid Chromatography

1. Mobile phase A: 200 mM 1,1,1,3,3,3-hexafluoropropan-2-ol (HFIP), 7 mM triethylamine (TEA), 7.5 mM triethylamine acetate (TEAA) pH 7.5. 2. Mobile phase B: 100% methanol LC/MS grade.

3

Methods

3.1 RNA Purification by Gel Electrophoresis

1. Cast the gel, 8% acrylamide, bis-acrylamide, 8 M urea. 2. Pre-run the gel in 1× TBE buffer for 30 min at 30 W (constant power). 3. Heat the sample at 90 °C for 2 min and load 20 μg per well of E. coli total RNA extract in a minimum volume (less than 10 μL).

Mapping of Regulatory Bacterial RNA Pseudouridines by Mass Spectrometry

277

4. Run the gel for 4 h at 16 W (constant power). 5. Stain the gel with an EtBr solution (10 μg/L) for 10 min. 6. Visualize the bands containing RNAs under UV light (302 nm). Beware of wearing appropriate protections for handling gel containing EtBr and manipulation under UV light. 7. Excise the gel bands containing RNAs with a clean razor blade. 8. Dry the bands in a clean tube under vacuum without heating. 3.2 In-Gel Cyanoethylation and RNase Digestion

1. Rehydrate the bands by adding 34 μL of 41% EtOH, 59% 1.1 M TEAA pH 8.6 (v/v) plus 4 μL of acrylonitrile and heat the mixture at 70 °C for 2 h. Acrylonitrile is hazardous and must be handled under a fume hood. 2. Remove the supernatant and wash the bands several times (at least 3) with 100 mM NH4AcO. 3. Dry the bands under vacuum without heating. 4. Perform individual digestion by rehydrating gel bands with 20 μL of RNase T1 (1 U/μL) solution or 20 μL of RNase A (0.01 μg/μL) solution in 100 mM NH4AcO (pH is not adjusted) and incubate at 55 °C for 2 h. 5. For RNase U2 digestion, rehydrate gel fragments with 50 μl of RNase U2 at 5 ng/μL in 100 mM NH4AcO (pH adjusted to 5.3) and incubate at 55 °C for 30 min. 6. Digested samples can be stored at -20 °C.

3.3 RNA Digest Products Desalting

1. Prepare ZipTip C18 with 50% acetonitrile in water (v/v) (2 × 10 μL). 2. Wash the ZipTip with milliQ water (2 × 10 μL). 3. Equilibrate the ZipTip for sample binding with 200 mM NH4AcO (2 × 10 μL). 4. Bind the digested sample (aspirate and dispense the sample 10 times). 5. Wash oligonucleotides with 200 mM NH4AcO (5 × 10 μL). 6. Wash oligonucleotides with milliQ water (5 × 10 μL). 7. Elute oligonucleotides with 10 μL of 50% acetonitrile in water (v/v) in a sample injection vial. 8. Dry under vacuum for 15 min without heating. The sample can be stored at -20 °C until analysis. 9. Dried samples are resuspended in 3 μL milliQ water before nanoLC injection.

278

3.4

Antony Lechner and Philippe Wolff

nanoLC-MS/MS

3.4.1 Nano Liquid Chromatography

NanoLC analyses are performed with an injection volume of 3 μL and a flow rate of 0.3 μL/min for 1 h. Column oven temperature is kept at 65 °C. The column is equilibrated in 85% mobile phase A (see Subheading 2.4). RNase digestion products are eluted using a gradient from 15% to 35% B in 2 min, followed by an increase of B to 50% in 20 min and then returning to 15% B in 25 min. After analysis, the nanoLC system is thoroughly washed to avoid RNase contamination.

3.4.2 Mass Spectrometry Analysis

All experiments are performed in negative mode. The digests are transferred into the mass spectrometer via a NanoLockSpray. Capillary voltage is set to 2.6 kV and the sample cone to 30 V. Source is heated to 130 °C. For MS, a mass range from 550 to 1600 (m/z) is used, followed by CID (collision-induced dissociation) fragmentation of most intense signals using Fast Data Directed Acquisition mode (FastDDA) with an m/z detection range of fragments from 50 to 2000. A collision energy ramp (18 V to 28 V at m/z 500 and 28 V to 38 V at m/z 1500) is applied in the trap collision cell to provide the maximum signal for CID fragment ions.

3.5

All the fragment spectra are manually sequenced (see Note 1)

Data Analysis

1. Deconvolute CID spectra by using MaxEnt3 software (see Note 2). 2. Reconstruct the sequence by following the y and/or c series (see Note 3). For ψ assignment, use Table 1 and see Notes 4–8. Three sequencing examples are shown in Fig. 3. 3. When a digested RNA fragment is sequenced, the precursor ion and the fragments’ masses can be checked using the Mongo Oligo mass calculator (https://mods.rna.albany.edu/mas spec/Mongo-Oligo). A maximum tolerance of 0.05 Da between the measured and calculated mass is routinely tolerated. 4. Compare the resulting sequences with genomic sequences (http://gtrnadb.ucsc.edu) [16] to identify RNA. 5. Map modifications, including ψ, on the RNA sequence. Figure 4 and Table 2 show a list of all digested RNA fragments sequenced by MS/MS obtained after individual RNase digestions of RNase P, 6S RNA, and tmRNA (see Notes 9 and 10).

4

Notes 1. Automated interpretation and annotation of MS/MS data can be achieved by software (Ariadne [17], RoboOligo [18], NASE [19], Pytheas [20]), nevertheless to obtain

443.0400

tm5U

5-taurinomethyluridine

acp Y

Q

3-(3-amino-3-carboxypropyl) pseudouridine

queuosine

3

tm s U

471.1155

407.073

459.0172

524.1421

513.1262

512.0438

496.0666

474.1153

418.0713

402.0941

383.0632

375.0291

373.0676

373.0676

359.0519

DeltaMa after cyanoethylation

DeltaM corresponds to the mass difference between RNA product ions of 1 nucleotide length

a

421.0887

m1acp3Y

1-methyl-3-(3-amino-3carboxypropyl)pseudouridine

5-taurinomethyl-2-thiouridine

365.0447

mnm5s2U

5-methylaminomethyl-2thiouridine

5 2

349.0675

mnm U

5-methylaminomethyluridine

330.0366

I

inosine

322.0025

s U

4-thiouridine

5

320.041

m3Y

3-methylpseudouridine

4

320.041

Ym

2′-O-methylpseudouridine

306.0253

Y

pseudouridine

DeltaM

Full name

a

Short name Comments

Modification [23]

Nucleobase position N1 + modification

Modification [23]

Modification [23]

Modification

Modification

Modification

Neutral loss of 168

Two labels

Nucleobase position N1 [25, 26] Becomes RNase T1 resistant

Nucleobase position S4 [26]

Nucleobase position N1 [3]

Nucleobase position N1 [3]

Nucleobase position N1 [4]

Location of labeling

Table 1 List of modified nucleotides labeled by acrylonitrile with the position of the cyanoethyl group on the nucleotide and their mass for CID sequencing

Mapping of Regulatory Bacterial RNA Pseudouridines by Mass Spectrometry 279

Fig. 3 Examples of different MS/MS spectra obtained after in-gel cyanoethylation and digestion. (a) MS/MS sequencing spectrum of AAGGG[Ceψ]Up (m/z = 1187.16, z = -2) from 6S RNA after RNase A digestion. In this case, Ceψ was not cleaved by RNase A. (b) MS/MS sequencing spectrum of U[Ceψ]UCGGA>p (m/ z = 1146.65, z = -2) from tmRNA after RNase U2 digestion. (c) MS/MS sequencing spectrum of G[m5U][Ceψ] CAA>p (m/z = 992.65, z = -2) from tmRNA after RNase U2 digestion

Mapping of Regulatory Bacterial RNA Pseudouridines by Mass Spectrometry

281

unambiguous modification placements, manual inspection of MS/MS spectra is recommended. 2. Electrospray spectrum produces multiple charged ions. Typically, RNase digest products are two or three times charged. Spectrum deconvolution allows us to simplify spectrum reading by transforming multiple charges spectrum (m/z) to mono charge spectrum (mass). 3. Generally, the most intense ion series are y and c with CID experiments on RNA [21]. However, the other series could be helpful for checking/completing fragment sequences. Mongo Oligo online calculator (http://rna.rega.kuleuven.be/mas spec/mongo.htm) could be helpful to find and/or check ion series. 4. For MS detection of ψ, the cyanoethyl group adds a mass of 53.026 Da. Thus, the mass of labeled ψ is very close to a methylated G. Therefore, ψ placement may be confusing. The mass difference between product ions corresponding to Ceψ is slightly lighter than Gm (Δ = 0.0112 Da). Discrimination between methylated G and cyanoethylated ψ is possible using an MS instrument with a high resolving power (>35,000) and a measurement accuracy of 10 ppm or less to avoid assignment errors. Another way to identify the nucleotide is to simply compare the sequences obtained with the genomic sequences. In addition, using known RNA modification sequences, it could be possible to predict ψ-sites. Modomics RNA database (https://iimcb.genesilico.pl/modomics/sequences/) [22] provides a large collection of modified RNA sequences. 5. Acrylonitrile may react with other RNA modifications (see Table 1). Therefore, this modification: (a) Is detected with an increased mass of + n * 53.0 Da (according to the number (n) of reactive sites). (b) May become RNase resistant. For instance, inosine (I) is hydrolyzed by RNase T1, but after cyanoethylation, I is not cleaved by this enzyme anymore (data not shown). 6. Although the cyanoethylation is very specific to ψ, this reaction is not complete and non-labeled fragments are observable. In addition, position sites may be partially modified. Thus detection of ψ on these sites is extremely difficult [4, 23–25]. 7. According to the literature [4, 26], the cyanoethylation reaction is very specific to ψ. However, cyanoethylation of uridines can occur to a lesser extent [24]. Other methods, able to detect ψ, can be useful to confirm ψ-sites, for example, CMC-primer extension [11, 23], HydraPsiSeq [27] or using other reactants [28]. See also review [29]. 8. RNase A hydrolyzes pyrimidine residues, including ψ. However, when ψ is cyanoethylated, the nucleotide becomes slightly

282

Antony Lechner and Philippe Wolff

Fig. 4 Sequence coverages obtained by LC-MS/MS for tmRNA, 6S RNA, and RNase P. For tmRNA only, all experimental RNase fragments (RNase T1, A, and U2) are shown. For 6S and RNase P, all RNase fragments (RNase T1, A, and U2) are combined and indicated in red

resistant to RNase A. Thus, fragments containing Ceψ may be observed when this enzyme is used; for example, in the 6S analysis, AAAACp and [Ceψ]AAAACp were detected (see Table 2).

RNase A

RNase T1

Calculated mass (Da)

1991.18

4481.37

4428.33

3512.31 2289.26 5416.68

2912.26 1327.14 1584.16 1938.2 2219.22 2195.18

1985.26 2345.34 2376.32 2017.24 1672.22 1639.24 1998.3 2689.36

-2

-3

-3

-3 -2 -4

-2 -1 -2 -2 -2 -2

-2 -2 -2 -2 -2 -2 -2 -2

994.59

1492.79

1475.11

1169.77 1143.63 1353.17

1455.13 1326.14 791.08 968.1 1108.61 1096.59

991.63 1171.67 1187.16 1007.62 835.11 818.62 998.15 1343.68

m/z

z

6S

AAGAAUp GAGAAGCp AAGGG[Ceψ]Up GGAGAUp GAGAUp AAAACp [Ceψ]AAAACp AAGAA[Ceψ]GUp

ACACAU[Ceψ] CACCUUGp ACACAUUCA CCUUGp CCUUAAAACUGp AACCAAGp AUAUUUCAUA CCACAAGp pAUUUCUCUGp AAUGp CUCGp UUCAAGp CAUCUCGp UCCCCUGp

[Ceψ]UACAGp

sequence

1148.16 834.61 826.62 1015.62 1007.14 1007.64 818.62 999.63 991.15 991.15 1155.66

1109.12 1121.13 979.62 1091.49 1271.81 1396.51 1451.15 956.59

827.1 956.59 1112.14

1212.67

1326.17

1156.15

m/z

tmRNA

-2 -2 -2 -2 -2 -2 -2 -2 -2 -2 -2

-2 -2 -2 -3 -3 -3 -3 -2

-2 -2 -2

-2

-1

-2

z

2298.32 1671.22 1655.24 2033.24 2016.28 2017.28 1639.24 2001.26 1984.3 1984.3 2313.32

2220.24 2244.26 1961.24 3277.47 3818.43 4192.53 4356.45 1915.18

1656.2 1915.18 2226.28

2427.34

1327.17

2314.3

Calculated mass (Da)

AAAAAAUp GAAGCp GAAACp GGGGAUp GGAAGCp GAGGAUp AAAACp AGGAAUp GAAAACp AAAGACp AAAAAGCp

CUCUUAGp AAUUUCGp ACUAAGp CAAAAAAUAGp CUUAAUAACCUGp UCAAACCCAAAAGp CCCUCUCUCCCUAGp AUUCUGp

UAAAGp CUUUAGp AAUUUCG > p

CUCCACCA

AAUGp

UAAAAAGp

sequence

1015.62 826.62 1007.64 1171.65 999.14 1326.2 834.61 835.11 1310.19 1344.68 1309.19

1349.16 979.6 955.59 1273.12

1285.63 1131.63 1109.09

1296.65

1448.66

1372.79

m/z

RNase P

-2 -2 -2 -2 -2 -1 -2 -2 -1 -2 -1

-1 -2 -2 -2

-2 -2 -2

-2

-2

-3

z

2033.24 1655.24 2017.28 2345.3 2000.28 1327.2 1671.22 1672.22 1311.19 2691.36 1310.19

1350.16 1961.2 1913.18 2548.24

2573.26 2265.26 2220.18

2595.3

2899.32

4121.37

Calculated mass (Da)

(continued)

AGGGGUp AGAACp AAGGGUp AGAGAGCp AAGAGCp AGAUp AAGGCp AAGGUp AAAUp GAAAGGGUp AAACp

AAAGp CAACAGp UCCACGp CUCCAUAGp

UUCAUAAGp CAAACCGp CUUAUCGp

CCAAAUAGp

UAAACUCC ACCCGp AAACCCACGp

sequence

Table 2 List of digestion products observed and identified by LC-MS/MS for RNA 6S, tmRNA, and RNase P from E. coli after individual enzymatic hydrolysis using RNases T1, A, and U2

2187.22 2225.26 2537.38 2588.22 3126.39 2894.26 3568.32 4089.48 1552.14 1896.22

1605.18 2240.24 1902.28 1637.22 2506.28 2851.3

2538.3 1575.18 1896.2 1880.24 2241.26 2561.28

-2 -2 -2 -2 -3 -2 -3 -3 -2 -2

-2 -2 -2 -2 -2 -2

-2 -2 -2 -2 -2 -2

1092.61 1111.63 1267.69 1293.11 1041.13 1446.13 1188.44 1362.16 775.07 947.11

801.59 1119.12 950.14 817.61 1252.14 1424.65

1268.15 786.59 947.1 939.12 1119.63 1279.64

CCUUAAAA>p UUCAA>p CCUUGA>p CCUUAA>p UCUCGGA>p CUCCGCGG>p

U[Ceψ]UCA>p U[Ceψ]UCAUA>p CCACAA>p CAAGA>p UCCCCUGA>p GUCCCCUGA>p

UUUCAUA>p CCUUGAA>p UACCACAA>p pUUCUCUGA>p UUUCAUACCA>p pUUUCUCUGAp pUUUCUCUGAGAp UUUCAUACCACAA>p UUUCA>p CCUUGA>p

sequence

The cyanoethylated pseudouridines are denoted by the symbol [Ceψ]

RNase U2

Calculated mass (Da)

m/z

z

6S

Table 2 (continued)

1284.14 971.14 1330.15 1404.2 1120.14 992.65 1146.65 1241.84 1445.17 1347.84

m/z

tmRNA

-2 -2 -3 -2 -2 -2 -2 -3 -3 -3

z 2570.28 1944.28 3993.45 2810.4 2242.28 1987.3 2295.3 3728.52 4338.51 4046.52

Calculated mass (Da) m/z

UCAGGCUA>p 974.62 UGUAAA>p 774.09 CCCUCUCUCCCUA>p 806.1 CUCCCGCCA>p 946.61 UUUCGGA>p 966.13 G[m5U][Ceψ]CAA>p 1147.12 U[Ceψ]UCGGA>p 1118.65 CCUCCGCUCUUA>p 775.07 CCCUCUCUCCCUAG>p 794.58 CCCUCUC[Ceψ] 782.58 CCCUA>p 801.59 814.09 806.1 974.62 974.12 1393.64

sequence

RNase P

-2 -2 -2 -2 -2 -2

-2 -2 -2 -2 -2 -2 -2 -2 -2 -2

z

1605.18 1630.18 1614.2 1951.24 1950.24 2789.28

1951.24 1550.18 1614.2 1895.22 1934.26 2296.24 2239.3 1552.14 1591.16 1567.16

Calculated mass (Da)

CCGCG>p GCUGA>p CUGAA>p UCCGGG>p CCCGGG>p UCCUCUU CG>p

UCCGGG>p UUUCA>p CUGAA>p CCCGUA>p CCCGGA>p CCUGGGG>p CCGCCGA>p UUUCA>p CUUGA>p CUUCG>p

sequence

Mapping of Regulatory Bacterial RNA Pseudouridines by Mass Spectrometry

285

9. As with proteomics strategy, RNase digestion generates a set of oligonucleotides specific to a unique RNA. RNA identification is possible by using a genomic RNA database. 10. RNase digestions generate fragments that can be too short to be uniquely assigned in the RNA sequence or too long to be correctly sequenced by LC-MS/MS. Hence, some ψ sites may not be covered using the LC-MS/MS technique if the fragments containing this modification are not detectable. To obtain the maximum sequence coverage, the use of different RNases [30] or even partial RNA digestions [31] is recommended.

Acknowledgments The project was supported by the French National Program Investissement d’Avenir (Labex NetRNA) administered by the Agence Nationale de la Recherche (ANR-10-LABX-0036_NETRNA). This work of the Interdisciplinary Thematic Institute IMCBio, as part of the ITI 2021-2028 program of the University of Strasbourg, CNRS and Inserm, was supported by IdEx Unistra (ANR-10-IDEX-0002), and by SFRI-STRAT’US project (ANR 20-SFRI-0012) and EUR IMCBio (ANR-17-EURE- 75 0023) under the framework of the French Investments for the Future Program. Antony Lechner was supported by Fonds Re´gional de Coope´ration pour la Recherche (Region Grand Est, EpiRNA). References 1. Thakur P, Abernathy S, Limbach PA, Addepalli B (2021) Chapter One – locating chemical modifications in RNA sequences through ribonucleases and LC-MS based analysis. In: Jackman JE (ed) Methods in enzymology. Academic Press, San Diego, pp 1–24 2. Heiss M, Kellner S (2016) Detection of nucleic acid modifications by chemical reagents. RNA Biol 14:1166–1174. https://doi.org/10. 1080/15476286.2016.1261788 3. Durairaj A, Limbach PA (2008) Mass spectrometry of the fifth nucleoside: a review of the identification of pseudouridine in nucleic acids. Anal Chim Acta 623:117–125. https:// doi.org/10.1016/j.aca.2008.06.027 4. Mengel‐Jørgensen J, Kirpekar F (2002) Detection of pseudouridine and other modifications in tRNA by cyanoethylation and MALDI mass spectrometry. Nucleic Acids Res 30:e135. https://doi.org/10.1093/nar/gnf135 5. Apffel A, Chakel JA, Fischer S, Lichtenwalter K, Hancock WS (1997) Analysis

of oligonucleotides by HPLC-electrospray ionization mass spectrometry. Anal Chem 69: 1320–1325. https://doi.org/10.1021/ ac960916h 6. Sutton JM, Guimaraes GJ, Annavarapu V, van Dongen WD, Bartlett MG (2020) Current state of oligonucleotide characterization using liquid chromatography–mass spectrometry: insight into critical issues. J Am Soc Mass Spectrom 31:1775–1782. https://doi.org/10. 1021/jasms.0c00179 7. McLuckey SA, Van Berker GJ, Glish GL (1992) Tandem mass spectrometry of small, multiply charged oligonucleotides. J Am Soc Mass Spectrom 3:60–70. https://doi.org/10. 1016/1044-0305(92)85019-G 8. Wassarman KM, Storz G (2000) 6S RNA regulates E. coli RNA polymerase activity. Cell 101:613–623. https://doi.org/10.1016/ S0092-8674(00)80873-9 9. Wassarman KM (2018) 6S RNA, a global regulator of transcription. Microbiol Spectr 6:

286

Antony Lechner and Philippe Wolff

6.3.06. https://doi.org/10.1128/micro biolspec.RWR-0019-2018 10. Mu¨ller C, Crowe-McAuliffe C, Wilson DN (2021) Ribosome rescue pathways in bacteria. Front Microbiol 12:652980 11. Felden B, Hanawa K, Atkins JF, Himeno H, Muto A, Gesteland RF, McCloskey JA, Crain PF (1998) Presence and location of modified nucleotides in Escherichia coli tmRNA: structural mimicry with tRNA acceptor branches. EMBO J 17:3188–3196. https://doi.org/10. 1093/emboj/17.11.3188 12. Kazantsev AV, Pace NR (2006) Bacterial RNase P: a new view of an ancient enzyme. Nat Rev Microbiol 4:729–740. https://doi. org/10.1038/nrmicro1491 13. Wolff P, Villette C, Zumsteg J, Heintz D, Antoine L, Chane-Woon-Ming B, Droogmans L, Grosjean H, Westhof E (2020) Comparative patterns of modified nucleotides in individual tRNA species from a mesophilic and two thermophilic archaea. RNA. https:// doi.org/10.1261/rna.077537.120 14. Rio DC, Ares M, Hannon GJ, Nilsen TW (2010) Purification of RNA using TRIzol (TRI reagent). Cold Spring Harb Protoc 2010:pdb.prot5439. https://doi.org/10. 1101/pdb.prot5439 15. Houser WM, Butterer A, Addepalli B, Limbach PA (2015) Combining recombinant ribonuclease U2 and protein phosphatase for RNA modification mapping by liquid chromatography– mass spectrometry. Anal Biochem 478:52–58. https://doi.org/10.1016/j.ab.2015.03.016 16. Chan PP, Lowe TM (2016) GtRNAdb 2.0: an expanded database of transfer RNA genes identified in complete and draft genomes. Nucleic Acids Res 44:D184–D189. https://doi.org/ 10.1093/nar/gkv1309 17. Nakayama H, Akiyama M, Taoka M, Yamauchi Y, Nobe Y, Ishikawa H, Takahashi N, Isobe T (2009) Ariadne: a database search engine for identification and chemical analysis of RNA using tandem mass spectrometry data. Nucleic Acids Res 37:e47. https://doi.org/10.1093/nar/gkp099 18. Sample PJ, Gaston KW, Alfonzo JD, Limbach PA (2015) RoboOligo: software for mass spectrometry data to support manual and de novo sequencing of post-transcriptionally modified ribonucleic acids. Nucleic Acids Res 43:e64. https://doi.org/10.1093/nar/gkv145 19. Wein S, Andrews B, Sachsenberg T, SantosRosa H, Kohlbacher O, Kouzarides T, Garcia BA, Weisser H (2020) A computational platform for high-throughput analysis of RNA sequences and modifications by mass

spectrometry. Nat Commun 11:926. https:// doi.org/10.1038/s41467-020-14665-7 20. D’Ascenzo L, Popova AM, Abernathy S, Sheng K, Limbach PA, Williamson JR (2022) Pytheas: a software package for the automated analysis of RNA sequences and modifications via tandem mass spectrometry. Nat Commun 13:2424. https://doi.org/10.1038/s41467022-30057-5 21. Andersen TE, Kirpekar F, Haselmann KF (2006) RNA fragmentation in MALDI mass spectrometry studied by H/D-exchange: mechanisms of general applicability to nucleic acids. J Am Soc Mass Spectrom 17:1353– 1368. https://doi.org/10.1016/j.jasms. 2006.05.018 22. Boccaletto P, Stefaniak F, Ray A, Cappannini A, Mukherjee S, Purta E, Kurkowska M, Shirvanizadeh N, Destefanis E, Groza P, Avs¸ar G, Romitelli A, Pir P, Dassi E, Conticello SG, Aguilo F, Bujnicki JM (2022) MODOMICS: a database of RNA modification pathways. 2021 update. Nucleic Acids Res 50: D231–D235. https://doi.org/10.1093/nar/ gkab1083 23. Suzuki T, Yashiro Y, Kikuchi I, Ishigami Y, Saito H, Matsuzawa I, Okada S, Mito M, Iwasaki S, Ma D, Zhao X, Asano K, Lin H, Kirino Y, Sakaguchi Y, Suzuki T (2020) Complete chemical structures of human mitochondrial tRNAs. Nat Commun 11:4269. https:// doi.org/10.1038/s41467-020-18068-6 24. Taoka M, Nobe Y, Hori M, Takeuchi A, Masaki S, Yamauchi Y, Nakayama H, Takahashi N, Isobe T (2015) A mass spectrometry-based method for comprehensive quantitative determination of posttranscriptional RNA modifications: the complete chemical structure of Schizosaccharomyces pombe ribosomal RNAs. Nucleic Acids Res 43:e115. https://doi.org/10.1093/nar/ gkv560 25. Sakurai M, Yano T, Kawabata H, Ueda H, Suzuki T (2010) Inosine cyanoethylation identifies A-to-I RNA editing sites in the human transcriptome. Nat Chem Biol 6:733–740. https://doi.org/10.1038/nchembio.434 26. Ofengand J (1967) The function of pseudouridylic acid in transfer ribonucleic acid: I. The specific cyanoethylation of pseudouridine, inosine, and 4-thiouridine by acrylonitrile. J Biol Chem 242:5034–5045. https://doi.org/10. 1016/S0021-9258(18)99473-1 27. Marchand V, Pichot F, Neybecker P, Ayadi L, Bourguignon-Igel V, Wacheul L, Lafontaine DLJ, Pinzano A, Helm M, Motorin Y (2020) HydraPsiSeq: a method for systematic and quantitative mapping of pseudouridines in

Mapping of Regulatory Bacterial RNA Pseudouridines by Mass Spectrometry RNA. Nucleic Acids Res 48:e110. https://doi. org/10.1093/nar/gkaa769 28. Helm M, Schmidt-Dengler MC, Weber M, Motorin Y (2021) General principles for the detection of modified nucleotides in RNA by specific reagents. Adv Biol 5:2100866. https:// doi.org/10.1002/adbi.202100866 29. Yoluc¸ Y, Ammann G, Barraud P, Jora M, Limbach PA, Motorin Y, Marchand V, Tisne´ C, Borland K, Kellner S (2021) Instrumental analysis of RNA modifications. Crit Rev Biochem Mol Biol 56:178–204. https://doi.org/10. 1080/10409238.2021.1887807

287

30. Thakur P, Estevez M, Lobue PA, Limbach PA, Addepalli B (2020) Improved RNA modification mapping of cellular non-coding RNAs using C- and U-specific RNases. Analyst 145: 8 1 6 – 8 2 7 . h t t p s : // d o i . o r g / 1 0 . 1 0 3 9 / C9AN02111F 31. Vanhinsbergh CJ, Criscuolo A, Sutton JN, Murphy K, Williamson AJK, Cook K, Dickman MJ (2022) Characterization and sequence mapping of large RNA and mRNA therapeutics using mass spectrometry. Anal Chem 94:7339– 7349. https://doi.org/10.1021/acs.ana lchem.2c00765

Part III sRNA Interactome

Chapter 16 Directed Screening for sRNA Targets in E. coli Using a Plasmid Library Xing Luo and Nadim Majdalani Abstract A large number of bacterial small regulatory RNAs (sRNAs) modulate gene expression by base pairing to a target mRNA, affecting its translation or stability. This posttranscriptional regulation has been shown to be essential and critical for bacterial physiology. One of the challenges of studying sRNA signaling is identifying the sRNA regulators of specific genes. Here, we describe a protocol for making an sRNA expression library and using this library to screen for sRNA regulators of genes of interest in E. coli. This library can be easily expanded and adapted to use in other bacteria. Key words sRNA, lacZ fusion, sRNA library screen

1

Introduction The past two decades have witnessed an incredible number of discoveries and characterization of regulatory small RNAs (sRNA) that modulate gene expression at the posttranscriptional level in bacteria. While the emphasis had been on E. coli and Salmonella as model organisms, recent global searches and discoveries have been made in a variety of other bacteria. However, regardless of the organism and the source of the sRNA itself (stand-alone, 5′ UTR, 3′ UTR, or processed transcript), the role of each of these sRNAs has to be determined by more directed approaches to elucidate its physiological role and the target(s) it is regulating, particularly if regulation is via direct pairing. Global approaches can help narrow down the potential targets of a given small RNA [1–3] as do in silico predictions of pairing. However, the final measure of functionality remains a direct demonstration of pairing between the sRNA and its mRNA target coupled with some measure of the effect of the pairing. In E. coli, the most straightforward and powerful method to verify these interactions has been through the use of mRNA translational fusions to a reporter and the

Ve´ronique Arluison and Claudio Valverde (eds.), Bacterial Regulatory RNA: Methods and Protocols, Methods in Molecular Biology, vol. 2741, https://doi.org/10.1007/978-1-0716-3565-0_16, © The Author(s), under exclusive license to Springer Science+Business Media, LLC, part of Springer Nature 2024

291

292

Xing Luo and Nadim Majdalani

expression of a plasmid-borne sRNA. Expressing sRNAs from an inducible plasmid allows the study of toxic sRNAs and sRNAs that are poorly expressed from their native promoters. mRNA translational fusions to a reporter, colorimetric or fluorescent, make it convenient to follow the effect of the sRNAs on target expression. Moreover, this setup also allows the easy generation of mutants in the sRNA and compensatory mutations in the mRNA to establish that they pair. Conversely, a potential mRNA target, monitored with the fusion, can be used in a screen using an sRNA library to determine which sRNAs regulate it. Furthermore, a screen using the sRNA library can be done to monitor a change in phenotype, independently of a reporter fusion [4]. With the constantly expanding repertoire of sRNAs cloned in our pBR-based expression vector and the streamlining of the search process, these searches have become very practical and relatively simple to pursue. A fluorescence-based approach using a multicopy plasmid reporter has been described by Urban and Vogel [5]. The approach we describe here is based on Mandin and Gottesman [6] and is a colorimetric approach based on the lacZ reporter gene in the chromosome. Expression of the chromosomal fusion is driven by the PBAD promoter while expression of the plasmid-borne sRNAs is driven by the PLac promoter. This system can also be used in heterologous fashions: a potential target from the organism of choice other than E. coli is introduced into the chromosome of strain PM1805 and an sRNA library from that organism is cloned into the pBR-plac or the pNM46 vectors for screening (assuming that no RNA chaperone is required for pairing or that E. coli has it). Another variation would be to use a constitutive promoter instead of the pBAD one which would ensure a more uniform expression of the target in all cells but if the levels are too low or too high, would not be adjustable [7]. Validation of potential sRNAs isolated in this screen can be done via Northern blotting and/or Western blot if an antibody against the native protein is available or if the protein can be tagged to allow detection.

2

Materials

2.1 Molecular Cloning of sRNA Genes

1. pBR-plac or pNM46 plasmid. 2. PCR primers. 3. PCR kit (DNA polymerase, reaction buffer, dNTPs, MgCl2). 4. Agarose gel electrophoresis system (agarose, 1 × TAE buffer, 6 × loading buffer, DNA ladder, electrophoresis tank). 5. DNA gel extraction kit. 6. Restriction enzymes: AatII, EcoRI, and reaction buffer. 7. T4 DNA ligase and 10 × ligation reaction buffer.

Use of a sRNA Plasmid Library

293

8. DH5α competent cells. 9. LB (Lennox) + ampicillin (Amp) plates: 10 g tryptone, 5 g yeast extract, 5 g sodium chloride, 15 g agar in 1 L, autoclaved, with ampicillin (50 μg/mL). 2.2 mRNA Translational Fusion Construction

1. Bacterial strain PM1805 [MG1655, mini- λ::tet, malP:: lacIQ, lacI::PBAD-cat-sacB-lacZ, araC+, ΔaraBAD]. 2. LB (Lennox): 10 g tryptone, 5 g yeast extract, 5 g sodium chloride in 1 L of pure water, autoclaved. 3. LB (Lennox) plates: 10 g tryptone, 5 g yeast extract, 5 g sodium chloride, 15 g agar in 1 L, autoclaved. 4. LB (Lennox) + chloramphenicol (Cm; 10 μg/mL) plates. 5. M63 (2×): 6 g KH2PO4, 14 g K2HPO4, 0.49 g MgSO4·7H2O, 0.001 g FeSO4, 4 g (NH4)2SO4 in 1 L of pure water, autoclaved. 6. M63 + 0.2% glycerol + 5% sucrose agar plates. 7. SOC media: 0.5% Yeast Extract, 2% Tryptone, 10 mM NaCl, 2.5 mM KCl, 10 mM MgCl2, 10 mM MgSO4, 20 mM Glucose. 8. One shaking water bath. 9. Refrigerated centrifuges (for both small and large volumes). 10. Spectrophotometer (UV/Vis range). 11. Centrifuge tubes (1.5 mL, 15 mL, and 50 mL). 12. Electroporator. 13. Electroporator cuvettes. 14. Culture flasks. 15. Primers. 16. PCR kit with high-fidelity DNA polymerase. 17. PCR purification kits (if possible). 18. Nanodrop or DeNovix (for UV range).

2.3 Library Screening

1. LB (Lennox). 2. TSS buffer: 5% DMSO, 50 mM MgSO4, 10% PEG-8000, dissolved in LB media. 3. Permeabilization buffer: 100 mM Tris–HCl, pH 7.8, 32 mM Na2HPO4, 8 mM DTT, 8 mM EDTA, 4% Triton X-100 (see Note 1). 4. Polymyxin B: 20 mg/mL. 5. Multichannel pipettor. 6. 96-well microplates. 7. Microplate reader shaker.

294

Xing Luo and Nadim Majdalani

8. Spectrophotometer. 9. LB (Lennox) + ampicillin (50 μg/mL) plates. 10. IPTG stock: 100 mM. 11. ONPG solution: 100 mL M63 salts, 4 mg/mL ONPG, 2 mM Sodium Citrate, 70 μL β-mercaptoethanol. 12. Arabinose 20% solution (dilute as needed). 13. Microplate replicator tool.

3

Methods

3.1 Molecular Cloning of sRNA Genes

In our screening system, sRNAs were expressed from a multicopy plasmid, under the control of the PLac promoter [6]. The plasmid we primarily used, pBR-plac, was constructed by inserting a PLacO-1 promoter into the pBR322 plasmid [8], and sRNAs were cloned into the region downstream of the promoter using AatII and EcoRI restriction enzymes (Fig. 1a). This variation on the lac promoter was chosen because there are no operator sites downstream of the start of transcription, allowing sRNAs to be cloned without any additional 5′ sequences that might affect activity. Recently, we found it difficult to clone toxic sRNAs into this plasmid because endogenous levels of the lac repressor (LacI) are not high enough to completely repress the promoter and the cells show leaky levels of expressed sRNAs. To solve this problem, we tried two strategies to improve repression by raising the intracellular level of LacI. One strategy is to express LacI from the same plasmid. This is achieved by replacing the tetracycline-resistant gene of the pBR-plac plasmid with the lacI gene with its native promoter (pNM46 in Fig. 1a). The new pBR-plac-lacI plasmid, named pNM46 [11], decreased the leaky expression of the sRNA significantly, but not completely (Fig. 1b, compare lane 3 to lane 2). The other strategy is making a single C → T change at the -35 of the promoter of the chromosomal lacI. This mutation, which is called the lacIQ mutation, causes a tenfold increase in LacI production [12]. This strategy worked very well as no leaky expression of DsrA was detected either from the pNM46-dsrA plasmid (Fig. 1b, lane 6) or from the pBRplac-dsrA plasmid (Fig. 1c, lane 1) and the sRNA was successfully induced by the addition of IPTG (Fig. 1b, lane 7 and Fig. 1c, lanes 2 and 3). Thus, we highly recommend using the plasmid borne sRNAs in a lacIQ strain to prevent leaky expression. We have also tried making a derivative of pBR-plac expressing lacIQ, pNM56, but this plasmid turned out to be uninducible (Fig. 1d). In this section, we will describe how to design and clone sRNAs genes into the pBR-plac plasmid using restriction enzymes. 1. Design primers to amplify desired sRNA genes, using AatII (upstream) and EcoRI (downstream) as the flanking restriction

Use of a sRNA Plasmid Library

295

Fig. 1 sRNA cloning plasmids and the expression of the sRNA DsrA under different conditions. (a) Schematic maps of plasmids pBR322, pBR-plac and pNM46. ampR: ampicillin-resistant gene, tetR: tetracycline resistant gene, bom: basis of mobility, ori: origin of replication, rop: repressor of primer, involved in the control of copy number. (b) DsrA expression in different plasmids. Strains with Δ lacI (DJ480 [9]) or lacIQ (DJ624 [10]) background containing indicated plasmids were cultured in LB + Amp (100 μg/mL) media at 37 °C with agitation. When OD600 reached 0.4, indicated amounts of IPTG were added to the culture and samples were collected for RNA extraction after a 10-min induction. DsrA and SsrA (loading control) were probed by Northern blotting. (c) DsrA induction in the pBR-plac plasmid in a lacIQ strain (DJ624). DJ624 was transformed with the pBR-plac-dsrA plasmid and cultured in LB + Amp (100 μg/mL) media at 37 °C with agitation. Different concentrations of IPTG (0 μM, 100 μM, or 1000 μM) were added for sRNA induction. Samples were collected and analyzed as described in (b). (d) DsrA induction in the pNM56 plasmid in a Δ lacI strain (DJ480). DJ480 was transformed with pNM56-dsrA or pBR-plac-dsrA plasmid and cultured in LB + Amp (100 μg/mL) media at 37 °C with agitation. Samples were collected and analyzed as described in (b)

sites. The forward primer should contain the AatII site (GACGT^C) followed by the first 25–30 nt of the 5′ end sequence of the sRNA. The reverse primer should contain the EcoRI site (G^AATTC) followed by the last 25–30 nt of the 3′ end sequence (reverse complemented) of the sRNA (see Notes 2 and 3).

296

Xing Luo and Nadim Majdalani

Fig. 2 Steps for cloning sRNAs into the pBR-plac plasmid. The sRNA fragment and the plasmid were digested with restriction enzymes AatII and EcoRI to generate the same sticky ends. Digested sRNA and plasmid were mixed and joined by the DNA ligase

2. Prepare insert DNA (Fig. 2, Step 1). Amplify the inserted sRNA gene by PCR and purify the PCR product. Genomic DNA was used as the PCR template. 3. Digest 1 μg pBR-plac plasmid and the PCR product with AatII and EcoRI for 1.5 h at 37 °C (Fig. 2, Step 2). 4. Run samples on an agarose gel. Cut the DNA bands with correct size and extract the DNA from the gel using a gel extraction kit. 5. Ligate the linearized plasmid and sRNA DNA in a 10 μL reaction system (1 μL plasmid DNA, 3 μL sRNA DNA, 1 μL 10 × ligase buffer, 1 μL T4 DNA ligase and 4 μL H2O) (Fig. 2, Step 3). Incubate at 16 °C for 2 h or at 4 °C overnight (follow the manufacturer’s instruction) (see Note 4). 6. Use the ligation mix to transform DH5α competent cells (2 μL mix with 50 μL competent cells) and select LB + Amp (50 μg/mL) plates. 7. Confirm the insertion in the plasmid by colony PCR (see Note 5). 8. Purify the plasmid (mini-prep) and sequence the insert region. After confirmation, save the plasmid at -20 °C for future use.

Use of a sRNA Plasmid Library

3.2 mRNA Translational Fusion Construction

3.2.1 Insertion DNA Preparation

297

Typically in E. coli, a great number of negatively regulating sRNAs pair with their target at or near the ribosome binding site. As a result, the design of the mRNA target sequence was restricted to a region extending from 80 nt upstream of the start codon and up to the tenth codon in the ORF. However, the positive regulation of RpoS occurs far upstream of the start codon and recent reports indicated that potential pairings can occur deep within an ORF. Therefore, the design of the target mRNA should take into consideration a larger part of both the UTR and the ORF. The mRNA translational fusions were constructed by inserting the potential mRNA gene into the E. coli chromosome fused in frame to the ORF of the lacZ gene. This is achieved by replacing the counterselectable marker cat-sacB of the PBAD-cat-sacB-lacZ cassette via lambda Red recombineering. The parental strain, containing the cassette, is sucrose sensitive and chloramphenicol resistant. The properly recombineered strain will become resistant to growth on 5% sucrose and CmS. Thus, the recipient strain needs to contain both mini- λ::tet (to express lambda recombination proteins) and the PBAD-cat-sacB-lacZ cassette, both found in PM1805 [13]. Ideally, the strain should also carry the constitutively expressed araE arabinose transporter to ensure uniform induction of all cells. In this section, we will describe how to design and construct the mRNA translational fusion using PM1805. 1. Primer design: the primers to amplify your mRNA of interest should contain 40 nt homology sequence to the PBAD promoter (forward) or the lacZ ORF (reverse), plus 20 nt or more sequence in your gene, as shown in Fig. 3a the GOI (gene of interest)-for/rev. 2. PCR-amplify your insertion DNA using high-fidelity DNA polymerase, check the size of the PCR product on an agarose gel and then purify the product preferably using a PCR purification kit. 3. Measure the DNA concentration OD260/280 using a Nanodrop or DeNovix or any UV-capable spectrophotometer.

3.2.2 Recombineering (Fig. 3b)

1. Grow PM1805 on a LB plate or in liquid LB overnight at 32 ° C. 2. From a single colony, or liquid culture diluted 1/1000, start a culture at 32 °C with shaking (see Note 6). 3. At an OD600 between 0.6 and 0.8 transfer the culture flask to the prewarmed 42 °C shaking water bath (air shaker will not work) to induce the lambda-Red genes (see Note 7). 4. After incubating for 15 min at 42 °C (see Note 8), move the flask quickly into a bucket of an ice-water slurry and swirl it for about 2 min continuously (see Note 9).

298

Xing Luo and Nadim Majdalani

Fig. 3 Protocol for constructing the mRNA translational lacZ fusion. (a) Constructing the mRNA translational fusion by replacing the cat-sacB marker of the PBAD-cat-sacB-lacZ cassette. The gene of interest was amplified with primers containing 40 nt homologous sequence to the regions outside the cat-sacB genes and used for homologous recombination. (b) Steps of lambda Red recombineering. The mini- λ::tet cassette was induced at 42 °C to express the lambda recombination proteins. The homology-containing sRNA fragment was introduced into competent cells by electroporation and inserted in the chromosome via homologous recombination. Sucrose-resistant and chloramphenicol-sensitive strains were then selected and purified

Use of a sRNA Plasmid Library

299

5. Keep the flask in the slurry, swirling it for 30 s at a time for a total of about 10 min (see Note 10). 6. Collect the cells by centrifuging at 4 °C for 10 min at a speed of 2700 × g (see Note 11). 7. Carefully remove the LB by gently decanting or by aspiration. 8. Resuspend the pellet in the same volume of prechilled pure water. 9. Collect the cells by centrifuging at 4 °C for 10 min at a speed of 2700 × g (the first wash). 10. Carefully remove the supernatant by gently decanting or by aspiration. 11. Resuspend the pellet in 1.5 mL prechilled pure water and transfer the mix to a 1.5 mL prechilled microcentrifuge tube. 12. Centrifuge at 4 °C for 2 min at a speed of 20,000 × g (the second wash). 13. Remove the supernatant. 14. Repeat the steps 11 and 12 to have a total of four washes. 15. Resuspend the pellet in 50 μL of ice-cold water for each 5 mL of starting culture (100-fold concentration). 16. Transfer cells to prechilled electroporation cuvettes, 50 μL per cuvette. 17. Add 100 ng insertion DNA to one cuvette. For the control cuvette, add the same volume of pure water instead. 18. Electroporate at 180 kV for 0.1 cm gap cuvettes or 250 kV for 0.2 cm gap cuvettes. 19. Add 950 μL of LB or preferably SOC media to the cuvettes and mix by pipetting up and down. 20. Recover the cells in the cuvette at 37 °C for at least 1 h. 21. Keep the cuvette on the bench (22–25 °C) overnight. 22. Spread 10 and 200 μL of cells on M63-glycerol-sucrose plates and incubate at 37 °C for 1–3 days until colonies appear (see Note 12). 23. Patch colonies on M63-glycerol-sucrose and LB + Cm plates and incubate at 37 °C. 24. Pick 4–6 colonies that grew on M63-glycerol-sucrose but not on LB + Cm and purify them once on M63-glycerol-sucrose and twice on LB plates. 25. After purification, check the insertion by colony PCR using primers lacI ( GAAGGCGAAGCGGCATGCATT )—deeplac (CCGTAATGGGATAGGTCACG). 26. Purify the PCR products and sequence. 27. Strain with correct insertion can be saved for sRNA library screening.

300

Xing Luo and Nadim Majdalani

3.3 Library Screening

3.3.1 TSS Transformation (Fig. 4)

Thus far, we have cloned 26 E. coli sRNAs (Table 1) in our plasmid library. To identify the sRNA regulators of your gene of interest, the translational fusion strain will be transformed with each of the individual plasmids and the β-galactosidase activity will be measured under sRNA overexpression conditions. There are several ways to evaluate the β-galactosidase activity; qualitatively, such as change in color; red/white on MacConkey agar plates or blue/ white on X-gal-containing agar plates, or quantitatively and more accurately, a liquid β-gal assay as described by Miller [39]. In this section, we will describe a modified β-gal assay protocol using 96-well microplates. 1. Grow the translational fusion strain on an LB plate or in liquid LB overnight at 37 °C. 2. From a single colony, or liquid culture diluted 1/100, start a culture in 30 mL LB at 37 °C with shaking (see Note 13). 3. When OD600 reaches 0.5, collect the cells by centrifuging at 4 °C for 10 min with the speed of 2700 × g. 4. Carefully remove the supernatant by gently decanting or aspiration. 5. Resuspend the pellet in 3 mL cold TSS buffer and keep on ice. 6. Prepare a 96-well microplate, dispense 1 μL at 10 ng/μL of plasmid in each well. 7. Add 100 μL of the TSS resuspended cells to each well. 8. Incubate the microplate on ice for 30 min. 9. Incubate the microplate at 37 °C for 1 h. 10. Using a multichannel pipettor, spot 4 μL of transformed cells on an LB + Amp (50 μg/mL) agar plate, following the same pattern of the microplate. 11. Incubate the agar plate at 37 °C overnight.

3.3.2 β-Gal Assay Using 96-Well Microplates (Fig. 4)

To get high reproducibility, eight parallel repeats are recommended for each strain. For 27 strains (26 sRNA plasmids and 1 empty plasmid control), we usually use four microplates in total (two replicas per microplate). 1. Add 100 μL of LB + Amp (100 μg/mL) + IPTG (100 μM) + Arabinose (0.2–0.0002%) (see Note 14) into the first six rows of the microplates. 2. Using the microplate replicator tool, inoculate one colony of freshly transformed cells into each well. Two replicas per microplate. Make sure to leave at least one well as non-inoculation control. 3. Incubate the microplates at 37 °C with agitation for 6–8 h.

Use of a sRNA Plasmid Library

301

Table 1 sRNA library of the lab collection Flanking Flanking gene gene Right end upstream downstream Strand

RNA Alternate name names

Left end

ArcZ

PsrA16, SraH, RyhA

3,350,577 3,350,697 elbB

ChiX

SroB, RybC, MicM

507,204

CyaR

RyeE

Size (nt) Reference

arcB



>>

53

[16]

DsrA

2,025,223 2,025,313 yodD

yedP

>

91

[17]

507,287

GadY

IS183

3,664,864 3,664,969 mdtF

gadA

> < < 106

[18]

GcvB

PsrA11, IS145

2,942,696 2,942,901 gcvA

ygdI




106

[27]

RseX

IS096

2,033,649 2,033,739 yedR

yedS

>

91

[28]

RybB

p25

887,979

rcdA

ybjL

>

65

[30]

RyhB

PsrA18, 3,580,922 3,581,016 yhhX IS176, SraI

yhhY

<


227

[33]

4,049,899 4,050,009 polA

yihA

>>


122

[35]

RyeG IS118

2,470,472 2,470,665 yfdL

tfaS

>

194

[36]

a

a

RyeG is likely to be a bifunctional RNA as it also contains a small ORF encoding the toxic protein YodE [37, 38]

4. Measure the OD600 of each well in a plate reader, export the results. 5. Add 50 μL of permeabilization buffer and 200 μg/mL polymyxin B to each well and incubate at room temperature for 15 min (see Note 15). 6. Set the microplate reader at 28 °C and program to read OD420 kinetically for 2 h with 1 min read intervals. 7. Add 50 μL of ONPG solution to each well. 8. Set the plate in the reader and start the kinetic run. 9. At the end of the run, calculate the Vmax of all the samples using 30 points in the exponential part of the curve. 10. Export the results. 11. Data analysis: for each well, divide the Vmax by the corresponding OD600 measurement. This is the specific activity. 12. Set the specific activity of the vector control to 1 and compare all the results relative to that value to obtain a bar graph similar to the one in Fig. 5.

4

Notes 1. Add DTT at the last minute. 2. For E. coli genes, the TSS and 3′ ends are available in the database of dRNA-Seq [40] and Term-Seq [11]. Otherwise, the 5′ and 3′ ends of sRNAs can be determined by RACE mapping. 3. The primers for amplifying the DsrA sRNA are shown here as a reference: AatII-DsrA-for: GGCCAAGACGTCAACACATCA GATTTCCTGGTGTAACG . EcoRI-DsrA-rev: GCAGCAGAATTCA AAAAAATCCCGACCCTGAGGG . The restriction sites were underlined and TSS and 3′ end nucleotides were bolded.

Use of a sRNA Plasmid Library

303

Fig. 4 Protocol for screening sRNA regulators with the sRNA library. Individual plasmids were introduced into the translational lacZ fusion strain via TSS transformation. The β-galactosidase activity was then measured in microplates under sRNA overexpression conditions

304

Xing Luo and Nadim Majdalani

Fig. 5 The application of this sRNA library in screening sRNA regulators of RpoS. The results were adapted from the work of Mandin in [6]. sRNAs that increased rpoS-lacZ expression by more than twofold are highlighted in red while those that caused a more than twofold decrease are highlighted in green

4. Do negative controls: plasmid only, insert only, no ligase. 5. If no colonies appear, it may indicate that the sRNA is toxic. Try repeating the transformation with DH5α lacIQ competent cells such as NEB 5α F’Iq (without IPTG). 6. Rule of thumb is 5 mL of culture per electroporation/construct and include one electroporation with no DNA as a control. 7. A 42 °C shaking water bath is essential to induce the expression of the lambda-Red genes. Heat transfer from water to flask is more efficient than in an air shaker. 8. Longer times if the volume is larger than 30 mL. 9. This is to rapidly cool down the culture because lambda proteins are very labile at high temperatures. 10. The cells can be kept in the slurry for up to 40 min without problem. 11. The cells should be kept cold until after the electroporation step. Thus, use refrigerated centrifuges and prechilled solutions, tubes, and cuvettes. 12. There should be a lot more colonies in the samples with DNA than in the control without DNA. 13. This volume is sufficient for the transformation of 30 plasmids. 14. The arabinose concentration is adjusted based on the basal activity of the translational fusion. 15. The polymyxin B was 1:100 diluted in the permeabilization buffer fresh before usage.

Use of a sRNA Plasmid Library

305

Acknowledgments We thank Dr. Susan Gottesman for discussions and critical comments on this chapter. This work was supported by funding from the Intramural Research Program, National Institutes of Health, National Cancer Institute, Center for Cancer Research. References 1. Melamed S, Adams PP, Zhang A, Zhang H, Storz G (2020) RNA-RNA interactomes of ProQ and Hfq reveal overlapping and competing roles. Mol Cell 77(2):411–425.e417. https://doi.org/10.1016/j.molcel.2019. 10.022 2. Melamed S, Faigenbaum-Romm R, Peer A, Reiss N, Shechter O, Bar A, Altuvia Y, Argaman L, Margalit H (2018) Mapping the small RNA interactome in bacteria using RIL-seq. Nat Protoc 13(1):1–33. https://doi. org/10.1038/nprot.2017.115 3. Han K, Tjaden B, Lory S (2016) GRIL-seq provides a method for identifying direct targets of bacterial small regulatory RNA by in vivo proximity ligation. Nat Microbiol 2(3): 1 6 2 3 9 . h t t p s : // d o i . o r g / 1 0 . 1 0 3 8 / nmicrobiol.2016.239 4. De Lay N, Gottesman S (2012) A complex network of small non-coding RNAs regulate motility in Escherichia coli. Mol Microbiol 86(3):524–538. https://doi.org/10.1111/j. 1365-2958.2012.08209.x 5. Urban JH, Vogel J (2007) Translational control and target recognition by Escherichia coli small RNAs in vivo. Nucleic Acids Res 35(3): 1018–1037 6. Mandin P, Gottesman S (2010) Integrating anaerobic/aerobic sensing and the general stress response through the ArcZ small RNA. EMBO J 29(18):3094–3107 7. Chen J, To L, de Mets F, Luo X, Majdalani N, Tai CH, Gottesman S (2021) A fluorescencebased genetic screen reveals diverse mechanisms silencing small RNA signaling in E. coli. Proc Natl Acad Sci U S A 118(27): e2106964118. https://doi.org/10.1073/ pnas.2106964118 8. Guillier M, Gottesman S (2006) Remodelling of the Escherichia coli outer membrane by two small regulatory RNAs. Mol Microbiol 59(1): 231–247 9. Cabrera JE, Jin DJ (2001) Growth phase and growth rate regulation of the rapA gene, encoding the RNA polymerase-associated protein RapA in Escherichia coli. J Bacteriol

183(20):6126–6134. https://doi.org/10. 1128/jb.183.20.6126-6134.2001 10. De Lay N, Gottesman S (2009) The Crp-activated small noncoding regulatory RNA CyaR (RyeE) links nutritional status to group behavior. J Bacteriol 191(2):461–476. https://doi.org/10.1128/jb.01157-08 11. Adams PP, Baniulyte G, Esnault C, Chegireddy K, Singh N, Monge M, Dale RK, Storz G, Wade JT (2021) Regulatory roles of Escherichia coli 5′ UTR and ORF-internal RNAs detected by 3′ end mapping. eLife 10: e62438. https://doi.org/10.7554/eLife. 62438 12. Mu¨ller-Hill B, Crapo L, Gilbert W (1968) Mutants that make more lac repressor. Proc Natl Acad Sci U S A 59(4):1259–1264 13. Lee H-J, Gottesman S (2016) sRNA roles in regulating transcriptional regulators: Lrp and SoxS regulation by sRNAs. Nucleic Acids Res 44(14):6907–6923. https://doi.org/10. 1093/nar/gkw358 14. Papenfort K, Said N, Welsink T, Lucchini S, Hinton JC, Vogel J (2009) Specific and pleiotropic patterns of mRNA regulation by ArcZ, a conserved, Hfq-dependent small RNA. Mol Microbiol 74(1):139–158 15. Mandin P, Gottesman S (2009) A genetic approach for finding small RNAs regulators of genes of interest identifies RybC as regulating the DpiA/DpiB two-component system. Mol Microbiol 72(3):551–565 16. Bouche´ F, Bouche´ JP (1989) Genetic evidence that DicF, a second division inhibitor encoded by the Escherichia coli dicB operon, is probably RNA. Mol Microbiol 3(7):991–994 17. Sledjeski DD, Gupta A, Gottesman S (1996) The small RNA, DsrA, is essential for the low temperature expression of RpoS during exponential growth in Escherichia coli. EMBO J 15(15):3993–4000 18. Opdyke JA, Kang J-G, Storz G (2004) GadY, a small-RNA regulator of acid response genes in Escherichia coli. J Bacteriol 186(20): 6698–6705

306

Xing Luo and Nadim Majdalani

19. Urbanowski ML, Stauffer LT, Stauffer GV (2000) The gcvB gene encodes a small untranslated RNA involved in expression of the dipeptide and oligopeptide transport systems in Escherichia coli. Mol Microbiol 37(4):856–868 20. Urban JH, Vogel J (2008) Two seemingly homologous noncoding RNAs act hierarchically to activate glmS mRNA translation. PLoS Biol 6(3):e64 21. Moon K, Gottesman S (2009) A PhoQ/Pregulated small RNA regulates sensitivity of Escherichia coli to antimicrobial peptides. Mol Microbiol 74(6):1314–1330 22. Udekwu KI, Darfeuille F, Vogel J, Reimega˚rd J, Holmqvist E, Wagner EGH (2005) Hfq-dependent regulation of OmpA synthesis is mediated by an antisense RNA. Genes Dev 19(19):2355–2366 23. Coornaert A, Lu A, Mandin P, Springer M, Gottesman S, Guillier M (2010) MicA sRNA links the PhoP regulon to cell envelope stress. Mol Microbiol 76(2):467–479 24. Chen S, Zhang A, Blyn LB, Storz G (2004) MicC, a second small-RNA regulator of Omp protein expression in Escherichia coli. J Bacteriol 186(20):6689–6697 25. Mizuno T, Chou M-Y, Inouye M (1984) A unique mechanism regulating gene expression: translational inhibition by a complementary RNA transcript (micRNA). Proc Natl Acad Sci U S A 81(7):1966–1970 26. Altuvia S, Weinstein-Fischer D, Zhang A, Postow L, Storz G (1997) A small, stable RNA induced by oxidative stress: role as a pleiotropic regulator and antimutator. Cell 90(1):43–53 27. Majdalani N, Chen S, Murrow J, St John K, Gottesman S (2001) Regulation of RpoS by a novel small RNA: the characterization of RprA. Mol Microbiol 39(5):1382–1394 28. Douchin V, Bohn C, Bouloc P (2006) Downregulation of porins by a small RNA bypasses the essentiality of the regulated intramembrane proteolysis protease RseP in Escherichia coli. J Biol Chem 281(18):12253–12259 29. Thompson KM, Rhodius VA, Gottesman S (2007) σE regulates and is regulated by a small RNA in Escherichia coli. J Bacteriol 189(11):4243–4256 30. Antal M, Bordeau V, Douchin V, Felden B (2005) A small bacterial RNA regulates a

putative ABC transporter. J Biol Chem 280(9):7901–7908 31. Masse´ E, Gottesman S (2002) A small RNA regulates the expression of genes involved in iron metabolism in Escherichia coli. Proc Natl Acad Sci U S A 99(7):4620–4625 32. Vogel J, Bartels V, Tang TH, Churakov G, Slagter-J€ager JG, Hu¨ttenhofer A, Wagner EGH (2003) RNomics in Escherichia coli detects new sRNA species and indicates parallel transcriptional output in bacteria. Nucleic Acids Res 31(22):6435–6443 33. Vanderpool CK, Gottesman S (2004) Involvement of a novel transcriptional activator and small RNA in post-transcriptional regulation of the glucose phosphoenolpyruvate phosphotransferase system. Mol Microbiol 54(4): 1076–1089 34. Møller T, Franch T, Udesen C, Gerdes K, Valentin-Hansen P (2002) Spot 42 RNA mediates discoordinate expression of the E. coli galactose operon. Genes Dev 16(13): 1696–1706 35. Durand S, Storz G (2010) Reprogramming of anaerobic metabolism by the FnrS small RNA. Mol Microbiol 75(5):1215–1231 36. Zhang A, Wassarman KM, Rosenow C, Tjaden BC, Storz G, Gottesman S (2003) Global analysis of small RNA and mRNA targets of Hfq. Mol Microbiol 50(4):1111–1124 37. Weaver J, Mohammad F, Buskirk AR, Storz G (2019) Identifying small proteins by ribosome profiling with stalled initiation complexes. mBio 10(2):e02819-18. https://doi.org/10. 1128/mBio.02819-18 38. Ho¨r J, Di Giorgio S, Gerovac M, Venturini E, Fo¨rstner KU, Vogel J (2020) Grad-seq shines light on unrecognized RNA and protein complexes in the model bacterium Escherichia coli. Nucleic Acids Res 48(16):9301–9319. https://doi.org/10.1093/nar/gkaa676 39. Miller J (1992) A laboratory manual and handbook for Escherichia coli and related bacteria. A short course in bacterial genetics. Cold Spring Harbor Laboratory Press, Plainview 40. Thomason MK, Bischler T, Eisenbart SK, Fo¨rstner KU, Zhang A, Herbig A, Nieselt K, Sharma CM, Storz G (2015) Global transcriptional start site mapping using differential RNA sequencing reveals novel antisense RNAs in Escherichia coli. J Bacteriol 197(1):18–28. https://doi.org/10.1128/jb.02096-14

Chapter 17 Defining Bacterial RNA-RNA Interactomes Using CLASH Sofia Esteban-Serna, Liang-Cui Chu, Mehak Chauhan, Pujitha Raja, and Sander Granneman Abstract Methicillin-resistant Staphylococcus aureus (MRSA) is a bacterial pathogen accounting for high mortality rates among infected patients. Transcriptomic regulation by small RNAs (sRNAs) has been shown to regulate networks promoting antibiotic resistance and virulence in S. aureus. Yet, the biological role of most sRNAs during MRSA host infection remains unknown. To fill this gap, in collaboration with the lab of Jai Tree, we performed comprehensive RNA-RNA interactome analyses in MRSA using CLASH under conditions that mimic the host environment. Here we present a detailed version of this optimized CLASH (cross-linking, ligation, and sequencing of hybrids) protocol we recently developed, which has been tailored to explore the RNA interactome in S. aureus as well as other Gram-positive bacteria. Alongside, we introduce a compilation of helpful Python functions for analyzing folding energies of putative RNA-RNA interactions and streamlining sRNA and mRNA seed discovery in CLASH data. In the accompanying computational demonstration, we aim to establish a standardized strategy to evaluate the likelihood that observed chimeras arise from true RNA-RNA interactions. Key words RNA-RNA interaction, UV cross-linking, Quality control analyses

1

Introduction RNAs spend most of their molecular lifespan in temporary association with RNA-binding proteins (RBPs) that determine their cellular location and biological role. When exposed to short-wavelength UV radiation, RBPs can become permanently bound to their interacting target transcripts by the formation of a covalent bond. This process is referred to as UV cross-linking and has served as the theoretical basis for many of the available procedures to study RNA–protein interactions. Techniques such as CLIP (cross-linking and immunoprecipitation) immunopurify covalently bound ribonucleoproteins (RNPs) and subsequently sequence the RNA species that were cross-linked to the RBP of interest. Resulting RNA-binding footprints have shed light upon the biological function of numerous RBPs.

Ve´ronique Arluison and Claudio Valverde (eds.), Bacterial Regulatory RNA: Methods and Protocols, Methods in Molecular Biology, vol. 2741, https://doi.org/10.1007/978-1-0716-3565-0_17, © The Author(s), under exclusive license to Springer Science+Business Media, LLC, part of Springer Nature 2024

307

308

Sofia Esteban-Serna et al.

RNA-RNA interactions lie at the heart of many transcriptional and posttranscriptional regulatory processes. For instance, in bacteria, small RNAs (sRNAs) base-pair to a set of target mRNAs thereby reprogramming the transcriptome in response to metabolic adaptation, environmental stresses, or infection processes [1]. Unsurprisingly, RNA-RNA interactions are often chaperoned by RBPs. Since UV exposure will result in cross-linking of both RNA species to the RBP hosting their interaction, these proteins can be used as baits to map interacting RNAs in the transcriptome. This is the strategy followed by the cross-linking, ligation, and sequencing of hybrids (CLASH) procedure to capture RNA-RNA interactions (Fig. 1) [2]. Following tandem affinity purification of adducted RNPs, bound transcripts undergo partial digestion and an intermolecular ligation step. Upon protein removal by proteinase K and reverse transcription of linked RNAs, a fraction (1–5%) of the resulting cDNA library comprises chimeric cDNAs composed of sequences encoding RNAs which are likely to interact in vivo (Fig. 1). After its development in yeast [2], CLASH has been successfully adapted to retrieve RNA-RNA associations from RBPs in humans and bacteria [3–7]. Concurrently, RIL-seq, a related approach applying immunoprecipitation of the bait RBP under native conditions, was developed in Escherichia coli and, since then, has been applied to study sRNA-target interactions in Salmonella, enteropathogenic E. coli, Vibrio cholerae, and Clostridioides difficile [8–13]. To date, most efforts have predominantly focused on Hfq and ProQ-chaperoned RNA-RNA duplexes in Gram-negative bacteria. Nevertheless, while available evidence has not ascertained the role of Hfq in mediating sRNA-mRNA interactions in S. aureus, there is no homolog of ProQ in Gram-positive bacteria. Consequently, to recover sRNA-mRNA duplexes in S. aureus, we needed to select an RBP that had been defined as a hub of sRNA-mRNA base-pairing. One such protein is endoribonuclease III (RNase III), which mediates sRNA-guided mRNA degradation and has been found to bind numerous sRNAs [14, 15]. Having established a suitable bait for our purposes, we customized CLASH to map the RNA-RNA interactions mediated by the endonuclease RNase III in S. aureus. Our findings uncovered a nutrient-sensing sRNA-mediated regulation of toxin production in methicillin-resistant S. aureus and a 3′ untranslated region of a protein-coding transcript, which orchestrates regulatory network promoting antibiotic resistance [5, 6]. Our CLASH protocol maximizes signal-to-noise ratios by pulling down the bait RBP under chaotropic conditions. Advantageously, the stringent purification designed in the original protocol has tolerated the addition of lysostaphin, a potent bacteriolytic endopeptidase to disrupt the cell wall, and the addition of Triton X-100, a strong detergent that we used to (partially) solubilize membrane-bound degradosome factors that we are currently

Defining Bacterial RNA-RNA Interactomes Using CLASH

309

Immunoprecipitation

UV

Protein

DNA sRNA

Lysis

RNase trimming mRNA

Intermolecular ligation

Adapter ligation Proteinase K treatment

RBP-sRNA-mRNA purification

Extraction of chimeras

Reverse transcription and cDNA amplification

Reads

Bioinformatic identification of enriched hybrids and mapping of interacting sRNAs and mRNAs to different parts of the genome

mRNA-encoding gene

sRNA-encoding gene

Genome

Fig. 1 Schematic overview of the CLASH protocol. After exposing the cells to UV, RBPs become covalently bound to their cognate transcripts. Following immunoprecipitation of the bait protein, the cross-linked RNA species are partially digested. Subsequently, base-pairing RNA species are ligated and indexed with sequencing adapters. Upon RNP purification, hybrid RNA fragments can be precipitated and reverse transcribed. The resulting cDNA can then be amplified and sequenced. Libraries derived from CLASH experiments will contain chimeric cDNAs comprising the sequence of both interacting RNA species. Accordingly, each half of the read will be mapped to their respective loci in the genome

310

Sofia Esteban-Serna et al.

investigating. Furthermore, we have supplemented the lysis buffer with ethylenediaminetetraacetic acid disodium (EDTA), a metal chelator which sequesters magnesium and calcium ions present in S. aureus extracellular polymeric substance. The benefits of adding EDTA during cell lysis are dual: on the one hand, it facilitates biofilm disruption; on the other hand, it inactivates ribonucleases and proteases that rely on metal co-factors for their enzymatic activity. Overall, the rigorous conditions under which tandem affinity purification is performed, substantially reduce background sequences. However, statistical filtering of the sequencing output is necessary to exclude random chimeric fragments which may derive from incidental cross-linking to proximate molecules of the RBP under study. Several very powerful bioinformatic pipelines have been developed for processing and/or the statistical analysis of CLASH and RIL-seq data [16–19]. Despite imposing different selection criteria to deem chimeric enrichments significant, available software generates output files with similar formats and information. In this protocol, we describe a set of functions which can provide a simplified approach for seed identification, motif enrichment, and folding energy analyses of CLASH data. We propose the inclusion of these evaluations as a standardized quality control check for CLASH/RIL-seq outputs.

2

Materials

2.1

Strains

Isolation of the ribonucleoprotein (RNP) harboring the RNA-RNA interactions under study requires epitope tagging of the bait RBP. In our original CLASH protocol, the proteins of interest were fused to an HTP tag, which was composed of a His6 tag, a TEV restriction site, and protein A [2]. However, our refined version of the experimental and bioinformatic CLASH procedure has been performed in RBPs tagged with a His6-TEV-FLAG (HTF) tag [4–6]. Specifically, in Staphylococcus aureus, this procedure has been applied to immunoprecipitated RNaseIII-HTF from JKD6009 rnc::HTF and USA300 rnc::HTF strains. Independently of the tag used for tandem affinity purification, a sample proceeding from an untagged parental strain must always be included as a negative control.

2.2

Growth Medium

This adjusted version of the CLASH protocol has been successfully employed to characterize RNA-RNA interactions of S. aureus strains grown in a wide variety of media [5, 6]. To ease experimental manipulation and facilitate cell growth, bacterial cells are grown in a rich medium such as tryptic soy broth medium (TSB). Aiming to study the dynamic changes in the RNA interactome of S. aureus

Defining Bacterial RNA-RNA Interactomes Using CLASH

311

Table 1 Composition of RPMI 1640 medium Components

Concentration

Glycine

0.133 mM

L-arginine

1.149 mM

L-asparagine

0.379 mM

L-aspartic acid

0.150 mM

L-cystine 2HCl

0.208 mM

L-glutamic acid

0.136 mM

L-glutamine

2.055 mM

L-histidine

0.097 mM

L-hydroxyproline

0.153 mM

L-isoleucine

0.382 mM

L-leucine

0.382 mM

L-lysine hydrochloride

0.219 mM

L-methionine

0.101 mM

L-phenylalanine

0.091 mM

L-proline

0.174 mM

L-serine

0.286 mM

L-threonine

0.168 mM

L-tryptophan

0.025 mM

L-tyrosine disodium salt dihydrate

0.111 mM

L-valine

0.171 mM

Biotin

0.820 μM

Choline chloride

21.429 μM

D-calcium pantothenate

0.524 μM

Folic acid

2.268 μM

Niacinamide

8.197 μM

Para-aminobenzoic acid

7.299 μM

Pyridoxine hydrochloride

4.854 μM

Riboflavin

0.532 μM

Thiamine hydrochloride

2.967 μM

Vitamin B12

0.004 μM

i-inositol

194.444 μM (continued)

312

Sofia Esteban-Serna et al.

during host infection, earlier work has employed growth conditions simulating the nature of the environmental insults to which S. aureus is exposed at different stages of infection. For example, exposing S. aureus to RPMI 1640, which is a phosphate-rich (5.63 mM Na2HPO4) medium to culture human leukemic cells, elicits a similar transcriptomic reprogramming to that which the pathogen experiences in human plasma (Table 1) [20]. On the other hand, low phosphate, low magnesium LPM medium at pH 7.6 has been produced to recapitulate the salt concentrations of the cytoplasm of a human cell (Table 2) [21]. On this basis, acidic LPM (pH 5.4) has been synthesized to examine the response of S. aureus to the stresses which it undergoes in phagolysosomes. Table 1 (continued) Components

Concentration

Calcium nitrate (Ca(NO3)2 4H2O)

0.42 mM

Magnesium sulfate anhydrous (MgSO4)

0.41 mM

Potassium chloride (KCl)

5.33 mM

Sodium bicarbonate (NaHCO3)

23.81 mM

Sodium chloride (NaCl)

103.45 mM

Sodium phosphate dibasic anhydrous (Na2HPO4)

5.63 mM

D–glucose (Dextrose)

11.11 mM

Glutathione (reduced)

3.257 μM

Phenol red

13.284 μM

Table 2 Composition of LPM medium Components

Concentration

Cas aminoacids

0.1% (w/v)

Ammonium sulfate ((NH4)2SO4)

7.5 mM

PO4

3-

337 μM

Potassium chloride (KCl)

5 mM

Potassium sulfate (K2SO4)

0.5 mM

Magnesium chloride (MgCl2)

8 μM

Tris–HCl (for titration to LPM pH 7.6)

100 mM

MES (for titration to LPM pH 5.4).

80 mM

0.3% (v/v) glycerol

38 mM

Defining Bacterial RNA-RNA Interactomes Using CLASH

313

To examine the transcriptional reprogramming occurring during adaptation to the different extracellular and intracellular contexts, bacteria were grown in TSB medium to OD600 ~ 3 and subsequently transferred to one of the infection-mimicking media. Afterwards, samples were harvested 15 min after the shift to investigate RNA-RNA interactions during the early stages of the stress response. Moreover, to control for cellular stresses resulting from a medium switch, CLASH has also been performed in bacteria grown in TSB and subsequently transferred to the same medium after filtering the cells. 2.3

Buffers

1. 1× Phosphate-buffered saline (PBS). 2. TN150-lysostaphin buffer: 50 mM Tris–HCl pH 7.8, 150 mM NaCl, 100 μg/mL Lysostaphin, 0.1% (v/v) NP-40 and, 0.5% (v/v) Triton X-100. 3. TN150 α-peptidase buffer: 50 mM Tris–HCl pH 7.8, 150 mM NaCl, 1 mini complete protease inhibitor (Roche) per 10 mL, 0.1% (v/v) NP-40, 0.5% (v/v) Triton X-100, and 10 mM EDTA. 4. TN1000: 50 mM Tris–HCl pH 7.8, 1 M NaCl, 0.1% (v/v) NP-40, and 0.5%(v/v) Triton X-100. 5. 10× core: 0.5 M Tris–HCl pH 7.8, 1% (v/v) NP-40, 50 mM β-mercaptoethanol, 5% (v/v) Triton X-100. 6. Wash buffer 1: 50 mM Tris–HCl pH 7.8, 0.1% (v/v) NP-40, 5 mM β-mercaptoethanol, 0.5% (v/v) Triton X-100, 300 mM NaCl, 10 mM imidazole, and 6 M GuHCl. 7. NP-PNK buffer: 50 mM Tris–HCl pH 7.8, 0.1% (v/v) NP-40, 5 mM β-mercaptoethanol, 0.5% (v/v) Triton X-100, and 10 mM MgCl2. 8. 5× PNK buffer: 250 mM Tris–HCl pH 7.8, 50 mM MgCl2, 50 mM β-mercaptoethanol, and 0.5% (v/v) Triton X-100. 9. Wash buffer 2: 50 mM Tris–HCl pH 7.8, 10 mM β-mercaptoethanol, 0.1% (v/v) NP-40, 0.5% (v/v) Triton X-100, 50 mM NaCl, and 10 mM imidazole. 10. LDS sample buffer. 11. Bis-tris running buffer. 12. Elution buffer: 50 mM Tris–HCl pH 7.8, 10 mM β-mercaptoethanol, 0.1% (v/v) NP-40, 0.5% (v/v) Triton X-100 and 50 mM NaCl, 250 mM imidazole. 13. GTC phenol buffer: 4 M guanidium thiocyanate (GTC), 50 mM Tris–HCl pH 8, 10 mM EDTA, 100 mM β-mercaptoethanol, 2% (w/v) sarcosyl, 100 mM sodium acetate pH 5.2, 50% acidic phenol pH 4.3. Do not use this reagent

314

Sofia Esteban-Serna et al.

if it acquires a pink color as this is indicative of phenol oxidation. 14. Extraction buffer: 50 mM Tris–HCl pH 7.8, 0.1% (v/v) NP-40, 5 mM β-mercaptoethanol, 1% (w/v) SDS, 5 mM EDTA, 50 mM NaCl, and 60 mg/mL proteinase K. 15. 5× SuperScript™ IV RT reaction buffer (Invitrogen). 16. 10× Tris–borate–EDTA buffer. 2.4

Solutions

1. 1 M Tris–HCl pH 7.5–7.8. 2. 5 M and 1 M NaCl. 3. 2 M β-mercaptoethanol. 4. 10% (v/v) NP-40. 5. 20% (v/v) Triton X-100. 6. 1 M MgCl2. 7. 0.5 M ethylenediaminetetraacetic acid disodium salt dihydrate (EDTA). 8. 50% (w/v) PEG 8000. 9. 2.5 mM imidazole pH 7.5. 10. ≥99.0% (v/v) trichloroacetic acid (TCA). 11. Acetone. 12. 20 mg/mL glycogen. 13. 3 M sodium acetate (NaOAc) pH 5.2. 14. 25:24:1 phenol:chloroform:isoamyl alcohol (v/v). 15. 24:1 chloroform:isoamyl alcohol (v/v). 16. Absolute ethanol. 17. 70% (v/v) ethanol. 18. Sterile milli-Q water.

2.5 Enzymes and Reagents (See Notes 1 and 2)

1. 10 mg/mL lysostaphin. 2. 1 U/μL RQ1 RNase-free DNase. 3. 20 U/μL Superase-In. 4. Protease inhibitor cocktail. 5. 10 μL GST-TEV protease. 6. RNace-IT. 7. Guanidine HCl. 8. 1 U/μL FastAP thermosensitive alkaline phosphatase. 9. 40 U/μL RNasin® ribonuclease inhibitor. 10. 30 U/μL T4 RNA ligase 1 (ssRNA ligase). 11. 200 U/μL T4 RNA ligase 2 truncated K227Q.

Defining Bacterial RNA-RNA Interactomes Using CLASH

315

12. 5′ adapter (100 μM). 13. ATP (100 and 10 mM). 14. App-PE 3′ adapter (100 μM). 15. 10 U/μL T4 polynucleotide kinase (T4 PNK). 16. 10 μCi/μL 32P-γATP (Perkin Elmer). 17. dNTP mix. 18. 100 mM DTT. 19. 200 U/μL SuperScript™ IV reverse transcriptase. 20. 5 U/μL RNase H. 21. RNAClean XP. 22. P5 forward amplification primer (100 μM). 23. P3 forward amplification primer (100 μM). 24. Pfu DNA polymerase (2–3 U/μL). 25. Exonuclease I (20 U/μL, New England BioLabs). 26. DNA gel stain. 2.6

Consumables

1. DNAZap. 2. RNaseZAP. 3. 0.45 μm membrane filters. 4. 50 mL conical centrifuge tubes. 5. 1.5 mL microcentrifuge tubes. 6. 2 mL microcentrifuge tubes. 7. 5 mL centrifuge tubes. 8. Zirconia beads 0.1 mm. 9. Magnetic anti-FLAG M2 beads (Sigma Aldrich). 10. HisPur™ Ni-NTA agarose resin (Thermo Fisher Scientific). In our experience, the procedure does not work well with Ni Sepharose resin, so avoid this resin. 11. Spin column. 12. 4–12%, Bis-Tris, 1.0–1.5 mm, precast protein gels. 13. TBE-urea gels, 6%. 14. Whatman grade GF/D glass microfiber prefilters, 10 mm circle. 15. 0.22 μm centrifuge tube filters. 16. Fluorometric assay kit. 17. Bioanalyzer high-sensitivity DNA analysis assay kit.

316

Sofia Esteban-Serna et al.

2.7 Adapters and Primers

2.8

Equipment

Oligonucleotide sequences used as 5′ and 3′ sequencing linkers as well as reverse transcription and amplification primers can be commercially purchased. A full list of the purpose, name, and sequence of each adapter and primer is provided in Table 3 (see Note 3). The 5′ adapters contain a sequencing index for sample identification (Table 3) and a randomized prefix (Table 3) which allows the identification of PCR duplicates during sequence analysis. These linkers are DNA-RNA hybrids displaying an inverted dideoxythymidine (invddT) sequence at the 5′ end to prevent ligation to other adapters or RNA molecules (Table 3). Conversely, the DNA oligos used as 3′ end adapters have a blocked 3′ dideoxycytidines (ddCs) end and an activated adenosine at the 5′ end. Thus, the 3′ linker can be ligated to the cross-linked RNA substrates in the absence of ATP. The primer used to reverse transcribe pooled RNAs into a cDNA library is a complementary sequence to that encoded in the 3′ sequencing adapter (Table 3). The forward and reverse amplification primers insert index sequences (Table 3) that enable the binding of the amplicons in the resulting cDNA library to Illumina’s sequencing flow cell. All oligonucleotides were dissolved in TE buffer at a concentration of 100 μM and stored at -80 °C. 1. Incubator with orbital shaker. 2. Vari-X-linker UV cross-linker (UVO3) [22, 23]. 3. Refrigerated centrifuge with rotor for 50 mL Falcon tubes. 4. Vortex. 5. Refrigerated microcentrifuge with rotor for 5 and 1.5 mL microcentrifuge tubes. 6. Thermoblock with an orbital shaking motion. The temperature control must operate between a range of 16 and 65 °C. 7. Access to a hot lab (i.e., a room with suitable protective equipment and authorization to perform work with radioactive material). 8. Geiger counter. 9. Perspex protective screens. 10. Electrophoresis system horizontal gel and compatible power supply. 11. Phosphor imaging cassette. 12. Film developer. 13. Printer. 14. Thermal cycler. 15. Fluorometer. 16. Bioanalyzer.

5′-invddT-ACACrGrArCrGrCrUrCrUrUrCrCrGrArUrCrUrNrNrNrArUrUrArGrCrN-OH-3′ 5′-invddT-ACACrGrArCrGrCrUrCrUrUrCrCrGrArUrCrUrNrNrNrGrCrGrCrArGrCrN-OH-3′ 5′-invddT-ACACrGrArCrGrCrUrCrUrUrCrCrGrArUrCrUrNrNrNrCrGrCrUrUrArGrCrN-OH-3′ 5′-invddT-ACACrGrArCrGrCrUrCrUrUrCrCrGrArUrCrUrNrNrNrArGrArGrCrN-OH-3′ 5′-invddT-ACACrGrArCrGrCrUrCrUrUrCrCrGrArUrCrUrNrNrNrGrUrGrArGrCrN-OH-3′ 5′-invddT-ACACrGrArCrGrCrUrCrUrUrCrCrGrArUrCrUrNrNrNrCrArCrUrArGrCrN-OH-3′ 5′-invddT-ACACrGrArCrGrCrUrCrUrUrCrCrGrArUrCrUrNrNrNrUrCrUrCrUrArGrCrN-OH-3′ 5′-invddT-ACACrGrArCrGrCrUrCrUrUrCrCrGrArUrCrUrNrNrNrCrUrArGrCrN-OH-3′ 5′-invddT-ACACrGrArCrGrCrUrCrUrUrCrCrGrArUrCrUrNrNrNrUrGrGrArGrCrN-OH-3′ 5′-invddT-ACACrGrArCrGrCrUrCrUrUrCrCrGrArUrCrUrNrNrNrArCrUrCrArGrCrN-OH-3′ 5′-invddT-ACACrGrArCrGrCrUrCrUrUrCrCrGrArUrCrUrNrNrNrGrArCrUrUrArGrCrN-OH-3′ 5′-invddT-ACACrGrArCrGrCrUrCrUrUrCrCrGrArUrCrUrNrNrNrCrGrUrGrArUrN-OH-3′ 5′-invddT-ACACrGrArCrGrCrUrCrUrUrCrCrGrArUrCrUrNrNrNrGrCrArCrUrArN-OH-3′ 5′-invddT-ACACrGrArCrGrCrUrCrUrUrCrCrGrArUrCrUrNrNrNrUrArGrUrGrCrN-OH-3′ 5′-invddT-ACACrGrArCrGrCrUrCrUrUrCrCrGrArUrCrUrNrNrNrArUrCrArCrGrN-OH-3′ 5′-invddT-ACACrGrArCrGrCrUrCrUrUrCrCrGrArUrCrUrNrNrNrCrArCrUrGrUrN-OH-3′ 5′-invddT-ACACrGrArCrGrCrUrCrUrUrCrCrGrArUrCrUrNrNrNrGrUrGrArCrArN-OH-3′ 5′-invddT-ACACrGrArCrGrCrUrCrUrUrCrCrGrArUrCrUrNrNrNrUrGrUrCrArCrN-OH-3′ 5′-invddT-ACACrGrArCrGrCrUrCrUrUrCrCrGrArUrCrUrNrNrNrArCrArGrUrGrN-OH-3′

L5Ab

L5Ac

L5Ad

L5Ba

L5Bb

L5Bc

L5Bd

L5Ca

L5Cb

L5Cc

L5Cd

L5Da

L5Db

L5Dc

L5Dd

L5Ea

L5Eb

L5Ec

L5Ed

(continued)

5′-invddT-ACACrGrArCrGrCrUrCrUrUrCrCrGrArUrCrUrNrNrNrUrArArGrCrN-OH-3′

L5Aa

5′ Adapter for sequencing

Sequence

Name

Oligonucleotide type

Table 3 Adapters and primers used during the CLASH procedure

Defining Bacterial RNA-RNA Interactomes Using CLASH 317

5′-AATGATACGGCGACCACCGAGATCTACACTCTTTCCCTACACGACGCTCTTCCGATCT-3′ 5′-CAAGCAGAAGACGGCATACGAGATCGTGATGTGACTGGAGTTCAGACGTGTGCTCTTCC GATCT-3′ 5′-CAAGCAGAAGACGGCATACGAGATGCCTAAGTGACTGGAGTTCAGACGTGTGCTCTTCC GATCT-3′ 5′-CAAGCAGAAGACGGCATACGAGATTGGTCAGTGACTGGAGTTCAGACGTGTGCTCTTCC GATCT-3′ 5′-CAAGCAGAAGACGGCATACGAGATCACTGTGTGACTGGAGTTCAGACGTGTGCTCTTCC GATCT-3′ 5′- CAAGCAGAAGACGGCATACGAGATATTGGCGTGACTGGAGTTCAGACGTGTGCTCTTCC GATCT-3′ 5′-CAAGCAGAAGACGGCATACGAGATCTTGTAGTGACTGGAGTTCAGACGTGTGCTCTTCC GATCT-3′

P5

P3 BC1

P3 BC3

P3 BC4

P3 BC5

P3 BC6

PC3 BC13

Primers for library amplification

5′App-NAGATCGGAAGAGCACACGTCTG-ddC 3′ 5′-CAGACGTGTGCTCTTCCGATCT-3′

App_PE

3′ Adapter for sequencing

Sequence

Primer for reverse transcription

Name

Oligonucleotide type

Table 3 (continued)

318 Sofia Esteban-Serna et al.

Defining Bacterial RNA-RNA Interactomes Using CLASH

3

Method

3.1 Experimental Procedure

3.1.1

319

Cell Growth

Buffers can be prepared in advance and stored at 4 °C for long periods. Since all solutions will be filter-sterilized, it is not essential to use sterile milli-Q water to prepare the buffers required in this procedure. However, to reduce the risk of contaminating the stock solutions, we advise aliquoting appropriate volumes prior to each CLASH experiment. β-mercaptoethanol, EDTA, and protease inhibitors should be added to these aliquots before the start of each experiment (see Note 4). It is ideal to use enzymes from fresh stocks and not carry them over through experiments to avoid crosscontamination (see Note 5). Ideally, one would make ~10 μL aliquots of each enzyme into 1.5 mL tubes to prevent this from happening. Where applicable, use dNTPs, glycogen from aliquots to avoid freeze-thawing and potential inactivation. Use cooling centrifuges throughout the process when RNA is being processed. While RNases will degrade cross-linked transcripts, DNase contamination can jeopardize library amplification. To avoid enzymatic degradation of RNA and DNA samples, all steps must be performed wearing disposable gloves and using filter tips. Furthermore, in preparation for the procedure, we recommend wiping the work surfaces and pipettes with DNaseZAP and RNaseZAP as per manufacturer’s instructions. Additional precautions include radiating surfaces, pipettes, and tubes with UV ahead of the start of the experiment. It is crucial to include appropriate negative controls (untagged strain or non-cross-linked sample) while performing CLASH experiments along with HTF-tagged strains. This helps differentiate true binding sites from the background signal. A minimum of four biological repeats is recommended as, due to the stochastic nature of UV cross-linking, there will be variability between repeats. Additionally, performing more repeats will enable sampling of a larger number of interactions. 1. Prepare and autoclave 100 mL of medium per each strain. 2. Streak out all the strains required from their glycerol stocks and incubate overnight at 37 °C. 3. Inoculate the medium with a single colony and incubate in 5 mL of medium overnight with shaking at 37 °C and 180 rpm. 4. The proceeding day, perform secondary inoculation with a culture of OD600 = 0.05–0.1 and let cells grow until it reaches OD600 = 3. 5. Take TSB samples at their respective OD and cross-linking 200 mL of cells straight from the media at 254 nm; 1000 mJ/cm2 or any optimized UV intensity using the VariX-linker (UVO3) [22, 23].

320

Sofia Esteban-Serna et al.

6. Prepare a Dewar containing an appropriate volume of liquid nitrogen to flash-freeze the cells. 7. Harvest the cells by vacuum filtering them through a 0.45 μm membrane filter. 8. Fold the membrane filter and place it inside a 50 mL Falcon tube. Drop the tube inside the liquid nitrogen container to flash-freeze the cells. Ensure that all the steps from this stage are performed on ice to minimize protein and RNA degradation. It is suggested to retain a small amount of sample from the end of the lysis stage (crude cell lysate) and TEV eluate stage to check for any issues later. A western blot on these samples should be performed to analyze the amount of protein in all samples. 3.1.2

Cell Lysis

1. Wash the filters with the frozen cells attached adding 25 mL of ice-cold PBS. Transfer the resuspended cells to a new 50 mL centrifuge tube. 2. Centrifuge cells at 4 °C and 4000 g for 10 min. 3. Weigh cell pellets and resuspend each of them in 2 volumes/ cell weight of TN150-lysostaphin buffer. Vortex and transfer samples to 5 mL centrifuge tubes. 4. Add 60 μL of RQ1 RNase-free DNase (Promega) and 10 μL SUPERase·In (Invitrogen). Then, incubate at 20 °C for 30 min. 5. Add 3 volumes/cell weight of Zirconia beads (0.1 mm) to the cell pellet. Vortex vigorously five times for 1 min with 1-min incubations on ice between each step. 6. Add 2 volumes/cell weight of cold TN150 α-peptidase buffer. 7. Centrifuge at 4 °C and 10.000 g for 30 min.

3.1.3 RNP-HTF Immunoprecipitation with Anti-FLAG Magnetic Beads

1. Thaw anti-FLAG® M2 magnetic beads on ice. 2. Using a wide bore pipette tip, transfer 75 μL of magnetic beads slurry to separate 1.5 mL microcentrifuge tubes. This will contain, approximately, 37.5 μL of dried beads. 3. Place the tubes in a magnetic rack and wash the beads three times using 1 mL of TN150. 4. After the last wash, remove the supernatant and resuspend beads in 50 μL of TN150. 5. Add an equal number of beads to each cleared lysate. 6. Incubate beads with rotation at 4 °C for 2 h. 7. Meanwhile, prepare the 10× core and TN1000 buffers. Make sure to prepare 10× at least 1 h before proceeding to the next step as the Triton X-100 takes time to dissolve.

Defining Bacterial RNA-RNA Interactomes Using CLASH 3.1.4 TEV Digestion of HTF Epitope

321

1. Resuspend beads in 1 mL of TN1000 and rotate at 4 °C for 10 min. Alternatively, resuspend beads in 2 mL of TN1000 and rotate at room temperature for 5 min. Perform this step three times. 2. Rinse beads three times with 2 mL of TN150. After the last wash, remove the supernatant and resuspend beads in 250 μL of TN150 and transfer to a new 1.5 mL microcentrifuge tube. 3. Add 10 μL of TEV to each sample and incubate at room temperature for 2 h (see Note 6). 4. Prepare wash buffer 1. 5. For each sample, pipette 100 μL of Ni-NTA agarose beads slurry. Spin down the Ni-NTA agarose beads and remove storage ethanol. 6. Wash Ni-NTA agarose beads three times using 1 mL of wash buffer 1 (see Note 7). 7. In a rack, prepare a set of three 1.5 mL microcentrifuge tubes per sample. Arrange the tubes as specified in Table 4. 8. Set a thermoblock for 1.5 mL tubes at 22 °C. This step is critical and requires a precise time of digestion. Commercial RNace-IT is highly concentrated; hence, we prepare diluted working stocks (1:100) to perform experiments and avoid discrepancies between technical and biological replicates. The working stock is stored at -20 °C until further usage. Optimization of the amount of RNase will need to be performed for each bait protein and we recommend testing a wide range of RNase concentrations to ensure that the fragments are long enough to be unambiguously mapped to the genome.

3.1.5 RNase Trimming of Transcripts Cross-Linked to the RBP

1. Collect TEV eluates and pipette them into their corresponding collection tube (containing 350 μL of TN150). 2. Transfer 550 μL from the first sample into the RNase treatment tube containing RNace-IT. Place the tubes in the thermoblock and immediately set the timer after starting the incubator of the first sample. 3. Repeat the transfer and incubation start procedure for the remaining samples. There is no need to set a timer for each of them. 4. Incubate the first sample at 22 °C for 7 min. After 7 min, remove the first sample from the incubator. 5. Transfer 500 μL from the RNase treatment tube of the first sample into its appropriate NTA binding tubes (containing 0.4 g of GuHCl, 3 μL of Imidazole (2.5 M), and 27 μL NaCl (5 M)).

322

Sofia Esteban-Serna et al.

Table 4 Disposition of tubes for retrieval of TEV eluate, RNase digestion, and start of Ni-NTA immunopurification Row

Purpose

Tube content

1

TEV eluate collection

250 μL of TEV eluate in 350 μL of TN150

2

RNase treatment

1 μL of a 1:100 dilution of RNace-IT (Agilent)

3

Ni-NTA binding

0.4 g of GuHCl, 3 μL of imidazole (2.5 M) and 27 μL NaCl (5 M)

6. Repeat steps 5 and 6 for the remaining samples following the order in which they were first placed in the thermoblock. The lag between samples will compensate for the different start times for each of the samples. 7. Vortex all Ni-NTA binding tubes to dissolve the GuHCl. 8. To each sample, add equal volumes of the previously washed Ni-NTA agarose beads and incubate samples while rotating at 4 °C overnight. 3.1.6 Ni-NTA Affinity Purification of RNP-His6 and 3′ Dephosphorylation of Covalently Bound RNAs

RNase trimming creates 2′–3′ cyclic phosphates (cPs) in RNAs that need to be removed to enable 3′ adapter ligation. Given that these groups are chemically unstable, most will be converted to 2′ and 3′ phosphates during the aforementioned overnight incubation. Ultimately, enzymatic removal of cPs guarantees that these will not interfere with the ligation of the 3′ linker. In this protocol, we use a thermosensitive alkaline phosphatase. Nonetheless, the phosphatase activity of the T4 polynucleotide kinase could potentially also be exploited to this end [24]. 1. Prepare several sets of 2 mL microcentrifuge tubes to use as flow-through waste containers. Place a spin column in each of the 2 mL tubes. 2. Transfer the Ni-NTA agarose beads of each sample onto a separate column and let the supernatant run through using gravity. It is advised to close the lid of each tube between bead transfers to avoid cross-contamination. 3. Discard the tubes in the steps afterward instead of discarding pass-through. If it’s difficult to pass, close the lid and open again, repeat. 4. Wash the Ni-NTA agarose beads three times with 500 μL of wash buffer 1. 5. Wash three times with 500 μL of NP-PNK buffer. 6. Prepare a FastAP treatment mix escalating the volumes provided in Table 5 accordingly.

Defining Bacterial RNA-RNA Interactomes Using CLASH

323

Table 5 Composition of phosphatase treatment mix. Volumes provided are for 1 reaction Reagent

Volume for 1 sample (μL)

5× PNK buffer

12

FastAP

4

RNasin

2

Sterile milli-Q water

42

Total

60

7. Replace the 2 mL tubes, close the lids, and spin down the Ni-NTA agarose beads by centrifugation at room temperature and 4000 rpm for 5 s (see Note 4). 8. Open the lid of each column and attach a rubber stopper to the bottom. 9. Add 60 μL of FastAP treatment solution and tap the tubes to mix the reagents. Incubate the Ni-NTA agarose beads at 20 °C for 1 h. 10. After the incubation, open the cap, remove the rubber stopper, and place it in the column in 1.5 mL tubes. 11. Wash the Ni-NTA agarose beads with 500 μL of wash buffer 1. This will inactivate FastAP. 12. Wash the samples three times with 500 μL of 1× NP-PNK buffer. 13. Place the lid on the columns and transfer them to new 2 mL tubes. 14. Pack the Ni-NTA agarose resin by centrifugation at room temperature and 4000 rpm for 5 s. Do not extend the centrifugation time to avoid over-drying the beads. 15. Open the lid of the column first and place the stopper in their bottom. 3.1.7 Phosphorylation of the 5′ Ends of CrossLinked RNAs with Radioactive 32P

All the parts of the protocol in this section must be performed in a laboratory room which is authorized and equipped for the safe handling and disposal of radioactive materials. Phosphorylating the 5′ ends of the RNA bound to the tagged protein is first done using radioactive ATP to get a reasonable signal from the sample, followed by nonlabeled ATP. Phosphorylation of all the 5′ ends facilitates efficient ligation of the 5′ end adapters. The 5′ adapters have unique barcode sequences to distinguish between samples if they are multiplexed. Also, the random

324

Sofia Esteban-Serna et al.

Table 6 Composition of radioactive phosphorylation mix. Volumes provided are for 1 reaction Reagent

Volume for 1 sample (μL)

5× PNK buffer

16

32P-γATP (10 μCi/μL, Perkin Elmer)

3

T4 PNK

3

Sterile milli-Q water

58

Total

80

nucleotides help to remove potential PCR duplicates. This makes it very important to choose different linkers for each sample. 1. Close the lid of the columns and transfer them to a new 2 mL tube. Spin down the Ni-NTA agarose resin by centrifugation at room temperature and 4000 rpm for 5 s. It is not advisable to centrifuge for longer periods as this can over-dry and damage the Ni-NTA agarose resin. 2. Prepare a radioactive phosphorylation mix escalating the volumes provided in Table 6 accordingly. 3. Open the lid of the columns first and place the stopper in their bottom. 4. Add 80 μL of the radioactive phosphorylation mix to each sample and incubate the reaction at 20 °C for 100 min. 5. After 100 min, supplement ongoing reactions with 1 μL of 100 mM ATP and let the reaction proceed for 50 min. Additional ATP will fuel T4 PNK phosphorylation of 5′ ends and maximize the number of sequences to which a 5′ adapter could be added (see Note 8). 6. After incubation, first open the tube lids and only then, remove the stopper from the column. Directly removing the stopper will build up pressure and increase the risk of radioactively contaminating the user. 7. Wash Ni-NTA agarose beads three times with 500 μL of wash buffer 1. 8. Wash Ni-NTA agarose beads three times with 500 μL of NP-PNK buffer. Ensure that the column is washed thoroughly to avoid carrying any remnants of wash buffer 1 onto the next step.

Defining Bacterial RNA-RNA Interactomes Using CLASH 3.1.8 On-Bead Ligation of the 5′ Linker to the Cross-Linked RNAs

325

1. Prepare a 5′ adapter ligation mix escalating the volumes provided in Table 7 accordingly. 2. Centrifuge the samples to remove residual NP-PNK buffer from the previous washing step. Add 2 μL of 5′ adapter and 4 μL of T4 RNA ligase 1 to each tube. 3. Incubate for 16 h at 16 °C.

3.1.9 On-Bead Ligation of the App-PE Linker to the 3′ End of the RNAs

1. Wash Ni-NTA agarose beads three times with 500 μL of wash buffer 1. 2. Wash Ni-NTA agarose beads three times with 500 μL of NP-PNK buffer. 3. Centrifuge the samples to remove residual NP-PNK buffer from the previous washing step. 4. Prepare a 3′ adapter ligation mix escalating the volumes provided in Table 8 accordingly. 5. Incubate the reaction at 25 °C for 6 h. 6. Wash Ni-NTA agarose resin with 500 μL of wash buffer 1. 7. Wash Ni-NTA agarose resin three times with 500 μL of wash buffer 2.

3.1.10 Elution of RBP and Cross-Linked RNAs from Ni-NTA Agarose Resin

1. For every Ni-NTA column, prepare two 1.5 mL microcentrifuge tubes. 2. Add 2 μL of glycogen (20 mg/mL) to the second tube of each sample set. 3. Remove remanent wash buffer 2 by centrifuging the samples at room temperature and 4000 rpm for 5 s. 4. Add 200 μL of elution buffer to each sample and let the column for 2 min. 5. After 2 min, force elution by closing and opening the tube cap. 6. Transfer the column to the second 1.5 mL microcentrifuge tube (i.e., the one containing glycogen). 7. Add another 200 μL of elution buffer to each sample and let the column stand for 2 min. 8. Force elution buffer through the column by opening and closing the cap. 9. Centrifuge the samples to achieve complete flow-through of the full volume of elution buffer. 10. Pool eluates for each sample together transferring the fraction collected in the first tube to the one inside the second one. Check for a radioactive signal from the eluate using the Geiger counter to ensure that the sample has flown through.

326

Sofia Esteban-Serna et al.

Table 7 Composition of 5′ adapter ligation mix. Volumes provided are for 1 reaction Reagent

Volume for 1 sample (μL)

5× PNK buffer

16

ATP (100 mM)

8

RNasin

2

Sterile milli-Q water

48

Total

74

Table 8 Composition of 3′ adapter ligation mix. Volumes provided are for 1 reaction

3.1.11 Trichloroacetic Acid (TCA) Precipitation of RNPs

Reagent

Volume for 1 sample (μL)

5× PNK buffer

12

App-PE adapter (100 μM)

0.6

T4 RNA ligase 2 truncated K227Q

3

RNasin

1.5

50% (w/v) PEG 8000

12

MQ

30.9

Total

60

The eluates from the nickel purification will very likely have a lot of free 32P-ATP and uncross-linked or contaminating RNAs. To remove these and to enrich the cross-linked proteins, we recommend performing a TCA precipitation step as this more selectively enriches for cross-linked proteins (see Note 9). After precipitating the cross-linked RNP using TCA, if you see a very big white pellet then this is very likely salts that have precipitated (see Note 10). Next, centrifuge the sample again for about 20 min in a microcentrifuge at full speed. 1. Add 100 μL of 100% TCA to each 400 μL eluate. If the volume of the eluted fraction is approximately 800 μL, add 200 μL of TCA. 2. Vortex the samples vigorously and incubate them on ice in the fume hood for 20 min. 3. Centrifuge the samples at 4 °C and 13,400 rpm for 30 min. 4. Resuspend each pellet in 800 μL of acetone.

Defining Bacterial RNA-RNA Interactomes Using CLASH

327

5. Centrifuge the samples at 4 °C and 13,400 rpm for 15 min. 6. Remove acetone completely and air-dry the samples for a few minutes by keeping the lid of each tube open inside the fume hood. 7. Resuspend pellets in 20 μL 1× LDS sample buffer. At this point, it is possible to freeze samples at -20 °C for overnight storage. 8. Heat samples at 65 °C for 5 min and make sure to vortex the sample a few times to ensure that the pellet dissolves in the sample buffer. 9. Load samples on a 1 mm 4–12% Bis-tris gel and let it run in 1× running buffer at 140 V for 1.5 h. 10. After 1.5 h, wrap the gel in cling film, and place it inside a phosphor imaging cassette. In a dark room, cover the gel with a film, close the cassette, and store it at -80 °C. 11. Expose the film to the gel for 3 h or overnight (see Note 11). 3.1.12 Proteinase K Digestion of the Purified RBP

1. Ensure that the gel has fully thawed at room temperature before removing the cling film. 2. Cut the gel band of interest and transfer it to a 1.5 mL tube. 3. Fragment it using a 1 mL filter tip and store the tip inside the tube until 600 μL of extraction buffer is added to the tube. Ensure that the tip is rinsed adequately to prevent loss of gel pieces adhered to the tip. 4. Add 8 μL of 20 mg/mL proteinase K. 5. Incubate at 55 °C with 800–900 rpm orbital shaking motion for 2 h.

3.1.13 Extraction of Cross-Linked RNAs

1. Centrifuge sample at room temperature for 1 min. 2. Transfer the supernatant (i.e., 450 μL) into a new 1.5 mL microcentrifuge tube. 3. Add 1 volume of 25:24:1 phenol:chloroform:isoamyl alcohol and vortex vigorously. 4. Centrifuge samples at 4 °C and 13,400 rpm for 5 min. 5. Transfer the top layer (300 μL) from each sample into a separate new tube. 6. Add 1 volume of 24:1 chloroform:isoamyl alcohol to each sample and vortex vigorously. 7. Centrifuge tubes at 4 °C and 13,400 rpm for 5 min. 8. Whilst centrifuging the samples, prepare tubes containing 2 μL of glycogen (20 mg/mL) and an amount equivalent to a tenth of the sample volume of 3 M NaOAc pH 5.2.

328

Sofia Esteban-Serna et al.

9. Add 2.2 volumes of cold 96% (v/v) ethanol. 10. Incubate samples at -80 °C for at least 30 min and up to 1 h. Alternatively, at this stage, samples can be stored at -20 °C overnight. 11. Centrifuge samples at 4 °C and 13,400 rpm for 30 min. 12. Wash RNA with 700 μL of refrigerated 70% (v/v) EtOH. Centrifuge at 4 °C and 13,400 rpm for 5 min. 13. Remove ethanol and air-dry pellets at room temperature for 5 min inside the hood. 14. Resuspend the RNA pellets in 10 μL of sterile milli-Q water. 3.1.14 Reverse Transcription of Purified RNAs

1. To reverse-transcribe the RNA sequences in a 10 μL sample add to each tube the sterile milli-Q water, primer, and dNTP volumes specified in Table 9. A negative control which will not be reverse transcribed must be included at this step. 2. Incubate tubes at 85 °C for 3 min and transfer quickly to ice. 3. Leave samples on ice for 5 min. 4. Prepare a reverse transcription mix by escalating the volumes provided in Table 10 accordingly. 5. Incubate samples at 50 °C for 3 min. Afterward, add 1 μL of Superscript IV and incubate the reaction at 50 °C for 1 h. 6. Inactivate Superscript IV by incubating the samples at 65 °C for 15 min. 7. Incubate the samples at 37 °C for 3 min. 8. Add 2 μL of RNase H and incubate at 37 °C for 30 min.

3.1.15 Purification of cDNA Library

1. Transfer resulting cDNA libraries (22 μL) to a PCR tube. 2. Add the equivalent of 2.5 times the volume of the sample (100 μL) of RNAClean XP beads and mix by pipetting 15 times. 3. Incubate at room temperature for 15 min. 4. Place tubes on a magnetic rack and incubate at room temperature for 5 min. The solution should be clear by the end of the incubation. 5. Discard the supernatant and wash it two times with refrigerated 70% (v/v) ethanol. 6. Air-dry libraries for up to 10 min. 7. Add 11 μL sterile milli-Q water to each sample to elute the cDNA. 8. After 2 min, collect 10 μL of cDNA and transfer it to a new tube. 9. Store cDNA and negative control tubes at -20 °C.

Defining Bacterial RNA-RNA Interactomes Using CLASH

329

Table 9 Reagents needed for reverse-transcription of the samples and preparation of the negative control for the reaction. Volumes provided are for 1 reaction Reagent

Volume for 1 sample (μL)

Volume for 1 negative control (μL)

Sterile milli-Q water



10

RT primer (10 μM)

1

1

dNTP mix (5 mM)

2

2

Total

13

13

Table 10 Composition of the reverse transcription mix. Volumes provided are for 1 reaction

3.1.16 Amplification of cDNA Library

Reagent

Volume for 1 sample (μL)

5× SSIV buffer

4

DTT (100 mM)

1

RNasin (40 U/μL)

1

Total

6

1. Prepare a cDNA library amplification mix using the reagents and volumes provided in Table 11 accordingly. 2. Amplify the library using the PCR protocol outlined in Table 12. 3. Add 1 μL of Exonuclease I to each sample and incubate at 37 ° C for 1 h.

3.1.17 Purification of Amplified cDNA Library

1. To each sample, add 84 μL of RNAClean XP beads and pipet to mix well. 2. Incubate amplified libraries at room temperature for 15 min. 3. Place tubes on a magnetic rack and incubate at room temperature for 5 min. The solution should be clear by the end of the incubation. 4. Discard the supernatant and wash it two times with refrigerated 70% (v/v) ethanol. 5. Air-dry libraries for up to 10 min. 6. Add 21 μL sterile milli-Q water to each sample to elute the cDNA. 7. After 2 min, collect 20 μL of cDNA and transfer it to a new tube. 8. Store cDNA and negative control samples at -20 °C.

330

Sofia Esteban-Serna et al.

Table 11 Composition of the cDNA library amplification mix Reagent

Volume (μL)

10× PFU buffer

5

P5 primer (10 μM)

1

BC primer (10 μM)

1

dNTPs (5 mM)

2.5

Pfu DNA polymerase

1

cDNA

5

Sterile milli-Q water

34.5

Total volume

50 μL

Table 12 PCR protocol for amplification of the cDNA library

3.1.18 Size-Based Selection of cDNA Libraries

Step

Temperature (°C)

Duration

Cycles

Initial denaturation

95

2 min

Denaturation

98

20 s

Annealing Extension

52 72

30 s 1 min

Final extension

72

5 min

1

Holding

4

1

1

1 24

1. To each DNA sample, add 4 μL of 6× gel loading dye blue. 2. Into two separate wells of a TBE-urea gel, 6%, load 3 μL of 50 bp ladder and 6 μL of 1 kb ladder. Leave an empty well between each cDNA sample. 3. Let the gel run in 1× TBE buffer at a maximum of 120 V for 1 h. Do not run the gel at higher voltages as this will reduce the resolution. 4. Cast the gel and soak it in TBE buffer with 5 μL of DNA gel stain for 10 min. 5. Wash the gel with sterile milli-Q water. 6. Image the gel using a fluorescence gel scanner and print the resulting image at the original size resolution. 7. Tape the image to the gel on the top.

Defining Bacterial RNA-RNA Interactomes Using CLASH

331

8. Use a sterile blade to accurately cut the region of the gel containing the band corresponding to the cDNA library of interest. The size of this band is approximately 200–400 bp. 9. Place the cut band into a 1.5 mL microcentrifuge tube. Shred the band using a 1 mL filter tip and store the tip inside the tube until 400 μL of sterile milli-Q water is added to the tube. Ensure that the tip is rinsed adequately to prevent loss of gel pieces adhered to the tip. 10. Incubate the fragment at 37 °C with 1100 rpm orbital shaking motion for 60 min. 11. Incubate tube in dry ice for 10 min. 12. Again, incubate the samples at 37 °C with 1100 rpm orbital shaking motion for 60 min. 13. Spin tubes at room temperature and 13,400 rpm for 1 min. 18. Stack two pieces of Whatman grade GF/D glass microfiber prefilters (10 mm, circle) into a 0.22 μm centrifuge tube filter. 14. Spin the samples at 13,000 rpm for 1 min and transfer the flowthrough (400 μL) into a new 1.5 mL microcentrifuge tube. This volume contains the cDNA. 3.1.19 Ethanol Precipitation of cDNAs

1. To the cDNA samples, add 1 μL of (20 μg/μL) glycogen, 40 μL of 3 M sodium acetate and 1 mL of cold 96% (v/v) ethanol to each tube. 2. Incubate samples at -80 °C for 30 min or store overnight at 20 °C. 3. Centrifuge tubes at 4 °C and 13,400 rpm for 30 min. 4. Wash DNA with 1 mL of 70% (v/v) ethanol (stored at -20 ° C). 5. Spin samples at 4 °C and 13,000 rpm for 10 min. 6. Air dry pellets and resuspend them in 15 μL of sterile milli-Q water. 7. Measure DNA concentration using a Qubit™ dsDNA HS and BR assay kit. Follow the instructions provided by the manufacturer. 8. Assess the quality of cDNA quality using a bioanalyzer high sensitivity DNA analysis assay kit (or equivalent). Follow the instructions provided by the manufacturer. 9. Store DNA library at -20 °C until the library is sent out for sequencing. Raw sequencing outputs can be processed using previously developed software pipelines such as hyb or the RILseq packages [8, 16]. Briefly, these pipelines demultiplex barcoded reads and

332

Sofia Esteban-Serna et al.

Sequencing output

Called chimeras

>Sequence_1 CTGAAAAACATAACCC

hyb or RILseq

>Sequence_2 CTGGTGGGGTTCCCGA

Information about Fragment 1

Information about Fragment 2

p-value

cyaR ncRNA 2167117 2167138 +

nuoC mRNA 2403049 2403066 -

1.20E-14

tyrV tRNA

tyrU tRNA 4175470 4175519 +

2.05E-19

1287282 1287330 -

*.txt

*.fa or *.fq

Notebook 1: Excluding Duplicates and Folding Chimeras Sequence Overlap

Information about Fragment 1

Information about Fragment 2

p-value

cyaR ncRNA 2167117 2167138 +

nuoC mRNA 2403049 2403066 -

1.20E-14

1

tyrV tRNA

tyrU tRNA 4175470 4175519 +

2.05E-19

100

1287282 1287330 -

MFE -19.4

*p_values_with_gene_names_significant_seqs_folded_NO_sequenceoverlap.txt

Notebook 3: Identifying Seed Sequences in the sRNA

Notebook 2: Comparing Folding Energies of Chimeras and Control Datasets

Seed Sequences in the sRNA of Interest

Chimeras Control (Shuffled over Genes) Control (Shuffled over Genomic Features)

Frequency

0.8

0.6

0.4

0.2

0.0 −60

−50 −40

−30

−20

−10

0

RNA Targets

1.0

MFE Nucleotide Position

Notebook 4: Defining sRNA Seed Sequences for the sRNA Using MEME

U

U

C G G

cyaR sRNA seed sequence

8

G A

7

A

6

GCUGAAAA 5

0

4

1

3

bits

2

1

7

MEME e-value: 4.7e-9

2

U

C G G

Confirming complementarity between the seed sequence and mRNA targets of the sRNA using MAST

8

U

6

G A

5

A

4

GCUGAAAA 3

0

1

1

2

bits

2

3’- GUUCCGGAACUCACCAAAGUUGACUUUGAACGUAGAACAAUGGAGUU-5’ tig mRNA MAST p-value: 1.0e-3

Fig. 2 Flowchart describing the analytic quality control processes that CLASH outputs undergo in our package. Using previously developed software, chimeras can be recognized and mapped (i.e., called). Upon selection of significantly enriched hybrids, the first notebook in our repository will discard non-unique chimeras and

Defining Bacterial RNA-RNA Interactomes Using CLASH

3.2 Computational Analysis 3.2.1 Bioinformatic Analysis of CLASH Sequencing Output

333

remove adapter sequences. The resulting reads undergo two quality control steps. Firstly, sequences are filtered based on their length and nucleotide quality values (i.e., PHRED scores). Secondly, since PCR artifacts will display identical random prefixes, they can easily be pinpointed and excluded from downstream analysis (collapsed). The remaining unique reads can subsequently be aligned to the appropriate regions of the corresponding reference transcriptome or genome. Having mapped raw sequences to their genomic region, these bioinformatic approaches will reject non-chimeric reads (i.e., fragment which does not contain sequences from different genomic regions). Some chimeric reads, however, may have formed spuriously due to ligation of transcripts which were incidentally proximal to the same RBP during UV cross-linking or stochastic base-pairing between cycling of cognate sRNAs. Thus, to identify true chimeras emerging from the ligation of base-pairing, RNA species recruited by the bait RBP is crucial to extract accurate biological conclusions. The statistical filtering applied to determine whether a given sRNAmRNA hybrid is significantly overrepresented among sequenced chimeric reads remains specific to each pipeline. Regardless, resulting output files typically comprise columns indicating the unique sequence index for a read, its sequence, the name of the transcript as well as its coordinates and mapping score in the first and the second half of the chimera. Exploiting this uniformity, we have developed a set of functions which streamlines the interpretation of the outputs generated by previously described processing and probabilistic pipelines of CLASH and RIL-seq-derived data [16–19]. These scripts aim to provide a computational tool for (i) comparing the folding energies of the RNA-RNA interactions against that of a in silico generated control, (ii) clustering targets with common seed sequences, and (iii) extracting the sequential motifs underlying sRNA target recognition (Fig. 2). Altogether, we propose these analyses will be useful to evaluate the quality of CLASH and RIL-seq RNA-RNA interaction datasets. The full package has been made available at https://git.ecdf.ed.ac.uk/sgrannem/clash-data-processing. In this repository, we provide all described code (see python_code), sample notebooks (see notebooks) tested on the statistically processed CLASH data (see test_data) from S. aureus’ RNase III

ä Fig. 2 (continued) compute their folding energies. Since RNA-RNA hybrids will tend to assemble into conformations with minimum free energy (MFE), the second notebook will check whether, as expected, the folding energy of the recovered duplexes is lower than that of in silico-generated control datasets. Given that sequence complementarity underlies RNA-RNA interactions, sRNAs typically bind to their targets through one or more regions referred to as seed sequences. The third notebook searches for these sequences across input sRNAs and clusters the targets based on the seed region to which they base-pair. Groups of targets can be fed to the fourth notebook to identify the motif by which the interacting sRNA recognizes them

334

Sofia Esteban-Serna et al.

and E. coli’s Hfq [4, 6] as well as the annotation files required for the analysis of these two datasets (see annotation_files). E. coli Hfq CLASH data were included because, for several sRNAs, we detected a larger number of interactions with mRNAs, thereby enabling meaningful statistical analyses. Prior to applying our functions to their own data, users will need to provide a list of statistically enriched chimeras. For these scripts to work, the user will need to install all required dependencies (see https://git.ecdf.ed.ac.uk/sgrannem/clash-chimera-statis tical-analyses for details). It should be noted that our hyb_code applies previously developed tools [16, 19], and can be substituted by other bioinformatic approaches [8]. Due to the multiple software alternatives for statistical analysis of CLASH outputs, we have not included a detailed protocol for each of these in this chapter. Nevertheless, a succinct overview of one of these options is provided in the next section for contextualizing the generation of the input file that must be fed to our package. 3.2.2 Preparing a Minimum Input File for the Data Analysis

For the Python code described here to work, the user should first produce a table containing the chimeras that the use wishes to analyze. This table must contain the following columns: ID, hyb_count, rna_class, chromo, name, start, end, strand, ID.1, hyb_count.1, rna_class.1, chromo.1, name.1,

start.1,

end.1,

strand.1,

p_value,

bh_adj_p_value, total_hybrids.

Prior to running the sample code stored in notebooks, ensure that all required dependencies are installed. All dependencies are listed in our repository and can be installed using pip or conda. 3.2.3 Removal of Duplicate Hybrids and Folding Unique Chimera

The first step in the analysis is to remove any chimeras that have fragments that have largely overlapping sequences. Some hybrids are still adventitiously annotated as intermolecular interactions because the individual halves of the chimeras are mapped to repetitive genomic regions or genes. For instance, hybrids consisting of two tRNA sequences usually stem from the same tRNA gene. Yet, they are sometimes mapped to different gene names because the tRNA is present in several copies of the genome. Thus, these are unlikely bona fide chimeras. Consequently, this step is needed to remove any of those false-positive chimeras. 1. From the genome_annotation_files directory, load the reference genome file (i.e., Staphylococcus_aureus_usa300_FPR3757_corrected.fa or Escherichia_coand its corresponding li_K12_Chromosome.fa) annotations (i.e., Staphylococcus_aureus_usa300_

Defining Bacterial RNA-RNA Interactomes Using CLASH

335

Table 13 Information is encoded by each feature in the input table Attribute

Description

ID

Unique sequence identifier provided by the pre-processing pipeline.

hyb_count Number of supporting chimeric reads collapsed to this sequence. rna_class Type of RNA (e.g., “ncRNA,” “protein-coding,” “rRNA”) chromo

Chromosome name.

name

Gene symbol according to the GTF annotation file that you used to determine to which genes each half of the chimera maps.

start

Starting genomic coordinate of the attribute.

end

End genomic coordinate of the attribute.

strand

Strand in which the attribute is found (e.g., “+,” “-,” or “.,” which denotes no strand information).

Attributes with the suffix “0.1” refer to the properties of the second half of the chimera. On the contrary, when this termination is not given, the information pertains to the first half of the chimera FPR3757_1.5.fa

or

Escherichia_coli_str_k_12_-

substr_mg1655.GCA_000005845.2.22_1.1_extended_3UTRs.gtf)

Reference genome files encompass a representative example of the sequences encoding all genes of a given species. In other words, they are consensus standard versions of a genome against which experimental outputs are compared. Reference genomes are stored in FASTA format, which encloses entries comprising an identifier and the nucleotide sequence to which it refers. Attributions of a sequence to a specific genomic feature are made based on reference annotation files for a given genome. These files are frequently in a GTF format which contains columns displaying the sequence identifier, the project or database from which this information was obtained, the gene, the start and end coordinates, the strand and, attributes such as “mRNA” or “protein-coding.” The specific terminology employed to define discrete features depends on the preferences of the curating source. The user must examine the annotation file and be aware of its inherent notation system to be able to subset the interactions of interest from an input dataset (see Note 12). 2. Import and read the TXT file containing a list with the information for the chimeras. 3. Ensure that the format of the table is identical to the one described in Table 13. If there is any discordance in the column names, correct it using Pandas’ rename() function.

336

Sofia Esteban-Serna et al.

4. Obtain the nucleotide sequence of the two base-pairing fragments constituting each chimera. This can be done with the extractInteractionSequences() function, which will append the sequence corresponding to each half of the hybrid under the columns firstseq and secondseq. 5. Identify duplicate chimeras from the list. To do this, the findSequenceOverlap() function acquires ten nucleotides from the middle of a given sequence and finds out whether it is also present in another chimera. 6. Calculate the percentage overlap between duplicated sequences by applying the calculateOverlapPercentage() tool. 7. Remove entries with any overlapping sequences. 8. Run the foldSequences() tool to compute the minimum folding energy (MFE) of each interaction and compile its secondary structure based on predictions made by the RNAduplex program of the ViennaRNA package (version 2.5.0) [25]. From the resulting dot-bracket notation, the foldSequences() function extracts the double-stranded structural motifs of the hybrid and identifies it as the intermolecular basepairing sequence. This information is appended to the original table and will be essential for future analyses. 9. Save the information of filtered chimeric fragments as a CSV file for the next steps. 3.2.4 Comparison of Folding Energies of Experimentally Defined Chimeras and Artificially Generated Control Interactions

In their native context, RNA molecules fold into the structures with minimum free energies (MFE). Thus, intermolecular interactions that were recovered by CLASH or RIL-seq will be expected to display lower MFE values compared to those of spurious chimeras. To assess the quality of the data generated by a CLASH/RIL-seq experiment, we normally compare the cumulative MFE distribution of the experimental dataset with one which has been produced artificially by random sequence rearrangements of interacting RNA sequences or by arbitrarily selecting genes within the same class. This can be done by applying the shuffler.shuffleIntervals () function as specified below and in the Python notebook. 1. Import the reference genome FASTA file, its corresponding GTF annotation file, and the CSV file containing the information and statistics for unique chimeras. 2. Separate the hybrids into inter- and intramolecular interactions using the separateHybrids() tool. In the working example provided in the repository, only intermolecular hybrids were conserved. 3. CLASH and RIL-seq will identify many types of interactions other than sRNA-mRNA interactions, such as mRNA-mRNA interactions. Nonetheless, to simplify the analyses, in this

Defining Bacterial RNA-RNA Interactomes Using CLASH

337

example, we will focus on the sRNA-mRNA interactions identified in the E. coli and MRSA CLASH data. For this, we use the extractInteractionTypesFromStats() function to filter the data depending on their rna_class feature under which they were grouped based on the GTF annotation file (Table 13). To select the interactions under study, the user must be acquainted with the nomenclature by which interactions have been described in their GTF annotation file. Different sources will curate annotation files employing specific terminologies, namely while small RNAs will be contained within the sRNA feature in the E. coli K-12 reference, the same group of transcripts will be classified as ncRNA in the S. aureus USA300 annotation file. Similarly, whereas messenger RNAs will be described as protein-coding in the latter, throughout the former reference file, they will be spread among the proteincoding attribute, if they belong to the coding sequence of the mRNA, or the 3UTR or 5UTR features if they were mapped within its untranslated region. Considering these differences is essential to shortlist the relevant type of interaction. The user can gain background knowledge on the type of annotations used throughout the file by typing set(intermolecular.rna_class). In both sample notebooks, we were interested in retrieving sRNAmRNA interactions. Hence, we selected ncRNA:protein_coding, ncRNA:3UTR and ncRNA:5UTR hybrids across E. coli intermolecular interactions and sRNA:protein_coding chimeras for the S. aureus data. 4. Since, in this example, we specifically wanted to focus on sRNA and mRNA fragments that were mapped to the gene on the same strand, we wanted to remove any chimeras that contained sequences mapped anti-sense to the sRNA or mRNA targets. For this, we used the removeFeatureTypeFromStats() and specified that we wanted to remove all anti-sense interactions. 5. Apply the correctPositions() function to amend the start and end positions to solely consider base-paired regions. In other words, only nucleotide positions involved in intermolecular base-pairing should be considered further. Do not discard strand information as it would be needed for subsequent steps. 6. Run the ShuffleIntervalCoordinates() function to exclusively consider one type of hybrid (e.g., ncRNA:protein-coding). In the example script, we focused on interactions between sRNAs and protein-coding genes only. 7. Execute the shuffler.shuffleIntervals() function to generate a control dataset of simulated interactions between

338

Sofia Esteban-Serna et al.

random fragments stemming from rearrangements of the sequence of each selected hybrid or that of the genes belonging to the RNA classes stipulated in the previous step. Include the shuffletype="overgenes" tag to redistribute the sRNA and mRNA sequences for each chimera. In this case, for each sRNA-mRNA hybrid, the function will randomly grab a sequence of the same length as the sRNA halve from the same sRNA gene and calculate the MFE of the structures that these fictitious fragments would form with artificial sequences arising from an analogous sequential rearrangement within the genomic coordinates of its target mRNA. The function will perform this for each sRNA-mRNA pair in the experimental dataset. Alternatively, quote the shuffletype="overfeatures" parameter to fix the class of genes which will be reshuffled as a control. In this instance, the function will retrieve and re-arrange the sequences for all the genes displaying the features specified in step 6 of this subheading. More specifically, in the test notebook, we selected genes under the ncRNA and protein_coding categories. Therefore, the sample output file will contain the MFE values for interactions between random sequential redistributions of sRNAs with equivalent fragments extracted from their protein-coding counterparts (see Note 13). 8. Compare the MFE of the chimeric structures in the experimental and control datasets. Cumulative distributions can be plotted using the Python Seaborn kdeplot() function and statistically compared by performing a Kolmogorov-Smirnov test using the SciPy package (Fig. 2). In this last step, low pvalues (≤0.01) indicate that experimental chimeras are significantly more likely to fold into structured RNAs than in silicoshuffled hybrids. 3.2.5 Clustering of Targets with Common Seed Sequences

Small RNAs are known to use one or multiple sequences to basepair with mRNA targets. Following the terminology established in eukaryotic micro RNAs [26], these complementary regions are referred to as seed sequences [27]. If the recovered putative interactions underlie true sRNA-mRNA base-pairing, target mRNAs should display at least one motif which is complementary to that of the sRNA seed region. As a result, if all mRNA targets for a given sRNA were aligned to the sequences through which they interact with their cognate ncRNA, we expect to observe the formation of a cluster for every seed sequence in the sRNA. For this purpose, the next steps will (i) identify overrepresented sequences within the sRNA fragments of several chimeras, (ii) align target transcripts around the region through which they interact with the seed sequence, and (iii) cluster targets with overlapping interaction

Defining Bacterial RNA-RNA Interactomes Using CLASH

339

regions for downstream motif analysis. Notably, a considerable number of interactions are needed to statistically substantiate the existence of various seed regions within an sRNA as well as to identify complementary motifs among target species. Hence, to illustrate the application of the remaining functions, we have used E. coli Hfq CLASH data as input [4] as this dataset contained a large number of mRNA interactions for several sRNAs. 1. Load the reference genome FASTA file, its corresponding GTF annotation file, and a TXT file specifying the chromosome length for the same genome (i.e., Staphylococcus_aureus_usa300_FPR3757_corrected.txt or Escherichia_coli_K12_chromosome_length.txt). 2. Import the TXT file containing the information and statistics of the chimeras without overlapping sequences. 3. Define the set of chimeras for which the seed sequence needs to be identified. For instance, to select chimeras comprising at least one sRNA, the test notebook subsets all hybrids for which at least half of the chimera is classified as a ncRNA. This is done by picking the pairs showing ncRNA values in the columns rna_class, which designates the feature for the first fragment for the chimera, and rna_class.1, which outlines the attribute of the second part of the same pair. 4. Save the extracted hybrids into a GTF file to visualize them in a genome browser afterward. 5. To identify the seed region by which the sRNA under study detects its targets, start by running the geneInteractionTable() function. This will create a Pandas data frame showing the coordinates of the gene coding for the sRNA of interest that interact with other transcripts. The table will have this information for each interaction involving the ncRNA of interest. 6. Format the resulting data frame as a binary output. In other words, nucleotides interacting with a target transcript will display a 1 for their coordinate in the row corresponding to that chimera. If, on the contrary, a nucleotide does not contribute to base-pairing between the sRNA and its target, the value for the coordinate assigned to that nucleotide will be 0 in the row corresponding to that interaction. 7. Find sequential clusters among the base-pairing nucleotides of all the interactions involving the sRNA of interest (Fig. 2). 8. Sort the chimeras containing the identified clusters and save the output as a CSV file. This table will have an identical format to that of the input file but will solely contain the hybrids with seed sequences.

340

Sofia Esteban-Serna et al.

9. Locate the clusters within the sequence of chimeras containing seed regions and export these in FASTA and GTF format. These files will be needed for motif enrichment analyses using the MEME suite and the pyCRAC pipeline [19, 28]. 10. Visualize the clusters of mRNAs. The sRNA under study hybridizes each set of targets using a distinct seed region. Export the resulting heatmap in PDF format. 3.2.6 Uncovering Enriched Motifs within Seed Sequences

Given that sRNAs base-pair to their cognate mRNAs through a recognition sequence, bona fide sRNAs should contain an enriched set of motifs (i.e., the seed sequences) through which they hybridize to their mRNA targets. To test this, this script will fetch the hybrid clusters established in the previous steps, extract the chimeric fragments corresponding to the sRNA half of the hybrid, and search for overrepresented sequential patterns. Furthermore, this analysis will compute the complementarity between the defined sRNA seed regions and the mRNA targets grouped together during the preceding step. 1. Define the sRNA(s) through which to search for enriched motifs. 2. Retrieve the FASTA files generated after seed clustering and concatenate them. 3. Feed the merged FASTA file to the MEME-ChIP motif discovery program [28]. In the example provided, this software was set to search for oligomers between 4 (-minw) and 8 (-maxw) nucleotides long (Fig. 1b). Moreover, the script ran simultaneously in 8 processors (-meme-p) using a higher-order background model (-meme-mod). Altogether, these parameters were found to yield the fastest and most accurate motif detection process among the sample target transcripts (-rna). However, users are encouraged to adjust these specifications for their needs. 4. An alternative approach to uncover sequence enrichments is to use pyMotif.py from the pyCRAC package [19]. Unlike MEME-ChIP, this program requires input files to be in the GTF format. This tool extracts motifs that are at least tetramers but no longer than octamers from input sequences which have been previously mapped to a genomic feature. Then, it will generate a control dataset by random re-arrangement of the sequences within intervals of the same genomic feature. Finally, the function will compare the abundance of a given motif in both datasets and calculate its statistical significance, which is given as a Z-score (Fig. 2). 5. Resulting sequences can be filtered according to their statistical significance (i.e., Z-score > 2) or based on additional criteria.

Defining Bacterial RNA-RNA Interactomes Using CLASH

341

In this notebook, pentamers have been shortlisted for individual examination. 6. Verify complementarity between the seed sequences of the sRNA of interest and its interacting mRNAs using the MAST tool of the MEME suite [28]. In principle, identical sRNA recognition motifs should be recovered from the complementary sequences of the target mRNAs the targets of a given sRNA can be obtained by applying the reverse_complement() tool to the FASTA files of all mRNA clusters. Provided an input file containing a set of motifs (i.e., the sRNA seed regions outlined in the TXT files produced by MEME in 4.6.3), MAST will search for those sequences within the reverse complements. Subsequently, this program will rank those in the latter file by best-combined match to the group of motifs that it was fed. After considering motifs that had a MAST p-value equal to or lower than 0.05, the user should be able to reproduce the seed sequences found by motif clustering (Fig. 2).

4

Notes 1. We recommend purchasing the enzymes and radioactively labeled ATP from the suppliers mentioned in the protocol to avoid compromising the experimental output. When we developed the original CRAC protocol, we found that the quality of the enzymes that were provided by different manufacturers varied considerably. Therefore, we would strongly recommend using the suppliers listed in this protocol. Should some of the listed reagents no longer be available, then it is imperative that you check whether the enzymes are active in the reaction buffer (e.g., 1× PNK) that we use for most enzymatic reactions. 2. For our CLASH experiments, we always use 32P-ATP that is supplied by Perkin Elmer. We have purchased 32P-ATP from Hartmann Analytic in the past and tested it with our CRAC/ CLASH protocols. While 32P-ATP from Hartmann works well for labeling DNA and/or RNA oligonucleotides, for reasons we do not fully understand, it does not reliably work for labeling cross-linked RNAs under our reaction conditions. Thus, we would recommend using Perkin Elmer as a supplier. 3. All the adapter sequences that we list here were purchased from Integrated DNA Technologies (IDT). We strongly recommend having these oligonucleotides HPLC-purified using RNasefree reagents after synthesis. This not only removes any impurities but also enriches full-length oligonucleotides.

342

Sofia Esteban-Serna et al.

4. We would recommend always aliquot the enzymes that you need for the CLASH experiments to reduce the chances of cross-contaminating samples. 5. Cross-contamination is a very common issue when simultaneously handling multiple samples in CLIP/CRAC/CLASH protocols. It is, hence, essential to take all possible measures to prevent this, for example, by very carefully opening lids and regularly cleaning working surfaces. Ideally, one would do an experiment with different baits or conditions on different days. Make sure to also UV irradiate pipets that you use for your experiments regularly to remove lingering cDNA sequences. Also regularly clean your work areas and gel tanks. 6. If you need to use an enzyme from a different supplier for your experiments (for example, some groups buy recombinant TEV protease), please make sure that it does not have a His6 tag as this enzyme will then compete with your bait protein for binding to the Ni-NTA beads. 7. The affinity purification beads we use (Ni-NTA agarose and anti-FLAG magnetic beads) are not compatible with high concentrations of reducing agent, such as DTT and β-mercaptoethanol. This is why we never use more than 5 mM β-mercaptoethanol in our buffers. If the concentration of the reducing agent is too high, your Ni-NTA beads will turn grey, and you probably accidentally eluted your protein-RNA complex. 8. Usually when a CLASH experiment fails it is because the ATP stock that was used was of poor quality, was not stored appropriately, or was too old. We recommend buying a new tube of ATP every 6 months or so and making 5–10 μL aliquots of 100 mM ATP. Store these aliquots in the freezer at -20 °C and do not re-freeze these aliquots. 9. Sample handling during TCA and acetone precipitation steps must be carried out inside a fume hood as inhalation or skin contact with these reagents can cause eye and respiratory irritation. 10. After TCA precipitation, if you see a very large white pellet, then it is extremely important to resuspend this pellet in 100% (v/v) acetone and to centrifuge the sample for another 20 min at 4 °C at full speed in a microcentrifuge. These large pellets are likely salts that have precipitated during the incubation with TCA. 11. After partial RNase digestion and radiolabeling of the crosslinked RNA, we always run the purified protein-RNA complexes on bis-tris gels to get a rough idea of how efficient the cross-linking was and whether sufficient cross-linked RNA was purified. After running the gel, we then use autoradiography to

Defining Bacterial RNA-RNA Interactomes Using CLASH

343

visualize the radiolabeled cross-linked RNA. If very long exposures of a film or phosphorimager screen are needed to see a reasonably strong radioactive signal, we would not recommend proceeding to the library preparation step, as it is very unlikely that the experiment will yield high-complexity cDNA libraries. In these cases, we would advise optimizing the immunoprecipitation and cross-linking steps to increase the yield of crosslinked protein. We often perform growth curves and analyze the levels of the bait proteins at various growth stages to select a growth phase at which the protein is expressed at sufficiently high levels to perform CLASH. With most baits and when we use fresh 32P-ATP, 1- to 4-h exposures to a film are usually enough to see a strong signal. 12. Bioinformatics analyses: If you are using a GTF formatted annotation file for your analyses, it is pivotal to confirm that the relevant GTF annotation file is compatible with the pyCRAC package, which we used for processing the hybrid reads in our notebooks. All GTF annotation files need to be checked with the pyCheckGTF() script before you start the analyses. If you are unsure about how such an annotation file should look like, examine the GTF files provided in the genome_annotation directory in the repository. 13. Bioinformatics analyses: The shuffleIntervals() function may return warnings if it is unable to figure out whether a fragment from a chimera originates from an sRNA or an mRNA. For example: Please make sure each row has a single feature. More than one feature (protein_coding,sRNA) found for 508259 and 508279 Not sure which one to use!

This will happen if the coordinates for a given region in the annotation file are simultaneously attributed to an sRNA and a protein_coding gene. The user could change the entries for this fragment in the rna_class or rna_class.1 columns to either sRNA or mRNA. By default, these chimeras will be removed. References 1. Waters LS, Storz G (2009) Regulatory RNAs in bacteria. Cell 136:615–628. https://doi. org/10.1016/J.CELL.2009.01.043 2. Kudla G, Granneman S, Hahn D et al (2011) Cross-linking, ligation, and sequencing of hybrids reveals RNA-RNA interactions in yeast. Proc Natl Acad Sci 108:10010–10015.

h t t p s : // d o i . o r g / 1 0 . 1 0 7 3 / P N A S . 1017386108/SUPPL_FILE/SD01.XLS 3. Helwak A, Kudla G, Dudnakova T, Tollervey D (2013) Mapping the human miRNA interactome by CLASH reveals frequent noncanonical binding. Cell 153:654–665. https://doi.org/ 10.1016/j.cell.2013.03.043

344

Sofia Esteban-Serna et al.

4. Iosub IA, van Nues RW, McKellar SW et al (2020) Hfq CLASH uncovers sRNA-target interaction networks linked to nutrient availability adaptation. Elife 9:1–33. https://doi. org/10.7554/ELIFE.54655 5. Mediati DG, Wong JL, Gao W et al (2022) RNase III-CLASH of multi-drug resistant Staphylococcus aureus reveals a regulatory mRNA 3′UTR required for intermediate vancomycin resistance. Nat Commun 13:3558. https://doi.org/10.1038/s41467-02231177-8 6. McKellar SW, Ivanova I, Arede P et al (2022) RNase III CLASH in MRSA uncovers sRNA regulatory networks coupling metabolism to toxin expression. Nat Commun 13:3560. https://doi.org/10.1038/s41467-02231173-y 7. Tree JJ, Granneman S, McAteer SP et al (2014) Identification of bacteriophage-encoded antisRNAs in pathogenic Escherichia coli. Mol Cell 55:199–213. https://doi.org/10.1016/ J.MOLCEL.2014.05.006 8. Melamed S, Peer A, Faigenbaum-Romm R et al (2016) Global mapping of small RNA-target interactions in bacteria. Mol Cell 63:884–897. https://doi.org/10.1016/J.MOLCEL.2016. 07.026 9. Melamed S, Adams PP, Zhang A et al (2020) RNA-RNA interactomes of ProQ and Hfq reveal overlapping and competing roles. Mol Cell 77:411–425.e7. https://doi.org/10. 1016/J.MOLCEL.2019.10.022 10. Matera G, Altuvia Y, Gerovac M et al (2022) Global RNA interactome of Salmonella discovers a 5′ UTR sponge for the MicF small RNA that connects membrane permeability to transport capacity. Mol Cell 82:629–644.e4. https://doi.org/10.1016/j.molcel.2021. 12.030 11. Mizrahi SP, Elbaz N, Argaman L et al (2021) The impact of Hfq-mediated sRNA-mRNA interactome on the virulence of enteropathogenic Escherichia coli. Sci Adv 7:eabi8228. https://doi.org/10.1126/SCIADV.ABI8228 12. Huber M, Lippegaus A, Melamed S et al (2022) An RNA sponge controls quorum sensing dynamics and biofilm formation in Vibrio cholerae. Nat Commun 13:1–14. https://doi. org/10.1038/s41467-022-35261-x 13. Fuchs M, Lamm-Schmidt V, Lencˇe T et al (2022) A network of small RNAs regulates sporulation initiation in C. difficile. bioRxiv. https://doi.org/10.1101/2022.10.17. 512509 14. Boisset S, Geissmann T, Huntzinger E et al (2007) Staphylococcus aureus RNAIII

coordinately represses the synthesis of virulence factors and the transcription regulator Rot by an antisense mechanism. Genes Dev 21(11):1353–1366. https://doi.org/10. 1101/gad.423507 15. Lioliou E, Sharma CM, Altuvia Y et al (2013) In vivo mapping of RNA–RNA interactions in Staphylococcus aureus using the endoribonuclease III. Methods 63:135–143. https://doi. org/10.1016/J.YMETH.2013.06.033 16. Travis AJ, Moody J, Helwak A et al (2014) Hyb: a bioinformatics pipeline for the analysis of CLASH (crosslinking, ligation and sequencing of hybrids) data. Methods 65:263– 273. https://doi.org/10.1016/J.YMETH. 2013.10.015 17. Sharma E, Sterne-Weiler T, O’Hanlon D, Blencowe BJ (2016) Global mapping of human RNA-RNA interactions. Mol Cell 62:618– 626. https://doi.org/10.1016/j.molcel. 2016.04.030 18. Waters SA, McAteer SP, Kudla G et al (2017) Small RNA interactome of pathogenic E. coli revealed through crosslinking of RN ase E. EMBO J 36:374–387. https://doi.org/10. 15252/EMBJ.201694639 19. Webb S, Hector RD, Kudla G, Granneman S (2014) PAR-CLIP data indicate that Nrd1Nab3-dependent transcription termination regulates expression of hundreds of protein coding genes in yeast. Genome Biol 15:R8. https://doi.org/10.1186/gb-2014-15-1-r8 20. M€ader U, Nicolas P, Depke M et al (2016) Staphylococcus aureus transcriptome architecture: from laboratory to infection-mimicking conditions. PLoS Genet 12:1–32. https://doi. org/10.1371/journal.pgen.1005962 21. Coombes BK, Brown NF, Valdez Y et al (2004) Expression and secretion of Salmonella pathogenicity island-2 virulence genes in response to acidification exhibit differential requirements of a functional type III secretion apparatus and SsaL. J Biol Chem 279:49804–49815. https://doi.org/10.1074/jbc.M404299200 22. van Nues R, Schweikert G, de Leau E et al (2017) Kinetic CRAC uncovers a role for Nab3 in determining gene expression profiles during stress. Nat Commun 8:12. https://doi. org/10.1038/s41467-017-00025-5 23. McKellar SW, Ivanova I, Van Nues RW et al (2020) Monitoring protein-RNA interaction dynamics in vivo at high temporal resolution using χCRAC. J Vis Exp (159):e61027 24. Das U, Shuman S (2013) Mechanism of RNA 2′,3′-cyclic phosphate end healing by T4 polynucleotide kinase-phosphatase. Nucleic Acids

Defining Bacterial RNA-RNA Interactomes Using CLASH Res 41:355–365. https://doi.org/10.1093/ nar/gks977 25. Lorenz R, Bernhart SH, Ho¨ner zu Siederdissen C et al (2011) ViennaRNA Package 2.0. Algorithms Mol Biol 6:1–14. https://doi.org/10. 1186/1748-7188-6-26/TABLES/2 26. Bartel DP (2009) MicroRNAs: target recognition and regulatory functions. Cell 136:215– 233. https://doi.org/10.1016/J.CELL. 2009.01.002

345

27. Bandyra KJ, Said N, Pfeiffer V et al (2012) The seed region of a small RNA drives the controlled destruction of the target mRNA by the endoribonuclease RNase E. Mol Cell 47:943– 953. https://doi.org/10.1016/J.MOLCEL. 2012.07.015 28. Bailey TL, Boden M, Buske FA et al (2009) MEME suite: tools for motif discovery and searching. Nucleic Acids Res 37:W202– W208. https://doi.org/10.1093/NAR/ GKP335

Chapter 18 Global Identification of RNA-Binding Proteins in Bacteria Thomas Søndergaard Stenum and Erik Holmqvist Abstract RNA-binding proteins (RBPs) are at the heart of many biological processes and are therefore essential for cellular life. Following identification of single RBPs by classical genetics and molecular biology methods, approaches for RBP discovery on a systems level have recently emerged. For instance, RNA interactome capture (RIC) enables the global purification of RBPs cross-linked to polyadenylated RNA using oligo (dT) probes. RIC was originally developed for eukaryotic organisms but was recently established for capturing RBPs in bacteria. In this chapter, we provide a detailed step-by-step protocol for performing RIC in bacteria. The protocol is based on its application to Escherichia coli but should be amenable for charting other genetically tractable bacterial species. Key words RNA, RNA-binding, RBP, Interactome, RIC, UV crosslinking, Protein-RNA, RNA– protein, Pull-down

1

Introduction Interactions between protein and RNA are vital for all living organisms. Macromolecular machines such as the ribosome and the spliceosome are large ribonuclear particles that function through a myriad of RNA–protein contacts [1–3]. RNA-binding proteins (RBPs) are also important factors in gene regulation by controlling transcription, translation, and mRNA stability [4, 5]. In bacteria, regulatory RBPs control transcription termination, translation initiation rates, and ribonuclease (RNase)-mediated RNA decay [6]. Most of the well-characterized RBPs in bacteria were discovered serendipitously; investigations into the basis for phenotypes associated with mutations in a particular gene led to the discovery of RNA-binding activity of the corresponding protein [7– 10]. These discoveries, and subsequent insights into their molecular mechanisms, have had tremendous impacts on our understanding of RBP-mediated gene regulation. Consequently, this has inspired the development of methods specifically aimed for identi-

Ve´ronique Arluison and Claudio Valverde (eds.), Bacterial Regulatory RNA: Methods and Protocols, Methods in Molecular Biology, vol. 2741, https://doi.org/10.1007/978-1-0716-3565-0_18, © The Author(s), under exclusive license to Springer Science+Business Media, LLC, part of Springer Nature 2024

347

348

Thomas Søndergaard Stenum and Erik Holmqvist

fying RBPs. This includes methods for the identification of RBPs associated with a particular RNA, for instance, by co-purification of RBPs associated with MS2 aptamer-tagged RNA ligands [11] and development of methods aiming at global identification of all cellular proteins that interact with RNA. Reaching the latter aim requires a solution to the intricate problem of how to separate proteins that interact with RNA from all other cellular proteins. This has been achieved by the general isolation of cellular RNA and associated proteins, either by selective binding of RNA to silica beads [12, 13] or by fishing out polyadenylated RNA using immobilized oligo(dT) probes, the latter often referred to as RNA interactome capture (RIC) [14, 15]. Another strategy is based on the differential phase separation of RNA–protein complexes, as opposed to free protein or RNA, in phenol-based extraction procedures [16–18]. Common to these strategies is that RNA–protein complexes are stabilized by cross-linking—often achieved by ultraviolet (UV) irradiation— before RNA–protein complex isolation, which allows the use of stringent wash buffers during the purification procedure. While the development of global RBP identification methods has been pioneered by research on eukaryotic cells, these strategies have only recently been applied to bacteria. In this chapter, we provide a step-by-step protocol for RIC in bacteria, based on a recent proof-of-concept article presenting successful oligo(dT)-based global RBP identification in Escherichia coli [19]. The protocol is based on the original RIC protocol by Castello et al. developed for eukaryotic cells [14, 20] but has been modified for its application in bacteria (Fig. 1). The most important difference from the original protocol deals with the fact that bacterial RNA, in contrast to eukaryotic mRNA, lacks extensive poly (A) tails. Thus, the bait required for purification with oligo(dT) is generally lacking. To circumvent this, a short pulse of poly (A) polymerase I (PAPI) overexpression is applied to induce transcriptome-wide polyadenylation before UV cross-linking [19]. After cell lysis, the polyadenylated RNA is captured on oligo (dT) probes attached to magnetic beads, and non-cross-linked proteins are removed by washes in high salt buffers. After elution from the beads, the purified RNA is degraded by the addition of RNases, thereby releasing the co-purified RBPs, which can be analyzed by Western blot, Coomassie staining, and mass spectrometry. This protocol includes all materials and experimental steps for the isolation of RBPs using RIC in bacteria. The protocol does not cover mass spectrometry experiments and bioinformatic analysis, which is covered elsewhere [19].

Mapping RBPs in Bacteria

349

Fig. 1 Schematic overview of the major steps in the RIC protocol for bacteria. Cellular RNA is polyadenylated by the pulse expression of PAPI. RNA–protein interactions are stabilized by UV light irradiation of intact cells. Following cell lysis, polyadenylated RNA and cross-linked proteins are captured on oligo(dT) probes attached to magnetic beads. Washes in high salt buffers remove non-bound and bound but not cross-linked proteins. RNA–protein complexes are eluted by temperature-dependent detachment from oligo(dT) probes, and the RNA moieties of the eluted complexes are subsequently degraded by RNase treatment. The success and stringency of RBP purification can be analyzed by Western blot (for specific known RBPs and non-RBP proteins) and Coomassie staining (total protein). The identity of the total protein content of the eluate is detected by mass spectrometry

2

Materials

2.1 Bacterial Culturing

All media and reagents should be sterile. 1. Escherichia coli MG1655 strain EHS-2288 carrying the plasmid pPAPI for arabinose-inducible overexpression of poly (A) PAPI from Salmonella typhimurium (see Note 1). 2. 10 × M9 salts solution: 422.7 mM Na2HPO4, 220.4 mM KH2PO4, 85.6 mM NaCl, 187 mM NH4Cl (see Note 2). 3. M9 minimal media: 1× M9 salts, 2 mM MgSO4, 0.1 mM CaCl2, 0.4% (v/v) glycerol, 0.1% (w/v) casamino acids, 100 μg/mL ampicillin (see Notes 3–5). 4. 100 mL Erlenmeyer flasks. 5. 5 L Erlenmeyer flasks.

350

Thomas Søndergaard Stenum and Erik Holmqvist

6. Incubators for 100 mL and 5 L Erlenmeyer flasks at 37 °C and 180 rpm. 7. 10% arabinose (see Note 3). 8. Spectrophotometer for measuring the OD600 of culture samples. 9. Liquid nitrogen for harvesting cultures to prepare total RNA (see Note 6). 2.2 Buffers and Reagents

All buffers should be sterile and RNase free. 1. PBS pH 7.4: 137 mM NaCl, 2.7 mM KCl, 8 mM Na2HPO4, 2 mM KH2PO4. 2. Stop solution: 95% (v/v) Ethanol, 5% (v/v) phenol for RNA purification (see Note 7). 3. 4× Laemmli buffer: 250 mM Tris–HCl pH 6.8, 40% (v/v) glycerol, 4% (w/v) lithium dodecyl sulfate (LiDS), 0.02% (w/v) bromophenol blue, 10% (v/v) 2-mercaptoethanol. 4. Oligo(dT)25 magnetic beads (New England Biolabs cat. no. S1419S). Store at 4 °C. 5. Lysis buffer: 20 mM Tris–HCl (pH 7.5), 500 mM LiCl, 0.5% (w/v) LiDS, 1 mM EDTA, 5 mM DTT. Store at 4 °C (see Notes 8 and 9). 6. Wash buffer 1: 20 mM Tris–HCl (pH 7.5), 500 mM LiCl, 0.1% (w/v) LiDS, 1 mM EDTA, 5 mM DTT. Store at 4 °C (see Notes 8 and 9). 7. Wash buffer 2: 20 mM Tris–HCl (pH 7.5), 500 mM LiCl, 1 mM EDTA, 5 mM DTT. Store at 4 °C (see Notes 8 and 9). 8. Wash buffer 3: 20 mM Tris–HCl (pH 7.5), 200 mM LiCl, 1 mM EDTA, 5 mM DTT. Store at 4 °C (see Notes 8 and 9). 9. Storage buffer for oligo(dT)25 magnetic beads: 1× PBS with 0.05% (v/v) Tween 20 and 0.02% (w/v) NaN3. 10. Ribonuclease A/T1 mix (RNase A/T1, Thermo Scientific cat no EN0551). 11. QC Colloidal Coomassie Stain (BioRad cat. no. 1610803). 12. Antibody against a known RBP for Western blotting (see Note 10). 13. Strip buffer for Western blot: 0.1 M glycine, 20 mM magnesium acetate (MgAc), 50 mM KCl, adjusted to pH 2.2 using HCl. 14. Labeled d(T)25 oligo probe for Northern blotting.

Mapping RBPs in Bacteria

2.3

Equipment

351

1. UV radiation device equipped with a 254 nm bulb, for crosslinking of RNA–protein complexes. 2. Plastic tray (30 × 30 cm) for UV cross-linking of cultures. 3. Centrifuge bottles 500 mL. 4. Centrifuge for 500 mL bottles, capable of centrifuging at 4600 × g and 4 °C (see Note 11). 5. FastPrep-24 for cell lysis (see Note 12). 6. 2 mL screw cap tubes for the FastPrep. 7. Zirconium or silica beads with a diameter of 100 μm. 8. 50 mL centrifuge tubes. 9. Centrifuge for 50 mL tubes capable of centrifuging at 3500 × g at 4 °C. 10. 2 mL microcentrifuge tubes. 11. Centrifuge for 2 mL microcentrifuge tubes, capable of centrifuging at 17,000 × g at 4 °C. 12. Rotation device for 50 mL centrifuge tubes, capable of rotating at 15 rpm. For incubation and washing of cell lysates with oligo (dT)25 magnetic beads. 13. Magnetic separation racks for capturing magnetic beads in both 50 and 2 mL tubes. For example, cat. no. S1507S and S1509S from New England Biolabs. 14. NanoDrop spectrophotometer concentration.

for

measuring

RNA

15. Standard Western and Northern blot equipment. 16. SpeedVac vacuum concentrator.

3

Methods

3.1 Verification of Polyadenylation of RNA

Before conducting the RNA capture, it is important to verify the successful polyadenylation of cellular transcripts. This is done by Northern blotting and can be complemented by RNA-seq. At the same time, to keep cellular conditions as close to natural as possible, it is recommended to harvest cultures before the overexpression of PAPI negatively affects the growth rate of the bacteria. This is assessed by simultaneously measuring growth rate and polyadenylation levels at several different timepoints after PAPI induction. 1. The day before the experiment; prepare two pre-cultures each of EHS-2288 (pPAPI) and EHS-2289 (vector control) by inoculating 2 mL of the M9 minimal media, from independent single colonies (see Notes 1, 5, and 13). 2. Incubate pre-cultures overnight at 37 °C with shaking at 180 rpm.

352

Thomas Søndergaard Stenum and Erik Holmqvist

3. The next day: dilute the overnight cultures to 100× by adding 0.25 mL of pre-culture to 25 mL pre-heated M9 minimal media in 200 mL Erlenmeyer flasks. Incubate the cultures at 37 °C with shaking at 180 rpm. 4. Measure the OD600 of the cultures approximately every hour to determine the pre-induction growth rate. 5. While the cultures are growing, prepare 2 mL microcentrifuge tubes for RNA harvest by adding 400 μL stop solution to the number of tubes needed. Place the tubes on ice. 6. At an OD600 of 0.4, induce PAPI overexpression by adding arabinose to a final concentration of 0.2%. From this point measure OD600 every 5 min to get a detailed picture of the growth burden caused by the PAPI overexpression. 7. At time intervals of 15 min after addition of arabinose (every third OD600 measurement), harvest culture aliquots by moving 1.6 mL of culture to a 2 mL microcentrifuge tube containing stop solution. Immediately vortex the tube and flash freeze in liquid nitrogen (see Note 6). 8. Continue measuring OD600 and harvesting culture for RNA purification until the growth rate is clearly negatively affected. 9. The samples for RNA harvest can now be stored at -80 °C or processed directly. 10. Plot the measured OD600 values of the cultures against time as a semi-log plot to evaluate the growth rate of each culture before and after induction of PAPI. 3.2 RNA Harvest and Northern Blotting

Thaw the culture samples harvested into stop solution on ice and then pellet by centrifugation at 4 °C. Use conventional hot phenol extraction to purify RNA from all the samples. Size separate 15 μg of total RNA from each sample in a 6% polyacrylamide 8 M urea gel by classical PAGE. After electrophoresis, transfer RNA from the gel to a nylon membrane and continue by probing the membrane for polyadenylated transcripts using a labeled d(T)25 DNA oligo probe. As this probe is specific for poly(A) tails, it can recognize all transcripts with a sufficiently long poly(A) tail. This, in combination with differences in length of the poly(A) tails between individual transcripts, results in a smear consisting of a large number of RNAs with different lengths and usually no defined bands on the Northern blot [19]. However, the intensity of the combined signal in each lane represents the level of polyadenylation and should increase with increased time of PAPI expression.

3.3 Determining the Optimal Duration of PAPI Induction

For the overall method’s ability to identify biologically relevant RNA–protein interactions, it is important to keep the cellular condition as close to natural as possible, meaning that the stress posed by PAPI overexpression should be limited as much as possible. On the other hand, successful capture of the RNA–protein complexes

Mapping RBPs in Bacteria

353

by oligo(dT) beads requires a substantial amount of polyadenylation of transcripts. In this respect, the growth of the cultures after induction of PAPI should be compared to the levels of polyadenylation as detected by Northern blotting (see Note 14). We recommend choosing an induction period long enough to drastically increase the level of polyadenylation but short enough not to significantly affect the growth rate of the culture. In our experiments with E. coli, we have harvested 30 min after the addition of arabinose [19]. However, the optimal length of induction will depend on the culture conditions and the growth rate of the bacteria. 3.4 UV Cross-Linking and Harvesting of Bacterial Cultures

1. Prepare pre-cultures the day before the harvest of bacterial cultures. For each of the pre-cultures add 10 mL of the M9 minimal media to a 100 mL Erlenmeyer flask (see Note 5). 2. Inoculate each pre-culture flask with a single colony from a plate with the strain EHS-2288 (see Notes 1 and 13), carrying the plasmid pPAPI, for overexpression of poly(A) PAPI. 3. Incubate overnight at 37 °C with shaking at 180 rpm. 4. The next day: dilute the overnight cultures to 100 × by adding the 10 mL pre-culture to 1000 mL pre-heated M9 minimal media in 5000 mL Erlenmeyer flasks (see Note 15). 5. Incubate the cultures at 37 °C with shaking at 180 rpm. 6. At an OD600 of 0.35 add arabinose to a final concentration of 0.2% (w/v), to induce the expression of PAPI. 7. Let the cultures grow for the period determined in Subheading 3 with induction (see Note 16). 8. Harvest cultures by moving the flasks to an ice water bath and shaking continuously for 5 min to rapidly cool down the cultures. 9. Centrifuge the cultures at 4600 × g at 4 °C for 30 min. 10. Pour off the supernatant and resuspend each cell pellet in 100 mL of ice-cold PBS. 11. Divide each cell suspension in 2 × 50 mL in 50 mL centrifuge tubes. 12. Pour one 50 mL sample into a plastic tray and irradiate with 254 nm UV light at 1 J/cm2. Keep the other 50 mL sample at room temperature during the UV light treatment to keep conditions as equal as possible between the two samples. 13. After UV light treatment, move the treated cells to a clean 50 mL centrifuge tube and transfer both 50 mL tubes to ice. 14. Pellet cells by centrifugation at 3500 g at 4 °C for 15 min. 15. Discard the supernatant and store pellets at -80 °C for later processing. Each sample set now consists of one UV light– treated sample and one non-treated sample.

354

3.5

Thomas Søndergaard Stenum and Erik Holmqvist

Cell Lysis

1. Prepare 2 mL screw cap tubes by adding 1.0 g of zirconium beads and place them at -20 °C for at least 30 min. Prepare eight tubes for each sample set. 2. Thaw cell pellets on ice. 3. Resuspend each pellet in 3 mL of ice-cold lysis buffer. 4. Each sample is split in four by moving 750 μL of the cell suspension into four different 2 mL screw cap tubes containing zirconium beads (see Note 17). 5. Lyse cells with FastPrep-24 for 20 s at 4.0 M/s. 6. Centrifuge cell lysates for 10 min at 17,000 × g at 4 °C. 7. Transfer the supernatants into two new 2 mL tubes (pool the content of two tubes from the same sample into one new tube). 8. Centrifuge lysates for 10 min at 17,000 × g at 4 °C for additional removal of insoluble material. 9. Transfer lysates from the same sample to 50 mL Falcon tubes and fill up to 20 mL with ice-cold lysis buffer. 10. From the lysate withdraw 250 μL of sample and mix with 50 μL Stop solution (95% EtOH, 5% phenol) and store at -80 °C for RNA extraction. These samples are referred to as RNA lysate samples. 11. Mix an additional 25 μL lysate with 9 μL of 4× Laemmli buffer for protein analysis and store at -20 °C. These samples are referred to as protein lysate samples. 12. Continue to RNA–protein pull down.

3.6 RNA–Protein Pulldown

1. Equilibrate 4 mL of oligo(dT)25 magnetic beads per sample set (see Notes 18 and 19). Wash the required volume of beads for all samples three times in three volumes of lysis buffer and then resuspend in one volume of lysis buffer. 2. Add 2 mL equilibrated beads to each of the cell lysates and incubate for 1 h at 4 °C with gentle rotation at 15 rounds per minute. 3. Place the tubes on a magnetic separation rack at 4 °C and wait until the beads are completely captured (10–30 min). 4. Remove 250 μL of supernatant, add 50 μL of stop solution, and store at -80 °C for later RNA extraction. This sample is the unbound RNA. 5. Remove 25 μL of sample, add 9 μL of 4× Laemmli buffer, and store at -20 °C for protein analysis. This sample is the unbound protein. 6. Remove the supernatant using a serological pipette.

Mapping RBPs in Bacteria

355

7. Wash the beads by resuspending in 20 mL ice-cold lysis buffer. Incubate the beads for 5 min at 4 °C with gentle rotation (15 rpm). Then magnetize the beads, remove, and discard the supernatant. Repeat the wash step with different buffers as outlined below. 8. Wash beads twice with wash buffer 1 (5 min at 4 °C and 15 rpm). 9. Wash beads twice with wash buffer 2 (5 min at 4 °C and 15 rpm). 10. Wash beads twice with wash buffer 3 (5 min at 4 °C and 15 rpm). 11. Remove the supernatant and resuspend each sample in 1 mL of wash buffer 3. 12. Transfer each sample from the 50 mL tube into a 2 mL microcentrifuge tube. 13. Place all samples on the magnetic separation rack and remove all the buffer. 14. Add 270 μL elution buffer, preheated to 55 °C, to each tube. 15. Elute RNA–protein complexes by incubating the tubes at 55 ° C shaking at 1000 rpm for 3 min. 16. Place the tubes in the magnetic separation rack and move the supernatant to new microcentrifuge tubes. 17. Place the tubes with beads on ice for later recycling. 18. Place the tubes containing the eluates in the magnetic rack and move the supernatant to new tubes. This prevents the carryover of small amounts of remaining beads in the sample. 19. Measure the concentration of eluted RNA with NanoDrop (see Note 20). 20. Save 20 μL of each sample for RNA analysis (see Note 14). 21. Add 30 μL of 10× RNase buffer and 6 μL RNase A/T1 mix (~20 μg) to each of the eluates. 22. Incubate the samples for 1.5 h at 37 °C. 23. Store the samples at -20 °C. 3.7 Recycling of Beads

After use, the beads can be cleaned and reused in another round of RNA capture. Label the used beads with either XL (cross-link) or noXL and reuse the beads with similar samples. 1. Add 2 mL 0.1 M NaOH to the beads and incubate at room temperature rotating at 15 rpm for 5 min. This will remove any bound RNA. 2. Wash the beads three times with 2 mL sterile RNase-free H2O.

356

Thomas Søndergaard Stenum and Erik Holmqvist

3. Equilibrate the beads by washing three times with 2 mL PBS containing 0.05% Tween 20. 4. Store in PBS containing 0.05% Tween 20 and 0.02% NaN3. 3.8 Protein Analysis by SDS-PAGE

The protein samples, collected before and after the RNA pull down, are used to verify the successful co-capture of RNA-binding proteins by Western blotting and Coomassie staining of protein gels (Fig. 2). 1. For Western blotting: mix 7.5 μL of the eluate with 2.5 μL 4× Laemmli buffer for both UV light treated and non-treated samples. 2. Size separate 10 μL each of the lysate and eluate samples by SDS-PAGE. 3. Transfer proteins to a PVDF membrane and probe with an antibody against a known RBP. 4. After imaging, strip the Western blot by applying a stripping buffer. 5. Incubate with agitation for 10 min and discard buffer. 6. Add fresh stripping buffer and incubate another 10 min with agitation. 7. Wash the membrane three times with TBS-T and re-probe using an antibody against a non-RBP. In a successful experiment, the probed RBP should be detected in both the lysate and the eluate samples, and there should be a significant difference in intensity between cross-linked and non-cross-linked samples. The non-RBP should not be detectable in the eluate samples. An example of a successful experiment is shown in Fig. 2. 8. For Coomassie staining, concentrate the eluate sample using a SpeedVac: Centrifuge 30 μL of each eluate sample in the SpeedVac until the volume has been reduced to ~5 μL. 9. Adjust the volume of the sample to 7.5 μL by adding H2O and then add 2.5 μL 4× Laemmli buffer. 10. Size-separate 10 μL each of the eluate and lysate samples by SDS-PAGE. 11. Stain the protein in the gel using QC Colloidal Coomassie Stain. In a successful experiment, there should be visible bands in the lanes with cross-link treated eluate samples, while the non-treated eluate lanes should be clear (see Note 21). Note that the prominent band with a size of approximately 15 kD that is seen in all eluates samples arises from the added RNases A and T1 (see Note 22). An example of a successful experiment is shown in Fig. 2. Information about mass spectrometry and sample preparation is available elsewhere [19].

Mapping RBPs in Bacteria

357

Fig. 2 Validation of RIC performance by Western blot analysis and total protein staining. (a) RIC was performed on E. coli MG1655 carrying a chromosomal hfq-3xflag allele. Western blot was carried out on lysate, unbound fraction, and eluate fractions, from cross-linked and non-cross-linked samples from the same bacterial culture. The well-characterized RBP Hfq was detected using an anti-FLAG antibody. Notably, Hfq is detected at the same level in lysates and unbound fractions irrespective of cross-linking. In contrast, the detection of Hfq in eluate fractions is only apparent in cross-linked samples, but not in the non-cross-linked control sample. An antibody against the protein chaperone GroEL was used as a non-RBP control protein. In contrast to Hfq, GroEL is only detected in lysate and unbound fractions, but not in the eluate, irrespective of cross-linking. (b) Staining of total protein by Coomassie following SDS-PAGE. Note that visible bands are seen in lysates, irrespective of cross-linking, but only in the eluate of the cross-linked sample. The smear in the cross-linked eluate sample likely reflects proteins with different lengths of remaining cross-linked RNA moieties. The highly intense bands between 10 and 15 kD in the eluate fractions are RNases A and T1 used for degrading RNA in the eluates

4 Notes 1. Although we have only applied this protocol to E. coli and Salmonella sp., we believe it should be applicable to all other genetically tractable bacteria where overexpression of PAPI can be achieved. However, this might require optimization of the expression vector used and the nucleotide sequence of the PAPI gene for optimal expression in each individual species. 2. To prepare 10× M9 salts mix per liter 60 g Na2HPO4 (anhydrous, dibasic), or 70 g/L Na2HPO4·7H2O (heptahydrate), 30 g KH2PO4 (monobasic), 5 g NaCl, and 10 g NH4Cl, and autoclave. 3. Do not autoclave the glycerol, casamino acid, ampicillin, or arabinose stocks, instead sterile filter through a 0.45 μm filter and store it at 4 °C or -20 °C (ampicillin). 4. To prepare M9 media mix per liter: 100 mL of 10× M9 salts with 800 mL sterile deionized H2O, then add 2 mL 1 M MgSO4, 100 μL 1 M CaCl2, 10 mL 10% (v/v) casamino

358

Thomas Søndergaard Stenum and Erik Holmqvist

acids and 8 mL 50% (v/v) glycerol. Then add H2O to 1 L. It is important to add most of the water before adding the CaCl2 to avoid precipitation. 5. Other media can be used; however, it is worth considering that different media absorb very different amounts of UV light. A highly absorbing media will result in lower cross-linking efficiency and accordingly a lower yield of co-purifying proteins. 6. Liquid nitrogen can cause frost burns upon contact with skin. Furthermore, its evaporation can reduce the oxygen content of the surrounding air. Thus, care should be taken when handling liquid nitrogen. 7. Danger: phenol is highly corrosive to the skin and readily absorbed through it. Exposure to skin or through breathing of phenol fumes can affect the central nervous system and cause damage to the liver and kidneys. It is also a mutagen and may be of reproductive hazard. Thus, phenol should be handled with great care and should be disposed of according to local regulations. 8. To prepare lysis and wash buffers mix per liter 700 mL deionized H2O, 20 mL 1 M Tris–HCl (pH 7.5), 2 mL 0.5 M EDTA, and enough powdered LiCl to reach the desired final concentration. Autoclave and then add DTT, LiDS, and sterile deionized H2O to 1 L. Sterile filter and store at 4 °C. 9. Be aware of precipitation in the buffers. Buffers with precipitation should not be used. Instead, slowly heat the buffer until the precipitate has redissolved. Then, cool down the buffer again. We find that precipitation problems can increase with the age of the buffer. Thus, in this regard, it is recommended not to store these buffers for longer periods. 10. Use either a specific antibody against a known RBP or an antibody against an epitope tag, provided that a strain expressing an epitope-tagged version of a known RBP is used. 11. If no such centrifuge is available, it is possible to use a larger number of smaller bottles/tubes. For instance, divide the culture into 20 × 50 mL tubes, centrifuge, discard the supernatant, resuspend in a smaller volume, and pool everything in one tube that is centrifuged again. 12. It should be possible to use alternative mechanical lysis methods such as French press. If the protocol is used with bacteria other than E. coli, it is recommendable to use cell lysis methods optimized for the specific organism. 13. It is important to consider the choice of strain. The EHS-2288 strain is a wild-type MG1655 E. coli strain carrying the pPAPI plasmid. Using a strain with an epitope tag on a known RNA-binding protein (e.g. Hfq) can simplify the confirmation of co-purification of RPBs by Western blot detection.

Mapping RBPs in Bacteria

359

14. To get a more detailed picture of the consequences of PAPI overexpression on the transcriptome of the cells, RNA-seq can be applied as an addition to Northern blotting. In this case, we recommend including RNA samples from before PAPI induction, after induction, from the lysate, as well as from the eluted RNA after purification on poly(T) beads. This will not only be informative in terms of the consequences of PAPI overexpression but also allow for an analysis of how well the composition of the pulled-down RNA reflects the composition of the RNA in the undisturbed cells. 15. If several independent cultures are harvested on the same day, we highly recommend staggering their inoculation. We recommend that the cultures are inoculated with at least 1 h in between. In this way, one culture can be harvested and processed at the same time. 16. We have used an induction period of 30 min for E. coli. 17. The container should not be more than half full of beads and bacterial sample. 18. To add the same amount of beads to each sample, it is important that the beads are fully resuspended before pipetting. 19. The beads can be recycled and used in another pull down. By doing so, it is possible to reduce the overall amount of beads needed to produce replicate experiments. 20. We usually purify 150–200 μg of RNA per sample. This is slightly more than suggested by the manufacturer (New England Biolabs) as it is stated that “1 mg of Oligo d(T)25 beads will bind 10 μg of polyadenylated RNA”. The yield in weight will however also depend on the length of captured RNA. If significantly lower amounts of RNA are obtained, it can be an indication that the method did not perform optimally. One reason for this could be insufficient polyadenylation of transcripts. Another reason could be a failure of the transcripts to bind to the poly(T) beads. Both of these reasons can be checked by Northern blotting. If the level of polyadenylation in the unbound fraction is much lower than in the lysate, it is an indication that the beads were not saturated with RNA due to insufficient polyadenylation. This can be solved by increasing the PAPI induction time, or if that is not possible, by increasing the overall amount of RNA (more cells). If the amount of polyadenylated RNA is similar between the lysate samples and the unbound fraction, it is an indication that the polyadenylated RNA either did not efficiently bind the beads or that it did not efficiently elute from the beads. To check this, do another elution with the beads this time at 95 °C. This should elute all RNA but might damage the beads, making them unsuitable for reuse. If the problem is inefficient binding between RNA and

360

Thomas Søndergaard Stenum and Erik Holmqvist

beads, make sure that the beads are properly washed before addition to the lysates. Additionally, it might be helpful to increase the incubation time for binding of RNA to the beads. If the beads are not fully recovered between washes this could also reduce the yield of RNA. Make sure that all beads are completely captured by the magnet before removing the wash buffer. Yet another possibility is that the RNA is degraded during the pull down. If this is the case, it can be helpful to add an RNase inhibitor to the wash buffers. 21. If no protein bands are visible, it could indicate a problem with the efficiency of the experiment. If the protein was successfully detected by Western blotting, it can help by simply increasing the amount of material loaded on the gel for Coomassie staining. If no protein is detected by the Western blotting, it could indicate that the UV cross-linking was not successful. Crosslinking efficiency of RBP to RNA can be tested by the PNK assay, see details elsewhere [19]. The UV cross-linking can be optimized by increasing the intensity (joule) of the light used. Another possibility is that the protein has been degraded by proteases during the experiment. To solve this, add a protease inhibitor to the wash buffers. 22. The actual sequence of the two commercially available RNases is a trade secret of the supplier. References 1. Staley JP, Woolford JL (2009) Assembly of ribosomes and spliceosomes: complex ribonucleoprotein machines. Curr Opin Cell Biol 21: 109–118 2. Lafontaine DLJ (2015) Noncoding RNAs in eukaryotic ribosome biogenesis and function. Nat Struct Mol Biol 22:11–19 3. Matera AG, Wang Z (2014) A day in the life of the spliceosome. Nat Rev Mol Cell Biol 15: 108–121 4. Ray D, Kazan H, Cook KB, Weirauch MT, Najafabadi HS, Li X, Gueroussov S, Albu M, Zheng H, Yang A et al (2013) A compendium of RNA-binding motifs for decoding gene regulation. Nature 499:172–177 5. Gerstberger S, Hafner M, Tuschl T (2014) A census of human RNA-binding proteins. Nat Rev Genet 15:829–845 6. Holmqvist E, Vogel J (2018) RNA-binding proteins in bacteria. Nat Rev Microbiol 16: 601–615 7. Kajitani M, Ishihama A (1991) Identification and sequence determination of the host factor

gene for bacteriophage Q β. Nucleic Acids Res 19:1063–1066 8. Møller T, Franch T, Højrup P, Keene DR, B€achinger HP, Brennan RG, Valentin-Hansen P (2002) Hfq: a bacterial Sm-like protein that mediates RNA-RNA interaction. Mol Cell 9: 23–30 9. Romeo T, Gong M, Liu MY, BrunZinkernagel AM (1993) Identification and molecular characterization of csrA, a pleiotropic gene from Escherichia coli that affects glycogen biosynthesis, gluconeogenesis, cell size, and surface properties. J Bacteriol 175:4744– 4755 10. Liu MY, Romeo T (1997) The global regulator CsrA of Escherichia coli is a specific mRNAbinding protein. J Bacteriol 179:4639–4642 11. Said N, Rieder R, Hurwitz R, Deckert J, Urlaub H, Vogel J (2009) In vivo expression and purification of aptamer-tagged small RNA regulators. Nucleic Acids Res 37:e133 12. Chu L-C, Arede P, Li W, Urdaneta EC, Ivanova I, McKellar SW, Wills JC, Fro¨hlich T, von Kriegsheim A, Beckmann BM et al (2022)

Mapping RBPs in Bacteria The RNA-bound proteome of MRSA reveals post-transcriptional roles for helix-turn-helix DNA-binding and Rossmann-fold proteins. Nat Commun 13:2883 13. Shchepachev V, Bresson S, Spanos C, Petfalski E, Fischer L, Rappsilber J, Tollervey D (2019) Defining the RNA interactome by total RNA-associated protein purification. Mol Syst Biol 15:e8689 14. Castello A, Fischer B, Eichelbaum K, Horos R, Beckmann BM, Strein C, Davey NE, Humphreys DT, Preiss T, Steinmetz LM et al (2012) Insights into RNA biology from an atlas of mammalian mRNA-binding proteins. Cell 149:1393–1406 15. Kwon SC, Yi H, Eichelbaum K, Fo¨hr S, Fischer B, You KT, Castello A, Krijgsveld J, Hentze MW, Kim VN (2013) The RNA-binding protein repertoire of embryonic stem cells. Nat Struct Mol Biol 20:1122–1130 16. Queiroz RML, Smith T, Villanueva E, MartiSolano M, Monti M, Pizzinga M, Mirea D-M, Ramakrishna M, Harvey RF, Dezi V et al (2019) Comprehensive identification of RNA–protein interactions in any organism

361

using orthogonal organic phase separation (OOPS). Nat Biotechnol 37:169–178 17. Urdaneta EC, Vieira-Vieira CH, Hick T, Wessels H-H, Figini D, Moschall R, Medenbach J, Ohler U, Granneman S, Selbach M et al (2019) Purification of cross-linked RNA-protein complexes by phenol-toluol extraction. Nat Commun 10:990 18. Smith T, Villanueva E, Queiroz RML, Dawson CS, Elzek M, Urdaneta EC, Willis AE, Beckmann BM, Krijgsveld J, Lilley KS (2020) Organic phase separation opens up new opportunities to interrogate the RNA-binding proteome. Curr Opin Chem Biol 54:70–75 19. Stenum TS, Kumar AD, Sandbaumhu¨ter FA, Kjellin J, Jerlstro¨m-Hultqvist J, Andre´n PE, Koskiniemi S, Jansson ET, Holmqvist E (2023) RNA interactome capture in Escherichia coli globally identifies RNA-binding proteins. Nucleic Acids Res 51:4572–4587 20. Castello A, Horos R, Strein C, Fischer B, Eichelbaum K, Steinmetz LM, Krijgsveld J, Hentze MW (2013) System-wide identification of RNA-binding proteins by interactome capture. Nat Protoc 8:491–500

Chapter 19 An Integrated Affinity Chromatography-Based Approach to Unravel the sRNA Interactome in Nitrogen-Fixing Rhizobia Natalia Isabel Garcı´a-Tomsig, Antonio Lagares Jr., Anke Becker, Claudio Valverde, and Jose´ Ignacio Jime´nez-Zurdo Abstract The activity mechanism and function of bacterial base-pairing small non-coding RNA regulators (sRNAs) are largely shaped by their main interacting cellular partners, i.e., proteins and mRNAs. We describe here an MS2 affinity chromatography–based procedure adapted to unravel the sRNA interactome in nitrogenfixing legume endosymbiotic bacteria. The method consists of tagging of the bait sRNA at its 5′-end with the MS2 aptamer followed by pulse overexpression and immobilization of the chimeric transcript from cell lysates by an MS2-MBP fusion protein conjugated to an amylose resin. The sRNA-binding proteins and target mRNAs are further profiled by mass spectrometry and RNAseq, respectively. Key words Plant symbionts, Sinorhizobium meliloti, trans-sRNA, MS2, RNA-binding proteins, Riboregulation

1

Introduction Bacteria express large and heterogeneous populations of nonprotein-coding small RNAs (sRNAs) that are expected to play major roles in the posttranscriptional regulation of gene expression underlying environmental adaptation [1–3]. The challenge now is to uncover the function of this plethora of sRNAs in the ecological specialization of widely diverse bacterial species. Most sRNAs functionally characterized to date rely on protein-assisted limited basepairing interactions to fine-tune translation and abundance of trans-encoded target mRNAs [4, 5]. Hfq and ProQ are well-characterized RNA-binding proteins (RBPs) that act as RNA chaperones protecting sRNAs from degradation and facilitating base-pairing to their target mRNAs [6–8]. As reported for Hfq in enterobacteria, RBPs may also recruit cellular ribonucleases (e.g.,

Ve´ronique Arluison and Claudio Valverde (eds.), Bacterial Regulatory RNA: Methods and Protocols, Methods in Molecular Biology, vol. 2741, https://doi.org/10.1007/978-1-0716-3565-0_19, © The Author(s), under exclusive license to Springer Science+Business Media, LLC, part of Springer Nature 2024

363

364

Natalia Isabel Garcı´a-Tomsig et al.

RNase E) to the sRNA-mRNA interplay as components of higher order protein complexes involved in target decay upon base-pairing (e.g., the degradosome) [8]. However, neither Hfq nor ProQ are ubiquitous in bacteria, which anticipates that other RBPs with similar roles in riboregulation remain undiscovered [9–11]. In bacteria, a single trans-sRNA typically targets multiple mRNAs, which places RBPs at the hub of large and complex regulatory RNA networks. Therefore, RBPs and the set of target mRNAs together (i.e., the sRNA interactome) provide key mechanistic and functional hallmarks of trans-acting sRNAs. It is well known that sRNAs regulate major bacterial traits operating during the interaction with eukaryotic hosts (e.g., virulence) [12–16]. However, the role of riboregulation in beneficial plant–microbe interactions remains understudied. One of such association is the mutualistic endosymbiosis between nitrogen fixers of the α and β classes of proteobacteria, collectively termed rhizobia, and legume plants [17]. Here, we describe an affinity chromatography–based procedure to capture the in vivo assembled trans-sRNA interactome in Sinorhizobium meliloti (Fig. 1), a genetically tractable α–proteobacterium that fixes nitrogen in symbiosis with alfalfa (Medicago sativa L.) and other Medicago plants. The toolkit and method are suited to investigate riboregulation in related α-rhizobia.

Fig. 1 Overview of the procedure. PsRNA, native promoter of the sRNA under study; Plac, lacZ promoter, PsinI, sinI promoter. Further details are provided throughout the text

Rhizobial sRNA Interactome

2

365

Materials The following protocols require special attention on glassware and equipment cleanness. It is recommended to clean work surfaces and gloves (powder-free) with an RNase-free agent or decontamination solution (e.g., RNAZap™, Invitrogen) and the use of pipettes only for RNA work. Commercial RNase-free and sterile plastic tubes, pipette tips, and boxes should be used directly due to the possibility of DNA contamination from autoclave water. Working solutions should be prepared in ultra-filtered sterile water and commercial RNase-free water is recommended for the preparation of the RNA-containing reactions (e.g., cDNA synthesis reaction).

2.1 MS2 Aptamer Tagging of the sRNA

1. pSRKKm derivative vectors: pSKiMS2-sRNA and pSKi-sRNA. 2. DNA oligonucleotides to construct plasmids that express the wild-type or tagged sRNAs. 3. High-fidelity DNA polymerase kit and dNTPs for PCR amplification. 4. Agarose. 5. 10× TAE buffer [0.4 M Tris, 17.4 M acetic acid, 0.02 M Ethylenediaminetetraacetic acid (EDTA) pH: 8.2]. 6. 6× DNA loading dye [50% glycerol (v/v), 0.01 M EDTA pH: 8, 0.5% Orange G (w/v)]. 7. Horizontal electrophoresis system for DNA electrophoresis. 8. Dyes for DNA staining. 9. Purification kits for PCR fragments and plasmid DNA. 10. Restriction enzymes with their appropriate buffers. 11. T4 DNA ligase with its buffer. 12. Thermocycler and block heater.

2.2 Culture, Harvest of Bacteria, and Cell Lysis

1. Sinorhizobium meliloti strain 2019 [18], which is a Sm2B3001 ΔsinRsinI-derivative with an additional deletion of the sRNA locus. 2. Rubidium chloride competent Escherichia coli DH5α and S17.1 cells for cloning and conjugation by biparental mating, respectively [19, 20]. 3. Luria-Bertani (LB) medium: 5 g/L NaCl, 10 g/L tryptone, 5 g/L yeast extract [21]. 4. Tryptone Yeast (TY) medium: 0.9 g/L CaCl2.2H2O, 5 g/L tryptone, 3 g/L yeast extract [22].

366

Natalia Isabel Garcı´a-Tomsig et al.

5. Minimal Medium (MM); 1.1 g/L potassium glutamate, 10 g/ L mannitol, 0.3 g/L K2HPO4, 3 g/L KH2PO4, 0.15 g/L MgSO4.7H2O, 0.05 g/L CaCl2.2H2O, 0.05 g/L NaCl, 0.006 g/L FeCl3.6H2O, 0.2 mg/L biotin, 0.1 mg/L calcium pantothenate, pH: 6.8 [23]. 6. Isopropyl β-D-1-thiogalactopyranoside (IPTG) stock solution at an appropriate concentration to achieve by dilution 1 mM in growing cultures. 7. Antibiotics at the following final concentrations (μg/mL): streptomycin (Sm) 480, kanamycin (Km) 50 for E. coli and 180 for rhizobia (see Note 1). 8. Sarkosyl solution; 0.1% (w/v) N-Lauroylsarcosine sodium salt, 0.01 M Tris-HCl pH: 8, 0.001 M EDTA pH: 8; for washing cells before harvesting. 9. Buffer A; 0.02 M Tris-HCl pH: 8, 0.15 M KCl, 0.001 M MgCl2, supplemented with 1 mM Dithiothreitol (DTT). 10. cOmplete™ Protease Inhibitor Cocktail (Roche). 11. RNase inhibitor chromatography.

(New

England

Biolabs)

for

affinity

12. Lysis solution: 1.4% (w/v) SDS, 0.004 M EDTA pH: 8, supplemented with 50 μg of proteinase K for total RNA extraction. 13. Incubator. 14. Refrigerated centrifuge. 15. French Press or Sonicator system. 2.3 Total RNA Extraction from Bacterial Iysates

1. 5 M NaCl. 2. Ethanol absolute. 3. RNase-free DNase I. 4. RNase Inhibitor for total RNA isolation. 5. Agarose. 6. 4× MOPS buffer: 0.08 M 3-(N-morpholino)propanesulfonic acid (MOPS), 0.02 M sodium acetate, 0.004 M EDTA pH: 7. 7. Formaldehyde. 8. 6× DNA loading dye; 50% glycerol (v/v), 0.01 M EDTA pH: 8, 0.5% Orange G (w/v). 9. Horizontal electrophoresis system for RNA electrophoresis. 10. Dyes for nucleic acid staining. 11. Qubit™ 3 Fluorometer (Thermo Fisher Scientific). 12. Qubit™ RNA High sensitivity assay kit (Invitrogen) for RNA quantification (see Note 2).

Rhizobial sRNA Interactome

2.4 Affinity Chromatography

367

1. Buffer A. 2. cOmplete™ Protease Inhibitor Cocktail. 3. RNase inhibitor. 4. Amylose resin (New England Biolabs). 5. SigmaPrep™ spin column (Sigma-Aldrich). 6. MS2-Maltose Binding Protein (MS2-MBP) (see Note 3). 7. Maltose for addition to Buffer A when required. 8. TURBO™ DNAse I (Invitrogen) RNase-free, for the elimination of residual DNA. 9. Phenol:chloroform:isoamyl alcohol pH: 4.5. 10. Ethanol absolute. 11. Acetone. 12. 3 M sodium acetate. 13. RNA-grade glycogen for RNA and protein extraction and precipitation. 14. DNA oligonucleotides for PCR amplification of the sRNA.

2.5 RT-qPCR Analysis

1. RNeasy® Plus Mini kit (QIAgen) including genomic DNA eliminator columns. 2. PrimeScript™ RT Master Mix (Perfect Real Time, Takara) for reverse transcription. 3. TB Green®Premix Ex Taq™ II (Tli RNaseH Plus, Takara). 4. Specific primers to amplify the sRNAs or target mRNAs under study for real-time PCR (qPCR). 5. QuantStudio 3 real-time PCR system.

2.6 Bioinformatics Analysis

1. RNAfold web server (http://rna.tbi.univie.ac.at/cgi-bin/ RNAWebSuite/RNAfold.cgi) and VARNA version 3.9 for the visualization of RNA secondary structure. 2. Rbowtie2 (https://bioconductor.org/packages/Rbowtie2), Rsamtools (https://bioconductor.org/packages/Rsamtools), and Rsubread (https://bioconductor.org/packages/ Rsubread) utility packages, which are available in R software (version 4.0.2 and later is recommended) for the analysis of RNAseq data. 3. Samtools software package (version 1.10) for PCR or optical duplicate labeling [24]. 4. IntaRNA web server (http://rna.informatik.uni-freiburg.de/ IntaRNA/) for the prediction of RNA-RNA base-pairing interactions [25].

368

Natalia Isabel Garcı´a-Tomsig et al.

5. Rockhopper (version 2.0.3) and the Integrative Genomics Viewer (IGV) software (version 2.16.0) for the RNA-seq data visualization [26, 27]. 6. topGO utility package (https://doi.org/10.18129/B9.bioc. topGO), which is also available in R software, for the functional analysis of RNAseq or proteomic data.

3

Methods The following protocols have been successfully implemented for the characterization of the interactome of the homologous transsRNAs AbcR1 and AbcR2 in S. meliloti [28–30]. For that, the MS2 aptamer was fused to the 5′-end of the sRNAs and then tagged sRNAs were subjected to MS2-Affinity Purification coupled with RNA Sequencing (MAPS) and mass spectrometry for profiling of RNA and protein partners, respectively. In the latter case, the tagged sRNAs were constitutively overexpressed [28, 29]. In the method described here, the expression of the wild-type and tagged sRNAs is induced by IPTG from the module Plac-sinR-PsinI cloned in a pSRK vector. Genes sinRI control quorum sensing in S. meliloti and their combination with the lacZ promoter in pSRK enables controlled pulse overexpression of the sRNA under study, thereby overcoming problems with either weak levels or pleiotropic effects of constitutive expression [18]. Note that oligonucleotide and other DNA sequences must be adapted to the sRNA under study in each case.

3.1 Aptamer Tagging of the sRNAs

Before proceeding with the tagging, determination of the 5′ and 3′ ends of the sRNA under study is important for cloning design. Rapid amplification of cDNA ends (RACE) is typically the approach of choice to determine the full-length sequence of the transcript [31, 32]. RNAfold can be used for in silico prediction of the tagged and untagged sRNA secondary structures. Tagging should not alter the predicted functional structure of the wildtype sRNA (Fig. 2a). Otherwise, another cloning design can be considered to fuse the MS2 aptamer to the 5′-end of the sRNA (e.g., DNA synthesis services or Modular Cloning Strategy (MoClo), which uses Type IIS restriction enzymes). MS2 tagging of AbcR1 was as follows: 1. For pSKiMS2 construction, the region sinR-PsinI is PCR amplified using S. meliloti 1021 genomic DNA as the template with the primer pair sinR_NdeIF/TSS3_28bp_b_sinIR (GCCACATATGGCTAATCAACAGGCTGTC/GTAGCGA TGCTGTCAGGCTC) and the MS2-aptamer is amplified from pSRKMS2 using the MS2FusTSSI/HindIIIvec primer pair (GAGCCTGACAGCATCGCTACCGTACACCATCAGGGT AC/CGAGGTCGACGGTATCGATAAGCTTCGCC). These

Rhizobial sRNA Interactome

369

Fig. 2 Tagging of the AbcR1 sRNA with the MS2 aptamer. (a) Predicted folding of the chimeric MS2AbcR1 (left) and wild-type AbcR1 (right) transcripts. Note that the three predicted hairpins and the unpaired anti-Shine Dalgarno (aSD) targeting motifs (aSD1 and aSD2) of the wild-type sRNA are not likely affected by tagging. (b) Maps of the genetic constructs used for pulse (over)expression of MS2AbcR1 (left) and AbcR1 (right). Relevant restriction sites and names of the oligonucleotides used for the generation of the appropriate DNA fragments are indicated. Red lines indicate complementary regions between oligonucleotides

two PCR products overlap and are jointly used as templates for amplification with sinR_NdeIF/HindIIIvec, and the resulting PCR product is restricted with NdeI and XbaI and inserted into pSRKKm [33] to yield pSKiMS2. For pSKiMS2AbcR1 construction (Fig. 2b; left), AbcR1 is cloned downstream the MS2 aptamer in this plasmid using the XbaI site.

370

Natalia Isabel Garcı´a-Tomsig et al.

2. For the construction of pSKiAbcR1 as control (Fig. 2b; right), the module sinR-PsinI and abcR1 are independently amplified from genomic DNA using primer pairs that generate overlapping fragments. These two PCR fragments are used as templates for a second round of PCR. The resulting PCR product is restricted with NdeI and XbaI and inserted into pSRKKm to yield pSKiAbcR1. 3. E. coli DH5α competent cells are used for transformation with the ligation reaction and plated in kanamycin-containing LB plates for incubation at 37 °C. Blue/white selection of transformants with IPTG and X-gal is possible in pSRKKm but not in pSKiMS2. 4. Successful integration of the insert is verified by colony PCR using PCR2 (TTAGCTCACTCATTAGG) and PCR1 (CGGGCCTCTTCGCTATT) as primers flanking the insert in pSRKKm. 5. Plasmid DNA from colonies showing the correct size are purified and further checked by sequencing with PCR2 as a primer. 6. E. coli S17.1 cells are transformed with the correct constructs for biparental mating with the rhizobial strain Sm2019 (ΔsinRsinI) lacking AbcR1. 3.2 Cell Growth and Pulse Expression of the MS2-sRNA

Expression of the tagged sRNA is induced in culture conditions that naturally promote endogenous upregulation of the wild-type transcript. 1. A Sm2019-derivative strain, in which the abcR1 locus is deleted (termed Sm2020), is used for the IPTG-induced expression of control and tagged AbcR1 from the corresponding plasmids. For that, rhizobial strains are grown (30 °C and 170 rpm) to the desired optical density (OD600 0.5 for AbcR1) in TY or MM broth. When the sRNA is strongly accumulated under certain stress conditions (e.g., salt, heat, or cold shock), stress is imposed to log cultures and the bacteria are grown for a further 1 h before the addition of IPTG. 2. 1 mM IPTG is added to the cultures and then, these are further incubated at 30 °C before harvesting. The induction time for the subsequent affinity chromatography should yield accumulation levels of the transcripts like those resulting from endogenous regulation. This time was set in 15 min for AbcR1. 3. Cells are harvested by centrifugation (10 min at 3500 × g and 4 °C). Bacterial pellets are washed once with 20 mL of sarkosyl solution and then frozen using liquid nitrogen before storage at -80 °C to facilitate subsequent cell disruption (see Note 4).

Rhizobial sRNA Interactome

3.3 Quality Check of the Tagging Strategy

3.3.1 RNA Purification from Harvested Cells

371

The following assays are typically used to assess the functional expression and stability of the tagged sRNA. First, a Northern blot analysis of the time-course induction of sRNA and MS2-sRNA species in bacteria carrying pSKisRNA or pSKiMS2sRNA, respectively, is recommended. It allows the visualization of the full-length transcripts and check for the absence of processed forms derived from induced (over)expression. Second, the functionality of the MS2-sRNA should be assessed, i.e., its ability to complement the sRNA knock-out phenotype, if any, or to regulate a known mRNA target using a double-plasmid reporter assay for example. Nonetheless, it is possible that the tagged sRNA does not retain the full regulatory ability but could be still able to base pair to the target, so it can be used as bait in the chromatography [30]. If tagged sRNA is functional as the wild-type, its expression from the pABC vectors can be considered (Fig. 1). pABCs are a family of single-copy vectors in S. meliloti and also applicable in related bacteria [34]. Type B pABC derivatives, which enable conjugal transfer, could be used to clone the endogenous region of the sRNA locus including its own promoter and generate the version carrying the tagged sRNA. Thus, the accumulation pattern of tagged sRNA is expected to mimic that of endogenous sRNA (see Note 5). 1. Bacterial pellets containing cells equivalent to OD600 ~20–30 (e.g., 30 mL culture of OD600 0.8), expressing either the wildtype or tagged sRNA, are gently resuspended in 1 mL of pre-heated lysis solution, and incubated for 10 min at 65 °C with mixing by vortex every 2 min (see Note 6). 2. Lysates are chilled on ice and 500 μL of 5 M NaCl is added. After 10 min on ice, samples are centrifuged (15 min, 16,000 × g, 4 °C). 3. The supernatant is transferred to a new tube with 4.5 mL of cold 100% Ethanol. Tubes are mixed by inversion and stored at -80 °C for at least 1 h before centrifugation (30 min, 16,000 × g, 4 °C). 4. Ethanol is completely removed, and pellets are resuspended in 225 μL of water and pooled together for DNAse I treatment according to the manufacturer’s instructions (final volume reaction should be 250 μL). 5. After incubation, 1× vol of cold phenol:chloroform:isoamyl alcohol (pH: 4.5) is added. Samples are mixed by vortex (15–20 s) and the organic and aqueous phases are separated by centrifugation (15 min, 16,000 × g, 4 °C). 6. The aqueous (upper) phase is transferred to a new microtube containing 14 μL of 3 M NaAc (pH: 5.2) and 4× vol Ethanol

372

Natalia Isabel Garcı´a-Tomsig et al.

(1 mL), mixed by inversion, and stored at -80 °C for at least 1 h before centrifugation (30 min, 16,000 × g, 4 °C). 7. Ethanol is removed, avoiding pipetting or vortex, and the RNA pellet is washed with 200 μL of cold 70% EtOH. Precipitated RNA is pelleted by centrifugation (10 min, 16,000 × g, 4 °C) and the supernatant is carefully removed, avoiding pipetting again. 8. Samples are air-dried at room temperature with open lids for 10 min, or alternatively, they are dried using a vacuum concentrator. 9. RNA is resuspended in 50 μL of RNase-free water and stored at -20 °C (or at -80 °C for long-term storage). 3.3.2 Assessment of Quality and Quantity of RNA in Samples

1. 1 μL of purified RNA is loaded in a MOPS agarose gel for testing RNA integrity. 23S, 16S, and 5S bands should be clearly detected. 2. The RNA concentration in each sample is measured using Qubit™ RNA high sensitivity assay kit. This RNA can be used for Northern blot hybridization as described previously [32] (see Note 7). 3. RNA samples are additionally treated with TURBO™ DNAse I for 1 h at 37 °C and further cleaned up with the RNeasy Mini Kit following the manufacturer’s guidelines (see Note 8). 4. cDNA is synthesized with the Takara Prime Script RT Master Mix (Perfect Real Time) using 500 ng of total RNA according to the manufacturer’s instructions. This cDNA can be used for RT-PCR or RT-qPCR. 5. RT-qPCR is carried out in a QuantStudio 3 (Thermo Fisher Scientific) with the Takara TB Green Premix ExTaqII (Tli RNaseH Plus) using 0.5 μL of cDNA following the manufacturer’s instructions. The ratios of transcript abundance are calculated as the ΔΔCT mean average of three replicates on three independent RNA extracts [35]. The constitutively expressed gene SMc01852 is used to normalize the gene expression [36]. Control reactions without reverse transcriptase (RT) in the RNA samples are simultaneously performed to confirm the absence of DNA contamination.

3.4 Affinity Chromatography

1. Cells equivalent to 100–200 OD600 (e.g., 250 mL culture of OD600 0.8) are harvested by centrifugation 10 min at 3500 × g and 4 °C. 2. Pelleted cells are then washed once with 20 mL of sarkosyl solution, centrifuged for 5 min at 3.500 × g (4 °C), and frozen in liquid nitrogen.

Rhizobial sRNA Interactome

373

3. Then, pellets are resuspended in 4 mL of Buffer A supplemented with cOmplete™ Protease Inhibitor Cocktail and RNase inhibitor. Cell lysis is performed by three consecutive passes through a French’s Press at 1000 psi. It is important to ensure that samples conserve low temperatures (see Note 9). 4. The lysate is centrifuged (15 min, 16,000 × g, 4 °C) and the supernatant is incubated with 400 pmol of MS2-MBP bait protein for 30 min at 4 °C in a Belly Dancer Shaker. The SigmaPrep™ spin column is prepared for affinity purification by three washes with 800 μL buffer A and loading of 100 μL amylose resin previously washed in buffer A twice. 5. The mixture of cell lysate and the bait protein is then loaded into the amylose column, which interacts non-covalently with the MBP moiety. The column system should be kept on ice during the process (see Note 10). 6. Unspecific binding is removed by five column washes with 800 μL Buffer A. 7. The cap is placed on column and 500 μL buffer A containing 15 mM maltose is added. After 10 min, the cap is removed and the sample is centrifuged for 1 min at 8000 × g and 4 °C to elute MS2-tagged sRNA-RNA/protein complexes. This step is repeated once. 8. Eluates are stored at -20 °C (-80 °C for a long-term storage). 9. An aliquot of 2% (v/v) of the lysate, flow through, last wash, and eluted fractions are collected for monitoring the purification process by Northern blotting. For purification of these RNA fractions, the steps described in Subheading 3.3.1 are followed starting with the addition of NaCl (scaling the volumes). Eluted fraction does not require precipitation. RNA preparations are probed with appropriate labeled oligonucleotides upon Northern blotting (Fig. 3a). 3.5 RNA Purification from the Eluates and Processing for RNAseq

1. RNA and proteins contained in the eluted fractions are separated with 1× vol of cold phenol:chloroform:isoamyl alcohol (pH: 4.5). Samples are mixed by vortex (15–20 s) and the organic and aqueous phases are separated by centrifugation (15 min, 16,000 × g, 4 °C). 2. The aqueous (upper) phase is transferred to a new microtube containing 14 μL of 3 M NaAc (pH: 5.2), 20 μg of glycogen, and 4× vol Ethanol (1 mL), mixed by inversion and stored at 80 °C for at least 1 h before centrifugation (30 min, 16,000 × g, 4 °C). The intermediate and lower phase can be used for proteomic analysis as described below. 3. Ethanol is removed, avoiding pipetting or vortex, and the RNA pellet is washed with 200 μL of cold 70% ethanol. Upon

374

Natalia Isabel Garcı´a-Tomsig et al.

Fig. 3 Monitorization of affinity chromatography at the RNA level. (a) Northern blot analysis of RNA from the input and output chromatography samples upon IPTG induced expression of AbcR1 and MS2AbcR1. (b) Mapping of sequencing reads to the Sinorhizobium meliloti abcR1-abcR2 genomic region visualized with the IGV tool

centrifugation (10 min, 16,000 × g, 4 °C), the supernatant is carefully removed, avoiding pipetting again. 4. Samples are air-dried at room temperature with open lids for 10 min. Alternatively, a vacuum concentrator can be used for drying. Then, samples are resuspended in ultrapure water (see Note 11). 5. The RNA concentration is measured for each sample using Qubit™ RNA High sensitivity assay kit. 6. 500 ng of each sample is used for RNA-seq. Strand-specific cDNA libraries from RNA are generated and sequenced in the Illumina NextSeq Mid 150 platform. 3.6 Protein Purification from the Eluates and Preparation for Proteomics

1. 4× vol acetone (1 mL) is added to the organic phase (i.e., intermediate, and lower phases from RNA extraction), mixed by inversion, and stored at -20 °C for at least 16 h before centrifugation (30 min, 16,000 × g, 4 °C). 2. Acetone is removed, avoiding pipetting or vortex, and the protein pellet is washed with 200 μL of cold acetone. Upon centrifugation (10 min, 16,000 × g, 4 °C), the supernatant is carefully removed, avoiding pipetting again. 3. Samples are air-dried at room temperature with open lids for 10 min. Alternatively, a vacuum concentrator can be used for

Rhizobial sRNA Interactome

375

drying. Then, samples are resuspended in ultrapure water (see Note 12). 4. Protein samples to be analyzed are loaded in a conventional SDS-PAGE and run for about 10 min at 6 mA until the entire samples enter into the 5% polyacrylamide stacking gel [28]. A lane with 0.5 μg of Bovine serum albumin (BSA) as a reference for protein concentration is also loaded. The gel is stained with Coomassie Blue and each gel lane corresponding to the different samples is cut from the gel (see Note 13). 5. Each collected gel slice is subjected to nano-scale liquid chromatographic tandem mass spectrometry (nLC-MS/MS) for further analysis by the corresponding Proteomics unit. 3.7 Data Analysis (RNA-Seq)

1. Demultiplexed sequencing reads are mapped with Bowtie2 (version 2.2.3) using standard parameters to the S. meliloti Sm1021 reference sequence downloaded from the Rhizogate portal (https://www.cebitec.uni-bielefeld.de/CeBiTec/ rhizogate/) [37]. 2. PCR duplicates are removed with Samtools (version 1.10). 3. Uniquely mapped reads are assigned to protein-coding genes or non-coding RNAs (sRNAs, asRNAs, sense RNAs, and mRNA leader) with Rsubread (version 3.12). Minimum mapping quality score (minMQS) is 20 and reads marked as duplicates are ignored. Mapped reads are also assigned to virtual 5′-UTRs from 50 nt upstream to the start codon and the first 100 nt of the CDS and assigned to virtual 3′-UTR from the last 50 nt of the CDS and 30 nt downstream to the stop codon. 4. Read counts for each genomic feature are normalized by coverage and the resulting RPKM values are used for fold change calculations. A cut-off point is determined for a minimum number of reads (e.g., 100 reads in the AbcR1 experiment) and a log fold change of enrichment in the MS2-sRNA sample. If biological replicates are available, the DESeq2 utility package (https://bioconductor.org/packages/DESeq2) [38] could be used to assist in the quantitative analysis of RNAseq or proteomic data (see Note 14). 5. Rockhopper (version 2.0.3) and the Integrative Genomics Viewer (IGV) software (version 2.13.0) are used for data visualization (Figs. 3b and 4a).

3.8 Data Analysis (Proteomics)

1. The proteins with high score that are co-purified with taggedsRNA but not with the wild-type control sRNA are considered as RNA-binding protein candidates and further in vitro analysis should be performed to confirm the interaction (Fig. 4b) [28].

376

Natalia Isabel Garcı´a-Tomsig et al.

Fig. 4 Recovery of mRNA and protein partners of AbcR1. (a) Mapping of sequencing reads to the SMc01820 and fixK1 genomic regions visualized with IGV. Note enrichment of both mRNAs in samples derived from the tagged AbcR1 with respect to the controls. The predicted interactions of AbcR1 with SMc01820 and fixK1 are indicated. IR, interaction region. Numbers denote the sequence stretch in both mRNAs that base pairs to AbcR1 aSD1. Nucleotide positions are relative to the respective start codons. E, predicted hybridization energy (b) Capture of Hfq by MS2AbcR1. Western blot analysis with anti-FLAG antibodies of chromatography fractions from bacteria expressing MS2AbcR1 or the MS2 aptamer (negative control) in an hfqFLAG background. Below are the mass spectrometry indices of Hfq in the captured proteomes. #peptides, number of identified peptides; SC (%), sequence coverage; Score, protein score in Mascot

2. topGO utility package (version 3.16) is used for the identification of significantly enriched Gene Ontology (GO) terms for the proteins associated with the sRNA. S. meliloti 1021 strain, containing 4627 genes with GO terms assigned in Uniprot database, is considered as the background set.

4

Notes 1. Neomycin at 120 μg/mL could be used as an alternative to kanamycin.

Rhizobial sRNA Interactome

377

2. Total RNA can be used for quality check by Northern blot hybridization. The materials and procedure have been described elsewhere [32, 38]. 3. The protocol for purification of MS2-MBP over amylose and heparin columns by FPLC is previously described [39]. This is a recombinant protein composed by the MS2 coat protein N-terminally fused to maltose-binding protein. Protein purity can be checked by Commassie gel staining and protein concentration can be determined by the Bradford assay. 4. Freezing of bacterial pellets in liquid nitrogen is recommended before the cell lysis step. 5. A copy of the small RNA mmgR gene [40] located in the single-copy pABCa plasmid fully complemented the mmgR mutant phenotype at the physiologic and proteomic level (Lagares Jr. et al., unpublished results). However, the intracellular accumulation level and the capacity of phenotypic complementation of the tagged-sRNA must be evaluated. MS2-sRNA may not exhibit the same stability or regulatory ability as the wild-type sRNA. In this case, the short pulse of overexpression from the pSKi vector is recommended. 6. Sm2019 is an expR- strain [18]. If an expR+ rhizobial strain is used, bacterial pellets should be resuspended in 2 mL of lysis solution and incubated for 20 min at 65 °C. 7. DNA probes should be from 25 to 30 nt-long and probes for sRNA and MS2-aptamer should be tested. The 5S rRNA (TACTCTCCCGCGTCTTAAGACGAA) probe is a good control for the amount of total RNA loaded in gels. 8. To improve the retention of sRNAs, RNA samples are loaded onto the columns mixed with seven volumes of 100% ethanol. Moreover, the first wash buffer is not used. RNeasy® Plus Mini kit (Qiagen) also contains columns that optimize the genomic DNA elimination. Nevertheless, the absence of DNA should be checked by routine PCR using any primers pair. 9. Alternatively, cell lysis is performed by disruption using a Branson Sonifier sonicator in three cycles of 10 s bursts at 32 W with a microprobe. It is also important to avoid foaming. 10. If the mixture flows too fast through the column, place the cap on for a few minutes and if the sample does not flow, spin for 2–3 s. 11. Alternatively, RNA contained in the eluted fractions is concentrated by clean-up using the RNeasy® Plus Mini kit. 12. If only proteomic analysis is performed, the phenolization step is not required and protein contained in the eluted fractions can be directly concentrated using Amicon® Ultra 0.5 mL Centrifugal Filters (Ultracel® 3K) to be loaded in an

378

Natalia Isabel Garcı´a-Tomsig et al.

SDS-PAGE gel. Our experience also supports that the eluates of the affinity chromatography from at least three biological replicates could be processed directly for quantitative proteomic profiling using state-of-the-art MS devices (i.e., tryptic digestion of the proteins that are present in the eluates, reduction, and alkylation of peptides, followed by their LC-MS/MS analysis without the need of prior purification steps). 13. SDS-PAGE can be used for Western blot hybridization to monitor the presence of specific proteins in the purification fractions. Materials and protocols are described elsewhere [28]. 14. IntaRNA [25] is used to predict the interactions of the candidate target RNA with the MS2-aptamer, wild-type sRNA, and tagged sRNA. RNAs co-purified with the tagged-sRNA for which an interaction with the MS2-aptamer but not with the sRNA under study is predicted are not considered further.

Acknowledgments Work at J.I.J.-Z. laboratory is currently supported by the grants PID2020-114782GB-I00 funded by MCIN/AEI/10.13039/ 501100011033, and P20_00185 funded by Junta de Andalucı´a PAIDI/FEDER/EU. Work at the C.V. laboratory is currently supported by the grants PICT-2018-2495 funded by MCTI, and PPUNQ 2295/22 from Universidad Nacional de Quilmes. A.L.Jr. is a fellow of the Alexander von Humboldt Foundation and was supported by the grant PICT-2016-1120 funded by MCTI. Work at A.B. laboratory was funded by Deutsche Forschungsgemeinschaft through Collaborative Research Centre SFB 987. References 1. Storz G (2002) An expanding universe of noncoding RNAs. Science 296:1260–1263. https://doi.org/10.1126/science.1072249 2. Waters LS, Storz G (2009) Regulatory RNAs in bacteria. Cell 136:615–628. https://doi. org/10.1016/j.cell.2009.01.043 3. Robledo M, Garcı´a-Tomsig NI, Jime´nezZurdo JI (2020) Riboregulation in nitrogenfixing endosymbiotic bacteria. Microorganisms 8 : 3 8 4 . h t t p s : // d o i . o r g / 1 0 . 3 3 9 0 / microorganisms8030384 4. Storz G, Opdyke JA, Zhang A (2004) Controlling mRNA stability and translation with small, noncoding RNAs. Curr Opin Microbiol 7:140–144. https://doi.org/10. 1016/j.mib.2004.02.015 5. Gottesman S, Storz G, Rosenow C, Majdalani N, Repoila F, Wassarman KM

(2001) Small RNA regulators of translation: mechanisms of action and approaches for identifying new small RNAs. Cold Spring Harb Symp Quant Biol 66:353–362. https://doi. org/10.1101/sqb.2001.66.353 6. Vogel J, Luisi BF (2011) Hfq and its constellation of RNA. Nat Rev Microbiol 9:578–589. https://doi.org/10.1038/nrmicro2615 7. Smirnov A, Fo¨rstner KU, Holmqvist E, Otto A, Gu¨nster R, Becher D, Reinhart R, Vogel J (2016) Grad-seq guides the discovery of ProQ as a major small RNA-binding protein. Proc Natl Acad Sci U S A 113:11591–11596. https://doi.org/10.1073/pnas.1609981113 8. Sobrero P, Valverde C (2012) The bacterial protein Hfq: much more than a mere RNA-binding factor. Crit Rev Microbiol 38:

Rhizobial sRNA Interactome 2 7 6 – 2 9 9 . h t t p s : // d o i . o r g / 1 0 . 3 1 0 9 / 1040841X.2012.664540 9. Holmqvist E, Vogel J (2018) RNA-binding proteins in bacteria. Nat Rev Microbiol 16: 601–615. https://doi.org/10.1038/s41579018-0049-5 10. van Assche E, van Puyvelde S, Vanderleyden J, Steenackers HP (2015) RNA-binding proteins involved in post-transcriptional regulation in bacteria. Front Microbiol 6:141. https://doi. org/10.3389/fmicb.2015.00141 11. Torres-Quesada O, Reinkensmeier J, Schlut¨er JP, Robledo M, Peregrina A, Giegerich R, Toro N, Becker A, Jime´nez-Zurdo JI (2014) Genome-wide profiling of Hfq-binding RNAs uncovers extensive post-transcriptional rewiring of major stress response and symbiotic regulons in Sinorhizobium meliloti. RNA Biol 11: 563. https://doi.org/10.4161/RNA.28239 12. Bobrovskyy M, Vanderpool CK (2013) Regulation of bacterial metabolism by small RNAs using diverse mechanisms. Annu Rev Genet 47:209–232. https://doi.org/10.1146/ annurev-genet-111212-133445 13. Westermann AJ, Fo¨rstner KU, Amman F, Barquist L, Chao Y, Schulte LN, Mu¨ller L, Reinhardt R, Stadler PF, Vogel J (2016) Dual RNA-seq unveils noncoding RNA functions in host–pathogen interactions. Nature 529:496– 501. https://doi.org/10.1038/nature16547 14. Hoe C-H, Raabe CA, Rozhdestvensky TS, Tang T-H (2013) Bacterial sRNAs: regulation in stress. Int J Med Microbiol 303:217–229. https://doi.org/10.1016/j.ijmm.2013. 04.002 15. Djapgne L, Oglesby AG (2021) Impacts of small RNAs and their chaperones on bacterial pathogenicity. Front Cell Infect Microbiol 11: 604511. https://doi.org/10.3389/fcimb. 2021.604511 16. Fu H, Elena RC, Marquez PH (2019) The roles of small RNAs: insights from bacterial quorum sensing. ExRNA 1:1–8. https://doi. org/10.1186/s41544-019-0027-8 17. Jones KM, Kobayashi H, Davies BW, Taga ME, Walker GC (2007) How rhizobial symbionts invade plants: the Sinorhizobium-Medicago model. Nat Rev Microbiol 5:619–633. https://doi.org/10.1038/nrmicro1705 18. Robledo M, Frage B, Wright PR, Becker A (2015) A stress-induced small RNA modulates alpha-rhizobial cell cycle progression. PLoS Genet 11:e1005153. https://doi.org/10. 1371/journal.pgen.1005153 19. Grant SGN, Jessee J, Bloom FR, Hanahan D (1990) Differential plasmid rescue from transgenic mouse DNAs into Escherichia coli

379

methylation-restriction mutants. Proc Natl Acad Sci U S A 87:4645–4649. https://doi. org/10.1073/PNAS.87.12.4645 20. Simon R, Priefer U, Puhler A (1983) A broad host range mobilization system for in vivo genetic engineering: transposon mutagenesis in gram negative bacteria. Nat Biotechnol 1: 784–791 21. Sambrook J, Fritsch EF, Maniatis T (1989) Molecular cloning. A laboratory manual. Cold Spring Harbor Laboratory Press, Cold Spring Harbor 22. Beringer JE (1974) R factor transfer in Rhizobium leguminosarum. J Gen Microbiol 84: 188–198 23. Robertsen BK, Aman P, Darvill AG, McNeil M, Albersheim P (1981) Hostsymbiont interactions: V. The structure of acidic extracellular polysaccharides secreted by Rhizobium leguminosarum and Rhizobium trifolii. Plant Physiol 67:389–400. https://doi. org/10.1104/pp.67.3.389 24. Danecek P, Bonfield JK, Liddle J, Marshall J, Ohan V, Pollard MO, Whitwham A, Keane T, McCarthy SA, Davies RM, Li H (2021) Twelve years of SAMtools and BCFtools. Gigascience 10(2):giab008. https://doi.org/10.1093/ GIGASCIENCE/GIAB008 25. Mann M, Wright PR, Backofen R (2017) IntaRNA 2.0: enhanced and customizable prediction of RNA-RNA interactions. Nucleic Acids Res 45:W435–W439. https://doi.org/ 10.1093/nar/gkx279 26. Tjaden B (2020) A computational system for identifying operons based on RNA-seq data. Methods 176:62. https://doi.org/10.1016/ J.YMETH.2019.03.026 27. Robinson JT, Thorvaldsdo´ttir H, Winckler W, Guttman M, Lander ES, Getz G, Mesirov JP (2011) Integrative genomics viewer. Nat Biotechnol 29:24–26. https://doi.org/10.1038/ nbt.1754 28. Robledo M, Matia-Gonza´lez AM, Garcı´aTomsig NI, Jime´nez-Zurdo JI (2018) Identification of small RNA–protein partners in plant symbiotic bacteria. Methods Mol Biol 1737: 351–370. https://doi.org/10.1007/978-14939-7634-8_20 29. Robledo M, Garcı´a-Tomsig NI, Matia-Gonza´lez AM, Garcı´a-Rodrı´guez FM, Jime´nezZurdo JI (2021) Synthetase of the methyl donor S-adenosylmethionine from nitrogenfixing α-rhizobia can bind functionally diverse RNA species. RNA Biol 18:1111–1123. https://doi.org/10.1080/15476286.2020. 1829365

380

Natalia Isabel Garcı´a-Tomsig et al.

30. Garcı´a-Tomsig NI, Robledo M, diCenzo GC, Mengoni A, Milla´n V, Peregrina A, Uceta A, Jime´nez-Zurdo JI (2022) Pervasive RNA regulation of metabolism enhances the root colonization ability of nitrogen-fixing symbiotic α-rhizobia. mBio 13(1):e0357621. https:// doi.org/10.1128/MBIO.03576-21 31. Frohman MA, Dush MK, Martin GR (1988) Rapid production of full-length cDNAs from rare transcripts: amplification using a single gene-specific oligonucleotide primer (polymerase chain reaction/5′ and 3′ cDNA ends/ cDNA cloning/low-abundance mRNAs/int2 gene). Proc Natl Acad Sci U S A 85:8998– 9002 32. del Val C, Rivas E, Torres-Quesada O, Toro N, Jime´nez-Zurdo JI (2007) Identification of differentially expressed small non-coding RNAs in the legume endosymbiont Sinorhizobium meliloti by comparative genomics. Mol Microbiol 66:1080–1091. https://doi.org/10.1111/j. 1365-2958.2007.05978.x 33. Khan SR, Gaines J, Roop RM, Farrand SK (2008) Broad-host-range expression vectors with tightly regulated promoters and their use to examine the influence of TraR and TraM expression on Ti plasmid quorum sensing. Appl Environ Microbiol 74:5053–5062. https://doi.org/10.1128/AEM.01098-08 34. Do¨hlemann J, Wagner M, Happel C, Carrillo M, Sobetzko P, Erb TJ, Thanbichler M, Becker A (2017) A family of single copy repABC-type shuttle vectors stably maintained in the alpha-proteobacterium Sinorhizobium meliloti. ACS Synth Biol 6: 9 6 8 – 9 8 4 . h t t p s : // d o i . o r g / 1 0 . 1 0 2 1 / acssynbio.6b00320 35. Livak KJ, Schmittgen TD (2001) Analysis of relative gene expression data using real-time quantitative PCR and the 2-ΔΔCT method.

Methods 25:402–408. https://doi.org/10. 1006/METH.2001.1262 36. Becker A, Berge´s H, Krol E, Bruand C, Ru¨berg S, Capela D, Lauber E, Meilhoc E, Ampe F, de Brujin FJ, Fourment J, FrancezCharlot A, Kahn D, Ku¨ster H, Liebe C, Pu¨hler A, Weidner S, Batut J (2004) Global changes in gene expression in Sinorhizobium meliloti 1021 under microoxic and symbiotic conditions. Mol Plant-Microbe Interact 17: 292–303. https://doi.org/10.1094/MPMI. 2004.17.3.292 37. Becker A, Barnett MJ, Capela D, Dondrup M, Kamp P-B, Krol E, Linke B, Ru¨berg S, Runte K, Schroeder BK, Weidner S, Yurgel SN, Batut J, Long SR, Pu¨hler A, Goesmann A (2009) A portal for rhizobial genomes: RhizoGATE integrates a Sinorhizobium meliloti genome annotation update with postgenome data. J Biotechnol 140:45–50. https://doi. org/10.1016/j.jbiotec.2008.11.006 38. Robledo M, Garcı´a-Tomsig NI, Jime´nezZurdo JI (2018) Primary characterization of small RNAs in symbiotic nitrogen-fixing bacteria. In: Medina C, Lo´pez-Baena FJ (eds) Host-pathogen interactions: methods and protocols. Springer, New York, pp 277–295. https://doi.org/10.1007/978-1-4939-76041_22 39. Jurica MS, Licklider LJ, Gygi SP, Grigorieff N, Moore MJ (2002) Purification and characterization of native spliceosomes suitable for threedimensional structural analysis. RNA 8(4): 4 2 6 – 4 3 9 . h t t p s : // d o i . o r g / 1 0 . 1 0 1 7 / s1355838202021088 40. Lagares A Jr, Ceizel Borella G, Linne U, Becker A, Valverde C (2017) Regulation of polyhydroxybutyrate accumulation in Sinorhizobium meliloti by the trans-encoded small RNA MmgR. J Bacteriol 199:e00776-16. https://doi.org/10.1128/JB.00776-16

Part IV sRNA Structure

Chapter 20 sRNA Structural Modeling Based on NMR Data Pengzhi Wu and Lingna Yang Abstract Small non-coding RNAs (sRNAs) play vital roles in gene expression regulation and RNA interference. To comprehend their molecular mechanisms and develop therapeutic approaches, determining the accurate three-dimensional structure of sRNAs is crucial. Although nuclear magnetic resonance (NMR) spectroscopy is a powerful tool for structural biology, obtaining high-resolution structures of sRNAs using NMR data alone can be challenging. In such cases, structural modeling can provide additional details about RNA structures. In this context, we present a protocol for the structural modeling of sRNA using the SimRNA method based on sparse NMR constraints. To demonstrate the efficacy of our method, we provide selected examples of NMR spectra and RNA structures, specifically for the second stem-loop of DsrA sRNA. Key words RNA structure, Imino proton, Structural modeling, SimRNA, NMR spectroscopy

1

Introduction Small non-coding RNAs (sRNAs) are a diverse group of RNA molecules that typically consist of fewer than 200 nucleotides and do not encode proteins [1]. Despite their small size, sRNAs play vital roles in various regulatory functions such as controlling gene expression, RNA splicing, and translation [2]. For example, microRNAs (miRNAs) are short RNA molecules 19–25 nucleotides (nt) in size that repress gene expression by promoting mRNA degradation or inhibiting translation [3], and bacterial regulatory sRNAs are crucial in bacterial adaptation to changing environments, stress responses, virulence, and pathogenicity [4]. Like proteins, the three-dimensional (3D) structure of RNA is essential for its biological function and interaction with other molecules in the cell [5–7]. Several methods are available to determine the 3D structure of RNA, including nuclear magnetic resonance (NMR) spectroscopy [8], X-ray crystallography [9], and cryo-electron microscopy (cryoEM) [10]. X-ray crystallography involves crystallizing the RNA molecule and using X-rays to determine the positions of the

Ve´ronique Arluison and Claudio Valverde (eds.), Bacterial Regulatory RNA: Methods and Protocols, Methods in Molecular Biology, vol. 2741, https://doi.org/10.1007/978-1-0716-3565-0_20, © The Author(s), under exclusive license to Springer Science+Business Media, LLC, part of Springer Nature 2024

383

384

Pengzhi Wu and Lingna Yang

atoms in the crystal structure [10], whereas cryo-EM involves freezing the RNA molecule and using an electron microscope to image the molecule from different angles to reconstruct the 3D structure [11, 12]. X-ray crystallography and cryo-EM require the macromolecule to be in a specific form, either as a crystal or as a particle with a particular symmetry [13], whereas NMR can determine the 3D structure of RNA in solution, allowing for the characterization of the dynamic properties of the molecule [14]. RNA has three types of exchangeable protons, namely, imino (-NH), amino (-NH2), and hydroxyl protons (-OH) [15]. The rate of exchange of these protons can provide information about the local structure and dynamics of the RNA molecule [16]. For instance, regions of RNA involved in base pairing and secondary structure formation tend to have slower exchange rates, while regions that are more flexible and exposed to solvent tend to have faster exchange rates [17]. The chemical shifts of imino protons can be used to identify base pairing and stacking interactions between nucleotides, which are critical for RNA folding and stability [18]. Therefore, they are essential for RNA structural modeling as they provide valuable information about the secondary and tertiary structure of RNA molecules [19]. Although NMR spectroscopy is a useful tool for determining the solution structure of sRNAs, it can be difficult to obtain highresolution structures using NMR data alone, particularly for large and complex RNAs. On the other hand, computational methods can be used to model the structure of RNA molecules based on NMR data, chemical modification data, and cryo-electron microscopy data [20]. There are several computational methods that can be used to predict the 3D structure of RNA, including comparative modeling (homology modeling), de novo modeling, molecular dynamics simulations, hybrid methods, and ab initio folding [21]. Of all the ab initio folding methods, SimRNA stands out for its ability to incorporate experimental data, including NMR restraints, in the prediction of RNA’s 3D structure. SimRNA uses a coarse-grained representation of a nucleotide chain, a knowledgebased energy function, and a Monte Carlo scheme for sampling the conformational space. It employs a statistical potential to approximate the energy and identify conformations that correspond to biologically relevant structures [22–24]. The aim of this chapter is to provide a protocol for using the SimRNA method to model the structure of sRNA based on sparse NMR constraints. To achieve this, we use the 25-nucleotide second stem-loop (SL) of DsrA (DsrA-SL2), which has two distinct conformations (named Conformation 1 and Conformation 2, Fig. 1), as a model system [25]. The major procedures involved in the protocol contain RNA in vitro transcription for NMR studies, experimental determination of sRNA secondary structure, and structural modeling of sRNA using computational methods.

sRNA Structural Modeling Based on NMR Data Conformation 1 C U G U 45 U - A 50 U-A C-G U - AC U - A 55 40 C - G G U 38 U U G U G C 60

385

Conformation 2 U A 50 A C-G G-C 45 U - A U - A 55 C-G U U GGUGCU UUC 60 38 40 U

Fig. 1 Schematic of the bistable secondary structure of DsrA-SL2, comprising nucleotides 38–60 of DsrA sRNA. To enhance the RNA transcription yield, two guanines are incorporated at the 5′-end of the RNA. The nucleotides representing Conformation 1 are highlighted in green, whereas those of Conformation 2 are denoted in black

2

Materials

2.1 Enzymes, Buffers, and Chemicals

To prepare RNA for NMR experiments, the following items are needed (see Note 1): 1. T7 RNA polymerase (RNAP): 3 mg/mL in 25 mM sodium phosphate, pH 7.0, 150 mM NaCl, 50% (v/v) glycerol, 10 mM DTT. Store at -80 °C. 2. 10× T7 RNAP transcription buffer: 400 mM Tris–HCl, pH 8.1, 10 mM Spermidine, 0.1% (v/v) Triton X-100. 3. T7 promoter DNA (60 μM in water): A T7 RNA polymerase promoter DNA oligonucleotide, stored at -20 °C (see Note 2). 4. DNA template (60 μM in water): A complementary DNA oligonucleotide template for RNA transcription, stored at 20 °C (see Fig. 2 for sequence parameters). 5. Solutions of 50 mM nucleotide-5′-triphosphate (ATP, CTP, GTP, and UTP) prepared in pH 7.0 water and stored in 1 mL aliquots at -20 °C. 6. Solution of 500 mM Magnesium chloride (MgCl2) prepared in DEPC-treated ddH2O. 7. Solution of 100 mM dithiothreitol (DTT) prepared in DEPCtreated ddH2O. 8. DEPC-treated ddH2O. 9. 0.5 M EDTA stock solution: pH 8.0, prepared in DEPCtreated ddH2O. 10. 100% ethanol. 11. Denaturing acrylamide gel solution: 16% acrylamide/bisacrylamide (19:1), 1× TBE, and 7 M urea.

386

Pengzhi Wu and Lingna Yang DNA Template 5’-GAAAT TAATACGACTCACTATA G-3’ 3’-CTTTA ATTATGCTGAGTGATAT CCACGAAGAACGAATTCGTTCAAAG-5’ T7 promoter sequence RNA Synthesized

5’-GGUGCUUCUUGCUUAAGCAAGUUUC-3’

Fig. 2 Sequence of the single-stranded DNA template for the in vitro transcription of DsrA-SL2 RNA

12. Denaturing loading buffer: 95% (v/v) formamide, 2% (v/v) 0.5 M EDTA (pH 8.0), 0.02% (w/v) bromophenol blue. 13. 10× TBE running buffer: Dissolve 108 g of Tris base, 55 g of boric acid, and 7.5 g of disodium EDTA salt in 1 L of water. 14. Ammonium persulfate solution (APS): 10% (w/v) solution in DEPC-treated ddH2O. 15. N,N,N´,N´-Tetramethylethylenediamine (TEMED). 16. GelRed (10,000×, Biotium). 17. NMR buffer: 10 mM sodium phosphate, pH 6.5, 50 mM NaCl, 90% H2O/10% D2O. 2.2

Equipment

1. NMR spectrometer equipped with an HCN cryoprobe. 2. Standard and Shigemi NMR tubes. 3. Elutrap electroelution system (GE Healthcare Life Sciences). 4. Water bath. 5. Centrifuge with appropriate rotors. 6. PAGE gel apparatus. 7. UV illuminator (254 nM). 8. Fluor-coated TLC plates (fluoresce in 254 nM UV light).

2.3

Software

1. NMRFAM-Sparky: a program for visualizing and analyzing NMR data (https://nmrfam.wisc.edu/nmrfam-sparky-distribution/). 2. Topspin: a software for processing, displaying, and analyzing NMR spectra (https://www.bruker.com/en/products-and-solutions/ mr/nmr-software/topspin.html). 3. SimRNA: a software package for simulating RNA structures and folding pathways (https://genesilico.pl/software/stand-alone/simrna). 4. SimRNAweb: a web-based interface for the SimRNA software package (https://genesilico.pl/SimRNAweb).

sRNA Structural Modeling Based on NMR Data

387

5. PyMOL: a program for high-quality 3D visualizations of macromolecules (https://pymol.org/2/).

3

Methods

3.1 RNA In Vitro Transcription for NMR Studies

1. Prepare components for a preparative-scale transcription reaction to determine the optimal concentration of MgCl2. Mix the following components in a 1.5-ml microcentrifuge tube for a 50 μL transcription reaction: 40 mM Tris–HCl (pH 8.1), 0.3 μM DNA template, 0.3 μM T7 promoter DNA, 0.1 mg/ mL T7 RNA polymerase, 5 mM of each unlabeled NTPs, 1 mM spermidine, 10 mM DDT, 0.01% (v/v) Triton X-100, and 20–50 mM MgCl2 (see Note 3). 2. Incubate the mixture for 4 h at 37 °C in a water bath, and then use denaturing PAGE to determine the magnesium concentration that yields the best RNA transcription. 3. Scale up the in vitro transcription mixture from 50 μL to 10 mL. For the 2D 1H-15N HSQC NMR experiment, uniformly 15N-labeled NTPs should be used. 4. Incubate the reaction solution for 4 h in a water bath at 37 °C. 5. Quench the reaction by adding 1/10 volume of 0.5 M EDTA (pH 8.0). 6. Add 1/10 volume of 5 M NaCl and 3 volumes of cold 100% ethanol to the transcription solution. 7. Incubate the solution at -20 °C overnight to precipitate the RNA. 8. Centrifuge the solution at 14,000 rpm at 4 °C for 30 min to pellet the RNA. Decant the supernatant and gently wash the pellet with 10 mL of cold 70% ethanol. 9. Resuspend the RNA pellet in 1.5 mL of DEPC-treated water and then add 1.5 mL of RNA denaturing loading buffer. 10. Heat the sample at 95 °C for 5 min and then incubate it on ice for 15 min. 11. Prepare a 16% acrylamide/bis 19:1 denaturing gel containing 7 M urea. Pre-run the denaturing gel at 120 W for 30–60 min to equilibrate and preheat the gel (see Note 4). 12. Run the RNA sample for a sufficient time at 120 W. Use the mobility of the tracking dyes on the gel to determine when to stop running the gel. 13. Visualize the target RNA band by UV254 shadowing and cut out the band of interest using a clean razor blade.

388

Pengzhi Wu and Lingna Yang

14. Elute the RNA using the Elutrap Electroelution System in 0.5× TBE buffer at 4 °C overnight. 15. Recover the eluted RNA from the collection cup and wash it with 2 M NaCl three times. Then, wash the RNA with DEPCtreated water three more times. 16. Concentrate the RNA sample to approximately 1 mL, transfer it to a 50 mL tube, and add 30 mL of NMR buffer. 17. Anneal the RNA to form the native structure by incubating it at 95 °C for 5 min and then immediately placing it on ice for 20 min (see Note 5). 18. Concentrate the RNA to the desired concentration. Generally, a concentration range of 0.5–1.0 mM is recommended for NMR studies. 19. The RNA is now ready for NMR studies and should be stored at -80 °C. 3.2 Experimental Determination of sRNA Secondary Structure

NMR spectroscopy is a powerful technique used to study the structure and dynamics of biological macromolecules, including RNA. In RNA, imino protons are the protons attached to the nitrogen atoms in the RNA base, and their chemical shifts can provide crucial information about RNA structure and function. Assigning imino proton resonances of RNA using NMR typically involves several steps, which can vary depending on the complexity of the sRNA molecule and the quality of the NMR data. Here is a general overview of the steps involved: 1. To begin the NMR analysis of sRNA, a low concentration of RNA (0.1–0.2 mM) in 90% H2O/10% D2O is prepared, and a 1D 1H NMR spectrum is obtained (Fig. 3a). The secondary structure of sRNA can be predicted initially by computational methods such as Mfold, which uses thermodynamic calculations to predict the most stable secondary structure of RNA based on the sequence of nucleotides. However, these predictions are not always accurate and can be affected by various factors such as the presence of ions, pH, and temperature. The 1D 1H NMR spectrum is a reliable way to validate the predicted secondary structure of RNA. In RNA, each WatsonCrick base pair (i.e., A-U and G-C) has an imino proton that gives rise to a characteristic signal in the 1D 1H NMR spectrum. The G-U wobble base pair and U-U mismatched base pair, on the other hand, have two imino protons, resulting in two distinct signals in the spectrum. By counting the number of imino signals in the 1D 1H NMR spectrum, it is possible to determine whether the RNA molecule has enough base pairs to form the predicted secondary structure by Mfold software. If there are considerably more imino signals detected than

sRNA Structural Modeling Based on NMR Data

A

389

U41/G52 U42/U44

U42/U47

U45/G52

U45 G56

G46

U42/U47

G56

U44

G39

U57

B G56-U42/U57

U42-U57

11

G39-U57 G56-U42/U57

H (ppm)

12

G52-G46

1

U45-G46

13 U44-U45

U41-G56

U44-G56 U42-U41 U42-G52

14

U44-U45

U44-G52

C 145

G39 G46 G56

G56

G52

150

155 U42/U57

15

N (ppm)

G52

U42/U57

160

U57 U44

U45

U44 U42

U41

U45

165 14

13

12 1

11

H (ppm)

Fig. 3 Assignment of imino protons of DsrA-SL2 RNA. (a) 1D 1H spectrum, (b) 2D 1H-1H NOESY, and (c) 2D 1 15 H- N HSQC

390

Pengzhi Wu and Lingna Yang

predicted, it suggests that the RNA molecule may have different conformations and secondary structures. 2. To obtain a better NMR spectrum of sRNA, it is crucial to optimize various conditions such as temperature, salt, pH, and Mg2+ concentration. The imino protons in RNA exchange rapidly with solvent molecules, which makes the detection of imino signals in the NMR spectrum challenging. One way to improve the detection of imino signals in RNA is to perform NMR experiments at lower temperatures, typically from 278 to 283 K. Lower temperatures reduce the mobility of the RNA molecule, slow down the exchange of imino protons with solvent molecules, and reduce the line broadening of the signals (see Note 6). 3. Following the optimization of NMR analysis conditions for RNA, the subsequent step is to prepare a higher concentration of RNA (0.5–1 mM) in 90% H2O/10% D2O and perform a 2D 1 H-1H NOESY experiment (Fig. 3b). A-form helical RNA is a common conformation of RNA molecules. The backbone of A-form helical RNA can be sequentially walked along, allowing regions to be assigned based on the position of neighboring imino protons. These imino protons are approximately 4 Å apart and give rise to moderate NOE crosspeaks at longer mixing time. Watson-Crick G-C base pairs can be identified based on the chemical shifts of their imino protons, which appear at approximately 12–13.5 ppm. In contrast, imino protons from A-U base pairs are further downfield, at around 13–15 ppm. One of the advantages of 2D 1H-1H NOESY spectrum is that it enables the easy identification of non-canonical G-U and U-U base pairs, which can be readily distinguished due to the intense crosspeak between the resonances of their H1 and H3 imino protons, which have chemical shifts of 10–11.5 ppm for guanosine and 10.5–12.5 ppm for uridine. 4. In addition to the imino protons, several other protons in RNA molecules can be assigned based on NOE crosspeaks to imino protons. For instance, the amino protons of cytidine are typically well-resolved and can be identified via pairs of NOEs to the guanine imino protons in G-C base pairs, which have chemical shifts ranging from about 6.0 to 7.5 ppm. Similarly, a strong crosspeak from the imino of uridine to the H2 of adenine can be detected. The H2 proton of adenine is typically in the range of 7.0–8.5 ppm. 5. The use of 15N-labeled RNA and 2D 1H-15N HSQC or HMQC spectrum can greatly aid in the assignment of imino proton resonances in RNA molecules (Fig. 3c). Since the 15N chemical shift difference between guanine and uridine is about

sRNA Structural Modeling Based on NMR Data

391

10 ppm, the signals from guanine and uridine imino protons can be easily distinguished in the 1H-15N HSQC spectrum. This can aid in the identification and assignment of imino proton resonances, particularly in cases where the 2D 1H-1H NOESY spectrum may be crowed or overlapping. 6. Once the imino protons are assigned, the hydrogen bonding pattern between the nucleotides can be analyzed to determine the secondary structure of RNA. For example, the presence of sequential imino proton signals suggests the presence of a helix, while the absence of sequential signals indicates the presence of a loop or single-stranded region. 3.3 Structural Modeling of sRNA Using Computational Methods

After determining the secondary structure of sRNA, obtaining a 3D model of the RNA molecule can be desirable. SimRNA is a software package for RNA structure modeling, and it can be run on a command-line interface. The general steps involved in using SimRNA to predict sRNA 3D structure based on sequence information with additional structural details provided in the form of secondary structure restraints are outlined below. These restraints are established through the assignment of imino proton resonances by NMR spectroscopy. 1. Visit the official website for SimRNA and download the software. Next, prepare the input files as listed below (see Note 7): • Input sequence file (input_file_sequence) in ASCII format named SL2.seq in our case: GGUGCUUCUUGCUUAAGCAAGUUUC • Secondary structure restraints file (secondary_struc_restraints_file) named Secondary_struc1 for the Conformation 1 of DsrA-SL2: ..((((((((....))).))))).. • Secondary structure restraints file (secondary_struc_restraints_file) named Secondary_struc2 for the Conformation 2 of DsrA-SL2: .....(((((((....))))))).. • Configuration file (configSA.dat) containing the basic simulation parameters: NUMBER_OF_ITERATIONS 16000000 TRA_WRITE_IN_EVERY_N_ITERATIONS 16000 INIT_TEMP 1.35 FINAL_TEMP 0.90 BONDS_WEIGHT 1.0 ANGLES_WEIGHT 1.0

392

Pengzhi Wu and Lingna Yang

TORS_ANGLES_WEIGHT 0.0 ETA_THETA_WEIGHT 0.40 2. Perform the simulation using the following syntax: $ ./SimRNA -s input_file_sequence -S secondary_struc_restraints_file -c configSA.dat -o output_files_basename >& results.log & Where: • “output_files_basename” is the output file containing information about the trajectories sampled in the simulation (.trafl), bonds (.bonds), and secondary structures (.ss_detected). • “results.log” maintains a record of the output messages generated during the simulation. For Conformation 1, run the following command: $ ./SimRNA -s SL2.seq -S Secondary_struc1 -c configSA.dat -o Conformation1 >& Conformation1.log & For Conformation 2, run the following command: $ ./SimRNA -s SL2.seq -S Secondary_struc2 -c configSA.dat -o Conformation2 >& Conformation2.log &

3. Cluster the best-scored conformations using the following syntax: $

./clustering trajectory.trafl fraction_of_lowE_frames_to_cluster rmsd_thrs >& results_clust.log

For Conformation 1, run the following command: $ ./clustering Conformation1.trafl 0.01 3.5 >& Conformation1_clust.log For Conformation 2, run the following command: $ ./clustering Conformation2.trafl 0.01 3.5 >& Conformation2_clust.log In these commands: • Conformation1.trafl and Conformation2.trafl are input trafl files. • The value “0.01” indicates that only 1% of the lowest energy frames from the input trafl files will be subjected to clustering, with the remaining frames being ignored. • The value “3.5” represents the RMSD threshold set for first pass of clustering.

sRNA Structural Modeling Based on NMR Data

393

4. Extract the lowest energy frame from the cluster by executing the following command line: $ python traf_extract_lowestE_frame.py trajectory.trafl For Conformation 1, run the following command: $ python trafl_extract_lowestE_frame.py Conformation1_thrs3.50A_clust01.trafl For Conformation 2, run the following commands: $ python trafl_extract_lowestE_frame.py Conformation2_thrs3.50A_clust01.trafl 5. Generate all-atom models of representatives of the largest clusters by using the following syntax: $ ./SimRNA_trafl2pdbs Structure.pdb trajectory.trafl 1 AA Where: • “1” specifies that the first frame of the input trajectory file (trafl) will be converted in to a PDB file. • The option “AA” initiates an all-atom reconstruction of the corresponding frame. For Conformation 1, run the following command: $ ./SimRNA_trafl2pdbs Conformation1-000001.pdb Conformation1_thrs3.50A_clust01_minE.trafl 1 AA For Conformation 2, run the following command: $ ./SimRNA_trafl2pdbs Conformation2-000001.pdb Conformation2_thrs3.50A_clust01_minE.trafl 1 AA 6. Analyze the 3D structure of the RNA molecule using software such as PyMOL, VMD, or MolProbity (Fig. 4). 7. To simplify the steps of the stand-alone package, SimRNAweb, which is the web service of SimRNA, can be used. In conclusion, understanding the 3D structure of sRNA is crucial for comprehending its function and interactions with other biomolecules. Computational methods such as SimRNA, along with experimental data from techniques such as NMR spectroscopy, can be used to model the 3D structure of sRNA. The accuracy and completeness of the experimental data are critical factors that can significantly affect the reliability of the results obtained from these methods. NMR spectroscopy can provide valuable data on various structural features of sRNA molecules in solution. This data can be used as restraints to generate more accurate and reliable models of sRNA structures, including regions that may not be well-defined in the NMR spectrum alone.

394

Pengzhi Wu and Lingna Yang

A

Conformation 1 A50

U45

U45

A50

U38

U38

5’ 3’

B

C60 3’ 5’

C60

Conformation 2

A50

U45

3’ C60

U38 5’

A50

U45

3’

C60

U38

5’

Fig. 4 Predicted 3D structure of DsrA-SL2, generated using SimRNA. (a) Conformation 1. (b) Conformation 2. Nucleotides are color-coded as blue (guanosine), green (cytosine), orange (adenine), and red (uracil)

Moreover, the combination of NMR spectroscopy and structural modeling using SimRNA can also help resolve ambiguities present in the NMR spectrum. Therefore, a comprehensive approach that combines experimental and computational techniques can enhance our understanding of the structural and functional properties of sRNA, enabling the development of RNA-based therapeutics and other applications.

4

Notes 1. All solutions must be RNase-free. RNases are enzymes that can cleave and degrade RNA molecules. They are ubiquitous, meaning they are present in many environments, including in the laboratory, on surfaces, and in many biological samples. If RNases are present in the solutions used for RNA in vitro transcription, they can degrade the RNA molecules before or during transcription, resulting in incomplete or inaccurate transcripts. 2. To increase the yield of product RNA and increase the stability of the double-stranded DNA, five nucleotides (5′-GAAAT-3′) are added to the 5′-end of the T7 promoter sequence.

sRNA Structural Modeling Based on NMR Data

395

3. Other parameters that can also be optimized during RNA in vitro transcription to improve the yield, quality, and specificity of the RNA transcripts. Some of these parameters include the concentration of NTPs, template DNA, and T7 RNA polymerase; pH and buffer composition of the transcription reaction; choice of DNA template and promoter sequence. 4. The percentage of acrylamide/bis 19:1 denaturing gel to be used for RNA purification depends on the size of the RNA interest. Generally, the higher the percentage of acrylamide, the better the separation of smaller RNA, while lower percentages are more suitable for larger RNA molecules. For example, a 10% acrylamide/bis 19:1 denaturing gel is commonly used to separate RNA molecules in the range of 50–500 nucleotides, while a 15% acrylamide/bis 19:1 gel is more suitable for RNA molecules in the range of 20–150 nucleotides. 5. The annealing conditions can vary depending on the specific RNA and its intended application. It is best to optimize the annealing conditions for your specific RNA sample. Some RNAs are better annealed by heating them to the optimal annealing temperature and then slowly cooling them down to room temperature over a specific amount of time. 6. The optimal salt concentration for NMR experiments depends on the specific RNA sequence and structure. By optimizing the salt concentration, NMR signal quality can be improved, and sample aggregation and precipitation can be reduced. 7. To utilize SimRNA, it is necessary to access the command line interface and navigate to the SimRNA installation directory. All SimRNA commands must be executed within this directory, which includes the essential “data” directory containing energy function files required for SimRNA operation.

Acknowledgments This work was supported by Hefei National Laboratory for Physical Sciences at Microscale and School of Life Science, University of Science and Technology of China. References 1. Storz G, Vogel J, Wassarman KM (2011) Regulation by small RNAs in bacteria: expanding frontiers. Mol Cell 43(6):880–891. https:// doi.org/10.1016/j.molcel.2011.08.022 2. Waters LS, Storz G (2009) Regulatory RNAs in bacteria. Cell 136(4):615–628. https://doi. org/10.1016/j.cell.2009.01.043

3. Lewis BP, Burge CB, Bartel DP (2005) Conserved seed pairing, often flanked by adenosines, indicates that thousands of human genes are microRNA targets. Cell 120(1): 15–20. https://doi.org/10.1016/j.cell.2004. 12.035 4. Papenfort K, Vogel J (2010) Regulatory RNA in bacterial pathogens. Cell Host Microbe

396

Pengzhi Wu and Lingna Yang

8(1):116–127. https://doi.org/10.1016/j. chom.2010.06.008 5. Cech TR, Steitz JA (2014) The noncoding RNA revolution-trashing old rules to forge new ones. Cell 157(1):77–94. https://doi. org/10.1016/j.cell.2014.03.008 6. Shi H, Rangadurai A, Abou Assi H, Roy R, Case DA, Herschlag D, Yesselman JD, Al-Hashimi HM (2020) Rapid and accurate determination of atomistic RNA dynamic ensemble models using NMR and structure prediction. Nat Commun 11(1):5531. https://doi.org/10.1038/s41467-02019371-y 7. Bailor MH, Sun X, Al-Hashimi HM (2010) Topology links RNA secondary structure with global conformation, dynamics, and adaptation. Science 327(5962):202–206. https:// doi.org/10.1126/science.1181085 8. Lu K, Heng X, Garyu L, Monti S, Garcia EL, Kharytonchyk S, Dorjsuren B, Kulandaivel G, Jones S, Hiremath A, Divakaruni SS, LaCotti C, Barton S, Tummillo D, Hosic A, Edme K, Albrecht S, Telesnitsky A, Summers MF (2011) NMR detection of structures in the HIV-1 5’-leader RNA that regulate genome packaging. Science 334(6053):242–245. https://doi.org/10.1126/science.1210460 9. Ban N, Nissen P, Hansen J, Moore PB, Steitz TA (2000) The complete atomic structure of the large ribosomal subunit at 2.4 A resolution. Science 289(5481):905–920. https://doi. org/10.1126/science.289.5481.905 10. Chung K, Xu L, Chai P, Peng J, Devarkar SC, Pyle AM (2022) Structures of a mobile intron retroelement poised to attack its structured DNA target. Science 378(6620):627–634. https://doi.org/10.1126/science.abq2844 11. Bai XC, Rajendra E, Yang G, Shi Y, Scheres SH (2015) Sampling the conformational space of the catalytic subunit of human gammasecretase. elife 4. https://doi.org/10.7554/ eLife.11182 12. Scheres SH (2016) Processing of structurally heterogeneous Cryo-EM data in RELION. Methods Enzymol 579:125–157. https://doi. org/10.1016/bs.mie.2016.04.012 13. Hendrickson WA, Teeter MM (1981) Structure of the hydrophobic protein crambin determined directly from the anomalous scattering of sulphur. Nature 290(5802):107–113. https://doi.org/10.1038/290107a0 14. Barnwal RP, Yang F, Varani G (2017) Applications of NMR to structure determination of RNAs large and small. Arch Biochem Biophys 628:42–56. https://doi.org/10.1016/j.abb. 2017.06.003

15. Jiang F, Kumar RA, Jones RA, Patel DJ (1996) Structural basis of RNA folding and recognition in an AMP-RNA aptamer complex. Nature 382(6587):183–186. https://doi. org/10.1038/382183a0 16. Pardi A, Morden KM, Patel DJ, Tinoco I Jr (1982) Kinetics for exchange of imino protons in the d(C-G-C-G-A-A-T-T-C-G-C-G) double helix and in two similar helices that contain a G . T base pair, d(C-G-T-G-A-A-T-T-C-G-C-G), and an extra adenine, d(C-G-C-A-G-A-A-T-TC-G-C-G). Biochemistry 21(25):6567–6574. https://doi.org/10.1021/bi00268a038 17. Wacker A, Weigand JE, Akabayov SR, Altincekic N, Bains JK, Banijamali E, Binas O, Castillo-Martinez J, Cetiner E, Ceylan B, Chiu LY, Davila-Calderon J, Dhamotharan K, Duchardt-Ferner E, Ferner J, Frydman L, Furtig B, Gallego J, Grun JT, Hacker C, Haddad C, Hahnke M, Hengesbach M, Hiller F, Hohmann KF, Hymon D, de Jesus V, Jonker H, Keller H, Knezic B, Landgraf T, Lohr F, Luo L, Mertinkus KR, Muhs C, Novakovic M, Oxenfarth A, Palomino-Schatzlein M, Petzold K, Peter SA, Pyper DJ, Qureshi NS, Riad M, Richter C, Saxena K, Schamber T, Scherf T, Schlagnitweit J, Schlundt A, Schnieders R, Schwalbe H, Simba-Lahuasi A, Sreeramulu S, Stirnal E, Sudakov A, Tants JN, Tolbert BS, Vogele J, Weiss L, Wirmer-Bartoschek J, Wirtz Martin MA, Wohnert J, Zetzsche H (2020) Secondary structure determination of conserved SARS-CoV-2 RNA elements by NMR spectroscopy. Nucleic Acids Res 48(22):12415–12435. https://doi.org/10. 1093/nar/gkaa1013 18. Wang Y, Han G, Jiang X, Yuwen T, Xue Y (2021) Chemical shift prediction of RNA imino groups: application toward characterizing RNA excited states. Nat Commun 12(1): 1595. https://doi.org/10.1038/s41467021-21840-x 19. Bahrami A, Clos LJ 2nd, Markley JL, Butcher SE, Eghbalnia HR (2012) RNA-PAIRS: RNA probabilistic assignment of imino resonance shifts. J Biomol NMR 52(4):289–302. https://doi.org/10.1007/s10858-0129603-z 20. Ponce-Salvatierra A, Astha MK, Nithin C, Ghosh P, Mukherjee S, Bujnicki JM (2019) Computational modeling of RNA 3D structure based on experimental data. Biosci Rep 39(2). https://doi.org/10.1042/BSR20180430 21. Li B, Cao Y, Westhof E, Miao Z (2020) Advances in RNA 3D structure modeling using experimental data. Front Genet 11:

sRNA Structural Modeling Based on NMR Data 574485. https://doi.org/10.3389/fgene. 2020.574485 22. Boniecki MJ, Lach G, Dawson WK, Tomala K, Lukasz P, Soltysinski T, Rother KM, Bujnicki JM (2016) SimRNA: a coarse-grained method for RNA folding simulations and 3D structure prediction. Nucleic Acids Res 44(7):e63. https://doi.org/10.1093/nar/gkv1479 23. Piatkowski P, Kasprzak JM, Kumar D, Magnus M, Chojnowski G, Bujnicki JM (2016) RNA 3D structure modeling by combination of template-based method ModeRNA, template-free folding with SimRNA, and refinement with QRNAS. Methods Mol

397

Biol 1490:217–235. https://doi.org/10. 1007/978-1-4939-6433-8_14 24. Wirecki TK, Nithin C, Mukherjee S, Bujnicki JM, Boniecki MJ (2020) Modeling of threedimensional RNA structures using SimRNA. Methods Mol Biol 2165:103–125. https:// doi.org/10.1007/978-1-0716-0708-4_6 25. Wu P, Liu X, Yang L, Sun Y, Gong Q, Wu J, Shi Y (2017) The important conformational plasticity of DsrA sRNA for adapting multiple target regulation. Nucleic Acids Res 45(16): 9625–9639. https://doi.org/10.1093/nar/ gkx570

Chapter 21 Circular and Linear Dichroism for the Analysis of Small Noncoding RNA Properties Florian Turbant, Kevin Mosca, Florent Busi, Ve´ronique Arluison, and Frank Wien Abstract Useful structural information about the conformation of nucleic acids can be quickly acquired by circular and linear dichroism (CD/LD) spectroscopy. These techniques, rely on the differential absorption of polarised light and are indeed extremely sensitive to subtle changes in the structure of chiral biomolecules. Many CD analyses of DNA or DNA:protein complexes have been conducted with substantial data acquisitions. Conversely, CD RNA analysis are still scarce, despite the fact that RNA plays a wide cellular function. This chapter seeks to introduce the reader to the use of circular, linear dichroism and in particular the use of Synchrotron Radiation for such samples. The use of these techniques on small noncoding RNA (sRNA) will be exemplified by analyzing changes in base stacking and/or helical parameters for the understanding of sRNA structure and function, especially by translating the dynamics of RNA:RNA annealing but also to access RNA stability or RNA:RNA alignment. The effect of RNA remodeling proteins will also be addressed. These analyses are especially useful to decipher the mechanisms by which sRNA will adopt the proper conformation thanks to the action of proteins such as Hfq or ProQ in the regulation of the expression of their target mRNAs. Key words Chiro-optical spectroscopy, Circular/Linear Dichroism, Orientated Circular Dichroism, DsrA noncoding RNA, Hfq, RNA secondary and tertiary structure, Post-transcriptional regulation, RNA annealing

1

Introduction Among light absorption spectroscopy techniques, circular dichroism (CD) appears particularly interesting for studying the structure of nucleic acids (NA) and proteins. This technique differs from the more widely used isotropic light absorption in the sense that it is based on measurements of differences in absorption of (left and

Florian Turbant and Kevin Mosca contributed equally with all other contributors. Ve´ronique Arluison and Claudio Valverde (eds.), Bacterial Regulatory RNA: Methods and Protocols, Methods in Molecular Biology, vol. 2741, https://doi.org/10.1007/978-1-0716-3565-0_21, © The Author(s), under exclusive license to Springer Science+Business Media, LLC, part of Springer Nature 2024

399

400

Florian Turbant et al.

right) circularly polarized light resulting from absorption phenomena at the electronic level (Fig. 1a top). In practice, the CD technique is particularly sensitive to the most subtle conformational changes observed in chiral biomolecules. As far as nucleic acids (NA) are concerned, n - π or π - π* transitions in the chiral structure are studied by CD in the far ultraviolet (UV) range between 320 and 170 nm. The CD signal will depend mainly on the spatial arrangement of the nucleotides forming the mostly helical NA, and precisely on the absorption of the bases linked by pentoses (ribose or deoxyribose), giving rise to asymmetry in the molecule. Although CD experiments can be conducted with bench-top spectrophotometers using conventional light sources (halogen or mercury), the extended spectral band from near UV to vacuum UV (VUV < 200 nm) accessible with synchrotron radiation with high photon fluxes (above 1010 photons/mm2/s/1 nm) allow for greater spectral details and ultimately more structural information. Thus, while conventional CD experiments allow the identification of NA secondary structures (double helix B, A, and Z structures), or the presence of triple helices or quadruplexes [1, 2], Synchrotron Radiation Circular Dichroism (SRCD) allows the extension of the spectral band down to 170 nm for samples in aqueous solutions keeping the bandwidth constant at 1 nm. Access to the 170–220 nm window allows observation of electronic transitions such as the intermolecular charge transfer crucial for fine local structural determinations of NA [3, 4]. Hence, SRCD is very useful to study NA alone, the effect of the local environmental on NA structure, or interactions between NA and proteins [5–7]. In an effort to gather and localize NA CD data, a publicly accessible databank has recently been created. It provides calibrated and normalized spectral information, experimental conditions, and NA three-dimensional structures (when known), with the associated bibliographic references. This nucleic acid circular dichroism database (NACDDB) is accessible at the following URL: https:// genesilico.pl/nacddb/ and already includes about 50 RNAs [8]. Complementary to CD, linear dichroism (LD) is used to investigate the orientation of anisotropic samples, such as orientated NA biopolymers [9]. Phase modulation using a piezo-electric modulator creates the circular or linear polarized light. At each wavelength step, an electromagnetic quarter-wavelength shift (λ/4) is applied for CD and traditionally a half-wavelength (λ/2) for LD. For both phase-modulated beams, a photomultiplier (detector) measures the remaining photon intensity (absorption). The optical active sample (RNA) will absorb differentially the circular (left and right) or linear (parallel and perpendicular) polarized light, which depends on the chirality and orientation of the sample, respectively (see Figs. 1a and 2).

Regulatory RNA CD and LD Spectroscopy

401

Fig. 1 (a) While Circular Dichroism CD investigates the interaction of circularly polarized light with RNA samples (top), linear Dichroism LD investigates the interaction of linearly polarized light, which oscillates in one plane with a sinusoidal waveshape (bottom). In this later case, light goes through the aligned sample with anisotropic samples such as orientated biopolymers (here a single strand nucleic acid in Blue). Aligned samples may be obtained using stretched films of the absorbing molecule between two spectroscopic windows (b left) or with flow using a Couette cell (b right). (b) Demountable SRCD/LD circular CaF2 cells. The chiral sample is deposited on the CaF2 cell with an adapted optical path length (i). Few μL are enough to fill the cell. The cell is then deposited in the sample holder having a diaphragm to control the size of the beam (ii) . Here, the sample holder shown is for SRLD measurements. The cell is tightened using Teflon spacers (iii) and a screw-lid (iv) the sample holder is then mounted in the experimental chamber purged with nitrogen. Couette cells consisting of a capillary, the moving or rotating part (v), and a rod that gets inserted in the capillary and is fixed (vi)

402

Florian Turbant et al.

Fig. 2 Linear dichroism investigates the interaction of linearly polarized light with an aligned optically active sample (anisotropic). In this case, a horizontally polarized wave (red) passes the sample (orientated RNA in blue), the vertically polarized (blue) wave is attenuated or completely absorbed. By measuring parallel and perpendicular polarized light absorption at different orientations of the sample (rotation), it is possible to identify its orientation in the cell. In the figure, the cell is turned (e.g., every 90° from blue to orange, red, and green) to measure the linear absorbance of the aligned sample in different orientations

An animation showing CD, LD, and the effect of an absorbing media is available on EMANIM website (https://emanim.szialab. org) [10]. LD thus allows measuring the differential absorption of the linear polarized light by the orientated molecules (RNA) relative to the incoming beam. Therefore, LD provides spectral information on orientated samples [9]. Samples orientate themselves either upon loading of the optical cells or by forcing them to align within a Couette cell, which typically rotates a tubular cell around a centered rod (Fig. 1b, right) [11]. If the sample observed is isotropic (randomly distributed) LD = 0, while as soon as a sample begins to be orientated, the LD signal increases. Among the biological-orientated samples, nucleic acids DNA/RNA and polymeric proteins (e.g., amyloids, cytoskeleton, and membrane proteins) are the most commonly analyzed by LD [12, 13]. For this reason, LD may be used to address the question of NA orientation that happens during some biological processes, such as recombination or replication [14, 15]. Note that as for SRCD, synchrotron radiation linear dichroism SRLD allows collecting spectra down to 170 nm with adapted cells and buffers (see Note 1).

Regulatory RNA CD and LD Spectroscopy

403

Through this chapter, our objective is to detail how accurately structural information related to RNA secondary, tertiary, and quaternary structures can be acquired with SRCD and SRLD spectroscopies, including RNA dynamics, as well as thermal stability analysis.

2

Materials

2.1 Nucleic Acids, Proteins, Buffers, and Chemicals

1. When RNA manipulations are carried out, RNase-free grade solutions/reagents are mandatory at all steps. RNA resuspension is thus carried out in double distilled, endotoxin-free sterile water or buffer (see Note 2). 2. Usually, 20 mM (expressed in nucleotide) of the solution of the RNA of interest and 5–10 μL are needed. The transcripts are usually analyzed after relaxation: the RNA is heated and slowly cooled down at 20 °C (see Note 3) [16]. 3. Protein of interest (POI) with RNA-binding capability that will be the object of the study (e.g., Hfq, ProQ or other... [17, 18]). 4. Optimal buffer, or if not optimized, a buffer that has already been tested and that allows measuring the annealing activity of the POI (Table 1). To clean optical cells: 5. dd-water (not sterile). 6. 70% Ethanol. 7. Hellmanex 2% (v/v) liquid alkaline solution. 8. RNaseZAP™ (Sigma-Aldrich) or equivalent RNase decontamination solution is used to clean the bench/working space.

2.2 Materials for Synchrotron Radiation Circular Dichroism (SRCD) Spectroscopy

1. Examples of facilities include DISCO at Synchrotron SOLEIL (France), AU-CD at ASTRID2, Aarhus University (Denmark), B23 at Diamond (UK) or HISOR (Japan), and CEDRO at SIRIUS (Brazil) [20–25]. 2. Removable circular cells [25] (Fig. 1b, left). Because far-UV transparent devices are required for SRCD experiments, CaF2 optical cells should be used (see Note 1). Moreover, these cells have the advantage to display optical pathlengths range from 1 to 50 μm consuming μL-volumes of sample. Cells should be previously thoroughly cleaned using successively: Hellmanex, alcohol, and dd-water. CaF2 cells require careful handling as they break easily. 3. (+)-camphor-10-sulphonic acid (CSA) at 6 mg/mL in dd-water used as calibration standard.

404

Florian Turbant et al.

Table 1 Recommended buffer for SRCD analysis

Buffer or salt pH/pKA

Transparent (nm)

Absorbance at 200 nm

Absorbance at 180 nm

Na2HPO4

210

0.05

>0.5

195

0

6–7.2

NaH2PO4

0.15

Acetate

3.6–5.6 pKA = 4.76

220

0.17

>0.5

Glycine-HCl

2.2–3.6 pKA = 2.35 8.6–10.6 pKA = 9.78

220

0.1

>0.5

Tris–HCl

6.8–8.5 pKA = 8.06 Strongly T° dependent

220

0.13

>0.5

Hepes

6.8–8.2 pKA = 7.55

230

0.5

>0.5

Pipes

6.1–7.5 pKA = 6.8

230

0.5

>0.5

Mops

6.5–7.9 pKA = 7.2

230

0.34

>0.5

Mes

5.2–7.1 pKA = 6.15

230

0.3

>0.5

Cacodylate

5–7.4 pKA = 6.25

210

0.2

>0.5

NaCl

NaCl

205

0.02

>0.5

GlycineNaOH

Adapted from [19] Useful information for buffer preparation may also be found here, https://www.aatbio.com/resources/bufferpreparations-and-recipes

4. CDtoolX software is used for averaging, smoothing, baseline subtraction, CSA calibration, and normalization of acquired CD spectra [26]. Absorbance spectra are obtained with the A205.exe in the same software download. 5. Any software such as MathLab, IGOR, Kaleidagraph (Synergy software), or SigmaPlot (Systat Software) can be used to perform kinetic and thermodynamic analysis. 2.3 Materials for Synchrotron Radiation Linear Dichroism (SRLD) Spectroscopy

1. The setup for SRLD requires a rotation chamber as well as access to the parameters of the phase modulation and data acquisition. The rotation chamber is preferably automated and thermo-controlled by a Peltier heating device. Different

Regulatory RNA CD and LD Spectroscopy

405

mounting devices allow holding and rotating the cells in the centered beam. The piezo electric modulator allows changing the phase required to obtain circular (λ/4) or linear (~λ/2) dichroism. The signal read-out frequency, controlled by the acquisition amplifier (Lockin-amplifier) needs to be doubled for linear (twice the frequency) compared to circular dichroism. 2. A second way of orientating samples using the Couette cell (Fig. 1a bottom right) is based on flow orientation. The sample solution (60 μl in 100 microns pathlengths) is spinning in the flow of the turning cell at speeds between 3000 and 6000 rpm. 3. Examples of facilities providing SRLD experiments are the same as for the SRCD stations mentioned above. When applying for beamtime the choice of experimental setup (SRCD or SRLD) should be discussed with the beamline management. 4. Cells of short pathlengths should be used in combination with a moderate concentration of NA or NA:protein complexes. 5. Standard measurements for the rotation chamber are carried out using a drop of naphthalene put on a clingfilm, which is then stretched and put between two sample windows. The aromatic rings of naphthalene will align following the stretching axis. 6. For data treatment, CdtoolX is used as mentioned for SRCD above.

3

Methods

3.1 Acquisition and Treatment of SRCD Spectra 3.1.1 Sample Preparation and Loading

1. Wear clean gloves at all steps during sample/optical cell handling to avoid contamination with RNases. The working area should also be thoroughly cleaned with RNaseZAPTM (or equivalent solution). Hellmanex, 70% ethanol, and sterile dd-water are used successively to wash the optical cell. 2. As a working NA solution, adjusting the sample to ~20 mM in terms of nucleotide concentration should be a good starting concentration for a 5 μM pathlength knowing that the concentration needs optimization according to RNA secondary structure and that difficulties are expected to acquire signals with poorly structured RNAs (see Note 4). 3. As far as signals are monitored for RNA in interaction with proteins, it is noteworthy that (1) the spectrum of the protein needs to be subtracted, (2) the concentration of the protein should be optimized to a certain RNA:protein ratio (see Notes 4 and 5), and (3) the concentration of the protein should be kept to as low values as possible.

406

Florian Turbant et al.

Fig. 3 Example of an RNA SRCD spectrum, here a fragment of rpoS mRNA, whose expression is regulated by DsrA sRNA [27]. Three raw data scans are shown for the RNA (red thin lines) and its buffer (blue thin line). They must align between 300 and 320 nm, indicating a good realignment of the CaF2 cells for the two loadings (see Note 6). After averaging, the baseline spectrum is subtracted from that of the sample spectrum (blue thick line)

4. Sample (with RNA of interest) and reference (buffer) spectra should be acquired at least three times with the optical cell loaded in the exact same orientation with respect to the beam (see Notes 6 and 7). The baseline (obtained with buffer) must display a nearly null ellipticity as long as non-chiral buffers are used (Table 1). It should also display the same values as the NA spectrum in the 300–320 nm region where no ellipticity is observed. Finally, each experiment should be reproduced three times for each experimental condition (including baseline). 3.1.2 SRCD Data Acquisition

1. Ellipticity Θ is inferred from the absorption difference ΔA = Aleft - Aright within the 170–320 nm acquisition window (see Note 8), applying a conversion factor of Θ = 3300 ΔA to obtain units of mdeg. 2. Both sample and baseline (buffer) spectra are measured at least three times between 320 and 170 nm with 1 nm step at 1.2 s integration. 3. The baseline spectrum is subtracted from the sample to obtain the baseline subtracted sample spectrum (see Fig. 3).

Regulatory RNA CD and LD Spectroscopy

407

4. A 6 mg/mL (+) camphor sulfonic acid (CSA) solution is used to calibrate spectral amplitudes and assure wavelength position. Spectra of CSA in a 100 μM cell subtracted by a water baseline shall produce a maximum ellipticity of 21 mdeg at 290.5 nm and a minimum at 192.5 nm with an ellipticity of -42 mdeg, which is twice the former 290.5 nm value in absolute values. This twofold ratio used for calibration is especially useful in the analysis of amplitudes from spectra acquired on different sources [28]. 5. To compare RNA CD spectra with different sizes (number of nucleotides) and/or concentrations, sample concentration (in mol/L), pathlength (in cm), and length of RNA (in nucleotides) are adjusted to allow spectra normalization. The absorbance and amplitude of CD are adjusted by adapting the concentration of RNA and the pathlength following Beer Lambert’s Law. 3.1.3 Spectral Data Treatment

1. SRCD spectra are analyzed using CDToolX [26, 29]. 2. Sample (RNA) and reference (buffer) spectra are expected to coincide at the highest wavelengths (to a 5 nm extent minimal range, Fig. 3). 3. The Savitzky Gollay algorithm (using a 5–15 factor) may be used for smoothing after averaging. This gives the highest quality results with low noise (~0.2 mdeg) preventing peak wavelength shifts or amplitude reduction in cases of large wavelength peaks.

3.1.4 RNA Spectra Analysis

As far as RNA molecules are concerned, special attention should be paid to the 320–170 nm region because several electronic transitions are observable in this part of the spectrum: Both base-pairing or stacking (in an intra- and/or inter-molecular structure) exert an impact on the 260–270 nm positive CD signal (Fig. 3) [2, 30, 31], a region belonging to a larger spectral band linked to n - π* transitions due to the sugar– phosphate backbone or to interactions between π and π* oscillators in RNA bases [32, 33]. Right-handed RNA molecules (A- or B-forms) can be recognized as they display a spectral signature with a negative band around 200–210 nm in addition to a positive band near 185 nm while left-handed RNA molecules (analogous to DNA Z-form [34]) present a 190 nm negative band and a 180 nm positive band [8].

408

Florian Turbant et al.

A

B 30

40

32 20

CD (mdeg)

CD (mdeg)

24

16

10

8 0 0

-8

-10

175

200

225

250

Wavelength (nm)

275

300

20

40

60

80

Temperature (°C)

Fig. 4 (a) Thermal scan of an sRNA fragment (here DsrA fragment) carried out every 3 °C from 10 °C (blue) to 95 °C (red). Note the isosbestic points, indicating the existence of an equilibrium between two coexisting species. (b) Melting curves obtained from the thermal scan at three wavelengths 182, 208, and 265 nm. Tm values are obtained from data fits with a Boltzmann sigmoidal equation. The sigmoidal at 182 nm (blue curve) reveals a melting point at 45.5 ± 1 °C, that at 208 nm (purple curve) a Tm at 38.2 ± 1 °C and that at 265 nm (pink curve) a Tm at 54.1 ± 1 °C (blue). Note the difference in melting point temperatures that are due to structural changes occurring at different temperatures for the unwinding of the right-handed helix (208 and 182 nm) and the disruption of the base pairing (265 nm). Also, the important shift from 265 to 275 nm confirming the change in the base environment when temperature increases 3.1.5 Thermal Scans Acquisition

1. Perform measurements using a Peltier-controlled sample holder (Fig. 4a) to obtain melting curves of RNA (see Note 9). Typically, scans are carried out from 10 to 97 °C (every 3 ° C). Following the decrease of maxima and minima at a fixed or variable wavelength provides the transition midpoint Tm. Several transitions and Tm may be observable indicating more complex RNA structures, comprising different sub-structures. The observation of an isosbestic point (Fig. 4a) usually indicates the existence of an equilibrium between two states when RNA unfolds. 2. Perform a last one acquisition at the initial temperature after melting to verify the reversibility of the process.

Regulatory RNA CD and LD Spectroscopy

409

3. Tm is very sensitive to buffer conditions, to the presence of divalent ions, and very dependent on RNA concentration for intermolecular base pairing [35]. 4. Keep in mind that during denaturation hyperchromicity occurs and that high tension (HT) will increase. Ensure the HT of the CD detector, which is proportional to the absorption of the sample, is respected for the spectral cut-off (see Note 4). 5. Melting curves can be fitted using a Boltzmann sigmoidal equation, y = Bottom + (top - bottom)/(1 + e((Tm - x)/ slope)) (Fig. 4b), but alternative equations with more parameters may also be used when needed. A reduction of the amplitude of the ~270 nm peak is related to the disruption of the base stacking, while a reduction in amplitude of the ~180 nm peak is related to helix melting (here A-type helix). 6. Performing UV absorption hypochromicity measurements at 260 nm (i.e., comparing the absorbance of the non-denatured and denatured NA) to estimate Tm may be performed in parallel, but will only give access to an averaged Tm. As shown in Fig. 4b, SRCD melting curve analysis allows the measurement of different Tm, corresponding to A-helix melting and base stacking disruption at different temperatures. Note that major differences may be observed in some cases (Δ up to 20 °C) between the different maxima and minima. Measurements of the high tension applied to the photomultiplier tube during the wavelength stepping of a spectral scan allow calculating the pseudo-absorbance. Mathematically calculated, the pseudo-absorbance is obtained by subtracting the logarithm of the high tension obtained from the baseline (buffer) minus the logarithm of the high tension of the sample. A correction factor is applied to correlate the pseudoabsorbance to the absorbance measured on an absorption spectrometer. This allows to follow the hyperchromicity of doublestranded nucleic acids unfolding. 3.1.6 Specific Case of Proteins Allowing RNA Annealing

Some proteins greatly influence RNA annealing [17, 18]. This process is usually analyzed using ElectroMobility Shift Assay (EMSA) or fluorescence (e.g., Fo¨rster resonance Energy Transfer, FRET) experiments [36, 37]. But RNA:RNA annealing can also be analyzed using SRCD (Fig. 5). Usually, identical concentrations of both RNAs should be used, while for protein:RNA complexes the protein concentration is limiting to avoid attenuation of the RNA spectrum (see Note 10). It is important to determine accurate kinetic constants of annealing to subtract the signal due to spontaneous annealing in the absence of the protein (if relevant) from that in the presence of the protein (Fig. 5b). In any case, the signal in the absence of protein must be kept minimal (see Note 11).

410

Florian Turbant et al.

b

DCD (mdeg)

a

15

10

5

0 50

100

150

200

250

300

Time (min)

Fig. 5 (a) SRCD spectra showing sRNA:mRNA annealing (here annealing of fragments of DsrA sRNA to rpoS mRNA by Hfq [27]). The 270 and 180 nm CD-positive bands increase during the kinetics. The increase at 270 nm reveals a formation of base pairing, while the increase at 180 nm CD bands indicates that the helical structure of the phosphate backbone is also affected. Both types of signal changes indicate RNA secondary structure formation induced by the protein and allow to measure the kinetics of annealing (b). ΔCD refers to CD at a specific time subtracted from CD at t = 0. Furthermore, apparent kinetics kcatapp can be measured during initial rate conditions. This constant is expressed in mdeg·M-1·min-1 and depends on protein concentration (CD units cannot be normalized to protein concentration as they are not expressed in the same unit, mdeg vs. mol.L-1). At 180 nm, the apparent catalytic kinetic constant kcatapp was 0.083 mdeg·M1 ·min-1

In the case of the presented spectra (Fig. 5), we can clearly identify a change in the RNA spectrum corresponding to RNA annealing. Changes in amplitudes at ~180 nm and ~220 nm reveal changes in the sugar-phosphate backbone structure and while changes in amplitudes ~270 nm reveal changes in base-pairing and stacking. 3.2 SRLD Spectra Acquisition and Treatment

The acquisition and data treatment for SRLD is similar compared to the SRCD, nevertheless to observe alignments of NA molecules in short pathlengths or in orientated flow (Couette cell), the cell needs to be rotated either perpendicularly to the incoming centered beam or horizontally to the beam, respectively.

3.2.1

Use of Circular Demountable Cells A total of 4 μL of samples shall be loaded into the same CaF2 circular cell used for SRCD. The cell is mounted in a round copper sample holder containing the sample deposited on a CaF2 window covered by a lid (Fig. 1b). The sample holder is then mounted in a rotation chamber (Fig. 1b), which allows the rotation of the cell around the beam center. A stepping motor allows the automated rotation of the cell holder at given rotation angles. An air-pressurized clutch system

Sample Loading

Regulatory RNA CD and LD Spectroscopy

411

allows for the contact of the sample holder with the chamber, to maintain the desired temperature, controlled by a Peltier. 1. For sample and optical cell handling, proceed and take the same precautions as for SRCD (see above). 2. SRLD sample orientation is obtained by loading moderate concentrations of samples between optical cells with short pathlengths (Fig. 1b). Typically, ~1 mM for a single strand of ~30 nucleotides are squeezed in an optical path of less than 10 μM at volumes of 2–4 μl. This concentration may need to be adapted depending on the RNA structure. The cell is mounted in a round copper sample holder containing the sample deposited on a CaF2 window covered by a lid. The sample holder is then mounted in a rotation chamber (Fig. 1b), which allows the rotation of the cell around the beam center. A stepping motor allows the automated rotation of the cell holder at given rotation angles. An air-pressurized clutch system allows for the contact of the sample holder with the chamber, to maintain the desired temperature, controlled by a Peltier. Use of Couette Cell 1. For the Couette in contrast, cell loading is around 60 μl in an optical path of 100 μM at concentrations of several tenth of mg/ml of RNA (~0.1 mM for a single strand of ~30 nucleotides). This concentration may need to be adapted depending on the RNA structure. 2. RNA:protein ratios should be adapted to the above concentrations for SRLD or Couette cell, respectively, keeping the protein concentration as low as possible (see Notes 4 and 5). 3. Sample (RNA) and baseline (buffer or dialysate) should be acquired at least twice for each angle of rotation and in triplets for the Couette cell data acquisition. 3.2.2 SRLD Data Acquisition and Treatment

In this section, only the analysis on thin film using a demountable circular cell (Fig. 1b) will be presented. Detailed protocols using flow alignment and a Couette cell can be found in ref. 38. For SRLD measurements, the settings of the modulator (0.608xλ) and acquisition (2xf) are adapted as described previously [39]. Doublet spectra are recorded and averaged every 90° over 360°, which result in 5 spectra including a repetition of the 0/360° angle for reproducibility verification (Fig. 2). LD results in strong spectral amplitudes, which changes depending on the angle of the cell. The magnitude of LD is typically higher than the CD of the same molecule or complex (see Note 12). This allows the alignment of RNA/RNA or protein/RNA complexes to be observed [12]. The spectral acquisition is carried out over the spectral band

412

Florian Turbant et al.

for every rotation angle chosen. In addition, the Peltier control allows for ramping of the temperature to follow denaturation. 1. SRLD spectra are analyzed using CDToolX [29]. 2. 9 spectra of the RNA sample are taken every 45° over 360° (or 5 spectra taken every 90°); the last spectrum ensures the reproducibility of measurement at 0° (first spectrum). Spectra of the RNA sample and of the baseline (Buffer or dialysate) are averaged separately for each angle. The corresponding baseline spectra are then subtracted from the sample spectra. 3. If orientation occurs, the maxima and minima signal should oscillate in opposite ways above and below the zero line (Fig. 2).

4

Notes 1. For SRCD analysis, quartz cells (low OH cells or highperformance quartz QS cells) can also be used. Nevertheless, for quartz cells with a 0.1 mM pathlength, the material has a transmission range from the visible to UV with a cut-off of around 176 nm. When CaF2 cells are used, their transmission range includes the UV down to 125 nm. 2. To avoid interference with CD spectra that are very sensitive to any absorbing compound, we recommend using water instead of buffers such as the common TE buffer. 3. The transcripts are usually analyzed after relaxation (i.e., the RNA is heated and slowly cooled down at 20 °C). Note that the structure and stability of the RNA can greatly be affected by heating or cooling rate. Indeed, performing experiments at different heating rates can be used to develop kinetic models, which provide access to the activation energies involved to predict the evolution of reactions under any given time or temperature conditions. This approach is called the isoconversional approach [40]. 4. The high tension (HT) of the CD detector, which is proportional to the absorption of the sample, is used to determine the spectral cut-off. In general, the mid height of the exponential HT increase, at the end of the spectra (in the lower wavelengths), is used to determine the cut-off. In addition, the HT or absorption is an additional source of information with respect to the sample. 5. In non-homogenous samples the probing wavelength may coincide with the size of aggregated samples. Light scattering occurs in those cases, which results in absorbance or solvent flattening (decrease of HT) as well as CD signal changes.

Regulatory RNA CD and LD Spectroscopy

413

Absorbance flattening and scattering can therefore affect the measured CD amplitude in this region. As a rule of thumb, in the far UV end of the spectrum, the CD signal (maximum or minimum) should be at least 5 nm apart from the spectral cut-off. Otherwise, a redshift of the CD signal is inevitable. 6. Many buffers and salts absorb strongly in the far-UV region and so should be avoided. Because chloride absorbs in the far-UV, if used, the typical NaCl concentration should be 10–15 mM and no more than 50 mM (when using a 20 μM pathlength cell, to be able to reach wavelengths down to 185 nm). Alternatively, chloride can be replaced with fluoride to increase the salt concentration (up to twice the amount of chloride to maintain ionic strength). In addition, divalent cations such as Mg2+ can be used to facilitate RNA folding. MgSO4 is preferred to MgCl2. 7. Due to the high sensitivity of the method, the baseline spectra have to be recorded using the same cell, loading and mounting procedure as that for the sample. Ideally, the baseline and sample are loaded by the same experimentalist in an acceptable time frame to avoid spectral changes of the baseline due to optical or technical effects over time such as heating–cooling of the optics or power failures. For screening of sample conditions and pathlength choice, starting with the sample first will save time. Initial buffer baselines may be recorded before the sample data collection, which allows immediate absorbance measurements based on the HT. 8. The effects of the far UV light, especially between 190 and 170 nm, upon RNA molecules have to be investigated in view of possible radiation damage. In this region, the UVC is known to cause alterations in the NA structure leading to single-strand and even double-strand breakage. The impact of radiation damage is avoided by illuminating a rather large sample surface of 2 × 2 mm, which is considered viable for light sources with photon fluxes of less than 1012 photons/s/mm2. Verification by means of an action spectrum is recommended [41]. 9. We recommend to use phosphate buffer when performing thermodynamic analyses as it is more temperature stable and deep-UV transparent compared to Tris-based buffers. 10. As both the protein and RNA spectra may change when complex forms limiting the possibility to make an accurate difference spectrum [31], we recommend to keep as minimal as possible the concentration of the protein to focus on RNA conformational change induced by the protein, without observing the protein structural change.

414

Florian Turbant et al.

11. In some cases, sRNA:mRNA spontaneous annealing can occur in the absence of any protein. To reduce this spontaneous annealing, the temperature can be lowered (i.e., 15 °C). 12. LD signals are stronger than CD signals of the same sample because the aligned samples have a stronger absorbance since their numbers and their higher linear absorbance coefficient outweighs in general, the number of molecules randomly distributed in solution for circular dichroism. In conclusion, we would like to recommend readers of this chapter to consider SRCD and SRLD as valuable tools for refining regulatory RNA-based interactions, with other RNAs or proteins, as well as for probing the structural local environments of the sRNA. While access to synchrotron radiation facilities may initially seem inaccessible to many researchers in the life sciences, valuable guidance can be obtained from the scientific community users and beamline scientists. Such collaborative efforts offer a remarkable opportunity to open up the field of structural deciphering in addressing significant biological puzzles. After reading this chapter, readers will possess the ability to evaluate the feasibility and accessibility of these powerful techniques for their own unresolved molecular problems pertaining to the structural aspects of RNA.

Acknowledgments We thank J.S. Bolduc (SANOFI) for the critical reading of the manuscript. This work was supported by Synchrotron SOLEIL, CNRS, and CEA. SRCD measurements were performed on DISCO beamline at the SOLEIL Synchrotron (proposals 99220186, 20220013, 20210819, and 20201013). This study contributes to the IdEx Universite´ Paris Cite´ ANR-18-IDEX0001 (VA, FB). This work was also supported by a public grant overseen by the French National Research Agency (ANR) as part of the “Investissements d’Avenir” program, through the “ADI 2021” project funded by the IDEX Paris-Saclay, ANR-11-IDEX-0003-02 (FT). References 1. Gray DM, Ratliff RL, Vaughan MR (1992) Circular dichroism spectroscopy of DNA. Methods Enzymol 211:389–406 2. Holm AIS, Nielsen LM, Hoffmann SV, Nielsen SB (2010) Vacuum-ultraviolet circular dichroism spectroscopy of DNA: a valuable tool to elucidate topology and electronic coupling in DNA. Phys Chem Chem Phys 12(33): 9581–9596

3. Johnson WC (1990) Electronic circular dichroism spectroscopy (CD) spectroscopic of nucleic acids. In: Biophysics, vol 1. Springer, Berlin/Heidelberg, pp 2275–2280 4. Wallace BA (2000) Synchrotron radiation circular-dichroism spectroscopy as a tool for investigating protein structures. J Synchrotron Radiat 7(Pt 5):289–295

Regulatory RNA CD and LD Spectroscopy 5. Berova N, Nakanishi K, Woody RW (2000) Protein characterisation by synchrotron radiation circular dichroism spectroscopy. In: Circular dichroism principles and applications, vol 4. Wiley, New York, pp 317–370. https://doi. org/10.1017/S003358351000003X 6. Fasman GD (1996) Circular dichroism and the conformational analysis of biomolecules. Springer, New York, pp 317–370. https:// doi.org/10.1017/S003358351000003X 7. Norde´n B, Rodger A, Daffron T (2010) Linear dichroism and circular dichroism a textbook on polarized-light spectroscopy. RCS Publishing, Cambridge, pp 317–370. https://doi.org/10. 1017/S003358351000003X 8. Cappannini A, Mosca K, Mukherjee S, Moafinejad SN, Sinden RR, Arluison V, Bujnicki J, Wien F (2022) NACDDB: nucleic acid circular dichroism database. Nucleic Acids Res. https://doi.org/10.1093/nar/gkac829 9. Rodger A, Dorrington G, Ang D (2016) Linear dichroism as a probe of molecular structure and interactions. Analyst 141:6490–6498 10. Szila´gyi A (2019) EMANIM: Interactive visualization of electromagnetic waves. Web application available at https://emanim.szialab.org 11. Wu X, Martin B, Kellay H, Goldburg WI (1995) Hydrodynamic convection in a two-dimensional Couette cell. Phys Rev Lett 75(2):236–239. https://doi.org/10.1103/ PhysRevLett.75.236 12. Sutherland JC (2017) Linear dichroism of DNA: characterization of the orientation distribution function caused by hydrodynamic shear. Anal Biochem 523:24–31. https://doi. org/10.1016/j.ab.2017.01.016 13. Rocha S, Kumar R, Norden B, WittungStafshede P (2021) Orientation of alphasynuclein at negatively charged lipid vesicles: linear dichroism reveals time-dependent changes in helix binding mode. J Am Chem Soc 143(45):18899–18906. https://doi.org/ 10.1021/jacs.1c05344 14. Kubiak K, Wien F, Yadav I, Jones NC, Vrønning Hoffmann S, Le Cam E, Cossa A, Geinguenaud F, van der Maarel JRC, We˛grzyn G, Arluison V (2022) Amyloid-like Hfq interaction with single-stranded DNA: involvement in recombination and replication in Escherichia coli. QRB Discov 3:e15. https://doi.org/10.1017/qrd.2022.1015 15. Wien F, Kubiak K, Turbant F, Mosca K, Arluison V (2022) Synchrotron radiation circular dichroism, a new tool to probe interactions between nucleic acids involved in the control of ColE1-type plasmid replication. Appl Sci 12: 2639

415

16. Sun X, Li JM, Wartell RM (2007) Conversion of stable RNA hairpin to a metastable dimer in frozen solution. RNA 13(12):2277–2286 17. Vogel J, Luisi BF (2011) Hfq and its constellation of RNA. Nat Rev Microbiol 9(8):578–589 18. Smirnov A, Wang C, Drewry LL, Vogel J (2017) Molecular mechanism of mRNA repression in trans by a ProQ-dependent small RNA. EMBO J 36(8):1029–1045. https:// doi.org/10.15252/embj.201696127 19. Schmid F (1989) Protein structure – a practical approach. IRL Press, Oxford 20. Refregiers M, Wien F, Ta HP, Premvardhan L, Bac S, Jamme F, Rouam V, Lagarde B, Polack F, Giorgetta JL, Ricaud JP, Bordessoule M, Giuliani A (2012) DISCO synchrotron-radiation circular-dichroism endstation at SOLEIL. J Synchrotron Radiat 19 (Pt 5):831–835. https://doi.org/10.1107/ S0909049512030002 21. Giuliani A, Jamme F, Rouam V, Wien F, Giorgetta JL, Lagarde B, Chubar O, Bac S, Yao I, Rey S, Herbeaux C, Marlats JL, Zerbib D, Polack F, Refregiers M (2009) DISCO: a low-energy multipurpose beamline at synchrotron SOLEIL. J Synchrotron Radiat 16(Pt 6): 8 3 5 – 8 4 1 . h t t p s : // d o i . o r g / 1 0 . 1 1 0 7 / S0909049509034049 22. Miles AJ, Hoffmann SV, Tao Y, Janes RW, Wallace BA (2007) Synchrotron radiation circular dichroism (SRCD) spectroscopy: new beamlines and new applications in biology. Spectroscopy 21:245–255 23. Wallace BA, Gekko K, Hoffmann SV, Lin Y-H, Sutherland JC, Tao Y, Wien F, Janes RW (2011) Synchrotron radiation circular dichroism (SRCD) spectroscopy: an emerging method in structural biology for examining protein conformations and protein interactions. Parkinsonism Relat Disord A649:177– 178 24. Wallace BA (2009) Protein characterisation by synchrotron radiation circular dichroism spectroscopy. Q Rev Biophys 42(4):317–370. h t t p s : // d o i . o r g / 1 0 . 1 0 1 7 / S003358351000003X 25. Wien F, Wallace BA (2005) Calcium fluoride micro cells for synchrotron radiation circular dichroism spectroscopy. Appl Spectrosc 59(9): 1109–1113. https://doi.org/10.1366/ 0003702055012546 26. Lees JG, Smith BR, Wien F, Miles AJ, Wallace BA (2004) CDtool-an integrated software package for circular dichroism spectroscopic data processing, analysis, and archiving. Anal Biochem 332(2):285–289

416

Florian Turbant et al.

27. Turbant F, Wu P, Wien F, Arluison V (2021) The amyloid region of Hfq riboregulator promotes DsrA:rpoS RNAs annealing. Biology (Basel) 10(9). https://doi.org/10.3390/ biology10090900 28. Johnson WC (1990) CD spectra for nucleic acid monomers. In: Spectroscopic and kinetic data. Physical data I; Landolt-Bo¨rnstein – Group VII Biophysics, vol 1C. Springer, Berlin, Heidelberg 29. Miles AJ, Wallace BA (2018) CDtoolX, a downloadable software package for processing and analyses of circular dichroism spectroscopic data. Protein Sci 27(9):1717–1722. https:// doi.org/10.1002/pro.3474 30. Moore DS, Williams AL Jr (1986) CD of nucleic acids: III. Calculated CD of RNAs from new A, U, G, and C transition-moment parameters. Biopolymers 25(8):1461–1491 31. Wien F, Geinguenaud F, Grange W, Arluison V (2021) SRCD and FTIR spectroscopies to monitor protein-induced nucleic acid remodeling. Methods Mol Biol RNA Remodeling Proteins 2209:87–108 32. Cech CL, Tinoco I Jr (1977) Circular dichroism calculations for double-stranded polynucleotides of repeating sequence. Biopolymers 16(1):43–65 33. Rizzo V, Schellman JA (1984) Matrix-method calculation of linear and circular dichroism spectra of nucleic acids and polynucleotides. Biopolymers 23(3):435–470. https://doi. org/10.1002/bip.360230305 34. Herbert A (2019) Z-DNA and Z-RNA in human disease. Commun Biol 2:7

35. Cayrol B, Geinguenaud F, Lacoste J, Busi F, Le Derout J, Pietrement O, Le Cam E, Regnier P, Lavelle C, Arluison V (2009) Auto-assembly of E. coli DsrA small noncoding RNA: molecular characteristics and functional consequences. RNA Biol 6(4):434–445 36. Arluison V, Hohng S, Roy R, Pellegrini O, Regnier P, Ha T (2007) Spectroscopic observation of RNA chaperone activities of Hfq in post-transcriptional regulation by a small non-coding RNA. Nucleic Acids Res 35(3): 999–1006 37. Hwang W, Arluison V, Hohng S (2011) Dynamic competition of DsrA and rpoS fragments for the proximal binding site of Hfq as a means for efficient annealing. Nucleic Acids Res 39(12):5131–5139 38. Marrington R, Dafforn TR, Halsall DJ, MacDonald JI, Hicks M, Rodger A (2005) Validation of new microvolume Couette flow linear dichroism cells. Analyst 130(12):1608–1616 39. Wien F, Paternostre M, Gobeaux F, Artzner F, Refregiers M (2013) Calibration and quality assurance procedures at the far UV linear and circular dichroism experimental station DISCO. J Phys Conf Ser 425:122014 40. Sbirrazzuoli N (2020) Interpretation and physical meaning of kinetic parameters obtained from isoconversional kinetic analysis of polymers. Polymers 12:1280. https://doi.org/10. 3390/polym12061280 41. Wien F, Miles AJ, Lees JG, Vronning Hoffmann S, Wallace BA (2005) VUV irradiation effects on proteins in high-flux synchrotron radiation circular dichroism spectroscopy. J Synchrotron Radiat 12(Pt 4):517–523

INDEX A

F

ANNOgesic ...............................................................35–68

Fluorescence correlation spectroscopy (FCS) .... 175–181 Free energy landscape ................................ 210, 216–218, 220, 221, 226 Functional genomics............................................ 102, 114

B Bacteriophage ............................................................25–33 Bacteroides ..........................................101–103, 109, 115 Base pairing ................................... 61, 74, 128–129, 141, 146, 147, 175, 183, 195, 196, 198, 199, 204, 222, 240, 243, 256, 308, 309, 333, 336, 339, 363, 364, 367, 407–410, 3884

C Cas12a ........................................................................... 105 Chiro-optical spectroscopy ........................................... 399 Circular dichroism (CD) ..................................... 399–414 Compensatory mutations .......................... 119, 128–129, 131, 141, 292 Computational predictions............................................. 93 Conformational ensemble .................................. 216–218, 220, 228 CRISPR interference (CRISPRi) ........................ 101–115 Cross-linking, ligation, and sequencing of hybrids (CLASH) .................................................. 307–343

D

G Genome annotation ..................................................48, 49 GlmZ ..................................................256, 257, 259–261, 263, 266, 268, 269, 301 Gram-negative bacteria ...................................3, 4, 11–13, 147, 184, 185, 308 Gram-positive bacteria ........................3–9, 147, 184, 308

H Hfq..................................................... 147, 165, 230, 231, 269, 270, 308, 334, 339, 357, 358, 363, 364, 376, 403, 410

I Imino proton................................................384, 388–391 Interactome .............................v, 307–343, 348, 363–378 In vitro-RNA synthesis .......................149, 152–155, 157

L

Differential Radial Capillary Action of Ligand Assay (DRaCALA)..................................... 148, 151, 165 DsrA................................................... 294, 295, 301, 302, 384, 385, 406, 408, 410 dsRNA hybridization .................................................... 175

lacZ fusion ........................................................... 118–120, 129–131, 138, 298, 303 Linear Dichroism (LD)........................................ 399–414 Long-flanking homology-PCR (LFH PCR) ..............119, 127–129, 138

E

M

eGFP reporter ............................................................... 244 Electrophoretic Mobility Shift Assay (EMSA) ...........120, 139, 141, 145–150, 152, 155–158, 166–167, 169, 170, 172, 175, 195–205, 409 Endoribonuclease......................................................8, 255 5’ End phosphorylation....................................... 255–270 Enhanced sampling .............................................. 226–229

Membrane vesicle (MV) ................. 3–9, 11–22, 183–191 Microbiota ............................................................ 101, 102 Modifications mapping ........................................ 274, 275 Molecular dynamics ................... 207, 210–215, 218–232 5’ Monophosphate (5’P) .............................................. 255 mRNA stability............................................ 243, 244, 347 MS2.................................... 348, 365, 368, 369, 376, 377

Ve´ronique Arluison and Claudio Valverde (eds.), Bacterial Regulatory RNA: Methods and Protocols, Methods in Molecular Biology, vol. 2741, https://doi.org/10.1007/978-1-0716-3565-0, © The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Science+Business Media, LLC, part of Springer Nature 2024

417

BACTERIAL REGULATORY RNA: METHODS AND PROTOCOLS

418 Index N

Nano liquid chromatography-tandem mass spectrometry (nanoLC-MS/MS).......................... 274, 275, 278 NMR spectroscopy..................................... 383, 384, 388, 391, 393, 394 Noncoding RNA............................................26, 399–414 Northern blotting ............................................... 120, 123, 135–137, 139, 140, 142, 153, 257, 262, 266–268, 292, 295, 350–353, 359, 373

O OMVs-derived RNA extraction .......................... 186, 188 OMVs extraction and purification ...........................17–18 Orientated circular dichroism....................................... 400 Outer membrane vesicles (OMVs) .............................3, 4, 11–23, 183–191

P Phage regulatory RNAs ............................................25–33 Plasmid vectors....................................121, 125, 126, 139 Post-transcriptional regulation .............................. 75, 101 Prokaryotic RNAs .....................................................25–33 Pseudouridine....................................................... 273–285 Pull-down ............................................354, 356, 359, 360

Q qRT-PCR................................................12, 21, 118, 120, 121, 125, 137, 139, 140, 143 Quality control analyses ................................................ 310

R RapZ ........................................... 256–258, 260, 269, 270 Reporter gene fusions ......................... 117–122, 125–131 Retapamulin-assisted ribosome profiling (Ribo-RET) ........................ 75, 77, 80, 89–95, 97 Riboregulation .............................................................. 364 Ribo-seq.................................... 74–77, 89–91, 93, 94, 97 Ribosome profiling ...................................................73–97 RNA annealing ..................................................... 409–410 RNA-binding proteins (RBPs) ........................... 230, 240, 307–310, 321, 325, 327, 333, 347–360, 363–375 RNA chaperones ........................................ 120, 121, 125, 138–139, 147–148, 151–152, 165–169, 292 RNA crosstalk....................................................... 183–191 RNA decay...........................................252, 255, 265, 347

RNA extraction ........................................ 3–9, 14, 79, 80, 82, 86, 186, 188, 261, 265, 295, 354, 366, 374 RNA Interactome Capture (RIC).............. 348, 349, 357 RNA processing ............................................................ 239 RNA-protein interaction ............................ 307, 349, 352 RNA-RNA and RNA-protein EMSA ................. 166–167 RNA-RNA interaction ........................................ 146, 149, 155–158, 167, 173, 175, 195–205, 308, 310, 313, 333 RNA secondary and tertiary structure......................... 220 RNA secondary structure probing ............. 150, 152, 168 RNase E ...................................... 255–257, 269, 363–364 RNA-seq ..................................................... 12, 21, 26, 29, 32, 35–68, 80, 84, 89–91, 186, 189, 190, 351, 359, 368, 374, 375 RNA structure ....................................150, 168–169, 172, 224, 388, 391, 411 RyhB ....................................................................... 76, 301

S Sequencing library .................................. 25–33, 112, 113 SimRNA...............................................384, 386, 391–395 Sinorhizobium ............................. 239–253, 364, 365, 374 Sinorhizobium meliloti ........................................ 239–253, 364, 365, 374 Small RNAs (sRNAs) ........................................11, 25, 35, 74, 101, 117, 145, 183, 195, 207, 240, 256, 291, 308, 363, 383, 406 Spike-in transcript ................................................ 245–247 sRNA library screen ............................................. 293, 300 Staphylococcus .......................................4, 73–97, 307, 310 Staphylococcus aureus ...........................4, 73–97, 307, 310 Start codons.................................... 91–95, 297, 375, 376 Structural modeling ............................................. 383–395

T Transcriptional attenuator ............................................ 240 Translation initiation.......................................... 75, 92–94 Translation regulation...............................................73–98 Trans-sRNA................................................................... 364 5’ Triphosphate .................................................... 255, 385

U UV cross-linking ........................................ 307, 319, 333, 348, 351, 353, 360