Plant Circular RNAs: Methods and Protocols 1071616447, 9781071616444

​This volume presents readers with up-to-date protocols, bioinformatics toolkits, and reference material for understandi

248 21 5MB

English Pages 206 [198] Year 2021

Report DMCA / Copyright

DOWNLOAD PDF FILE

Table of contents :
Preface
Contents
Contributors
Chapter 1: Identification and Characterization of Plant Circular RNAs
1 Introduction
2 Materials
2.1 Total RNA Isolation
2.2 qRT-PCR for circRNA Identification and Detection
2.3 RNase R Treatment
2.4 Northern Blotting
2.5 Separation of Nuclear and Cytoplasmic Fractions
3 Methods
3.1 Total RNA Isolation from Plant Tissues
3.1.1 RNA Isolation Using Manual RNA Zol Reagent
3.1.2 RNA Isolation Using Commercialized TRIzol
3.1.3 RNA Isolation from Plant Sample Rich in Polysaccharides and Polyphenols Using CTAB-LiCl
3.2 CircRNA Identification
3.2.1 Design of Outward-Facing Primers for circRNA Identification and Detection
3.2.2 RNase R Treatment
3.2.3 Identification of the Components of circRNA by PCR
3.2.4 Northern Blotting
3.2.5 Detect Location of circRNA
4 Notes
References
Chapter 2: Experimental Strategies for Studying the Function of Plant CircRNAs
1 Introduction
2 Materials
2.1 Knockdown of CircRNAs
2.2 Overexpression of CircRNAs
2.3 Investigation of the Full-Length and BackSplice Sites of CircRNAs and Quantification of Their Relative Expression Levels
2.4 Pull-Down of the CircRNA-Interacting Proteins
3 Methods
3.1 Knockdown of CircRNAs Using Artificial microRNAs Targeting Their Junction Sites
3.2 Overexpression of CircRNAs in Plants
3.3 Investigation of the Full-Length and BackSplice Sites of CircRNAs
3.4 Quantification of CircRNAs by RT-qPCR
3.5 Identification of CircRNA Binding Proteins by Pull-down Assay
4 Notes
References
Chapter 3: Generation of Transgenic Rice Expressing CircRNA and Its Functional Characterization
1 Introduction
2 Materials
2.1 Buffers
2.2 Bacterial Culture Reagents
2.3 Plant Material and Plant Tissue Culture Reagents
2.4 Enzymes
2.5 Nucleic Acid Reagents
2.6 Chemical Reagents
2.7 Antibiotic Stocks
2.8 Kits
2.9 Equipment
2.10 Miscellaneous
3 Methods
3.1 Plant Genomic DNA Isolation
3.2 Plant Total RNA Isolation
3.3 Enriching CircRNA
3.4 RNA Sequencing and Data Analysis for CircRNA Identification and Flanking Sequence Determination
3.5 CircRNA Validation Using Divergent PCR
3.6 Vector Construction
3.6.1 Amplification and Cloning of the Full-Length CircRNA
3.6.2 Cloning of the Flanking Intron Downstream of the CircRNA in Clone-A1
3.6.3 Cloning of the Flanking Intron Upstream of the CircRNA in Clone-A2
3.6.4 Cloning the CircRNA Expression Cassette into pCAMBIA1305.1
3.7 Triparental Mating
3.8 Agrobacterium DNA Isolation
3.9 Confirmation of Transconjugants
3.10 Transformation into Plants
3.11 Confirmation of Intact T-DNA Integration by Southern Hybridization
3.11.1 Interpretation
3.12 Determination of T-DNA Integration Site in Plant Genome
3.12.1 Construction of Genome Walker Library
3.12.2 Designing the Gene Specific Primers
3.12.3 Primary PCR
3.12.4 Secondary/Nested PCR
3.12.5 Cloning the Nested PCR Product and Sequencing
3.13 Expression Analysis of CircRNA in Transgenic Plants
3.13.1 Semiquantitative RT-PCR
3.13.2 Quantitative Reverse Transcription PCR (RT-qPCR)
3.13.3 Northern Hybridization
3.14 Functional Validation of Overexpressed CircRNA
3.14.1 CircRNA-Mediated Transcriptional Regulation of Parental Gene
3.14.2 Predicting CircRNA-miRNA-mRNA Network
3.14.3 Analysis of Phenotypic Changes in Transgenic Plants
3.14.4 Role of circRNA During Stress Conditions
3.14.5 CircRNA Mediated RBP Sequestration
4 Notes
References
Chapter 4: Identification of Circular RNAs by Multiple Displacement Amplification and Their Involvement in Plant Development
1 Introduction
2 Materials
2.1 Seed Sterilization, Germination, Growth, and Sample Collection
2.2 RNA Extraction and CircRNA Enrichment
2.3 RT and MDA
2.4 Next Generation Sequencing (NGS) and Data Analysis
2.5 Validation and Expression of CircRNA, miRNA, and mRNA
3 Methods
3.1 Seed Sterilization, Germination, Growth, and Sample Collection
3.2 Plant Total RNA Isolation
3.3 CircRNA Enrichment
3.4 cDNA Synthesis
3.5 Multiple Displacement Amplification (MDA)
3.6 NGS of MDA Products and its Data Analysis
3.7 Predicting Differentially Expressed (DE)-CircRNAs
3.8 Validation of Selected Potential DE-CircRNAs
3.9 CircRNA Expression at Different Growth Stages
3.9.1 Divergent Quantitative Reverse Transcription PCR (RT-qPCR)
3.9.2 Northern Hybridization
3.10 CircRNA Parental Gene Expression Studies
3.11 Identification and Comparative Expression Analysis of CircRNA Interacting miRNA(s)
3.12 Validation of CircRNA Interacting miRNA(s)
3.13 Identification and Expression Analysis of miRNA Interacting mRNA(s)
4 Notes
References
Chapter 5: Identification of Intronic Lariat-Derived Circular RNAs in Arabidopsis by RNA Deep Sequencing
1 Introduction
2 Materials
2.1 Plant Material
2.2 Extraction of Total RNAs
2.3 Preparation of the Circular RNA-seq Libraries
2.4 Instrument or Equipment
2.5 Software
3 Methods
3.1 Extraction of Total RNAs
3.2 Preparation of the Circular RNA-seq Libraries
3.3 Collection of the Published RNA-seq Data
3.4 Computational Analysis of the RNA-seq Profiles
3.5 Identification of Stable Lariat RNAs
4 Notes
References
Chapter 6: Identification and Functional Characterization of Viroid Circular RNAs
1 Introduction
2 Materials
3 Methods
3.1 RNA Extraction
3.2 Cellulose Chromatography
3.3 Polysaccharide Removal with 2-Methoxyethanol
3.4 Viroid Inoculation
4 Notes
References
Chapter 7: Circular RNA Databases
1 Introduction
2 CircRNA Databases
3 Plant circRNA Databases
4 Discussion
References
Chapter 8: NGS Methodologies and Computational Algorithms for the Prediction and Analysis of Plant Circular RNAs
1 Introduction
2 Current Studies in the Field of Plant CircRNAs
2.1 Methods to Identify and Characterize CircRNAs in Plants
2.1.1 RNA Isolation, Library Preparation and Sequencing
2.1.2 CircRNA Identification Tools
2.2 Bioinformatic Resources for Plant CircRNAs
2.2.1 Differential Expression
2.2.2 Functional Analyses
2.2.3 Regulation Networks
2.2.4 Public Repositories
3 Validation
4 Conclusions
References
Chapter 9: Computational Analysis of Transposable Elements and CircRNAs in Plants
1 Introduction
2 Methods and Databases for CircRNA and TEs
2.1 CircRNA-Related Computational Tools
2.2 CircRNA-Related Databases
2.3 Transposable Elements Classification Tools
2.4 Transposable Elements-Related Databases
3 Exploratory Analysis on CircRNA and TE Data
3.1 Methodology of the Analysis
3.2 Sequence Length Analysis
3.3 Dinucleotide Analysis
3.4 GC Content Analysis
4 Identification of CircRNA-TE Associations
4.1 Methodology
4.2 Results and Discussion
5 Conclusion
6 Notes
References
Chapter 10: Constructing CircRNA-miRNA-mRNA Regulatory Networks by Using GreenCircRNA Database
1 Introduction
2 Materials
2.1 Hardware and Software Requirements
3 Methods
3.1 Data Collection
3.2 Identification of CircRNA
3.3 Removal of Redundant CircRNA
3.4 Construction of CircRNA-miRNA-mRNA Regulatory Networks
3.5 A Case Study
4 Notes
References
Chapter 11: Methods for Predicting CircRNA-miRNA-mRNA Regulatory Networks: GreenCircRNA and PlantCircNet Databases as Study Ca...
1 Introduction
2 Materials
2.1 CircFunBase
2.2 CropCircDB
2.3 GreenCircRNA
2.4 PlantcircBase
2.5 PlantCircNet
3 Methods
3.1 GreenCircRNA
3.2 PlantCircNet
3.3 Comparing Results
4 Notes
References
Index
Recommend Papers

Plant Circular RNAs: Methods and Protocols
 1071616447, 9781071616444

  • 0 0 0
  • Like this paper and download? You can publish your own PDF file online for free in a few minutes! Sign Up
File loading please wait...
Citation preview

Methods in Molecular Biology 2362

Luis María Vaschetto Editor

Plant Circular RNAs Methods and Protocols

METHODS

IN

MOLECULAR BIOLOGY

Series Editor John M. Walker School of Life and Medical Sciences University of Hertfordshire Hatfield, Hertfordshire, UK

For further volumes: http://www.springer.com/series/7651

For over 35 years, biological scientists have come to rely on the research protocols and methodologies in the critically acclaimed Methods in Molecular Biology series. The series was the first to introduce the step-by-step protocols approach that has become the standard in all biomedical protocol publishing. Each protocol is provided in readily-reproducible step-bystep fashion, opening with an introductory overview, a list of the materials and reagents needed to complete the experiment, and followed by a detailed procedure that is supported with a helpful notes section offering tips and tricks of the trade as well as troubleshooting advice. These hallmark features were introduced by series editor Dr. John Walker and constitute the key ingredient in each and every volume of the Methods in Molecular Biology series. Tested and trusted, comprehensive and reliable, all protocols from the series are indexed in PubMed.

Plant Circular RNAs Methods and Protocols

Edited by

Luis María Vaschetto Oncativo, Argentina

Editor Luis Marı´a Vaschetto Oncativo, Argentina

ISSN 1064-3745 ISSN 1940-6029 (electronic) Methods in Molecular Biology ISBN 978-1-0716-1644-4 ISBN 978-1-0716-1645-1 (eBook) https://doi.org/10.1007/978-1-0716-1645-1 © The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Science+Business Media, LLC, part of Springer Nature 2021 This work is subject to copyright. All rights are solely and exclusively licensed by the Publisher, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting, reproduction on microfilms or in any other physical way, and transmission or information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or hereafter developed. The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication does not imply, even in the absence of a specific statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use. The publisher, the authors, and the editors are safe to assume that the advice and information in this book are believed to be true and accurate at the date of publication. Neither the publisher nor the authors or the editors give a warranty, expressed or implied, with respect to the material contained herein or for any errors or omissions that may have been made. The publisher remains neutral with regard to jurisdictional claims in published maps and institutional affiliations. This Humana imprint is published by the registered company Springer Science+Business Media, LLC, part of Springer Nature. The registered company address is: 1 New York Plaza, New York, NY 10004, U.S.A.

Preface Circular RNA (circRNA) is one of the most recently recognized members of the family of noncoding regulatory RNAs. Due to their circular structure, circRNAs are more stable than linear RNAs and resistant to degradation by exonucleases. Interestingly, it has been shown that circRNAs control gene expression at different levels. These transcripts exert their functional roles by acting as miRNA sponges, competing endogenous RNAs, regulators of parental gene expression, modifiers of mRNA splicing, etc. In plants, circRNAs have recently emerged as important players in gene regulation both in development and in response to environmental stimuli. However, research into plant circRNAs is still in its infancy. They have the ability to open new avenues both in basic and applied research, and it is therefore imperative to increase our understanding of this unique class of noncoding RNAs. This volume of Methods in Molecular Biology aims to present a set of the latest protocols, bioinformatic toolkits, and reference material for understanding circRNAs in plants. For this book, some of the world’s top scientists in RNA biology have summarized basic concepts and methodologies used for prediction/identification, validation, and analysis of circRNAs and their targets in plants. State-of-the-art methods, detailed protocols, and descriptive chapters are highlighted as follows: procedures for circRNA identification and characterization in plants, strategies for circRNA generation, databases, next generation sequencing (NGS) technologies in circRNA research, functional relationships between plant circRNAs and transposable elements (TEs), and, finally, bioinformatic tools and pipelines for the analysis of circRNA-miRNA-mRNA regulatory networks. Each chapter is a unique piece and will serve as an important reference for readers so they can explore and apply the latest developments made in this field. Luis Marı´a Vaschetto

Oncativo, Argentina

v

Contents Preface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Contributors. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

v ix

1 Identification and Characterization of Plant Circular RNAs . . . . . . . . . . . . . . . . . . Yan-Fei Zhou, Yu-Meng Sun, and Yue-Qin Chen 2 Experimental Strategies for Studying the Function of Plant CircRNAs. . . . . . . . . Yan-Zhao Feng and Yang Yu 3 Generation of Transgenic Rice Expressing CircRNA and Its Functional Characterization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Priyanka Sharma, Ashirbad Guria, Sankar Natesan, and Gopal Pandi 4 Identification of Circular RNAs by Multiple Displacement Amplification and Their Involvement in Plant Development . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Ashirbad Guria, Priyanka Sharma, Sankar Natesan, and Gopal Pandi 5 Identification of Intronic Lariat-Derived Circular RNAs in Arabidopsis by RNA Deep Sequencing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Taiyun Wang, Xiaotuo Zhang, and Binglian Zheng 6 Identification and Functional Characterization of Viroid Circular RNAs . . . . . . . Jose´-Antonio Daro`s 7 Circular RNA Databases . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Peijing Zhang and Ming Chen 8 NGS Methodologies and Computational Algorithms for the Prediction and Analysis of Plant Circular RNAs. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Laura Carmen Terro n-Camero and Eduardo Andre´s-Leon 9 Computational Analysis of Transposable Elements and CircRNAs in Plants. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Liliane Santana Oliveira, Andressa Caroline Patera, Douglas Silva Domingues, Danilo Sipoli Sanches, Fabricio Martins Lopes, Pedro Henrique Bugatti, Priscila Tiemi Maeda Saito, Vinicius Maracaja-Coutinho, Alan Mitchell Durham, and Alexandre Rossi Paschoal 10 Constructing CircRNA–miRNA–mRNA Regulatory Networks by Using GreenCircRNA Database . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Jingjing Zhang, Ruiqi Liu, and Guanglin Li 11 Methods for Predicting CircRNA–miRNA–mRNA Regulatory Networks: GreenCircRNA and PlantCircNet Databases as Study Cases. . . . . . . . . . . . . . . . . . Nureyev F. Rodrigues and Rogerio Margis

1

Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

vii

21

35

69

93 101 109

119

147

173

181 195

Contributors EDUARDO ANDRE´S-LEO´N • Bioinformatics Unit, Instituto de Parasitologı´a y Biomedicina “Lopez-Neyra”, Consejo Superior de Investigaciones Cientı´ficas (IPBLN-CSIC), Armilla, Granada, Spain PEDRO HENRIQUE BUGATTI • Department of Computer Science, Federal University of Technology—Parana´ (UTFPR), Corne´lio Procopio, PR, Brazil MING CHEN • Department of Bioinformatics, State Key Laboratory of Plant Physiology and Biochemistry, College of Life Sciences, Zhejiang University, Hangzhou, China; Zhejiang Laboratory for Systems & Precision Medicine, Zhejiang University Medical Center, Hangzhou, China; James D. Watson Institute of Genome Sciences, Zhejiang University, Hangzhou, China YUE-QIN CHEN • Guangdong Provincial Key Laboratory of Plant Resources, State Key Laboratory for Biocontrol, School of Life Science, Sun Yat-Sen University, Guangzhou, P. R. China JOSE´-ANTONIO DARO`S • Instituto de Biologı´a Molecular y Celular de Plantas (Consejo Superior de Investigaciones Cientı´ficas-Universitat Polite`cnica de Vale`ncia), Valencia, Spain DOUGLAS SILVA DOMINGUES • Department of Computer Science, Federal University of Technology—Parana´ (UTFPR), Corne´lio Procopio, PR, Brazil; Group of Genomics and Transcriptomes in Plants, Instituto de Biocieˆncias de Rio Claro, Universidade Estadual Paulista (UNESP), Rio Claro, SP, Brazil ALAN MITCHELL DURHAM • Department of Computer Science, Instituto de Matema´tica e Estatı´stica, Universidade de Sa˜o Paulo (USP), Cidade Universita´ria, SP, Brazil YAN-ZHAO FENG • Agro-Biological Gene Research Center, Guangdong Academy of Agricultural Sciences, Guangzhou, China ASHIRBAD GURIA • Department of Plant Biotechnology, School of Biotechnology, Madurai Kamaraj University, Madurai, India GUANGLIN LI • Key Laboratory of Ministry of Education for Medicinal Plant Resource and Natural Pharmaceutical Chemistry, College of Life Sciences, Shaanxi Normal University, Xi’an, China RUIQI LIU • Key Laboratory of Ministry of Education for Medicinal Plant Resource and Natural Pharmaceutical Chemistry, College of Life Sciences, Shaanxi Normal University, Xi’an, China FABRICIO MARTINS LOPES • Department of Computer Science, Federal University of Technology—Parana´ (UTFPR), Corne´lio Procopio, PR, Brazil VINICIUS MARACAJA-COUTINHO • Centro de Modelamiento Molecular, Biofı´sica y Bioinforma´tica—CM2B2, Facultad de Ciencias Quimicas y Farmaceuticas, Universidad de Chile, Santiago, Chile ROGERIO MARGIS • Laboratory of Genome and Plant Populations, Department of Biophysics, Universidade Federal do Rio Grande do Sul—UFRGS, Porto Alegre, RS, Brazil; PPGBCM, Center of Biotechnology, Universidade Federal do Rio Grande do Sul—UFRGS, Porto Alegre, RS, Brazil; Department of Biophysics, Universidade Federal do Rio Grande do Sul—UFRGS, Porto Alegre, RS, Brazil

ix

x

Contributors

SANKAR NATESAN • Department of Genetic Engineering, School of Biotechnology, Madurai Kamaraj University, Madurai, India LILIANE SANTANA OLIVEIRA • Department of Computer Science, Federal University of Technology—Parana´ (UTFPR), Corne´lio Procopio, PR, Brazil; Embrapa Soja, Londrina, Parana´, Brazil GOPAL PANDI • Department of Plant Biotechnology, School of Biotechnology, Madurai Kamaraj University, Madurai, India ALEXANDRE ROSSI PASCHOAL • Department of Computer Science, Federal University of Technology—Parana´ (UTFPR), Corne´lio Procopio, PR, Brazil ANDRESSA CAROLINE PATERA • Department of Computer Science, Federal University of Technology—Parana´ (UTFPR), Corne´lio Procopio, PR, Brazil NUREYEV F. RODRIGUES • Laboratory of Genome and Plant Populations, Department of Biophysics, Universidade Federal do Rio Grande do Sul—UFRGS, Porto Alegre, RS, Brazil; PPGBCM, Center of Biotechnology, Universidade Federal do Rio Grande do Sul— UFRGS, Porto Alegre, RS, Brazil PRISCILA TIEMI MAEDA SAITO • Department of Computer Science, Federal University of Technology—Parana´ (UTFPR), Corne´lio Procopio, PR, Brazil DANILO SIPOLI SANCHES • Department of Computer Science, Federal University of Technology—Parana´ (UTFPR), Corne´lio Procopio, PR, Brazil PRIYANKA SHARMA • Department of Genetic Engineering, School of Biotechnology, Madurai Kamaraj University, Madurai, India YU-MENG SUN • Guangdong Provincial Key Laboratory of Plant Resources, State Key Laboratory for Biocontrol, School of Life Science, Sun Yat-Sen University, Guangzhou, P. R. China LAURA CARMEN TERRO´N-CAMERO • Bioinformatics Unit, Instituto de Parasitologı´a y Biomedicina “Lopez-Neyra”, Consejo Superior de Investigaciones Cientı´ficas (IPBLNCSIC), Armilla, Granada, Spain TAIYUN WANG • State Key Laboratory of Genetic Engineering, Ministry of Education Key Laboratory of Biodiversity Sciences and Ecological Engineering, Collaborative Innovation Center of Genetics and Development, Institute of Plant Biology, School of Life Sciences, Fudan University, Shanghai, China YANG YU • Agro-Biological Gene Research Center, Guangdong Academy of Agricultural Sciences, Guangzhou, China JINGJING ZHANG • Key Laboratory of Ministry of Education for Medicinal Plant Resource and Natural Pharmaceutical Chemistry, College of Life Sciences, Shaanxi Normal University, Xi’an, China PEIJING ZHANG • Department of Bioinformatics, State Key Laboratory of Plant Physiology and Biochemistry, College of Life Sciences, Zhejiang University, Hangzhou, China; Zhejiang Laboratory for Systems & Precision Medicine, Zhejiang University Medical Center, Hangzho, China XIAOTUO ZHANG • State Key Laboratory of Genetic Engineering, Ministry of Education Key Laboratory of Biodiversity Sciences and Ecological Engineering, Collaborative Innovation Center of Genetics and Development, Institute of Plant Biology, School of Life Sciences, Fudan University, Shanghai, China BINGLIAN ZHENG • State Key Laboratory of Genetic Engineering, Ministry of Education Key Laboratory of Biodiversity Sciences and Ecological Engineering, Collaborative Innovation Center of Genetics and Development, Institute of Plant Biology, School of Life Sciences, Fudan University, Shanghai, China YAN-FEI ZHOU • Guangdong Provincial Key Laboratory of Plant Resources, State Key Laboratory for Biocontrol, School of Life Science, Sun Yat-Sen University, Guangzhou, P. R. China

Chapter 1 Identification and Characterization of Plant Circular RNAs Yan-Fei Zhou, Yu-Meng Sun, and Yue-Qin Chen Abstract Recent studies have reported that circular RNAs (circRNAs) are a newly discovered type of ubiquitous, abundant and stable noncoding RNAs (ncRNAs) that play important roles in various biological processes in eukaryotic organisms. However, the biological functions of circRNAs in plants remain largely unknown and need further studies. Identification of plant circRNAs from plant circRNA database or sequencing analysis is a first step to investigate their functions. Here, we provide a series of protocols for circRNA identification including circular forms, composition features and location even in plant tissues which are rich in polysaccharides and polyphenols and difficult to extract RNAs. Key words Plant circRNA, CircRNA identification, RNA isolation, RNase R, Northern blotting, CircRNA location

1

Introduction Circular RNAs (circRNAs) are a novel class of abundant and ubiquitous noncoding RNAs (ncRNAs) that arise during pre-RNA splicing in a reversed order in which the 30 and 50 ends are covalently closed [1]. CircRNAs could be derived from any genomic location, such as exonic, intronic, and intergenic regions [2]. Unlike linear transcripts, circRNA is highly stable due to its circular form that lacks open ends and thus effectively resists degradation induced by RNase R, a strong 30 –50 exoribonuclease to degrade linear transcripts [3, 4]. Recent studies have shown that some circRNAs undertake important functions instead of transcriptional noises in various biological processes [2, 5, 6]. For example, an Arabidopsis circRNA derived from the sixth exon of SEPALLATA (SEP3) regulates its parental gene (SEP3) mRNA splicing by forming an R-loop (an RNA:DNA hybrid) to influence floral phenotypes [6]. A recent study has reported that an intronic circRNA generated from AT5G37720 regulates gene expression and Arabidopsis

Luis Marı´a Vaschetto (ed.), Plant Circular RNAs: Methods and Protocols, Methods in Molecular Biology, vol. 2362, https://doi.org/10.1007/978-1-0716-1645-1_1, © The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Science+Business Media, LLC, part of Springer Nature 2021

1

2

Yan-Fei Zhou et al.

development [5]. However, compared to comprehensive investigation of circRNAs in animals, circRNA research in plants is still in its infancy. The genome identification of circRNAs that have been performed recently in several plant species reveal that circRNAs are also ubiquitously expressed and abundant in plants [7, 8]. Their expression is often based on cell type, tissue, and development stage, suggesting potentially specific functions of circRNAs in different biological processes in plants [6, 9]. However, different from animal tissues, some plant tissues are rich in polysaccharides and polyphenols which disturb RNA extraction and influence the further identification and investigation of circRNAs. Exact identification and detection of circRNAs using experimental methods are important for further studies of circRNA functions. Moreover, complete composition features and locations may provide effective information for mechanism researches of circRNAs. In this chapter, we provide a series of protocols for circRNA identification including circular forms, composition features and location. These protocols will help users to identify and study circRNAs even in plant tissues with high polysaccharide and polyphenol contents.

2

Materials

2.1 Total RNA Isolation

1. Liquid nitrogen. 2. RNase-free water: Mix 3d H2O with 1/1000 volume of DEPC. Keep away from light with a silver paper and blend overnight. Autoclave sterilization. 3. 1 M sodium citrate (pH 7.0): Mix 29.4 g trisodium citrate dehydrate with 80 ml RNase-free water and adjust pH with HCl (see Note 1). Make up to 100 ml with RNase-free water. Store at 4  C. 4. 10% sarkosyl: Mix 5 g sarkosyl with 45 ml RNase-free water and mix well. Make up to 50 ml with RNase-free water. Store at 4  C. 5. RNA Zol reagent: Mix 47.2 g guanidinium isothiocyanate, 2.5 ml 1 M sodium citrate (pH 7.0) and 5 ml 10% sarkosyl with 75 ml RNase-free water and mix well. Make up to 100 ml with RNase-free water. Add 720 μl beta-mercaptoethanol and mix well. Store at 4  C. 6. Phenol (water-saturated). Store at 4  C. 7. 2 M sodium acetate (NaAc) solution (pH 4.0): Mix 27.2 g sodium acetate trihydrate and 50 ml glacial acetic acid with 10 ml RNase-free water and mix well. Adjust pH with glacial acetic acid to pH 4.0. Make up to 100 ml with RNase-free water. Store at 20  C.

Plant circRNA Identification

3

8. 3 M sodium acetate (NaAc) solution (pH 5.2): Mix 20.4 g sodium acetate trihydrate with 20 ml RNase-free water and mix well. Adjust pH with glacial acetic acid to pH 5.2. Make up to 50 ml with RNase-free water. Store at 20  C. 9. Chloroform. 10. Isoamyl alcohol. 11. Ethanol. 12. 75% ethanol: Mix 75 ml ethanol with 25 ml RNase-free water. Store at 4  C. 13. TRIzol. Store at 4  C. 14. CTAB extraction buffer: 2% CTAB (hexadecyltrimethylammonium bromide) (w/v), 2% PVP (polyvinylpyrrolidone) (w/v), 100 mM Tris–HCl (pH 8.0), 2 M NaCl, 25 mM EDTA (pH 8.0), and 2% beta-mercaptoethanol (add betamercaptoethanol before use). 15. 10 M lithium chloride (LiCl) solution: Mix 8.478 g LiCl with 15 ml RNase-free water. Make up to 20 ml with RNase-free water. 2.2 qRT-PCR for circRNA Identification and Detection

1. Reverse transcription kit. 2. SYBR Green PCR kit. 3. Gel purification kit. 4. E. coli strain DH5α.

2.3 RNase R Treatment

1. 10  RNase R buffer: 200 mM Tris–HCl (pH 8.0), 1 mM MgCl2, 1 M KCl. Store at 20  C. 2. RNase enzyme. 3. 80% ethanol: Mix 80 ml ethanol with 20 ml RNase-free water. Store at 4  C.

2.4

Northern Blotting

1. T7 high yield RNA synthesis kit. 2. Biotin-16-UTP. Store at 20  C. 3. Agarose powder. 4. Hybond-N+ membrane. 5. 3 mm paper (Whatman paper). 6. Thick blot papers (Bio-Rad). 7. Whatman™ TurboBlotter Transfer System. 8. 10  MOPS buffer (pH 7.0): Mix 4.18 g of MOPS powder with 70 ml 3d H2O and adjust pH with NaOH (see Note 2). Add 0.66 ml 3 M NaAc (pH 5.2) and 2 ml 0.5 M EDTA, mix well. Make up to 100 ml with 3d H2O. Filter using 0.22 μm syringe filters. Store at room temperature.

4

Yan-Fei Zhou et al.

9. Denaturing RNA buffer: 125 μl 10  MOPS, 225 μl 37% formaldehyde, 625 μl deionized formamide, 50 μl of 0.5 mg/ ml ethidium bromide (EB). Store at 20  C in a 1.5 ml conical tube and keep away from light with a silver paper. 10. 20  SSC solution (pH 7.2): 3 M NaCl and 0.3 M Na3C6H5O7. Adjust the pH to 7.2. 11. Prehybridization buffer: 5  SSC, 5  Denhardt, 50 mM PBS (pH 6.8), 0.4% SDS, 0.5 mg/ml salmon sperm DNA, 12.5 mg/ml dextran sulfate, 50% formamide. Blend all reagents except formamide to dissolve at 50  C. Then add formamide and mix. 12. Hybridization buffer: 5  SSC, 1  Denhardt, 20 mM PBS (pH 6.8), 2% SDS, 0.2 mg/ml salmon sperm DNA, 0.5 g/ml dextran sulfate, 50% formamide. Blend all reagents except formamide to dissolve at 50  C. Then add formamide and mix. 13. 20  SSPE solution (pH 7.4): 3 M NaCl, 0.2 M NaH2PO4 and 0.02 M EDTA. Adjust pH with NaOH. 14. Wash buffer 1: 2  SSPE solution containing 0.5% SDS. 15. Wash buffer 2: 0.2  SSPE solution containing 0.2% SDS. 16. Chemiluminescent EMSA kit (Beyotime Biotechnology, GS009). 2.5 Separation of Nuclear and Cytoplasmic Fractions

1. 0.5 M Tris–HCl (pH 7.4): Mix 3.0285 g Tris with 40 ml RNase-free water and adjust pH with HCl. Make up to 50 ml with RNase-free water. Store at 4  C. 2. 0.5 M Tris–HCl (pH 8.0): Mix 3.0285 g Tris with 40 ml RNase-free water and adjust pH with HCl. Make up to 50 ml with RNase-free water. Store at 4  C. 3. 4 M KCl: Mix 5.964 g KCl with 15 ml RNase-free water. Make up to 20 ml with RNase-free water. Store at 4  C. 4. 1 M MgCl2: Mix 10.15 g MgCl2·6H2O with 40 ml RNase-free water. Make up to 50 ml with RNase-free water. Store at 4  C. 5. 0.5 M EDTA (pH 8.0): Add 9.306 g EDTA·2H2O into 40 ml RNase-free water and blend. Adjust pH with NaOH. Make up to 50 ml with RNase-free water. Store at 4  C. 6. 50% glycerol (v/v): Mix 100 ml sterilized glycerol and 100 ml RNase-free water. Store at 4  C. 7. Sucrose. 8. 1 M DTT (1,4-Dithiothreitol) solution: Mix 0.775 g DTT with 2 ml RNase-free water. Make up to 5 ml with RNasefree water (see Note 3). Store at 20  C. 9. Protease inhibitor: 100  plant Cocktail.

Plant circRNA Identification

5

10. Triton X-100. 11. Beta-mercaptoethanol. 12. RNase inhibitor. 13. Miracloth. 14. Cell lysis buffer: 20 mM Tris–HCl (pH 7.4), 20 mM KCl, 2.5 mM MgCl2, 2 mM EDTA, 25% glycerol, 250 mM sucrose, 0.5% 1 M DTT, 1  cocktail, and 100 U/ml RNase inhibitor. Mix 2 ml of 0.5 M Tris–HCl (pH 7.4), 250 μl of 4 M KCl, 125 μl of 1 M MgCl2, 200 μl of 0.5 M EDTA (pH 8.0), 25 ml of 50% glycerol (v/v) and 4.2787 g sucrose with 18 ml RNasefree water (see Note 4). Make up to 50 ml with RNase-free water. Add DTT, cocktail, and RNase inhibitor before use. 15. Resuspension buffer: 20 mM Tris–HCl (pH 7.4), 2.5 mM MgCl2, 25% glycerol, 0.2% Triton X-100, 0.5% 1 M DTT, and 1 U/ml RNase inhibitor. Mix 2 ml of 0.5 M Tris–HCl (pH 7.4), 125 μl of 1 M MgCl2, 25 ml of 50% glycerol (v/v) and 100 μl Triton X-100 with 20 ml RNase-free water (see Note 4). Make up to 50 ml with RNase-free water. Add DTT and RNase inhibitor before use. 16. RNA precipitation solution (RPS): Mix 0.5 ml 3 M NaAc (pH 5.2) with 9.5 ml ethanol. Store at 20  C. 17. Gradient buffer 1: 10 mM Tris–HCl (pH 8.0), 10 mM MgCl2, 250 mM sucrose, 1% Triton X-100, 0.5% 1 M betamercaptoethanol, 1  cocktail, and 10 U/ml RNAse inhibitor. Mix 1 ml of 0.5 M Tris–HCl (pH 8.0), 500 μl of 1 M MgCl2, 4.2787 g sucrose, 500 μl Triton X-100 with 40 ml RNase-free water. Make up to 50 ml with RNase-free water. Add betamercaptoethanol, cocktail, and RNase inhibitor before use. 18. Gradient buffer 2: 10 mM Tris–HCl (pH 8.0), 2 mM MgCl2, 1700 mM sucrose, 0.15% Triton X-100, 0.5% 1 M betamercaptoethanol, 1  cocktail, and 10 U/ml RNAse inhibitor. Mix 1 ml of 0.5 M Tris–HCl (pH 8.0), 100 μl of 1 M MgCl2, 29.099 g sucrose, 75 μl Triton X-100 with 35 ml RNase-free water. Make up to 50 ml with RNase-free water. Add betamercaptoethanol, cocktail, and RNase inhibitor before use. 19. Protein wash buffer: 95% ethanol with 0.3 M guanidine hydrochloride (GuHCl). Mix 2.8659 g guanidine hydrochloride with 100 ml of 95% ethanol. 20. 1  Blue Loading Buffer: 62.5 mM Tris–HCl (pH 6.8), 2% (w/v) SDS, 10% glycerol, 0.01% (w/v) bromophenol blue, and 0.1 M DTT.

6

3

Yan-Fei Zhou et al.

Methods Perform RNA-associated experiments under relative instructions to avoid ribonuclease contamination and RNA degradation. 1. Wear gloves when handling reagents and RNA samples because that skin is a common source of RNases. Change gloves frequently. 2. Use RNase-free tubes and pipette tips. 3. Use RNase-free water. 4. Use appropriate reagents to remove RNase contamination from nondisposable items (pH detector system) and work surfaces. 5. Keep all reagents tightly sealed when not in use. After usage close covers of tubes and tip boxes immediately.

3.1 Total RNA Isolation from Plant Tissues

3.1.1 RNA Isolation Using Manual RNA Zol Reagent

Choosing suitable methods of plant RNA isolation is important to obtain high-quality RNAs for circRNA identification. Organic reagent method using isothiocyanate–phenol–chloroform or commercialized TRIzol is a universal RNA extraction way to obtain plant total RNA including circRNA. However, this method does not work well when plant tissues are rich in polysaccharides and polyphenols. Here, we provide alternative ways to obtain highquality total RNAs from different plant tissues. 1. Pour some liquid nitrogen into mortar and pestle to chill. Continue chilling the mortar and pestle until the liquid nitrogen stops boiling. 2. Transfer less than 100 mg of plant tissue into a chilled mortar that contains an appropriate amount of liquid nitrogen to cover the sample. Grind the sample into a fine powder for 15–30 min using a chilled pestle with the liquid nitrogen. Keep the sample frozen by adding liquid nitrogen until the RNA Zol reagent is added. 3. Pour an appropriate amount of liquid nitrogen into the sample and add RNA Zol reagent at the ratio (1 ml RNA Zol for 100 mg plant tissue). Grind the RNA Zol reagent into powder using a pestle with the liquid nitrogen and sufficiently blend the frozen reagent and plant sample powder. 4. Add phenol (water-saturated) and 2 M NaAc solution (pH 4.0) to RNA Zol at the ratio (1 ml phenol for 1 ml RNA Zol, 100 μl NaAc solution for 1 ml RNA Zol) into the mortar without either extra liquid nitrogen or grinding. Put the mortar in a chemical fume hood at room temperature until the sample entirely thaw.

Plant circRNA Identification

7

5. Collect and transfer the sample into 1.5 ml conical tubes, 1 ml per tube. Add 200 μl chloroform–isoamyl alcohol (49:1) per ml of sample. Vortex for 1 min. Incubate for 5 min on ice and centrifuge at 12,000  g for 10–15 min at 4  C. 6. Remove the upper aqueous phase carefully avoiding the interphase and place in a new tube. Add the equal volume phenol– chloroform–isoamyl alcohol (50:49:1) into the aqueous phase. Vortex for 1 min. Incubate for 5 min on ice and centrifuge at 12,000  g for 10–15 min at 4  C. This step should be repeated 2–3 times to discard protein as far as possible. 7. Remove the upper aqueous phase carefully avoiding the interphase and place in a new tube. Add equal volume isopropyl alcohol and gently blend the sample by turning over the tube. Incubate for 30–60 min on ice and centrifuge at 12,000  g for 20 min at 4  C. 8. Discard the supernatant carefully and air-dry the pellets (see Note 5). Resuspend in an appropriate volume of RNase-free water (approximate 70 μl RNase-free water per 100 mg sample), add one-tenth volume 3 M NaAc solution (pH 5.2) and triple volume ethanol, blend, and incubate overnight at 20  C. 9. Centrifuge at 12,000  g for 20 min at 4  C and discard the supernatant carefully. Wash the pellet twice with 75% ethanol. Centrifuge at 7500  g for 5 min at 4  C and discard the supernatant carefully. Air-dry or dry the pellet by concentrator. 10. Resuspend the pellet in an appropriate volume of RNase-free water. Sample may be stored at 80  C or utilized immediately in downstream RNA detection methods. 3.1.2 RNA Isolation Using Commercialized TRIzol

1. Grind the plant sample into a fine powder with liquid nitrogen as mentioned above in this chapter. Pour an appropriate amount of liquid nitrogen into the powder and add 1 ml TRIzol per 100 mg sample. Grind the TRIzol into powder with the liquid nitrogen and sufficiently blend the frozen reagent and plant sample powder. Put the mortar in a chemical fume hood at room temperature until the sample entirely thaw. 2. Collect and transfer the sample into a 1.5 ml conical tube, 1 ml per tube. Incubate for at least 5 min at room temperature for sufficient lysis. 3. Add 200 μl chloroform per ml TRIzol, shake vigorously by hand or vortex for at least 30 s. Incubate for 10 min at room temperature and centrifuge at 12,000  g for 15 min at 4  C. 4. Remove the upper aqueous phase carefully avoiding the interphase and place in a new tube. Add equal volume isopropyl alcohol and gently blend the sample by turning over the tube

8

Yan-Fei Zhou et al.

gently. Incubate for 10 min at room temperature and centrifuge at 12,000  g for 10 min at 4  C. 5. Discard the supernatant carefully and as thoroughly as possible (see Note 5). Wash the pellet with 75% ethanol and centrifuge at 7500  g for 5 min at 4  C and discard the supernatant carefully. 6. Wash the pellet with 100% ethanol and centrifuge at 7500  g for 5 min at 4  C and discard the supernatant carefully. Air-dry or dry the pellet by concentrator. 7. Resuspend the pellet in an appropriate volume of RNase-free water. Sample may be stored at 80  C or utilized immediately in downstream RNA detection methods. 3.1.3 RNA Isolation from Plant Sample Rich in Polysaccharides and Polyphenols Using CTABLiCl

1. Grind 100–200 mg plant sample into a fine powder with liquid nitrogen as mentioned above in the Subheading 3.1.1 of this chapter. Keep the sample frozen by adding liquid nitrogen until extraction buffer is added. 2. The extraction buffer needs to be prewarmed to 65  C before usage. The frozen powder needed to be transferred quickly to a 2 ml tube containing 700 μl prewarmed extraction buffer. Vortex for 30 s to blend the sample and incubate at 65  C for 5–30 min according to the species of plant samples. Vortex several times during incubation. 3. Cool down to room temperature. Add the equal volume chloroform: isoamyl alcohol (24:1) and shake vigorously by hand or vortex for 1 min. Incubate for 10 min on ice and centrifuge at 12,000  g for 10 min at 4  C. 4. Remove the upper aqueous phase carefully avoiding the interphase and place in a new 2 ml tube. Add the equal volume chloroform–isoamyl alcohol (24:1) again and shake vigorously by hand or vortex for 1 min. Incubate for 10 min on ice and centrifuge at 12,000  g for 10 min at 4  C. 5. Remove the upper aqueous phase carefully avoiding the interphase and place in a new 1.5 ml conical tube. Add one-third volume of 10 M LiCl, gently blend and store overnight at 20  C. 6. Centrifuge at 12,000  g for 20 min at 4  C and discard the supernatant carefully. Air-dry the pellets and resuspend in 1 ml TRIzol. Add the 200 μl chloroform and shake vigorously by hand or vortex for 1 min. Incubate for 10 min at room temperature and centrifuge at 12,000  g for 15 min at 4  C. 7. Remove the upper aqueous phase carefully avoiding the interphase and place in a new tube. Add equal volume isopropyl alcohol and gently blend the sample by turning over the tube gently. Incubate for 10 min at room temperature and centrifuge at 12,000  g for 10 min at 4  C.

Plant circRNA Identification

9

8. Discard the supernatant carefully and as thoroughly as possible. Wash the pellet with 75% ethanol and centrifuge at 7500  g for 5 min at 4  C and discard the supernatant carefully. 9. Wash the pellet with 100% ethanol and centrifuge at 7500  g for 5 min at 4  C and discard the supernatant carefully. Air-dry or dry the pellet by concentrator. 10. Resuspend the pellet in an appropriate volume of RNase-free water. Sample may be stored at 80  C or utilized immediately in downstream RNA detection methods. 3.2 CircRNA Identification

3.2.1 Design of OutwardFacing Primers for circRNA Identification and Detection

Interested circRNAs obtained from the analysis of sequencing or computational prediction, need to be further identified by several experimental approaches. Polymerase chain reaction (PCR) using outward-facing primers followed by Sanger sequencing is the simplest method to amplify the speculated backspliced junction point of circRNA. Stability tests using RNase R, an exonuclease involved in the 30 to 50 degradation of RNAs, is another important way to confirm the circular form of circRNA. PCR and Northern blotting are effective methods to identify the composition features and variants of circRNAs. What’s more, identification of circRNA location by separation of nuclear and cytoplasmic fractions provides useful information for further investigation of circRNA functions. 1. Obtain the sequences of backspliced junction (BSJ) point of interested circRNAs from plant circRNA-associated database or sequencing analysis based on specific plant genome and transcriptome. 2. Analyze the specific circRNA sequences containing appropriate flanking sequences around the BSJ point using Blast (https:// blast.ncbi.nlm.nih.gov/Blast.cgi) or primer premier 5 (https://primer-premier-5.software.informer.com/). 3. Design several pairs of outward-facing primers which can specifically amplify the BSJ sequences of the target circRNA without nonspecific products. Make the forward primer (F primer) correspond to the upstream sequences of the BSJ site (close to the 30 end of the linear transcript), and make the reverse primer (R primer) correspond to the downstream sequences of the BSJ site (close to the 50 end of the linear transcript) (see Notes 6 and 7) (Fig. 1). 4. Synthesize the complementary DNA (cDNA) from total RNA for real-time quantification PCR (qRT-PCR) analyses using M-MLV Reverse Transcriptase or commercialized reverse transcription kits according to the manufacturer’s introductions. 5. Perform quantitative PCR with melt curve analysis by a SYBR Green method. The following cycling conditions were used: 95  C for 30 s, 40 amplification cycles of 95  C for 10 s

10

Yan-Fei Zhou et al.

Fig. 1 Schematic depicting the principle of outward-facing primers to specifically amplify the BSJ point of circRNAs other than the linear transcripts generated from circRNA parental genes

followed by 60  C for 30 min, and a final melting cycle of 95  C for 10 s, 65  C for 1 min, and 97  C for 1 s. Choose the pairs of primers whose products have a single and normative melt curve. 6. Further analyze the PCR products generated by specific primers using gel electrophoresis. Choose the pairs of primers whose products have a single band with correct size. 7. Collect the bands using a gel purification kit. Ligate specific products to the cloning vectors and transform into E. coli strain DH5α. 8. The positive clones are detected by PCR and further analyzed by Sanger sequencing. 9. Choose the pairs of primers that can specifically amplify the BSJ sequences of the target circRNAs without any nonspecific products. 3.2.2 RNase R Treatment

1. Prepare two 1.5 ml conical tubes labeled with “Test” and “NC.”

Plant circRNA Identification

11

2. Add 2 μl 10  RNase R Buffer, 4 μg total RNA, and extra RNase-free water into each tubes. Add 20 U RNase R enzyme into the “Test” tube and an equal volume of water into the “NC” tube in place of the enzyme as a control. The final volume of the mix in each tube is 20 μl. 3. Vortex slightly and incubate for 15 min at 37  C (see Note 8). 4. Add 360 μl RNase-free water, thus bringing up the volume to 400 μl. 5. Add 400 μl of phenol–chloroform–isoamyl alcohol (25:24:1) to each tube. Vortex for 30 s and centrifuge at 12,000  g for 10 min at room temperature to separate the phase. 6. Remove ~350 μl of the aqueous phase carefully avoiding the interphase and place in a new 1.5 ml conical tube. 7. Add 400 μl of chloroform. Vortex for 30 s and centrifuge at 12,000  g for 10 min at room temperature to separate the phase. 8. Remove ~300 μl of the aqueous phase carefully avoiding the interphase and place in a new 1.5 ml conical tube. Add one-tenth volume 3 M NaAc solution (pH 5.2) and triple volume ethanol, blend and keep at 20  C overnight to precipitate the RNA. 9. Centrifuge at 12,000  g for 30 min at 4  C and discard the supernatant carefully. Wash the pellet once with 80% ethanol. Centrifuge at 12,000  g for 15 min at 4  C and discard the supernatant carefully. Air-dry the pellet. 10. Resuspend the pellet in 20 μl of RNase-free water. Sample may be stored at 80  C or utilized immediately in downstream RNA detection methods. 11. Synthesize the complementary DNA (cDNA) from the equal volume of treated RNA samples for real-time quantification PCR (qRT-PCR) analyses using M-MLV Reverse Transcriptase or commercialized reverse transcription kits. 12. Perform quantitative PCR using specific outward-facing primers to identify the circRNA expression change by a SYBR Green method. Simultaneously detect the expression changes of linear transcripts from circRNA parental genes or housekeeping genes as controls. 3.2.3 Identification of the Components of circRNA by PCR

1. Design a pair of outward-facing primers which correspond to the sequences of the same exon (see Note 9) (Fig. 2). Make the F primer correspond to the upstream sequences of the BSJ site and make R primer correspond to the downstream sequences of the BSJ site (see Note 9) (Fig. 2).

12

Yan-Fei Zhou et al.

Fig. 2 Schematic depicting the design of primers that can specifically identify the composition features of circRNA by PCR

2. Synthesize the complementary DNA (cDNA) from total RNA for PCR analyses using M-MLV Reverse Transcriptase or commercialized reverse transcription kits according to the manufacturer’s introductions. 3. Perform PCR according to the relevant instructions and analyze PCR products generated using gel electrophoresis. The bands with predicted sizes are further collected by a gel purification kit. 4. Ligate specific products to the cloning vectors and transform into E. coli strain DH5α. 5. The positive clones are detected by PCR and further analyzed by Sanger sequencing to identify the characteristics of circRNA component. 3.2.4 Northern Blotting

1. Design antisense probes that are complimentary to the sequences spanning the BSJ point of interested circRNAs. (1) Generate the biotin labeled antisense probes using T7 RNA polymerase and Biotin-16-UTP by in vitro transcription according to the relevant manufacturer’s introductions. Incubate for 3–8 h at 37  C. Add appropriate amount of DNase I and incubate for extra 15 min at 37  C; (2) Add extra RNasefree water, thus bringing up the volume to 400 μl. Purify the

Plant circRNA Identification

13

probes by phenol–chloroform extraction as mentioned above in the Subheading 3.2.3 of this chapter, step 5–10. 50–100 pmol per tube. Store at 80  C (see Note 10); (3) Prepare a denatured Agarose gel containing formaldehyde. Mix 0.3 g of agarose powder with 22 ml 3d H2O in a microwavable flask. Microwave for 1–3 min until the agarose is completely dissolved (see Note 11). Let agarose solution cool down to about 60  C (about when you can keep your hand on the flask for up to 10 s). Add 2.5 ml 10  MOPS and 750 μl 37% formaldehyde. Swirl the flask to blend the solution. Pour the agarose into a gel tray with the well comb in place (see Note 12). Place at room temperature 30 min until it has completely solidified; (4) Add 1 μl of probes (0.5–2 μg) into 10 μl Denaturing RNA buffer and blend. Prepare a molecular weight ladder for RNA in a new 1.5 ml tube. Incubate both for 10 min at 65  C. Cool down for 3 min in ice; (5) Add 1.2 μl 10  loading buffer into the probe mix. Place the denatured gel into the electrophoresis box with 1  MOPS. Carefully load the RNA ladder and probe samples into the wells of the gel; (6) Run the gel at 80–100 V, 16 h) (see Note 16). 11. After the transfer is complete, place the Hybond-N+ membrane on two new 3 mm paper soaked with 10  SSC buffer. Keep the side of membrane binding RNA up. Place membrane

Plant circRNA Identification

15

in a UV crosslinker on automatic setting. Ultraviolet-crosslink each sides of the membrane with 254 nm UV light for 2 min respectively (see Note 17). 12. Rinse the membrane using sterile Milli Q H2O to remove residual gel. Dry the membrane (see Note 18). 13. Place the membrane into a hybridization tube and keep the side binding RNA up. 14. Prehybridization (Blocking): Warm the prehybridization buffer to 42  C. Add appropriate amount of prehybridization buffer (0.1 ml prehybridization buffer per 1 cm2 membrane) into the hybridization tube. Place it at 42  C in hybridization oven for 5 h. 15. Hybridization: Prepare biotin-labeled probes and incubate for 10 min at 99  C. Then, place in ice for 10 min. Warm hybridization buffer to 42  C, add the probes and mix well. Discard the prehybridization buffer form the blot and add hybridization buffer with labeled probes. Incubate overnight (16–24 h) at 42  C. 16. Warm the wash buffer 1 to 42  C and 65  C. Discard the hybridization buffer from the blot and rinse for 15 min with 42  C wash buffer 1. Rinse again for 15 min with 65  C wash buffer 1. 17. Warm the wash buffer 2 to 65  C. Rinse for 15 min with wash buffer 2. 18. Detect probe signal using several reagents from Chemiluminescent EMSA kit: Warm the blocking solution and wash buffer at 37–50  C to dissolve completely. Prepare a clean box and add appropriate amount of blocking solution. Transfer the membrane into the blocking solution. Incubate for 15 min at room temperature on a shaker. Discard blocking solution, add another 20 ml blocking solution containing 6.5 μl Streptavidin-HRP Conjugate, and incubate for 15 min at room temperature on a shaker. Transfer membrane to a new clean box containing 1  wash buffer. Rinse the membrane five times using 1  wash buffer, 5 min for each time. Transfer membrane into a new box containing Equilibration buffer and incubate for 5 min at room temperature on a shaker. The membrane is visualized with high-sensitivity chemiluminescence detection system (mix containing 500 μl BeyoECL Moon A and 500 μl BeyoECL Moon B) in a darkroom (see Note 19). 3.2.5 Detect Location of circRNA

1. Chill the mortar and pestle using liquid nitrogen. Transfer 0.5 g of plant tissue into a chilled mortar that contains an appropriate amount of liquid nitrogen to cover the sample.

16

Yan-Fei Zhou et al.

Grind the sample into a fine powder using a chilled pestle with the liquid nitrogen. Grind for 5–9 times according to the species of plant tissue (see Note 20). 2. Transfer frozen powder into an RNase-free 50 ml tubes with liquid nitrogen, and store the tube at 80  C without screwing the cover for liquid nitrogen volatilization (see Note 21). 3. Add 3 ml complete cell lysis buffer and suspend the frozen plant tissue powder until it is thawed completely. Incubate on ice for 15 min. 4. Filter solution through two layers of Miracloth into a 50 ml tube on ice. Centrifuge the filtered solution at 4  C for 10 min at 1500–2500  g according to the species of plant cells. 5. Collect cytoplasmic fraction: Remove the supernatant (cytoplasmic fraction) into a new 15 ml tube carefully avoiding the precipitation (see Note 22). Separate the cytoplasmic fraction into 1.5 ml conical tubes, 380 μl per one tube. Add RPS solution in accordance with 1 ml RPS for 380 μl cytoplasmic fraction. Mix well and incubate for 1 h to overnight at 20  C. 6. Collect nuclear fraction: Resuspend the precipitation gently with 2 ml cell lysis buffer (see Note 23) and centrifuge at 4  C for 10 min at 1500–2500  g according to the species of plant cells. 7. Discard the supernatant. Wash the precipitation three times with 1 ml resuspension buffer. Centrifuge at 4  C for 10 min at 1500–2500  g according to the species of plant cells. 8. Discard the supernatant. Resuspend the pellet with 500 μl gradient buffer 1 and place on ice. 9. Prepare a 2 ml tube with round bottom and add 500 μl gradient buffer 2 in it. Transfer gradient buffer 1 containing samples carefully on the top of the gradient buffer 2. Centrifuge at 4  C for 10 min at 13,500  g. 10. Discard the supernatant thoroughly. Add 1 ml TRIzol or CTAB extraction buffer to lyse the nucleus (see Note 24). 11. Vortex the solution of cytoplasmic fraction for 30 s. Centrifuge at 18,000  g for 15 min at 4  C. 12. Discard the supernatant and air-dry the pellet. Add 1 ml TRIzol or CTAB extraction buffer to lyse the pellet (see Note 24). 13. Use an appropriate method to extract RNA from nuclear and cytoplasmic fractions. Synthesize the complementary DNA (cDNA) from the equal volume of nuclear and cytoplasmic RNAs using M-MLV Reverse Transcriptase or commercialized reverse transcription kits according to the manufacturer’s introductions. Perform qPCR according to the relevant instructions. Detect specific RNAs located in nucleus or cytoplasm to identify the efficiency of fraction separation.

Plant circRNA Identification

17

14. Extract protein from nuclear and cytoplasmic fractions by TRIzol to identify the efficiency of fraction separation: Add chloroform into TRIzol for phase separation. After removing the upper aqueous phase, add 0.3 ml ethanol per 1 ml TRIzol into the interphase and organic phase. Mix well and incubate for 3 min at room temperature. 15. Centrifuge at less than 2000  g for 5 min at 4  C. 16. Remove supernatant into a new 2 ml tube with round bottom. Add 1 ml isopropyl alcohol per 1 ml TRIzol into the supernatant to precipitate proteins. Mix well and incubate for 10 min at room temperature. 17. Centrifuge at 15,000  g for 10 min at 4  C. 18. Rinse the pellets using protein wash buffer (1 ml protein wash buffer per 1 ml TRIzol). Incubate for 20 min at room temperature and centrifuge at 7500  g for 5 min at 4  C. 19. Repeat step 18 twice. 20. Rinse the pellets with 1 ml ethanol and centrifuge at 7500  g for 5 min at 4  C. 21. Dry the pellets using a concentrator for 5–10 min. 22. Add 50 μl 1  Blue Loading Buffer to resuspend the pellets. Vortex and incubate for 10 min at 95  C for completely dissolving. 23. Perform western blot to detect specific proteins located in nucleus and cytoplasm such as H3, Tubulin, and GAPDH, to identify the efficiency of fraction separation.

4

Notes 1. Take care to bring the solution to room temperature before adjusting pH. It would be better to use HCl with relatively low ionic strengths when pH is closed to the required pH, to avoid a sudden drop in pH below the required pH. 2. It would be better to use NaOH with relatively low ionic strengths when pH is closed to the required pH, to avoid a sudden rise in pH above the required pH. 3. Avoid repeated freezing and thawing. Repeated freezing and thawing will make DTT out of work. 4. It’s better to adjust pH again to the exact pH value. 5. In the section of RNA isolation, in order to obtain high-quality RNA, the supernatant solution including isopropyl alcohol needs to be discarded thoroughly from pellets after RNA precipitation by transient centrifugation again and remove with pipette tips.

18

Yan-Fei Zhou et al.

6. The annealing temperature of the primers should be around 60  C to be suitable for the procedure of qRT-PCR. The sizes of products should be 80 to 400 bp. 7. The primer can be designed to correspond to the sequences containing the BSJ site if that the size of interested circRNA is too small to design more ideal primers. Ensure enough 50 end of the primer which is across the BSJ site to avoid amplifying linear transcripts of the circRNA parental gene. 8. The optimum temperature and time of enzyme reaction should be referred to the relevant RNase R instructions. 9. The two sites corresponding to the outward-facing primers (F and R primers) should be as close as possible for providing a complete characteristic of circRNA sequence. 10. Avoid repeated freezing and thawing. 11. Do not overboil the solution. Microwave in pulses and swirl the flask occasionally as the solution heats up to avoid evaporation and altering the final percentage of agarose in the gel. 12. Pour slowly to avoid bubbles. Any bubbles can be pushed away from the well comb with pipette tip. 13. High concentration of total RNA is needed for Northern blotting (>4 μg) as the capacity limitation of the gel wells. RNA isolation by manual RNA Zol reagent can be chose for extracting a lot of total RNA from a vast of plant tissue samples. The RNA pellets needed to be resuspended in a small amount of RNase-free water to obtain RNA with high concentration. 14. Avoid overheating of 1  MOPS buffer to melt the gel. Change the 1  MOPS buffer anytime if it is overheated. 15. Do not place any other weight on top of the wick cover during transfer. It may inhibit transfer by crushing the pore structure of the agarose gel. 16. Change the thick blot papers under the Hybond-N+ membrane when they become wet. Change 2–3 times. 17. Warm up the UV crosslinker for 5 min. Appropriately change the time of UV crosslink to increase the efficiency. However, it may break RNA if UV crosslink for a long time. 18. The membrane can be stored at 4  C with package of plastic wrap for a short time, if the following process do not be performed immediately. 19. Prepare the chemiluminescence detection system fresh when it will be used. 20. Moderate grinding is needed to try to sustain the integrity of nucleus. Grinding aid with appropriate size can be used to protect nucleus.

Plant circRNA Identification

19

21. Do not screw up the cover of tube until liquid nitrogen volatilize completely. Avoid blast. 22. It does not need to collect all of the supernatant (cytoplasmic fraction). Leave a little supernatant to avoid contaminate from nuclear fraction. 23. Resuspend the precipitation gently using pipette tips to avoid breaking the nucleus. 24. Choose appropriate RNA extraction buffer according to the species of plant tissues to remove the interference of polysaccharides and polyphenols.

Acknowledgements This work was supported by the National Natural Science Foundation of China (No. 91640202, 91940301) and the grant from Guangdong Province (No. 2019JC05N394). References 1. Jeck WR, Sorrentino JA, Wang K et al (2013) Circular RNAs are abundant, conserved, and associated with ALU repeats. RNA 19:141–157 2. Chen LL (2016) The biogenesis and emerging roles of circular RNAs. Nat Rev Mol Cell Biol 17:205–211 3. Chen LL, Yang L (2015) Regulation of circRNA biogenesis. RNA Biol 12:381–388 4. Suzuki H, Tsukahara T (2014) A view of pre-mRNA splicing from RNase R resistant RNAs. Int J Mol Sci 15:9331–9342 5. Cheng JP, Zhang Y, Li ZW et al (2018) A lariatderived circular RNA is required for plant development in Arabidopsis. Sci China Life Sci 61:204–213

6. Conn VM, Hugouvieux V, Nayak A et al (2017) A circRNA from SEPALLATA3 regulates splicing of its cognate mRNA through R-loop formation. Nat Plants 3:17053. https://doi.org/ 10.1038/nplants.2017.53 7. Lu TT, Cui LL, Zhou Y et al (2015) Transcriptome-wide investigation of circular RNAs in rice. RNA 21:2076–2087 8. Ye CY, Chen L, Liu C et al (2015) Widespread noncoding circular RNAs in plants. New Phytol 208:88–95 9. Lai XL, Bazin J, Webb S et al (2018) CircRNAs in plants. Adv Exp Med Biol 1087:329–343

Chapter 2 Experimental Strategies for Studying the Function of Plant CircRNAs Yan-Zhao Feng and Yang Yu Abstract Circular RNAs (circRNAs) represent a novel group of noncoding RNA whose functions are little known in plants. Genetic manipulation is necessary for studying the biological function of specific circRNA. Here, we describe strategies to study the function of plant circRNAs including artificial microRNA-mediated circRNA knockdown, gain-of-function study, full-length circRNA identification, and circRNA-protein interaction. These methods can be applied to functional characterization of circRNAs in plants and would be promising to facilitate the research on plant circRNAs. Key words Circular RNA, Knock down, Overexpression, Pull-down, Quantitative PCR

1

Introduction CircRNAs are a class of noncoding RNA that being backspliced from pre-mRNA, which have been considered byproducts of the mRNA splicing for a period of time [1]. With the rapid development of sequencing techniques as well as analyzing tools for circRNAs, numerous circRNAs from eukaryotes have been identified [2–5]. For the fact that vast differences of the architecture and lifestyles between animals and plants, it is interesting to study how circRNAs functionally fit the physiological processes of plants. Loss-of-function mutants of circRNAs are hard to achieve because they are derived from their host genes, and common gene knockout methods like T-DNA insertion disturb not only transcription of circRNAs but also the host genes themselves. Moreover, the transcription and backsplicing of circRNAs are distinct from linear RNAs, making the gain-of-function analysis strategy also different

Supplementary Information The online version of this chapter (https://doi.org/10.1007/978-1-0716-16451_2) contains supplementary material, which is available to authorized users. Luis Marı´a Vaschetto (ed.), Plant Circular RNAs: Methods and Protocols, Methods in Molecular Biology, vol. 2362, https://doi.org/10.1007/978-1-0716-1645-1_2, © The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Science+Business Media, LLC, part of Springer Nature 2021

21

22

Yan-Zhao Feng and Yang Yu

from mRNAs. Thus, it is necessary to provide guidance to the genetic study of plant circRNAs. Here, we propose methods that could be used in studying the biological roles of circRNAs, including circRNA identification, expression analysis, genetic studies, and molecular interactions.

2

Materials

2.1 Knockdown of CircRNAs

1. Plant material: Arabidopsis thaliana (Col-0). 2. Leaf mold soil. 3. Multiple-well tray. 4. pH meter, 5. Autoclave. 6. MS medium: Add hygromycin to 25 mg/L when screening circRNA overexpression transformants. 7. Bacterial strains: Agrobacterium strain GV3101, E. coli strain DH5α. 8. Plasmids: pHBT and pCB302 [6]. 9. Enzymes: DNA polymerase (KOD FX, including KOD FX buffer), Restriction endonuclease (EcoRI, XbaI, BamHI, PstI), T4 DNA ligase. 10. FastDigest buffer: provided by the supplier with the restriction enzymes. 11. Antibiotics: ampicillin, kanamycin, rifampicin. 12. Transformation solution: 5% (w/v) sucrose solution. Dissolve 5 g sucrose in distilled water to a final volume of 100 mL (freshly made). Add 0.02% Silwet L-77 before use and stir well. 13. Basta solution:10% stock solution. Dilute with ddH2O to a working concentration of 0.005% (v/v). 14. TAE buffer: Dissolve Tris 242 g, Na2EDTA·2H2O 37.2 g in 600 mL of distilled water, add 57.1 mL of acetic acid, and supplement distilled water to a total volume of 1 L. The working concentration is 1/50 of that of the stock solution. 15. LB medium: 10 g/L of peptone, 5 g/L of yeast extract, 10 g/ L of NaCl. 15 g/L of agar is supplemented when using solid medium. Autoclave at 121  C for 20 min. 16. The primer pair used here depends on which circRNA we want to target and the primer designing techniques are described in step 3 in Subheading 3.1. Here we take ath_circ_023978 for example and design the primers as follows: Forward primer:

Methods for Studying Functions of Plant circRNAs

23

CGAGAATTCATGTTTTAGGAATATATATGTAGA TGCTCTTGGTGGATCATGAGTTCACAGGTCGTGATAT GATTC (The underlined sequence comes from the capitalized letters in Oligo III). Reverse primer: CGATCTAGAAAATTGGAATACAAAAGAGAGA TGTTCTTGGTGGAACATGAGA TCAAAGAGAATCAAT GATCCA (The underlined sequence comes from the capitalized letters in Oligo II). 2.2 Overexpression of CircRNAs

1. Restriction endonuclease (BamHI, SacI, KpnI, SpeI). 2. DNA polymerase. 3. One step cloning kit. 4. Bacterial strains: Agrobacterium strain GV3101, E. coli strain DH5α. 5. LB medium (see step 15 in Subheading 2.1). 6. Plasmid: pRHV [7].

2.3 Investigation of the Full-Length and BackSplice Sites of CircRNAs and Quantification of Their Relative Expression Levels 2.4 Pull-Down of the CircRNA-Interacting Proteins

1. DNA polymerase. 2. Simple Blunt cloning kit. 3. DH5α competent cells. 4. TAE buffer (see step 14 in Subheading 2.1). 5. Reverse transcription kit (with DNase). 6. QPCR Taq mix. 1. DNA polymerase (see step 9 in Subheading 2.1). 2. T7 in vitro transcription kit. 3. RNA purification kit. 4. Liquid nitrogen. 5. Protease inhibitor. 6. Lysis buffer: 50 mM Tris–HCl (pH 7.4), 150 mM NaCl, 2 mM MgCl2, 20% glycerol, 0.1% NP-40, 5 mM DTT, 1% protease inhibitor. 7. Magnetic RNA-protein pull-down kit. 8. RNA capture buffer: 20 mM Tris (pH 7.5), 1 M NaCl, 1 mM EDTA. 9. Protein-RNA binding buffer: 0.2 M Tris (pH 7.5), 0.5 M NaCl, 20 mM MgCl2, 1% Tween 20. The working concentration is 1/10 of that of the stock solution. 10. Wash buffer: 20 mM Tris (pH 7.5), 10 mM NaCl, 0.0.1% Tween-20 detergent.

24

Yan-Zhao Feng and Yang Yu

11. Protein loading buffer: 10% SDS, 500 mM DTT, 50% glycerol, 500 mM Tris–HCl, 0.05% bromophenol blue. The working concentration is 1/5 of that of the stock solution. 12. Biotin Elution Buffer: provided by the Magnetic RNA-protein pull-down kit.

3

Methods

3.1 Knockdown of CircRNAs Using Artificial microRNAs Targeting Their Junction Sites

To avoid interfering the expression of the linear mRNA, artificial miRNA (amiRNA) is exploited to target and span the backspliced junction site of the circRNA. Target blast search is necessary to reduce the off-target events before designing primers to construct an amiRNA expression vector. Transgenic plants are generated by Agrobacterium tumefaciens infiltration and the knock down efficacy is determined by qPCR. 1. Download the circRNA sequences from PlantcircBase [8](see Note 1). Take ath_circ_023978 derived from AT5G04090 for example, which stays highly expressed during all life stages. Click “search” button on the webpage and enter the PlantcircBase ID in the frame of “Keyword Search” item, and then click the “Submit” button. On the result page, the splice junction sequence was shown, whereas the junction site is located between the capital and lowercase letter (Fig. 1). Reversecomplemented the sequence and choose 21 nt across the junction site, which is as much near the center of the sequence as possible. Substitute the first nucleotide with “T.” For example, we designed the following oligo to target ath_circ_023978. The sequence underlined is chosen to target the circRNA (Fig. 2). 2. Submit the sequence to the target search program on Web MicroRNA Designer (see Note 2) (Fig. 3), and the candidate without off-target sites or the off-targets are poorly matched (with the warning “More than one mismatch at positions 2–12”) (Fig. 4) can be chosen to design amiRNA. 3. Submit the candidate sequence to the “Oligo” program in Web microRNA Designer and we can view the result page. Design the forward primer and the reverse primer. The primers sequences are shown in step 16 in Subheading 2.1. Note that the readers should substitute the unlined sequences of the forward primer with the capitalized letters from Oligo III and the reverse primer with the capitalized letters in Oligo II according to their results. 4. Amplify with the forward and reverse primer using the existing amiRNA plasmid. Add 1.5 μL of forward and reverse primer respectively, and 1.5 μL of 2 mM dNTPs, 25 μL of 2 KOD FX

Methods for Studying Functions of Plant circRNAs

25

Fig. 1 Example showing the splice junction sequence of circRNA. The triangle indicates the splice junction site

Fig. 2 The reverse-complementary sequence of circRNA. Letters underlined indicate that attempted seed sequence selected for expressing amiRNA

Fig. 3 The interface of Target Search program in Web MicroRNA Designer

Fig. 4 An example showing imperfect matching candidate sequence

buffer, 1 μL of KOD FX, and 0.1 μL of the existing amiRNA plasmid as template. The amplicon will be as specified in the Supplementary Material 1.

26

Yan-Zhao Feng and Yang Yu

5. Digest 2 μg of the pHBT plasmid and the PCR product with 1 μL of EcoRI and 1 μL of XbaI and 2 μL of 10  FastDigest buffer, and distilled water in a volume of 20 μL at 37  C for 30 min. 6. Resolve the reaction product on 1.5% (w/v) agarose gel (see Note 3), and purify the 236 bp DNA product and the digested pHBT plasmid using gel purification kit. 7. Add 3 μL of the purified PCR product, 1 μL of the digested pHBT plasmid, and 2 μL of 5  T4 DNA ligase buffer, 1 μL of T4 DNA ligase, and 3 μL of distilled water, and incubate the tube at 25  C for 2 h. 8. Thaw 50 μL of competent cells of E. coli strain DH5α from 80  C (see Note 4). 9. Add the ligated product to the competent cells. Keep on ice for 20 min. 10. Incubate in water bath at 42  C for 90 s, and immediately keep on ice for 2 min. 11. Add 500 μL of LB liquid medium to the tube and transfer to the shaker and incubate at 37  C, 180 rpm. 12. Plate the cells on LB solid media containing 100 mg/L ampicillin (see Note 5), and incubate the plate at 37  C overnight. 13. Examine the positive colony by PCR and isolate the plasmid. 14. Digest 2 μg of the amiR-pHBT plasmid and 2 μg of pCB302 plasmid with 1 μL of BamHI, 1 μL of PstI, and 2 μL of 10 FastDigest buffer in a volume of 20 μL respectively at 37  C for 30 min. The digested product is as specified in the Supplementary Material 1. 15. Resolve the reaction product on 1.5% (w/v) agarose gel, and purify the 429 bp DNA product and the digested pCB302 vector using gel purification kit. 16. Add 3 μL of the purified PCR product, 1 μL of the digested pCB302 plasmid, and 2 μL of 5 T4 DNA ligase buffer, 1 μL of T4 DNA ligase, and 3 μL of distilled water, and incubate the tube at 25  C for 2 h. 17. Transform competent cells of DH5α as steps 8–12. 18. Isolate the plasmids of the correct amiRcirc and overexpression construct, add 500 ng of the plasmids into 50 μL of freshly thawed competent cells of Agrobacterium strain GV3101 (see Note 6), and keep on ice for 30 min. 19. Put the tube into liquid nitrogen for 1 min and transfer the tube to water bath at 37  C.

Methods for Studying Functions of Plant circRNAs

27

20. Keep the competent cells on ice for 2 min to rapidly cool down, and add 0.5 mL of LB liquid medium to culture at 220 rpm at 28  C for 3 h. 21. Plate the cells on LB solid media containing 50 mg/L kanamycin (see Note 7), and incubate the plate at 28  C for 2 days. 22. Identify single colony by PCR and pick the positive colony to inoculate at 5 mL of LB liquid media containing 50 mg/L of kanamycin and 50 mg/L of rifampicin (see Note 8) and incubate at 28  C for 2 days. 23. Inoculate 1 mL of the culture at last step into 100 mL of LB liquid medium containing 50 mg/L of kanamycin and rifampicin and incubate at 220 rpm at 28  C for 16 h. 24. Pellet the bacteria by centrifuge at 4000  g for 8 min in 50 mL Falcon tubes and discard the supernatant thoroughly. 25. Resuspend the pellet with 1 volume of 5% (w/v) sucrose solution, and add Silwet L-77 to 0.02% [9]. 26. Select healthy Arabidopsis wild-type plants at blooming stage. Remove the opened flowers and leave the unopened flower buds. Submerge the inflorescences into the suspension and slightly sway for a few seconds. Throw off the liquid covered on the inflorescences and wrap the plants with plastic wrap and keep in dark overnight. 27. Remove the plastic wrap and proceed regular management until the seeds being matured and then harvest them. 28. The seeds were first stratified (see Note 9). For screening the circRNA knock down lines, the seeds are scattered on wet soil and covered with plastic wrap. Spray 0.005% (v/v) Basta solution (see Note 10) on day 4 and day 6. The positive transformants will remain green and grow true leaves. 3.2 Overexpression of CircRNAs in Plants

CircRNA overexpression construct is based on inserting reverse complementary sequence flanking the introns of exonic circRNA [10]. 1. Amplify the first 60 bp of the fifth intron (see Note 11) of AtSEP3 [11] by PCR using genome DNA as template with forward primer harboring BamHI and reverse primer harboring SacI recognition sites. Check the PCR products by resolving on 3% agarose gel. 2. Digest the PCR products with BamHI and SacI at 37  C, and recover the digested product with gel purification kit. 3. Digest the pRHV plasmid with BamHI and SacI at 37  C, and resolve the linear plasmid on 1% agarose gel and recover with gel purification kit.

28

Yan-Zhao Feng and Yang Yu

4. Ligate the 60 bp of the intron and the linear pRHV plasmid with T4 DNA ligase, and transform the ligated product to E. coli DH5α competent cells. 5. Culture the competent cells in LB liquid medium with 50 mg/ L kanamycin at 37  C, 180 rpm. Isolation the plasmid with plasmid mini kit. 6. Prepare the reverse complemental intron sequence by substituting the forward and reverse primer with that harboring SpeI and KpnI recognition site, respectively, and amplify the above sequence again. Check the PCR products by resolving on 3% agarose gel. 7. Digest the PCR products with KpnI and SpeI at 37  C, and recover the digested product with gel purification kit. 8. Digest the pRHV plasmid containing 60 bp of the fifth intron of AtSEP3 with KpnI and SpeI at 37  C, and resolve the linear plasmid on 1% agarose gel and recover with gel purification kit. 9. Ligate the reverse complemental sequence of the intron and the linear pRHV plasmid containing the forward sequence of the intron with T4 DNA ligase, and transform the ligated product to E. coli DH5α competent cells. 10. Culture the competent cells in LB liquid medium with 50 mg/ L kanamycin at 37  C, 180 rpm. Isolation the plasmid with plasmid mini kit. The plasmid is named pRHV-CR (Fig. 5). 11. For single exonic circRNA, use genomic DNA as the template, and amplify the exon starting from ~200 bp upstream in the intron to ~200 bp downstream in the intron. Upstream and downstream primers are designed to harbor 15 bp overlap with the 30 end of the inverse repeat sequence but without SacI and KpnI recognition sites on the pRHV-CR plasmid. For multiple exonic circRNA, three amplicons should be prepared. The first amplicon should be amplified from genomic DNA, starting from ~200 bp upstream in the intron to the end of the exon. The forward primer is designed to overlap with the 30 end of the inverse repeat sequence on the pRHV-CR plasmid but without SacI recognition site. The second amplicon should be amplified from cDNA, from the first exon to the last exon. The forward primer should overlap with part of the sequences at the end of the first exon, while the reverse primer should overlap with part of the starting sequences of the last exon. The third amplicon should be amplified from genomic DNA, starting from the part of the overlapped sequences at the end of the

Fig. 5 Partial map of the plasmid pRHV-CR. CircRNAs containing their flanking sequences are inserted between SacI and KpnI recognition sites inside the inverse repeats

Methods for Studying Functions of Plant circRNAs

29

Fig. 6 Schematic diagram of primer design for amplifying circRNA

second amplicon to ~200 bp of the downstream intron. The reverse primer is designed to overlap with the 30 end of the inverse repeat sequence on the pRHV-CR plasmid but without KpnI recognition site. Check the products by 1.5% agarose gel and recover by gel purification kit. 12. Digest the pRHV-CR with SacI and KpnI, resolve the linear plasmid on 1% agarose gel and recover by gel purification kit. 13. Recombine the PCR product and the linear plasmid using one-step cloning kit. 14. Transform E. coli DH5α competent cells and check for positive clones by PCR. 15. Isolate the plasmid of the positive colony, proceed Agrobacterium-mediated transformation of Arabidopsis as described in steps 18–27 in Subheading 3.1. For screening the circRNA overexpression lines, scatter the sterilized seeds on MS medium containing 25 mg/L Hygromycin. The positive transformants will grow true leaves and the roots remain elongating. 3.3 Investigation of the Full-Length and BackSplice Sites of CircRNAs

1. Design two pairs of divergent primers on circRNA, both of which spans the backspliced junction site in different orientation (Fig. 6) (see Note 12). 2. Amplify the circRNA with these 2 pairs of divergent primers by PCR with high fidelity DNA polymerase using cDNA from total RNA as the template. Resolve the PCR products on 1.5% agarose gel in 1  TAE buffer and recover the PCR products with gel purification kit.

30

Yan-Zhao Feng and Yang Yu

Fig. 7 The mapping diagram of the linear circRNA Sanger sequencing result to the gene locus

3. Ligate the products amplified by 2 pairs of primers to Simple Blunt vector, and transform E. coli DH5α competent cells. 4. Check the positive clone by PCR and send for Sanger sequencing. 5. Compare the sequencing results of the 2 PCR products, and fuse the two fragments by their homologous sequence, and finally we can get the full length of the circRNA. 6. Map the sequence to the genomic sequence, and analyze the backsplice sites of the circRNA (Fig. 7). 3.4 Quantification of CircRNAs by RT-qPCR

1. When the plants grow larger, sample a small piece of leaf to perform PCR to confirm the knock down or the overexpression construct is integrated to the plant genome. Design divergent primers astride the backspliced junction site of the circRNA, whose TM values are about 60–63  C. 2. Isolate total RNA of the transformants, Reverse transcribe the total RNA from the tissues where the circRNA expresses with randomized hexamer with DNase treatment and the reverse transcription kit. 3. Amplify the circRNA with qPCR mix, and also amplify Ubiquitin as the endogenous control [12]. Each sample is tested in 3 replicates. 4. Confirm the specificity of the amplicon by checking the melting plot and sequencing the PCR product. 5. Calculate the relative expression level of the circRNA normalized to Ubiquitin using 2ΔΔCT method (see Note 13).

3.5 Identification of CircRNA Binding Proteins by Pull-down Assay

Pull-down using circRNA is proceeded by linear the circRNA exposing the backspliced site with RNA aptamer [13] (Fig. 8). 1. Linear the circRNA at the opposite to the backspliced site and fuse to T7 promoter and tRSA at the 50 end by overlapped PCR using KOD FX. Meanwhile, amplify tRSA alone as the negative control (see Note 14). 2. Check the PCR product on 1.5% agarose gel and recover the DNA by PCR purification kit. Transcribe the tRSA-linear circRNA and tRSA by T7 in vitro transcription kit.

Methods for Studying Functions of Plant circRNAs

31

Fig. 8 Flowchart of circRNA pull-down with RNA aptamer

3. Purify the tRSA-linear RNA and tRSA by RNA purification kit. 4. Grind the plant tissue where the circRNA naturally expresses into fine powder, and dissolve in 3 volumes of lysis buffer with 1% protease inhibitor and 1 U/μL RNase inhibitor. Store at 80  C overnight. 5. Thaw the protein extracts, and centrifuge at 4  C, 13,000  g for 15 min. Transfer the supernatant to a new Eppendorf tube at keep on ice to be used. 6. Heat the RNA at 85  C for 5 min, and cool down to 4  C by decreasing temperature at 1  C/30 s. 7. Add 50 μL of Streptavidin coated beads to the 1.5 mL Eppendorf tube on the magnetic stand and stay for 1 min. 8. Remove the supernatant and add 50 μL of 20 mM Tris–HCl (pH 7.5) and gently vortex the beads by pipetting, stay for 1 min on magnetic stand and remove the supernatant. Repeat this step once. 9. Add 50 μL of 1  RNA capture buffer, and gently vortex the beads by pipetting. 10. Add 50 pmol of tRSA-linear circRNA or tRSA, and mix well by pipetting. Incubate the RNA and the beads by rotating for 30 min at room temperature. 11. Transfer the tube to the magnetic stand, and wash the beads twice by adding 50 μL of 20 mM Tris–HCl (pH 7.5), gently vortex and remove the supernatant.

32

Yan-Zhao Feng and Yang Yu

12. Wash the beads with 100 μL of 1  Protein-RNA binding buffer, and remove the supernatant on the magnetic stand. 13. Prepare Master Mix of RNA-Protein Binding Reaction as follows: 130 μL of protein extract, 60 μL of 50% glycerol and 20 μL of 10  Protein-RNA binding buffer. Mix well, and add 200 μL of Master Mix of RNA-Protein Binding Reaction to the beads. 14. Incubate the beads on the rotator for 30–60 min at 4  C. 15. Put the tube on the magnetic stand and discard the supernatant. 16. Wash the beads with 200 μL of 1  Wash buffer, put the tube on the magnetic stand, and discard the supernatant. Repeat this step twice. 17. Add 50 μL of Biotin Elution Buffer, mix well with the beads and incubate at 37  C with agitation for 30 min. 18. Place the tube to the magnetic stand, and transfer the supernatant to a new tube. Add 12.5 μL of 5  loading buffer and heat at 95  C for 5 min. 19. The proteins are able to analyze by Western blotting or mass spectrum.

4

Notes 1. The website of PlantcircBase is: http://ibi.zju.edu.cn/ plantcircbase/index.php 2. The website of WMD3 is: http://wmd3.weigelworld.org/cgibin/webapp.cgi?page¼Oligo;project¼stdwmd. If there are possible off-target sites, another candidate can be chosen by shifting a few bases upstream or downstream. 3. Prepare appropriate amount of agarose gel as required. For less than eight samples, weight and boil 0.3 g agarose dissolved in 20 mL of 1  TAE solution. 4. It is best to add ligation product to just thawed competent cells. 5. Ampicillin is prepared at 100 mg/mL as a 1000  stock solution, and sterilize by filtering through 0.22 μm filter membrane and stored at 20  C. 6. It takes a long time to thaw on ice naturally. Alternatively, it is possible to accelerate the process by holding in hands. 7. Kanamycin is prepared at 50 mg/mL as a 1000  stock solution, and sterilize by filtering through 0.22 μm filter membrane and stored at 20  C.

Methods for Studying Functions of Plant circRNAs

33

8. Rifampicin is prepared at 50 mg/mL dissolved in DMSO as a 1000  stock solution, and sterilize by filtering through 0.22 μm filter membrane and stored at 20  C. 9. The seeds were soaked in water and stayed at 4  C for 2 days. 10. Pipette 100 μL of 10% Basta stock solution (v/v) into 200 mL of distilled water and mix well by thoroughly stirring. 11. The sequence is GTAAATAAAGAAACACTCATTCTCCTC TCTAAATTCCTCATCTAAAAGTAATGTAACCAA. 12. Primers designed in this way distinguishes linear and circular transcripts. 13. Fold change of the relative expression level ¼ 2[ΔCT(transgenic plants)ΔCT(WT)] , where ΔCT ¼ CT (circRNA)-CT (Ubiquitin). 14. The sequence of tRSA is GCGAATTGAAGCTGCCCT TAAAAAAAAAAAAAGCCCGGATAGCTCAGTCGGTAGA GCAGCGGCCTCGACCAGAATCATGCAAGTGCGTAAG ATAGTCGCGGGTCGAGGCCGCGTCCAGGGTTCAAGT CCCTGTTCGGGCGCCACTGCAGAAAAAAAAAAAA. References 1. Zhao W, Chu S, Jiao Y (2019) Present scenario of circular RNAs (circRNAs) in plants. Front Plant Sci 10:379. https://doi.org/10.3389/ fpls.2019.00379 2. Ye CY, Chen L, Liu C et al (2015) Widespread noncoding circular RNA s in plants. New Phytol 208(1):88–95 3. Lai X, Bazin J, Webb S et al (2018) CircRNAs in plants. In: Circular RNAs. Springer, Berlin, pp 329–343 4. Westholm JO, Miura P, Olson S et al (2014) Genome-wide analysis of drosophila circular RNAs reveals their structural and sequence properties and age-dependent neural accumulation. Cell Rep 9(5):1966–1980 5. Rybak-Wolf A, Stottmeister C, Glazˇar P et al (2015) Circular RNAs in the mammalian brain are highly abundant, conserved, and dynamically expressed. Mol Cell 58(5):870–885 6. Zhang N, Zhang D, Chen SL et al (2018) Engineering artificial microRNAs for multiplex gene silencing and simplified transgenic screen. Plant Physiol 178(3):989–1001 7. He F, Zhang F, Sun W et al (2018) A versatile vector toolkit for functional analysis of rice genes. Rice 11(1):1–10

8. Chu Q, Zhang X, Zhu X et al (2017) PlantcircBase: a database for plant circular RNAs. Mol Plant 10(8):1126–1128 9. Zhang X, Henriques R, Lin S-S et al (2006) Agrobacterium-mediated transformation of Arabidopsis thaliana using the floral dip method. Nat Protoc 1(2):641 10. Liu D, Conn V, Goodall GJ et al (2018) A highly efficient strategy for overexpressing circRNAs. In: Circular RNAs. Springer, Berlin, pp 97–105 11. Conn VM, Hugouvieux V, Nayak A et al (2017) A circRNA from SEPALLATA3 regulates splicing of its cognate mRNA through R-loop formation. Nat Plants 3(5):1–5 12. Fan J, Quan W, Li G-B et al (2020) circRNAs are involved in the Rice-Magnaporthe oryzae interaction. Plant Physiol 182(1):272–286. https://doi.org/10.1104/pp.19.00716 13. Sun Y-M, Wang W-T, Zeng Z-C et al (2019) circMYBL2, a circRNA from MYBL2, regulates FLT3 translation by recruiting PTBP1 to promote FLT3-ITD AML progression. Blood 134(18):1533–1546

Chapter 3 Generation of Transgenic Rice Expressing CircRNA and Its Functional Characterization Priyanka Sharma, Ashirbad Guria, Sankar Natesan, and Gopal Pandi Abstract Circular RNA (CircRNA) is yet another vital addition to the noncoding RNA family. They are mainly derived by fusion of downstream 30 splice donor with upstream 50 splice acceptor by a noncanonical form of alternative splicing mechanism called backsplicing. An array of functional aspects of these circRNAs has been reported in animal systems. However, functional investigation of circRNA in plants is very limited. In this chapter, we described a methodological outline to study the circRNA biogenesis and to characterize its function(s). Sequence of a newly identified Oryza sativa Indica circRNA flanked by complementary repeat sequences of a rice intron was assembled to yield a circRNA expression cassette. This cassette can be cloned into any plant expression vector which has a suitable promoter (CaMV 35S or ubiquitin promoter) and terminator, and can be used for any circRNA-mediated functional studies. Subsequent agroinfection of rice calli with this cassette yielded circRNA expressing transgenic plants. These transgenic plants were used to establish a correlation between the expressing circRNA, parental gene, and interacting miRNAs. Moreover, effect of circRNA overexpression on plant phenotype under various stress conditions can be studied using these transgenic plants. Also, RNA pull-down assay can be performed to identify the circRNA interacting proteins and the expression of these RBPs can also be studied from these transgenic plants. Key words CircRNA, CircRNA expression cassette, Binary vector, Plant tissue culture, Divergent RT-PCR

1

Introduction Circular RNA (circRNA) has gained wide attention in the ongoing decade. Continuing research has led to the discovery of innumerable circRNAs in a range of organisms tested so far from fungus [1, 2] to mammals [3, 4]. The hunt for the cost-cutting lab protocols and development of newer computational tools together has made identification, validation and functional characterization of circRNAs as a never-ending task. Some of the identified functions

Priyanka Sharma, Ashirbad Guria contributed equally to this work. Luis Marı´a Vaschetto (ed.), Plant Circular RNAs: Methods and Protocols, Methods in Molecular Biology, vol. 2362, https://doi.org/10.1007/978-1-0716-1645-1_3, © The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Science+Business Media, LLC, part of Springer Nature 2021

35

36

Priyanka Sharma et al.

of circRNAs include protein coding ability [5, 6] sponging of miRNAs [7–10] regulation of transcription [11, 12], and translation of parental gene [13]. However, most of the circRNA functions are predominantly concentrated in animal cell lines or tissue which augments the need for functional characterization of circRNAs identified from plants [14]. Functional significance of any newly identified circRNA can be studied either by overexpression [15, 16], knockouts [9, 17–19], or knockdown [20–23] studies. Here, we are discussing a methodological layout for circRNA biogenesis and to determine its function(s) by overexpression which requires construction of a circRNA expression cassette and its subsequent transfer into plants. To facilitate the biogenesis of the circRNA from the designed circRNA expression cassette, flanking sequences are required both at upstream and downstream of the sequence to be circularized. These sequences can either be the original flanking sequences surrounding the circRNA in a genome (varies for each circRNA) or can comprise of some conserved repeat sequences such as miniature inverted-repeat transposable elements (MITES) in rice [15], and LINE1-like elements (LLEs) and their Reverse Complementary Pairs (LLERCPs) in maize [24]. In addition, verified complementary repeat sequences of a validated circRNA can be used for constructing a different circRNA expression cassette [15]. The use of complementary repeat sequences for the overexpression is well documented [15]. The selected 478 bp complementary sequence of a rice intron was amplified and cloned [15] at both upstream and downstream of the circRNA expressing sequence. This orientation would result in complementarity of the upstream and downstream flanking introns and would further facilitate the circRNA biogenesis. CircRNA expression cassette is integrated between CaMV35S promoter and NOS terminator by replacing the GUSPlus gene present within the T-DNA of a plant binary vector pCAMBIA1305.1 [25]. Transfer of T-DNA from Agrobacterium into plants is achieved by agroinfection of O. sativa Indica calli. However, it can be carried out by multiple ways depending on the transient or stable expression strategy like agroinfiltration onto leaves [26], agroinfection of the seedlings [27]/callus [28], agrodrenching plant rhizosphere [29, 30], vacuum infiltration [31], and floral dip method [32]. Herein, we describe the methodological concept of circRNA overexpression using plasmid constructs followed by Agrobacterium-mediated transformation for the detection of its potential function in plants. Divergent RT-qPCR and northern hybridization were performed to estimate the expression levels of circRNA in transgenic plants. This methodology would facilitate to study the functions such as miRNA sponging, parental gene regulation, response to stress conditions, role in plant growth and development. In addition, using the same transgenic plants, identification and characterization of circRNA interacting proteins can also be studied using RNA pull-down assay.

CircRNA Biogenesis and Functions

37

This chapter overall compiles the stepwise methodology to develop a circRNA expressing rice transgenic plant using a welldesigned strategy. CircRNA expression cassette was assembled and successfully transformed into a rice plant. The T-DNA insertion and circRNA expression can be asserted by hybridization or PCR based methods. This will help to understand and compare various biological functions such as regulation of parental gene expression, circRNA–RBP interaction, phenotypic changes, and effects when exposed to any stress condition.

2 2.1

Materials Buffers

1. Cetyl trimethylammonium bromide (CTAB) extraction buffer: 2.0% CTAB, 100 mM Tris (pH 8.0), 1.4 M NaCl, 20 mM EDTA, 0.3% β-mercaptoethanol, 1.0% PVP, 0.1 Tris–EDTA buffer. 2. 10 MOPS buffer (pH 7.0): 0.2 M MOPS, 0.05 M sodium acetate, 0.01 M EDTA sodium salt. 3. 25 Tris–acetate–EDTA (TAE) buffer: 1 M Tris base, 0.05 M EDTA sodium salt (pH 8.0), 0.5 M acetic acid. 4. 10 Tris-EDTA (TE) buffer: 0.01 M Tris–HCl (pH 8.0), 0.1 M EDTA (pH 8.0). 5. 1 TNE Running buffer (pH 7.5): 40 mM Tris, 20 mM sodium acetate, 2 mM EDTA (adjust pH with glacial acetic acid). 6. 20 SSC buffer (pH 7.0): 3 M NaCl, 0.3 M sodium citrate. 7. Denaturation Solution: 1 M NaCl, 0.5 M NaOH. 8. Neutralization solution (pH 7.0): 1.5 M NaCl, 0.5 M Tris. 9. Southern Hybridization Buffer: 0.5 M NaCl, 4.0% Blocking reagent (Amersham). 10. Primary wash buffer: 2 M Urea, 0.1% SDS, 50 mM sodium phosphate buffer (pH 7.0), 150 mM NaCl, 10 mM MgCl2, 0.2% (w/v) Blocking reagent. 11. 20 Secondary wash buffer (pH 10.0): 1 M Tris base, 2 M NaCl. 12. Northern Hybridization buffer (freshly prepared): 5 SSC, 0.1% (w/v) N-lauryl sarcosine Na salt, 0.02% SDS, 1.0% (w/v) Blocking reagent. 13. 10% Blocking solution: 10% (w/v) Blocking reagent in maleic acid buffer (65  C). 14. 1.0% Blocking solution: Dilute 10% Blocking solution 1:10 in maleic acid buffer. 15. Washing buffer I: 2 SSC, 0.1% SDS.

38

Priyanka Sharma et al.

16. Washing buffer II: 0.5 SSC, 0.1% SDS. 17. Maleic acid buffer (pH 7.5): 0.1 M maleic acid, 0.15 M NaCl. 18. Washing buffer: To 100 mL maleic acid buffer, 300μL Tween 20 was added (use cut tips). 19. Detection buffer (pH 9.5): 0.1 M Tris–HCl, 0.1 M NaCl. 20. Cell wash solution: 50 mM Tris–HCl (pH 8.0), 20 mM EDTA, 0.5 M NaCl, 0.05% Sarkosyl. 21. Lysis buffer: 1.0% SDS, 200 pg/mL proteinase K in 10 TE Buffer. 2.2 Bacterial Culture Reagents

1. Chemically competent E. coli DH5α cells. 2. Luria–Bertani (LB) liquid media (1 L, pH 7.2): 10 g tryptone, 10 g NaCl, 5 g yeast extract. 3. LB agar: LB medium + 1.5% Agar. 4. Yeast extract peptone (YEP) medium (1 L): 10 g Peptone, 5 g NaCl, 10 g Yeast extract. 5. YEP agar: YEP medium + 1.5% Agar. 6. Recipient strain: Agrobacterium tumefaciens strain LBA4404. 7. Donor strain: E. coli DH5α harboring plasmid clone-A4. 8. Helper strain: E. coli DH5α harboring plasmid pRK2013.

2.3 Plant Material and Plant Tissue Culture Reagents

1. Seeds of Oryza sativa Indica, cultivar—Pusa Basmati 1. 2. 20 AB Salt (100 mL): 2% (w/v) NH4Cl, 0.6% (w/v) MgSO4·7H2O, 0.3% (w/v) KCl, 0.31% (w/v) CaCl2·2H2O, 0.005% (w/v) FeSO4·7H2O. 3. 20 AB Buffer (100 mL): 0.6% (w/v) K2HPO4, 2.6% (w/v) NaH2PO4·2H2O. 4. AB Agar minimal medium (100 mL): 1 AB salt, 1 AB buffer, 0.5% (w/v) glucose, 2% (w/v) agar. 5. 1 M acetosyringone in DMSO. 6. Germination medium: Murashige and Skoog (MS) salts, 3.0% (w/v) sucrose, and 0.9% (w/v) agar. 7. Callus induction medium (pH 5.8): MS medium supplemented with 300 mg/L casein hydrolysate, 500 mg/L proline, 2.5 mg/L 2,4-dichlorophenoxy acetic acid, 30 g/L sucrose, 2.25 g/L Phytagel, 0.5 mg/L 6-benzylaminopurine. 8. Cocultivation medium: Callus induction medium supplemented with 10 g/L glucose, 3 g/L Phytagel, 100μM acetosyringone. 9. Selection medium: Callus induction medium supplemented with 4 mg/L Phytagel, 50 mg/L hygromycin, and 250 mg/ L cefotaxime.

CircRNA Biogenesis and Functions

39

10. Regeneration medium: MS medium supplemented with 3 mg/ L kinetin, 1.5 mg/L naphthaleneacetic acid (NAA), 6 mg/L Phytagel, 40 mg/L hygromycin, 250 mg/L cefotaxime. 2.4

Enzymes

1. 2 units/μL TURBO DNase (Invitrogen). 2. 20 units/μL RNase R (Lucigen). 3. 5 units/μL Taq DNA polymerase (Thermo Scientific). 4. T4 DNA Ligase (New England Biolabs). 5. Appropriate Restriction enzymes. 6. Lysozyme (Thermo Scientific). 7. RNase A (Thermo Scientific).

2.5 Nucleic Acid Reagents

1. 10 mM dNTPs. 2. pBluescript KS(). 3. pGEMT-Easy. 4. pCAMBIA1305.1. 5. JetSeq Beads (Bioline). 6. Illumina Universal Adapter. 7. Specific index adapter and index sequence (Illumina). 8. 2 FastStart Universal SYBR Green Master (Rox) (Roche).

2.6 Chemical Reagents

1. 100% isopropanol. 2. 100%, 75%, and 70% ethanol. 3. 100% chloroform. 4. Phenol, pH 8.0. 5. 24:1 chloroform–isoamyl alcohol. 6. TRIzol. 7. 10 mg/mL ethidium bromide (EtBr). 8. Diethyl pyrocarbonate (DEPC). 9. CDP-star detection reagent (Amersham). 10. Ready-to-use Kodak Rapid Access Developer solution. 11. Ready-to-use Kodak Rapid Access Fixer solution. 12. Autoclaved DEPC-treated water. 13. Double distilled water.

2.7

Antibiotic Stocks

1. Hygromycin (Hyg): 50 mg/mL in water. 2. Ampicillin (Amp): 100 mg/mL in water. 3. Kanamycin (Kan): 50 mg/mL in water. 4. Rifampicin (Rif): 10 mg/mL in DMSO. 5. Cefotaxime (Cef): 250 mg/mL in water.

40

2.8

Priyanka Sharma et al.

Kits

1. RiboMinus Plant kit for RNA-Seq (Thermo Scientific): 10 mg/mL RiboMinus Magnetic Beads, 15 pmol/μL RiboMinus Plant Probe, Hybridization Buffer. 2. NEBNext Ultra II Directional RNA Library Prep Kit (New England Biolabs): NEBNext Ligation Enhancer, NEBNext First Strand Synthesis Enzyme Mix, NEBNext First Strand Synthesis Reaction Buffer, Random Primers, NEBNext Second Strand Synthesis Enzyme Mix, NEBNext Second Strand Synthesis Reaction Buffer with dUTP Mix, NEBNext USER Enzyme, NEBNext Ultra II End Prep Enzyme Mix, NEBNext Ultra II End Prep Reaction Buffer, NEBNext Ultra II Ligation Master Mix, NEBNext Ultra II Q5 Master Mix, NEBNext Adaptor Dilution Buffer 0.1 TE Buffer, nuclease-free water, NEBNext Strand Specificity Reagent. 3. RevertAid First Strand cDNA synthesis kit (Thermo Scientific): 200 units/μL RevertAid RT, 20 units/μL RiboLock RNase Inhibitor, 5 Reaction Buffer, 10 mM dNTP Mix, 100μM Random Hexamer Primer, 100μM Oligo(dT)18 Primer. 4. GeneJET Gel Extraction Kit (Thermo Scientific): Binding Buffer, Wash Buffer, Elution Buffer, GeneJET Purification Columns. 5. QIAprep Spin Miniprep Kit (Qiagen): QIAprep 2.0 Spin Columns, Buffer P1, Buffer P2, Buffer N3, Buffer PB, Buffer PE, Buffer EB, LyseBlue, Collection tubes. 6. Universal Genome Walker 2.0 kit (Clontech): Universal Genome Walker Components (10 units/μL DraI, 10 DraI Restriction Buffer, 10 units/μL EcoRV, 10 EcoRV Restriction Buffer, 10 units/μL PvuII, 10 PvuII Restriction Buffer, 10 units/μL StuI, 10 StuI Restriction Buffer, 0.1μg/μL Control Human Genomic DNA, 6 units/μL T4 DNA Ligase, 10 Ligation Buffer, 25μM Genome Walker Adaptor, 10μM Adaptor Primer 1 (AP1), 10μM Nested Adaptor Primer 2 (AP2), Positive Control Genome Walker Human Library, 10μM Positive Control tPA Primer (PCP1), 10μM Positive Control tPA Nested Primer (PCP2). 7. NucleoSpin Gel and PCR Clean-Up Kit (Macherey-Nagel): Buffer NT1, Buffer NT3, Buffer NE, NucleoSpin Gel and PCR Clean-Up Column and Collection tubes. 8. Advantage 2 PCR Kit (Clontech): 50 Advantage 2 Polymerase Mix, 10 Advantage 2 PCR Buffer, 10 mM each dNTP Mix, 100 ng/μL Control DNA Template, 10μM each Control Primer Mix. 9. Amersham AlkPhos direct labeling and detection system (GE Healthcare): Labeling Reagent, Cross-linker Solution, Reaction Buffer, Hybridization Buffer, Blocking reagent.

CircRNA Biogenesis and Functions

41

10. DIG DNA Labeling and Detection Kit (Roche): 10 Hexanucleotide Mix, 10 dNTP Labeling Mixture, 2 units/μL Klenow Enzyme, 750 units/mL Anti-DIG-AP Conjugated Antibody. 11. CDP-Star, ready-to-use (Roche). 2.9

Equipment

1. Cooling centrifuge. 2. 70  C freezer. 3. Vacuum dryer. 4. Heating block. 5. NanoDrop. 6. Water bath. 7. Qubit fluorometer. 8. Illumina Hyseq4000. 9. High capacity workstation. 10. Thermal cycler. 11. Agilent 2200 Tape Station. 12. ABI 3500 DX series genetic analyzer. 13. UV cross-linker. 14. GelDoc imager. 15. Real-time PCR system. 16. Magnetic separation stand. 17. Vortex machine. 18. Incubator. 19. Shaker. 20. Hybridization oven. 21. Horizontal Agarose gel electrophoresis apparatus. 22. Green house.

2.10

Miscellaneous

1. Scalpel blade. 2. Sterile microcentrifuge, falcon tubes. 3. 200μL sterile PCR tubes. 4. 96-well real time PCR plates. 5. Chilled mortar-pestle. 6. Glassware and plasticware. 7. Nylon membrane. 8. Whatman paper No 1, 3. 9. Crude filter papers. 10. Glass rod.

42

Priyanka Sharma et al.

11. Plastic sheet. 12. X-ray films. 13. X-ray film cassette. 14. Clay pots and clay soil.

3

Methods

3.1 Plant Genomic DNA Isolation

1. Weigh 1 g of Oryza sativa Indica leaves and crush into a fine powder in liquid nitrogen (LN2) using prechilled mortar and pestle. 2. Add 3–4 mL of preheated (65  C) CTAB extraction buffer and grind again. 3. Transfer ~700μL extract to 1.5 mL sterile microcentrifuge tubes. 4. Incubate the tubes at 65  C for 15–20 min. 5. Add equal volume of prechilled 24:1 mixture of chloroform– isoamyl alcohol into each tube, mix well and centrifuge for 10 min at 10,621  g at 4  C. 6. Transfer the supernatant to a fresh 1.5 mL sterile microcentrifuge tube and repeat the above step one more time. 7. Add equal volume of isopropanol to the supernatant, gently mix and incubate for overnight at 70  C. 8. Next day, centrifuge the tubes at 10,621  g for 10 min at 4  C, followed by 70% ethanol wash for 5 min at 10,621  g at 4  C. 9. Vacuum dry the sample for ~20 min until the alcohol is completely removed. Suspend the pellet with 20–25μL of 0.1 TE buffer, measure the concentration and quality of the DNA using NanoDrop followed by 0.8% TAE-agarose gel electrophoresis.

3.2 Plant Total RNA Isolation

1. Grind 300 mg of Oryza sativa Indica leaves to a fine powder in LN2 using prechilled mortar and pestle. 2. Add 3 mL of TRIzol (1 mL for each 100 mg of sample), gently mix, and incubate for 5 min at room temperature (RT). 3. Transfer the mix to sterile 1.5 mL microcentrifuge tubes (1 mL/tube), add 0.2 mL chloroform to each tube, mix by inverting, and incubate for 2–3 min at RT. 4. Centrifuge at 15,294  g for 10 min at 4  C, transfer the aqueous phase to fresh sterile 1.5 mL microcentrifuge tube. Add 0.5 mL isopropanol to each tube, mix well by inverting and again incubate at RT for 10 min. 5. Centrifuge at 15,294  g for 10 min at 4  C. Wash the pellet with 1 mL of 75% ethanol at 15,294  g for 5 min at 4  C.

CircRNA Biogenesis and Functions

43

6. Air-dry the RNA pellet and resuspend in ~20–40μL of prewarmed (55  C) sterile DEPC-treated water (see Note 1). 7. Store the total RNA at 80  C until further used. 8. Measure the concentration and quality using NanoDrop followed by 1.5% MOPS-agarose gel electrophoresis. 9. Treat 10μg of total RNA with 1μL of (2 units/μL) DNase in 1 DNase buffer. Incubate the mix at 37  C for 30 min. Inactivate the enzyme using 0.01 M EDTA (pH 8.0) by incubating at 70–75  C for 10 min. 10. Measure the quantity using NanoDrop and assess the quality by 1.5% MOPS-agarose gel electrophoresis. 3.3 Enriching CircRNA

1. Use RiboMinus Plant Kit for RNA-seq that comes with the 10 mg/mL magnetic beads, 15 pmol/μL plant RNA probe, hybridization buffer, and sterile DEPC-treated water. 2. Resuspend the magnetic beads by vortexing or vigorous tapping. Take 750μL of magnetic beads in a 1.5 mL sterile microcentrifuge tube. 3. Using the magnetic separation stand, remove the solution from the beads and wash the magnetic beads twice with sterile DEPC-treated water. Separate the beads using magnetic separation stand. Resuspend the washed beads in 750μL of hybridization buffer, mix gently and split the beads into 250μL and 500μL volumes in sterile 1.5 mL microcentrifuge tubes. 4. Concentrate the 500μL magnetic beads by separating and resuspending again in 200μL of hybridization buffer. Incubate the beads at 37  C water bath until used. 5. Hybridize the RNA with RiboMinus probe by mixing 10μg of DNase-treated total RNA, 10μL of 15 pmol/μL Probe, and 100μL of hybridization buffer. Incubate the mix at 70–75  C in a heat block for 5 min, and cool the tube gradually to 37  C. 6. Transfer the mix to the tube containing 200μL prepared magnetic beads. Mix by gentle vortexing and incubate at 37  C for 15 min. 7. After incubation, separate the mix for 1 min using magnetic separation stand and transfer the supernatant to the other 250μL aspirated magnetic beads and incubate at 37  C for 15 min. 8. Separate and aspirate the supernatant using magnetic separation stand. 9. To the supernatant, add 4μL of 5μg/μL glycogen, 1/10th sample volume of 3 M Sodium acetate and 2.5 sample volume of 100% ethanol and mix well.

44

Priyanka Sharma et al.

10. Incubate the mix at 80  C for 30 min followed by centrifugation for 15 min at 15,294  g at 4  C. 11. Wash the pellet with 70% ethanol twice and air-dry for 5 min. Dissolve rRNA depleted RNA in 10–30μL of sterile DEPCtreated water. Quantify the sample using NanoDrop and preserve at 80  C until further use (see Note 2). 12. Use 1–2μg of RiboMinus-treated RNA for RNase R treatment using 5–10 units of RNase R in 1 RNase R buffer and incubate the mix for 20 min at 37  C. 3.4 RNA Sequencing and Data Analysis for CircRNA Identification and Flanking Sequence Determination

1. Using Qubit fluorometer and NanoDrop, check the quality and integrity of the processed RNA from step 12 of Subheading 3.3. 2. Prepare library for RNA sequencing using Illumina-compatible NEBNext Ultra II Directional RNA Library Prep Kit. 3. Use 100 ng of RNA for fragmentation, priming, followed by first and second strand cDNA synthesis. 4. Subject the synthesized cDNA for end-repair, adenylation, and ligation with Illumina multiplex barcode adapter. Excise second strand using USER enzyme at 37  C for 15 min. 5. Purify the adapter ligated cDNA using JetSeq beads. Perform PCR with an initial denaturation at 98  C for 30 s followed by 7 cycles of denaturation at 98  C for 10 s, annealing and extension at 65  C for 75 s, followed by a final extension at 65  C for 5 min (see Note 3). 6. Purify the enriched PCR product followed by library quality control check. 7. Quantify the library and analyze its fragment size distribution on Agilent 2200 Tape Station followed by sequencing on an Illumina Hiseq4000 platform. Remove the adapter(s), overrepresented sequence(s) from the raw sequence using Trimmomatic (licensed under GPL V3 and available at http://www. usadellab.org/cms/index.php?page¼trimmomatic) in Linux platform. 8. Check for the quality of the library at different phred score (>30 will be better) in order to retain maximum reads of equal read length with >50% GC ratio (see Note 4). 9. Map the processed reads to linear Oryza sativa genome database. 10. Eliminate the mapped reads and analyze the quality of the final processed unmapped reads by FastQC (see Note 5). 11. Feed the reads into any circRNA computational pipeline (CIRCexplorer [33], CIRI [34], pCircRNAfinder [35], DCC [36], CIRI2 [37], etc.) for circRNA identification either by using default parameters or changing the set parameters (see Note 6).

CircRNA Biogenesis and Functions

45

12. Linearized circRNA sequence can be directly mapped to the plant genome in order to manually scan the sequences for the availability of any of the already annotated species-specific repeat sequences on both upstream and downstream of the intervening circRNA sequence. This approach can provide a preliminarily idea about the repeat sequence responsible for circularization of the intervening sequence, but is a laborious task to follow. In this chapter, we have selected a 478 bp rice intron [15] to flank the circRNA sequence in order to facilitate its biogenesis (see Note 7). 3.5 CircRNA Validation Using Divergent PCR

Synthesizing cDNA using oligo-dT primers would help in reverse transcription of only the linear mRNA molecules, thereby minimizing or exempting the population of cDNA generated from circRNAs. On the other hand, using random primers, we can obtain cDNA from all types of RNA molecules including circRNAs. Hence, this underscores the oligo-dT primed cDNA as a negative control against random primed cDNA template for validating the circRNAs using divergent RT-PCR. 1. To validate the identified circRNAs, design specific divergent primer pair, D1 and D2 (Fig. 1a), to amplify the backsplice junctions using cDNA templates. 2. Use 2μg DNase-treated RNA for single stranded (ss) cDNA conversion using oligo-dT and random primers in separate reactions. To the RNA, mix random primer or oligo-dT primer, incubate at 65  C for 5 min followed by addition 1μL of 10 mM dNTPs, 1μL of 20 units/μL RNase inhibitor, 1μL of 200 units/μL reverse transcriptase in its compatible buffer. 3. Incubate the random primer sample at 25  C for 5 min, 42  C for 1 h followed by inactivation at 70  C for 5 min. Follow the same for oligo-dT primer sample without the initial step at 25  C for 5 min. 4. Perform divergent PCR with cDNA using 10 pM of primers each, 1μL of 10 mM dNTPs, 0.6μL of 5 units/μL Taq DNA polymerase in its compatible buffer at circRNA specific PCR condition. 5. Check for the expected amplified product size by 1.5% TAE-agarose gel electrophoresis and further perform Sanger sequencing for circRNA confirmation (see Note 8).

3.6 Vector Construction

Upon identification and validation of circRNAs, their potential functions can be studied by overexpression using appropriate expression vector. Here we describe the strategy used for circRNA biogenesis using a binary vector pCAMBIA1305.1 [25] and partial single exonic circRNA (circ 1:4789318-4789486) identified by MDA-NGS method [10]. To begin with, a full-length PCR

46

Priyanka Sharma et al.

Fig. 1 Representation showing the region and strategy for designing different primers and probes. (a) CircRNA can be validated by divergent PCR using primer pair, D1 and D2 (red arrow) to detect the backsplice junction (black line) whereas its corresponding linear sequence (negative control) can be amplified by convergent PCR using primers, C1 and C2 (blue arrow). (b) Primer pair, S1–S2 is used to amplify a part of hygromycin gene (Left Border probe) and S5–S6 to amplify sequence comprising a fragment of circRNA cassette, NOS terminator and RB portion (Right Border probe). Using specific primers or any amplified product as a probe in Southern hybridization, Agrobacterium transconjugants can be confirmed. Also, intactness and copy number of the T-DNA integrated into the rice genome can be determined using specific probes (LB and RB probe) for respective T-DNA borders

amplified circRNA sequence was cloned in a suitable vector. Subsequent subcloning of the 478 bp rice complementary intronic sequences [15] at upstream and downstream of circRNA resulted in a circRNA expression cassette. This assembled circRNA cassette was further subcloned into pCAMBIA 1305.1 for overexpression in rice upon transformation (Fig. 2). We have chosen pBluescript KS() to assemble circRNA expression cassette using appropriate restriction sites (SacI/PstI, PstI/EcoRI, EcoR1/KpnI). The cassette was replaced in place of GUSPlus gene in the plant binary vector pCAMBIA1305.1 at NcoI and BstEII restriction sites. 3.6.1 Amplification and Cloning of the Full-Length CircRNA

1. Design the circRNA specific forward (P1) and reverse (P2) primers flanked with PstI and EcoRI restriction sites respectively by placing extra nucleotides in the 50 ends to facilitate digestion of the PCR product (Fig. 2a).

CircRNA Biogenesis and Functions

47

Fig. 2 Representation of the strategy adopted for designing a circRNA overexpression construct (a) to develop a transgenic rice plant upon its transformation. To facilitate circRNA biogenesis, a 478 bp rice intron (b) (blue arrow) present at a different locus from that of a selected circRNA, is designated to flank upstream and downstream of the selected exonic circRNA sequence (yellow box) while assembling the circRNA-expression cassette. Primers P1 (with PstI restriction site) and P2 (with KpnI restriction site) are designed to amplify the sequence of the selected circRNA (circRNA ID) along with additional ~200 bp sequence from both its upstream and downstream (orange box) (d). The amplified product upon digestion with PstI/KpnI is ligated in MCS of pBluescript KS() (e). Primer P3 (with EcoRI restriction site) and P4 (with BstEII–KpnI restriction site) are designed to amplify the previously mentioned 478 bp intron (c) so as to subclone it downstream of circRNA sequence in pBluescript KS() (e). Similarly, primers P5 (with SacI–NcoI restriction sites) and P6 (with PstI restriction site) are used to PCR amplify the same 478 bp rice intron (blue arrow) but to subclone it upstream of circRNA sequence in pBluescript KS() (e). The entire circRNA expression cassette so formed in pBluescript KS() (e) is digested by NcoI and BstEII and ligated at respective sites in pCAMBIA1305.1 to get the circRNA overexpression construct (f) which will be used for rice transformation by Agrobacterium-mediated transformation

48

Priyanka Sharma et al.

2. Using genomic DNA (Oryza sativa Indica leaf sample) as a template, perform PCR to amplify the full-length of the selected circRNA (circ 1:4789318-4789486) with additional ~200 bp of sequence flanked upstream and downstream in a way to retain the splice site (Fig. 2d) [39, 40]. 3. Digest the pBluescript KS() and the PCR product amplified with P1-P2 primers with PstI and EcoRI for 3 h at 37  C (see Note 9). 4. Resolve the digested products by 1.0% TAE-agarose gel electrophoresis and elute the digested pBluescript KS() vector and the PCR product using gel elution kit. 5. Ligate the digested PCR product with linearized pBluescript KS() using T4 DNA ligase in 1 T4 Ligase buffer at 16  C overnight (Fig. 2e). 6. Transform the ligated product into 30μL of competent E. coli (DH5α) and plate on LBAmp/X-gal agar plates for blue–white screening. Culture the transformed white colony in LBAmp broth and incubate at 37  C overnight at 180 rpm. 7. Isolate plasmid from exponentially growing cells using miniprep kit (see Note 10). 8. Confirm the clone by digesting the isolated plasmids with PstI/ EcoRI and resolving on 1.0% TAE-agarose gel. Further, perform Sanger sequencing to confirm the sequence. This recombinant construct is termed as clone-A1. 3.6.2 Cloning of the Flanking Intron Downstream of the CircRNA in Clone-A1

1. Design the forward primer P3 with EcoRI restriction site and reverse primer P4 with KpnI followed by BstEII restriction sites to amplify the 478 bp rice intron sequence from the genomic DNA (Fig. 2c). 2. Digest the amplified 478 bp fragment and clone-A1 with EcoRI and KpnI for 3 h at 37  C (see Note 9). 3. Resolve the digested clone A-1 and the PCR amplified 478 bp fragment on 1.0% TAE-agarose gel, followed by gel elution. 4. Ligate the digested products using T4 DNA ligase in 1 Ligase buffer overnight at 16  C (Fig. 2e). 5. Transform the ligated product and screen the colonies for positive clones as described in steps 6–8 of Subheading 3.6.1. 6. The resulting recombinant construct was termed as clone-A2.

3.6.3 Cloning of the Flanking Intron Upstream of the CircRNA in Clone-A2

1. As described in Subheading 3.6.2, PCR amplify the 478 bp rice intron sequence using the forward primer P5 and reverse primer P6 with restriction sites SacI followed by NcoI and PstI respectively (Fig. 2b) (see Note 9). 2. Digest the amplified 478 bp fragment and the clone-A2 with SacI and PstI for 3 h at 37  C.

CircRNA Biogenesis and Functions

49

3. Resolve the digested products on 1.0% TAE-agarose gel, followed by gel elution. 4. Ligate the digested clone-A2 and the PCR amplified 478 bp fragment using T4 DNA ligase in 1 Ligase buffer overnight at 16  C (Fig. 2e). 5. Transform the ligated product and screen the positive clones as described in steps 6–8 of Subheading 3.6.1. The resulting recombinant clone was termed as clone-A3. 3.6.4 Cloning the CircRNA Expression Cassette into pCAMBIA1305.1

1. To replace the GUSPlus gene present in pCAMBIA1305.1 with the assembled circRNA-expression cassette, digest both pCAMBIA1305.1 and clone-A3 using NcoI/BstEII. 2. Resolve the digested products on 1.0% TAE-agarose gel followed by gel elution of the expected fragments. 3. Ligate the eluted cassette into the linearized pCAMBIA1305.1 and incubate for overnight at 16  C (Fig. 2f) followed by transformation and plating on LBKan agar plates. 4. Screen for the positive clones as described earlier and confirm the clones by restriction digestion, PCR amplification followed by Sanger sequencing. 5. The resulting recombinant construct is termed as clone-A4 which will be used as a donor strain during triparental mating.

3.7 Triparental Mating

1. Plate-1: Streak one positive E. coli DH5α transformant of clone-A4 (donor strain) on LBKan agar plate and incubate at 37  C overnight. 2. Plate-2: Streak E. coli pRK2013 (helper strain) on LBKan agar plate and incubate at 37  C overnight. 3. Plate-3: Streak A. tumefaciens strain LBA4404 (recipient strain) on ABRif agar plate and incubate at 28  C for 2–5 days. 4. Plate-4: Pick single colony from Plate-1, Plate-2, and Plate-3, patch on fresh YEP plate (without antibiotics) very close to each other. Mix together all the patches using a sterile loop and incubate the plate at 28  C for 18 h. 5. Scratch a scoop of the grown bacteria and resuspend in 1 mL of 0.8% saline for the subsequent serial dilution. 6. Make ten-fold dilution of 100μL culture by adding 0.9 mL saline followed by subsequent six more ten-fold dilutions and spread plate 100μL from of each dilution on ABKan+Rif agar plates and incubate for 2–3 days at 28  C. 7. Check the plates for transconjugants. For further confirmation, select and patch a single colony on AB/YEPKan+Rif agar plates. Here, the positive A. tumefaciens LBA4404 strain harboring clone-A4 plasmid will be called as transconjugant.

50

Priyanka Sharma et al.

3.8 Agrobacterium DNA Isolation [41]

1. Take a patch from Agrobacterium transconjugants (LBA4404 harboring clone A-4) and inoculate in 4 mL YEPKan medium. Incubate the culture at 28  C for 18 h at 250 rpm in a slanting position (see Note 11). 2. Pellet down the culture in a sterile 1.5 mL microcentrifuge tube by spinning at 15,294  g for 5 min. Discard the supernatant, and repeat the step to pellet down the remaining culture in the same tube. 3. Wash the pellet with 700μL cell wash solution. 4. Resuspend the cells in 100μL 10 TE buffer and add 20μL of 50 mg/mL lysozyme solution, 2μL of 20 mg/mL RNase A followed by incubation for 15–30 min at 37  C. 5. Add 450μL lysis buffer and mix gently by inverting the tube. Incubate for 1 h at 50  C. 6. Add 500μL of phenol (pH 8.0) to the tube and mix thoroughly by inverting repeatedly for 3–5 min. Extract the solution by centrifuging at 15,294  g for 5 min at RT (see Note 12). 7. Transfer the upper aqueous phase to a sterile 1.5 mL microcentrifuge tube and add an equal volume of a 1:1 mixture of phenol and 24:1 mixture of chloroform–isoamyl alcohol solution. Mix well by inverting the tube, and centrifuge for 5 min at 15,294  g at RT (see Note 13). 8. Transfer the aqueous phase to a sterile 1.5 mL microcentrifuge tube (see Note 14). Mix gently, centrifuge for 5 min, and remove the upper layer to a clean tube. 9. Add 2 volumes of prechilled absolute ethanol or isopropanol to the tube. Mix well and incubate for 1 h at 70  C (see Note 15). 10. Centrifuge at 15,294  g for 10 min at 4  C and discard the supernatant. 11. Wash the pellet with 70% ethanol. 12. Air-dry the pellet at RT for ~20 min or until alcohol smell vanishes completely. Resuspend the DNA in 25–100μL 1 TE buffer. 13. Quantify the DNA isolated using NanoDrop followed by running in 1.0% TAE-agarose gel.

3.9 Confirmation of Transconjugants

Confirmation of Agrobacterium transconjugants for the presence of modified T-DNA (with the sequence of interest) in binary vector is an important step to ensure successful plant transformation. It can be done either by PCR, or by Southern hybridization. 1. Use primer pairs, S1-S2 and S5-S6, to confirm the presence of intact T-DNA by amplifying Left Border (LB) (hygromycin gene) and Right Border (RB) (sequence spanning portion of

CircRNA Biogenesis and Functions

51

downstream 478 bp intron, NOS terminator and T-DNA RB) respectively from isolated DNA of Agrobacterium transconjugants (Fig. 1b). 2. Use P5-P4 primer pair to amplify the complete assembled circRNA-expression cassette from isolated DNA of Agrobacterium transconjugants (Fig. 2b). 3. Resolve the PCR products in 1.0% TAE-agarose gel. The confirmed transconjugant was used for plant transformation to generate clone-A4 transgenic plants overexpressing circRNA (see Note 16). 3.10 Transformation into Plants

The successful Agrobacterium transconjugant obtained in Subheading 3.7 is further transformed to Oryza sativa Indica (Pusa Basmati 1) seed–derived calli using agroinfection [42, 28]. 1. Dehusk the seeds and sterilize with 70% ethanol for 2 min followed by washing with sterile water. 2. Wash seeds with 4.0% sodium hypochlorite with a single drop of Tween 20 for 15 min followed by washing with sterile water (see Note 17). 3. Wash seeds with 0.1% mercuric chloride (HgCl2) followed by washing with sterile water five times each for 15 min. 4. Grow sterilized seeds on callus induction medium for 21 days in dark at 25  C (see Note 18). 5. Subculture the scutellum-derived calli again on callus induction medium for 4 days in dark at 25  C. Divide the scutellumderived calli into two batches and process one batch for agroinfection with Clone-A4 bacterial suspension (explained in the following steps) and the other without any infection which serves as a negative control. 6. Inoculate the successfully confirmed LBA4404 clone-A4 transformant in 100 mL ABKan+Hyg medium at 28  C overnight at 220 rpm till Optical Density (OD600) reaches 1. 7. Centrifuge the culture at 10,621  g for 10 min. Add 100 mL of AB medium supplemented with 50–100 Mm acetosyringone to prepare the LBA4404 clone-A4 bacterial suspension for agroinfection. 8. Soak one batch of the induced calli in LBA4404 harboring clone-A4 transconjugant suspension for 15 min by gently swirling the flask. LBA4404 alone or untransformed calli can be used as a control. 9. Dry the infected calli and keep it on a separate Whatman No 1 paper placed on co-cultivation medium for 3 days in dark.

52

Priyanka Sharma et al.

10. Wash the calli three times with liquid selection medium supplemented with 50 mg/L hygromycin and 250 mg/L cefotaxime. 11. Place the agroinfected and uninfected calli on solid selection medium supplemented with 50 mg/L hygromycin and 250 mg/L cefotaxime and incubate for 14 days in dark (see Note 19). 12. Subculture the transformed calli in selection medium with 50 mg/L hygromycin and 250 mg/L cefotaxime subsequently for three times, each cycle with a duration of 21 days in dark. 13. Transfer the selected calli to regeneration medium and incubate for 14 days in dark. 14. Subculture calli to regeneration medium for and incubate for 7 days in light for shoots to appear. 15. Subculture the calli with developed shoots to regeneration medium for root development subsequently for two times, each with a duration of 14 days in light. 16. Transfer all the successful plantlets to clay soil for acclimatization in plant growth chamber (28–30  C, 75–85% humidity) for 2 weeks. 17. Shift all the acclimatized plantlets to green house and maintain at 30  C. 18. Newly grown transgenic plants can be confirmed by molecular analysis for the copy number, intactness and site of T-DNA integration into the rice genome. 3.11 Confirmation of Intact T-DNA Integration by Southern Hybridization

Southern hybridization is employed to confirm the integrity of T-DNA into the rice genome upon transformation. Copy number and intactness of the integrated transgene can be confirmed by choosing a unique enzyme site in the T-DNA followed by probing the LB and RB using specific probes. DNA probe preparation: Hygromycin gene specific probe (PCR amplified using primers S1-S2 from clone-A4 plasmid) can be utilized for confirming the T-DNA LB (Fig. 1b) (see Note 20). On the other hand, using primers S5-S6, T-DNA RB specific probe is generated by PCR amplifying sequences partially spanning the downstream rice intron together with NOS terminus and T-DNA RB sequences (Fig. 1b) (see Note 21). 1. Dilute 100 ng of each PCR amplified product from above to a final concentration of 10 ng/μL and denature the dsDNA for each probe in boiling water for 5 min. 2. Prepare labeling mix by adding 10μL reaction buffer, 2μL labeling reagent, and 10μL cross-linker working solution to the ssDNA of each probe. Spin the mix briefly and incubate at 37  C for ~30 min.

CircRNA Biogenesis and Functions

53

3. Finally, use the labeled ssDNA probes for hybridization with the template DNA (see Note 22). 4. Isolate genomic DNA from control and transgenic rice plants as described earlier in Subheading 3.1. 5. Digest isolated rice genomic DNA with NcoI (see Note 23) and resolve on a long 1.0% TNE-agarose gel at 50 V until the tracking dye (Bromophenol Blue, BPB) reaches the bottom (see Note 24). Image the gel followed by washing in denaturation solution for 30–45 min on a rocker at slow speed. 6. Drain out the denaturation solution and rinse the gel with sterile water. Repeat washing three more times followed by immersing the gel in neutralization solution for 45 min on a rocker. 7. Again, rinse the gel in sterile water. 8. Keep the gel casting tray upside down as a platform in a glass tray containing 20 SSC buffer. Wrap the platform with a wet (dipped in 20 SSC) Whatman No 3 paper in such a way that its ends can touch the solution. 9. Remove air bubbles by gently rolling a glass rod over the platform (see Note 25). 10. Place the gel upside down on the platform. Remove air bubbles if any (see Note 26). 11. Cut Nylon membrane the size of the gel, dip in sterile water followed by 20 SSC. Place it over the gel and remove air bubbles if any (see Note 27). Similarly, place two wet Whatman No 3 papers over the membrane followed by a dry Whatman No 3 paper. 12. Place crude filter papers of same size on top of Whatman No 3 papers up to a height of 8–10 cm. Place ~1 kg of weight on top and leave it overnight (see Note 28). 13. Next day, carefully remove the membrane, wash with 2 SSC, dry on Whatman No 1 paper, and cross-link the DNA to the membrane inside a UV cross-linker for 1.5 min with the sample facing upward (see Note 29). 14. Place the membrane in a hybridization bottle and incubate with prehybridization solution for 30 min at 65  C in a hybridization oven (see Note 30). 15. Discard the prehybridization solution. Mix the denatured labeled probe (explained previously) to the hybridization solution and add to the center of hybridization bottle containing the blot. Carry out hybridization at 65  C for 12–18 h (see Note 31). 16. Discard hybridization buffer containing probe and wash the blot twice with primary wash buffer at 65  C for 10 min each.

54

Priyanka Sharma et al.

17. Discard primary wash buffer followed by washing with secondary wash buffer at RT for 10 min by slowly rolling the tube manually. Repeat the process one more time (see Note 32). 18. Discard the secondary wash buffer and blot the membrane on Whatman No 1 paper. 19. Place the blot on a polythene sheet/Saran wrap. Add CDP-star detection reagent drop by drop to completely cover the blot surface without any bubbles and wrap it. Place the blot inside the cassette. Expose the blot to an X-ray film (see Note 33). 20. Remove the X-ray film from the cassette by holding at one corner. Develop the X-ray film in dark by washing for 2 min in developer, 1 min in water followed by 2 min in fixer solution. Finally, rinse extensively under running tap water and observe under normal light (see Note 34). 3.11.1

Interpretation

Single band should appear for both the probes from transgenic homozygous plants confirming the integration of a single and complete T-DNA into plant. However, practically at times, the presence of circRNA sequence in LB or RB probe may result in signal amplification with the endogenous circRNA present in both transformed and untransformed plants. Therefore, if needed, the intensity and length of the signals from both the transformed and untransformed control plant can be compared in order to determine the efficacy of transformation. The presence of multiple bands with any/both the probes will correspond to the number of independent T-DNA integration (copy number) into the genome. A blot showing unequal number of bands between both the probes indicates the possibility of complete and incomplete T-DNA integration which requires further investigation.

3.12 Determination of T-DNA Integration Site in Plant Genome

T-DNA integrations into the genome are illegitimately random and is a complex process known to perpetuate localized disturbances of the target sites. Therefore, it is important to localize the site of T-DNA integration upon transformation as it may disrupt a functional gene. Although multiple methods have been followed to determine the site of T-DNA integration, reported elsewhere, we have followed the stepwise methodology of Genome Walking as depicted in Fig. 3a by using Universal Genome Walker 2.0 Kit.

3.12.1 Construction of Genome Walker Library

1. Isolate genomic DNA from transgenic and nontransgenic plants and estimate the concentration using NanoDrop. Dilute the sample to 0.1μg/μL to proceed further. Perform the following steps simultaneously for both transgenic and nontransgenic plants (see Note 35). 2. Label four 1.5 mL sterile microcentrifuge tubes as DL-1, DL-2, DL-3, and DL-4 for DNA library (DL) preparation by blunt-end restriction digestion of genomic DNA with DraI, EcoRV, PvuII, and StuI respectively.

CircRNA Biogenesis and Functions

55

Fig. 3 Representation of the strategy adopted for locating the T-DNA integration into plant genome. (a) Flowchart highlighting the steps involved in finding the location of T-DNA integration in plant genome by Genome walking (b) T-DNA LB integration in plant chromosome can be found out by performing primary PCR with outer Adapter Primer (AP1) (provided with the kit) and LB-GSP1 primer followed by secondary PCR with nested Adapter Primer 2 (AP2) (provided with the kit) and LB-GSP2 primer. Same methodology can be followed for T-DNA RB integration with RB-GSP1 and RB-GSP2 primer

3. Separately in a reaction of 100μL, digest 2.5μg of 0.1μg/μL diluted genomic DNA with 80 units of each 10 units/μL DraI, EcoRV, PvuII, and StuI enzymes in compatible 1 buffer. 4. Mix the tubes gently and incubate for 2 h at 37  C. Confirm the completion of digestion by resolving 5μL from the digested reaction mix along with 0.5μL of undigested diluted genomic DNA on 0.6% 1 TAE-agarose gel (see Note 36). 5. Purify the remaining ~90μL reaction from each tube using the NucleoSpinGel and PCR Clean-Up kit by following the manufacturers instruction carefully. Mix ~90μL of reaction with

56

Priyanka Sharma et al.

200μL of Buffer NT1 and transfer the mix into column placed inside the collection tube. Centrifuge for 30 s at 11,000  g and discard the flow through. Wash the membrane bound DNA with 700μL buffer NT3 by centrifuging for 30 s at 11,000  g, again discard the flow through. Give an additional spin to ensure complete removal of wash buffer. Elute the purified DNA from column using 20μL Buffer NE. Estimate the concentration using NanoDrop and confirm the quality by resolving 50–100 ng from both eluted and control (reaction before elution) DNA on 0.6% 1 TAE-agarose gel. 6. Transfer 4.8μL of the digested, purified DNA from each tube to fresh sterile 0.5 mL microcentrifuge tube and ligate it with 1.9μL of 25μM Genome Walker adaptor using 0.5μL of 6 units/μL T4 DNA ligase in 1 ligation buffer for a total volume of 8μL. Incubate each reaction at 16  C overnight followed by inactivating the enzyme at 70  C for 5 min (see Note 37). 7. Dilute the ligated mix by adding 32μL of 1 TE buffer. Mix gently by vortexing for 10–15 s (see Note 38). 3.12.2 Designing the Gene Specific Primers

1. Two sets of gene specific primers are required to separately analyze the junction point for the T-DNA LB and RB integration into the plant genome (see Note 39). Two hygromycin gene specific reverse primers (GSP) were designed, one for primary PCR (LB-GSP1) and another for secondary/nested PCR (LB-GSP2) for tracing the LB insertion site (Fig. 3b). 2. Similarly, for tracing the RB insertion site two reverse primers (RB-GSP1 for primary PCR and RB-GSP2 for nested PCR) were designed from the junction of the circRNA sequence and the downstream intron sequence of the assembled circRNA expression cassette (see Note 40). 3. Genome walking will be performed by primary and secondary (nested) PCRs with both the sets of primers designed to identify the site of T-DNA integration in the plant genome.

3.12.3

Primary PCR

1. Label four 0.2 mL sterile PCR microcentrifuge tubes as LB-DL1, LB-DL2, LB-DL3 and LB-DL4 for primary PCR with primers LB-GSP1 and outer adapter primer 1 (AP1) for the respective DNA libraries. Similarly, label four 0.2 mL sterile PCR microcentrifuge tubes as RB-DL1, RB-DL2, RB-DL3, and RB-DL4 for primary PCR with primers RB-GSP1 and AP1. 2. Prepare PCR mix using 1μL of template from each DNA library prepared in Subheading 3.12.1. To this, add 0.5μL of 50 Advantage 2 polymerase mix, 0.5μL of 10 mM dNTPs, and 0.5μL of 10μM respective primers in 1 Advantage 2 PCR buffer (see Note 41).

CircRNA Biogenesis and Functions

57

3. For positive control, perform the PCR using primers AP1 and PCP1 with Preconstructed Human Control Library as a template. 4. Perform PCR using the following conditions: Set PCR program for 7 cycles at 94  C—25 s, 72  C—3 min followed by 32 cycles at 94  C—25 s and annealing, extension at 67  C for 3 min followed by a final incubation at 67  C—7 min. 5. Upon completion of the PCR cycles, resolve the reaction mix on 1.0% TAE-agarose gel. 6. Analyze the product from each PCR reaction mix and proceed further for secondary PCR if any band(s) or smear is obtained (see Note 42). 3.12.4 Secondary/ Nested PCR

1. Dilute 1μL of each primary PCR product by adding 49μL of sterile water. 2. Label 0.2 mL sterile PCR microcentrifuge tubes for preparing secondary PCR reactions. 3. Use 1μL of each diluted primary PCR product as template and add 0.5μL of 50 Advantage 2 polymerase mix, 0.5μL of 10 mM dNTPs, 0.5μL of 10μM respective primers (for LB tubes add nested adaptor primer (AP2) and LB-GSP2, and for RB tubes add AP2 and RB-GSP2) in 1 Advantage 2 PCR buffer. For positive control, set up the PCR reaction using primers AP2 and PCP2 with Preconstructed Human Control Library as a template. 4. Perform PCR using the following conditions: 5 cycles at 94  C—25 s, 72  C—3 min followed by 20 cycles at 94  C— 25 s, annealing and extension at 67  C—3 min, and final incubation at 67  C—7 min. Upon completion of the PCR cycles, resolve the reaction mix on 1.0% TAE-agarose gel.

3.12.5 Cloning the Nested PCR Product and Sequencing

Perform TA cloning of the nested PCR products in pGEMT-Easy vector, screen the colonies by isolating plasmids followed by Sanger sequencing using universal pGEMT-Easy primers (see Note 43).

3.13 Expression Analysis of CircRNA in Transgenic Plants

1. Isolate total RNA from leaves of transgenic and control plants followed by cDNA conversion (see Note 44).

3.13.1 Semiquantitative RT-PCR

2. Using divergent primers D1 and D2 (Fig. 1a), validate the circRNA expression by semiquantitative RT-PCR. 3. Resolve the PCR products on 1.5% TAE-agarose gel for expected fragment size followed by sequence confirmation by Sanger sequencing. 4. As a preliminary analysis, measure the intensity of the amplified product using densitometric analysis to see the expression level in transgenic plants compared to the control.

58

Priyanka Sharma et al.

3.13.2 Quantitative Reverse Transcription PCR (RT-qPCR)

1. Design divergent primers to perform quantitative reverse transcription PCR (RT-qPCR). Use 100 ng of cDNA template, add gene specific primers and SYBR Green master mix to analyze the expression level of circRNA upon normalizing with rice actin (see Note 45). 2. Carry out the reactions in triplicates with the following PCR conditions: 50  C 2 min (preheating), 95  C 10 min (initial denaturation), 95  C 10 s, 60  C 1 min for 40 cycles and set the dissociation reaction from 60  C onward (see Note 46). 3. Determine the mean Ct-value of each sample and calculate the expression of circRNA by 2ΔΔCt method [43]. 4. Level of circRNA expression is expected to be higher in transgenic plants as compared to control plants which is expected to be consistent with the semiquantitative RT-PCR.

3.13.3 Northern Hybridization

1. Generate probe complementary to the backsplice junction of the circRNA using primers D1 and D2 (Fig. 1a) to analyze its expression level by performing northern blot hybridization. 2. Resolve the PCR product on 1.0% TAE-agarose gel followed by elution of the expected fragment and estimate its concentration. 3. Mix 100 ng of eluted PCR product in 15μL with RNase-free water, denature in boiling water bath for 5 min and immediately transfer to ice. 4. Add 2μL hexanucleotide mix, 2μL dNTP labeling mixture, and 1μL Klenow enzyme to the denatured PCR sample. Mix gently by pipetting and give a short spin. Incubate the mix at 37  C for 18–20 h (see Note 22). 5. Following the protocol mentioned in Subheading 3.2, isolate total RNA from transgenic (T) and nontransgenic (C) plants. 6. Split the total RNA from each plant into two tubes and subject one tube for RNase R treatment as described earlier 7. Label the RNase R–treated tubes for transgenic plants as TR(+) and for nontransgenic as CR(+). Label the untreated tubes as TR() and CR() for respective plants. 8. Denature 10μg RNA from all tubes at 80  C for 10 min followed by incubating on ice immediately. 9. Resolve total RNA in 1.0% MOPS-agarose gel till the tracking dye (BPB) reaches the bottom (see Note 24). 10. Follow the steps as mentioned previously in Subheading 3.11 for denaturation, neutralization and transfer of RNA onto the Nylon membrane followed by UV cross-linking (see Note 47). 11. Perform prehybridization for 2 h and hybridization with denatured probe for overnight at 68  C by following the steps as described in Subheading 3.11.

CircRNA Biogenesis and Functions

59

12. Discard hybridization buffer with probe and wash the blot with washing buffer-I twice at 15–25  C for 5–15 min inside the oven. 13. Discard washing buffer-I and incubate the blot twice with washing buffer-II for 15–30 min each time at 68  C with constant agitation. 14. Discard washing buffer-II and rinse with washing buffer for 5–10 min at RT. Incubate the membrane with 20–40 mL 1.0% blocking solution for 1 h at 37  C with gentle agitation. 15. Discard the blocking solution and incubate the membrane with anti-DIG-AP solution 1:10 in 1.0% blocking solution for 30 min at RT (see Note 48). 16. Discard anti-DIG-AP/blocking solution and rinse with washing buffer for 15 min at 37  C. 17. Discard washing buffer and equilibrate the blot in 20 mL detection buffer (40 mL/100 cm2) for 5 min at RT. 18. Remove the membrane and place it between two plastic sheets, incubate with 0.5 mL per 100 cm2 of CDP-STAR substrate diluted to 1:100 (10μL in 1 mL detection buffer) followed by incubation at RT for 5 min. 19. Tape the blot inside cassette and follow the steps mentioned in Subheading 3.11 for developing the X-ray film. Expression is expected to be more in clone-A4 transgenic plants as compared to the control plants. Also, the RNase R– treated sample is expected to give clearer and specific results when compared to untreated samples. 3.14 Functional Validation of Overexpressed CircRNA

3.14.1 CircRNAMediated Transcriptional Regulation of Parental Gene

Multiple functional aspects of circRNA can be studied using the circRNA overexpressing transgenic plants. For example, circRNAs can potentially regulate gene expression (mRNA turnover) by influencing the process of transcription. Using the transgenic plants, expressional correlation between circRNA and its linear transcript can be studied by performing RT-qPCR as explained in Subheading 3.13.2. Similarly, the impact of miRNA sponging by circRNA on gene regulation can also be investigated as elaborated in Subheading 3.14.2. Moreover, the effect of circRNA overexpression on plant development and stress tolerance can also be analyzed using these transgenic plants (Subheadings 3.14.3 and 3.14.4). 1. Study the correlation between the overexpressing circRNA and its linear transcript by performing RT-qPCR between the transgenic and control plants. 2. Design specific convergent primers C1-C2 for mRNA from gene BGIOSGA002246 and circRNA specific divergent primers D1-D2 (Fig. 1a).

60

Priyanka Sharma et al.

3. Perform RT-qPCR as described in Subheading 3.13.2. with both convergent and divergent primers using 100 ng cDNA as template from transgenic and control plants (in triplicate) against rice actin as an internal control. 4. Determine the expression level of circRNA and linear transcript in both transgenic and control plants using 2ΔΔCt calculation. 5. The outcome of this study would determine the presence of any direct/indirect correlation between circRNA and its parental gene to unravel the possible circRNA mediated transcriptional regulation. 3.14.2 Predicting CircRNA–miRNA–mRNA Network

1. Feed the sequence of the selected circRNA in psRNATarget and scan for the possible interacting miRNA(s) against the available list of O. sativa miRNAs [44]. For example, upon feeding the circ1:4789318-4789486 () sequence in psRNATarget, three interacting miRNAs namely, osa-miR1872, osamiR160e-3p, and osa-miR5498 were predicted (Fig. 4a). 2. Select the potential miRNA(s) to check for its function (predicted/confirmed/annotated) from miRbase, binding properties and multiplicity of action from psRNATarget output file. For example, we have selected osa-miR160e-3p to predict its downstream mRNA target(s) (Fig. 4a). 3. Now, feed the sequence for the selected miRNA in psRNATarget and scan for its putative downstream mRNA target(s). For example, osa-miR160e-3p showed 109 downstream mRNA targets (Fig. 4b). 4. Choose a downstream mRNA target of interest after checking its function (predicted/confirmed/annotated) from any database like Ensembl plant database. 5. Design step-loop primers specific to the selected miRNA and convergent primers specific to the mRNA target and perform RT-qPCR for expressional studies in both transgenic and control plants. 6. Compare and correlate (negative/positive) the expression between circRNA, miRNA and mRNA target in both transgenic and control plants to construct a possible circRNA– miRNA–mRNA network axis. 7. Compare the correlation with visible phenotypic changes, if any, between transgenic and control plants. 8. Narrow down the function of selected circRNA by analyzing all the expression data, their correlation and phenotypic changes, if any, between transgenic and control plant.

CircRNA Biogenesis and Functions

61

Fig. 4 Predicting circRNA interacting miRNA and its mRNA target from psRNATarget. (a) Snapshot showing the list of miRNAs (highlighted inside the red box) that are predicted to bind with our circRNA of interest, circ1:4789318-4789486(), (b) Snapshot showing the total number of possible downstream mRNA targets (highlighted in the red box) of circRNA interacting miRNA, osa-miR160e-3p 3.14.3 Analysis of Phenotypic Changes in Transgenic Plants

1. Before studying the phenotypic changes as a result of circRNA overexpression in transgenic plants, rule out the effect of disrupted functional gene (if any) due to T-DNA integration by following the steps in Subheading 3.12. 2. Observe phenotypic changes, if any, such as flowering time, stature, leaf size and shape, root development (adventitious), panicle size, and flower morphology (whorl, orientation, etc.), that may be visible in transgenic and compare the same with the control plant at a particular time point under same condition.

3.14.4 Role of circRNA During Stress Conditions

Stress regulated differential expression of circRNAs has been shown in plants like tomato, wheat, pears, cucumber highlighting a potential role of circRNAs during various stress conditions such as chilling injury, dehydration, drought, temperature, and salinity [45– 49]. 1. Grow the transgenic and control plants in normal and stress conditions.

62

Priyanka Sharma et al.

2. Perform experiments as in Subheading 3.13 to observe and compare the difference in circRNA expression between transgenic and control plants. 3. Observe for any visible phenotypic changes between transgenic and control plants. 3.14.5 CircRNA Mediated RBP Sequestration

4

Recently, circRNAs are also found to be interacting with proteins termed as RNA binding proteins (RBPs) thereby regulating various biological processes [50]. In order to identify the circRNA interacting protein(s), circRNA overexpressing transgenic plant would be ideal which expresses more copies of circRNA to which more proteins molecules will interact. Thus, the transgenic circRNA expressing plants are suitable for the identification and characterization of the circRNA interacting proteins by RNA pull-down assay followed by mass spectrometry or immunoblot (if specific antibodies are available). Also, if the antibodies are available for the RBP, then the interaction can be studied by RNA immunoprecipitation (RIP) assay.

Notes 1. Do not dry the RNA pellet completely as this will greatly decrease RNA solubility. 2. Upon quantification, the yield obtained will be ~1/10th of the starting material which will further be processed for RNase R treatment. Therefore, to ensure the availability of the required amount for RNase R treatment, take the starting sample in duplicate. 3. PCR cycles are decided based on the number of reads required for sequence analysis. Here, we opted for ten million paired end (PE) reads. 4. Higher the phred score, lesser is the processed reads, better is the accuracy, reliability and stringency of the data analysis. 5. Reads mapped with genome database highly suggest uncertainty of circRNA biogenesis because of its linear orientation. 6. Based on the software programs used, different properties of circRNA like total number identified, length, types, and location on chromosome and their corresponding genes can be determined. 7. CircRNA sequence can also be uploaded into REPEAT MASKER program (v.2.1; species—rice; http://www. repeatmasker.org/) [38] to check the repetitive sequences present in flanking region [24].

CircRNA Biogenesis and Functions

63

8. Sometimes even oligo-dT primed cDNA may amplify the expected band with divergent primers, but the intensity will be many folds lower when compared to the amplification with random primed cDNA as template. Therefore, one can still continue to choose the band and proceed further upon confirming by Sanger sequencing. 9. Alternatively, PCR product can be cloned in T-tail vector (pGEMT) and verified by sequencing. The verified fragment can be released by digestion with appropriate restriction enzymes and ligated into pBluescript KS(). 10. OD600 between 0.6 and 1.0 is preferred for plasmid isolation. 11. Agrobacterium has tendency to aggregate while growing thereby forming thread like structure. This can be avoided by intense vortexing after inoculation and keeping the tube in slanting position. 12. Thoroughly mix the cell solution and phenol by inverting the tubes repeatedly for 3–5 min for efficient extraction of proteins and lipids. Also, wear lab coats and perform these steps inside the fume hood as phenol, chloroform, and their fumes are toxic. 13. Presence of chloroform and isoamyl alcohol in the phenol extraction facilitates the segregation of the aqueous phase which contains the DNA. 14. Repeat the extraction until the interface between the two layers is clean followed by a final extraction of the aqueous phase with the 24:1 mixture of chloroform–isoamyl alcohol to remove traces of phenol. 15. DNA has a translucent nature under light but small amount of DNA can make it difficult to observe and precipitate. Therefore, low concentration of DNA can be precipitated efficiently by adding 4μL of 5μg/μL glycogen, 2.5μL of 3 M Sodium acetate, and 2 volumes ethanol followed by 70  C incubation for few hours and then centrifugation at high speed. 16. Agrobacterium transconjugants can also be confirmed by Southern hybridization using probes as in Fig. 1b or with PCR primers designed specific to the circRNA expression cassette. 17. Silwet L-77 can also be used in place of Tween 20 as the surfactant. 18. Germinate the seeds on germination medium in a tissue culture room set with 16 h light–8 h dark photoperiod at 25  C for 4–5 days. Excise endosperm from germinated seedlings and separate scutellum carefully from tissues of root, shoot and embryonic axis.

64

Priyanka Sharma et al.

19. Only the transformed calli will retain the friable structure and subsequently uninfected calli will turn blackish-brown and die out in the presence of antibiotics in selection medium. Instead of cefotaxime, carbenicillin or a mixture of both can be used to suppress the Agrobacterium growth in the medium. Either of the one should be supplemented as longs as plants are maintained in the medium. 20. The hygromycin gene in pCAMBIA1305.1 is located near the LB, therefore, we can confirm LB by targeting the hygromycin gene. 21. The 478 bp rice intron positioned downstream of the circRNA sequence in clone-A4 is placed in close proximity of the T-DNA RB. 22. Start processing the probe simultaneously while incubating the membrane blot for prehybridization because the prepared probe should be used within 2 h of labeling. 23. Choose enzyme for digestion based on restriction site availability in transgene T-DNA segment. 24. Running a long gel at slow speed would help in a better resolution of the bands on gel which would facilitate a crisp transfer of the bands on the membrane. 25. Pour extra solution using a pasture pipette over the Whatman paper. Make sure that there are no air bubbles while preparing the sandwich. Repeat 2–3 times whenever placing a Whatman paper, gel, or membrane on the platform. 26. Placing the gel upside down would bring the bands closer to the membrane and hence the transfer will be easy and fast. 27. Cut the nylon membrane ~1 mm larger than the gel and mark the gel facing side of membrane using a ball point pen at one corner to identify the order of the samples loaded. 28. Too thick/many layers of blotting paper can result in an oversqueezing of the gel and membrane and thus distorted bands. Too thin/few blotting papers may result in incomplete or/and nonuniform transfer due to lack of full contact between the membranes and the gels. 29. Before discarding the gel, stain it with EtBr and observe for the successful transfer of DNA onto the membrane. 30. Do not handle the membrane directly; use forceps to hold the blot from one corner to avoid unwanted background noise. 31. Use probe within 2 h from its preparation by storing in ice. 32. Freshly prepare the secondary wash buffer by diluting the 20 Stock to 1:20 ratio with 2 mM Magnesium (2 mL/L of 1 M MgCl2).

CircRNA Biogenesis and Functions

65

33. Optimize the exposure time until clear bands are obtained on the X-ray film. 34. Freshly prepare the developer and fixer solutions and leave for overnight to settle before using. 35. DNA to be used should be of high purity. 36. Save ~5μL from this reaction to resolve on the gel in the next step to ensure proper elution. 37. Incubation using thermal cycler would provide a constant temperature when compared to water-bath. 38. Avoid vigorous vortexing to prevent shearing of the DNA library. 39. Generally, gene specific primers (GSP) should be derived from sequences close to the end of the known sequence. Here we designed GSP from hygromycin gene to determine the T-DNA LB insertion site and junction of 478 bp downstream intron and NOS terminator for T-DNA RB insertion site. 40. The secondary/nested gene specific primers (LB-GSP2, RB-GSP2) should anneal to sequences beyond the 30 end of the primary gene specific primer (LB-GSP1, RB-GSP1). GSP designed should be of 26–30 nt in length with GC content of 40–60% to ensure effective annealing at 67  C (recommended annealing and extension temperature for PCR based Genome Walking). 41. Conventional PCR reaction with single polymerase does not usually work with Genome Walker experiments. Thus, it is recommended to perform PCR using the 50 polymerase mix provided along with the kit. 42. Even if there is no visible product on gel, prefer to proceed to the next PCR without diluting the PCR product from the first PCR. 43. Secondary PCR product can also be directly sequenced if its size is >150 bp. 44. The plants to be used as control are grown from sterilized seeds in clay pots in controlled green house. 45. Primer D1 and D2 (Fig. 1a) can be designed appropriately to use for both RT-PCR and RT-qPCR or can be designed separately. 46. Observe dissociation curve (melting curve) of each reaction to check the specificity of the primers. 47. Use sterile DEPC-treated water to prepare all the required buffers for Northern blot hybridization. 48. For a 100 cm2 blot, 1μL of anti-DIG-AP was diluted in 10 mL 1.0% blocking solution.

66

Priyanka Sharma et al.

Acknowledgments We acknowledge the fund granted from the Science and Engineering Research Board (SERB) (Ref. No. EEQ/2018/000067, SB/ EMEQ-070/2013 to GP and EMR/2016/000945 to SN). PS is a recipient of Lady Tata Memorial Trust (LTMT) Junior Research Scholarship (2019-2020). Financial assistance was provided to AG by Department of Biotechnology (DBT) (Ref. No. BT/PR23641/ BPA/118/309/2017, BT/PR2061/AGR/36/707/2011) and grants from BT/PR6466/COE/34/16/2012. The equipment grants from Department of Science and Technology—Promotion of University Research and Scientific Excellence (DST-PURSE), University Grant Commission—Special Assistance Programme (UGC-SAP) are gratefully acknowledged. References 1. Yuan J, Wang Z, Xing J et al (2018) Genomewide identification and characterization of circular RNAs in the rice blast fungus Magnaporthe oryzae. Sci Rep 8(1):6757. https://doi. org/10.1038/s41598-018-25242-w 2. Shao J, Wang L, Liu X et al (2019) Identification and characterization of circular RNAs in Ganoderma lucidum. Sci Rep 9:16522. https://doi.org/10.1038/s41598-01952932-w 3. Salzman J, Chen RE, Olsen MN et al (2013) Cell-type specific features of circular RNA expression. PLoS Genet 9:e1003777. https:// doi.org/10.1371/journal.pgen.1003777 4. Yu C, Kuo H (2019) The emerging roles and functions of circular RNAs and their generation. J Biomed Sci 26:29. https://doi.org/10. 1186/s12929-019-0523-z 5. Legnini I, Di Timoteo G, Rossi F et al (2017) Circ-ZNF609 is a circular RNA that can be translated and functions in myogenesis. Mol Cell 66:22–37.e9. https://doi.org/10.1016/ j.molcel.2017.02.017 6. Pamudurti NR, Bartok O, Jens M et al (2017) Translation of circRNAs. Mol Cell 66:9–21.e7. https://doi.org/10.1016/j.molcel.2017.02. 021 7. Hansen TB, Wiklund ED, Bramsen JB et al (2011) miRNA-dependent gene silencing involving Ago2-mediated cleavage of a circular antisense RNA. EMBO J 30(21):4414–4422. https://doi.org/10.1038/emboj.2011.359 8. Hansen TB, Jensen TI, Clausen BH et al (2013) Natural RNA circles function as efficient microRNA sponges. Nature 495 (7441):384–388. https://doi.org/10.1038/ nature11993

9. Memczak S, Jens M, Elefsinioti A et al (2013) Circular RNAs are a large class of animal RNAs with regulatory potency. Nature 495 (7441):333–338. https://doi.org/10.1038/ nature11928 10. Guria A, Kumar KVV, Srikakulum N et al (2019) Circular RNA profiling by Illumina sequencing via template-dependent multiple displacement amplification. Biomed Res Int 2019:2756516. https://doi.org/10.1155/ 2019/2756516 11. Li Z, Huang C, Bao C et al (2015) Exonintron circular RNAs regulate transcription in the nucleus. Nat Struct Mol Biol 22 (3):256–264. https://doi.org/10.1038/ nsmb.2959 12. Conn VM, Hugouvieux V, Nayak A et al (2017) A circRNA from SEPALLATA3 regulates splicing of its cognate mRNA through R-loop formation. Nat Plants 3:17053. https://doi.org/10.1038/nplants.2017.53 13. Ashwal-Fluss R, Meyer M, Pamudurti NR et al (2014) circRNA biogenesis competes with pre-mRNA splicing. Mol Cell 56(1):55–66. https://doi.org/10.1016/j.molcel.2014.08. 019 14. Guria A, Sharma P, Natesan S et al (2020) Circular RNAs-the road less traveled. Front Mol Biosci 6:146. https://doi.org/10.3389/ fmolb.2019.00146 15. Lu T, Cui L, Zhou Y et al (2015) Transcriptome-wide investigation of circular RNAs in rice. RNA 21(12):2076–2087. https://doi.org/10.1261/rna.052282.115 16. Tan J, Zhou Z, Niu Y et al (2017) Identification and functional characterization of tomato CircRNAs derived from genes involved in fruit

CircRNA Biogenesis and Functions pigment accumulation. Sci Rep 7(1):8594. https://doi.org/10.1038/s41598-01708806-0 17. Piwecka M, Glazˇar P, Hernandez-Miranda LR et al (2017) Loss of a mammalian circular RNA locus causes miRNA deregulation and affects brain function. Science 357(6357):eaam8526. https://doi.org/10.1126/science.aam8526 18. Sekar S, Liang WS (2019) Circular RNA expression and function in the brain. Non coding RNA Res 4:23–29. https://doi.org/10. 1016/j.ncrna.2019.01.001 19. Yang Y, Liu S, Lei Z et al (2019) Circular RNA profile in liver tissue of EpCAM knockout mice. Int J Mol Med 44(3):1063–1077. https://doi.org/10.3892/ijmm.2019.4270 20. Guarnerio J, Bezzi M, Jeong JC et al (2016) Oncogenic role of fusion-circRNAs derived from cancer-associated chromosomal translocations. Cell 165(2):289–302. https://doi. org/10.1016/j.cell.2016.03.020 21. Jamal M, Song T, Chen B et al (2019) Recent Progress on circular RNA research in acute myeloid leukemia. Front Oncol 9:1108. https://doi.org/10.3389/fonc.2019.01108 22. Sun YM, Wang WT, Zeng ZC et al (2019) circMYBL2, a circRNA from MYBL2, regulates FLT3 translation by recruiting PTBP1 to promote FLT3-ITD AML progression. Blood 134(18):1533–1546. https://doi.org/10. 1182/blood.2019000802 23. Zhang J, Chen S, Yang J et al (2020) Accurate quantification of circular RNAs identifies extensive circular isoform switching events. Nat Commun 11:90. https://doi.org/10. 1038/s41467-019-13840-9 24. Chen L, Zhang P, Fan Y et al (2018) Circular RNAs mediated by transposons are associated with transcriptomic and phenotypic variation in maize. New Phytol 217(3):1292–1306. https://doi.org/10.1111/nph.14901 25. Mei-rong X, Zhi-hui X, Wen-xue Z et al (2008) Construction of double right-border binary vector carrying non-host gene Rxo1 resistant to bacterial leaf streak of Rice. Rice Sci 15 (3):243–246. https://doi.org/10.1016/ S1672-6308(08)60048-7 26. Babu KSD, Guria A, Karanthamalai J et al (2018) DNA methylation suppression by Bhendi Yellow Vein Mosaic Virus. Epigenomes 2(2):7. https://doi.org/10.3390/ epigenomes2020007 27. Dale PJ, Marks MS, Brown MM et al (1989) Agroinfection of wheat: inoculation of in vitro grown seedlings and embryos. Plant Sci 63:237–245. https://doi.org/10.1016/ 0168-9452(89)90249-5

67

28. Sridevi G, Sabapathi N, Meena P et al (2003) Transgenic indica Rice variety Pusa basmati 1 constitutively expressing a Rice Chitinase gene exhibits enhanced resistance to Rhizoctonia solani. J Plant Biochem Biotechnol 12:93–101. https://doi.org/10.1007/ BF03263168 29. Ryu C-M, Anand A, Kang L et al (2004) Agrodrench: a novel and effective agroinoculation method for virus-induced gene silencing in roots and diverse Solanaceous species. Plant J 40(2):322–331. https://doi.org/10.1111/j. 1365-313X.2004.02211.x 30. Kirigia D, Runo S, Alakonya A (2014) A virusinduced gene silencing (VIGS) system for functional genomics in the parasitic plant Striga hermonthica. Plant Methods 10:16. https:// doi.org/10.1186/1746-4811-10-16 31. Hu D, Bent AF, Hou X et al (2019) Agrobacterium-mediated vacuum infiltration and floral dip transformation of rapid-cycling Brassica rapa. BMC Plant Biol 19:246. https:// doi.org/10.1186/s12870-019-1843-6 32. Zhang X, Henriques R, Lin SS et al (2006) Agrobacterium-mediated transformation of Arabidopsis thaliana using the floral dip method. Nat Protoc 1(2):641–646. https:// doi.org/10.1038/nprot.2006.97 33. Zhang XO, Wang HB, Zhang Y et al (2014) Complementary sequence-mediated exon circularization. Cell 159:134–147. https://doi. org/10.1016/j.cell.2014.09.001 34. Gao Y, Wang J, Zhao F (2015) CIRI: an efficient and unbiased algorithm for de novo circular RNA identification. Genome Biol 16:4. https://doi.org/10.1186/s13059-014-05713 35. Chen L, Yu Y, Zhang X et al (2016) PcircRNA_finder: a software for circRNA prediction in plants. Bioinformatics 32:3528–3529. https://doi.org/10.1093/bioinformatics/ btw496 36. Cheng J, Metge F, Dieterich C (2016) Specific identification and quantification of circular RNAs from sequencing data. Bioinformatics 32:1094–1096. https://doi.org/10.1093/bio informatics/btv656 37. Gao Y, Zhang J, Zhao F (2018) Circular RNA identification based on multiple seed matching. Brief Bioinform 19(5):803–810. https://doi. org/10.1093/bib/bbx014 38. Bailly-Bechet M, Haudry A, Lerat E (2014) “One code to find them all”: a perl tool to conveniently parse RepeatMasker output files. Mob DNA 5:13. https://doi.org/10.1186/ 1759-8753-5-13

68

Priyanka Sharma et al.

39. Liu D, Conn V, Goodall GJ et al (2018) A highly efficient strategy for overexpressing circRNAs. In: Dieterich C, Papantonis A (eds) Circular RNAs. Methods in molecular biology, vol 1724. Humana Press, New York, pp 97–105. https://doi.org/10.1007/978-14939-7562-4_8 40. Gao Z, Li J, Luo M, Li H et al (2019) Characterization and cloning of grape circular RNAs identified the cold resistance-related Vv-circATS1. Plant Physiol 180(2):966–985. https://doi.org/10.1104/pp.18.01331 41. Wise AA, Liu Z, Binns AN (2006) Three methods for the introduction of foreign DNA into Agrobacterium. In: Wang K (ed) Methods in molecular biology, vol 343. Humana press, Totoyo, pp 43–53. https://doi.org/10. 1385/1-59745-130-4:43 42. Vijayachandra K, Palanichelvam K, Veluthambi K (1995) Rice scutellum induces Agrobacterium tumefaciens vir genes and T-strand generation. Plant Mol Biol 29:125–133. https:// doi.org/10.1007/BF00019124 43. Livak KJ, Schmittgen TD (2001) Analysis of relative gene expression data using real-time quantitative PCR and the 2(-Delta Delta C (T)) method. Methods 25(4):402–408. https://doi.org/10.1006/meth.2001.1262 44. Dai X, Zhuang Z, Zhao PX (2018) psRNATarget: a plant small RNA target analysis server (2017 release). Nucleic Acids Res 46(W1): W49–W54. https://doi.org/10.1093/nar/ gky316

45. Zuo J, Wang Q, Zhu B et al (2016) Deciphering the roles of circRNAs on chilling injury in tomato. BiochemBiophys Res Commun 479 (2):132–138. https://doi.org/10.1016/j. bbrc.2016.07.032 46. Wang Y, Yang M, Wei S et al (2017) Identification of circular RNAs and their targets in leaves of Triticum aestivum L. under dehydration stress. Front. Plant Sci 7:2024. https://doi. org/10.3389/fpls.2016.02024 47. Wang J, Lin J, Wang H et al (2018) Identification and characterization of circRNAs in Pyrus betulifolia Bunge under drought stress. PLoS One 13(7):e0200692. https://doi.org/10. 1371/journal.pone.0200692 48. He X, Guo S, Wang Y et al (2019) Systematic identification and analysis of heat-stressresponsive lncRNAs, circRNAs and miRNAs with associated co-expression and ceRNA networks in cucumber (Cucumis sativus L.). Physiol Plant 168(3):736–754. https://doi. org/10.1111/ppl.12997 49. Zhu Y, Jia J, Yang L et al (2019) Identification of cucumber circular RNAs responsive to salt stress. BMC Plant Biol 19:164. https://doi. org/10.1186/s12870-019-1712-3 50. Huang A, Zheng H, Wu Z et al (2020) Circular RNA-protein interactions: functions, mechanisms, and identification. Theranostics 10 (8):3503–3517. https://doi.org/10.7150/ thno.42174

Chapter 4 Identification of Circular RNAs by Multiple Displacement Amplification and Their Involvement in Plant Development Ashirbad Guria, Priyanka Sharma, Sankar Natesan, and Gopal Pandi Abstract With the innovative knowledge and bioinformatics tools in the identification and characterization of noncoding RNAs, circular RNA (circRNA) is added as a new member to the noncoding RNAs family. CircRNA enrichment by rRNA depletion/RNase R or poly-A removal/RNase R treatment followed by NGS analysis is the most frequently adopted method for circular RNA identification and characterization. In this chapter, we describe the multiple displacement amplification (MDA) as a convenient method to augment the identification of even the abysmally expressed circular RNAs at low sequencing depth. Total RNA, extracted at three different developmental stages of rice, is subjected to RiboMinus and RNase R treatment to deplete the linear RNAs. The enriched circular RNAs are reverse transcribed with random hexamers. The resulting cDNA is subjected to phi29 DNA polymerase amplification using exo-resistant random pentamers to yield high molecular weight dsDNA product, followed by Illumina sequencing at ten million paired end reads per sample. The sequence analysis yielded a promising number of circRNAs with the appreciable inclusion of differentially regulated and minimally expressed circRNAs at a comparatively reduced cost. Key words CircRNA, MDA, Exo-resistant pentamer, Divergent RT-qPCR, Northern hybridization, Plant development

1

Introduction Circular RNAs (circRNAs), which are formed by bonding a 30 downstream donor with a 50 upstream acceptor site in an alternative splicing mechanism called backsplicing, are found to play a pivotal role in different aspects of plant development. circRNAs co-originate with the linear transcripts from protein coding precursors, and therefore circRNA biogenesis is bound to influence the transcriptional turnover of its parental gene. Transcriptional regulation by circRNAs is a critical parameter in studying plant

Ashirbad Guria, Priyanka Sharma contributed equally to this work. Luis Marı´a Vaschetto (ed.), Plant Circular RNAs: Methods and Protocols, Methods in Molecular Biology, vol. 2362, https://doi.org/10.1007/978-1-0716-1645-1_4, © The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Science+Business Media, LLC, part of Springer Nature 2021

69

70

Ashirbad Guria et al.

development and detailed molecular analysis of circRNA overexpression construct(s)-derived transgenic plants can be used to characterise their function resonating with distinct phenotypic changes. Numerous evidences have highlighted the circRNA-parental gene regulation directly [1–3] and indirectly through miRNA sponging [4–7] and protein sequestration [8], thereby having an impact on plant development. For example, exon 6-circRNA biogenesis from alternatively skipped exon-6 during splicing of the SEP3 gene into SEP3.3 isoform variant has shown to have a reduced vegetative phase, altered floral organ number, and commence flowering by 4 weeks in Arabidopsis upon exposure to high temperature [9]. Similarly, a laciRNA (lariat circRNA) derived from the first intron of gene AT5g37720 in Arabidopsis caused late flowering, reduced fertility, and produced curly and clustered leaf by regulating an array of genes [10]. In another case, differentially expressed (DE)–circRNAs were detected at different developmental stages like pollen mother cell formation, meiosis, and microspore formation from the young rice panicles of both sterile and fertile plants [11]. Few of these DE-circRNAs were also found to be involved in miRNA sponging. GO annotation of identified parental genes and miRNA regulated genes suggests that these circRNAs are regulating fertility transition by hormone signaling and metabolic pathways [11]. DE-circRNA(s) obtained after varying stress treatment [2, 4, 6, 7, 12] have narrowed down genes participating in molecular pathways involved in plant development. All the above cited circRNAs were identified after analyzing a large number of cDNA sequences from Illumina sequencing reads on various computational platforms. Since exonic circRNA constitutes ~1.0% of poly-A RNA population, it requires ~300,000–300,000,000 reads to obtain one circRNA [13]. Therefore, a large number of sequencing reads are inevitable in order to identify even the minimally expressed circRNAs regulating plant development. However, we have shown that enriching circRNAs using Multiple Displacement Amplification (MDA) is capable of enlisting thousands of circRNAs from comparatively much lesser sequencing reads and is highly cost-effective [14]. Therefore, in this chapter, template dependent (td)-MDA that improves the sensitivity of circRNA as a result of continuous amplification is highlighted as an alternative method for identification of even low copy circRNAs that might be missed during traditional RNA sequencing. Rice is one of the most important commercial crops and it has been studied extensively in order to manipulate and enhance its quality and quantity to meet up the increasing population demand. Therefore, it intrigued us to undermine the possible role of circRNA in the life cycle of rice during vegetative, reproductive, and ripening stages. In this chapter, we described the detailed procedure for profiling rice circRNAs by employing MDA coupled with

MDA to Identify Plant circRNAs

71

Illumina sequencing and CIRI2 as a computational pipeline for identification. Enriched RNA (devoid of linear RNA and rRNA) from each of the three growth stages were utilized for cDNA conversion and MDA was performed using phi29 DNA polymerase and exo-resistant random pentamer to yield high-molecular weight dsDNA. The final dsDNA was Illumina sequenced at ten million paired end (PE) reads and analyzed on CIRI2 platform to obtain circRNA profiles. A DE-circRNA was identified based on the difference in its fragments per kilobase per million mapped reads (FPKM) value between different growth stages. The top three DE-circRNAs were selected and further validated by employing divergent PCR and northern hybridization. Similarly, differential expression of circRNAs, parental genes, circRNA interacting miRNA(s) and their downstream mRNA target(s) were studied. These studies will aid in developing a circRNA-miRNA-mRNA networks to understand their roles in regulating plant development. A more elaborate analysis of circRNA-mediated regulation of plant development by developing a specific circRNA overexpressing transgenic plant is described in our previous chapter. In this chapter, we have stressed on the importance of MDA followed by NGS to identify low expressed and differentially regulated circRNAs at comparatively fewer sequencing reads between different developmental stages of rice. Parental gene regulation and circRNA interacting miRNA were also studied to establish a possible network with DE-circRNA in order to understand the combined role during these growth stages. Validation and expression analysis were studied by divergent RT-qPCR, Northern hybridization, and stem-loop RT-qPCR. Moreover, this method also provides the advantage of identifying circRNAs from genome unknown organism(s).

2

Materials

2.1 Seed Sterilization, Germination, Growth, and Sample Collection

1. Oryza sativa Indica, cultivar—Pusa Basmati-1 (PB-1) seeds. 2. 70% ethanol. 3. 4.0% sodium hypochlorite. 4. 100% Tween 20. 5. 0.1% mercuric chloride. 6. Liquid nitrogen (LN2). 7. Whatman No 1 paper. 8. Petri dishes. 9. Soil, clay pots, plant growth chamber, green house.

72

Ashirbad Guria et al.

2.2 RNA Extraction and CircRNA Enrichment

1. Leaf sample. 2. LN2. 3. TRIzol. 4. 100% chloroform. 5. 100% isopropanol. 6. 75% ethanol. 7. 0.5 M EDTA (pH-8.0). 8. Diethylpyrocarbonate (DEPC). 9. Sterile DEPC-treated water. 10. 2 units/μl TURBO DNase (Invitrogen). 11. 20 units/μl RNase R (Lucigen). 12. Agarose. 13. 10 MOPS buffer (pH-7.0): 0.2 M MOPS, 0.05 M sodium acetate, 0.01 M EDTA sodium salt. 14. RiboMinus Plant kit for RNA-Seq (Thermoscientific): 10 mg/ ml RiboMinus Magnetic Beads, 15 pmol/μl RiboMinus Plant Probe, Hybridization Buffer. 15. Gel electrophoresis unit. 16. Magnetic separation stand. 17. Water bath. 18. NanoDrop. 19. Heating block. 20. Mortar and pestle. 21. Microcentrifuge tubes.

2.3

RT and MDA

1. 10 pM/μl miRNA specific stem-loop primers. 2. 500μM Exo-resistant random pentamer. 3. 10 mM dNTPs. 4. 10 units/μl phi29 DNA polymerase (Thermoscientific). 5. 0.1 units/μl Pyrophosphatase (Thermoscientific). 6. Nuclease free water. 7. Agarose. 8. RevertAid First Strand cDNA synthesis kit (Thermoscientific): 200 units/μl RevertAid RT, 20 units/μl RiboLock RNase Inhibitor, 5 Reaction Buffer, 10 mM dNTP Mix, 100μM Random Hexamer Primer, 100μM Oligo(dT)18 Primer. 9. 25 Tris–acetate–EDTA (TAE) buffer: 1 M Tris base, 0.05 M EDTA sodium salts (pH-8.0). 10. 0.5 M Acetic acid.

MDA to Identify Plant circRNAs

73

11. Thermal cycler. 12. Heating block. 13. NanoDrop. 14. Gel electrophoresis unit. 2.4 Next Generation Sequencing (NGS) and Data Analysis

1. JetSeq cleanup Beads (Bioline). 2. Illumina multiplex barcode adapters. 3. NEXTflex RapidDNA sequencing bundle kit (PerkinElmer): NEXTFLEX End-Repair & Adenylation Buffer Mix, NEXTFLEX End-Repair & Adenylation Enzyme Mix, NEXTFLEX Ligase Enzyme Mix, NEXTFLEX PCR Master Mix, NEXTFLEX Sizing Solution, NEXTFLEX Resuspension Buffer. 4. Illumina Hyseq4000. 5. High-capacity workstation. 6. Qubit fluorometer. 7. Covaris S220 Sonicator. 8. Agilent 2200 TapeStation. 9. CircRNA computational pipeline(s).

2.5 Validation and Expression of CircRNA, miRNA, and mRNA

1. CircRNA specific divergent primers. 2. miRNA and mRNA specific convergent primers. 3. CircRNA, miRNA, and mRNA specific real-time PCR primers. 4. 10 mM dNTPs. 5. 5 units/μl Taq DNA polymerase (Thermo Scientific). 6. 2 FastStart Universal SYBR Green Master (Rox) (Roche). 7. Agarose. 8. Ready-to-use Kodak Rapid Access Developer solution. 9. Ready-to-use Kodak Rapid Access Fixer solution. 10. 125:24:1 acid-phenol–chloroform–IAA. 11. 10 MOPS buffer (pH-7.0): 0.2 M MOPS, 0.05 M sodium acetate, 0.01 M EDTA sodium salt. 12. 1.0% Blocking solution: 10% (w/v) Blocking reagent in maleic acid buffer (65  C). Dilute 10 Blocking solution 1:10 in maleic acid buffer for 1.0% Blocking solution. 13. Denaturation solution: 1 M NaCl, 0.5 M NaOH. 14. Neutralization solution (pH-7.0): 1.5 M NaCl, 0.5 M Tris. 15. 20 SSC buffer (pH-7.0): 3 M NaCl, 0.3 M sodium citrate. 16. Hybridization buffer: 5 SSC, 0.1% (w/v) N-lauryl sarcosine sodium salt, 0.02% SDS, 1.0% (w/v) blocking reagent, sterile DEPC-treated water.

74

Ashirbad Guria et al.

17. Washing buffer I: 2 SSC, 0.1% SDS. 18. Washing buffer II: 0.5 SSC, 0.1% SDS. 19. Maleic acid buffer (pH 7.5): 0.1 M maleic acid, 0.15 M NaCl. 20. Washing buffer: To 100 ml maleic acid buffer, 300μl of 100% Tween 20 is added (use cut tips). 21. Detection buffer (pH 9.5): 0.1 M Tris–HCl, 0.1 M NaCl. 22. DIG DNA Labeling and Detection Kit (Roche): 10 Hexanucleotide Mix, 10 dNTP Labeling Mixture, 2 units/μl Klenow Enzyme, 750 units/ml Anti-DIG-AP conjugated Antibody. 23. CDP-Star, ready-to-use (Roche). 24. mirVana miRNA Isolation Kit (Thermo Scientific): miRNA Wash Solution 1, Wash Solution 2/3, Collection tubes, Filter Cartridges, Lysis/Binding buffer, miRNA Homogenate Additive, Elution Solution. 25. PCR thermocycler. 26. Real-time PCR thermocycler. 27. Gel electrophoresis unit. 28. Hybridization oven. 29. UV cross-linker. 30. Microcentrifuge. 31. Hybridization bottle. 32. Forceps. 33. Nylon membrane. 34. Whatman No 1, 3 paper. 35. Crude filter paper. 36. Gel casting tray. 37. X-ray films and exposure cassette. 38. Dark room.

3

Methods Overall methodology of DE-circRNA identification and its target (s) at different growth stages is described in Fig. 1.

3.1 Seed Sterilization, Germination, Growth, and Sample Collection

Seed sterilization protocol was followed according to the method described in [15]. 1. Dehusk Pusa Basmati-1 (PB-1) seeds and sterilize with 70% ethanol for 2 min followed by washing with sterile water.

MDA to Identify Plant circRNAs

75

Fig. 1 Flowchart depicting the methodological outline to study the DE-circRNA(s) at different growth stages

2. Wash seeds with 4.0% sodium hypochlorite and a single drop of Tween 20 for 15 min followed by washing with sterile water (see Note 1). 3. Wash seeds with 0.1% mercuric chloride (HgCl2) followed by washing with sterile water five times each for 15 min. 4. Place ~10–15 seeds on Whatman No 1 paper in autoclaved petri plate. Pour little water to soak the seeds. Keep it in dark for 3–5 days at 26  C. 5. Check for the emergence of radicle and plumule from the germinating seedlings (see Note 2). 6. Puddle the soil before placing the seedlings inside (see Note 3). 7. Place each germinated seedling into separate puddled soil pots and keep in plant growth chamber for 2 weeks at 16:8 h light– dark condition with 75–80% relative humidity (RH) (see Note 4). 8. Transfer the plantlets to bigger puddled soil pots and maintain inside greenhouse at 30–32  C (see Note 5).

76

Ashirbad Guria et al.

9. Observe the plant as it traverses through different growth stages. 10. Collect 100 mg of O. sativa leaf sample at day 60–70 (for vegetative stage), day 100–105 (for reproductive stage) and day 135–140 (for ripening stage). 11. Snap-freeze the collected sample in liquid nitrogen (LN2) and store at 70  C until further used. 3.2 Plant Total RNA Isolation

1. Grind 100 mg of Oryza sativa Indica, PB-1, leaves collected from each time point to fine powder in LN2 using prechilled mortar and pestle. Add 1 ml of TRIzol (1 ml for each 100 mg sample), gently mix, and incubate for 5 min at room temperature (RT). 2. Transfer the mix to sterile 1.5 ml microcentrifuge tubes (1 ml/ tube), add 0.2 ml chloroform to each tube, mix by inverting, and incubate for 2–3 min at RT. 3. Centrifuge the tubes at 15,294  g for 10 min at 4  C. Add 0.5 ml isopropanol to each tube, mix well by inverting and again incubate at RT for 10 min. 4. Centrifuge the tubes at 15,294  g for 10 min at 4  C. Wash the pellet with 1 ml of 75% ethanol by centrifuging at 15,294  g for 5 min at 4  C. 5. Air-dry the RNA pellet and resuspend in ~20–40μl of prewarmed (55  C) sterile DEPC-treated water (see Note 6). 6. Store the total RNA at 80  C until further used. 7. Measure the concentration using NanoDrop and check the quality by resolving the sample in 1.5% MOPS-agarose gel. 8. Treat 10μg of total RNA with 1μl of 2 units/μl DNase in 1 DNase buffer. Incubate the mix at 37  C for 30 min. Inactivate the enzyme using 0.01 M EDTA (pH-8.0) by incubation at 70–75  C for 10 min. 9. Measure the quantity using NanoDrop and assess the quality by resolving the sample in 1.5% MOPS-agarose gel.

3.3 CircRNA Enrichment

1. Use RiboMinus Plant Kit for RNA-seq that comes with the 10 mg/ml magnetic beads, 15 pmol/μl plant RNA probe, hybridization buffer, and sterile DEPC-treated water. 2. Resuspend the magnetic beads by vortexing or by vigorous tapping. Take 750μl of magnetic beads in a 1.5 ml sterile microcentrifuge tube. 3. Using the magnetic separation stand, remove the solution from the beads and wash the magnetic beads twice with sterile DEPC-treated water. Resuspend the washed beads in 750μl

MDA to Identify Plant circRNAs

77

of hybridization buffer, mix gently and split the beads into two sterile 1.5 ml microcentrifuge tubes in 250μl and 500μl volumes. 4. Concentrate the 500μl magnetic beads by separating and again resuspending in 200μl of hybridization buffer. Incubate the beads at 37  C water bath until used. 5. Hybridize the RNA with RiboMinus probe by mixing 10μg of DNase treated total RNA, 10μl of 15 pmol/μl Probe, and 100μl of hybridization buffer. Incubate the mix at 70–75  C in a heat block for 5 min, and cool the tube gradually to 37  C. 6. Transfer the mix to 200μl of prepared magnetic beads. Mix by gentle vortexing and incubate at 37  C for 15 min. 7. Separate the mix for 1 min using magnetic separation stand and transfer the supernatant to the other 250μl aspirated magnetic beads and incubate at 37  C for 15 min. 8. Separate and aspirate the supernatant using magnetic separation stand. 9. To the supernatant, add 1μl of 20μg/μl glycogen, 1/10th sample volume of 3 M Sodium acetate and 2.5 sample volume of 100% ethanol and mix well. 10. Incubate the mix at 80  C for 30 min followed by centrifugation for 15 min at 15,294  g at 4  C. 11. Wash the pellet with 70% ethanol twice. Air-dry the pellet for 5 min. Dissolve rRNA depleted RNA in 10–30μl of sterile DEPC-treated water. Quantify the sample using NanoDrop and preserve at 80  C until further use (see Note 7). 12. Use 1–2μg of RiboMinus treated RNA for RNase R treatment using 5–10 units of RNase R in the presence of 1 RNase R buffer. Incubate the mix for 20 min at 37  C (see Note 8). 3.4

cDNA Synthesis

1. Use 2μg DNase treated RNA for cDNA conversion using oligo-dT and random primers in separate reactions with 10 mM dNTPs, 1μl of 20 units/μl RNase inhibitor, 1μl of 200 units/μl reverse transcriptase in its compatible buffer. 2. Set the random primer sample for reaction at 25  C for 5 min, 42  C for 1 h followed by inactivation at 70  C for 5 min. Follow the same for oligo-dT primer sample without the initial 25  C for 5 min step.

3.5 Multiple Displacement Amplification (MDA)

1. Use 50–100 ng of random-primed cDNA (test sample), equal concentration of double stranded (ds) circular plasmid (positive control), and nuclease-free water (negative control) as template for performing MDA (see Note 9).

78

Ashirbad Guria et al.

2. Design exo-resistant random pentamer containing a carbon 18 (C18) spacer at 50 end of the primer and two phosphorothioate bonds at fourth and fifth nucleotide [16]. 3. To the template, add 2μl of 10 phi29 DNA polymerase buffer, 2μl of 500μM exo-resistant random pentamer, 2μl of 10 mM dNTPs and make up the volume with autoclaved nuclease-free water to 17.4μl. 4. Gently mix and denature at 94  C for 3 min followed by cooling gradually in a thermocycler (see Note 10). 5. To the mix, add 0.6μl of 10 units/μl phi29 DNA polymerase followed by 2μl of diluted pyrophosphatase (1μl of 0.1 units/μl pyrophosphatase + 9μl of dilution buffer) to make a 20μl reaction. 6. Gently mix the reaction, incubate at 28  C for 21 h, and deactivate the enzyme at 65  C for 10 min. 7. Resolve 1–3μl amplified product in 0.8% TAE-agarose gel. 8. Check for the presence of MDA amplified high molecular weight dsDNA product (~20 kb) in test and control sample only as shown in Fig. 2 (see Note 11). 3.6 NGS of MDA Products and its Data Analysis

1. Check the quality and integrity of the DNA obtained in Subheading 3.5 using Qubit fluorometer and NanoDrop. 2. Prepare whole genome sequencing (WGS) library using Illumina-compatible NEXTflex Rapid DNA sequencing Bundle. 3. Shear 250 ng MDA-derived DNA using Covaris S220 sonicator to generate ~200–300 bp fragments. 4. End-repair, adenylate, and ligate the purified fragments to Illumina multiplex barcode adaptors as per NEXTflex RapidDNA sequencing bundle kit protocol. 5. Amplify purified adapter-ligated DNA in four cycles of PCR using Illumina-compatible primers provided in the NEXTflex Rapid DNA sequencing Bundle (see Note 12). 6. Purify PCR products (sequencing libraries) followed by library quality control check. 7. Quantify the sequencing libraries by Qubit fluorometer and analyze its size distribution on Agilent 2200 TapeStation followed by sequencing on an Illumina Hiseq4000 platform. 8. Remove the adapter(s), over-represented sequence(s) from the ten million paired end raw read sequences using Trimmomatic in Linux platform. 9. Check for the quality of the library at different phred score (>30 will be better) in order to retain maximum reads of equal read length with >50% GC ratio (see Note 13).

MDA to Identify Plant circRNAs

79

Fig. 2 Multiple displacement amplification (MDA). Phi29 DNA polymerase with exo-resistant random pentamer yields high molecular dsDNA product with cDNA and circular pBKS(+) plasmid and no amplification with water. λHindIII digest is used as the DNA ladder

10. Map the processed reads to linear Oryza sativa genome database. 11. Eliminate the mapped reads and analyze the quality of the final processed unmapped reads by FastQC (see Note 14). 12. Feed the reads into CIRI2 [17] computational pipeline for identification of circRNAs and their properties such as length, types, and location on chromosome and their corresponding genes (see Note 15).

80

Ashirbad Guria et al.

3.7 Predicting Differentially Expressed (DE)-CircRNAs

1. Calculate fragments per kilobase per million mapped reads (FPKM) value to find out the expression data of individual circRNAs identified at each of the growth stages (see Note 16). 2. Formula for FPKM (https://www.rna-seqblog.com/rpkmfpkm-and-tpm-clearly-explained/) [22] for an individual circRNA as follows: (a) Per million scaling factor ¼ total WGS reads obtained in a sample by Illumina sequencing/1,000,000. (b) Fragment per million mapped reads (FPM) of an individual circRNA ¼ read count of circRNA/per million scaling factor. (c) FPKM of an individual circRNA ¼ FPM/length of circRNA in kilobase (kb). 3. Select circRNA(s) showing a minimum difference of ~50% (or two-fold difference) in FPKM values between samples of all the developmental stages to consider it as potential DE-circRNAs. We selected the top three circRNAs with maximum differential expression between the vegetative, reproductive, and ripening stages with the consistent pattern of expression in triplicates.

3.8 Validation of Selected Potential DE-CircRNAs

1. Perform divergent RT-PCR to validate the potential DE-circRNAs selected in Subheading 3.7. When oligo-dT primers were used the resulting cDNA would be predominantly from the mRNA molecules whereas random primers would yield cDNAs from a significant population of circRNAs. Therefore, both oligo-dT and random primed cDNAs were used as template to compare and validate circRNAs using divergent primers complimentary to the backsplice junction. 2. Design specific divergent primer pair complementary to the backsplice junction of the selected circRNA. 3. Perform divergent PCR with cDNA synthesized in Subheading 3.4 using 10 pM of each primer, 10 mM dNTPs, 0.6μl of 5 units/μl Taq DNA polymerase in its compatible buffer at specific PCR condition for circRNA. 4. Resolve the amplified product in a 1.5% TAE-agarose gel to confirm the expected amplicon size followed by Sanger sequencing to confirm the circRNA(s) sequence (see Notes 17 and 18).

3.9 CircRNA Expression at Different Growth Stages

The difference in the expression of the potential circRNAs, selected based on the difference in the FPKM values between vegetative, reproductive and ripening stages, can be studied by performing real-time PCR (qRT-PCR) and northern hybridization as described in the following subheadings.

MDA to Identify Plant circRNAs 3.9.1 Divergent Quantitative Reverse Transcription PCR (RT-qPCR)

81

1. Design divergent primers for performing quantitative Reverse Transcription PCR (RT-qPCR) with 100 ng of randomprimed cDNA template from each of the three growth stages with SYBR Green master mix and use rice actin as an internal control to analyze the expression levels of circRNAs. 2. Design qRT-PCR specific primers to amplify a segment of 80–150 bp length with combined TA at ~60  C (see Note 19). 3. Carry out all the reactions in triplicates at PCR condition: 50  C 2 min (preheating), 95  C 10 min (initial denaturation), 95  C 10 s, 60  C 1 min for 40 cycles and set the dissociation reaction from 60  C onward. 4. Observe dissociation curve (melting curve) of each reaction to check for the absence of any nonspecific amplicon. 5. Calculate the differential expression of circRNAs by 2ΔΔCt method [23] (see Note 20).

3.9.2 Northern Hybridization

1. Northern blot probe preparation: To study the differential expression of circRNAs, a probe complementary to the backsplice junction would be used. The probe can be either amplified using circRNA backsplice junction specific primers or commercially synthesized and annealed. 2. Take about 100 ng of probe, suspend in 15μl RNase free water, denature in boiling water bath for 5 min and immediately transfer to ice. 3. Add 2μl hexanucleotide mix, 2μl dNTPs labelling mixture, and 1μl Klenow enzyme to the denatured probe, mix gently by pipetting and collect by brief centrifugation. Incubate the reaction mix at 37  C for 18–20 h (see Note 21). 4. Split the total RNA, as isolated in Subheading 3.2, treat one set with RNase R as described in Subheading 3.3, step 2 and keep another set as untreated control. 5. Denature 10μg RNA at 80  C for 10 min, immediately transfer to ice. Resolve the RNA in 1.0% MOPS-agarose gel till the tracking dye (BPB) reaches to the bottom (see Note 22). 6. Wash the gel in denaturation solution for 30–45 min on a rocker. 7. Take out the gel from denaturation solution and rinse with sterile DEPC-treated water four times. After that, immerse in neutralization solution for 45 min on a rocker. 8. Rinse the gel once in sterile DEPC-treated water. 9. Place the gel casting tray upside down as a platform in a glass tray containing 20 SSC buffer. Wrap the platform with a wet (dipped in 20 SSC) Whatman No 3 paper in such a way that its ends can touch the SSC buffer.

82

Ashirbad Guria et al.

10. Remove air bubbles by gently rolling a glass rod over the wet Whatman paper (see Note 23). 11. Place the gel upside down on the platform. Remove air bubbles if any (see Note 24). 12. Cut Nylon membrane the size of the gel, dip in sterile DEPCtreated water followed by 20 SSC. Place it over the gel and remove air bubbles if any (see Note 25). 13. Place crude filter paper of same size on top of dry Whatman No 3 paper up to a height of 8–10 cm. Place ~1 kg weight and leave it overnight (see Note 26). 14. Next day, carefully remove the membrane, wash with 2 SSC, dry on Whatman No 1 paper, and cross-link the RNA to the membrane inside a UV cross-linker for 1.5 min with sample facing upward (see Note 27). 15. Roll the membrane inside a hybridization bottle; carry out prehybridization with prewarmed (68  C) hybridization buffer at 68  C in hybridization oven for 2 h with gentle agitation (see Note 28). 16. Discard the prehybridization solution. Mix the denatured probe with ~10 ml of prewarmed (68  C) hybridization buffer and add to the center of the hybridization bottle. Carry out hybridization at 68  C overnight with gentle agitation in a hybridization oven (see Note 29). 17. Discard hybridization buffer with probe and wash the blot with washing buffer-I twice at 15–25  C for 5–15 min inside the oven. 18. Discard washing buffer-I and incubate the blot twice with washing buffer-II for 15–30 min each time at 68  C with constant agitation. 19. Discard washing buffer-II and rinse with washing buffer for 5–10 min at RT. 20. Incubate the membrane with 20–40 ml 1.0% blocking solution for 1 h at 37  C with gentle agitation. 21. Discard the blocking solution and incubate the membrane with anti-DIG-AP solution 1:10 in 1.0% blocking solution for 30 min at RT (see Note 30). 22. Discard anti-DIG-AP/blocking solution and rinse the membrane with washing buffer for 15 min at 37  C. 23. Discard washing buffer and equilibrate the blot in 20 ml detection buffer (40 ml/100 cm2) for 5 min at RT. 24. Remove the membrane and place in between two plastic sheets, incubate with 0.5 ml/100 cm2 of CDP-STAR substrate diluted to 1:100 (10μl in 1 ml detection buffer) followed by incubation at RT for 5 min.

MDA to Identify Plant circRNAs

83

25. Place the blot inside exposure cassette using cellophane tape. 26. Expose the blot to X-ray film inside dark room at RT (see Note 31). 27. Remove the X-ray film from the cassette by holding at one corner. Develop the X-ray film by washing for 2 min in developer, 1 min in water followed by 2 min in fixer solution (see Note 32). Finally, rinse the X–ray film under running tap water and observe under normal light. 28. Confirm the presence of the expected circRNA, compare and analyze the expression of the selected circRNAs between all the growth stages and check for its consistency with the real-time data (see Note 33). 3.10 CircRNA Parental Gene Expression Studies

Perform RT-qPCR to study the existence of a correlation (positive/ negative), if any, between the expression of circRNAs and its linear transcripts. 1. Identify parental gene(s) of the selected circRNA(s) from the sequencing data and its location in the genome using Ensembl plant database (sp. O. sativa Indica) [24]. 2. Design RT-qPCR specific primers by following the criteria as mentioned in Subheading 3.9.1 to amplify the parental gene (s) and rice actin. 3. Optimize the PCR condition using the designed primers with 50–100 ng of random-primed cDNA as template as mentioned in Subheading 3.8. 4. Resolve the PCR product(s) in 1.0% TAE-agarose gel to confirm the presence of the expected amplicons. 5. Gel elute the PCR product followed by cloning in a T-tail vector followed by Sanger sequencing for confirmation of the amplified parental gene segment. 6. Perform convergent RT-qPCR with 50–100 ng cDNA derived from DNase treated RNA and actin (housekeeping gene) using SYBR Green mix for 40 cycles as described above in Subheading 3.9.1. 7. Calculate parental gene expression at each developmental stages using 2ΔΔCt method [23]. 8. Analyze the results of the RT-qPCR from circRNA (Subheading 3.9.1) and its corresponding genes to draw a positive/ negative correlation.

3.11 Identification and Comparative Expression Analysis of CircRNA Interacting miRNA(s)

1. Feed the sequence psRNATarget [25].

of

the

selected

DE-circRNA

in

2. Scan for the possible interacting miRNA(s) by running against the O. sativa miRNAs database available in psRNATarget.

84

Ashirbad Guria et al.

3. Identify the total number of miRNA(s) and their respective binding sites on the corresponding DE-circRNA(s). 4. Select the most stringent miRNA(s) with least E-value. Find out the functional role of the selected miRNA using miRbase [26] (see Note 34). 3.12 Validation of CircRNA Interacting miRNA(s)

There are many different experimental methods reported for the identification and quantification of miRNAs; however, each method has its own limitations. Northern hybridization is less sensitive in detecting sparsely expressed miRNAs [27] even at a very high concentration of RNA and autoradiographic exposure for many days [28]. Microarray also shown to be less sensitive [29, 30] and requires costly instrumentation [28]. Similarly, the bead arrays are not sensitive enough and require PCR amplification followed by hybridization techniques [31]. The loop-mediated isothermal amplification (LAMP) method requires a minimum of four specially designed primers to amplify the miRNA from at least six small RNA targets. The major disadvantage of this method is that, the amplified fragment cannot be cloned for sequence verification [30]. The quantitative reverse-transcription PCR (RT-qPCR) is also not suitable for miRNA because of its size which ranges from 18–24 nt and the length of the primers outweighs the length of the miRNA [28, 30]. However, we find comfortable with the stemloop quantitative PCR [29, 28] for the quantification of miRNAs, which we have enriched miRNAs by using mirVana miRNA isolation kit following the manufacturer’s instructions. 1. Homogenize 100 mg leaves with 1 ml of prechilled Lysis/ Binding Buffer in a prechilled mortar pestle. 2. Add 1 volume of acid-phenol–chloroform. Vortex for 30 s. 3. Add 1/10 volume of miRNA Homogenate Additive and mix well by vortexing. Incubate the mixture on ice for 10 min. 4. Centrifuge for 5 min at 15,294  g at RT and transfer the aqueous phase to new 1.5 ml sterile microcentrifuge tube. Note the volume removed. 5. Add 1/3 volume of 100% ethanol to the aqueous phase and mix thoroughly by vortexing. 6. Pass it through a filter cartridge placed into the collection tube. Collect the filtrate by centrifuging for 15 s at 10,621  g. 7. Add 2/3 volume of 100% ethanol to the filtrate (i.e., flowthrough) and pass through a fresh filter cartridge by centrifuging at 10,621  g for 15 s. 8. Discard the flow-through and wash the filter cartridges using 700μl miRNA Wash Solution-1 by centrifuging at 10,621  g for 10 s.

MDA to Identify Plant circRNAs

85

9. Discard the flow-through and wash the filter cartridge with 500μl Wash Solution-2/3 and again centrifuge at 10,621  g for 10 s. Repeat the washing step one more time. 10. Discard the flow-through and remove residual fluid from the filter cartridge centrifuge at 10,621  g for 30 s. 11. Place the filter cartridge into a fresh collection tube and add 50μl of preheated (60  C) nuclease-free water to the center of the filter. Incubate for 5 min at RT. 12. Centrifuge for 30 s at maximum speed to recover the small RNAs. Check the quality and quantity of the sample using NanoDrop and store at 70  C until further used (see Note 35). 13. Design a stem-loop primer (SLP), preferably between 40–60 nt, with an additional 6–8 nt complementary to the 30 end of the selected miRNA (Fig. 3b). This SLP upon hybridizing with the 30 end of the selected miRNA will reversetranscribe to cDNA which will further serve as template for real-time qRT-PCR (Fig. 3c).

Fig. 3 Representation of mechanism of stem-loop PCR employed to quantify miRNAs. Mature miRNA and stem-loop primer (SLP) (a) are hybridized during RT (b) to form a miRNA-SLP-cDNA complex (c). Forward and reverse primers are designed from 50 end of miRNA and 30 stem region of SLP respectively (d) which will be used in stem-loop PCR to amplify the target miRNA sequences (e)

86

Ashirbad Guria et al.

14. Perform normal Reverse Transcriptase reaction with 1μg DNase treated RNA, 1μl of 1–10μM SLP [27, 32], 2μl of 10 mM dNTPs, 4μl of 5 RT buffer, 1μl of 20 units/μl RiboLock RNase inhibitor (RI), and 1μl of 200 units/μl Revert Aid RT at 25  C for 5 min followed by 42  C for 1 h and enzyme inactivation at 70  C for 5 min [32] (see Notes 36 and 37). 15. Design forward primer (FP) within 13–15 nt from the 50 end of miRNA (to maximize the miRNA coverage with FP primer) and adjust the TA to 60  C by step-wise adding G/C nt at the 50 end of FP (Fig. 3d). 16. Design a universal reverse primer (RP) within 12–14 nt from the 30 end of stem region of the SLP and adjust the TA to 60  C by stepwise addition of G/C nt at 50 end of RP. Reverse complement this sequence to get RP (Fig. 3d). 17. Carry out RT-qPCR in triplicates with 1μl of 50–100 ng SLP derived cDNA as template, 1μl of 10 pM FP, 1μl of 10 pM RP, 10μl of 2 SYBR Green master mix in a 20μl reaction at 50  C for 2 min (preheating), 95  C for 10 min (initial denaturation), followed by 40 cycles at 95  C for 10 s, 60  C for 1 min with a dissociation temperature starting from 60  C till 95  C. 18. Observe the dissociation curve (melting curve) of each reaction to check for the absence of any nonspecific amplicon. 19. Run RT-qPCR in triplicate simultaneously with random hexamer derived cDNA (from the same RNA used for SLP) as template with the specific primers for the internal control, U6 gene, at same conditions as mentioned above. 20. Observe the Ct value of miRNA and normalize against internal control (U6 gene) and calculate its expression by 2ΔΔCt method between the developmental stages [23]. 3.13 Identification and Expression Analysis of miRNA Interacting mRNA(s)

1. Feed the selected DE-circRNA interacting miRNA sequence, identified in Subheading 3.11, in psRNATarget and scan for its possible downstream mRNA target(s) against the available list of Oryza sativa mRNA. 2. Select the most stringent target mRNA(s) with least Expectation (E)-value (for example, E-value  2.0 to 3.0) (see Note 38). 3. Design RT-qPCR specific primers for the selected mRNA target(s) by following the criteria as in Subheading 3.9.1 for expression studies at different developmental periods. 4. Optimize the designed primers by RT-PCR, clone the expected eluted product followed by Sanger sequencing for confirmation of the amplified target mRNA.

MDA to Identify Plant circRNAs

87

5. Perform RT-qPCR as mentioned in Subheading 3.10 and analyze the data. 6. Analyse the differential expression of target mRNAs between the developmental stages. 7. Compare the expression data of mRNA target with the results of miRNA (from Subheading 3.11) and the selected DE-circRNA (from Subheading 3.9) at each growth stage. 8. Develop a circRNA-miRNA-mRNA network axis for the three growth stages. To further authenticate the function of the selected and studied DE-circRNA in plant development, circRNA overexpressing transgenic plants can be constructed as described in our previous chapter.

4

Notes 1. Silwet L-77 can also be used in place of Tween 20 as a surfactant. 2. Add little more water if seeds get dry or take longer time for germination. 3. Puddling increases aeration, helps to remove weeds and conserve water. 4. This is an optional step for plants to acclimatize, otherwise seedlings can be directly transferred to green house. 5. Do not wet the plants when it is too small/till it attains five leaf stage. 6. Do not dry the RNA pellet completely as this will greatly decrease RNA solubility. 7. Upon quantification, the yield obtained will be ~1/10th of the starting material which will further be processed for RNase R treatment. Therefore, to ensure the availability of the required amount for RNase R treatment, take the starting sample in duplicates. 8. Perform a gel check for the sample if the concentration of the processed RNA is adequate. 9. Template independent amplification (TIA) is a common phenomenon in MDA because of self-priming and polymerase jumping which depends on factors like primer designing and concentration, water purity and the MDA reagents [16]. All these parameters should be optimized to avoid TIA before performing MDA with sample of interest. 10. Slow cooling helps the primer to anneal with the template efficiently.

88

Ashirbad Guria et al.

11. Intensity of band in positive control will be higher as compared to test sample because of different binding efficiency of phi29 DNA polymerase with its template. 12. 4 cycles of PCR are enough to achieve the target of ten million paired end reads for this experiment. 13. Higher the phred score, lesser is the processed reads, better is the accuracy, reliability and stringency of the data analysis. 14. Reads mapped with genome database suggest nonpossibility of circRNA biogenesis because of its linear orientation. 15. Other software programs such as CIRCexplorer [18], CIRI [19], pCircRNAfinder [20], and DCC [21] can also be used. 16. FPKM value is calculated for PE reads. 17. It is possible that even oligo-dT primed cDNA may amplify the expected band with divergent primers, but the intensity will be many folds lower when compared to the amplification with random primed cDNA template. Therefore, one can continue to choose the band and proceed further upon confirming by Sanger sequencing. 18. Perform direct sequencing if the size is above ~150 bp, otherwise clone the product and then perform the sequencing. 19. Primer D1 and D2 (Fig. 3a) can be designed appropriately to use for both RT-PCR and RT-qPCR or can be designed separately. 20. RT-qPCR products can simultaneously be loaded in 1.0% TAE-agarose gel to visualize the specific and nonspecific amplicon to cross-check the analyzed data. 21. Start processing the probe simultaneously while incubating the membrane blot for prehybridization because the prepared probe should be used within 2 h of its labelling. 22. Running a long gel at slow speed would help for a better resolution and to get a better result. 23. Pour extra solution using a Pasteur pipette over the Whatman paper make sure that there are no air bubbles while preparing the sandwich. Repeat 2–3 times whenever placing a Whatman paper, gel, or membrane on the platform. 24. Placing the gel upside down would bring the bands closer to the membrane and hence the transfer will be easy and fast. 25. Cut the nylon membrane ~1 mm larger than the gel and mark the gel facing side of membrane using a ball point pen at one corner to identify the order of the samples loaded. Similarly, place two wet Whatman No 3 paper on the membrane followed by a dry Whatman No 3 paper.

MDA to Identify Plant circRNAs

89

26. Too thick/many layers of blotting paper can result in an oversqueezing of the gels and membranes and thus, distorted bands. Too thin/few blotting papers may result in incomplete or/and nonuniform transfer due to lack of full contact between the membrane and the gels. 27. Before discarding the gel, stain it with EtBr and observe for its successful transfer of RNA to the membrane. 28. Do not handle the membrane directly; use forceps to hold the blot from one corner to avoid unwanted background noise. 29. Use probe within 2 h from its preparation by storing in ice. 30. For a 100 cm2 blot, 1μl of anti-DIG-AP was diluted in 10 ml 1.0% blocking solution. 31. Optimize the exposure time until clear bands are visible on the X-ray film. 32. Freshly prepare the Developer and Fixer solutions and leave for overnight to settle before using. 33. Presence of other bands, if any, increases the probability of alternative isoforms of the selected circRNA with the same backsplice junctions. 34. Initially run psRNATarget with default parameter to get the list of all possible circRNA interacting miRNA(s) and then adjust the E-vale to stringently choose the most significant miRNA. 35. Alternatively, small RNAs can be extracted directly by performing TRIzol extraction method by incubating the sample in isopropanol for overnight at 80  C. 36. Otherwise, carry out the same reaction mixture at 16  C for 30 min followed by 42  C for 60 min and enzyme inactivation at 70  C for 10 min [30]. 37. Enriched miRNA can also be used as a template for stem-loop RT-PCR using mirVana miRNA extraction kit. 38. Initially run psRNATarget with default parameter to get the list of all possible target mRNA(s) and then adjust the E-value to stringently choose the most significant miRNA downstream target.

Acknowledgments We acknowledge the fund granted from the Science and Engineering Research Board (SERB) (Ref. No. EEQ/2018/000067, SB/ EMEQ-070/2013 to GP and EMR/2016/000945 to SN). PS is a recipient of Lady Tata Memorial Trust (LTMT) Junior Research Scholarship (2019-2020). Financial assistance was provided to AG by Department of Biotechnology (DBT) (Ref. No. BT/PR23641/

90

Ashirbad Guria et al.

BPA/118/309/2017, BT/PR2061/AGR/36/707/2011) and grants from BT/PR6466/COE/34/16/2012. The equipment grants from Department of Science and Technology—Promotion of University Research and Scientific Excellence (DST-PURSE), University Grant Commission—Special Assistance Programme (UGC-SAP) are gratefully acknowledged. References 1. Lu T, Cui L, Zhou Y et al (2015) Transcriptome-wide investigation of circular RNAs in rice. RNA 21(12):2076–2087. https://doi.org/10.1261/rna.052282.115 2. Darbani B, Noeparvar S, Borg S (2016) Identification of circular RNAs from the parental genes involved in multiple aspects of cellular metabolism in barley. Front Plant Sci 7:776. https://doi.org/10.3389/fpls.2016.00776 3. Tan J, Zhou Z, Niu Y et al (2017) Identification and functional characterization of tomato CircRNAs derived from genes involved in fruit pigment accumulation. Sci Rep 7(1):8594. https://doi.org/10.1038/s41598-01708806-0 4. Wang Y, Yang M, Wei S et al (2017) Identification of circular RNAs and their targets in leaves of Triticum aestivum L. under dehydration stress. Front Plant Sci 7:2024. https://doi. org/10.3389/fpls.2016.02024 5. Wang Y, Wang Q, Gao L et al (2017) Integrative analysis of circRNAs acting as ceRNAs involved in ethylene pathway in tomato. Physiol Plant 161(3):311–321. https://doi. org/10.1111/ppl.12600 6. Wang J, Lin J, Wang H et al (2018) Identification and characterization of circRNAs in Pyrus betulifolia Bunge under drought stress. PLoS One 13(7):e0200692. https://doi.org/10. 1371/journal.pone.0200692 7. Zhu Y, Jia J, Yang L et al (2019) Identification of cucumber circular RNAs responsive to salt stress. BMC Plant Biol 19:164. https://doi. org/10.1186/s12870-019-1712-3 8. Ashwal-Fluss R, Meyer M, Pamudurti NR et al (2014) circRNA biogenesis competes with pre-mRNA splicing. Mol Cell 56(1):55–66. https://doi.org/10.1016/j.molcel.2014.08. 019 9. Conn VM, Hugouvieux V, Nayak A et al (2017) A circRNA from SEPALLATA3 regulates splicing of its cognate mRNA through R-loop formation. Nat Plants 3:17053. https://doi.org/10.1038/nplants.2017.53 10. Cheng J, Zhang Y, Li Z et al (2018) A lariatderived circular RNA is required for plant

development in Arabidopsis. Sci China Life Sci 61:204–213. https://doi.org/10.1007/ s11427-017-9182-3 11. Wang Y, Xiong Z, Li Q et al (2019) Circular RNA profiling of the rice photothermosensitive genic male sterile line Wuxiang S reveals circRNA involved in the fertility transition. BMC Plant Biol 19:340. https://doi. org/10.1186/s12870-019-1944-2 12. Ye CY, Chen L, Liu C et al (2015) Widespread noncoding circular RNAs in plants. New Phytol 208:88–95. https://doi.org/10.1111/ nph.13585 13. Jeck WR, Sharpless NE (2014) Detecting and characterising circular RNAs. Nat Biotechnol 32:453–461. https://doi.org/10.1038/nbt. 2890 14. Guria A, Kumar KVV, Srikakulum N et al (2019) Circular RNA profiling by Illumina sequencing via template-dependent multiple displacement amplification. Biomed Res Int 2019:2756516. https://doi.org/10.1155/ 2019/2756516 15. Vijayachandra K, Palanichelvam K, Veluthambi K (1995) Rice scutellum induces Agrobacterium tumefaciens vir genes and T-strand generation. Plant Mol Biol 29:125–133. https:// doi.org/10.1007/BF00019124 16. Wang W, Ren Y, Lu Y et al (2017) Templatedependent multiple displacement amplification for profiling human circulating RNA. BioTechniques 63(1):21–27. https://doi.org/10. 2144/000114566 17. Gao Y, Zhang J, Zhao F (2018) Circular RNA identification based on multiple seed matching. Brief Bioinform 19(5):803–810. https://doi. org/10.1093/bib/bbx014 18. Zhang XO, Wang HB, Zhang Y et al (2014) Complementary sequence-mediated exon circularization. Cell 159:134–147. https://doi. org/10.1016/j.cell.2014.09.001 19. Gao Y, Wang J, Zhao F (2015) CIRI: an efficient and unbiased algorithm for de novo circular RNA identification. Genome Biol 16:4. https://doi. org/10.1186/s13059-014-0571-3

MDA to Identify Plant circRNAs 20. Chen L, Yu Y, Zhang X et al (2016) PcircRNA_finder: a software for circRNA prediction in plants. Bioinformatics 32:3528–3529. https://doi.org/10.1093/bioinformatics/ btw496 21. Cheng J, Metge F, Dieterich C (2016) Specific identification and quantification of circular RNAs from sequencing data. Bioinformatics 32:1094–1096. https://doi.org/10.1093/bio informatics/btv656 22. https://www.rna-seqblog.com/rpkm-fpkmand-tpm-clearly-explained/. Accessed 29 July 2020 23. Livak KJ, Schmittgen TD (2001) Analysis of relative gene expression data using real-time quantitative PCR and the 2(-Delta C(T)) method. Methods 25(4):402–408. https:// doi.org/10.1006/meth.2001.1262 24. Bolser D, Staines DM, Pritchard E et al (2016) Ensembl plants: integrating tools for visualizing, mining, and analyzing plant genomics data. In: Edwards D (ed) Plant bioinformatics. Methods in molecular biology, vol 1374. Humana Press, New York, NY. https://doi. org/10.1007/978-1-4939-3167-5_6 25. Dai X, Zhuang Z, Zhao PX (2018) psRNATarget: a plant small RNA target analysis server (2017 release). Nucleic Acids Res 46(W1): W49–W54. https://doi.org/10.1093/nar/ gky316 26. Kozomara A, Birgaoanu M, Griffiths-Jones S (2019) miRBase: from microRNA sequences to function. Nucleic Acids Res 47(D1):

91

D155–D162. https://doi.org/10.1093/nar/ gky1141 27. Varkonyi-Gasic E, Wu R, Wood M et al (2007) Protocol: a highly sensitive RT-PCR method for detection and quantification of microRNAs. Plant Methods 3:12. https://doi.org/ 10.1186/1746-4811-3-12 28. Kramer MF (2011) Stem-loop RT-qPCR for miRNAs. Curr Protoc Mol Biol . Chapter 15: Unit15.10-15.10. https://doi.org/10.1002/ 0471142727.mb1510s95 29. Chen C, Ridzon DA, Broomer AJ et al (2005) Real-time quantification of microRNAs by stem-loop RT-PCR. Nucleic Acids Res 33 (20):e179. https://doi.org/10.1093/nar/ gni178 30. Marcial-Quino J, Go´mez-Manzo S, Fierro F et al (2016) Stem-loop RT-qPCR as an efficient tool for the detection and quantification of small RNAs in Giardia lamblia. Genes 7 (12):131. https://doi.org/10.3390/ genes7120131 31. Chen J, Lozach J, Garcia EW et al (2008) Highly sensitive and specific microRNA expression profiling using BeadArray technology. Nucleic Acids Res 36(14):e87. https://doi. org/10.1093/nar/gkn387 32. Yang LH, Wang SL, Tang LL et al (2014) Universal stem-loop primer method for screening and quantification of microRNA. PLoS One 9(12):e115293. https://doi.org/10. 1371/journal.pone.0115293

Chapter 5 Identification of Intronic Lariat-Derived Circular RNAs in Arabidopsis by RNA Deep Sequencing Taiyun Wang, Xiaotuo Zhang, and Binglian Zheng Abstract Lariat RNAs are well-known by-products of pre-mRNA splicing in eukaryotes, which are produced by the excised introns when the 50 splice site (50 ss) joins with the branchpoint (BP) during splicing. In general, most of lariat RNAs are usually linearized by RNA debranching enzyme 1 (DBR1), followed by degradation for intron turnover. However, with the high-throughput RNA sequencing technology and bioinformatics methods, increasing evidences have shown that many lariat RNAs can stably accumulate under physiological conditions in both animals and plants. Here, we describe a large-scale analysis to systematically identify the lariat RNAs (i.e., intronic circular RNAs) in Arabidopsis by utilizing the RNA-sequencing data. Key words RNA splicing, Lariat RNA, Intron, Circular RNA, RNA-seq, Bioinformatics

1

Introduction Eukaryotic genes frequently contain intervening sequences that are removed from precursor RNAs (pre-mRNAs) by splicing [1]. Pre-mRNA splicing is a key posttranscriptional process in which noncoding introns are excised from transcripts and coding exons are ligated together to generate a mRNA [2]. It takes a two-step transesterification reaction, in which the 50 splice site (50 ss), the branchpoint (BP), and the 30 splice site (30 ss) are the substrates for splicing catalysis [3–5]. In the first step, the 20 OH of the BP attacks on the 50 ss, generating a 50 exon and a lariat intermediate RNA that is attached to the 30 exon. These intermediates are then subjected to the second step of the reaction, in which the 30 ss is attacked and then a ligated 50 exon-30 exon product as well as the excised intron lariat are released [6]. The excised intron lariat, termed lariat RNA, is traditionally considered to be deb-

Luis Marı´a Vaschetto (ed.), Plant Circular RNAs: Methods and Protocols, Methods in Molecular Biology, vol. 2362, https://doi.org/10.1007/978-1-0716-1645-1_5, © The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Science+Business Media, LLC, part of Springer Nature 2021

93

94

Taiyun Wang et al.

ranched rapidly by the debranching enzyme 1 (DBR1) and followed by the degradation by the exonucleases [7, 8]. Nevertheless, some lariat RNAs fail to be debranched and instead accumulate as circular intronic RNAs (ciRNAs) trimmed at the 30 end by an exonuclease [9–12] or as stable intronic sequence RNAs (sisRNAs) [13–16]. In earlier studies, the detection of lariat RNAs was usually based on RT-PCR, which exploits the ability of the reverse transcriptase to read through the BP [17]. However, with the improvement of high-throughput sequencing technologies and the optimization of bioinformatics analyses, several studies from animals to plants, including Xenopus tropicalis, Drosophila melanogaster, mouse (Mus musculus), chicken (Gallus gallus domesticus), zebrafish (Danio rerio), human, and Arabidopsis thaliana, show the genome-wide accumulation of lariat RNAs as the circular form under normal physiological conditions [9–16, 18, 19], implying that the phenomenon of lariat RNAs escaping debranching is evolutionarily conserved in eukaryotes. Although branchpoint (BP) is very important for lariat formation, the identification, selection and regulation of BP have remained largely unexplored, especially in plants. This is in large part due to the technical challenges posed in identifying BP [20]. By taking advantage of the highly processive 30 –50 exonuclease Ribonuclease R (RNase R) which specifically deletes linear RNAs [21, 22], lariat RNAs which traverse the BP nucleotide are enriched to perform circular RNA-seq analysis [23, 24]. It is shown that the BP selection is highly conserved from plants to human: First, the BP nucleotide is strictly constrained in distance from the 30 ss, which is usually located on average 15–50 nt upstream of the 30 ss [10, 23]. Second, the BP nucleotide exhibits a strong preference for adenine. Third, sequences flanking the BP exhibit uracilrich nucleotides. Fourth, uracil is preferred as the second nucleotide upstream of the BP [10, 23–26]. In this chapter, we present a detailed large-scale analysis followed by circular RNA-high-throughput sequencing (circular RNA-seq) protocol for the genome-wide analysis of lariat RNAs. We describe the preparation of circular RNA libraries, the computational analysis of the circular RNA-seq profiles, and the identification of stable lariat RNAs (Fig. 1).

2 2.1

Materials Plant Material

Seeds of Arabidopsis thaliana were grown in a 16 h light–8 h dark growth room at 22  C. Inflorescences or seedlings were collected for the extraction of total RNAs (see Note 1).

Lariat-Derived Circular RNAs in Arabidopsis

95

Fig. 1 Flowchart of the pipeline of lariat RNAs identification 2.2 Extraction of Total RNAs

1. TRIzol™ Reagent (Invitrogen). 2. NanoDrop spectrophotometer (Thermo Fisher Scientific). 3. Diethyl pyrocarbonate (DEPC)-treated H2O.

2.3 Preparation of the Circular RNA-seq Libraries

1. RQ1 RNase-Free DNase (Promega). 2. RNase R (Lucigen). 3. 10  RNase R reaction buffer (Lucigen). 4. Ribonuclease Inhibitor (RRI, Takara). 5. miRNeasy Kit (QIAGEN). 6. DEPC-treated H2O. 7. Ribo-Zero kit (Epicentre). 8. Illumina TruSeq Stranded Total RNA High Throughput Sample Prep Kit. 9. Qubit DNA HS100 assay kit (Thermo Fisher Scientific).

2.4 Instrument or Equipment

1. Illumina HiSeq 2500 sequencer. 2. Qubit fluorometer (Thermo Fisher Scientific). 3. A computer (see Note 2).

2.5

Software

1. Trimmomatic (v0.38): page¼trimmomatic.

http://www.usadellab.org/cms/?

2. FastQC (v0.11.9): http://www.bioinformatics.babraham.ac. uk/projects/fastqc/.

96

Taiyun Wang et al.

3. HISAT2 (v2.1.0): http://ccb.jhu.edu/software/hisat/index. shtml. 4. FeatureCounts (v2.0.1): http://subread.sourceforge.net/.

3

Methods

3.1 Extraction of Total RNAs

1. Use TRIzol™ Reagent to extract the total RNAs from inflorescences or seedlings of Arabidopsis thaliana following the manufacturer’s protocol. 2. Resuspend the total RNAs in DEPC-treated H2O and measure the concentration of the extracted total RNAs by a NanoDrop spectrophotometer.

3.2 Preparation of the Circular RNA-seq Libraries

1. Prepare a 50 μL both DNase I and RNase R digestion reaction in a 1.5 mL nuclease-free microcentrifuge tube containing 5 μg of total RNAs, 5 units of RQ1 RNase-Free DNase, 20 units of RNase R, and 20 units of Ribonuclease Inhibitor in 10  RNase R reaction buffer (see Note 3). 2. Incubate the reaction at 37  C for 30–60 min and immediately proceed to RNA extraction (see Note 4). 3. Use the miRNeasy Kit following the protocol provided by the manufacturer and elute RNAs in 20 μL of nuclease-free water. 4. Treat the purified RNAs with a Ribo-Zero kit to obtain a ribosomal RNA-depleted RNA (ribo RNA). 5. Use the ribo RNA to create an RNA library through the Illumina TruSeq Stranded Total RNA High Throughput Sample Prep Kit following the manufacturer’s protocol. 6. Measure the library concentration by loading 1 μL of the purified library DNA in a Qubit fluorometer following the Qubit DNA HS Assay kit protocol (see Note 5). 7. Sequence the libraries including the control sample using the Illumina HiSeq 2500 sequencer. At least two replicates for each sample should be sequenced (see Note 6).

3.3 Collection of the Published RNA-seq Data

1. Search key words of plants and sequencing types on the website of NCBI (National Center for Biotechnology Information, https://www.ncbi.nlm.nih.gov/). For example, “Arabidopsis thaliana circRNA seq”. 2. Download the search result according to your needs.

3.4 Computational Analysis of the RNA-seq Profiles

1. The low-quality sequencing reads and adapter sequences are removed using Trimmomatic (version 0.38) [16]. Set the parameters of Trimmomatic to “trimmomatic PE -phred33 LEADING:20 TRAILING:20 SLIDINGWINDOW :4:20

Lariat-Derived Circular RNAs in Arabidopsis

97

Fig. 2 Per base sequence quality of FastQC. The numbers on the X axis represent the position of each base of the sequencing reads; The numbers of the Y axis represent the score Q (10  log10 (P value)). The green, yellow, and red area means the sequencing quality is excellent, good, and bad, respectively

MINLEN:20 ILLUMINACLIP:TruSeq3-PE.fa:2:30:10” (see Note 7). 2. Use FastQC to evaluate the quality of the filtered data. An example of the output result is shown in Fig. 2 (see Note 8). If the false positive of sequencing quality 1% (see Note 9), that is, the Q value should be 20, the data is considered to be qualified (see Note 10). But if the result of FastQC is unqualified, the adapter and low-quality sequencing reads need to be removed repeatedly until the result of FastQC is qualified. 3. Map the filtered sequencing reads to the genome of Arabidopsis thaliana (version TAIR10) using HISAT2 [27] (see Note 11). A SAM file containing the information of mapped genome location will be obtained. 4. As the BP nucleotide is strictly constrained in distance from the 30 ss, which is usually located on average 15–50 bp upstream of the 30 ss, we define the nucleotide located 20 base pair upstream of 30 ss as the branchpoint (BP) of the intron (see Note 12), the intronic sequences from 50 ss to the BP is called lariat-RNAproducing region (Region L), and the region from BP to the 30 ss is called 30 tail (Region T). Then, quantify the reads from the SAM file which can be mapped to Region L and Region T, respectively, using FeatureCounts [28]. Notice that you should

98

Taiyun Wang et al.

set the parameter “--minOverlap” to 75% (see Note 13) when quantifying the reads mapped to Region L in order to eliminate interference caused by reads flanking the 50 ss. 5. Normalize the intronic reads of Region L and Region T into RPK (Reads Per Kilobase), respectively: RPK ¼

Mapped Reads  103 Region Length

(see Note 14). 3.5 Identification of Stable Lariat RNAs

Based on the fact that RNase R treatment could effectively degrade linear RNAs and the 30 linear tail of lariat RNAs (i.e., Region T) [17, 21, 22], thereby enriching circular lariat RNAs (i.e., Region L). 1. Calculate RPK of Region L/RPK of Region T of each intron, which represents the enrichment coefficient of lariat-RNA-producing region relative to its linear transcripts. 2. Define those intronic reads with enrichment coefficient  5 as stable lariat RNAs in Arabidopsis.

4

Notes 1. Plant tissue at other growth stages could also be collected to do circular RNA-seq. 2. Both laptop and desktop computer could be used, and a Linux system is recommended. 3. RNase R digestion can be verified by 1% agarose gel or 8% polyacrylamide gel electrophoresis. 28S/18S/5S bands of rRNAs should become lighter or invisible under RNase R treatment. 4. Long incubation time is not recommended as it will degrade circular RNAs. 5. Do not quantify with spectrophotometer as it leads to unevenness among samples. 6. Paired-end sequencing is highly recommended as it could display gene expression abundance more accurately. 7. “PE” indicates paired-end sequencing data; “-phred33” indicates the phred33 quality value system; “LEADING:20” indicates removing bases with quality lower than 20 from the beginning of reads; “TRAILING:20” indicates removing bases with quality lower than 20 from the end of reads; “SLI DINGWINDOW:4:20” indicates sliding from the 50 end of the reads to cut the window with a width of 4 and an average base quality of less than 20; “MINLEN:20” indicates removal of

Lariat-Derived Circular RNAs in Arabidopsis

99

sequences with length less than 20 after removing low-quality reads and adaptor sequences; “ILLUMINACLIP:TruSeq3-PE. fa:2” represents the obtained file is “TruSeq3-PE.fa”, allowing two mismatches in the seed sequences. 8. Each box plot in the figure represents the quality (Q) distribution on a single base, which is a statistic representing the quality of all sequences at that site. The upper line, box top, middle line, box bottom, and bottom line represent the 90/75/50/25/10% quantiles of the single base. 9. Quality score ¼ 20 means 1% false positive. 10. The higher the value of Q (maximum 40), the better the quality. 11. TopHat2 [19] or other software with similar functions could also be used. 12. The definition of BP can be adjusted according to your needs. 13. “--minOverlap” indicates the shortest read length mapped to the region should be 75% of the sequencing reads. 14. “MappedReads” represents the quantified reads of Region L or Region T calculated by FeatureCounts; “RegionLength” represents the length of Region L or Region T (bp). References 1. Konarska MM, Grabowski PJ, Padgett RA et al (1985) Characterization of the branch site in lariat RNAs produced by splicing of mRNA precursors. Nature 313:552–557. https://doi. org/10.1038/313552a0 2. Lee Y, Rio DC (2015) Mechanisms and regulation of alternative pre-mRNA splicing. Annu Rev Biochem 84:1–33. https://doi.org/10. 1146/annurev-biochem-060614-034316 3. Reed R, Maniatis T (1985) Intron sequences involved in lariat formation during pre-mRNA splicing. Cell 41:95–105. https://doi.org/10. 1016/0092-8674(85)90064-9 4. Frendewey D, Keller W (1985) Stepwise assembly of a pre-mRNA splicing complex requires U-snRNPs and specific intron sequences. Cell 42:355–367. https://doi. org/10.1016/s0092-8674(85)80131-8 5. Aebi M, Hornig H, Padgett RA et al (1986) Sequence requirements for splicing of higher eukaryotic nuclear pre-mRNA. Cell 47:555–565. https://doi.org/10.1016/ 0092-8674(86)90620-3 6. Ruskin B, Krainer AR, Maniatis T et al (1984) Excision of an intact intron as a novel lariat structure during pre-mRNA splicing in vitro.

Cell 38:317–331. https://doi.org/10.1016/ 0092-8674(84)90553-1 7. Ruskin B, Green M (1985) An RNA processing activity that debranches RNA lariats. Science 229:135–140. https://doi.org/10.1126/sci ence.2990042 8. Nam K, Lee G, Trambley J et al (1997) Severe growth defect in a Schizosaccharomyces pombe mutant defective in intron lariat degradation. Mol Cell Biol 17:809–818. https://doi.org/ 10.1128/mcb.17.2.809 9. Talhouarne GJS, Gall JG (2018) Lariat intronic RNAs in the cytoplasm of vertebrate cells. Proc Natl Acad Sci U S A 115:201808816. https:// doi.org/10.1073/pnas.1808816115 10. Zhang X, Zhang Y, Wang T et al (2019) A comprehensive map of intron branchpoints and lariat RNAs in plants. Plant Cell 31:956–973. https://doi.org/10.1105/tpc. 18.00711 11. Cheng J, Zhang Y, Li Z et al (2018) A lariatderived circular RNA is required for plant development in Arabidopsis. Sci China Life Sci 61:204–213. https://doi.org/10.1007/ s11427-017-9182-3

100

Taiyun Wang et al.

12. Li Z, Wang S, Cheng J et al (2016) Intron lariat RNA inhibits microRNA biogenesis by sequestering the dicing complex in Arabidopsis. PLoS Genet 12:e1006422. https://doi.org/10. 1371/journal.pgen.1006422 13. Tay ML-I, Pek JW (2017) Maternally inherited stable intronic sequence RNA triggers a selfreinforcing feedback loop during development. Curr Biol 27:1062–1067. https://doi.org/10. 1016/j.cub.2017.02.040 14. Osman I, Pek JW (2018) A sisRNA/miRNA Axis prevents loss of germline stem cells during starvation in drosophila. Stem Cell Rep 11:4–12. https://doi.org/10.1016/j.stemcr. 2018.06.002 15. Wong JT, Akhbar F, Ng AYE et al (2017) DIP1 modulates stem cell homeostasis in drosophila through regulation of sisR-1. Nat Commun 8:759. https://doi.org/10.1038/s41467017-00684-4 16. Ng SSJ, Zheng RT, Osman I et al (2018) Generation of drosophila sisRNAs by independent transcription from cognate introns. iScience 4:68–75. https://doi.org/10.1016/j.isci. 2018.05.010 17. Suzuki H, Zuo Y, Wang J et al (2006) Characterization of RNase R-digested cellular RNA source that consists of lariat and circular RNAs from pre-mRNA splicing. Nucleic Acids Res 34:e63. https://doi.org/10.1093/ nar/gkl151 18. Zhang Y, Zhang X-O, Chen T et al (2013) Circular intronic long noncoding RNAs. Mol Cell 51:792–806. https://doi.org/10.1016/j. molcel.2013.08.017 19. Talhouarne GJS, Gall JG (2014) Lariat intronic RNAs in the cytoplasm of Xenopus tropicalis oocytes. RNA 20:1476–1487. https://doi. org/10.1261/rna.045781.114 20. Neil CR, Fairbrother WG (2019) Intronic RNA: Ad’junk’ mediator of posttranscriptional gene regulation. Biochim

Biophys Acta Gene Regul Mech 1862:194439. https://doi.org/10.1016/j. bbagrm.2019.194439 21. Jeck WR, Sorrentino JA, Wang K et al (2013) Circular RNAs are abundant, conserved, and associated with ALU repeats. RNA 19:141–157. https://doi.org/10.1261/rna. 035667.112 22. Xiao M-S, Wilusz JE (2019) An improved method for circular RNA purification using RNase R that efficiently removes linear RNAs containing G-quadruplexes or structured 30 ends. Nucleic Acids Res 47:8755–8769. https://doi.org/10.1093/nar/gkz576 23. Taggart AJ, Lin C-L, Shrestha B et al (2017) Large-scale analysis of branchpoint usage across species and cell lines. Genome Res 27:639–649. https://doi.org/10.1101/gr. 202820.115 24. Mercer TR, Clark MB, Andersen SB et al (2015) Genome-wide discovery of human splicing branchpoints. Genome Res 25:290–303. https://doi.org/10.1101/gr. 182899.114 25. Bitton DA, Rallis C, Jeffares DC et al (2014) LaSSO, a strategy for genome-wide mapping of intronic lariats and branch points using RNA-seq. Genome Res 24:1169–1179. https://doi.org/10.1101/gr.166819.113 26. Pineda JMB, Bradley RK (2018) Most human introns are recognized via multiple and tissuespecific branchpoints. Genes Dev 32:577–591. https://doi.org/10.1101/gad.312058.118 27. Kim D, Langmead B, Salzberg SL (2015) HISAT: a fast spliced aligner with low memory requirements. Nat Methods 12:357–360. https://doi.org/10.1038/nmeth.3317 28. Liao Y, Smyth GK, Shi W (2014) featureCounts: an efficient general purpose program for assigning sequence reads to genomic features. Bioinformatics 30:923–930. https:// doi.org/10.1093/bioinformatics/btt656

Chapter 6 Identification and Functional Characterization of Viroid Circular RNAs Jose´-Antonio Daro`s Abstract Viroids are relatively small, noncoding, plant circular RNAs. In contrast to other plant circular RNAs of endogenous origin, viroids are infectious agents able to replicate autonomously in the appropriate host. Because of their highly base-paired structures, they can be purified from infected tissue extracts using nonionic CF11 chromatography. Depending on the host plant species, viroid RNA preparation may also require polysaccharide removal by an extraction with 2-methoxyethanol followed by precipitation with cetyltrimethylammonium bromide. Electrophoretic analyses of this kind of preparations frequently show differential bands corresponding to the viroid circular molecules, which are absent in those from healthy plants. These RNA preparations can also be used for viroid transmission to new plants by mechanical inoculation. Key words Viroid, Plant circular RNA, Infectious RNA, Noncoding RNA, RNA replication, RNA circularization, Ribozyme

1

Introduction Viroids are infectious agents of plants that are exclusively constituted by a relatively small, highly self-complementary, noncoding, circular RNA that, in those species currently known, ranges from 246 to 434 nucleotides (nt). While we nowadays recognize a myriad of circular RNAs in plants and other organisms, playing important roles in health and disease [1], for many decades, viroids were the only well characterized plant circular RNAs [2, 3]. The more than thirty viroid species known to date are classified in two different families [4]. Notably, plant also host viroid-like satellite RNAs, which share many properties with viroids, including circularity, but require a helper virus for infection [5]. Most viroids, such as potato spindle tuber viroid (PSTVd), belong to the family Pospiviroidae.

Luis Marı´a Vaschetto (ed.), Plant Circular RNAs: Methods and Protocols, Methods in Molecular Biology, vol. 2362, https://doi.org/10.1007/978-1-0716-1645-1_6, © The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Science+Business Media, LLC, part of Springer Nature 2021

101

102

Jose´-Antonio Daro`s

These viroids contain a distinctive central conserved region (CCR) in the center of their molecules and replicate and accumulate in the nucleus of infected cells. In contrast, five viroids that do not have CCRs in their molecules, but contain hammerhead ribozymes in the strands of both polarities, belong to the family Avsunviroidae, such as avocado sunblotch viroid (ASBVd). They replicate and accumulate in the chloroplasts of infected cells. Belonging to one family or the other, viroids replicate through an RNA-to-RNA rolling-circle mechanism in which the viroid circular RNAs serve as templates to host RNA polymerases for production of longerthan-unit oligomeric RNAs of complementary (or minus) polarity [6]. In the family Pospiviroidae, whose members follow an asymmetric pathway of the rolling-circle mechanism, the oligomeric intermediates of minus polarity are directly used as templates to copy plus oligomeric RNAs, which are cleaved by host type-III RNases [7] to monomeric-length viroid RNAs that are, ultimately, circularized by the host DNA ligase I [8]. In contrast, in members of the family Avsunviroidae, which follow a symmetric pathway, the oligomeric RNA intermediates of minus polarity that contain hammerhead ribozymes self-cleave to monomeric-length RNAs that are circularized by the host tRNA ligase [9]. In this symmetric version of the rolling-circle mechanism, monomeric circular RNAs of minus polarity serve as templates in a second rolling-circle for the production of oligomeric RNAs of plus polarity that, again, are selfcleaved by the embedded hammerhead ribozymes to monomericlength RNAs, which are finally circularized by the host tRNA ligase to produce the viroid progeny. There are some properties that distinguish viroids from other plant circular RNAs. Chief among them is the infectivity. Viroids are infectious agents able to autonomously replicate in the appropriate host plants, of course by recruiting host enzymes and structures. This means that they frequently induce symptoms of infection, although not always, since some so-called latent viroids exist, which replicate and move systemically without inducing apparent symptoms in their host plants [10]. Self-replication also means that viroids frequently reach substantial accumulations in the infected plants and that they can be easily detected by electrophoretic analyses as differential RNA species when comparing preparations from infected versus noninfected plants. However, again, this is not always true, since some viroids have been shown to replicate to low accumulation in their hosts and they can only be revealed by northern-blot hybridization, bioassay, deep sequencing, or other sensitive techniques [11]. Here, I describe materials and methods to extract and purify the viroid circular RNA molecules from infected plants tissues. First, total RNAs are extracted homogenizing the tissue in a mix of phenol and an aqueous buffer. Then, highly self-complementary

Viroid Circular RNAs in Host Plants

103

RNAs, such as viroids, are enriched in the extract using cellulose CF11 chromatography [12]. These RNA preparations may be directly subjected to electrophoretic analyses to reveal the viroid circular forms [13]. However, some plant species typically produce RNA preparations that are highly contaminated with polysaccharides and preclude optimum electrophoretic analyses. In these cases, an extraction with 2-methoxyethanol, followed by precipitation with cetyltrimethylammonium bromide (CTAB) [14] is recommended. Electrophoretic analysis of RNA preparations obtained using these methods from noninfected and ASBVd-infected avocado leaves clearly show the differential band corresponding to the circular ASBVd RNA of 247 nt (Fig. 1). These RNA preparations can also be used to transmit the viroid to new plants by mechanical inoculation.

Fig. 1 Electrophoretic separation of RNA preparations from leaves of healthy (lane 1) and ASBVd-infected avocado trees (lane 2 to 4). RNA aliquots were separated by PAGE in a denaturing polyacrylamide gel containing 8 M urea. The gel was stained with ethidium bromide. A distinctive band (pointed by an arrow) corresponding to the circular ASBVd of 247 nt can be observed in the lanes of the infected trees

104

2

Jose´-Antonio Daro`s

Materials Prepare all solutions using ultrapure water and analytical or biochemistry grade reagents. Follow all waste disposal regulations. 1. Extraction buffer: 125 mM Tris–HCl, pH 9.0, 0.75% sodium dodecyl sulfate (SDS), 15 mM ethylenediaminetetraacetic acid (EDTA), 1 M 2-mercaptoetanol. Before use, mix 12.5 mL 1 M Tris–HCl, pH 9.0, 7.5 mL 10% SDS, 3 mL 0.5 M EDTA, pH 8.0, and 7.1 mL 2-mercaptoethanol, bring to 100 mL with water. 2. Water saturated phenol (pH 8.0): mix 400 mL 90% phenol and 120 mL water and store at 4  C in a dark bottle. When needed, take the aliquot of water saturated phenol to use and, while stirring, add 10 M NaOH to bring pH to approximately 8.0. Measure pH with indicator paper. 3. Tissue homogenizer (e.g., Polytron, Kinematica). 4. Nonionic cellulose CF11 (Whatman). 5. Salt–Tris–EDTA (STE): 50 mM Tris–HCl, pH 7.0, 100 mM NaCl, and 1 mM EDTA. Prepare a 10 stock solution weighing 60.6 g Tris, 58.5 g NaCl, 3.72 g Na2EDTA·2H2O. Adjust pH 7.0 with HCl, bring to 1 L with water. Autoclave and store at room temperature. 6. 35% ethanol–STE solution: mix 36.5 mL 96% ethanol and 10 mL 10x STE, bring to 100 mL with water. 7. 2.5 M potassium phosphate, pH 8.0: weigh 57.05 g K2HPO4 (·3H2O) and dissolve with water, bring to 100 mL; weigh 34.02 g KH2PO4 and dissolve with water, bring to 100 mL. Mix the right amount of both solutions to reach pH 8.0. Autoclave and store at room temperature. 8. 3 M sodium acetate, pH 5.5: weigh 40.8 g sodium acetate (·3H2O), add water to approximately 85 mL and dissolve. Adjust pH to 5.5 with acetic acid and bring to 100 mL with water. Autoclave and store at room temperature. 9. 1% CTAB: weigh 1 g CTAB and dissolve in water. Bring to 100 mL and store at 4  C. 10. 1 M NaCl: weigh 5.84 g NaCl and dissolve with water. Bring to 100 mL, autoclave and store a room temperature. 11. 0.5 M K2HPO4: weigh 8.71 g K2HPO4 and dissolve with water. Bring to 100 mL, autoclave and store at room temperature. 12. 10% carborundum in 50 mM K2HPO4: weigh 100 mg carborundum in an Eppendorf tube. Add 0.9 mL water and 0.1 mL 0.5 M K2HPO4. Vortex exhaustively and mix before use.

Viroid Circular RNAs in Host Plants

3

105

Methods For viroid identification and characterization, total RNA is first extracted from the plant tissue and then viroids are enriched by chromatography. Polysaccharides may be removed if required. Finally, viroids are characterized after PAGE separation (Fig. 1). RNA preparations can be used for viroid mechanical inoculation to new host plants.

3.1

RNA Extraction

1. Add 40 mL of water-saturated phenol (pH 8.0) and 16 mL extraction buffer to a beaker. 2. Cut 10 g of plant tissue in pieces (approximately 1 cm2) and add to the beaker (see Note 1). 3. Homogenize with a Polytron or similar homogenization device. 4. Transfer the extract to centrifuge tubes and clarify by centrifuging for 15 min at 7500  g. 5. Recover the aqueous phase and reextract with half a volume of water-saturated phenol (pH 8.0). 6. Centrifuge again for 15 min at 7500  g and recover the aqueous phase.

3.2 Cellulose Chromatography

1. In a centrifuge tube, adjust the extract volume to 20 mL with water. 2. Add 3.7 mL of 10 STE, 13.4 mL of 96% ethanol, and 1.25 g nonionic cellulose. 3. Rotate the mix for 1 h. 4. Sediment the cellulose centrifuging for 3 min at 1000  g and discard supernatant. 5. Wash the cellulose three times with 30 mL 35% ethanol–STE solution, recovering the cellulose each time by centrifugation for 3 min at 1000  g. 6. Elute the RNA in three steps with 3.33 mL STE. After adding each STE aliquot, vortex exhaustively, centrifuge for 3 min at 1000  g and recover the supernatant. 7. Combine the three supernatants and sediment cellulose remains by centrifugation for 5 min at 7500  g. Recover supernatant (approximately 10 mL). 8. Precipitate RNAs by adding 25 mL cold ethanol (20  C). Mix and incubate a minimum of 2 h at 20  C (see Note 2). 9. Sediment RNA centrifuging for 15 min at 7500  g at 4  C. Gently discard supernatant and wash carefully the pellet with cold (20  C) 70% ethanol. 10. Air dry the RNA sediment and finally resuspend in 250μL water (see Note 3).

106

Jose´-Antonio Daro`s

3.3 Polysaccharide Removal with 2-Methoxyethanol

1. Mix 2 mL RNA extract, 2 mL 2.5 M potassium phosphate, pH 8.0, and 2 mL 2-methoxyethanol, vortex exhaustively and incubate on ice for 5 min. 2. Centrifuge for 3 min at 1000  g and recover the aqueous phase (around 4.8 mL). 3. Add 0.05 volumes 3 M sodium acetate, pH 5.5, and 0.5 volumes 1% CTAB. 4. Vortex and keep on ice for 5 min. 5. Centrifuge for 15 min at 12,000  g and discard supernatant. 6. Air-dry the sediment and resuspend in 2 mL 1 M NaCl. 7. Add 3 volumes cold 96% ethanol (20  C) and incubate at least 2 h at 20  C. 8. Centrifuge for 15 min at 12,000  g at 4  C and discard supernatant. 9. Wash the sediment with cold 70% ethanol (20  C) and air-dry. 10. Resuspend in an appropriate amount of water according to downstream application.

3.4 Viroid Inoculation

1. Prepare the right amount of inoculum considering 5μL per inoculated leaf. 2. Mix on ice the RNA preparation, 0.1 volumes of 0.5 M K2HPO4 and water to the desired final volume. 3. On the adaxial side of the leaf to inoculate, deposit 5μL 10% carborundum dispersion in 50 mM K2HPO4 (see Note 4) and 5μL viroid preparation. Using a glass rod, mix both drops and gently distribute through the leaf surface. 4. Cultivate plants regularly in the greenhouse or growth chamber for about 1 month. 5. Monitor disease symptoms or analyze the presence of the viroid circular RNA in the upper noninoculated tissue.

4

Notes 1. The amount of tissue and extraction buffer can be scaled up and down. 2. Incubation can be prolonged overnight or more. 3. Although 250μL are recommended for a concentrated preparation, adjust the right amount of water to properly dissolve the precipitate. 4. Mix the 10% carborundum dispersion before pipetting.

Viroid Circular RNAs in Host Plants

107

Acknowledgments This work was supported by grants BIO2017-83184-R and BIO2017-91865-EXP from the Ministerio de Ciencia e Innovacio´n (Spain) through the Agencia Estatal de Investigacio´n, and cofinanced by the European Regional Development Fund (European Commission). References 1. Haddad G, Lorenzen JM (2019) Biogenesis and function of circular RNAs in health and in disease. Front Pharmacol 10:428. https://doi. org/10.3389/fphar.2019.00428 2. Diener TO (1971) Potato spindle tuber “virus” IV. A replicating, low molecular weight RNA. Virology 45:411–428 3. S€anger HL, Klotz G, Riesner D et al (1976) Viroids are single-stranded covalently closed circular RNA molecules existing as highly base-paired rod-like structures. Proc Natl Acad Sci U S A 73:3852–3856. https://doi. org/10.1073/pnas.73.11.3852 4. Di Serio F, Flores R, Verhoeven JT et al (2014) Current status of viroid taxonomy. Arch Virol 159:3467–3478. https://doi.org/10.1007/ s00705-014-2200-6 5. Elena SF, Dopazo J, de la Pena M et al (2001) Phylogenetic analysis of viroid and viroid-like satellite RNAs from plants: a reassessment. J Mol Evol 53:155–159 6. Branch AD, Robertson HD (1984) A replication cycle for viroids and other small infectious RNAs. Science 223:450–455 7. Gas M-E, Molina-Serrano D, Herna´ndez C et al (2008) Monomeric linear RNA of citrus Exocortis viroid resulting from processing in vivo has 50 -phosphomonoester and 30 -hydroxyl termini: implications for the RNase and RNA ligase involved in replication. J Virol 82:10321–10325. https://doi.org/10.1128/ JVI.01229-08 ´ , Flores R, Daro`s JA (2012) Viroid 8. Nohales MA RNA redirects host DNA ligase 1 to act as an

RNA ligase. Proc Natl Acad Sci U S A 109:13805–13810. https://doi.org/10. 1073/pnas.1206187109 9. Nohales M-A, Molina-Serrano D, Flores R et al (2012) Involvement of the Chloroplastic isoform of tRNA ligase in the replication of Viroids belonging to the family Avsunviroidae. J Virol 86:8269–8276. https://doi.org/10. 1128/jvi.00629-12 10. Daro`s JA (2016) Eggplant latent viroid: a friendly experimental system in the family Avsunviroidae. Mol Plant Pathol 17:1170–1177 11. Navarro B, Flores R (1997) Chrysanthemum chlorotic mottle viroid: unusual structural properties of a subgroup of self-cleaving viroids with hammerhead ribozymes. Proc Natl Acad Sci U S A 94:11262–11267 12. Franklin RM (1966) Purification and propertiels of replicative intermediate of RNA bacteriophage R17. Proc Natl Acad Sci U S A 55:1504 13. Daro`s JA (2021) Two-dimensional polyacrylamide-gel electrophoresis analysis of viroid RNAs. In: Vidalakis G et al. (eds) Viroids: methods and protocols. Methods molecular biology, vol 2316. Springer US, New York 14. Bellamy AR, Ralph RK (1968) Recovery and purification of nucleic acids by means of cetyltrimethylammonium bromide. Methods Enzymol 12:156–160. https://doi.org/10.1016/ 0076-6879(67)12125-3

Chapter 7 Circular RNA Databases Peijing Zhang and Ming Chen Abstract Circular RNAs (circRNAs) are a class of endogenous ncRNAs with covalently closed-loop structures, lacking of 50 caps and 30 tails. These novel ncRNAs are ubiquitously expressing in eukaryotes, exhibiting expression patterns of specific cell types, tissues, or developmental stages. CircRNAs have been reported to play important roles in various biological processes, such as regulating gene expression at transcriptional or post-transcriptional levels, modulating alternative splicing, and interacting with miRNAs or proteins. With the increasing amount of circRNA data, several databases have been established to organize and manage this information, such as circBase, CIRCpedia, CircAtlas, circRNADb, PlantCircNet, and CircFunBase. These diverse databases will help to explore circRNA characterization, and further investigate circRNA functions. In this chapter, we give a brief overview of the existing circRNA databases and focus on plant circRNA databases, introducing their key features. Key words circRNA, Database, Resource, Plant, miRNA, Function

1

Introduction After circular transcripts have been observed for decades, circular RNAs (circRNAs) have been discovered and identified in diverse species with the development of high-throughput sequencing technology and bioinformatic methods. Experiments show that circRNAs are ubiquitous and abundant in all eukaryotes [1–3]. To date, the emerging circRNAs have been an important part of the endogenous noncoding RNAs (ncRNAs) family. It has been proved that most circRNAs are generated from the exons of protein-coding genes and normally produced in the nucleus to be then transported to the cytoplasm [4]. Although generated from the same precursors, circRNAs are formed closedloop structures through backsplicing, while linear RNA transcripts are originated from canonical forward splicing [5]. The backsplicing results in the covalent join of downstream splice donor and upstream splice acceptor, forming the distinct circular structure.

Luis Marı´a Vaschetto (ed.), Plant Circular RNAs: Methods and Protocols, Methods in Molecular Biology, vol. 2362, https://doi.org/10.1007/978-1-0716-1645-1_7, © The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Science+Business Media, LLC, part of Springer Nature 2021

109

110

Peijing Zhang and Ming Chen

Therefore, circRNAs are resistant to degradation by Ribonuclease R (RNase R) and more stable than linear RNA transcripts. Generally, the expression level of circRNAs is relatively low than their linear cognates, but they accumulate in specific cell types or tissues in temporally regulated manners [6, 7]. Previous evidence has indicated that circRNAs can act as microRNA (miRNA) sponge or RNA-binding proteins (RBP) sponge, the key regulator of transcription and splicing, or potential template for translation [8]. It has been reported that circRNAs are associated with a variety of diseases, including various types of cancer [9]. What’s more, the stability and specific expression patterns of circRNAs made them easily be detected in biological samples, such as saliva and blood. Consequently, circRNAs have been becoming promising biomarkers for potential diagnostic, therapy, and prognostic. Online resources are primary tools for researchers to collect data and search for information. The rapid discovery of new circRNAs and their critical function facilitates the establishment of circRNA databases. Nowadays, more than 20 databases have been developed, and most of them contain human circRNAs, such as circBase [10], circRNADb [11], CircNet [12], circAtlas [13, 14], and CircR2Disease [15]. These databases provide many basic functions: circRNA annotations, such as genomic location, circRNA names, parental gene names, and junction spanning reads; circRNA function query by genomic position and parental gene names; information download service. Besides, each database has its own unique aspects and advantages. Experimentally validated, tissuespecific or disease-associated circRNAs can be found in the corresponding database, as well as the interactions between circRNA and miRNAs/RBPs, coding potential circRNAs, and species conserved circRNAs [5]. For example, circBase covers circRNA information of six species, including human, mouse, C. elegans, and Latimeria organisms; circRNADb provides human exonic circRNAs annotations with the protein-coding ability; Circ2Traits [16] is the knowledgebase for disease-associated circRNAs in human; CircNet provides visualized circRNA–miRNA–gene regulatory networks. However, less attention has been paid to plants than that for animals on circRNA research. The circRNA databases in plants are relatively fewer than in animals. Previous research suggested that mechanism of plant circRNAs biogenesis is slightly different from that of animal circRNAs, and plant circRNAs show stress-induced expression patterns in response to biotic and abiotic stresses. Therefore, more circRNA database resources for plants are needed, which would greatly facilitate the research into plant circRNAs. In this chapter, we briefly overview the current circRNA databases (Table 1). We focus on plant circRNA databases and summarize their content and features. Then we discuss current issues

CircRNA Databases

111

Table 1 CircRNA databases Name

Species

Published Year

Last Updated

References

Circ2Traits

Human

2013.11

Unavailable

[16]

starBase v2.0

23 species

2013.11

2019.10

[21]

circBase

6 species

2014.09

2017.07

[10]

CIRCpedia

6 species

2014.09, 2018.08

/

[17, 18]

CircNet

Human

2015.10

Unavailable

[12]

deepbase v2.0

19 species

2015.11

/

[24]

CircInteractome

Human

2016.02

2020.01

[22]

TSCD

Human and mouse

2016.08

/

[19]

AtCircDB

Arabidopsis

2016.09, 2017.07

/

[29]

circRNADb

Human

2016.10

/

[11]

PlantcircBase

19 plant species

2017.03, 2018.11

2020.04

[26, 27]

CSCD

Human

2017.09

/

[20]

PlantCircNet

8 plant species

2017.11

/

[28]

CircR2Disease

Human, mouse, and rat

2018.05

/

[15]

ASmiR

11 plant species

2019.01

/

[34]

MiOncoCirc

Human

2019.02

/

[9]

CircFunBase

15 species (7 plant)

2019.02

/

[33]

circAtlas

6 species

2019.03, 2020.04

/

[13, 14]

CropCircDB

Rice and maize

2019.05

/

[31]

PncStress

114 plant species

2020.03

/

[35]

VirusCircBase

26 viral species

2020.04

2020.08

[23]

GreenCircRNA

69 plant species

2020.06

/

[32]

regarding circRNA databases, which may limit further exploration in this field. Finally, we come up with some suggestions to deal with these limitations.

2

CircRNA Databases The development of circRNA database can be divided into three categories, each representing one technical direction. The first category is circRNA comprehensive resources that provide downloadable circRNA annotation of different species, including human, mouse, plant, drosophila, zebrafish, worm, and even

112

Peijing Zhang and Ming Chen

virus. CircBase, circRNADb, CIRCpedia [17, 18], and circAtlas are all fall into this category. The other categories focus on features and functions of human circRNA. The second category of circRNA databases pays more attention to tissue-specific and diseaseassociated circRNAs, including Circ2Traits, CircR2Disease, TSCD [19], CSCD [20], and MiOncoCirc [9]. The third category is more concerned with interaction relationships between circRNA and other molecules, and with potential circRNA-associated networks. Furthermore, some databases predict the protein-coding ability of circRNAs. The databases such as CircNet, Circ2Traits, starBase v2.0 [21], and CircInteractome [22] provide miRNA– circRNA interaction prediction. As one of the earliest proposed circRNA databases, circBase (http://www.circbase.org/) merge and unify datasets from six species. It provides accession and download services of circRNAs information and their expression profiles [10]. The genomic browser, the sequence-based search, and diversity export options are also provided as basic functions in the database. For further investigation, circBase provides scripts to annotate known and novel circRNAs by imputing RNA sequences in Fasta format. circAtlas (http://circatlas.biols.ac.cn/) is a comprehensive resource for vertebrate circRNAs, with millions of highly reliable circRNAs from 1,070 transcriptomes, including human, mouse, macaque, rat, chicken, and pig [13, 14]. The large-scale dataset contains 1,007,087 circRNAs collecting from 19 normal tissues, and over 80% of them have been assembled into full-length circRNAs. circAtlas includes several notable information: the expression profiles across tissues; the conservation information across the species; the functional annotation generated by integrating coexpression profiles with interaction data; potential open reading frames prediction; internal ribosomal entry sites (IRES) prediction; and RBP binding sites prediction. Moreover, the database could convert circRNA names from different circRNA databases or assemblies to circAtlas ID, which is beneficial to integrate circRNA information. VirusCircBase (http://www.computationalbiology.cn/ ViruscircBase/home.html) is the first virus circRNA database with 46,440 viral circRNAs from 26 viruses [23]. These circRNAs are not only generated from double-stranded DNA viruses, but also from single-stranded RNA viruses and retrotranscribing viruses, including the Human immunodeficiency virus 1, the Influenza A virus, the Zika virus, and the Zaire ebolavirus. TSCD (http://gb.whu.edu.cn/TSCD) is the resource for tissue-specific circRNAs of human and mouse [19]. In total, 302,853 tissue-specific circRNAs have been identified, some of them are confirmed by reverse transcription-polymerase chain reaction (RT-PCR). circRNA information such as tissue annotation, conservation across species, and visualized exon structures, are

CircRNA Databases

113

provided in the database. The database allows circRNAs to be compared among different tissues, which benefits to exploring circRNA functions in organ development and disorders. CircR2Disease (http://bioinfo.snnu.edu.cn/CircR2Disease/) provides experimentally supported circRNAs associated with diseases by manually curating [15]. Currently, there are 739 associations between 661 circRNAs and 100 diseases in this database. The circRNA annotation, expression pattern, experimental technique, circRNA–disease relationship, and PubMed ID can be browsed, searched, and downloaded. CSCD (http://gb.whu.edu.cn/CSCD) is resource for cancerspecific circRNAs. Comparing with circRNAs from normal cell lines, 272,152 cancer-specific circRNAs have been recognized from 87 cancer cell line samples across 19 cancer types [20]. In addition to the schematic diagram of circRNA formation from one gene, the database provides potential miRNA response element sites, RBP sites, and ORFs (open reading frames) of circRNAs, which greatly helps to explore functional regulation and translational potential of circRNAs. MiOncoCirc (https://mioncocirc.github.io/) is the first resource for cancer-associated circRNAs that are directly detected from clinical tumor tissues, including primary tumors, metastases, and rare cancer types [9]. This database provides download service of several types of data: (1) circRNA information; (2) circRNA’s parental gene expression files, samples annotation, and sequencing statistics; (3) raw data; and (4) unfiltered data. It is a valuable resource for exploring circRNA function in disease, especially for cancer diagnostic or therapeutic biomarkers discovery. CircNet (http://circnet.mbc.nctu.edu.tw/) is the first database to explore tissue-specific circRNA expression patterns and visualize regulatory networks of circRNAs, miRNAs, and genes [12]. The database provides genomic annotations, expression profiles, and sequences of circRNA isoforms, which helps to illustrate the regulatory function of circRNA. Unfortunately, CircNet has been unavailable for a long time. CircInteractome (http://circinteractome.nia.nih.gov) offers circRNAs and their relationships with binding factors, which are mainly miRNAs and RBPs [22]. This database allows user to query for circRNA information and binding associates. Besides, user can identify potential RBP sponges and circRNA IRES. For further investigation, the database provides services for users to design junction-spanning primers for circRNA detection and siRNA for circRNA silencing, which is necessary for experimental confirmation to validate circRNA functions. starBase v2.0 (http://starbase.sysu.edu.cn/) and deepbase v2.0 (http://biocenter.sysu.edu.cn/deepBase/) are comprehensive databases for diverse ncRNAs and regulatory interaction networks among diverse classes of RNAs, especially for circRNAs,

114

Peijing Zhang and Ming Chen

miRNA–circRNA interactions, and circRNA-accosted regulatory networks [21, 24]. deepbase v2.0 annotates diverse types of ncRNAs across 19 species, including miRNA, snRNA, lncRNA, and circRNA. What’s more, it constructs expression profiles and evolutional patterns of these ncRNAs. starBase v2.0 provides several types of data: (1) regulatory miRNA–mRNA, miRNA–ncRNA, miRNA–pseudogene, and protein–RNA interactions; (2) ceRNA regulatory networks; (3) functional annotation of miRNAmediated networks; and (4) CLIP-supported miRNA target sites for interacting circRNAs. These two databases were developed by the same group, and they are not available outside service hours. Now, deepbase v3.0 (http://rna.sysu.edu.cn/deepbase3/ index.html) is online, integrating over 67,000 data from normal and cancer tissues [25], which will facilitate analyses and explorations of the functions and mechanisms of various types of ncRNAs.

3

Plant circRNA Databases Comparing with animal databases, the number and unique features of plant circRNA databases are slightly inferior. Currently, there are eight databases containing plant circRNAs, including several functional databases for Arabidopsis and multiple species. PlantcircBase (http://ibi.zju.edu.cn/plantcircbase/) is a comprehensive resource for plant circRNAs and normally updates with newly released records [26, 27]. Now PlantcircBase release 5 is online and covers 121,971 circRNAs from 19 plant species, most of which are model plants and crops. circRNA information can be retrieved, visualized, downloaded, and searched in the database, including annotations with alternative splicing, exon boundary, splicing signals, and exons covered. Genomic location, sequence, and gene interpretation can be used to search for the corresponding circRNAs. In the database, circRNAs can be visualized in a circular format with genomic annotations, the sequence components and colored backsplicing sites. Furthermore, PlantcircBase analyzes the detection methods and characteristics of circRNAs for each species, including (1) numbers of circRNAs identified by one or more bioinformatic tools; (2) different types and length distributions of circRNAs; (3) distribution of circRNA splicing signals. The statistical data will give a quick and comprehensive insight in plant circRNAs. Besides the annotations and sequences of circRNAs, the supportive evidence of their expression is available in the database. PlantCircNet (http://bis.zju.edu.cn/plantcircnet/) is the first integrated circRNA database for plants [28]. The circRNA information for eight plant species can be retrieved, searched, and downloaded, including annotations, full-length sequences, and isoforms. There are four search options in the database: (1) genomic locus and (2) parental gene names can be easily located to the related

CircRNA Databases

115

circRNAs; (3) sequences can be used to find potentially conserved circRNAs; (4) Gene Ontology (GO) is available for searching functional circRNAs. Considering the regulatory roles of circRNAs, PlantCircNet predicts circRNAs acting as miRNA sponges and provides visualized circRNA–miRNA–mRNA networks. As it is available for circRNA annotation and isoforms generated from alternative backsplicing, the interaction relationships in circRNAassociated networks can be downloaded. Furthermore, enrichment analysis can be implemented to find significantly overrepresented functional circRNAs in the network. AtCircDB (http://deepbiology.cn/circRNA/) is the first dedicated database for tissue-specific circRNAs in Arabidopsis thaliana [29]. Currently, the database version 2 hosts 84,685 circRNAs from 10 single tissues and mixed tissues, of which 30,648 are tissue-specific circRNAs and 3,486 miRNA–circRNA interactions are stored [30]. CircRNAs annotations and their interactions can be retrieved, searched, and downloaded in the database. In addition to the circRNA basic information like genomic locus, strand, parental gene, and tissue, AtCircDB provides start and end annotation, antisense information, and experimental evidence. Meanwhile, two novel metrics are proposed; the detection score is introduced to show the probability that circRNA can be detected; the regions hosting enriched circRNAs are defined as super circRNA regions, which is highly related to alternative splicing and chloroplast. CropCircDB (http://deepbiology.cn/crop/) is another plant circRNA database developed by the AtCircDB team, focusing on crops and their responses to abiotic stress [31]. In summary, 63,048 circRNAs and 38,785 circRNAs in rice and maize are annotated in the database, respectively, including circRNAs identified from 148 stress-related rice samples (cold, drought, and salt) and 111 stress-related maize samples (salt and drought). The database provides detailed information on circRNAs, visualized circRNA structure with exons, potential miRNA interaction, and predicted proteins. Moreover, CropCircDB provides dynamic expression profiles of circRNA in different conditions. Besides the detection score and super circRNA region, another score named stress detections score is introduced in the database, which evaluates the possibility of detecting circRNA in stress-related samples. GreenCircRNA (http://greencirc.cn/) contains 213,494 circRNA from 69 plant species by using more than 4,000 transcriptome datasets, of which 38 species have miRNA-related information in the miRNA databases [32]. Currently, this database covers abundant circRNA information of different species and is valuable to exploring characteristics of plant circRNAs. CircFunBase (http://bis.zju.edu.cn/CircFunBase/) is a comprehensive resource for functionally annotated circRNAs. It documents 7,059 circRNA entries from 15 organisms including 7 plant species [33]. In addition to basic information, each entry contains

116

Peijing Zhang and Ming Chen

circRNA expression patterns, function description, reference PubMed ID, and circRNA-associated miRNAs. The functional information contains manually collected experimentally validated annotations and computationally predicted functional annotations. CircFunBase provides API for researchers by returning detailed circRNA-related information in JSON format, such as circRNA information or circRNA–miRNA interactions. Besides, it invites users to upload novel functional circRNAs by visiting the “Submit” page. ASmiR (http://forestry.fafu.edu.cn/bioinfor/db/ASmiR/) is a database and web server for saving and identifying miRNA target sites in alternative splicing regions, including mRNAs from linear alternative splicing and circRNAs from alternative backsplicing [34]. Now the database collects sequences of eleven species generated by PacBio sequencing and Illumina sequencing. Among them, 114,574 alternative splicing events have been identified from circRNAs in five species, and 38,913 circRNAs are overlapping with their own parent isoforms. PncStress (http://bis.zju.edu.cn/pncstress/) is a manually curated database designed for experimentally validated stressresponsive ncRNAs in plants, including miRNA, lncRNA, and circRNAs [35]. The current version contains 4,227 entries from 114 plants, most of them are miRNAs, covering 48 biotic and 91 abiotic stresses. 52 circRNAs are stored in the database and show differential expression patterns under different stress such as cold, dehydration, nitrogen-deficiency, drought, heat, maize Iranian mosaic virus, and verticillium wilt.

4

Discussion With the development of sequencing technology and the progress of circRNA research, the number of circRNAs is very likely to increase continually. To date, circRNAs is one of the key areas in biological research and is still in its early stage. The generation, degradation, and biological function of circRNAs remain interesting to researchers. Large numbers of circRNAs have been identified and stored in various circRNAs databases. However, there are several issues with the current databases, which hinder researchers from further research. The varying sequencing methods and detecting pipelines results in inconsistent standards for circRNA annotations. Besides, it is difficult to compare the names and splicing sites of circRNAs across different databases. The miRNA and RBP binding sites, protein-coding ability, and circRNA-associated regulatory networks are annotated in databases, whereas these annotations are usually based on presumed circRNA sequences. Furthermore, several databases have not been updated since

CircRNA Databases

117

published, and some of the databases contain broken links. These problems hamper progression in the circRNA field, as Vromman et al. suggested [36], a solid circRNA nomenclature is needed.

Acknowledgments We would like to thank Sida Li (Zhejiang University) for his assistance with linguistic issues. References 1. Danan M, Schwartz S, Edelheit S et al (2012) Transcriptome-wide discovery of circular RNAs in archaea. Nucleic Acids Res 40 (7):3131–3142. https://doi.org/10.1093/ nar/gkr1009 2. Jeck WR, Sorrentino JA, Wang K et al (2013) Circular RNAs are abundant, conserved, and associated with ALU repeats. RNA 19 (2):141–157. https://doi.org/10.1261/rna. 035667.112 3. Wang PL, Bao Y, Yee MC et al (2014) Circular RNA is expressed across the eukaryotic tree of life. PLoS One 9(3):e90859. https://doi.org/ 10.1371/journal.pone.0090859 4. Salzman J, Gawad C, Wang PL et al (2012) Circular RNAs are the predominant transcript isoform from hundreds of human genes in diverse cell types. PLoS One 7(2):e30733. https://doi.org/10.1371/journal.pone. 0030733 5. Kristensen LS, Andersen MS, Stagsted LVW et al (2019) The biogenesis, biology and characterization of circular RNAs. Nat Rev Genet 20(11):675–691. https://doi.org/10.1038/ s41576-019-0158-7 6. Rybak-Wolf A, Stottmeister C, Glazar P et al (2015) Circular RNAs in the mammalian brain are highly abundant, conserved, and dynamically expressed. Mol Cell 58(5):870–885. https://doi.org/10.1016/j.molcel.2015.03. 027 7. Veno MT, Hansen TB, Veno ST et al (2015) Spatio-temporal regulation of circular RNA expression during porcine embryonic brain development. Genome Biol 16:245. https:// doi.org/10.1186/S13059-015-0801-3 8. Li X, Yang L, Chen LL (2018) The biogenesis, functions, and challenges of circular RNAs. Mol Cell 71(3):428–442. https://doi.org/ 10.1016/j.molcel.2018.06.034 9. Vo JN, Cieslik M, Zhang Y et al (2019) The landscape of circular RNA in cancer. Cell 176

(4):869–881. https://doi.org/10.1016/j.cell. 2018.12.021 10. Glazar P, Papavasileiou P, Rajewsky N (2014) circBase: a database for circular RNAs. RNA 20 (11):1666–1670. https://doi.org/10.1261/ rna.043687.113 11. Chen X, Han P, Zhou T et al (2016) circRNADb: a comprehensive database for human circular RNAs with protein-coding annotations. Sci Rep 6:34985. https://doi. org/10.1038/srep34985 12. Liu YC, Li JR, Sun CH et al (2016) CircNet: a database of circular RNAs derived from transcriptome sequencing data. Nucleic Acids Res 44(D1):D209–D215. https://doi.org/10. 1093/nar/gkv940 13. Wu W, Ji P, Zhao F (2020) CircAtlas: an integrated resource of one million highly accurate circular RNAs from 1070 vertebrate transcriptomes. Genome Biol 21(1):101. https:// doi.org/10.1186/s13059-020-02018-y 14. Ji P, Wu W, Chen S et al (2019) Expanded expression landscape and prioritization of circular RNAs in mammals. Cell Rep 26 (12):3444–3460. https://doi.org/10.1016/j. celrep.2019.02.078 15. Fan C, Lei X, Fang Z et al (2018) CircR2Disease: a manually curated database for experimentally supported circular RNAs associated with various diseases. Database (Oxford) 2018:bay044. https://doi.org/10.1093/data base/bay044 16. Ghosal S, Das S, Sen R et al (2013) Circ2Traits: a comprehensive database for circular RNA potentially associated with disease and traits. Front Genet 4:283. https://doi.org/10. 3389/fgene.2013.00283 17. Dong R, Ma XK, Li GW et al (2018) CIRCpedia v2: an updated database for comprehensive circular RNA annotation and expression comparison. Genomics Proteomics Bioinformatics

118

Peijing Zhang and Ming Chen

16(4):226–233. https://doi.org/10.1016/j. gpb.2018.08.001 18. Zhang XO, Wang HB, Zhang Y et al (2014) Complementary sequence-mediated exon circularization. Cell 159(1):134–147. https:// doi.org/10.1016/j.cell.2014.09.001 19. Xia SY, Feng J, Lei LJ et al (2017) Comprehensive characterization of tissue-specific circular RNAs in the human and mouse genomes. Brief Bioinform 18(6):984–992. https://doi. org/10.1093/bib/bbw081 20. Xia S, Feng J, Chen K et al (2018) CSCD: a database for cancer-specific circular RNAs. Nucleic Acids Res 46(D1):D925–D929. https://doi.org/10.1093/nar/gkx863 21. Li JH, Liu S, Zhou H et al (2014) starBase v2.0: decoding miRNA-ceRNA, miRNAncRNA and protein-RNA interaction networks from large-scale CLIP-Seq data. Nucleic Acids Res 42(Database issue):D92–D97. https:// doi.org/10.1093/nar/gkt1248 22. Dudekula DB, Panda AC, Grammatikakis I et al (2016) CircInteractome: a web tool for exploring circular RNAs and their interacting proteins and microRNAs. RNA Biol 13 (1):34–42. https://doi.org/10.1080/ 15476286.2015.1128065 23. Cai Z, Fan Y, Zhang Z et al (2020) VirusCircBase: a database of virus circular RNAs. Brief Bioinform 22(2):2182–2190. https://doi. org/10.1093/bib/bbaa052 24. Zheng LL, Li JH, Wu J et al (2016) deepBase v2.0: identification, expression, evolution and function of small RNAs, LncRNAs and circular RNAs from deep-sequencing data. Nucleic Acids Res 44(D1):D196–D202. https://doi. org/10.1093/nar/gkv1273 25. Xie F, Liu S, Wang J et al (2021) deepBase v3.0: expression atlas and interactive analysis of ncRNAs from thousands of deepsequencing data. Nucleic Acids Res 49(D1): D877–D883. https://doi.org/10.1093/nar/ gkaa1039 26. Chu Q, Bai P, Zhu X et al (2018) Characteristics of plant circular RNAs. Brief Bioinform 21(1):135–143. https://doi.org/10.1093/ bib/bby111

27. Chu Q, Zhang X, Zhu X et al (2017) PlantcircBase: a database for plant circular RNAs. Mol Plant 10(8):1126–1128. https://doi.org/10. 1016/j.molp.2017.03.003 28. Zhang P, Meng X, Chen H et al (2017) PlantCircNet: a database for plant circRNAmiRNA-mRNA regulatory networks. Database (Oxford) 2017:bax089. https://doi.org/10. 1093/database/bax089 29. Sun X, Wang L, Ding J et al (2016) Integrative analysis of Arabidopsis thaliana transcriptomics reveals intuitive splicing mechanism for circular RNA. FEBS Lett 590(20):3510–3516. https://doi.org/10.1002/1873-3468.12440 30. Ye J, Wang L, Li S et al (2019) AtCircDB: a tissue-specific database for Arabidopsis circular RNAs. Brief Bioinform 20(1):58–65. https:// doi.org/10.1093/bib/bbx089 31. Wang K, Wang C, Guo B et al (2019) CropCircDB: a comprehensive circular RNA resource for crops in response to abiotic stress. Database (Oxford) 2019:baz053. https://doi. org/10.1093/database/baz053 32. Zhang J, Hao Z, Yin S et al (2020) GreenCircRNA: a database for plant circRNAs that act as miRNA decoys. Database (Oxford) 2020:baaa039. https://doi.org/10.1093/ database/baaa039 33. Meng X, Hu D, Zhang P et al (2019) CircFunBase: a database for functional circular RNAs. Database (Oxford) 2019:baz003. https://doi. org/10.1093/database/baz003 34. Wang HY, Wang HH, Zhang HX et al (2019) The interplay between microRNA and alternative splicing of linear and circular RNAs in eleven plant species. Bioinformatics 35 (17):3119–3126. https://doi.org/10.1093/ bioinformatics/btz038 35. Wu W, Wu Y, Hu D et al (2020) PncStress: a manually curated database of experimentally validated stress-responsive non-coding RNAs in plants. Database (Oxford) 2020. https:// doi.org/10.1093/database/baaa001 36. Vromman M, Vandesompele J, Volders PJ (2020) Closing the circle: current state and perspectives of circular RNA databases. Brief Bioinform 22:288–297. https://doi.org/10. 1093/bib/bbz175

Chapter 8 NGS Methodologies and Computational Algorithms for the Prediction and Analysis of Plant Circular RNAs Laura Carmen Terro´n-Camero and Eduardo Andre´s-Leo´n Abstract Circular RNAs (circRNAs) are a class of single-stranded RNAs derived from exonic, intronic, and intergenic regions from precursor messenger RNAs (pre-mRNA), where a noncanonical back-splicing event occurs, in which the 50 and 30 ends are attached by covalent bond. CircRNAs participate in the regulation of gene expression at the transcriptional and posttranscriptional level primarily as miRNA and RNA-binding protein (RBP) sponges, but also involved in the regulation of alternative RNA splicing and transcription. CircRNAs are widespread and abundant in plants where they have been involved in stress responses and development. Through the analysis of all publications in this field in the last five years, we can summarize that the identification of these molecules is carried out through next generation sequencing studies, where samples have been previously treated to eliminate DNA, rRNA, and linear RNAs as a means to enrich circRNAs. Once libraries are prepared, they are sequenced and subsequently studied from a bioinformatics point of view. Among the different tools for identifying circRNAs, we can highlight CIRI as the most used (in 60% of the published studies), as well as CIRCExplorer (20%) and find_circ (20%). Although it is recommended to use more than one program in combination, and preferably developed specifically to treat with plant samples, this is not always the case. It should also be noted that after identifying these circular RNAs, most of the authors validate their findings in the laboratory in order to obtain bona fide results. Key words Plant circRNAs, Plant circular RNAs, ceRNAs

1

Introduction Circular RNAs (circRNAs) are a class of single-stranded RNAs, closed as a loop due to the covalent binding at the 50 and 30 ends. They are members of the endogenous noncoding RNA (ncRNAs) family and they have been identified in all eukaryotes, protists, and even prokaryotic archaea [1–3] CircRNAs derives from exonic, intronic and intergenic regions from precursor messenger RNAs (pre-mRNA), where a noncanonical back-splicing event, in which the 50 and 30 ends are attached by covalent bond, produces a headto-tail splicing junction [4–8]. Because of their circular structure,

Luis Marı´a Vaschetto (ed.), Plant Circular RNAs: Methods and Protocols, Methods in Molecular Biology, vol. 2362, https://doi.org/10.1007/978-1-0716-1645-1_8, © The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Science+Business Media, LLC, part of Springer Nature 2021

119

120

Laura Carmen Terro´n-Camero and Eduardo Andre´s-Leo´n

circRNAs are insusceptible to degradation by ribonuclease R (RNase R) [9] revealing that there are huge a number of circRNAs highly and stable expressed in cells and tissues [10] being regulated over time and development stage [5, 11, 12]. Due to the appearance of high-throughput sequencing technologies, in particular with the development of the RiboMinus RNA-Seq technique along with depleted liner RNA samples for RNAseq, and new bioinformatics tools, successive reports discovered that the expression of circRNAs is strictly regulated, tissue and cell-type specific, and present in diverse species conserved across evolution [5, 10, 13–16]. These kinds of techniques are based in the biochemical properties of the circRNAs for improving the library preparation with the enrichment on circular RNA species. Following this strategy, samples are treated with RNase R before the library preparation. An example of the relevant results obtained using this methodology was cANRIL, the circular RNA from the ANRIL gene [4]. A common feature among animals and plants is the cell biogenesis of these circular RNAs, since they derive from a RNA pol(II)mediated transcription and a backsplicing event of pre-mRNAs [17]. Furthermore, in both cases, circRNAs are produced in the nucleus and generally transported to the cytoplasm [10], although several authors confirm that circRNAs generated from intronic mRNAs are usually located preferably inside the cell nucleus [18]. However, other numerous studies have shown that the mechanisms of circularization in plants are different from those in animals. For example, plant circRNAs from coding regions contain fewer repetitive and reverse complementary sequences in the flanking introns than animal’s circRNAs [1, 19–21]. In that sense, the amount of reverse complementary sequences was only 6.2, 2.7, and 0.3% in intronic sequences next to exonic circRNAs in Oryza sativa, Glycine max, and Arabidopsis thaliana, respectively [1, 20, 22, 23]. This fact suggests that intron-pairing-driven circularization may not be the main mechanism of the biogenesis of plant circRNA [13]. Furthermore, although most eukaryotes contain the GT at 50 and AG at 30 terminal dinucleotides required by the spliceosome to perform the splicing process, in organisms such as O. sativa, cucumber and in the chloroplast of Arabidopsis thaliana, noncanonical splicing signals such as GC/CG, CT/GC, and GC/GT have been found [21, 24, 25]. Other studies have reported that back-splice signals within the circRNA sequence are flexible and the splicing event is widespread [26–28]. Particularly, in maize, mobile elements have also an important role in the formation of circRNAs as LINE1-like elements (LLEs) and their Reverse Complementary Pairs (LLERCPs) are significantly enriched in the flanking regions of circRNAs [29]. Overall, these results suggest that plants have specific mechanisms for the production and regulation of circRNAs in comparison with other organisms [30, 31].

NGS and Tools for the Analysis of Plant circRNAs

121

The functions of circRNAs have been widely reported in animals [6, 10, 32, 33] but it is not the case in plants. CircRNAs participate in the regulation of gene expression at the transcriptional and posttranscriptional level. Acting as competing endogenous RNAs (ceRNAs), circRNAs are principal regulators of miRNA actions. In animals, circRNAs are known to function primarily as miRNA [34] and RNA-binding protein (RBP) sponges. Therefore, in their sequence they contain a high number of binding regions to both miRNAs and RNA binding proteins, which contributes to the decrease of the amount of these molecules in the cellular environment, preventing from exerting their function. Nonetheless, plant circRNAs have fewer miRNA-binding sites than animals, so the sponge function is in some way attenuated. In addition, compared to animals, there are also fewer circRNAs that possess these binding regions and thus acting as a sponge [1, 35]. Besides, circRNAs are also involved in the regulation of alternative RNA splicing and transcription [36]. Moreover, it has been posited that the exonic circularization process competes with the canonical splicing machinery as they act on the same splice sites. As a result, this causes that circRNAs expressed more abundantly than their linear counterparts, sustain the idea that circularization can negatively regulate gene expression by decreasing the activity of the canonical splicing of linear RNAs [37]. Specifically, in plants, diverse studies suggest that plant circRNAs are also able to regulate gene expression under some circumstances. In that sense this play an important role in a variety of biological processes, including development, plant growth and the response to biotic and abiotic stresses, where the expression will depend on the type of cell, tissue, development, and stage [3, 31] . It has been demonstrated that plant circRNAs are able to regulate negatively the expression of their parental genes [22, 29]. Conn and collaborators also described that floral homeotic phenotypes are driven by the overexpression of an exon-skipped circRNA missing the exon number 6 [38]. There is another essential difference in the behavior and function of animal circRNAs, with those of plants. In the case of animals, the translation of circRNAs into the form of protein has been demonstrated. Pamudurti et al. identified a set of circRNAs that were translated in vivo in Drosophila melanogaster in a cap-independent manner [39]. Later, Legnini et al. also determined the translation of the Circ-ZNF609 in a splicing-dependent and cap-independent mode [40]. After that, Yang et al. discovered a consensus m6A motif, enriched in circRNAs, and verified that a single m6A site is able to initiate the translation in human cells [41], To date, circRNAs from more than 30 plant species have been identified, being abundant and with ubiquitous expression [1, 3, 35, 42, 43]. Most of these circRNA identified in plants are stored in specialized databases such as PlantCircNet [44] and PlantcircBase

122

Laura Carmen Terro´n-Camero and Eduardo Andre´s-Leo´n

[45] that collect more than 200,000 circRNAs from experiments conducted by different researchers. To date none have been verified to be translated into a protein. In order to summarize the goals and the type of techniques available in the field of plant circRNAs, we have performed a manually curated selection of all publications from PubMed using the keywords “circRNA” and “plants.” From those results we have selected all circRNA studies performed in the last 5 years. In total, we have examined 62 publications with the goal of analyzing the most studied plant species, the tissues where these circRNAs are expressed, the aim of each study, the sample preparation protocol, the bioinformatic analysis, the validation technique and finally the number of identified circRNAs. In some cases, we also include the possible circRNA function and the metabolic pathway involved. All this information is collected in Tables 1, 2, 3, 4. Based on the publications, we will discuss the advantages and disadvantages of the applied technique and the bioinformatic analysis conducted. Finally, we will make a brief summary of the bioinformatics tools currently available for the identification of these RNAs and other useful resources.

2

Current Studies in the Field of Plant CircRNAs In the last 5 years the field linked to the study of circRNAs in plants has experienced a rapid growth, in that sense these circular molecules have been identified in a total of 29 different species. Fifteen percent of the studies have been carried out on the model organism Arabidopsis thaliana, and practically the rest of the studies have been conducted on agri-food interest species such as Solanum lycopersicum, Zea mays, and Oryza sativa. For the remaining species, 1 or 2 articles per species have been published. These studies are usually accomplished in one or more plant tissues; thus, the most studied tissue has been the leaf (55%), followed by the fruit, root, flower/ovule, stem, and seed. There is a special interest in the characterization of circRNA in response to stress (42%), both biotic (15%, Table 3) and abiotic (27% Table 4). CircRNA involved in plant development have been characterized in 30% of the studies (Table 2). The rest of the studies mainly focus on the identification of circRNA in the whole plant or specific tissues, in those species where circRNA had not been previously characterized (Table 1). Only 3 of the 62 studies present mutants that have been used as a research tool [46–48]. In other cases, the comparison of different cultivars, for example, with different levels of stress resistant, has been very useful for the identification of the possible circRNA involved in the stress response. Regarding the biotic stresses research, circRNA have been characterized in response to fungi, viruses and bacteria in different species.

NEBNext

Spectrum Plant Total RNA Kit Depl.

Gossypium spp. O. L.

Conserved circRNA

mRNA-Seq TRIzol, Spect, sample BA2100, Depl, preparation RNase R kit

No kit specified (see in study)

Kapa strand RNA-Seq library synthesis

G. max L. R. S.

mirVana, DNase, BA2100, Spect, Depl. Qubic

No kit specified

MiniBEST Plant RNA RNaseOUT, BA2100, Qubit, Depl,Depl, AGE

Mutants; cbp80, c2h2, and flk

A. thaliana L.

TRIzol, Depl. RNase R+ 5 independent studies

A. thaliana R. S. L. F. S. P.

Canonical and noncanonical circRNA

A. thaliana F. Fl. L. R. P.

HS 2000

HS 2500

HS 2500 &X

G.Analyzer IIx

HS 2000

HS 4000

Seq

GEO, accession number GSM989339— GSM989346 and GSM989350—GSM989352

mRNA-Seq sample preparation kit

Library

A. thaliana F.

RNA isolation

TRIzol BA2100, Spect, Depl

Develop.

A. hypogaea S.

Species

Others

DESeq2

SplicingTypesAnno; Biostrings; GenomicAlignments; BLAST

psRNATarget tool; DESeq package

BWA; Ciri

LQR and A; TopHat2; TopHat-Fusion; CIRI; CIRI-AS

BLAST; GOStats

DEseq; miRanda46; Targetscans; miRBase21.0; Cytoscape GOseq R and Kobas

NGS QC Toolkit; Bedtools; BLAST; BOWTIE2 TopHat2; BLAST2GO; find_circ TargetFinder3; Cytoscape

LQR and A BWA men; CIRI2 find_circ; SortMeRNA; HISAT2 StringTie

TopHat; SAMtools; BWA MEM

LQR and A; CirComPara pipeline (used FindCirc, TestRealign and CircExplorer2; three different aligners (Segemhel, Star, and TopHat); Aligning Bowtie

BED tools and TopHat/ LQR and A; TopHat; Cufflinks; GO; Cufflinks; Cuffmerge; Cufflinks; Cuffdiff; TopHat-Fusion; SPSS Statistics CIRCexplorer; CIRI

Identification

Table 1 Identification and characterization of plant circRNA published in the last 5 years

27,812

347

CircRNA

5372

5861

30,923

[19]

[20]

[29]

[47]

[17]

[106]

[105]

References

(continued)

20 divergent primers 1041, 1478, 1311, PCR with ARN and 499 and gDNA

10 D PCR in RNA, Sanger

3 D$C qPCR with ARN and gDNA; Sanger Expression of 10 by RTqPCR

No validation specified

803 x divergen PCR primers PCR with RNA and gDNA; Sanger

D PCR with ARN; Sanger

15 D&C RTq-PCR, Sanger

Validation

NGS and Tools for the Analysis of Plant circRNAs 123

P. edulis P.

Diverse non-GT/AG splicing

O. sativa P.

RNAprep Pure Plant Kit, AGE, Spect

RNeasy Mini Kit, DNase, Depl, RNase R

TRIzol, Spect, BA2100, Depl

Fertility

O. sativa P.

RPAD method (Enrichment of circular RNAs)

DCC software; PlantcircBase

LQR and A; BWA-MEM 0.7.1; CIRI v1.2 CLC

Identification

MinION minimap2 platform

10 + 10 D PCR in RNA and gDNA; sequencing or digestion with restriction enzymes

26 D&C qPCR amplicon-size analysis on Agilent Bioanalyzer

Validation

EpiNano tool; BiNGO in -1 D qPCR with Cytoscape; CNCI; ARN CPC; Swiss-Prot; + +/control IRESfinder tool; Transdecoder; BLASTP using NR

5 D PCR with ARN and gDNA, Sanger

16 D&C PCR; DESeq; 16 RT-qPCR MutiExperimental RNA and gDNA Viewer; topGO; KOBAS; KEGG and GO; Pfam KOG/ COG; Nr; Swiss-Prot; miRBase; Target Finder; Cytoscape

NCBI-BLASTN; psRNATarget

Others

fastx toolkit; circseq_cup Annotation of circRNAs pipeline; Aligning TopHat-Fusion, STAR-Fusion CIRCexplorer

HS Xten LQR and A; HISAT2; platform find_circ

NextSeq 500

HS 2000

Seq

RNA sequencing HS 3000 libraries

NEBNextR

NEXTflex TRIzol DNase, RNase R, AGE, tdMDA, Spect, RTPCR

Truseq technology

Library

O. sativa and N. benthamiana L.

RNA isolation

Truseq technology

Develop.

H. vulgare L. S.

Species

Table 1 (continued)

470

2806

[76]

[25]

[49]

[101]

O. sativa 1875 Total putative N. benthamiana 9242

9994; 186 DE

[107]

References

62

CircRNA

124 Laura Carmen Terro´n-Camero and Eduardo Andre´s-Leo´n

Intraspecific variation

Z. mays L. X-ten

HS 3000

HS 2500

HS 2500

FastUniq; CIRI; CIRCexplorer2

RNase R X D&C RT qPCR with ARN

BLAST; EdgeR; Blast2GO; KOBAS3.0; GSTAr.pl (from miRBase)

2804; 1690

3715

318; 282 DE

895

DEseq; GO; TO; KEGG; 12 D&C qPCR with 1500 DE TopHat ARN and gDNA, Sanger

15 D PCR in RNA and gDNA; Sanger

8 D&C PCR with RNA and gDNA, Sanger

miRanda; TargetFinder; No validation specified CIRI tools; DESeq R; Nr; Pfam; KOG/COG; SwissProt; KEGG; topGO GO

LQR and A, KNIFE; - psRNATarget BLATCIRCexplorer2 CIRI; TopHat2

find_circ; Bowtie2

LQR and A TopHat2 (version 2.0.10) CIRI

edgeR package; WEGO; LQR and A; TopHat; BLAST2GO; CIRCexplorer InParanoid; Splicing program; TAIR;MSU Related Genes Genome Annotation (ASRG); BINGO; Project Database Cufflinks; rMATS.3.2.2; BLAST; WGCNA

[110]

[60]

[109]

[48]

[108]

L Leaf, R Root, S Stem, P Whole plant, F Fruit, Fl Flower, O Ovule, Depl. Depletion, Spect Spectrophotometer techniques, HS High Sequence, D&C divergent and convergent primers, BA2100 Agilent 2100 Bioanalyzer, u unique, Remove adapter and low quality LQR and A, DE Differential Expression, g Genome, Seq Sequencer

mRNA-seq Sample Preparation Kit ()

TRIpure, DNase, NEBNext RNA clean, AGE, RNase R

Phenotypic variance

Z. mays L.

TRIzol, DNase, BA2100, RNase R

5 resources; including 21 samples.

NEB Next

strand-specific RNA-Seq libraries

Z. mays L.

Depl, RNase R

S. lycopersicum F.

LeERF1

RNAprep Pure Plant Kit, Depl.

P. edulis S.

NGS and Tools for the Analysis of Plant circRNAs 125

L. bud to young TRIzol, DNase, L. BA2100, Depl.

Lint and fuzz fibers

Fl. and pollen

C. sinensis L.

G. hirsutum O.

G. max Fl

RNAprep pure Plant Kit, DNase, RNase R, Depl. Qubit,BA2100, qPCR

guanidine thiocyanate, Depl.

TRIzol. BA2100, Spect

NEBNext

Oligotex mRNA kit

mRNA-seq sample preparation kit

NEBNext

HS 2500

HS

HS 4000

HS

Hi-Seq

Ripening

No kit specified (see papers)

C. annuum F.

HS. X-ten.

TRIzol kit Spect, Qubit, BA2100, Depl. RNase R

NEBNext

Anther

RNAprep Pure Plant Kit, Depl. BA2100, AGE, PCR, Qubit, Spect

Seq

B. campestris F.

Methyl jasmonate

A. thaliana P.

GEO accession GSE43616

Library

NCBI GEO database GSE43616 GEO database (Triticum aestivum, GSE58805; Glycine max, GSE69469; Zea mays, GSE71046; Oryza sativa Indica, GSE74465)

Life span of leaves

A. thaliana L.

RNA isolation

A. thaliana/ Others L.

Develop.

Species

Table 2 CircRNA studies in plant development published in the last 5 years

DESeq, TargetFinder; Blast

TargetFinder, psRNATarget, agriGO, GOSemSim

DESeq2, psRobot, Cytoscape, LQR and A, agriGO, KOBAS, Predict TopHat, Bowtie IRES, ORFs and coding v2.0.6 products find_circ.

Gene expression using FPKM

12 D&C qPCR, Sanger

5 D&CRT qPCR with ARN

24 D&C qPCR with RNA and gDNA

No validation specified

4 D qPCR with ARN and gDNA

No validation specified

circMeta R OmicShare, GSTAr.pl 6 D&C PCR with Phytozome12, Cytoscape ARN and gDNA, Sanger

TopHat2, TopHat- DESeq; PlantcircBase, BLASTN, Fusion, BWA, PacBio RS II, GO, KEGG, CIRI Targetscans, miRanda, CIRCexplorer Cytoscape LQR and A. Cuffcompare, TopHat2 Cufflinks; Cuffmerge; CIRI

Validation

psRNA target TAPIR AtmiRNET 4 D rt. qPCR Cytoscape DAVID (GO, KEGG)

Others

LQR and A, StringTie GOseq, KOBAS StringTie, CIRI

LQR and A, find_circ, miRBase, miRDeep2

find_circ, circRNA_finder CIRI2

LQR and A, BWA, CIRI, CIRCexplorer find_circ, CD-HIT-EST

MapSplice

Identification

2867, 1009 DE

2262

3175, 828 DE in bud, 1594 DE in L.

125 DE

1443, 758 DE fertile, 584 DE sterile

11,490

8588, 385 DE

168

CircRNA

[75]

[113]

[100]

[68]

[112]

[57]

[111]

[58]

References

126 Laura Carmen Terro´n-Camero and Eduardo Andre´s-Leo´n

Morphogenesis

Flowering

Pigment

Ripening

Coloration

Ripening

P. euphratica L.

P. trifoliata F.

S. lycopersicum F.

S. lycopersicum F.

S. lycopersicum F.

S. lycopersicum F.

NEBNext

TruSeq Stranded

Strand-specific sequencing

NEBNext

TRIzol Spet, Depl.

RNA Extraction Kit, BA2100. RNA-Seq sample preparation kit

NEBNex

NEBNext RNAprep Pure Plant Kit, Depl, RNase R, BA2100, qPCR

RNA isolation, Depl, BA2100

TRIzol Depl. RNase R TruSeq RNA LT

Plant RNAiso Plus kit, Depl. DNase, RNase R

mirVana miRNA Isolation Kit BA2100, Depl.

mirVanamiRNA Isolation Kit, BA2100

HS 2500

HS

HS X-Ten

HS 4000

HS 2500

HS 4000

HSTM 2500

HS 2500

Bowtie2; find_circ

StringTie, CIRI tools

Find-circ

LQR and A; TopHat2; CIRI

Trimmomatic; Segemehl; CIRI

LQR and A, TopHat, CIRI

CIRI

LQR and A, bowtie2 and eXpress, CIRI Find –circ

DEseq. psRNATarget

DESeq R; Pfam; KOG/COG; Swiss-Prot; KEGG; GOseq R; KOBAS; hypergeometric test

DESeq R, GO

CIRI tools, Pfam, KOG/COG, Swiss-Prot, NR, topGO, KEGG, TargetFinder, Cytoscape

DESeq R, psRNATarget, Blast2go, KEGG

DESeq, miRBase, psRNATarget, NCBI (functional predictions)

DEseq2, miRbase, mirdeep230 psRNATarget, GO, KEGG

DESeq2, GOseq R, KOBAS, BLASTN TargetFinder, psRNATarget, Cytoscape

LQR and A, CIRI, BLAST2GO CIRIAS tool, TopHat2

HS 2500 LQR and A, BWA platform Cufflinks, CIRI2

HS 2500

5 D qPCR in RNA

No validation specified

No validation specified

11 D qPCR with ARN.

6 D&C qPCR in RNA and gDNA

22 D qPCR ARN and gDNA; 28 qPCR

1 D qPCR in RNA

10 D&C qPCR, Sanger, 3 qPCR

6 D&C qPCR with ARN and gDNA, Sanger

No validation specified

45, 12 DE (YN29XN979), 23 DE (YN29-LH9)

65 DE

3796, 273 and 89 DE

705, 340 DE

796

558 potential, 176 DE

2 circRNAs

1149, 163DE in Bo, 226 DE in Ov, 223 DE in La, 383 DE in Li

2616

[122]

[121]

[120]

[119]

[50]

[118]

[117]

[116]

[115]

3819 in R and 2295 in Y [114]

L Leaf, R Root, S Stem, P Whole plant, F Fruit, O Ovule, Depl. Depletion, Spect Spectrophotometer techniques, HS High Sequence, D&C divergent and convergent primers, BA2100 Agilent 2100 Bioanalyzer, u unique, Remove adapter and low quality LQR and A, DE Differential Expression, G Genome, Seq Sequencer

T. aestivum R.

Heteromorphic L

P. euphratica L.

TRIzol, Depl. TruSeq RNA BA2100,Spet, AGE Library Prep

H. rhamnoides F. F.

high strandspecificity of the libraries

TRIzol, Qubit, Spect, Depl

H. rhamnoides Mature stage F.

NGS and Tools for the Analysis of Plant circRNAs 127

Identification

MATLAB, Genomic Workbench, mirBase, Heatmapper, ensemble, BLAST2GO

Cuffdiff, DESeq2, Blast2go, psRNAtarget, TargetFinder, WGCNA Cytoscape

3582 circRNAs

CircRNA

2932 high-confidence

686 novel circRNAs

2098 circRNAs, 429 circRNAs differential expression. 5 D&C PCR primers 2608, Control!1443 for qPCR (1288 u), Sequencing of direct Treatment!1165 RT qPCR (1010 u) product

14 D primers PCR with ARN, Sanger

68, Control!40 (31 u) Treatment!37 (28 u)

7 D&C PCR in RNA 184, CK!32 u, and gDNA TYLCV!83 u Sanger, RT q PCR

13 D&C PCR with RNA and gDNA, Sanger, 12 qPCR

3 D primers PCR with ARN, Sanger

4 D qPCR with RNA 548 DE

80 D&C qPCR RNA & gDNA, Sanger

Validation

GOseq, KEGG, Nr, GO, Pfam, Swiss- 2 D&C PCR for qPCR Prot, DEseq2, Targetscan, miRanda, psRNATarget

DE

edgeR, KOBAS, CIRI-AS

edgeR, GO, KEGG

Custom-made scripts for CPC&CNCI, Pfam, TopHatFusion, CIRCExplorer, StringTie, GOseq, KEGG, FASTA3 Cytoscape

StringTie, Annocript, cuffcompare, DESeq2, WGCNA, Cytoscape

Others

[129]

[128]

[127]

[126]

[67]

[125]

[124]

[123]

References

L Leaf, R Root, S Stem, Depl. Depletion, Spect Spectrophotometer techniques, HS High Sequence, D&C divergent and convergent primers, BA2100 Agilent 2100 Bioanalyzer, u unique, Remove adapter and low quality LQR and A, DE Differential Expression, G Genome, Seq Sequencer

HS 2500 BOWTIE2, CLC Genomic Workbench

LQR and A, TopHat2, BWA men, CIRI2

LQR and A, TopHat2 tool, TopHatFusion, CIRIexplore2

HS 2000

Poly(A) + type RNA-Seq library

LQR and A, CIRI2

HS 4000 BWA-MEM, CIRI, CIRIAS

HS X Ten

HS 2000 LQR and A, TopHat2, find_circ

HS 4000 LQR and A, Bowtie2, TopHat2, StringTie

HS 2000 LQR and A, BWA-MEM, STAR, CIRI.

Seq

Strand-specific RNA libraries UTP method

Nucleorhabdovirus TRIzol, Spec, AGE TruSeq BA2100, Depl, DNase,

Depl. BA2100

circRNA-seq library

Z. mays L.

TYLCV

S. lycopersicum L.

TRIzol, DNase, Spec, AGE, Depl.

RNA-seq libraries

NCBI (GSE74871)

M. oryzae

O. sativa L.

TRIzol Depl., RNase R, Fragments collected AGE

mRNA-Seq sample preparation kit

P. carotovorum

Verticillium spp.

Gossypium spp. R. S.

TRIzol, BA2100, Depl.

Library Prep Kit for IlluminaR (NEB)

S. tuberosum S.

CGMMV

C. lanatus L.

HiPure Plant RNA Mini Kit (Magen), DNase

Library

TRIzol, BA2100

Bacterial canker

A. deliciosa L. R.

RNA isolation

S. pimpinellifolium P. infestans L.

Stress

Species

Table 3 Plant circRNA studies in response to biotic stress published in the last 5 years

NEBNext

Library

Heat

Salinity

Low-tmp

Low P

High/Low light

Topping

C. sativus L.

C. sativus L. R.

G. max L.

G. max L. R.

L. sativa L.

N. tabacum R.

TruSeq Stranded

NEBNext

TruSeq Stranded

Identification

LQR and A, TopHat2, Cufflink, CIRI, CIRCexplorer, TopHat and TopHat-Fusion

StringTie, CIRI

LQR and A, TopHat2, find_circ

LQR and A, BWA_MEM, CIRI

LQR and A, BWA, CIRI2

HS 4000 LQR and A, Bowtie2, TopHat, STAR, CIRI, find_circ. PcircRNA finder

HS 4000 LQR and A, Bowtie2 and TopHat2, TopHatFusion, CIRCExplorer

HS 4000

HS

DESeq R, topGO R, BH method)

Others

7 qPCR miRNA

No validation specified

9 D PCR primers RT qPCR in RNA

25 D &C qPCR with ARN and gDNA

Validation

40 D PCR qPCR, Sanger

2787, Root!1934 DE, Leaf!44 DE

2477 circRNAs

2404, Treat. in R!45 u, Treat. in L !17 u

36 DE

730, First stage!23 DE, 2 stage !22 DE

novel circRNAs!1599 heat circRNAs!1583

CircRNA

edgeR, psRNATarget, Cytoscape, psRNATarget

edgeR,BLAST,Blast2GKEGG, miRBase, Targetscans miRanda CircNet Cytoscape.

DE circRNAs, GO, Patmatch

6 D&C PCR primers for qPCR RNA

10 D&C PCR for qPCR RNA and gDNA, Sanger

13 D&C PCR, with RNA and gDNA, Sanger

12,414s

[64]

[135]

[134]

[23]

[21]

[133]

[132]

[131]

[130]

[84]

References

(continued)

1650 circRNAs, SL!742 (484 u), ML!792 (537 u) WL!530 (334 u)

371 novel

DEGseq, psRobot, miRBase21.0, 10 D&C PCR and 749, Control!451 psRNATarget, GOseq and KEGG, Sanger, (203 u), Cytoscape, IRES, ORFs, 14 qPCR, RNA Treatment!546 Domains, EST. and gDNA (298 u)

EBSeq, topGO, KEGG, TargetFinder, Cytoscape

find_circ, DEG, patmatch, Blast2GO, qPCR KEGG, Cytoscape

DEGseq, GO, psRobo, Cytoscape

DESeq R, Nr, Pfam, KOG/COG, Swiss-Prot, GOseq R, KOBAS

LQR and A, BWA, Find-circ edgeR, TargetFinder, BLAST2GO, KOG/COG (functional annotation), Pfam, Swiss-Prot, eggnog, Nr.

HS 2000 TopHat

HS X Ten

HS

HSXten.

HS 4000 TopHat, STAR, Bowtie, CIRCexplore2

Seq

NEBNext Ultra HS X Directional

NEBNext

Truseq

MiniBEST Plant RNA mRNA-Seq sample Extraction kit, preparation Spect, Depl. kit

TRIzol, Depl, RNase R

TRIzol, BA2100, spect, AGE

RNAiso Plus DNase I, Spect, BA2100, Depl. RNase R

TRIzol, Depl, RNase R, BA2100

TRIzol, BA2100

TRIzol, BA2100, Depl.

Cu

NEBNext,

TRIzol, Spect, Qubit, circRNA, BA2100, RNase R. library

C. junos L. R.

Low Ca

B. rapa L.

TRIzol,BA2100, DNase RNase R

RNA Extraction Kit, BA2100, Depl.

Heat

A. thaliana

RNA isolation

C. annuum F. Chilling

Stress

Species

Table 4 Plant circRNA studies in response to abiotic stress published in the last 5 years

Drought

Heat

P. betulifolia L.

R. sativus L.

Library

Low N

Cold tmp. and Mutant

Drought

Salinity and drought

T. aestivum R.

V. vinifera R. S. L. Fl. F

Z. mays and A. thaliana L.

Z. mays L.

No kit specified

TruSeq Stranded

mRNA-Seq sample preparation kit

TruSeq Stranded

NEBNext

NEBNext

Identification

DESeq2, Go seq, KOBAS, iTAlK, psRobot, Cytoscape

DESeq2, GOseq, KOBAS, PsRobot software

Others

LQR and A, BWA-MEM, CIRI.

18 D PCR for qPCR

16 D&C qPCR RT qPCR, Sanger

No validation specified

No validation specified

agriGO, BLAST, TransDecoder

No validation specified

9 divergen PCR qPCR with RNA and gDNA

51 D PCR with DESeq R, PlantcircBase, BLAST, RNA, Sanger miRBase, miRVIT, psRNATarget, GO, KEGG edgeR, Short Time-series Expression Miner, KOBAS

CircRNA

[138]

[42]

[35]

[137]

[136]

References

1199 circRNAs

2174 and 1354 highconfidence Maize and Arabidopsis

[139]

[102]

475 DE circRNAs in grape [46] leaves under cold stress

285 to 522 per sample LH9LN vs LH9CK!29 LH979LN vs XN979CK!30

88, 62 DE

854 differentially expressed.

3 ciRNA DE

D&C qPCR 899, 309 DE circRNAs primers RT qPCR with RN and gDNA

Validation

L Leaf, R Root, P Whole plant, F Fruit, S Seeds, Fl Flower, Tmp temperature, N Nitrogen, Ca Calcium, Cu Cooper, P Fosforo, Depl. Depletion, Spect Spectrophotometer techniques, HS High Sequence, D&C divergent and convergent primers, BA2100 Agilent 2100 Bioanalyzer, u unique, Remove adapter and low quality LQR and A, DE Differential Expression, g Genome, Seq Sequencer

LQR and A, BWA, CIRI

HS 3000 BWA-MEM, CIRI2, PlantcircBase

HS 2500 LQR and A, BWA MEM, find_circ, CIRCexplorer, CIRI, CIRI-AS

DESeq2 R, psRNATarget, GOseq R

psRNATarget, BLAST2GO, KAAS

LQR and A, TopHat2, CIRI CIRI tools, DESeq, Pfam, KOG/COG, Swiss-Prot, NR, topGO, KEGG, miRanda, RNAhybrid, TargetFinder

HS 2500 LQR and A, Bowtie2, Find cir

HS

HS

HS 4000 Hisat2 v2. 0.4

HS 4000 LQR and A, TopHat, Bowtie, find_circ, CIRI

Seq

the NCBI SRA (PRJNA290180, PRJNA316069, and PRJNA287309)

TRIpure, DNase, AGE, RNase R

Small, Depl, RNase R

TRIzol, Spect, Depl.

Dehydration TRIzol, CircRNA Enrichment Kit (Cloud seqInc). BA2100

TRIzol, Depl. BA2100

TRIzol, AGE, Spect, Qubit, BA2100

Mini BEST Plant RNA NEBNext Extraction Kit, AGE, Spect, Depl. BA2100

RNA isolation

T. aestivum L.

S. lycopersicum Chilling F.

Stress

Species

Table 4 (continued)

NGS and Tools for the Analysis of Plant circRNAs

131

On the other hand, the most studied abiotic stresses have been those related to the effects produced by temperature changes. Furthermore, the response to drought, nutrient deficiency stress, salinity stress, changes in light intensity and heavy metal stress have been characterized in a smaller number of publications. In the case of plant development research, the vast majority of articles have mainly focused on fruit ripening and leaf morphogenesis by 12% and 10% respectively. The remaining studies were focused on aspects associated to plant fertility [49] and root growth. 2.1 Methods to Identify and Characterize CircRNAs in Plants

CircRNAs are new performers discovered in RNA-mediated gene regulation at the transcriptional and posttranscriptional level, acting on many biological processes. In plants, the function of circRNA is poorly characterized, although its presence in plant is widespread and abundant, as per the first report in Arabidopsis thaliana [14]. Since this work, various studies have been achieved which, in addition to identifying the presence of circular RNAs, have also sought to ascertain their involvement during plant stress responses (Tables 3 and 4) and development (Table 2). To identify and characterize circRNA, it is required to use a high-throughput sequencing approach and subsequently to perform a bioinformatics analysis. In detail, this involves obtaining the sample (normally from the tissue under study, such as leaf or fruit), extracting the cellular RNA, followed by the library preparation and the sequencing. At the end of this process a huge amount of short sequences are produced that will be subjected to a quality assessment before the identification and characterization of existing circRNA using computational tools. Finally, although not all authors agree, it is always important to validate in the laboratory at least part of the identified circRNAs, as an example of the total validity of the obtained results (Fig. 1).

2.1.1 RNA Isolation, Library Preparation and Sequencing

In recent studies, two strategies have been frequently followed to isolate the RNA from the samples of interest. Using TRIzol, by using commercial kits or TRIzol reagent with standardized protocols. Thereafter, from the 62 publications published in the last 5 years, all of them treat the samples with DNase and the ~75% perform a ribosomal RNA depletion using the Ribo-Zero rRNA Removal kit (Epicentre). In addition, approximately 50% of the studies incorporate RNase R to eliminate linear RNAs, since this technique helps to improve sensitivity and reduce the number of false positives [4, 20, 45, 50]. Before libraries preparation NanoDrop (NanoDrop), and Agilent 2100 Bioanalyzer along with Qubic technology, have been used to evaluate the quality and concentration of RNA libraries. Although classic techniques such as agarose gels are still in use. The most used library kits are TruSeq Stranded total RNA (Illumina) and NeBNext Ultra Direction RNA Library (NEB) as they are specially design to work with Illumina sequencer such as Hiseq 2000–4000.

132

Laura Carmen Terro´n-Camero and Eduardo Andre´s-Leo´n

Fig. 1 Schematic workflow for sample preparation, bioinformatics analysis, and validation

After sequencing libraries, the adaptors and low-quality data must be removed from the sequencing raw data. For this purpose, the most used tools were Trimmomatic, Cutadapt, FastQC, NGS-QC, FASTX-Toolkit, and so on. 2.1.2 CircRNA Identification Tools

Multiple bioinformatic algorithms have been developed to identify circRNA, such as circRNAFinder [32], CIRCexplorer [51], TestRealign [52] CIRI [53], CIRI2 [54], find_circ [5], MapSplice [55], PcircRNA-finder [26], and KNIFE [28]. These algorithms are distributed in two groups according to the employed strategies to find circRNAs [56]. The first is labeled as “pseudo-reference” strategy, where a putative circRNA reference is built from a gene annotation database, to successively identify junction-spanning reads. The second approach, “segmented-based,” relies on the identification of back-splicing junctions from aligned sequences. Out of these, the most used tools to identify plant circRNA have been CIRI, which is used in more than 60% of the publications, followed by find_circ (20%) and CIRCexplorer (20%). Other tools have been used only once in various publications, such as CircRNA-finder, which was used by Meng and collaborators for studying circRNA networks in Arabidopsis thaliana leaf development, identifying 11,490 circRNA [57] and MapSplice, which was used by Liu et al. to characterize circRNA involved in life span of Arabidopsis thaliana leaves, identifying a total of 168 circRNAs

NGS and Tools for the Analysis of Plant circRNAs

133

[58]. Chen and collaborators used KNIFE to identify circRNA in leaves of several varieties as well as CIRCexplorer2, and CIRI. In addition, BLAT [59] was used to identify conserved maize circRNA in Oryza sativa and Arabidopsis thaliana, aligning specifically the junction sequences of circRNAs [60]. Ye and collaborators used circseq-cup in order to characterize circRNAs with diverse non-GT/AG splicing, detecting a total of 2806 circRNA, including only 206, flanked by the canonical GT/AG (CT/AC) splicing signals and 1543 of which 1057 were flanked by noncanonical splicing signals. 208 circRNAs could not be assigned to any splicing signal due to the shuffled mapping on the rice genome, but it was clear that the canonical splicing signals were not identified [25]. Nevertheless, these bioinformatics tools differ in terms of sensitivity and precision in identifying circRNA from RNA-Seq, and they also vary in terms of computational complexity. Moreover, the identification of circRNA in plants, as previously mentioned, appears to be complicated since most algorithms are design to work with animal samples. CIRI and find_circ, two of the three most widely used tools, only consider canonical GT/AG splicing [13] and because of this, there is a modified version of the CIRI algorithms able to search for plant specific dinucleotides [13]. In order to select the tool or tools to be used, it is necessary to compare their performance using identical datasets. In that sense, Zeng and collaborators evaluated 11 circRNA detection tools in four datasets, including a positive dataset (14,689 circRNA species), a background dataset (Simulated paired-end RNA-Seq data), a mixed dataset (combining the background and positive datasets), and a real dataset. They concluded that CIRI [53], CIRCexplorer [51], and KNIFE [28] present better balanced results between their precision and sensitivity compared to the other tools, while NCLscan and MapSplice were conservative methods with similar precision but less sensitivity [61]. Another subsequent study compared 5 different algorithms using samples treated with RNase R (which eliminates linear RNA, thus enriching the fraction of circular RNA) and samples without this treatment, with the aim of evaluating the number of true positives and false-positives circRNAs identified by each software. The final conclusion advocates the combination of different tools to reduce the number of false positives [62, 63]. Therefore, we recommend the use of tools with a high degree of precision and recall but which are also capable of identifying all types of splicing signals. As mentioned, the best way to do this is through the integrated use of various prediction tools. In addition, whenever possible, plant-specific tools such as PcircRNA-finder should be used along with other tools. One example is the work by Cheng and collaborators; after performing the study with find_circ and CIRI2 in N. tabacum root, to detect circRNA in topping

134

Laura Carmen Terro´n-Camero and Eduardo Andre´s-Leo´n

responsive, PcircRNA-finder was also used to detect additional back-splice junction sites [64]. According to our analysis of the articles published in this field over the last five years, only the 25% of the publications have used more than two tools. 2.2 Bioinformatic Resources for Plant CircRNAs

Depending on the initial approach and hypothesis of a particular project, the identification of circRNAs in a new species, or in a new tissue, allows sufficient conclusions to be drawn for the publication of a scientific paper. However, other works include different scenarios that allow more complete studies to be performed. For example, in stress studies, whether biotic or abiotic, the basal condition is used as control group whereas other samples are subjected under certain type of stress. This allows the identification of new and specific circRNAs to each group associated with baseline or stress situations. At the same time, through differential expression studies, it is possible to determine if some circRNAs shift their expression levels between the different conditions. An even further step allows us to study the circRNA host genes to undertake functional enrichment studies in order to understand in general terms the affected metabolic or developmental pathways. At the same time, since circRNAs reduce the amount of miRNAs in the cytoplasm and these in turn negatively regulate various genes, the ceRNA network established makes it possible to increase the study of gene host expression to include other types of genes that may also be affected by this complicated network.

2.2.1 Differential Expression

As mentioned above, differential expression studies enable us to determine circRNAs that are expressed under different conditions, as well as to statistically compare the changes in expression that these RNAs undergo between the various study situations. Through a fold change, we are able to measure the difference of expression of a circRNA, which together with a statistical value, usually a false discovery rate, provides the necessary information to infer the possible regulatory patterns of these RNAs. In more than 70% of the articles published in the last five years, a differential expression (DE) analysis among several types of samples has been carried out. These studies have been helpful to study circRNA involved in different stages of development and plant stress response. The most widely used tools for this estimation were DeSeq [65] and edgeR package [66]. Differentially expressed circRNAs (DECs) are established by including a threshold value, mainly 1 or 2, to the absolute value of the log2 fold change followed by a P value 4000 bp [base pair]). Next, using Support Vector Machines (SVM) with a Gaussian kernel, the authors build a sequence of binary classification processes. A hierarchical order guides each binary classification, producing an output or proceeding to the next level. For example, they compare DNA versus Retrotransposon, then LTRs versus nonLTRs, and so on. Finally, the tool performance was evaluated, reaching accuracies of 90.9%, 94.3%, 74.1%, and 84.2% for Class II, LTR, LINE, and SINE sequences, respectively. PASTEC (Pseudo Agent System for Transposable Element Classification) [49] classifies TE sequences into two classes (Class I and II) and 12 orders (LTR, DIRS, PLE, LARD, TRIM, LINE, SINE, TIR, MITE, Crypton, Maverick, and Helitron). It is based on a Hidden Markov Model (HMM) that considers features such as: sequence length, LTR, TIR, Simple Sequence Repeats (SSRs), polyA tail, and Open Reading Frame (ORF). PASTEC performs

158

Liliane Santana Oliveira et al.

Table 5 Overview of some classifiers of transposable elements. All tools described here are standalone Name

URL

TE type

TEClass

http://www.compgen.uni-muenster.de/ DNA transposons, LTR, tools/teclass/index.hbi? LINE, and SINE

Year References 2009 Abrusa´n et al., 2009 [48]

PASTEC https://urgi.versailles.inra.fr/Tools/ PASTEClassifier/PASTEClassifiertuto

Class I and class II

2014 Hoede et al., 2014 [49]

REPCLA http://sourceforge.net/projects/ SS repclass/11:16

Any known TE type

2014 Feschotte et al., 2009 [50]

TERL

Class I, class II, and unknown (order and superfamily)

2020 da Cruz et al., 2020 [51]

https://github.com/muriloHoracio/ TERL

the similarity-based searches through BLASTx, tBLASTx, and BLASTn to search TEs in the Repbase Update database [61]. Next, PASTEC uses profile HMMs [52] to search for TEs and protein domains. Alternatively, each of the similarity searches can be turned off. However, the results can be affected, because PASTEC mostly depends on similarity search. The software REPCLASS [50] consists of three modules that classify sequences into every known TE category. All the three modules help to define a consensus TE sequence. The first module, similarly to PASTEC, performs similarity searches (BLASTx) against known TEs in the RepBase database. The second module searches for TE structural features such as terminal inverted repeats (TIR_search), LTRs (LTR_search), tRNA-like sequences (tRNAscan-SE), or polyA/SSRs (polyA/SSR_search). Finally, the third module searches for duplication sites and flanking regions in the genome using similarity search (BLASTn). All the methods listed above (TEclass, PASTEC, REPCLASS) are based on filtering or alignment-based steps. However, handcrafted features are intrinsically linked with the contextual knowledge about the problem, and rarely we have a specialist to provide such information. Also, these methods use similarity-based search strategies to classify the TE sequences. This also impairs in a great extent the efficiency (i.e., computational cost) and the scalability of the methods, because it is required databases with large volumes of sequences to obtain an adequate search space. To mitigate these problems some more recent methods propose the use of Deep Convolutional Neural Networks (CNNs) [53–55] to automatically describe and classify TE sequences. Using these methods one neither needs to define which kind of

In Silico Analysis of Mobile Elements and circRNAs in Plants

159

feature will be extracted nor to have an a priori knowledge about the problem. These methods use only sequences as input, considering they were trained before. Moreover, they are well-suited to other omics classification problems [51, 56–58]. Nakano and collaborators [59] used Deep neural networks (DNNs) to classify TEs using the PSGB [60] and REPBASE [61] databases. However, they did not consider superfamily or order classification. Recently, Cruz and collaborators [51] proposed a method, called Transposable Elements Representation Learner (TERL), that transforms omics data into a 2D representation (i.e., imagelike data of the sequences) and uses it as input to CNNs. Different from other methods, it predicts any hierarchical level of the Wicker classification for TEs. According to the authors, TERL possesses almost 10% better accuracy than TEclass, close to the performance of PASTEC, but about 20 times faster than TEclass and PASTEC. 2.4 Transposable Elements-Related Databases

When we consider the availability of information on transposable elements for plants, there is a larger number of databases available for public access than the number of databases for circular RNAs. One of the first published databases we could find in our research, the TIGR Plant Repeat Database [62] is unfortunately not available any more, being discontinued due to lack of funding. Currently, we could find 12 public databases with available information on transposable elements: DPTEdb [63], GyDB [64], MASiVEdb [65], MnTEdb [71], PlaNC-TE [17], PlanTE-MIR DB [66], PGSB [60], P-Mite [67], RebBase [61], REPETdb [68], SINEBase [69], and SoyTEdb [70]. As a rule, each database employs its own algorithm for de novo in silico detection of the Transposable Elements and they all store consensus sequences for each repeat. They vary widely in scope (from a single organism, such as SoyTE to 51 different species in the Gypsy database) and in the number of available sequences (from around 2200 in the Gypsy database (GyDB) to more than 2,000,000 inPMite) but, unfortunately, there is no explicit information linking them to CircRNAs. In total, we found 197 species covered by TE repositories, with 39 of them shared with circRNA databases and 85 available exclusively in TE data banks. The species available in both TE and circRNA databases can be integrated in further analysis in order to obtain TE-associated circRNAs. The TE repository covering the higher number of plant species is GyDB [64] with 51 species, covering 15 monocots, 29 eudicots, 1 gymnosperm, and 6 algae (3 Chlorophyta, 2 Diatoms, 1 Rhodophyta), followed by P-Mite [64] with 41 species, and RepBase [61] with 31 species. The remaining 9 databases covers a total of 70 species, including specimens from different evolutionary branches, with only two databases exclusive for one species in particular, MnTEdb [71], which was designed to store information from flowering eudicot Morus notabilis (mulberry), and SoyTEdb [70], with data related to the

160

Liliane Santana Oliveira et al.

other eudicot Glycine max (soy). PlaNC-TE database delivers 14,350 overlaps between nine noncoding RNAs (tRNA, rRNA, pre_miRNA, snRNA, snoRNA, antisense_RNA, sense_intronic, SRP_RNA, RNase_MRP_RNA) related to fiver transposable elements order in 40 plant genomes. Similarly to PlaNC-TE, PlanTEMIR DB is only dedicated to miRNA–TE association in 10 plant genomes. Including all databases, we found plants spread in different evolutionary clades, with 70 Eudicots and 35 Monocots; three gymnosperms (Picea spp., Picea abies, Pinus radiata); three nonvascular land plants (Adiantum capillus-veneris—a Pteridophyta member; Marchantia polymorpha—a Marchantiophyta member; Physcomitrella patens—a Bryophyta member); and 8 algae species, including four green algae (Chlamydomonas reinhardtii, Chlorella variablis, Coccomyxa subellipsoidea, Volvox carteri—Chlorophyta), one red alga (Porphyra yezoensis—Rhodophyta), two diatoms (Thalassiosira pseudonana and Porphyra yezoensis), and one Cryptophyta member (Rhodomonas salina). Besides this huge coverage of specimens from different evolutionary branches, only 25 Eudicots, 10 Monocots, 1 Chlorophyta, and 1 Bryophyta are present in both TE and circRNA databases. This information is summarized in a table available in github (https://github.com/liliane-sntn/ plant_circRNAs_TEs). For the transposable elements databases, we generated a list with 21 terms present in the twelve databases. In general, the majority of these terms are related to the classification and taxonomy of transposable elements, such as “Class,” “Subclass,” “Family,” and “Superfamily.” However, some terms can provide interesting information regarding the TE, such as the “centromere distance,” the “Envelope-like gene,” and the estimated “age” in millions of years for a particular element which can be retrieved from MASiVEdb. This information is summarized in Table 6. The transposable element databases offer a variety of ways to search information. With the exception of MASiVEdb, PlaNC-TE, PlanTE-MIR DB, REPETdb, and RepBase, the remaining databases offer some type of similarity search. BLAST is the most popular search method, with only SINEBase’s similarity searches performed by the FASTA algorithm [72], which is slower than BLAST but that can be an interesting option, as it is more sensitive to alignments from less similar sequences. When looking for matches against a consensus, however, probably the best alternative is an HMM search, which guarantees that the alignment is more tolerant of variations in less conserved positions. HMM search is present in DPTEdb, and the GyDB. MnTEdb also advertises HMM search but, at the time of the writing of this survey, the option is not available. Keyword searches are popular and keywords can include Organism (PSGB), host gene (REPETdb), family (DPTEdb, MnTEdb, PMite), and superfamily (DPTEdb, MnTEdb, REPETdb). Two databases offer a free keyword search,

In Silico Analysis of Mobile Elements and circRNAs in Plants

161

Table 6 Information content stored in transposable elements–related public databases MASiVEdb MnTEdb PMite PlaNC- PlanTE- SINEbase REPETdb SoyTEdb RepBase PGSB Gydb Te MIR DB Superfamily Family Sequence Genomic location Id Class Subclass Strand order Bib, reference Taxonomy Copy number Domain Classification (WICKER) Phylogeny Domains Literature article Centromere Distance Envelopelike gene Age Non-coding

where the contents of all the records are searched for the specific word (RepBase and the Gypsy database). In four databases we can directly search a table of contents, DPTEdb, MnTEdb, PMite, and SINEBase. Finally, DPTEdb, MASiVEdb, PlaNC-TE, PlanTEMIR DB, and REPETdb offer a variety of drop-down menu searches. Most of the transposable element databases make their sequences available for bulk download, the exception is the Gypsy database, for which we could not find any download option. Also, MnTEdb offers only download of individual sequences. Similarly, all databases, with the exception of Gypsy and DPTEdb, offer bulk download of annotation information on the sequences. Table 4 summarizes the bulk download options available and number of sequences/consensus of each database.

162

3

Liliane Santana Oliveira et al.

Exploratory Analysis on CircRNA and TE Data In this section, we describe an exploratory analysis performed based on the length of the sequence, frequency of dinucleotides and GC content of circRNA sequences of four model species (A. thaliana, G. max, O. sativa, and Zea mays) available in three circRNA databases (GreenCircRNA [37], PlantcircBase [31], and PlantCircNet [36]). For TE, we have already done these analyses in another book chapter [73].

3.1 Methodology of the Analysis

The circRNA sequences from A. thaliana, G. max, O. sativa, and Z. mays were obtained from three abovementioned databases (in the Download section of each website): (1) PlantcircBase from back-splicing sequences file; (2) PlantCircNet from “Sequence” item; and (3) GreenCircRNA from circ_full-length sequence file. For each file, we removed circRNAs presenting length equal to or larger than 2000 nt. As a result, approximately 10% and 15% of circRNAs were excluded from the sets obtained from GreenCircRNA and PlantCircNet databases. PlantcircBase database is composed of back-splicing circRNAs, because of that, its sequences are up to 200 nt in size, and none of these sequences obtained was removed from the analysis.

3.2 Sequence Length Analysis

As a first analysis, we calculated the length of all sequences present in GreenCircRNA, PlantcircBase, and PlantCircNet databases. For each database, we constructed boxplots for A. thaliana, G. max, O. sativa, and Z. mays to represent the variation of the length of circRNAs (Figs. 1, 2, and 3). As we can observe, the length of most circRNAs in the GreenCircRNA database ranges from 200 to 500 nt, with a small variability being noted for O. sativa (Fig. 1). The median length in all species was lower than 500 nt. The length of the circRNAs in GreenCircRNA is shorter than the other two databases. One explanation could be because GreenCircRNA is composed of circRNAs that act as miRNA decoys [37]. PlantCircNet database is composed of circRNAs related to regulatory triplex (circRNA-miRNA-mRNA). In that, we can see the median of the circRNAs length is close to 500 nt for A. thaliana and G. max species (Figs. 3, 4 and 5). Nevertheless, this is not the case for O. sativa and Z. mays, which differs from the other three databases. Finally, in the PlantcircBase database, we can observe that the median value for all species is 200 nt, which is also the maximum size of the sequences (Fig. 3). This happened because the PlantcircBase database is composed of back-splicing circRNAs. That means, the circRNAs of this database were already processed and are smaller than the circRNA sequences of the PlantCircNet and PlantcircBase. Finally, our results are similar as previously described in the literature, in which circRNAs are mainly between 200 and 600 bp, and just a small piece is >2 kb [13, 74, 75].

In Silico Analysis of Mobile Elements and circRNAs in Plants

163

Fig. 1 Boxplots of the circRNA length of four species present in GreenCircRNA database

Fig. 2 Boxplots of the circRNA length of four species present in PlantCircNet database

164

Liliane Santana Oliveira et al.

Fig. 3 Boxplots of the circRNA length of four species present in PlantcircBase database

Fig. 4 Frequency of dinucleotides from CircRNAs of four species present in GreenCircRNA database

In Silico Analysis of Mobile Elements and circRNAs in Plants

165

Fig. 5 Frequency of dinucleotides from CircRNAs of four species present in PlantCircNet database 3.3 Dinucleotide Analysis

Another analysis that we performed on the circRNAs of the three databases was the identification of dinucleotides frequency of the species A. thaliana, G. max, O. sativa, and Z. mays (Figs. 4, 5 and 6). As a result, we could observe that the frequency of AA, AT, TA, and TT dinucleotides were higher for G. max (green bar) in all databases. Considering A. thaliana species (pink bar), we can see that AA and TT dinucleotides frequencies are also high in this species. In relation to Z. mays (purple bar), it is possible to observe that the frequency of all dinucleotides is more stable than the other species. In all databases, CG dinucleotide presents the smallest pick in all species, being smaller for G. max species. We can also observe the frequency of the dinucleotide CC is similar for A. thaliana and G. max in all databases. Finally, the frequency of the AC dinucleotide is almost the same for all species in all databases. The dinucleotide profile can be used in the training of a model to identify or classify novel circRNAs in these or other species.

3.4 GC Content Analysis

As a final analysis, we investigated the CG content for A. thaliana, G. max, O. sativa and Z. mays species in all databases (Figs. 7, 8 and 9). As we can observe, the results found in all databases were quite similar. In the A. thaliana species (pink boxplot), the CG content seems to be more conserved than in the other species since the range of quartiles in the box plot is smaller for this species. As A. thaliana is considered a model organism, this species may be better annotated in the databases, which justifies the smaller variation in the GC content shown in the graph.

166

Liliane Santana Oliveira et al.

Fig. 6 Frequency of dinucleotides from CircRNAs of four species present in and PlantcircBase database

Fig. 7 Boxplots of CG content of four species present in GreenCircRNA database

In Silico Analysis of Mobile Elements and circRNAs in Plants

167

Fig. 8 Boxplots of CG content of four species present in, (b) PlantCircNet database

Fig. 9 Boxplots of CG content of four species present in PlantcircBase database

168

4

Liliane Santana Oliveira et al.

Identification of CircRNA–TE Associations To better understand the association between circRNA and TE in plants, we performed some analyses. To perform these analyses, we obtained TE and circRNA data of A. thaliana from publicly available databases. We selected A. thaliana for this analysis because it is a model organism and is possibly better annotated in the databases in relation to other species. In this section, we describe these analyses and the results obtained from them, which could be interesting to provide a clue about the lack of information on circRNA– TE associations.

4.1

Methodology

4.2 Results and Discussion

5

For this analysis, the TE coordinates of A. thaliana were obtained from the PlaNC-TE database from the FTP file arabidopsis_thaliana_tes_records.gff3 (http://planc-te.cp.utfpr.edu.br/releases/ 38/genomes/arabidopsis_thaliana/). The coordinates of the circRNAs of the same species were downloaded from the PlantCircNet (ath.csv file), PlantcircBase (All entries file: v5_ath_circ_info_2020.txt.gz) and GreenCircRNA databases (Athaliana.csv file). All circRNA files were converted to gff3 files using an in-house script. To identify the overlap between TEs and circRNAs, we used the program intersect from BedTools toolset (version 2.18) [76]. We explore the circRNA–TE relationship based on three circRNAs databases (PlantCircNet, PlantcircBase, and GreenCircRNA) and the PlaNC-TE database (Table 7, Fig. 10). We notice the high number of LTR elements overlapping circRNAs. This is as we expected, considering plants contain large amounts of LTR elements into the genome [77]. Another aspect is the increased number of TE in the overlap against circRNA for PlantcircBase and GreenCircRNA databases, but not for PlantCircNet. One explanation could be the isoforms described in the PlantCircNet database. Another category means any other TE class (e.g., Helitron) or an unknown element. Finally, GreenCircRNA contains less overlapping. Again, the reason could be as this database is dedicated to a specific issue (microRNA decay area).

Conclusion There is a large amount of information available in both transposable elements and circular RNAs, but no web database makes any association of the two types of data, even though there is a relationship between them. A few of the databases include associations between transposable elements and miRNAs, but not with circRNAs. This is the same case for tools which we did not find any

In Silico Analysis of Mobile Elements and circRNAs in Plants

169

Table 7 Distribution of circRNA and TE overlapping analysis in data used. A total of 43,442 TEs are described for A. thaliana at PlaNC-TE database

circRNA databases

Total Number of number of overlapping circRNAs circRNAs

Number of overlapping TEs SINE

LINE

DNA

LTR

Other

PlantcircBase

30,311

3186

5917

84

147

619

4735

332

GreenCircRNA

10,707

842

1069

8

11

111

908

31

PlantCircNet

96,135

8506

7367

107

166

817

5880

397

Fig. 10 TE distribution which overlaps with some circRNA in PlantcircBase, GreenCircRNA, and PlantCircNet databases

tools dedicated to analyze TE–circRNA relationship. Currently the only solution for an interested research would be to download the information from databases of each category and discover the associations locally. This, in fact, opens a window of opportunity for the development of a database to close this gap. Currently, there are 39 organisms for which information on both circRNAs and transposable elements are available (https://github.com/lilianesntn/plant_circRNAs_TEs). This section should provide guidelines both to individual researchers interested in the topic and for database developers that will tackle the task of developing a more permanent solution.

170

6

Liliane Santana Oliveira et al.

Notes 1. This number of plant circRNAs was obtained from a search in the publicly available databases. 2. The number of TEs in plants is much higher than that of circRNAs, especially since these elements were previously identified in plants.

References 1. Miller WJ, Capy P (2004) Mobile genetic elements as natural tools for genome evolution. Mobile Genetic Elements 260:001–020. https://doi.org/10.1385/1-59259-7556:001 2. Makałowski W, Gotea V, Pande A et al (2019) Transposable elements: classification, identification, and their use as a tool for comparative genomics. Methods Mol Biol 1910:177–207 3. Wicker T, Sabot F, Hua-Van A et al (2007) A unified classification system for eukaryotic transposable elements. Nat Rev Genet 8:973–982 4. Jurka J, Kapitonov VV, Kohany O et al (2007) Repetitive sequences in complex genomes: structure and evolution. Annu Rev Genomics Hum Genet 8:241–259. https://doi.org/10. 1146/annurev.genom.8.080706.092416 5. Bourque G, Burns KH, Gehring M et al (2018) Ten things you should know about transposable elements. Genome Biol 19:199 6. Hadjiargyrou M, Delihas N (2013) The intertwining of transposable elements and non-coding RNAs. Int J Mol Sci 14:13307–13328 7. Chen L, Zhang P, Fan Y et al (2018) Circular RNAs mediated by transposons are associated with transcriptomic and phenotypic variation in maize. New Phytol:1292–1306. https:// doi.org/10.1111/nph.14901 8. Hou J, Lu D, Mason AS et al (2019) Non-coding RNAs and transposable elements in plant genomes: emergence, regulatory mechanisms and roles in plant development and stress responses. Planta 250:23–40 9. Salzman J, Gawad C, Wang PL et al (2012) Circular RNAs are the predominant transcript isoform from hundreds of human genes in diverse cell types. PLoS One 7:e30733 10. Jeck WR, Sorrentino JA, Wang K et al (2013) Circular RNAs are abundant, conserved, and associated with ALU repeats. RNA 19:141–157

11. Fan X, Zhang X, Wu X et al (2015) Single-cell RNA-seq transcriptome analysis of linear and circular RNAs in mouse preimplantation embryos. Genome Biol 16:148 12. Lai X, Bazin J, Webb S et al (2018) CircRNAs in plants. Adv Exp Med Biol 1087:329–343 13. Zhao W, Chu S, Jiao Y (2019) Present scenario of circular RNAs (circRNAs) in plants. Front Plant Sci 10:379 14. Robic A, Ku¨hn C (2020) Beyond Back splicing, a still poorly explored world: non-canonical circular RNAs. Genes 11. https://doi.org/10. 3390/genes11091111 15. Ye C, Chen L, Liu C et al (2015) Widespread noncoding circular RNAs in plants. New Phytol 208:88–95 16. Lu T, Cui L, Zhou Y et al (2015) Transcriptome-wide investigation of circular RNAs in rice. RNA 21:2076–2087 17. Pedro DLF, Lorenzetti APR, Domingues DS et al (2018) PlaNC-TE: a comprehensive knowledgebase of non-coding RNAs and transposable elements in plants. Database 2018:1–7 18. Chen L, Yu Y, Zhang X et al (2016) PcircRNA_finder: a software for circRNA prediction in plants. Bioinformatics 32:3528–3529 19. Memczak S, Jens M, Elefsinioti A et al (2013) Circular RNAs are a large class of animal RNAs with regulatory potency. Nature 495:333–338 20. Zhang X, Wang H, Zhang Y et al (2014) Complementary sequence-mediated exon circularization. Cell 159:134–147 21. Sun P, Li G (2019) CircCode: a powerful tool for identifying circRNA coding ability. Front Genet 10:981 22. Bolger AM, Lohse M, Usadel B (2014) Trimmomatic: a flexible trimmer for Illumina sequence data. Bioinformatics 30:2114–2120. https://doi.org/10.1093/bioinformatics/ btu170

In Silico Analysis of Mobile Elements and circRNAs in Plants 23. Langmead B, Salzberg SL (2012) Fast gappedread alignment with bowtie 2. Nat Methods 9:357–359 24. Dobin A, Davis CA, Schlesinger F et al (2013) STAR: ultrafast universal RNA-seq aligner. Bioinformatics 29:15–21 25. Ito EA, Katahira I, da Rocha Vicente FF et al (2018) BASiNET-BiologicAl Sequences NETwork: a case study on coding and non-coding RNAs identification. Nucleic Acids Res 46:e96 26. Sun P, Wang H, Li G (2020) Rcirc: an R package for circRNA analyses and visualization. Front Genet 11:548 27. Gao Y, Zhang J, Zhao F (2018) Circular RNA identification based on multiple seed matching. Brief Bioinform 19:803–810 28. Langmead B, Trapnell C, Pop M, Salzberg SL (2009) Ultrafast and memory-efficient alignment of short DNA sequences to the human genome. Genome Biol 10:R25 29. Gu Z, Gu L, Eils R et al (2014) Circlize implements and enhances circular visualization in R. Bioinformatics 30:2811–2812. https:// doi.org/10.1093/bioinformatics/btu393 30. Wilkinson L (2011) ggplot2: Elegant Graphics for Data Analysis by WICKHAM, H. Biometrics 67:678–679. https://doi.org/ 10.1111/j.1541-0420.2011.01616.x 31. Chu Q, Zhang X, Zhu X et al (2017) PlantcircBase: a database for plant circular RNAs. Mol Plant 10:1126–1128 32. Gao Y, Wang J, Zhao F (2015) CIRI: an efficient and unbiased algorithm for de novo circular RNA identification. Genome Biol 16:4. https://doi.org/10.1186/s13059-014-05713 33. Zhang X-O, Dong R, Zhang Y et al (2016) Diverse alternative back-splicing and alternative splicing landscape of circular RNAs. Genome Res 26:1277–1287 34. Ye J, Wang L, Li S, Zhang Q et al (2019) AtCircDB: a tissue-specific database for Arabidopsis circular RNAs. Brief Bioinform 20:58–65 35. Wang K, Wang C, Guo B et al (2019) CropCircDB: a comprehensive circular RNA resource for crops in response to abiotic stress. Database 2019:baz053. https://doi.org/10. 1093/database/baz053 36. Zhang P, Meng X, Chen H et al (2017) PlantCircNet: a database for plant circRNAmiRNA-mRNA regulatory networks. Database 2017:bax089. https://doi.org/10.1093/data base/bax089 37. Zhang J, Hao Z, Yin S et al (2020) GreenCircRNA: a database for plant circRNAs that

171

act as miRNA decoys. Database 2020. https:// doi.org/10.1093/database/baaa039 38. Ghosal S, Das S, Sen R et al (2013) Circ2Traits: a comprehensive database for circular RNA potentially associated with disease and traits. Front Genet 4:283 39. Meng X, Hu D, Zhang P et al (2019) CircFunBase: a database for functional circular RNAs. Database 2019:baz003. https://doi.org/10. 1093/database/baz003 40. Paschoal AR, Maracaja-Coutinho V, Setubal JC et al (2012) Non-coding transcription characterization and annotation: a guide and web resource for non-coding RNA databases. RNA Biol 9:274–282 41. Maracaja-Coutinho V, Paschoal AR, CarisMaldonado JC et al (1912) Noncoding RNAs databases: current status and trends. Methods Mol Biol 2019:251–285 42. Sun X, Zuo F, Ru Y et al (2015) SplicingTypesAnno: annotating and quantifying alternative splicing events for RNA-Seq data. Comput Methods Prog Biomed 119:53–62 43. Pages H, Aboyoun P, Gentleman R et al (2016) Biostrings: string objects representing biological sequences, and matching algorithms. R package version 2:10–18129 44. Lawrence M, Huber W, Page`s H et al (2013) Software for computing and annotating genomic ranges. PLoS Comput Biol 9:e1003118 45. Wang K, Singh D, Zeng Z et al (2010) MapSplice: accurate mapping of RNA-seq reads for splice junction discovery. Nucleic Acids Res 38: e178 46. Fu X, Liu R. CircRNAFinder: a tool for identifying circular RNAs using RNA-Seq data. Proceedings of the 6th International Conference on Bioinformatics and Computational Biology, BICOB. 2014. Available: https://www. researchgate.net/profile/Xing_Fu3/publica tion/280068964_Circrnafinder_A_tool_for_ identifying_circular_RNAs_using_RNA-Seq_ data/links/55a65a5b08aebe1d24699e14 47. Song X, Zhang N, Han P et al (2016) Circular RNA profile in gliomas revealed by identification tool UROBORUS. Nucleic Acids Res 44: e87 48. Abrusa´n G, Grundmann N, DeMester L et al (2009) TEclass--a tool for automated classification of unknown eukaryotic transposable elements. Bioinformatics 25:1329–1330 49. Hoede C, Arnoux S, Moisset M et al (2014) PASTEC: an automatic transposable element classification tool. PLoS One 9:e91929 50. Feschotte C, Keswani U, Ranganathan N et al (2009) Exploring repetitive DNA landscapes

172

Liliane Santana Oliveira et al.

using REPCLASS, a tool that automates the classification of transposable elements in eukaryotic genomes. Genome Biol Evol 1:205–220 51. da Cruz MHP, Domingues DS, Saito PTM et al (2020) TERL: classification of transposable elements by convolutional neural networks. Brief Bioinform. https://doi.org/10.1093/ bib/bbaa185 52. Eddy SR (1998) Profile hidden Markov models. Bioinformatics 14:755–763 53. Goodfellow I, Bengio Y, Courville A (2016) Deep Learning. MIT Press, Cambridge, Massachusetts 54. Zou J, Huss M, Abid A et al (2019) A primer on deep learning in genomics. Nat Genet 51:12–18 55. Min S, Lee B, Yoon S (2016) Deep learning in bioinformatics. Briefings in Bioinformatics: bbw068. https://doi.org/10.1093/bib/ bbw068 56. Kelley DR, Snoek J, Rinn JL (2016) Basset: learning the regulatory code of the accessible genome with deep convolutional neural networks. Genome Res 26:990–999 57. Quang D, Xie X (2016) DanQ: a hybrid convolutional and recurrent deep neural network for quantifying the function of DNA sequences. Nucleic Acids Res 44:e107 58. Zeng H, Edwards MD, Liu G et al (2016) Convolutional neural network architectures for predicting DNA-protein binding. Bioinformatics 32:i121–i127 59. Nakano FK, Mastelini SM, Barbon S, et al. Improving Hierarchical Classification of Transposable Elements using Deep Neural Networks. 2018 International Joint Conference on Neural Networks (IJCNN). 2018. https:// doi.org/10.1109/ijcnn.2018.8489461 60. Spannagl M, Nussbaumer T, Bader KC et al (2016) PGSB PlantsDB: updates to the database framework for comparative plant genome research. Nucleic Acids Res 44:D1141–D1147 61. Bao W, Kojima KK, Kohany O (2015) Repbase update, a database of repetitive elements in eukaryotic genomes. Mob DNA 6:11 62. Ouyang S, Buell CR (2004) The TIGR plant repeat databases: a collective resource for the identification of repetitive sequences in plants. Nucleic Acids Res 32:D360–D363 63. Li S-F, Zhang G-J, Zhang X-J et al (2016) DPTEdb, an integrative database of transposable elements in dioecious plants. Database 2016:baw078. https://doi.org/10.1093/data base/baw078

64. Llorens C, Futami R, Covelli L et al (2011) The gypsy database (GyDB) of mobile genetic elements: release 2.0. Nucleic Acids Res 39: D70–D74 65. Bousios A, Minga E, Kalitsou N et al (2012) MASiVEdb: the Sirevirus plant retrotransposon database. BMC Genomics 13:158 66. Lorenzetti APR, de Antonio GYA, Paschoal AR, Domingues DS (2016) PlanTE-MIR DB: a database for transposable element-related microRNAs in plant genomes. Funct Integr Genomics 16:235–242 67. Chen J, Hu Q, Zhang Y et al (2014) P-MITE: a database for plant miniature inverted-repeat transposable elements. Nucleic Acids Res 42: D1176–D1181 68. Amselem J, Cornut G, Choisne N et al (2019) RepetDB: a unified resource for transposable element references. Mob DNA 10:6 69. Vassetzky NS, Kramerov DA (2013) SINEBase: a database and tool for SINE analysis. Nucleic Acids Res 41:D83–D89 70. Du J, Grant D, Tian Z et al (2010) SoyTEdb: a comprehensive database of transposable elements in the soybean genome. BMC Genomics 11:113 71. Ma B, Li T, Xiang Z, He N (2015) MnTEdb, a collective resource for mulberry transposable elements. Database 2015. https://doi.org/ 10.1093/database/bav004 72. Pearson WR, Lipman DJ (1988) Improved tools for biological sequence comparison. Proc Natl Acad Sci U S A 85:2444–2448. https://doi.org/10.1073/pnas.85.8.2444 73. Oliveira LS, Amorim TS, Pedro DLF et al A practical guide on computational tools and databases for transposable elements in plants. Springer, New York 74. Ye C-Y, Zhang X, Chu Q et al (2017) Fulllength sequence assembly reveals circular RNAs with diverse non-GT/AG splicing signals in rice. RNA Biol 3:1055–1063. https:// doi.org/10.1080/15476286.2016.1245268 75. Zhang P, Li S, Chen M (2020) Characterization and function of circular RNAs in plants. Front Mol Biosci 7:91 76. Quinlan AR, Hall IM (2010) BEDTools: a flexible suite of utilities for comparing genomic features. Bioinformatics 26:841–842 77. Galindo-Gonza´lez L, Mhiri C, Deyholos MK et al (2017) LTR-retrotransposons in plants: Engines of evolution. Gene 626:14–25. https://doi.org/10.1016/j.gene.2017.04. 051

Chapter 10 Constructing CircRNA–miRNA–mRNA Regulatory Networks by Using GreenCircRNA Database Jingjing Zhang, Ruiqi Liu, and Guanglin Li Abstract Circular RNA (circRNA) is a special class of endogenous RNA with a continuous loop structure, which plays important roles in metabolic processes. With the development of sequencing technology and algorithm for circRNA, more and more circRNA are identified in plants. However, the function of most plant circRNA remains unclear. In order to investigate the function of plant circRNA, we firstly build a plant circRNA database, named GreenCircRNA. Then we construct circRNA–miRNA–mRNA regulatory networks to infer the function of plant circRNA. Here, taking Arabidopsis thaliana as an example, we firstly show the method for identification of circRNA and construction of circRNA–miRNA–mRNA networks based on GreenCircRNA database, then perform a case study for identifying circRNAs and inferring their function in A. thaliana. Our method can be easily extended and applied to other plants. Key words Circular RNA, miRNA, Regulatory network

1

Introduction circRNA is a type of RNA with a covalently closed continuous loop structure and has highly resistant to exonuclease and longer half-life [1]. With the development of high-throughput sequencing and computer technology, more and more circRNA, which used to be unheeded, are valued. Different from linear RNA, identification of circRNA are mainly based on their special splice method, named back-splicing. After aligned sequence reads to genome sequences, unmapped reads are used to detect circRNA, mapped reads to detect linear RNA (Fig. 1). Nowadays, several algorithms to identify circRNA are developed, such as CIRI [2], CIRCexplorer [3], find_circ [4], PcircRNA_finder [5], and circPlant [6]. These tools usually use different features and algorithms to identify circRNA and may find different results. So using at least two methods to identify plant circRNA is a good choice.

Luis Marı´a Vaschetto (ed.), Plant Circular RNAs: Methods and Protocols, Methods in Molecular Biology, vol. 2362, https://doi.org/10.1007/978-1-0716-1645-1_10, © The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Science+Business Media, LLC, part of Springer Nature 2021

173

174

Jingjing Zhang et al.

Fig. 1 Prediction of circRNA

circRNA plays important role in a wide range of biological processes, such as regulating the expression of their host gene and acting as miRNA (microRNA, about 20–24 nucleotides in length) or RBP (RNA binding protein) sponge, and translation of circRNA [7–10]. Plant circRNA also plays important role in the development and stress-specific biological processes of plants [11– 13]. However, the function of most plant circRNA remains unclear. In order to investigate the function of plant circRNA, we firstly build a plant circRNA database, named GreenCircRNA [14]. Then we construct circRNA–miRNA–mRNA regulatory networks so as to infer the function of plant circRNA. Here, taking Arabidopsis thaliana as an example, we firstly show the method for identification of circRNA and construction of circRNA–miRNA–mRNA networks based on GreenCircRNA database, then perform a case study for identifying circRNAs and inferring their function in A. thaliana. Our method can be easily extended and applied to other plant.

CircRNA-miRNA-mRNA Networks

2

175

Materials

2.1 Hardware and Software Requirements

1. Personal computer with Linux system. 2. BWA (v0.7.17). 3. CIRI (v2.0.6). 4. CIRCexplorer (v2.3.3). 5. CD-HIT-EST (v4.6). 6. Cytoscape (v3.7.2). 7. GSTAr.pl (https://github.com/MikeAxtell/GSTAr). 8. Perl (v5.26.2). 9. Script target.pl and decoy.pl from our previous method [15].

3 3.1

Methods Data Collection

1. circRNA in the GreenCircRNA database is identified by transcriptome data. Taking SRR1004832 as an example, transcriptome data with paired-ends (accession number: SRR1004832) can be downloaded from the Sequence Reads Archive (SRA): read_1.fq and read_2.fq. 2. Genome (A. thaliana Araport11) and annotation information (A. thaliana Araport11) are downloaded from Phytozome 12: genome.fa and gene.gtf. 3. miRNA sequences are downloaded from miRbase (http:// www.mirbase.org/): miRNA.

3.2 Identification of CircRNA

Usually, it is recommended that two or three algorithms are used to identify circRNA at the same time, especially for plant circRNA. Step 1: Establishment of genome index files. Before alignment, establishment of FM-index is necessary for genome file, this step generates five files, including genome.fa.amb, genome.fa.ann, genome.fa.bwt, genome.fa.pac, and genome.fa.sa (see Note 1). $bwa index -a bwtsw genome.fa

Step 2: Aligning the reads to the reference genome. This step aligns short sequencing reads against the reference sequence using BWA. BWA contains three align algorithms: backtrack, sw, and mem. Here mem is chosen to align. $bwa mem -T 19 -t 8 genome.fa read_1.fq read_2.fq > align.sam

Step 3: Identification of circRNA from sequencing data. Two algorithms are used to identify circRNA by using align results (align.sam) from the previous step. CIRI2 can identify

176

Jingjing Zhang et al.

circRNA directly, and -A is an optional parameter. CIRCexplorer2 needs two steps to identification of circRNA, and CIRCexplorer2_parse.log and CIRCexplorer2_annotate.log in output files record the detail for identification (see Notes 2 and 3). ① CIRI2 $perl CIRI2.pl -I align.sam -O CIRI_result.txt -F genome.fa -A gene.gtf -T thread ② CIRCexplorer2 $CIRCexplorer2 parse -t BWA align.sam > CIRCexplorer2_parse. log $CIRCexplorer2 annotate -r ref.txt -g genome.fa -b back_spliced_junction.bed -o CIRCexplorer_result.txt > CIRCexplorer2_annotate.log

3.3 Removal of Redundant CircRNA

CD-HIT-EST is used to merge the identified circRNA, the input file (merge_result) is a fasta file, which includes all circRNA and their squences, and the final circRNA is saved in the file of output. $cd-hit-est -i merge_result -o output -AL 10 -AS 10 -c 0.95

3.4 Construction of CircRNA– miRNA–mRNA Regulatory Networks

Step 1: The relationship between miRNA and circRNA/mRNA is established by using GSTAr, which is an alignment tool. The alignment result between miRNA and circRNA is saved in the file of circmiRNA.txt, and the result between miRNA and mRNA in m-miRNA.txt. $perl GSTAr.pl miRNA circRNA > circ-miRNA.txt $perl GSTAr.pl miRNA mRNA > m-miRNA.txt

Step 2: Prediction of circRNA acting as miRNA decoy using decoy.pl. This method defines the decoys of miRNA as follows: no more than six mismatched or inserted bases present between the ninth to twentieth nucleotides of the miRNA 50 end, the second to eighth bases of the miRNA 50 end sequence must be matched perfectly, and there are no more than 4 mismatches or indels in other regions. The result for circRNA acting as miRNA decoy is saved in the file of circ-mi-decoy.txt. $perl decoy.pl circ-miRNA.txt circ-mi-decoy.txt

Step 3: Prediction of mRNA acting as miRNA target using target.pl. This method defines the targets of miRNA as follows: at most, one mismatch or indel is allowed between the ninth and 12th positions of the 50 end of miRNA sequences, the total number of bulges or mismatches in the other regions is not allowed to exceed

CircRNA-miRNA-mRNA Networks

177

4 nt, and no continuous mismatches are allowed. The result for mRNA acting as miRNA target is saved in the file of m-mi-target. txt. $perl target.pl m-miRNA.txt m-mi-target.txt

Step 4: Construction of circRNA–miRNA–mRNA regulatory networks. The information from previous prediction results (circmi-decoy.txt and m-mi-target.txt) are screened by an identifier (see Notes 4 and 5), then circRNA–miRNA–mRNA regulatory networks are constructed and visualized by Cytoscape. $grep ‘identifier’ circ-mi-decoy.txt | cut -f 1,2 >decoy.txt $grep ‘identifier’ m-mi-target.txt | cut -f 1,2 >target.txt $cat decoy.txt target.txt >network_result

3.5

A Case Study

1. Identification of circRNA. Identification of circRNA in A. thaliana is performed by CIRI2 and CIRCexplorer2 at the same time. Finally, we obtain 1385 circRNA by CIRI2 and 7380 circRNA by CIRCexplorer2. 1260 circRNA are from both two algorithms (Fig. 2a), accounting for 90.97% of CIRI2 results and 17.07% of CIRCexplorer2 results. Consistent with previous studies, the length of most circRNA ranges from 300 to 800 bp (Fig. 2b). Moreover, most circRNA are supported by fewer junction reads and derived from exon region of the genome. 2. Functional analysis of circRNA based on circRNA–miRNA– mRNA networks. In order to investigate the function of circRNA in A. thaliana, the circRNA–miRNA–mRNA networks are firstly constructed (Fig. 3). In the network, circRNA is considered as

Fig. 2 circRNA are identified and characterized. (a) circRNAs are identified by two algorithms (CIRI, and CIRCexplorer). (b) The length distribution of circRNA

178

Jingjing Zhang et al.

Fig. 3 circRNA–miRNA–mRNA regulatory networks. Blue nodes: miRNA. Red nodes: circRNA that might be miRNA decoy. Green nodes: mRNA that might be miRNA target

miRNA decoy and mRNA is considered as miRNA target. In our result, circRNA regulatory network is consist of 140 circRNA, 83 miRNA, and 2798 mRNA. According to ceRNA hypothesis, circRNA as miRNA decoy can regulate mRNA levels when the mRNA is targeted by the same miRNA. Then, we can infer the circRNA function from mRNAs indirectly by an miRNA-mediated manner. By GO enrichment analysis of these mRNAs, we find that circRNA function in response to stimulus (GO:0050896), metabolic process (GO:0008152), catalytic activity (GO:0003824), and protein-containing complex (GO:0032991).

4

Notes 1. During establishment of index files of the genome, the parameter “-a” represents BWT construction algorithm, and “bwtsw” applies to long genomes. 2. For parsing of CIRCexplorer, the program will output a file in bed format (back_spliced_junction.bed) for the next step. 3. For annotation of CIRCexplorer, the parameter “-r” represents the gene annotation, and this file should be in the format of Gene Predictions and RefSeq Genes with Gene Names (ref. txt), not gff3 or gtf format.

CircRNA-miRNA-mRNA Networks

179

4. circRNAs, miRNAs, and mRNAs files should be in the format of fasta when building the regulatory network. 5. Identifier in the screen command is a special string to distinguish the lines that contain miRNA IDs and the lines that represent binding sequences. For example, “ath” is an identifier for miRNA IDs of A. thaliana. References 1. Memczak S, Jens M, Elefsinioti A, Torti F, Krueger J, Rybak A et al (2013) Circular RNAs are a large class of animal RNAs with regulatory potency. Nature 495:333–338 2. Gao Y, Zhao F (2018) Computational strategies for exploring circular RNAs. Trends Genet 34:389–400 3. Gao Y, Wang J, Zhao F (2015) CIRI: an efficient and unbiased algorithm for de novo circular RNA identification. Genome Biol 16:4 4. Zhang XO, Dong R, Zhang Y, Zhang JL, Luo Z, Zhang J et al (2016) Diverse alternative back-splicing and alternative splicing landscape of circular RNAs. Genome Res 26:1277–1287 5. Chen L, Yu YY, Zhang XC, Liu C, Ye CY, Fan LJ (2016) PcircRNA_finder: a software for circRNA prediction in plants. Bioinformatics 32:3528–3529 6. Zhang P, Liu Y, Chen H, Meng X, Xue J, Chen K, Chen M (2020) CircPlant: an integrated tool for circRNA detection and functional prediction in plants. Genomics Proteomics Bioinformatics 18(3):352–358 7. Ashwal-Fluss R, Meyer M, Pamudurti NR, Ivanov A, Bartok O, Hanan M et al (2014) circRNA biogenesis competes with pre-mRNA splicing. Mol Cell 56:55–66 8. Hansen TB, Jensen TI, Clausen BH, Bramsen JB, Finsen B, Damgaard CK et al (2013)

Natural RNA circles function as efficient microRNA sponges. Nature 495:384–388 9. Conn SJ, Pillman KA, Toubia J, Conn VM, Salmanidis M, Phillips CA et al (2015) The RNA binding protein quaking regulates formation of circRNAs. Cell 160:1125–1134 10. Sun P, Li G (2019) CircCode: a powerful tool for identifying circRNA coding ability. Front Genet 10:981 11. Zhao W, Chu S, Jiao Y (2019) Present scenario of circular RNAs (circRNAs) in plants. Front Plant Sci 10:379 12. Wu Z, Huang W, Qin E, Liu S, Liu H, Grennan AK et al (2020) Comprehensive identification and expression profiling of circular RNAs during nodule development in Phaseolus vulgaris. Front Plant Sci 11:587185 13. Zhang J, Liu R, Zhu Y, Gong J, Yin S, Sun P et al (2020) Identification and characterization of circRNAs responsive to methyl jasmonate in Arabidopsis thaliana. Int J Mol Sci 21:792 14. Zhang J, Hao Z, Yin S, Li G (2020) GreenCircRNA: a database for plant circRNAs that act as miRNA decoys. Database 2020:baaa039 15. Li G, Hao Z, Fan C, Wu X (2017) Genomewide function analysis of lincRNAs as miRNA targets or decoys in plant. Plant Epigenetics:149–162

Chapter 11 Methods for Predicting CircRNA–miRNA–mRNA Regulatory Networks: GreenCircRNA and PlantCircNet Databases as Study Cases Nureyev F. Rodrigues and Rogerio Margis Abstract Circular RNAs are molecules formed by 30 –50 ligation in a splicing reaction, the so-called backsplicing. Well described in other groups, especially in humans, circRNA studies that include prediction and validation in plants are recent. It has already been shown that circRNAs can interact with microRNAs, acting as sponges, and adding a new layer of complexity in regulating eukaryotic transcription. Here, we cover two up-to-date databases that allow the users to perform analyses of the circRNA–miRNA–mRNA interactions in plants. We choose two databases to demonstrate their functions and compare their approaches to obtain a more robust and reliable interaction network. Key words Circular RNA, miRNA sponge, In silico prediction, Interaction network, Databases

1

Introduction RNA has a diverse and complex role in eukaryotic life. These features are reflected in the broad of RNA classes, summarized on coding and noncoding RNAs (ncRNAs), processing different RNA types and relationships. While coding RNA (messenger RNA, mRNA) carries the information to produce proteins, ncRNAs acts, in general, to regulate other RNAs. Among ncRNAs, microRNAs (miRNAs), belonging to a subclass of small noncoding RNAs, with approximately 19–24 nucleotides, act as a posttranscriptional mRNA regulator through the RNA Induced Silencing Complex (RISC) [1]. The interaction between the miRNA and the target mRNA on the active site of Argonaute protein (AGO) leads to repression in gene expression either by the mRNA cleavage or by translational repression [2, 3]. In plants, posttranscriptional regulation of gene expression

Luis Marı´a Vaschetto (ed.), Plant Circular RNAs: Methods and Protocols, Methods in Molecular Biology, vol. 2362, https://doi.org/10.1007/978-1-0716-1645-1_11, © The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Science+Business Media, LLC, part of Springer Nature 2021

181

182

Nureyev F. Rodrigues and Rogerio Margis

by miRNAs play critical roles in many developmental processes, abiotic and biotic stress responses [4–6]. Described in both, animals and plants, competing endogenous RNAs (ceRNAs) are transcripts of coding or noncoding genes with miRNA response elements (MREs). MREs are partially complementary sequences between mRNA and miRNA that can compete with mRNA targets for miRNAs binding and act like miRNA sponge transcripts [7, 8]. Some ceRNAs transcripts, as human ciRS-7 (sponge for miR-7), were described being circular RNAs (circRNAs), increasing interest in how these molecules can contribute to fine-tuning gene expression across eukaryotes [8]. Unlike linear ceRNAs, circRNAs are abundant, highly stable, and contain more binding sites for miRNA [9]. Circular RNAs are formed by 30 –50 ligation in a splicing reaction (the so-called backsplicing) of a single RNA molecule. Due to this origin, the circRNA biogenesis is a feature of eukaryotic gene expression, reported in some species of fungi, protists, plants, and metazoans [10]. Although well known in other groups, plant circRNAs from model species were only recently demonstrated and validated [11, 12]. With the advance of molecular techniques (like linear RNAs degradation and ribosomal RNA depletion) and the development of high-throughput sequencing technology, circRNA identification approaches became easy and accessible. Besides, for circRNAs, in silico tools have been produced or have had algorithm improvements, seeking a sensitive and accurate prediction. Prediction programs like CirComPara [13] use several algorithms to predict circRNAs and allow the user to choose those shared among the algorithms. Up to the present day, these kinds of tools run only on local machines; online tools are not available for this purpose. At the validation level, amplification using divergent primer PCR and Sanger sequencing has performed well and produced consistent results [14]. Thus, using these tools and approaches, circRNAs have been identified in several plant species. Besides that, the development of in silico approaches to the analysis of other RNA types, as miRNAs or long noncoding RNAs (lncRNAs), allowed the integration between these RNAs and predictions of a relationship between them. Several programs have been used to predict miRNA binding sites in circRNAs and estimate the interactions between circRNAs, miRNAs, mRNAs (miRNA targets), and mRNAs from circRNA parental genes [15– 17]. For instance, by using these bioinformatics tools and molecular approaches, our group demonstrated the existence of a ternary complex formed by AGO:miRNA:circRNA, using AGO-immunoprecipitated libraries of flowers from Arabidopsis thaliana [18]. In this way, some interaction networks can be predicted and evaluated, bringing to light new targets for gene expression regulation.

Methods for Predicting CircRNA–-miRNA-mRNA Regulatory Networks. . .

183

Table 1 List of plant circRNA databases

Databasea

# # Last circRNAs species updateb Short description

CircFunBase

1158

7

2019

Manually curated circRNA database, including experimentally validated and computationally predicted functions

CropCircDB

101,833

2

2018

Database for circRNAs in maize and rice response to abiotic stress

GreenCircRNA 213,494

69

2019

Database with circRNA information obtained from over 4100 RNA-seq public datasets across 69 different species

PlantcircBase

121,971

19

2020

Database of collected publicly available information of plant circRNAs

PlantCircNet

96,135

8

2017

Database of plant circRNA–miRNA–gene regulatory networks

a

Databases that were accessible online. Other databases reported are not listed because were inaccessible Information obtained from the database itself, not related to publication date

b

Given the increase in studies on circRNAs and, consequently, on the availability of data, plant circular RNA databases have been released to store, organize, and integrate different information related to circRNAs data. Up to date, some plant circRNAs databases with the prediction of circRNA–miRNA–mRNA interaction networks are available (Table 1). Using the database and their respective tools described here, we will show some predicted relationships in addition to the typical circRNA identification process.

2

Materials The related databases provide a lot of information about circRNAs such as genome position, parental gene name, isoform name, strand, identification method, and circRNAs classification (exonic, intronic, and intergenic region). We highlighted some differences that exist among databases. To follow the next steps on in silico analyses, we recommend the use of Firefox and Chrome browsers.

2.1

CircFunBase

Despite having nine other animal species, the CircFunBase [19] (http://bis.zju.edu.cn/CircFunBase/index.php) provides more than 1150 manually curated functional circRNA entries from seven plant species. Users can search on the database using circRNA name, circRNA location, gene symbol, or keywords. The search results present, beyond pattern info, the detection tool, validation method, parental gene description and GO annotation, tissue, the

184

Nureyev F. Rodrigues and Rogerio Margis

expression level of circRNA, and circRNA–miRNA interaction networks. CircFunBase also provides a BLAST module to specific plant species nucleotide databases and a submission interface to novel functional circRNAs that are not documented. 2.2

CropCircDB

Unlike CircFunBase, other databases investigated public RNA-seq samples by different approaches, providing the results within their respective databases. The CropCircDB [20] (http://deepbiology. cn/crop/index.php/Home/Index/Index) is an exclusive database for circRNAs from crop species (currently maize and rice), that include circRNAs found on ~250 stress-related samples (drought, salt, and cold). The search functionality is obtained by entering circRNA name, circRNA ID (a unique identifier that includes chromosome, start, end, and strand information), gene name or gene isoform, stress, miRNA (candidate that can interacts with circRNA), and tissue. An option to obtain crop-specific searches is also available. Excluding basic information, the CropCircDB’s search output produces two score values, “detection score” and “stress detection score,” to measure the robustness of the circRNA detection and in the stressed sample. Other options as the stress related with circRNA; miRNA predicted to interact to circRNA; three web links, to JBrowser of the genome, circRNA structure and respective expression profile; the tissue information; and three sequences, full sequence, spliced sequence, and an amino acid sequence (from spliced sequence that does not have stop codon), are also available as output search. It is possible to download all databases, also separated by species, in CSV format. Submission of new circRNAs, validated to CropCircDB, can be done through the e-mail address located on the database page.

2.3

GreenCircRNA

The GreenCircRNA [21] (http://greencirc.cn/) is the most comprehensive database, containing more than 200,000 circRNAs. Like CropCircDB, the GreenCircRNA information was obtained by circRNA identification using RNA-seq public datasets. Other nonbasic information regarding circRNAs in the database are junction read number of circRNAs and junction read IDs; relative expression of the circRNA; parental gene and its respective GO number (when are not intergenic region circRNAs); SRA accession number (where the circRNA was identified); circRNA and parental gene visualization; full length sequence and genomic sequence of the circRNA; and circRNA interaction network information. The GreenCircRNA allows search for circRNAs by host gene, miRNA ID, circRNA ID, and SRA ID. Besides, a subset search is available to search circRNAs by plant species, chromosome, and junction reads ratio. The search output present, including information mentioned above, KOG, KEGG, EC, KO, GO, Arabidopsis

Methods for Predicting CircRNA–-miRNA-mRNA Regulatory Networks. . .

185

and/or Oryza sativa best blast hit name, symbol, and definition, and the circRNA interaction network information, containing the miRNA and/or genes that interact with the circRNA. 2.4

PlantcircBase

The Plant Circular RNA Database (PlantcircBase) [22] (http://ibi. zju.edu.cn/plantcircbase/index.php) stores public available backsplice junction and respective full-length sequences from circRNAs identified in plants by different research groups. So, unlike CropCircDB and GreenCircRNA, the circRNAs may have been predicted by various bioinformatics tools, being those presented on a statistic data page for each species. Additionally, the number of circRNAs identified by multiple software, the number of different types, length, and circRNA splicing signal distribution is available on the “Statistics” data page. The PlantcircBase has eight modules: Statistics, Browse, Search, Visualize, BLASTcirc, Jbrowse, Network, and Download. The search module presents four options: keyword search (PlantcircBase ID, parental gene, or miRNA name), batch search (a list of keyword search entries), subset search (search by the organism, chromosome, validation, or circRNA types), and BLAST search against circRNA sequences. Including basic information, other information, for example, genomic sequence, splice junction sequence, potential amino acid sequence, number of exons covered, support reads, and tissues, are presented in the search output. The miRNAs that are predicted to bind the circRNA (spongemiRNAs) and the circRNA–miRNA–mRNA network is also shown in search output, the last one as a link to the Network module result. All data and information can be downloaded from each species in a tab-separated file at the Download page. The submission of new circRNAs can be done via e-mail address available on the database page.

2.5

PlantCircNet

The PlantCircNet [23] (http://bis.zju.edu.cn/plantcircnet/index. php) has the first database focused on plant circRNA–miRNA–gene regulatory networks. This database is composed of plant backspliced junction sites reported in several studies and publicly available RNA-seq samples collected from Sequence Read Archive (SRA). Up to date, eight species are offered on the database. The PlantCircNet has three modules; Search, Browse, and BLAST. A download page allows the user to choose between circRNA data (gtf or csv format), sequence, and interaction data, separated in circRNA–miRNA and miRNA–mRNA files. The search module has four search options: a basic search (by gene ID or miRNA name), a sequence search (by a sequence in FASTA format), a GO search (by GO ID or GO term), and a name search (by circRNA ID or name). At all options, one particular species may be selected to perform the search. The search output comprises standard basic information, unique reads (backsplicing reads), expression level, the SRA information (like Bioproject and

186

Nureyev F. Rodrigues and Rogerio Margis

their respective samples used to find the circRNA), and a genome browser. In the “Name” field, a link to circRNA–miRNA–mRNA network is available.

3

Methods Due to many options to predict circRNA–miRNA–mRNA network, we selected the GreenCircRNA, and PlantCircNet databases to present below, with notes to indicate differences and alternatives. This topic will be based on the network of one miRNA, particularly (see Notes 1 and 2) to present all database functions.

3.1

GreenCircRNA

1. Navigate to http://greencirc.cn/. 2. Select the Search module. On the Search page, choose one of search options; in our case, type ath-mir161 on the single microRNA ID option, and click “Submit”. 3. A result page will be demonstrated, with a table containing the columns: circRNA ID, miRNA, and gene ID (Fig. 1). The circRNA ID column presents the circRNA in a link format to the circRNA page. All genes predicted to be related to microRNA are listed in the gene ID column (see Note 3). In this case, the predicted network contains the ath-miR161, the Athal_Chr4:17281016|17282607 circRNA, and 18 genes. 4. Click on circRNA link. The server will be redirected to “CircRNA Information” page. The circRNA, parental gene, and circRNA interaction network information are available on this page. On this page, the basic circRNA basic information is displayed. Besides that, the number of junction reads (2), junction reads ratio (0.09), junction reads ID, and sample ID (SRR2079796) are available (Fig. 2a). The circRNA–miRNA– mRNA network is represented at the bottom of the page (Fig. 2b). The network can be downloaded in a separate file (Fig. 3). In addition to this approach, all predicted networks are available on “Download” page, separated by species.

3.2

PlantCircNet

1. Navigate to http://bis.zju.edu.cn/plantcircnet/index.php. 2. Go to the Search module. On the MicroRNA Name field, type ath-mir161 and click “Search”. 3. The result page will display the circRNA–miRNA–mRNA network and, when available, a circRNA list (Fig. 4). The network visualizer enables users to choose on the network components (nodes) to move them or view details. A zoom tool offers to view different distances and sizes, making it a good option for very dense networks. When clicking on the nodes, the CircRNA–miRNA–mRNA interaction list specific to the node will appear. Three layout options, force-directed, circle and tree, and two export formats, png and jpg, are available.

Methods for Predicting CircRNA–-miRNA-mRNA Regulatory Networks. . .

187

Fig. 1 GreenCircRNA’s result page from the search of ath-miR161. The circRNA_ID contains the link to circRNA predicted to interacts with selected microRNA. The gene_id field was resized to improve target list visualization

Fig. 2 The circRNA information page from GreenCircRNA; (a) circRNA and parental gene information; a circRNA representation is available below circRNA parental gene information field. In our case, given the intergenic region origin of circRNA, no exon boundaries are represented. (b) circRNA interaction network information and scheme

188

Nureyev F. Rodrigues and Rogerio Margis

Fig. 3 CircRNA interaction network scheme. In the center, in blue, the ath-miR161; in pink, the circRNA; the green dots are miRNA targets. The miRNA targets are not labeled

4. This result page is from all circRNA predicted to interact ath-miR161; if you want to access the network of a specific circRNA, the circRNA list enables individual links to each one. Some circRNA information as type, chromosome, start and end position, strand, and unique reads could be obtained through this list. Clicking on one of circRNA link from list, a new page will be opened, with detailed information, expression level and a genome browser of selected circRNA. Among the data, the Name field presents a link to a specific network of this circRNA (Fig. 5). 5. Yet on the result page, at the top right corner of the page (Fig. 4), it is possible to perform a GO enrichment analysis of the miRNA targets. The link leads to a new page with GO terms of genes (see Note 4) (Fig. 6). Each GO term presents a connection to AmiGO 2 database. A p-value is assigned to estimate whether the term is significant, and depending on the number of genes mapped to it. 3.3 Comparing Results

To demonstrate differences between databases, we compare the results from both, and the results are summarized in Table 2. The circRNA number were selected as 3 in PlantCircNet (ATH_circ09854, ATH_circ09855 and ATH_circ09856) and 1 in GreenCircRNA (Athal_Chr4:17281016|17282607). All circRNAs from PlantCircNet are from the same genomic region of chromosome 1 (Fig. 7 and Note 5), whereas GreenCircRNA circRNA is from

Methods for Predicting CircRNA–-miRNA-mRNA Regulatory Networks. . .

189

Fig. 4 PlantCircNet’s result page from the search of ath-miR161 network. At the top of the figure, the network with force-directed layout is composed by circRNAs (red), miRNA (yellow), and miRNA targets (blue). Nodes can be rearranged by the user. The legend of node and edge type is presented on the right side and the zoom tool on left side of the network. When exported, the legend is not available. At the upper left side of page, the user can choose the network layout. At the upper right side of page, enrichment analysis link and a export option are available. At the bottom of page, the circRNA list presents information and links to circRNA details

chromosome 4, and therefore different circRNAs. About miRNA targets, from a total of 33 genes from both databases, eight miRNA target genes are shared among them: AT1G62590, AT1G62910, AT1G62914, AT1G62930, AT1G63070, AT1G63130, AT1G63330, and AT1G63400. Even among unique miRNA targets, the PlantCircNet was more targets predicted than

190

Nureyev F. Rodrigues and Rogerio Margis

Fig. 5 The PlantCircNet’s detailed information page from ATH_circ09854. At bottom, the circRNA information, as well as a link to circRNA network at “Name” field. At the middle, the expression levels information containing Bioproject and Biosamples link and information of libraries used to predict the circRNA. At the bottom (not showed), a genome browser with circRNA annotation is available (see Fig. 7)

GreenCircRNA (Table 2). Thus, collecting the prediction of diverse databases allows to identify more robust networks of interaction, decrease potential false positives, and choose more positive targets for validation.

4

Notes 1. Because of the difference in circRNA nomenclature or prediction tools and databases, the ath-miR161 was selected for circRNA–miRNA–mRNA network analysis instead of a specific circRNA. If do you want to look for the same circRNA in different databases, please use other search entries, such as sequence or position in the genome, instead of circname or circID.

Methods for Predicting CircRNA–-miRNA-mRNA Regulatory Networks. . .

191

Fig. 6 The PlantCircNet’s GO enrichment analysis result page from ath-miR161 targets Table 2 Comparison of predicted ath-miR161 network information among evaluated databases Database

# circRNAs

# unique circRNAs

# genesa

# unique genesa

GreenCircRNA

1

1

18

10

PlantCircNet

3

3

23

15

a

predicted miRNA targets

2. Some databases allow you to search using only a part of the miRNA name, while others require the full name. In this way, the first one can be used to retrieve the networks for all predicted miRNAs, using the “ath-mir” query. 3. Each database uses a different pipeline to predict circRNA– miRNA–mRNA network. The results may differ between databases; a comparison between several databases and the selection of the shared results can improve the prediction. 4. The GO enrichment analysis of PlantCircNet is based on ontology from GO Consortium (http://geneontology.org/) and GO annotation from agriGO (http://bioinfo.cau.edu.cn/ agriGO/index.php) [23, 24]. In addition to the AmiGO 2 database’s information, we recommend using the GO term to search at the databases mentioned above and incorporate more information about miRNA targets.

192

Nureyev F. Rodrigues and Rogerio Margis

Fig. 7 Genome browser with the three circRNAs predicted to interact with ath-miR161. The upper “Region overview” panel shows a 40.04 kbp slice of chromosome 1 from Arabidopsis thaliana. The orange boxes are circRNA; the arrow shows the circRNA strand. The three circRNAs, ATH_circ09854, ATH_circ09855, and ATH_circ09856 are tagged in a, b, and c, respectively

5. Although from the same chromosomal region, circRNAs are considered unique because the junction reads found for each of them demonstrate different backsplicing regions, therefore regarded as different circRNAs. References 1. Xie M, Zhang S, Yu B (2015) microRNA biogenesis, degradation and activity in plants. Cell Mol Life Sci 72:87–99 2. Pasquinelli AE (2012) MicroRNAs and their targets: recognition, regulation and an emerging reciprocal relationship. Nat Rev Genet 13:271–282 3. Lanet E, Delannoy E, Sormani R et al (2009) Biochemical evidence for translational repression by Arabidopsis MicroRNAs. Plant Cell 21:1762–1768 4. Sunkar R, Li Y-F, Jagadeeswaran G (2012) Functions of microRNAs in plant stress responses. Trends Plant Sci 17:196–203 5. Khraiwesh B, Zhu JK, Zhu J (2012) Role of miRNAs and siRNAs in biotic and abiotic stress responses of plants. Biochim Biophys Acta 1819:137–148 6. Chen X (2012) Small RNAs in development— insights from plants. Curr Opin Genet Dev 22:361–367 7. Franco-Zorrilla JM, Valli A, Todesco M et al (2007) Target mimicry provides a new

mechanism for regulation of microRNA activity. Nat Genet 39:1033–1037 8. Hansen TB, Jensen TI, Clausen BH et al (2013) Natural RNA circles function as efficient microRNA sponges. Nature 495:384–388 9. Liu L, Wang J, Khanabdali R et al (2017) Circular RNAs: isolation, characterization and their potential role in diseases. RNA Biol 14:1715–1721 10. Wang PL, Bao Y, Yee MC et al (2014) Circular RNA is expressed across the eukaryotic tree of life. PLoS One 9:e90859 11. Ye CY, Chen L, Liu C et al (2015) Widespread non-coding circular RNAs in plants. New Phytol 208:88–95 12. Lu T, Cui L, Zhou Y et al (2015) Transcriptome-wide investigation of circular RNAs in rice. RNA 21:2076–2087 13. Gaffo E, Bonizzato A, te Kronnie G et al (2017) CirComPara: a multi-method comparative bioinformatics pipeline to detect and

Methods for Predicting CircRNA–-miRNA-mRNA Regulatory Networks. . . study circRNAs from RNA-seq data. Non coding RNA 3:8 14. Jeck WR, Sharpless NE (2014) Detecting and characterizing circular RNAs. Nat Biotechnol 32:453–461 15. Zhao W, Cheng Y, Zhang C et al (2017) Genome-wide identification and characterization of circular RNAs by high throughput sequencing in soybean. Sci Rep 7:1–11 16. Yin J, Liu M, Ma D et al (2018) Identification of circular RNAs and their targets during tomato fruit ripening. Postharvest Biol Technol 136:90–98 17. Chen G, Cui J, Wang L et al (2017) Genomewide identification of circular RNAs in Arabidopsis thaliana. Front Plant Sci 8:1–12 18. Frydrych Capelari E´, da Fonseca GC, Guzman F et al (2019) Circular and micro RNAs from Arabidopsis thaliana flowers are simultaneously isolated from AGO-IP libraries. Plan Theory 8:302

193

19. Meng X, Hu D, Zhang P et al (2019) CircFunBase: a database for functional circular RNAs. Database 2019:1–6 20. Wang K, Wang C, Guo B et al (2019) CropCircDB: a comprehensive circular RNA resource for crops in response to abiotic stress. Database 2019:1–7 21. Zhang J, Hao Z, Yin S et al (2020) GreenCircRNA: a database for plant circRNAs that act as miRNA decoys. Database (Oxford) 2020:1–9 22. Chu Q, Zhang X, Zhu X et al (2017) PlantcircBase: a database for plant circular RNAs. Mol Plant 10:1126–1128 23. Zhang P, Meng X, Chen H et al (2017) PlantCircNet: a database for plant circRNAmiRNA-mRNA regulatory networks. Database (Oxford) 2017:1–8 24. Tian T, Liu Y, Yan H et al (2017) AgriGO v2.0: a GO analysis toolkit for the agricultural community, 2017 update. Nucleic Acids Res 45: W122–W129

INDEX A Agrobacterium ............................... 22–24, 26, 36, 38, 46, 50, 51, 63, 64 Arabidopsis thaliana .................................. 22, 94, 96, 97, 115, 120, 122, 123, 126, 129–133, 136, 137, 148, 150–152, 162, 165, 168, 169, 174, 175, 177, 179, 182, 192 Argonaute (AGO) ................................................ 181, 182 AtCircDB.................................................... 111, 115, 137, 151, 152, 154

Divergent RT-PCR ...................................................45, 80 Divergent RT-qPCR .................................................36, 71

E Exo-resistant pentamer ............................... 71, 72, 78, 79

F Functions ........................................1, 2, 9, 21–33, 35–37, 45, 60, 70, 87, 99, 110, 112, 113, 116, 121, 122, 131, 135, 137, 154, 174, 177, 178, 183, 186

B

G

Binary vector ............................................... 36, 45, 46, 50 Bioinformatics .............................................. 94, 109, 114, 120, 122, 131–139, 182, 185 BLAST ........................................................ 123, 125, 126, 129, 130, 154, 160, 184, 185

GreenCircRNA........................................... 111, 115, 137, 151, 154, 162–164, 166, 168, 169, 173–179, 181–192

C

Infectious RNA .................................................... 101, 102 in silico prediction ......................................................... 182 Interaction networks.................. 113, 136, 137, 182–188 Interactions...................................... 22, 37, 62, 110, 112, 114–116, 136, 137, 147, 181, 182, 185, 186, 190 Introns ..................................................25, 28, 29, 36, 45, 47, 51, 52, 55, 64, 65, 70, 93, 97, 120, 148

circFunBase ................................................ 111, 115, 116, 137, 151, 154, 157, 183, 184 circRNA expression cassette ................................... 36, 37, 46, 47, 49, 51, 55, 63 circRNA identification ............................................ 2, 3, 6, 9–17, 22, 44, 132–134, 137, 152, 182–184 circRNA location.......................................................9, 183 circRNA-miRNA-mRNA ................................71, 87, 162 Circular RNAs (circRNAs) ..................................... 1, 2, 9, 10, 12, 21–33, 35–65, 69–89, 93–99, 101–107, 109–117, 119–122, 124, 127–139, 147–170, 174, 177, 179, 182–185, 187, 189, 191, 192 Competing endogenous RNAs (ceRNAs)..................121, 136, 182 Computational methods............................................... 152 CropCircDB ........................................................ 111, 115, 137, 152–156, 183, 184

D Data analysis .............................44, 45, 62, 73, 78, 79, 88 Databases ................................................9, 44, 60, 62, 79, 82, 88, 109–117, 121, 125, 126, 132, 135, 137, 150–162, 165, 168–170, 173–179, 181–192

I

K Knock down ................................................ 22, 24, 27, 30

L Lariat RNAs........................................................ 93–95, 97

M Maize ............................................................ 36, 111, 115, 116, 120, 133, 149, 183, 184 miRNAs ...................................................... 24, 36, 59–61, 70–74, 82, 84–87, 89, 110, 113–116, 121, 127, 129, 133, 136, 137, 148, 154, 156, 160, 162, 168, 174–176, 178, 179, 181, 182, 184–186, 188, 189, 191 miRNA sponges .......................................... 115, 136, 182 Mobile elements ............................................................ 120 Multiple displacement amplification (MDA) ..........69–89

Luis Maria Vaschetto (ed.), Plant Circular RNAs: Methods and Protocols, Methods in Molecular Biology, vol. 2362, https://doi.org/10.1007/978-1-0716-1645-1, © The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Science+Business Media, LLC, part of Springer Nature 2021

195

PLANT CIRCULAR RNAS: METHODS

196 Index

AND

PROTOCOLS

N

Q

Next generation sequencing (NGS) ......... 71, 73, 78, 79, 119–138, 148 Non-coding .............................................. 1, 93, 109, 148, 150, 151, 160, 181, 182 Non-coding RNA ................................................ 119, 151 North blotting .............................................3, 4, 9, 12–15 Northern hybridization .............................. 36, 58, 59, 81

Quantitative PCR................................................. 9, 11, 84

O Oryza sativa................................................ 36, 38, 42, 44, 48, 51, 52, 60, 71, 75, 76, 79, 82, 85, 120, 122, 124, 126, 128, 133, 137, 151, 152, 162, 165, 185 Overexpression .......................22, 23, 25, 26, 28–30, 121

P PlantcircBase....................................................24, 31, 111, 114, 121, 124, 126, 130, 137, 151, 154, 157, 162, 164, 166–169, 183, 185 PlantcircNet................................................ 111, 114, 115, 121, 137, 151, 152, 154, 157, 162, 163, 165, 167–169, 181–192 Plant circRNAs ...................................................... 22, 110, 114–116, 120–138, 150, 151, 170, 173–175, 182, 183, 185 Plant circular RNAs ...................................................1–19, 101, 102, 119–139, 183, 185 Plant development ...................59, 69–89, 122, 126, 131 Plants ................................................................... 2, 4, 6–9, 13, 16, 18, 19, 21–33, 36–40, 42, 43, 45–47, 50–62, 64, 65, 70–72, 75, 76, 82, 87, 94, 96, 97, 101–103, 105, 106, 110, 111, 114–116, 120–137, 147–170, 174, 181–185 Plant tissue culture ....................................................38, 39 PsRNA Target ............................................................... 126 Pull down ............................................................... 36, 154

R Real-time quantification PCR (qRT-PCR).................3, 9, 11, 18, 80, 81, 85 Regulatory networks........................................... 110, 113, 114, 116, 154, 173–179, 181–192 Resources ............................................................. 110–115, 122, 125, 133–137, 150, 151 Ribosomes ..................................................................... 151 Rice ................................................ 35–66, 70, 71, 81, 82, 111, 115, 133, 138, 148, 150, 183, 184 RNA circularization ..............................45, 120, 121, 148 RNA induced silencing complex (RISC)..................... 181 RNA isolation................................................2, 3, 6–9, 16, 18, 42, 43, 75, 124, 126–128, 130–132 RNA-seq ..................................................... 40, 43, 72, 75, 94–97, 120, 123, 125, 127, 128, 133, 150, 155, 183–185 RNase R ................................................ 1, 3, 9–11, 18, 39, 44, 58, 59, 62, 72, 77, 81, 87, 94–97, 110, 120, 123–130 RNA splicing ................................................................. 121

T T-DNA.........................21, 36, 37, 46, 50–57, 61, 64, 65 Transposable elements ...................................36, 147–170 TRIzol................................................................... 123–130

V Viroids .................................................................. 101–106

Z Zea mays......................................122, 126, 137, 151, 162