144 22 9MB
English Pages 311 [300] Year 2022
Methods in Molecular Biology 2379
Matias D. Zurbriggen Editor
Plant Synthetic Biology Methods and Protocols
METHODS
IN
MOLECULAR BIOLOGY
Series Editor John M. Walker School of Life and Medical Sciences University of Hertfordshire Hatfield, Hertfordshire, UK
For further volumes: http://www.springer.com/series/7651
For over 35 years, biological scientists have come to rely on the research protocols and methodologies in the critically acclaimed Methods in Molecular Biology series. The series was the first to introduce the step-by-step protocols approach that has become the standard in all biomedical protocol publishing. Each protocol is provided in readily-reproducible step-bystep fashion, opening with an introductory overview, a list of the materials and reagents needed to complete the experiment, and followed by a detailed procedure that is supported with a helpful notes section offering tips and tricks of the trade as well as troubleshooting advice. These hallmark features were introduced by series editor Dr. John Walker and constitute the key ingredient in each and every volume of the Methods in Molecular Biology series. Tested and trusted, comprehensive and reliable, all protocols from the series are indexed in PubMed.
Plant Synthetic Biology Edited by
Matias D. Zurbriggen Institute of Synthetic Biology and CEPLAS, University of Düsseldorf, Düsseldorf, Germany
Editor Matias D. Zurbriggen Institute of Synthetic Biology and CEPLAS University of Du¨sseldorf Du¨sseldorf, Germany
ISSN 1064-3745 ISSN 1940-6029 (electronic) Methods in Molecular Biology ISBN 978-1-0716-1790-8 ISBN 978-1-0716-1791-5 (eBook) https://doi.org/10.1007/978-1-0716-1791-5 © Springer Science+Business Media, LLC, part of Springer Nature 2022 This work is subject to copyright. All rights are solely and exclusively licensed by the Publisher, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting, reproduction on microfilms or in any other physical way, and transmission or information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or hereafter developed. The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication does not imply, even in the absence of a specific statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use. The publisher, the authors, and the editors are safe to assume that the advice and information in this book are believed to be true and accurate at the date of publication. Neither the publisher nor the authors or the editors give a warranty, expressed or implied, with respect to the material contained herein or for any errors or omissions that may have been made. The publisher remains neutral with regard to jurisdictional claims in published maps and institutional affiliations. This Humana imprint is published by the registered company Springer Science+Business Media, LLC, part of Springer Nature. The registered company address is: 1 New York Plaza, New York, NY 10004, U.S.A.
Preface Synthetic biology is a fast developing transdisciplinary field bridging engineering and life sciences. It provides novel theoretical-experimental strategies for the development of tools and techniques to study, control, and engineer biological processes with improved and new functionalities. These disrupting approaches are revolutionizing fundamental and translational research in prokaryotic and eukaryotic systems including yeasts and animals, but have not yet fully taken root in the plant field. In this book, we aim to share a thorough overview on currently developed molecular tools, techniques, and approaches to facilitate the broad implementation of synthetic biology strategies by the plant community in the hopes to advance the field further. The work covers different aspects within the topic, and with applications in plants, algae, and photosynthetic bacteria. Synthetic construct design and multiplex cloning are covered. Information is provided on advanced cloning methods, vectors, and transformation approaches as well as the required informatics tools supporting the engineering of complex constructs required in plant synthetic biology applications. It also gives an introduction to the efforts being made to standardize and facilitate the exchange of materials between plant labs worldwide. Gene expression control switches and CRISPR/Cas9-based tools and techniques for precise genome engineering are discussed in this context. In addition, there are chapters covering multiple aspects of synthetic metabolic and photosynthetic systems, including the contribution synthetic biology is providing to (a) the study of metabolic and signaling pathways, (b) the advanced engineering of metabolic networks for the production of high-value metabolites in plants and green microorganisms, and (c) the isolation of organelles and coculture of microorganisms, which will be needed for future development of synthetic organelle and cellular systems. Methods for the on-command manipulation of the relative stability of proteins, quantitative monitoring, and study of hormone signaling networks and for the in vitro reconstitution of enzymatic cascades illustrate the potential of new techniques toward those goals. Finally, the current state of the art in advanced imaging techniques and mathematical model approaches is discussed, pinpointing the key contribution for a quantitative and spatiotemporally resolved description of the systems. This is essential in the design of semisynthetic regulatory networks in plants for the understanding of complex signaling networks and the development of biotechnological applications. This work aims to be a useful resource for both researchers starting to explore novel experimental avenues as well as for experts willing to expand their portfolio of tools and strategies. As already experienced in other biological systems, this will open up novel perspectives for a faster progress of plant fundamental research and the development of traits of agricultural relevance and biotechnological applications. ¨ sseldorf, Germany Du
Matias D. Zurbriggen
v
Acknowledgments We highly appreciate the help of Dennis Dienst with discussion and technical advice for the initial pSHDYvector design. We would also like to thank Dr. Vladislav Zinchenko (Dept. of Genetics, Moscow State University) for his permission to use the pVZ vector as a backbone for our newly designed vector named pSHDY and its derivatives.
vii
Contents Preface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Preface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Contributors. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
v vii xi
1 Cas9-Mediated Targeted Mutagenesis in Plants. . . . . . . . . . . . . . . . . . . . . . . . . . . . . Quentin M. Dudley, Oleg Raitskin, and Nicola J. Patron 2 Design of Multiplexing CRISPR/Cas9 Constructs for Plant Genome Engineering Using the GoldenBraid DNA Assembly Standard. . . . . . . . . . . . . . . . M. Vazquez-Vilar, P. Juarez, J. M. Bernabe´-Orts, and D. Orzaez 3 Gene Editing in Green Alga Chlamydomonas reinhardtii via CRISPR-Cas9 Ribonucleoproteins . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Simon Kelterborn, Francisca Boehning, Irina Sizova, Olga Baidukova, Heide Evers, and Peter Hegemann 4 pSHDY: A New Tool for Genetic Engineering of Cyanobacteria . . . . . . . . . . . . . . Anna Behle and Ilka M. Axmann 5 Plant X-tender Toolbox for the Assembly and Expression of Multiple Transcriptional Units in Plants. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Tjasˇa Lukan, Kristina Gruden, and Anna Coll 6 Exploiting the Gal4/UAS System as Plant Orthogonal Molecular Toolbox to Control Reporter Expression in Arabidopsis Protoplasts. . . . . . . . . . . . . . . . . . . Sergio Iacopino, Francesco Licausi, and Beatrice Giuntoli 7 AGROBEST: A Highly Efficient Agrobacterium-Mediated Transient Expression System in Arabidopsis Seedlings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Hung-Yi Wu and Erh-Min Lai 8 Heterologous Production of Plant Terpenes in the Photosynthetic Bacterium Rhodobacter capsulatus . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ¨ lsmann, Katrin Troost, Anita Loeschcke, Jennifer Hage-Hu Vera Wewer, Karl-Erich Jaeger, and Thomas Drepper 9 Coexpression and Reconstitution of Enzymatic Cascades in Bacteria Using UbiGate . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Kathrin Kowarschik and Marco Trujillo 10 Engineering Destabilizing N-Termini in Plastids. . . . . . . . . . . . . . . . . . . . . . . . . . . . Lioba Inken Winckler and Nico Dissmeyer 11 Genetically Encoded Biosensors for the Quantitative Analysis of Auxin Dynamics in Plant Cells. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Jennifer Andres and Matias D. Zurbriggen 12 Homo-FRET Imaging to Study Protein–Protein Interaction and Complex Formation in Plants . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Stefanie Weidtkamp-Peters, Stephanie Rehwald, and Yvonne Stahl
1
ix
27
45
67
81
99
113
125
155 171
183
197
13
Mathematical Modelling in Plant Synthetic Biology . . . . . . . . . . . . . . . . . . . . . . . . . Anna Deneer and Christian Fleck In Vivo Epitope Tagging of Plant Mitochondria . . . . . . . . . . . . . . . . . . . . . . . . . . . . Franziska Kuhnert and Andreas P. M. Weber Trichome Transcripts as Efficiency Control for Synthetic Biology and Molecular Farming . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Richard Becker, Christian Go¨rner, Pavel Reichman, and Nico Dissmeyer Generation of Stable, Light-Driven Co-cultures of Cyanobacteria with Heterotrophic Microbes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Amit K. Singh and Daniel C. Ducat
265
Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
293
14 15
16
209 253
277
Contributors JENNIFER ANDRES • Institute of Synthetic Biology and CEPLAS, University of Du¨sseldorf, Du¨sseldorf, Germany ILKA M. AXMANN • Department of Biology, Institute for Synthetic Microbiology, Heinrich Heine University, Du¨sseldorf, Germany OLGA BAIDUKOVA • Institute of Biology, Experimental Biophysics, Humboldt University of Berlin, Berlin, Germany RICHARD BECKER • Leibniz Institute of Plant Biochemistry (IPB), Halle (Saale), Germany; ScienceCampus Halle – Plant-Based Bioeconomy, Halle (Saale), Germany ANNA BEHLE • Department of Biology, Institute for Synthetic Microbiology, Heinrich Heine University, Du¨sseldorf, Germany J. M. BERNABE´-ORTS • Instituto de Biologı´a Molecular y Celular de Plantas (IBMCP), Consejo Superior de Investigaciones Cientı´ficas (CSIC), Universidad Polite´cnica de Valencia, Valencia, Spain FRANCISCA BOEHNING • Institute of Biology, Experimental Biophysics, Humboldt University of Berlin, Berlin, Germany ANNA COLL • National Institute of Biology, Ljubljana, Slovenia ANNA DENEER • Biometris, Department of Mathematical and Statistical Methods, Wageningen University, Wageningen, The Netherlands NICO DISSMEYER • Protein Metabolism Lab, Department of Plant Physiology, University of Osnabruck, Osnabruck, Germany; CellNanOs—Center of Cellular Nanoanalytics, University of Osnabruck, Osnabruck, Germany; Faculty of Biology, University of Osnabruck, Osnabruck, Germany; Leibniz Institute of Plant Biochemistry (IPB), Halle (Saale), Germany; ScienceCampus Halle – Plant-Based Bioeconomy, Halle (Saale), Germany THOMAS DREPPER • Institute of Molecular Enzyme Technology, Heinrich Heine University Du¨sseldorf, Forschungszentrum Ju¨lich, Ju¨lich, Germany DANIEL C. DUCAT • MSU-DOE Plant Research Laboratory, Michigan State University, East Lansing, MI, USA; Department of Biochemistry & Molecular Biology, Michigan State University, East Lansing, MI, USA QUENTIN M. DUDLEY • Earlham Institute, Norwich Research Park, Norwich, Norfolk, UK; John Innes Centre, Norwich Research Park, Norwich, Norfolk, UK HEIDE EVERS • Institute of Biology, Experimental Biophysics, Humboldt University of Berlin, Berlin, Germany CHRISTIAN FLECK • ETH Zurich, Department of Biosystems Science and Engineering, Basel, Switzerland; Freiburg Institute for Data Analysis and Mathematical Modelling, University of Freiburg, Freiburg im Breisgau, Germany BEATRICE GIUNTOLI • University of Pisa, Pisa, Italy; Sant’Anna School of Advanced Studies, Pisa, Italy CHRISTIAN GO¨RNER • Leibniz Institute of Plant Biochemistry (IPB), Halle (Saale), Germany; Department of Plant Physiology and Protein Metabolism Lab, University of Osnabru¨ck, Osnabru¨ck, Germany; Fraunhofer Institute for Cell Therapy and Immunology (IZI), Leipzig, Germany KRISTINA GRUDEN • National Institute of Biology, Ljubljana, Slovenia
xii
Contributors
JENNIFER HAGE-HU¨LSMANN • Institute of Molecular Enzyme Technology, Heinrich Heine University Du¨sseldorf, Forschungszentrum Ju¨lich, Ju¨lich, Germany PETER HEGEMANN • Institute of Biology, Experimental Biophysics, Humboldt University of Berlin, Berlin, Germany SERGIO IACOPINO • University of Pisa, Pisa, Italy KARL-ERICH JAEGER • Institute of Molecular Enzyme Technology, Heinrich Heine University Du¨sseldorf, Forschungszentrum Ju¨lich, Ju¨lich, Germany P. JUAREZ • Instituto de Biologı´a Molecular y Celular de Plantas (IBMCP), Consejo Superior de Investigaciones Cientı´ficas (CSIC), Universidad Polite´cnica de Valencia, Valencia, Spain SIMON KELTERBORN • Institute of Biology, Experimental Biophysics, Humboldt University of Berlin, Berlin, Germany; Charite´ – Universit€ atsmedizin Berlin, Institute of Translational Physiology, Charite´platz 1, Berlin, Germany KATHRIN KOWARSCHIK • Leibniz Institute of Plant Biochemistry, Halle (Saale), Germany FRANZISKA KUHNERT • Institute of Plant Biochemistry, Cluster of Excellence on Plant Science (CEPLAS), Heinrich Heine University, Universit€ atsstrasse 1, Du¨sseldorf, Germany ERH-MIN LAI • Institute of Plant and Microbial Biology, Academia Sinica, Taipei, Taiwan FRANCESCO LICAUSI • University of Pisa, Pisa, Italy; Sant’Anna School of Advanced Studies, Pisa, Italy ANITA LOESCHCKE • Institute of Molecular Enzyme Technology, Heinrich Heine University Du¨sseldorf, Forschungszentrum Ju¨lich, Ju¨lich, Germany TJASˇA LUKAN • National Institute of Biology, Ljubljana, Slovenia D. ORZAEZ • Instituto de Biologı´a Molecular y Celular de Plantas (IBMCP), Consejo Superior de Investigaciones Cientı´ficas (CSIC), Universidad Polite´cnica de Valencia, Valencia, Spain NICOLA J. PATRON • Earlham Institute, Norwich Research Park, Norwich, Norfolk, UK OLEG RAITSKIN • Earlham Institute, Norwich Research Park, Norwich, Norfolk, UK STEPHANIE REHWALD • University Library, University Duisburg-Essen, Duisburg, Germany PAVEL REICHMAN • Leibniz Institute of Plant Biochemistry (IPB), Halle (Saale), Germany; ScienceCampus Halle – Plant-Based Bioeconomy, Halle (Saale), Germany; Department of Plant Physiology and Protein Metabolism Lab, University of Osnabru¨ck, Osnabru¨ck, Germany AMIT K. SINGH • MSU-DOE Plant Research Laboratory, Michigan State University, East Lansing, MI, USA IRINA SIZOVA • Institute of Biology, Experimental Biophysics, Humboldt University of Berlin, Berlin, Germany; Petersburg Nuclear Physics Institute - NRC Kurchatov Institute, St. Petersburg, Russia; Kurchatov Genome Center – PNPI, Gatchina, Russia YVONNE STAHL • Institute for Developmental Genetics, Heinrich-Heine University, Du¨sseldorf, Germany KATRIN TROOST • Institute of Molecular Enzyme Technology, Heinrich Heine University Du¨sseldorf, Forschungszentrum Ju¨lich, Ju¨lich, Germany MARCO TRUJILLO • Institute for Biology II, Albert-Ludwigs-University Freiburgg, Freiburg, Germany M. VAZQUEZ-VILAR • Instituto de Biologı´a Molecular y Celular de Plantas (IBMCP), Consejo Superior de Investigaciones Cientı´ficas (CSIC), Universidad Polite´cnica de Valencia, Valencia, Spain ANDREAS P. M. WEBER • Institute of Plant Biochemistry, Cluster of Excellence on Plant Science (CEPLAS), Heinrich Heine University, Universit€ a tsstrasse 1, Du¨sseldorf, Germany
Contributors
xiii
STEFANIE WEIDTKAMP-PETERS • Center for Advanced Imaging (CAi), Heinrich-Heine University, Du¨sseldorf, Germany VERA WEWER • MS Platform, Department of Biology, University of Cologne, Cologne, Germany LIOBA INKEN WINCKLER • Protein Metabolism Lab, Department of Plant Physiology, University of Osnabruck, Osnabruck, Germany; CellNanOs—Center of Cellular Nanoanalytics, University of Osnabruck, Osnabruck, Germany; Faculty of Biology, University of Osnabruck, Osnabruck, Germany HUNG-YI WU • Institute of Plant and Microbial Biology, Academia Sinica, Taipei, Taiwan; Department of Plant Pathology and Microbiology, National Taiwan University, Taipei, Taiwan MATIAS D. ZURBRIGGEN • Institute of Synthetic Biology and CEPLAS, University of Du¨sseldorf, Du¨sseldorf, Germany
Chapter 1 Cas9-Mediated Targeted Mutagenesis in Plants Quentin M. Dudley, Oleg Raitskin, and Nicola J. Patron Abstract Genome engineering technologies enable targeted mutations to be induced at almost any location in plant genomes. In particular, Cas9 nucleases use easily recoded RNA guides to target user-defined sequences and generate double-stranded breaks (DSB) that are then repaired by the cell’s endogenous repair mechanisms. Incorrect repair results in mutations at the target. When the targets are in coding sequences, this often results in loss-of-function mutations. In this chapter, we describe a method to rapidly design and assemble RNA-guided Cas9 constructs for plants and test their ability to induce mutations at their intended targets in rapid assays using both Agrobacterium-mediated transient expression and PEG-mediated DNA delivery to protoplasts, the latter of which can be adapted to a wide range of plant species. We describe a PCR-based method for detecting mutagenesis and outline the steps required to segregate the Cas9 transgene from the targeted mutation to enable the production of transgene-free mutated plants. These techniques are amenable to a range of plant species and should accelerate the application of Cas-9-mediated genome engineering for basic plant science as well as crop development. Key words CRISPR, Cas9, sgRNA, Protoplast, Mutagenesis, Nicotiana benthamiana, Arabidopsis thaliana, Genome engineering, Genome editing
1
Introduction Genome engineering enables the introduction of modifications to specific sequences in plant genomes, offering the potential to accelerate the development of crops with improved yield, disease resistance, increased nutrient use efficiency, and other desirable traits [1]. Genome engineering technologies deploy nucleases that induce double-stranded breaks (DSBs) at specific targets in the genome. These are subsequently repaired by the cell’s endogenous machinery resulting in mutations or specific edits to the target sequence. Earlier generations of genome engineering tools fused the nuclease domain of a bacterial Type IIS restriction endonuclease, FokI, to a DNA binding domain (a Zinc Finger (ZF) protein or Transcription Activator-Like Effector (TALE)) recoded to recog-
Matias D. Zurbriggen (ed.), Plant Synthetic Biology, Methods in Molecular Biology, vol. 2379, https://doi.org/10.1007/978-1-0716-1791-5_1, © Springer Science+Business Media, LLC, part of Springer Nature 2022
1
2
Quentin M. Dudley et al.
nize a specific sequence of DNA. These tools are somewhat laborious to work with as the protein sequence must be mutated at multiple sites to enable recognition of each new DNA target. The development of tools based on CRISPR/Cas (Clustered Regularly Interspaced Short Palindromic Repeats/CRISPR associated proteins) systems substantially accelerated genome editing [2]. Found in bacteria and archaea, these systems act as an acquired immune system against foreign DNA where the “memory” of previously encountered DNA is stored by integrating short sequences known as “spacers” between palindromic repeats within the CRISPR array. These spacers are then used to direct nucleases to defend against subsequent viral attacks, initially scanning DNA for the presence of short motifs known as protospacer adjacent motifs (PAMs) before cleaving invading DNA at sequences corresponding to the spacer. The first CRISPR nuclease adapted for genome engineering was Cas9 [3], a monomeric nuclease directed to its target by an RNA moiety known as the guide RNA (gRNA) that mediates interaction with the target DNA via Watson– Crick base pairing with the spacer. In bacteria, the RNA moiety is comprised of two RNA molecules, the CRISPR RNA (crRNA) that contains the spacer and a trans-activating RNA (tracrRNA). To ease the construction of genome engineering tools, these were fused into a single guide RNA (sgRNA) [4]. RNA-guided Cas9 was first used to induce targeted mutations in plant genomes in 2013 and has since been used effectively in dozens of species [5]. Although there have been several reports [6–9], it is comparatively more difficult to precisely recode plant genomic DNA to a desired sequence using homology directed repair (HDR). This is because the non-homologous end joining (NHEJ) pathway predominates in the somatic cells from which new plants can be regenerated, and because it is challenging to co-deliver sufficient quantities of a DNA repair template concurrent with the nuclease using the existing delivery protocols [10]. In this chapter, we describe a method to rapidly design, assemble, and test RNA-guided Cas9 constructs capable of inducing mutations at specific targets within plant genomes. Our protocol is exemplified using the model plants Arabidopsis thaliana and Nicotiana benthamiana, but the method is adaptable to many plant species. These constructs will induce DSBs at one or more targets, and DSBs will be rapidly repaired via NHEJ. In some instances, substitutions, insertions, and deletions will occur during repair, resulting in mutations. When these mutations disrupt a reading frame, or delete an essential sequence or functional motif, this is likely to lead to inactivation of the target gene, resulting in a genetic knockout. The modular nature of the DNA assembly method allows the user to easily add additional sgRNAs to enable simultaneous multiplexed mutagenesis of different targets and also to substitute alternative regulatory elements for specific plant
Cas9-Mediated Targeted Mutagenesis in Plants
3
species of interest. Further, Cas9 can also be replaced with alternative Cas nucleases (e.g., Cas12a) [11] or inactivated Cas proteins fused to protein domains (e.g., transcriptional activators [12] for gene regulation or cytidine deaminase domains [13, 14] for base editing). We describe the use of individual transcriptional units for each sgRNA. We have found such constructs to be robust at inducing insertions or deletions (indels) in multiple targets. However, we have not tested constructs with more than eight sgRNAs. A number of publications have utilized alternative ways to express multiple sgRNAs from a single promoter [15–17]; these are not described in this method. In addition, it has also been shown to be possible to assemble the ribonuclease complex in vitro and deliver it to protoplasts [18, 19] or to regenerable tissues via biolistic delivery [20]. This approach avoids the integration of a transgene; however, the reported editing efficiencies (i.e., the number of recovered plants with mutations at the intended targets) published to date are lower compared to delivery of a DNA construct to express Cas9 and sgRNAs in vivo. Additionally, these methods are only applicable to species with robust and highly efficient protoplast/bombardment-mediated transformation and regeneration protocols.
2
Materials
2.1 Selection of sgRNAs and Design of Plasmid Constructs
1. Sequence analysis software (see Note 1).
2.2 Assembly of Plasmid Constructs
1. Plasmid acceptors for the assembly of standardized DNA parts for plants [21]. This protocol utilizes several plasmid acceptors (pICH47732, pICH47811, pICH47751, pICH41766, pAGM4723) from the Golden Gate MoClo Toolkit [22] (Addgene Kit # 1000000044); however, other Type IIS-based cloning systems such as the GoldenBraid [23] or Loop toolkits [24] can also be used. 2. Standard DNA parts encoding plant regulatory sequences. This protocol utilizes several parts from the MoClo Plant Parts Kit (Addgene Kit # 1000000047) including promoters (nos, pICH8763 and 35 s, pICH51277) and terminators (ocs, pICH41432 and nos, pICH41421) that function in Arabidopsis, Nicotiana, and other dicotyledonous plants. However, these could be substituted with equivalent regulatory sequence parts suited for controlling expression in other plant species. 3. A plant selectable marker cassette, e.g., nos:nptII:ocs (pICSL70004, Addgene # 50034) assembled into a plasmid acceptor compatible with the selected DNA assembly system.
4
Quentin M. Dudley et al.
Alternatively, the nptII coding sequence part (pICLS80037, Addgene #68260) or hptII coding sequence part pICLS80036 (Addgene #68259) could be assembled with promoter and terminator parts known to function in the plant species of interest. 4. A standard DNA part encoding the Cas9 coding sequence (e.g., pEPOR0SP0013, Addgene #117521). 5. A standard DNA part encoding a U6 small nuclear RNA promoter (e.g., Arabidopsis U6-26, pICSL90002, Addgene #68261). This part is known to function in Arabidopsis and Nicotiana, but an equivalent part may be required for some species of interest, particularly monocotyledonous species. 6. A plasmid encoding an sgRNA scaffold corresponding to the selected Cas9 coding sequence (e.g., pSLQ1661-sgMUC4-E3 (Addgene #51025)). 7. Restriction endonuclease, BsaI (Eco31I) (see Note 2). 8. Restriction endonuclease, BpiI (BbsI). 9. T4 DNA-ligase. 10. Bovine serum albumin (BSA), molecular biology grade. 11. Proofreading polymerase and PCR reagents (e.g., Q5® polymerase, NewEngland Biolabs). 12. Competent cells of a cloning strain of E. coli (e.g., DH5α). 13. Reagents and equipment for culturing bacteria including LB broth, LB-agar, and antibiotics (spectinomycin, carbenicillin, kanamycin, chloramphenicol). 14. Thermocycler. 15. PCR purification Purification Kit).
kit
(e.g.,
Qiagen
QIAquick®
PCR
16. Plasmid DNA purification kit (e.g., Qiagen PlasmidPlus® Midi Kit). 2.3 Assessing the Mutagenesis Activity of Cas9-sgRNA Constructs Using Transient Expression
1. Infiltration medium: 10 mM MgCl2, 10 mM MES pH 5.7, 200 μM acetosyringone. 2. Digestion solution: 1.5% (w/v) cellulase, 0.3% (w/v) macerozyme R-10, 400 mM mannitol, 20 mM KCl, 20 mM MES pH 5.6, 10 mM CaCl2, 0.1% BSA. The buffer can be prepared in advance without the cellulase, macerozyme, and BSA; these should be added immediately before use. 3. Wash solution: 154 mM NaCl, 125 mM CaCl2, 5 mM KCl, 2 mM MES pH 5.6. 4. MMg solution: 400 mM mannitol, 15 mM CaCl2, 4 mM MES pH 5.6. 5. Sucrose solution: 21% sucrose in double-distilled water.
Cas9-Mediated Targeted Mutagenesis in Plants
5
6. PEG solution: Mix 2 g of PEG (poly(ethylene glycol), MW 4000 Da), 2 mL of 500 mM mannitol, and 0.5 mL of 1 M CaCl2. The resulting volume is ~4.5 mL and produces a ~44% PEG (w/v) solution (see Note 3). 7. Sieve filter or cell strainer (e.g., PluriStrainer 70 μm). 8. Bench-top centrifuge capable of 2200 g for pelleting Agrobacterium or 100 g for protoplast preparation. 9. Plant growth chamber (or greenhouse). 10. Reagents or kit for purification of genomic DNA from leaf tissue. 11. PCR purification kit (e.g., Qiagen QIAquick® PCR Purification Kit.).
3
Methods
3.1 Selection of sgRNAs and Design of Plasmid Constructs
1. Select a target genomic region (e.g., a gene) for mutagenesis. The ribonuclease complex (sgRNA + Cas9) will introduce a double-stranded break (DSB) within the target sequence (Fig. 1a). In the somatic cells of most plant species, the majority of DSBs will be rapidly repaired by the Non-Homologous End Joining (NHEJ) mechanism. This will likely introduce small insertions or deletions (indels) that, in coding sequences, can cause a frameshift which prevents correct translation. It is therefore advantageous to target the N-terminal region (or active domain) of the target protein to maximize the likelihood of that the region of the protein responsible for activity will not be translated. Indels in the 50 or 30 untranslated regions or regulatory elements are less likely result in functional knockouts; however, they may alter expression levels. 2. Select an appropriate Cas nuclease. This method utilizes the well-characterized Cas9 nuclease from Streptococcus pyogenes (SpCas9) that requires a Protospacer Adjacent Motif (PAM) sequence of NGG adjacent to the 30 end of the 20 bp target site (see step 4) [3]. Codon optimization often improves protein expression levels; here, we recommend the use of a plant codon-optimized sequence containing a Nuclear Localization Signal (NLS) [28] (pEPOR0SP0013; Addgene #117521). Alternative nucleases functional in plants [29] include Cas9 from Staphylococcus aureus [30] or Cas12a (previously, Cpf1) from Lachnospiraceae bacterium ND2006 [11, 19, 31]. Note that each of these Cas nucleases recognizes a different PAM. 3. Select an appropriate guide RNA scaffold (Fig. 1). The single guide RNA (sgRNA) scaffold used in this method has sequence modifications which extend the stem of the secondary structure (sgRNA-ES) [26] designed to improve the stability of the
6
Quentin M. Dudley et al.
Fig. 1 (a) Recognition of a cognate target by an sgRNA/Cas9 nuclease complex (adapted from [25]). The sgRNA scaffold sequence is denoted by blue/green text; the green letters in the sgRNA scaffold denotes additional/modified bases for the stem extension reported by Chen et al. [26]. Blue triangles denote the expected locations of the Cas9-induced breaks. (b) Primer design for incorporation of target information into sgRNAs. Primer sequences are denoted by underlined text
complex at ambient temperatures. Use of alternate Cas nucleases will require a compatible sgRNA (Cas9) or crRNA (Cas12a) scaffold. 4. Select spacer sequences within the target region. This can be done simply by searching the sequence of interest for instances of the cognate PAM (NGG for SpCas9) and selecting the adjacent ~20 base pairs. For SpCas9, the DSB will be induced three bases 50 of the PAM (Fig. 1a). Alternatively, CRISPR design software can be employed (see Note 1 for examples). Consider selecting spacer sequences amenable to PCR-based assay (see Note 4 and Subheading 3.4). 5. The NGG PAM is likely to be found in many locations within any given target gene; thus, it is often useful to rank and prioritize testing of spacers likely to perform best. It may be desirable to discard spacers that are not unique within the genome or that differ at only a single base from another PAM-adjacent target in the genome; “off-target” mutagenesis
Cas9-Mediated Targeted Mutagenesis in Plants
7
Fig. 2 Modular assembly of multigene constructed for RNA-guided targeted mutagenesis in plants using the Golden Gate MoClo Toolbox for plants. Additional sgRNAs can be assembled into Level 1 plasmids designated for position 4, 5, or 6
may occur at these sites [27, 32, 33]. The CRISPR design software tools listed in step 4 can be used to assess the likely functionality (efficiency of induction of DSBs) at the target and some will provide information on potential off-targets if a full genome sequence is available. Factors that may affect the functionality include GC content or pyrimidine content of the spacer [34]; however, at the time of writing, this has yet to be systematically assessed in plants. 6. Select appropriate regulatory elements. Figure 2 depicts the use of regulatory elements suitable for Nicotiana species; however, these should be substituted with elements suitable to the species of interest. Expression of Cas nucleases in plants is typically
8
Quentin M. Dudley et al.
driven by a strong constitutive promoter (e.g., 35 s from Cauliflower Mosaic Virus (CaMV35s) in dicots or Ubiquitin from Zea mays (ZmUbi1) in monocots [35]). However, in Arabidopsis, the use of the floral-dip transformation technique means that expression of the nuclease in early embryo development is critical to success. The use of promoters that express strongly in these tissues has been found to be advantageous [36, 37]. These promoters are dependent on RNA polymerase II-mediated transcription and are coupled with a 30 UTR and terminator. In contrast, RNA polymerase III (RNAPol-III)dependent promoters are used to transcribe sgRNAs due to their precise transcriptional start site. To ensure good expression, it is desirable to obtain a U6 promoter from the species of interest; however, the U6-26 promoter from Arabidopsis thaliana (At U6-26) has been used in several dicot species [35]. For many U6 genes, including AtU6-26, the start of transcription is a “G.” For many DNA targets, it will be possible to start the spacer of the sgRNA with a “G”; however, this is not a requirement. Yet the transcriptional start site must be preserved; therefore, if the selected spacer does not start with a “G,” include a “G” as an additional non-pairing “G” at the 50 end (Fig. 1a). We are routinely successful at inducing mutations at targets using an additional non-pairing “G.” 3.2 Assembly of Plasmid Constructs
The assembly of the final construct for delivery to plant cells requires combining transcriptional units encoding a selectable marker, Cas9, and one or more sgRNAs. While many assembly methods are available, we recommend the Golden Gate Modular Cloning (MoClo) toolbox for plants [22], a Golden Gate assembly standard based on the activity of Type IIS restriction endonucleases BsaI and BpiI which cleave DNA at a defined distance from their non-palindromic asymmetric recognition sites. Each plasmid acceptor (Level 1 and 2) contains left and right border sequences which enable delivery to plant cells via Agrobacterium tumefaciens. Many of the standardized, Level 0 DNA parts used here are compatible with other Golden Gate assembly standards such as GoldenBraid [23, 38], Loop [24], and others. When using the plant MoClo toolkit, we typically use the orientation depicted in Fig. 2 with a selectable marker in position 1, Cas9 in the reverse orientation in position 2, and sgRNAs in positions 3–6 [39]. If additional sgRNAs are required, we utilize the Level M and P acceptors, which enable any number of transcriptional units to be assembled iteratively [40]. 1. Construct selectable marker and Cas9 expression cassettes using a one-step digestion-ligation assembly of Level 0 parts into a Level 1 acceptor. The standard 20 μL digestion/ligation assembly reaction used in this method contains 1 nM acceptor
Cas9-Mediated Targeted Mutagenesis in Plants
9
plasmid, 2 nM of each insert plasmid or PCR product, 1 T4 DNA ligase buffer, 0.2 mg/mL BSA, plus 1 μL of T4 ligase (400 U/μL), and 1 μL of Eco31I (BsaI) (10 U/μL). Use a thermocycler to incubate at the following temperatures: initial 37 C for 20 s, 26 cycles of alternating 37 C for 3 min and 16 C for 4 min, with final incubation at 50 C for 5 min, and then 80 C for 5 min (see Note 2). Figure 2 depicts sample plasmids containing regulatory and coding sequences for the selectable marker and Cas9 cassettes. These components function in several dicot species, including Nicotiana. Advice on selection of alternative regulatory elements is given in Subheading 3.1 (step 6) (also see Note 5). 2. Use PCR to integrate the selected spacer sequence into the sgRNA. The primer sequences and plasmid template (pSLQ1661-sgMUC4-E3(F + E)) for amplification are shown in Fig. 1b (see Note 6). Any proofreading polymerase can be used for PCR. However, we have optimized this PCR using Q5® polymerase and the following protocol: a standard 50 μL reaction containing 1 Q5® reaction buffer, 200 μM dNTPs, 0.5 μM primers, 5 ng plasmid DNA template, 0.02 U/ μL polymerase cycled 38 times (denaturing at 98 C for 7 s, annealing at 60 C for 20 s, and extension at 72 C for 5 s) (see Note 7). 3. Purify the PCR product and quantify the resulting DNA for use in a standard digestion/ligation reaction with a Level 0 part containing an RNA-III-dependent promoter, e.g., AtU6-26 (pICSL90002), and the desired Level 1 acceptor to produce a complete transcriptional unit (as described in step 1). Note that the sgRNA template contains a functional minimal terminator, and it is not necessary to include an additional terminator in the reaction. 4. Transform the completed Level 1 assembly reactions into competent E. coli cells (e.g., DH5α). 5. Select LB agar plates containing 100 μg/mL carbenicillin. Level 1 acceptors contain a lacZα cassette which enables “blue/white” screening if plated on LB agar containing 20 μg/mL X-Gal, 500 μM IPTG, and 100 μg/mL carbenicillin. “Incorrect” plasmids (e.g., the original acceptor is intact and does not contain the desired insert) will be colored blue due to the galactosidase activity of lacZα (encoded by plasmid) and lacZΩ (present in E. coli) (see Note 8). 6. Select 1–3 white colonies and check for correct assembly using standard methods (restriction analysis and sequencing). 7. Digestion/ligation assembly reaction of Level 1 parts into a Level 2 acceptor. Combine plasmid DNA of the assembled selectable marker cassette (position 1), the Cas9 expression
10
Quentin M. Dudley et al.
cassette (Position 2), the sgRNA expression cassettes (positions 3+), and an appropriate end-linker compatible with the number of cassettes being assembled with a level 2 acceptor (e.g., pAGM4723). Use the same digestion/ligation protocol described in step 1, except replace the BsaI enzyme with 1 μL of BpiI (BbsI) (10 U/μL) (see Note 9 on additional sgRNAs). 8. Transform the completed Level 2 assembly reaction into competent E. coli cells (e.g., DH5α). 9. Select LB agar plates containing 50 μg/mL kanamycin. The Level 2 acceptor pAGM4723 contains the CRed selectable marker for cloning (CRed is an artificial bacterial operon responsible for canthaxanthin biosynthesis). “Incorrect” plasmids (e.g., the original acceptor is intact and does not contain the desired insert) will be colored red-orange due to accumulation of canthaxanthin. 10. Select 1–3 white colonies and check for correct assembly using standard methods (restriction analysis and sequencing). 11. Transform into competent Agrobacterium tumefaciens by mixing 200 ng of plasmid DNA with 50 μL of electrocompetent GV3101 Agrobacterium cells. Electroporate at 2.5 kV for 2 mm gap cuvette, E ¼ 12.5 kV/cm. Recover in 200 μL LB for 3 h at 28 C. Plate all cells on sterile LB plates containing 100 μg/mL rifampicin and 20 μg/mL gentamicin plus 50 μg/ mL kanamycin (substitute 100 μg/mL carbenicillin for kanamycin if electroporating a Level 1 plasmid) (see Note 10). Incubate for 48–72 h at 28 C. 12. Alternatively, prepare 10 μg of the Level 2 plasmid for transfection of protoplasts. This can typically be achieved using a plasmid midiprep kit. 3.3 Assessing the Mutagenesis Activity of Cas9-sgRNA Constructs Using Transient Expression
Stable plant transformation can take several weeks to months. It is therefore desirable to screen constructs for their ability to induce mutations at the desired target in advance. Below we describe two methods (Agroinfiltration and protoplast delivery) for transient expression using Nicotiana and Arabidopsis as exemplars.
3.3.1 Transient Expression in N. benthamiana Leaves via Agroinfiltration
1. Grow 10 mL of GV3101 Agrobacterium cultures containing the plasmid to be expressed in appropriate selection medium for 36 h at 28 C to OD600 above 1.0 (see Note 11). GV3101 contains resistance cassettes for 100 μg/mL rifampicin and 20 μg/mL gentamicin. Standard Level 2 plasmids can be retained with 50 μg/mL kanamycin. 2. Transfer cultures to 15-mL conical centrifuge tubes. Centrifuge at 2200 g for 20–30 min at room temperature. Discard supernatant.
Cas9-Mediated Targeted Mutagenesis in Plants
11
3. Resuspend the pellet cells in 10 mL infiltration medium. Incubate at room temperature with low rotation speed for at least 3 h (or overnight) (see Note 12). 4. Measure the OD600 and dilute to 0.8 (see Note 13). 5. Infiltrate leaves of 2- to 4-week-old N. benthamiana with the Agrobacterium cultures [41] (see Note 14). 6. Allow N. benthamiana to grow for 3–5 days before assaying mutagenesis activity (see Note 15). 7. After 5 days, collect leaf discs by punching out leaf tissue using a 6–8 mm borer. Leaf tissue can be frozen at 80 C. Genomic DNA can be purified and analyzed as described in Subheading 3.4 (step 1). 3.3.2 Transient Expression in Protoplasts
Agroinfiltration is a robust and simple method for transient expression but is not widely successful in non-solanaceous species. Protoplasts are more difficult to work with; however, they offer several trade-offs compared to agroinfiltration. l
Advantages: Preparation for delivery is faster (1 day to prepare midiprep plasmid DNA compared to 2–3 days for the growth of Agrobacterium); cells can be screened after 24 h (compared to 3–5 days for Agrobacterium); mesophyll protoplasts can be prepared from many diverse plant species using very similar protocols (reagents and time for digestion will need to be optimized).
l
Disadvantages: Requires a large quantities of transfection grade DNA (e.g., midiprep); fewer cells available for genomic DNA purification. 1. Prepare plasmid DNA using a plasmid purification kit. 10 μg of DNA 3 will be required to test a single DNA construct in triplicate (see Note 16). 2. Alternative PCR-derived preparation of DNA for transformation: As an alternative to using plasmids which encode the Cas9 protein and the sgRNA(s), linear DNA encoding either or both components can be generated by PCR using a high-fidelity polymerase (such as Q5®). If there are multiple DNA fragments, mix all DNA thoroughly before delivery to protoplasts (see Note 17). 3. Preparation of protoplasts. This step to step 15 were adapted from a protocol developed by the Sheen laboratory [42] and optimized for Arabidopsis thaliana and Nicotiana benthamiana. For use in other species, several experimental parameters should be optimized including digestion time, concentration of digestion enzymes, alternate enzymes and plant growth
12
Quentin M. Dudley et al.
conditions. This protocol will prepare sufficient protoplasts for approximately 20 transfection reactions in triplicate (60 reactions total); however, it is prudent to prepare an excess. This protocol is visualized in Fig. 3. 4. Grow plants in light intensity of 50–100 μmol/m2/s. Use 4- to 5-week-old N. benthamiana and 3- to 5-week-old Arabidopsis (see Note 18). 5. Prepare 25 mL digestion solution. 6. Cut two to three green, healthy leaves from the plant and rinse with double-distilled water twice. For N. benthamiana, infiltrate leaf tissue with digestion solution using a needleless syringe in the same manner described in Subheading 3.3.1 (step 7). Repeat until entire leaf is infiltrated (see Note 14b). Cut the leaf into pieces with lab scissors and remove the midrib and the most conspicuous leaf veins. For Arabidopsis, cut the leaves (keeping them fully immersed in digestion solution) into small strips each ~4 mm wide. 7. Transfer the leaf pieces into a polystyrene dish (petri dish) and immerse in the remaining Digestion Solution. 8. Wrap polystyrene dish(es) in foil to protect from light and incubate at 22 C for 3 h (N. benthamiana) or 4 h (A. thaliana). Gently rock plates every hour (approximately) to mix (see Note 19). 9. After enzymatic digestion of cell walls, pour leaf tissue onto a sieve filter or cell strainer. The filter should allow cells and liquid to easily pass through but should have a small enough mesh to prevent transfer of most undigested debris (~70 μm mesh size). Add 2 mL of W5 buffer to top of filter and tilt gently to release additional protoplasts remaining in the tissue. 10. Centrifuge at 100 g for 2 min at room temperature. 11. Aspirate supernatant quickly with a serological pipette and avoid disturbing the pellet. 12. Gently resuspend protoplasts in 2 mL W5 buffer and incubate on ice for 30 min until the protoplasts settle to bottom of tube by gravity. 13. Carefully aspirate supernatant with a serological pipette and avoid disturbing the pellet. Add about 1.5 mL of MMg Buffer (the concentration of protoplasts should be about 104 to 105 cells/mL, but the exact concentration is not critical at this point). 14. Sucrose cushion: Load a 15-mL centrifuge tube with 5 mL of 21% sucrose. Mount 2–2.5 mL of protoplasts in MMg buffer onto the top of the cushion using a P1000 pipette. To do this, place the tip of the pipette very close to the liquid surface and
Cas9-Mediated Targeted Mutagenesis in Plants
13
Fig. 3 Preparation of protoplasts from N. benthamiana leaves by digesting the cell wall and filtering the resulting protoplasts using gravity and a sucrose cushion
14
Quentin M. Dudley et al.
then dispense liquid very slowly and gently; the green liquid should sit on top of the clear sucrose layer and not mix. Centrifuge at 90 g for 10 min at room temperature. After centrifugation, healthy protoplasts will sit at the interface while debris will pellet at the bottom (Fig. 3). Next, remove the clear upper layer. Then carefully pipette the protoplasts (the green band) to a new 50-mL conical centrifuge tube (see Note 20). 15. Measure the concentration of protoplasts using a hemocytometer (typically ranging from 1 104 to 2 105). Dilute the purified protoplasts using MMg Buffer to about 1 104 cells/ mL or less (see Note 21). Protoplasts can sit for several hours at room temperature and still be viable and amenable to transfection. 16. Transfection of protoplasts (steps 16–23). Add 20 μg of purified DNA to a 2-mL microcentrifuge tube containing 200 μL of protoplasts (~1–2 104 cells/mL). To minimize the volume of liquid added, DNA should be highly concentrated (>1 μg/μ L for plasmids, similar for PCR). Reaction size can be adjusted depending on the application (i.e., 10 μg of purified DNA for 100 μL of protoplasts). 17. Mix well but gently. Invert the tube but avoid pipetting. 18. Slowly add 200 + x μL PEG solution (x ¼ volume of the DNA added, for example add 210 μL of PEG solution if 10 μL of DNA was added) to the tube wall and mix well by tilting (see Note 22). 19. Incubate at room temperature for 2–5 min (optional). At this time, prepare a 0.1% BSA solution. Add 300 μL to each well of a 12-well cell culture plate. Incubate for at least 5 min before use in step 23. Discard the BSA solution from the multiwall plate after incubation. 20. Add 1 mL of W5 solution to the protoplasts and gently mix by slowly inverting the tube. 21. Centrifuge at 100 g for 2 min at room temperature. 22. Pipette out the supernatant very carefully. Try not to remove any protoplasts. 23. Resuspend protoplasts in 300 μL of W5 buffer/reaction and pipette into the BSA-coated 12-well culture plate (prepared in step 19). Incubate in a growth chamber at 22 C for 18 h at 70–100 μmol/m2/s light. 24. Protein expression (e.g., fluorescence of reporter or and editing activity of Cas9) can be detected after approximately 18 h. In a good experiment with healthy protoplasts and efficient transfection, fluorescence can be detected for an additional 24 h.
Cas9-Mediated Targeted Mutagenesis in Plants
3.4 Assessment of Targeted Mutagenesis Following Transient Transfection
3.4.1 Detection of Indels Induced Using a Single sgRNA by PCR and Restriction Endonuclease Digestion (PCR-REN) and Sanger Sequencing
15
There are many methods to assess mutations. It is important to recall that transient expression (i.e., leaf discs) or protoplast experiments will contain a high percentage of cells which are not mutated (i.e., many wild-type cells). The goal of these assays is to determine if a given Cas9/sgRNA unit induces DSBs at the target, which can be assessed by the presence of mutations. Below, we describe PCR-based approaches for detecting (1) indels induced by a single sgRNA or (2) “deletion” of a fragment of DNA by two paired sgRNAs. We find these methods to be rapid and robust. However, there are a number of other methods described elsewhere [43] including T7 cleavage assay (more ambiguous and less sensitive than PCR/RE) [44], high-resolution melting analysis [45, 46], and PCR followed by digestion using purified ribonucleoprotein complex of cas9/sgRNA [47]. Additionally, next-generation sequencing (NGS) can be used to more accurately quantify editing efficiency, screen for off target mutations, and characterize complex structure rearrangements difficult to detect by PCR. 1. Extract genomic DNA from the regions of infiltrated leaves or protoplasts. There are many methods and kits for this. No specific variations are required for protoplasts; however, the small sample size may exclude some column-based protocols as the yield will be low. We use a variation of a CTAB-based method [48] with both sample types. 2. Amplify the region of interest using primers that flank the target. Primers should bind ~200–400 nucleotides away from the expected site of the induced DSB (see Note 23). 3. Purify DNA from PCR using a column-based PCR purification method or kit. 4. Use a restriction endonuclease (REN) to digest the PCR DNA. The REN recognition site should be overlapped with the expected cut-site (three base pairs before the PAM) (Fig. 4a). Use buffer/conditions appropriate for the restriction endonuclease. Load undigested and digested amplicons on an agarose gel and separate DNA by electrophoresis. 5. Image the gel. Indels caused by CRISPR/Cas9 will add/remove/substitute bases near the REN recognition site and prevent digestion. The presence of a WT-length band in the digested sample (digestion mutation band (*) in Fig. 4a) indicates mutagenesis at the locus (see Note 24). To confirm the presence of targeted mutations, the cleaned-up PCR product from step 3 should be submitted for Sanger sequencing. If mutagenesis occurred, then the sequencing trace should become messy and incoherent at the expected site of Cas9 cleavage. Alternatively, the band resistant to digestion from step 4 can be purified and sequenced (see Note 25).
16
Quentin M. Dudley et al.
Fig. 4 (a) Assessment of induced mutagenesis by a construct encoding an sgRNA-guided Cas9 that targets a recognition site for a restriction endonuclease (REN). Following amplification with flanking primers the product is digested with the cognate REN. The band resistant to restriction (marked with an asterisk) is sequenced to confirm mutagenesis. As the double-stranded break is repaired independently in each cell, numerous products are present. In the Sanger sequencing read, these different genotypes can be seen as multiple peaks following the expected break (vertical blue line). (b) Assessment of deletions induced by the delivery of a construct encoding Cas9 and two sgRNAs by amplification with flanking primers. The amplification product is expected to be smaller than the wild-type product. The Cas9-induced deletion is confirmed by sequencing this product 3.4.2 Detection of Indels Induced Using Paired sgRNAs, PCR, and Sanger Sequencing
1. Extract genomic DNA as in step 1 and amplify the amplify the region of interest using primers that flank the target. Primers should bind ~200–400 nucleotides away from the expected site of the pair of induced DSB(s) (see Note 23). 2. Load PCR onto an agarose gel and separate DNA via electrophoresis. 3. Image the gel. If both sgRNAs are active, a fraction of NHEJ repairs will ligate the broken ends together causing a “deletion” of the sequence between the sgRNAs (Fig. 4b). PCR amplification of these templates containing a deletion will be a smaller size than the wild type.
Cas9-Mediated Targeted Mutagenesis in Plants
17
4. Excise, purify, and sequence the “deletion band” from the gel in step 3. This will provide a definitive check that the “deletion band” is indeed the expected sequence and not an artefact of nonspecific PCR amplification (Fig. 4b, see Note 26). 3.5 Production of Stable Lines with Targeted Mutations
1. Deliver DNA constructs using an established transformation and regeneration method for the plant species of interest. This process is highly species-specific, and the details are beyond the scope of this chapter. Aim to recover at least 10–20 independent transgenic events for each construct. 2. Assess the genotype at the intended mutagenesis target for each regenerated T0 plant (T1 for Arabidopsis) using the same methods described in Subheading 3.4. In some cases, the transgenic plants will have regenerated from a cell in which a mutation was previously induced. In this case, all cells in the plants are likely to be of the same genotype (see Note 27). Alternatively, transgenic plants may produce different genotypes across different cell or tissue samples because the DSB and repair did not occur until after the regeneration process had begun. In these cases, because the induced breaks may be repaired differently in each cell, the plants will be genetically chimeric. For these chimeric plants, the PCR-based mutagenesis assays described in Subheading 3.4 will give similar results to the mixed populations of cells obtained in transient assays. If several of the recovered T0 plants have a stable mutated genotype across all tissues, plants with chimeric genotypes can be discarded. If possible, determine the copy number of the Cas9 transgene (e.g., by qPCR) and discard events with multiple insertions as segregating the transgene from the mutated locus will be prohibitively complex. 3. Regardless of whether T0 plants are chimeric at the target locus or show the same mutated genotype across all tissues, grow selected lines to sexual maturity and self-fertilize (assuming the species is self-compatible). Collect seeds and grow T1 plants from each T0 line. 4. Genotype the T1 progeny, testing both the genotype at the target locus (using the same method as step 2) and for the presence of the transgene. Since all T0 lines will be hemizygous for the Cas9 transgene insertions, the transgene will be segregate according to Mendel’s laws. Therefore, one quarter of T1 daughter progeny of a single copy line will not have inherited the transgene (sister-nulls) (Fig. 5). If these sister-nulls contain the same mutation as detected in the T0 parent, then the mutation can be confirmed to be heritable and stable. If all T1 plants are genetically chimeric at the target locus and the mutations can only be detected in plants that still contain the transgene, it is likely that different repair events taking place in
18
Quentin M. Dudley et al.
Fig. 5 Segregation of a Cas9/sgRNA single copy transgene from a mutagenized locus and assessment of mutagenesis at the target locus
somatic cells. It is possible that either the transgenes are not expressing in the germline cells or the expression levels or activity of sgRNAs are insufficient. 5. The potential genotype at the target locus of stably mutagenized plants will be heterozygous, biallelic, or homozygous. In heterozygous plants, only one of the two alleles at the target locus has been mutated, the other retains the wild-type sequence. Self-fertilizations of these lines will allow the recovery of homozygous plants (unless the mutation compromises fitness). Biallelic plants have two different mutant genotypes at each allele. This is common and occurs because the DSB occurred and was repaired separately at each allele. It is possible (but less likely) that the DSB at each allele has been repaired identically resulting in a homozygous plant. Biallelic plants can also be self-fertilized to produce a homozygous plant. Figure 5 depicts the possible genotypes of the T1 progeny of a T0 plant. 6. In subsequent generations, continue to genotype the locus to ensure the mutation is the same as the previous generation and is heritable in the absence of the transgene.
Cas9-Mediated Targeted Mutagenesis in Plants
19
7. Assess for off-target mutations. A number of papers have found evidence of unintended editing at genomic locations elsewhere in the plant genome [27, 32, 33]. There are two general approaches to assessing off-target activity. First, potential off-targets can be identified due to their similarity to on-target sequence. These loci can be screened using the methods described above or PCR amplified, barcoded, and pooled for multiplexed next-generation sequencing (NGS) [29, 49]. Alternatively, off-targets can be identified by whole genome sequencing of individual plants [50, 51]. If off-target mutations are detected, they can be removed by back-crossing the mutagenized plant with the parental (wild-type) line. This may also remove any additional genetic or epigenetic mutations that resulted from tissue culture and regeneration. 8. Analyze your stably engineered lines for your phenotype of interest. The negative control should be a null-sibling from which the mutation has been segregated rather than the parental wild-type line. This is because plants often acquire genetic and epigenetic changes during the transformation and regeneration process. This is particularly important when analyzing complex or quantitative traits. If the mutated lines are backcrossed to a parental wild type, a null-sibling in which the targeted mutations have been segregated should be used in a similar cross to generate an appropriate negative control.
4
Notes 1. CRISPR-PLANT (https://www.genome.arizona.edu/crispr/ ) is recommended and has an interface optimized for plant genomes. Other options include: CRISPRko (https://portals. broadinstitute.org/gpp/public/analysis-tools/sgrna-design), Benchling (gRNA design function) (https://benchling.com), Desktop Genetics (DESKGEN cloud) (https://www.deskgen. com/landing/cloud.html), and CRISPR-MIT (http://crispr. mit.edu/). 2. Providers may have more than one version of BsaI (e.g., BsaI, BsaI-HF). Some versions are less suitable for one-step digestion/ligation reactions. Check the description of the product to ensure suitability. At the time of writing, we obtain the best results with Eco31I (BsaI) (ER0291 Thermo Fisher Scientific) or Bsal-HF v2 (New England BioLabs). Note that we have repeatedly found that BsaI-HF (New England BioLabs) does not function in one-step digestion/ligation reactions. 3. It is important for PEG to completely dissolve before use. This requires preparation in advance (at least 1 h). Using your hand to warm the tube can accelerate dissolving but avoid exposure
20
Quentin M. Dudley et al.
to temperatures above 37 C. The quality and brand of the PEG is critical; this protocol is optimized to use Sigma-Aldrich 81240 “Poly(ethylene glycol), average Mn 4000, platelets.” 4. If possible, design spacers sequence with a restriction endonuclease (REN) recognition site near the location of the double-stranded break ( NNNNNNNNNNNNNN NNN | NNN-NGG). The RE recognition site need not be centered on the site of cleavage but should be close enough so that mutations (indels) will prevent recognition of the sequence by the enzyme. This will be useful if PCR/RE or Sanger sequence methods will be used to assay the mutagenesis activity (Subheading 3.4). 5. Standard Level 0 parts must not contain recognition sequences for Type IIS enzymes BsaI and BpiI as these will be used in DNA assembly. If present, these should be mutated. A protocol for this so-called domestication of new Level 0 parts (i.e., removing unwanted BsaI/BpiI sites) is described in [21]. 6. Forward primers will be ordered for each unique spacer sequence; the reverse primer need be ordered only once. The nonstandard overlap ATTG encoded by the forward primer (Fig. 1b) is derived from the native 30 end of the U6-26 promoter from Arabidopsis thaliana. For alternate promoters, a different overlap may be required to maintain integrity of the promoter; when making sgRNAs, simplify modify the forward primer to encode an appropriate overlap. 7. Since a similar PCR amplification from the same template will be performed each time a new spacer in integrated into the sgRNA, it is worth the time to optimize the reaction from the outset. 8. (A) To use blue/white screening, the competent E. coli must have a partial deletion in the lacZ gene (lacZΔM15); this is common in many cloning strains (DH5α, DH10B, XL-10 Gold, XL1-Blue, etc.). (B) The stock concentration of X-Gal is 20 mg/mL and is typically dissolved in DMSO or DMF (N, N-dimethyl formamide). DMF is more toxic than DMSO so use caution when handling and especially when making stock concentrations. X-Gal is light sensitive and should be stored in dark (if the reagent has yellow-tinge, it should be discarded). (C) Plates that have already been poured with LB-agar and antibiotic can be made “blue-white compatible” by adding 20 μL of 500 mM IPTG and 20 μL of 20 mg/mL X-Gal to the surface of the LB plate. Spread the liquid and allow to dry before plating the cells. 9. If you wish to assemble more than four sgRNAs (resulting in more than six transcriptional units), do not use a Level 2 acceptor. Instead, select appropriate Level M and P acceptors [40]
Cas9-Mediated Targeted Mutagenesis in Plants
21
which will allow assembly of >6 transcriptional units. If using Level M and P acceptors, the entire cloning strategy will need to be planned in advance to ensure that the correct Level 1 acceptors are used for each transcriptional unit. 10. GV3101 is an appropriate Agrobacterium tumefaciens strain for the pAGM4723 Level 2 backbone, which is kanamycin resistant. If you usually use a different Agrobacterium strain, remember to check for compatibility of the origin of replication as well as antibiotic resistance. 11. An additional strain of A. tumefaciens harboring a plasmid encoding a suppressor of silencing cassette (e.g., P19 from Tomato Bushy Stunt Virus available in plasmid pDGB3alpha2_35S:P19:Tnos (GB1203); Addgene #68214) can be co-infiltrated with the Agrobacterium strain harboring the Cas9-sgRNA construct. Expression of a viral repressor of silencing is known to increase transient expression during Agroinfiltration; however, we routinely detect targeted mutagenesis without co-expression of P19. 12. Optionally, some protocols suggest an additional centrifugation step after incubation: centrifuge at 2200 g for 15–20 min at room temperature. Discard supernatant. Resuspend cells in 5–10 mL of fresh infiltration medium. Then proceed with step 4. We have found both versions of the protocol to work. 13. If multiple strains of Agrobacterium are being mixed, use a 1:1 ratio of each strain and adjust concentrations so the combined OD600 remains at 0.8. If using a strain containing a plasmid expressing a P19 suppressor of silencing, the ratio of overexpression (Cas9/sgRNA) strains to P19 strains should be 4:1. 14. (A) The rate of plant growth and optimal age for infiltration will depend on growth conditions. Plants should have about three or four fully expanded true leaves but no more. Older plants (>5 weeks) will become more resistant to Agrobacterium resulting in low levels of expression [52]. Prior to infiltration, the plant should be well-watered (soil damp but not wet) so that the stomata are open. Plants showing any symptoms of stress from temperature, drought, pests, disease, etc. will not yield good results. Depending on size, each leaf can be infiltrated with four to six separate spots. (B) For infiltration, puncture the leaf with a needle. Next, press the needleless tip of a 1-mL syringe containing the Agrobacterium to the underside of the leaf over the puncture site. Use a gloved finger push against the syringe tip from the top side of the leaf. Slowly push the solution into the internal space of the leaf. A detailed method with videos describing the technical approach is available from JoVE [41].
22
Quentin M. Dudley et al.
15. Growth conditions do not need to be standardized during this period. The plant can be kept in the laboratory (e.g., on a windowsill) if temperatures are not extreme. The soil should be kept damp but not wet. 16. Particularly when optimizing delivery to a new species, consider the use of a reporter-tagged Cas9 to confirm delivery and expression. To construct the expression cassette, use a Level 0 Cas9 part without a stop codon (e.g., Addgene #117522) together with a C-terminal reporter such as YFP (e.g., Addgene #117536). The purity of DNA will affect transformation efficiency of protoplasts. We find that transformations performed with DNA purified using column-based midi- or maxi-prep kits (such as the Qiagen PlasmidPlus® Midi Kit #12943) are superior to mini-prep kits (such as the QIAprep® Kit #27104). The purity of the DNA can be further improved by performing two additional column washes prior to elution of the plasmid DNA. In addition, yield can be maximized by leaving the elution buffer (or water) on the column for 10 min at room temperature before the final centrifugation. 17. For a typical experiment, prepare five to ten 50 μL PCR and run a PCR cleanup using a commercial kit (such as the Qiagen QIAquick® PCR Purification Kit ~#28106). For protoplast transformation, the DNA must be highly concentrated. Thus, instead of using a single column per PCR, sequentially pass multiple PCR mixed with binding buffer over the same column (for example, four to five reactions per column). Proceed with the protocol and use three washes of Buffer PE which contains ethanol. Add 20 μL of water to the column and elute. Then, take the eluted DNA and pass it a second time over the column to elute additional DNA. The process should yield 20 μL of DNA at concentration at or exceeding 1 μg/μL. Delivered DNA can range from several hundred bases to several kilobases in length. 18. Do not use biological control pesticides such as nematodes, etc. Do not use any chemicals or grow in conditions that may cause stress to the plant. If Arabidopsis is cultivated using longday conditions (16 h light, 8 h dark), grow for at least 2 days using short day conditions (8 h light, 16 h dark) prior to preparation of protoplasts. 19. The speed of degradation of cell walls can be checked using a brightfield microscope. 20. A 15-mL centrifuge tube works better than a 50-mL centrifuge tube for the sucrose cushion. Also, when pipetting protoplasts, it is not absolutely necessary to use wide-orifice or cut tips to minimize shear forces, though this may be a good idea. However, users should pipette more slowly and gently than normal.
Cas9-Mediated Targeted Mutagenesis in Plants
23
21. Cells can be quantified using a hemocytometer. An experienced user will be able to estimate the concentration by eye. Due to the difficulty of accurately counting cells, we recommend that each experimenter find an optimal protoplast concentration that works in their laboratory. In general, we find that lower concentrations have improved the efficiency of transfection (and thus mutagenesis). However, low numbers of cells may make purification of genomic DNA more difficult. 22. A 2-mL microcentrifuge tubes seems to mix a little better than a 1.5-mL microcentrifuge tube. 23. Important: To maximize the clarity and sensitivity of the assay, the primers should produce a single band when PCR amplifying from the wild-type plant genome. If multiple bands appear, obtain new primers or optimize the PCR condition to minimize nonspecific binding. Effort spent optimizing primer design and PCR conditions (even before selecting binding sites for sgRNAs) can save time and energy by ensuring that any mutagenesis can be clearly and accurately detected. Additionally, while proof-reading polymerases are not required for this application, we find that high fidelity polymerases tend to produce cleaner gels with less non-specific amplification (e.g., Phusion®, Q5®). 24. Hypothetically, it is possible to try the reverse strategy. Instead of PCR followed by restriction endonuclease digestion, one could first digest the genomic template to cleave any wild-type sequence and then amplify. The presence of a band should indicate mutagenesis and the absence would indicate wild type. In practice, however, we have found that the restriction digest of genomic DNA is almost always incomplete (likely due to chromatin, methylation, impurities) and the approach produces many false-positive results that need to be verified by sequencing. 25. The percentage of editing can be estimated by comparing the wild-type and mutated sequencing traces [53, 54] using the web-based software tools Tide (Desktop Genetics, https:// tide.deskgen.com/), Synthego (https://ice.synthego.com/ #/analyze/results/ryhtyh6x/2), or DSDecode [55] (http:// skl.scau.edu.cn/dsdecode/). 26. Often a second PCR amplification step will be required to get enough template for a Sanger sequencing reaction. Simply use the excised band as template with the same primers. 27. Test multiple leaves from same plants to ensure the genotype is consistent across the plant.
24
Quentin M. Dudley et al.
References 1. Baltes NJ, Voytas DF (2015) Enabling plant synthetic biology through genome engineering. Trends Biotechnol 33(2):120–131 2. Bortesi L, Fischer R (2015) The CRISPR/ Cas9 system for plant genome editing and beyond. Biotechnol Adv 33(1):41–52 3. Jinek M, Chylinski K, Fonfara I, Hauer M, Doudna JA, Charpentier E (2012) A programmable dual-RNA–guided DNA endonuclease in adaptive bacterial immunity. Science 337(6096):816–821 4. Mali P, Yang L, Esvelt KM, Aach J, Guell M, DiCarlo JE, Norville JE, Church GM (2013) RNA-guided human genome engineering via Cas9. Science 339(6121):823–826 5. Weeks DP, Spalding MH, Yang B (2016) Use of designer nucleases for targeted gene and genome editing in plants. Plant Biotechnol J 14(2):483–495 6. Baltes NJ, Gil-Humanes J, Cermak T, Atkins PA, Voytas DF (2014) DNA replicons for plant genome engineering. Plant Cell 26(1):151–163 7. Li J-F, Norville JE, Aach J, McCormack M, Zhang D, Bush J, Church GM, Sheen J (2013) Multiplex and homologous recombination–mediated genome editing in Arabidopsis and Nicotiana benthamiana using guide RNA and Cas9. Nat Biotechnol 31(8):688 8. Hayut SF, Bessudo CM, Levy AA (2017) Targeted recombination between homologous chromosomes for precise breeding in tomato. Nat Commun 8:15605 9. Wang M, Lu Y, Botella JR, Mao Y, Hua K, Zhu J-k (2017) Gene targeting by homologydirected repair in rice using a geminivirusbased CRISPR/Cas9 system. Mol Plant 10(7):1007–1010 10. Puchta H, Fauser F (2013) Gene targeting in plants: 25 years later. Int J Dev Biol 57 (6–8):629–637 11. Zetsche B, Gootenberg JS, Abudayyeh OO, Slaymaker IM, Makarova KS, Essletzbichler P, Volz SE, Joung J, van der Oost J, Regev A (2015) Cpf1 is a single RNA-guided endonuclease of a class 2 CRISPR-Cas system. Cell 163(3):759–771 12. Piatek A, Ali Z, Baazim H, Li L, Abulfaraj A, Al-Shareef S, Aouida M, Mahfouz MM (2015) RNA-guided transcriptional regulation in planta via synthetic dCas9-based transcription factors. Plant Biotechnol J 13(4):578–589 13. Komor AC, Kim YB, Packer MS, Zuris JA, Liu DR (2016) Programmable editing of a target base in genomic DNA without double-
stranded DNA cleavage. Nature 533(7603):420 14. Zong Y, Wang Y, Li C, Zhang R, Chen K, Ran Y, Qiu J-L, Wang D, Gao C (2017) Precise base editing in rice, wheat and maize with a Cas9-cytidine deaminase fusion. Nat Biotechnol 35(5):438 15. Gao Y, Zhao Y (2014) Self-processing of ribozyme-flanked RNAs into guide RNAs in vitro and in vivo for CRISPR-mediated genome editing. J Integra Plant Biol 56(4):343–349 16. Lee RTH, Ng ASM, Ingham PW (2016) Ribozyme mediated gRNA generation for in vitro and in vivo CRISPR/Cas9 mutagenesis. PLoS One 11(11):e0166020 ˇ erma´k T, Curtin SJ, Gil-Humanes J, 17. C ˇ egan R, Kono TJ, Konecˇna´ E, Belanto JJ, C Starker CG, Mathre JW, Greenstein RL (2017) A multi-purpose toolkit to enable advanced genome engineering in plants. Plant Cell 29(6):1196–1217 18. Woo JW, Kim J, Kwon SI, Corvala´n C, Cho SW, Kim H, Kim S-G, Kim S-T, Choe S, Kim J-S (2015) DNA-free genome editing in plants with preassembled CRISPR-Cas9 ribonucleoproteins. Nat Biotechnol 33(11):1162 19. Kim H, Kim S-T, Ryu J, Kang B-C, Kim J-S, Kim S-G (2017) CRISPR/Cpf1-mediated DNA-free plant genome editing. Nat Commun 8:14406 20. Liang Z, Chen K, Li T, Zhang Y, Wang Y, Zhao Q, Liu J, Zhang H, Liu C, Ran Y (2017) Efficient DNA-free genome editing of bread wheat using CRISPR/Cas9 ribonucleoprotein complexes. Nat Commun 8:14261 21. Patron NJ, Orzaez D, Marillonnet S, Warzecha H, Matthewman C, Youles M, Raitskin O, Leveau A, Farre´ G, Rogers C et al (2015) Standards for plant synthetic biology: a common syntax for exchange of DNA parts. New Phytol 208(1):13–19 22. Engler C, Youles M, Gruetzner R, Ehnert T-M, Werner S, Jones JD, Patron NJ, Marillonnet S (2014) A golden gate modular cloning toolbox for plants. ACS Syn Biol 3(11):839–843 23. Sarrion-Perdigones A, Vazquez-Vilar M, Palacı´ J, Castelijns B, Forment J, Ziarsolo P, Blanca J, Granell A, Orzaez D (2013) GoldenBraid 2.0: a comprehensive DNA assembly framework for plant synthetic biology. Plant Physiol 162:1618–1634 ´ lamos S, 24. Pollak B, Cerda A, Delmans M, A Moyano T, West A, Gutie´rrez RA, Patron N,
Cas9-Mediated Targeted Mutagenesis in Plants Federici F, Haseloff J (2018) Loop assembly: a simple and open system for recursive fabrication of DNA circuits. bioRxiv:247593 25. Belhaj K, Chaparro-Garcia A, Kamoun S, Nekrasov V (2013) Plant genome editing made easy: targeted mutagenesis in model and crop plants using the CRISPR/Cas system. Plant Methods 9(1):39 26. Chen B, Gilbert LA, Cimini BA, Schnitzbauer J, Zhang W, Li G-W, Park J, Blackburn EH, Weissman JS, Qi LS (2013) Dynamic imaging of genomic loci in living human cells by an optimized CRISPR/Cas system. Cell 155(7):1479–1491 27. Lawrenson T, Shorinola O, Stacey N, Li C, Østergaard L, Patron N, Uauy C, Harwood W (2015) Induction of targeted, heritable mutations in barley and Brassica oleracea using RNA-guided Cas9 nuclease. Genome Biol 16(1):258 28. Fauser F, Schiml S, Puchta H (2014) Both CRISPR/Cas-based nucleases and nickases can be used efficiently for genome engineering in Arabidopsis thaliana. Plant J 79(2):348–359 29. Raitskin O, Schudoma C, West A, Patron NJ (2019) Comparison of efficiency and specificity of CRISPR-associated (Cas) nucleases in plants: an expanded toolkit for precision genome engineering. PLoS One 14(2): e0211598 30. Steinert J, Schiml S, Fauser F, Puchta H (2015) Highly efficient heritable plant genome engineering using Cas9 orthologues from Streptococcus thermophilus and Staphylococcus aureus. Plant J 84(6):1295–1305 31. Tang X, Lowder LG, Zhang T, Malzahn AA, Zheng X, Voytas DF, Zhong Z, Chen Y, Ren Q, Li Q (2017) A CRISPR–Cpf1 system for efficient genome editing and transcriptional repression in plants. Nat Plants 3(3):17018 32. Zhang Q, Xing H-L, Wang Z-P, Zhang H-Y, Yang F, Wang X-C, Chen Q-J (2018)Potential high-frequency off-target mutagenesis induced by CRISPR/Cas9 in Arabidopsis and its prevention. Plant Mol Biol 96(4–5):445–456 33. Li Z, Liu Z-B, Xing A, Moon BP, Koellhoffer JP, Huang L, Ward RT, Clifton E, Falco SC, Cigan AM (2015) Cas9-guide RNA directed genome editing in soybean. Plant Physiol 169(2):960–970 34. Doench JG, Fusi N, Sullender M, Hegde M, Vaimberg EW, Donovan KF, Smith I, Tothova Z, Wilen C, Orchard R (2016) Optimized sgRNA design to maximize activity and minimize off-target effects of CRISPR-Cas9. Nat Biotechnol 34(2):184
25
35. e Silva NV, Patron NJ (2017) CRISPR-based tools for plant genome engineering. Emerg Top Life Sci 1(2):135–149 36. Yan L, Wei S, Wu Y, Hu R, Li H, Yang W, Xie Q (2015) High-efficiency genome editing in Arabidopsis using YAO promoter-driven CRISPR/Cas9 system. Mol Plant 8(12):1820–1823 37. Wang Z-P, Xing H-L, Dong L, Zhang H-Y, Han C-Y, Wang X-C, Chen Q-J (2015) Egg cell-specific promoter-controlled CRISPR/ Cas9 efficiently generates homozygous mutants for multiple target genes in Arabidopsis in a single generation. Genome Biol 16(1):144 38. Vazquez-Vilar M, Bernabe´-Orts JM, Fernandez-del-Carmen A, Ziarsolo P, Blanca J, Granell A, Orzaez D (2016) A modular toolbox for gRNA–Cas9 genome engineering in plants based on the GoldenBraid standard. Plant Methods 12:10 39. Castel B, Tomlinson L, Locci F, Yang Y, Jones JD (2019) Optimization of T-DNA architecture for Cas9-mediated mutagenesis in Arabidopsis. PLoS One 14(1):e0204778 40. Werner S, Engler C, Weber E, Gruetzner R, Marillonnet S (2012) Fast track assembly of multigene constructs using Golden Gate cloning and the MoClo system. Bioengineered 3(1):38–43 41. Leuzinger K, Dent M, Hurtado J, Stahnke J, Lai H, Zhou X, Chen Q (2013) Efficient agroinfiltration of plants for high-level transient expression of recombinant proteins. J Vis Exp (77) 42. Yoo S-D, Cho Y-H, Sheen J (2007) Arabidopsis mesophyll protoplasts: a versatile cell system for transient gene expression analysis. Nat Protoc 2(7):1565 43. Zischewski J, Fischer R, Bortesi L (2017) Detection of on-target and off-target mutations generated by CRISPR/Cas9 and other sequence-specific nucleases. Biotechnol Adv 35(1):95–104 44. Vouillot L, The´lie A, Pollet N (2015) Comparison of T7E1 and surveyor mismatch cleavage assays to detect mutations triggered by engineered nucleases. G3 5(3):407–415 45. Distefano G, Caruso M, La Malfa S, Gentile A, Wu S-B (2012) High resolution melting analysis is a more sensitive and effective alternative to gel-based platforms in analysis of SSR—an example in citrus. PLoS One 7(8):e44202 46. Bassett AR, Tibbit C, Ponting CP, Liu J-L (2013) Highly efficient targeted mutagenesis
26
Quentin M. Dudley et al.
of Drosophila with the CRISPR/Cas9 system. Cell Rep 4(1):220–228 47. Liang Z, Chen K, Yan Y, Zhang Y, Gao C (2018) Genotyping genome-edited mutations in plants using CRISPR ribonucleoprotein complexes. Plant Biotechnol J 16(12):2053–2062 48. Allen G, Flores-Vergara M, Krasynanski S, Kumar S, Thompson W (2006) A modified protocol for rapid DNA isolation from plant tissues using cetyltrimethylammonium bromide. Nat Protoc 1(5):2320 49. Chari R, Mali P, Moosburner M, Church GM (2015) Unraveling CRISPR-Cas9 genome engineering parameters via a library-on-library approach. Nat Methods 12(9):823 50. Nekrasov V, Wang C, Win J, Lanz C, Weigel D, Kamoun S (2017) Rapid generation of a transgene-free powdery mildew resistant tomato by genome deletion. Sci Rep 7(1):482 51. Peterson BA, Haak DC, Nishimura MT, Teixeira PJ, James SR, Dangl JL, Nimchuk ZL (2016) Genome-wide assessment of efficiency
and specificity in CRISPR/Cas9 mediated multiple site targeting in Arabidopsis. PLoS One 11(9):e0162169 52. Saur IM, Kadota Y, Sklenar J, Holton NJ, Smakowska E, Belkhadir Y, Zipfel C, Rathjen JP (2016) NbCSPR underlies age-dependent immune responses to bacterial cold shock protein in Nicotiana benthamiana. Proc Natl Acad Sci U S A 113(12):3389–3394 53. Brinkman EK, Chen T, Amendola M, van Steensel B (2014) Easy quantitative assessment of genome editing by sequence trace decomposition. Nucleic Acids Res 42(22):e168 54. Yin H, Song C-Q, Suresh S, Kwan S-Y, Wu Q, Walsh S, Ding J, Bogorad RL, Zhu LJ, Wolfe SA (2018) Partial DNA-guided Cas9 enables genome editing with reduced off-target activity. Nat Chem Biol 14(3):311 55. Liu W, Xie X, Ma X, Li J, Chen J, Liu Y-G (2015) DSDecode: a web-based tool for decoding of sequencing chromatograms for genotyping of targeted mutations. Mol Plant 8(9):1431–1433
Chapter 2 Design of Multiplexing CRISPR/Cas9 Constructs for Plant Genome Engineering Using the GoldenBraid DNA Assembly Standard M. Vazquez-Vilar, P. Juarez, J. M. Bernabe´-Orts, and D. Orzaez Abstract Due to the huge potential of CRISPR/Cas9 for synthetic biology and genome engineering, many plant researchers are adopting this technology in their laboratories. CRISPR/Cas9 allows multiplexing of guide RNAs (gRNAs), therefore targeting several loci in the genome simultaneously. However, making DNA constructs for this purpose is not always straightforward for first-time users. Here we show how to make multiplex CRISPR/Cas9 constructs using the GoldenBraid (GB) DNA assembly system. As an example, we create a polycistronic gRNA construct that guides a dead version of Cas9 to three different positions of the nopaline synthase promoter, leading to transcriptional repression. After a description of the reagents, the protocol describes step-by-step the considerations for DNA target selection and the molecular cloning process of the final T-DNA construct as well as its testing by transient expression in Nicotiana benthamiana leaves along with a reporter construct for luciferase expression. Key words Multiplex CRISPR/Cas9, GoldenBraid DNA assembly, Transcriptional repression, Transient expression
1
Introduction CRISPR/Cas9 is a powerful system for genome engineering in plants [1–3]. The exchange of a 20-nucleotide sequence of the guide RNA (gRNA) is enough for reprogramming the Cas9 to target an alternative locus. Large genome engineering projects require the delivery of multiple gRNAs in the same T-DNA to enable, e.g., the generation of multiple gene knockouts or the creation of artificial allelic series. Furthermore, it has been shown that gRNA multiplexing using a dead Cas9 version fused to transcriptional regulators leads to strong activation/repression of target genes, and the extension of this strategy could enable the fine-tune regulation of entire metabolic pathways and/or regulatory cascades [4–6].
Matias D. Zurbriggen (ed.), Plant Synthetic Biology, Methods in Molecular Biology, vol. 2379, https://doi.org/10.1007/978-1-0716-1791-5_2, © Springer Science+Business Media, LLC, part of Springer Nature 2022
27
28
M. Vazquez-Vilar et al.
Multiplex genome editing can be achieved in its simplest way by cloning several RNAPolIII promoter-gRNA expression units in the same T-DNA. Alternatively, other strategies involve the construction of polycistronic gRNAs that undergo post transcriptional processing to yield functional gRNAs, therefore expanding the multiplexing capabilities of CRISPR/Cas9. These strategies require flanking each gRNA by processing signals, such as Csy4 sites [7], self-cleaving ribozymes [8], or ribonuclease recognition sites [9], which end up generating individual functional gRNAs from the polycistronic transcript. Polycistronic strategies minimize the repetition of the same RNAPolIII promoter for the expression of each gRNA, reducing the risk of raising transcriptional suppression rates. Not all DNA assembly methods used in multigene engineering are equally suitable for polycistronic gRNA assembly. For instance, recombination-based methods, such as Gibson, are not appropriate for the assembly of repetitive DNA parts [10]. Therefore, the majority of the assembly strategies currently employed for creating gRNA multiplex constructs rely on Golden Gate, a cloning strategy based on type IIS restriction enzymes [11, 12]. Here, we describe the assembly of a T-DNA vector for the expression of a polycistronic gRNA along with the dead Cas9 (dCas9) using the GoldenBraid (GB) approach, an iterative Golden Gate specialty designed for multigene engineering in plants [13–15]. GB-assembled polycistronic gRNAs are processed by the RNaseP and RNaseZ releasing individual gRNAs and avoiding promoter repetition for the expression of each gRNA. Furthermore, the iterative GB schema allows multiple combinations of transcriptional units comprising polycistronic gRNA units and/or single gRNA-cassettes, next to transcriptional units encoding an active Cas9 nuclease or dCas9based programmable transcription factors in multiple versions [16]. Here we present a detailed protocol for first-time users on how to assemble a T-DNA multigene construct encoding a CRISPR/ Cas9-based programmable transcriptional repressor. Such construct comprises a three-gRNA polycistronic cassette next to the dCas9 expression unit driven by the CaMV35S promoter. In this example, we chose to target the dCas9 to the nopaline synthase (nos) promoter at three different positions, and we tested the ability of the designed nucleoprotein complexes to downregulate luciferase levels driven by the Pnos in transient expression experiments carried out in N. benthamiana leaves. GB is a hierarchical assembly method that follows the previously described phytobrick standard for Plant Synthetic Biology [17]. Phytobricks are basic DNA elements (also known as level 0 parts) cloned in a standard entry vector. Upon BsaI cleavage, fragments are released flanked by four nucleotide overhangs and assembled together in multipartite Golden Gate reactions. The sequence of the flanking overhangs defines the position of each
Multiplexing CRISPR/Cas9 Constructs using GoldenBraid
29
DNA part in the final assembly. Only the overhangs for level 0 parts are defined in the phytobrick standard, but level 0 parts can themselves be created by assembling two or more lower hierarchy (level 1) nonstandard parts. Conversely, the assembly of several level 0 parts produces higher hierarchy constructs (level 1 parts, i.e., transcriptional units). From this point, GB follows an iterative strategy to produce higher level multigenic constructs employing only Golden Gate reactions. The CRISPR/Cas GB workflow is depicted in Fig. 1. It starts with the annealing of the two complementary oligonucleotides containing the user-selected target sequence. The annealed oligonucleotides are assembled together with the “tRNA” and the “scaffold” sequences stored as level 1 parts, creating a “tRNAprotospacer-scaffold” (“tps”) level 0 part stored in the pUPD2
Fig. 1 General workflow of the multiplexing gRNA cloning strategy with GoldenBraid. Targets are adapted to the GB standard by overhang addition. Oligonucleotide heteroduplex is combined with standard level 1 parts to create individual tRNA-protospacer-scaffold plasmids that are combined in level 1 polycistronic tRNA-gRNA expression cassettes. The binary combination of two polycistrons incorporates a 2D multiplexing step on the CRISPR cloning workflow. Additionally, polycistronic tRNA-gRNAs can be combined with Cas9 transcriptional units
30
M. Vazquez-Vilar et al.
standard entry plasmid. The “tRNA” and the “scaffold” are both provided by the GB kit (available at https://www.addgene.org/ kits/orzaez-goldenbraid2/). These level 1 parts are both flanked by BsmBI sites and therefore a Golden Gate reaction with this enzyme followed by blue/white screening is needed for the selection of correctly assembled “tps” level 0 parts. In 3 multiplex strategy, three of these level 0 assemblies need to be created, one for each position in the polycistronic transcript. Once three compatible “tps” level 0 parts are created, they can be combined together with the Arabidopsis thaliana RNAPolIII promoter of choice (U6-26 or U6-1) in a standard binary GB levelα destination vector (e.g., pDGB3α2, available with GB kit) yielding the final 3 polycistronic gRNA expression-cassette. All level 0 parts are flanked by BsaI sites, and therefore, this Golden Gate reaction requires the use of BsaI. Finally, the gRNA expression-cassette cloned into pDGB3α2 can be combined with the dCas9 transcriptional unit previously assembled in a compatible binary GB levelα plasmid (e.g., pDGB3α1, provided) to form a final T-DNA construct ready for transcriptional repression assays.
2
Materials 1. A selection of GoldenBraid plasmids: The GB DNA assembly collection includes a set of plasmids for Cas9-mediated genome engineering. This set of plasmids is described in [16], and some of them will be used in this example. All plasmids listed on this section are available from Addgene at https://www.addgene. org/kits/orzaez-goldenbraid2/. (a) Level 1 tRNA-scaffold (“ts”) plasmids for polycistronic gRNA assemblies. GB1205, GB1206, and GB1207 are the plasmids providing the level 1 parts for the assembly of gRNAs for positions 1, 2, and 3 of a 3 polycistronic gRNA transcript, respectively. All three are ampicillin-resistant plasmids and contain, in a single entry vector, both the “tRNA” and the “scaffold” level 1 parts, all flanked by BsmBI restriction sites. Conversely, GB1207 (position 2) and GB1208 (position 1) provide the necessary level 1 parts for creating 2 gRNA polycistrons. A detailed description of the “ts” plasmids can be found in Table 1. (b) RNA polymerase III promoters as level 0 GB parts. GB1001 and GB1204 are phytobricks that contain the A. thaliana U6–26 and U6–1 RNAPolIII promoters, respectively.
Multiplexing CRISPR/Cas9 Constructs using GoldenBraid
31
Table 1 GB level -1 tRNA-scaffold “ts” plasmids for the assembly of polycistronic gRNAs GB vector
Name
Description
GB1205 tRNA-scaffold [D1-2]
tRNA and scaffold for the assembly of “tRNA-protospacer-scaffold” for position 1 (position [D1_2]) of a 2 or 3 polycistronic gRNA regulated by the A. thaliana U6-26 or U6-1 promoter
GB1206 tRNA-scaffold [2-(n-1)]
tRNA and scaffold for the assembly of “tRNA-protospacer-scaffold” for position 2 (position [2_n-1]) of a 3 polycistronic gRNA
GB1207 tRNA-scaffold [n] tRNA and scaffold for the assembly of “tRNA-protospacer-scaffold” for the last position (position [n]) of a 3 or 2 polycistronic gRNA GB1208 tRNA-scaffold [D1-(n-1)]
tRNA and scaffold for the assembly of “tRNA-protospacer-scaffold” for position 1 (position [D1_n-1]) of a 2 polycistronic gRNA regulated by the A. thaliana U6-26 or U6-1 promoter
(c) Cas9/dCas9 transcriptional units. GB0639 is a GB plasmid that contains a transcriptional unit (TU) for the expression of the human codon optimized Cas9 endonuclease driven by the CaMV35S promoter; GB1191 contains an equivalent TU for the expression of its catalytically inactive version (dCas9) (see Note 1). (d) GB destination plasmids. The Universal Part Domestication plasmid (pUPD2) is the GB destination vector for level 0 “tps” parts assembly. The α level vectors (pDGB3α1 and α2) are the destination vectors for transcriptional units and multiplex gRNA expression cassettes, and Ω level vectors are destination vectors for multiple TUs combinations (pDGB3Ω1 and Ω2). All pDGBs are binary vectors ready for plant transformation. 2. DNA sequence editor software for plasmid design (Benchling and www.gbcloning.org). 3. gRNA oligonucleotides listed in Table 2. 4. Restriction enzymes: BsmBI and BsaI (NEB) for GoldenBraid reactions and BsmBI and SpeI for gene editing efficiency test. 5. T4 DNA ligase and 10 T4 ligase buffer (Promega). 6. A thermocycler for the GoldenBraid reactions. 7. Competent DH5α Escherichia coli cells (prepared with the Mix&Go Kit from Zymoresearch). Chemically competent or electrocompetent cells can also be used. 8. A 37 C incubator and a 37 C shaking incubator. 9. Sterile SOC liquid medium: 2% tryptone, 0.5% yeast extract, 10 mM sodium chloride, 2.5 mM potassium chloride, 10 mM magnesium chloride, 10 mM magnesium sulfate, 20 mM glucose.
32
M. Vazquez-Vilar et al.
Table 2 Oligonucleotides for Pnos “tps” plasmid assembly with the GB-multiplexing strategy gRNA
Oligonucleotide sequence (50 !30 )
gRNA1-forward
50 -GTGCaAGACTCTAATTGGATACCG-30
gRNA1-reverse
50 -AAACCGGTATCCAATTAGAGTCTt-30
gRNA2-forward
50 -GTGCaACGTTCCATAAATTCCCCT-30
gRNA2-reverse
50 -AAACAGGGGAATTTATGGAACGTt-30
gRNA3-forward
50 -GTGCaACTTTTGAACGCGCAATAA-30
gRNA3-reverse
50 -AAACTTATTGCGCGTTCAAAAGTt-30
10. Sterile LB broth: 1% tryptone, 0.5% yeast extract, 1% sodium chloride. 11. LB agar plates: 1% tryptone, 0.5% yeast extract, 1% sodium chloride, 1.5% agar. 12. Antibiotic stocks: 10 mg/mL chloramphenicol in 100% ethanol (use at 10 μg/mL), 50 mg/mL kanamycin (use at 50 μg/ mL), 50 mg/mL spectinomycin (use at 50 μg/mL), and 50 mg/mL rifampicin (use at 50 μg/mL). 13. 0.5 M IPTG stock (use at 0.5 mM), and 20 mg/mL X-Gal in DMSO (use at 40 μg/mL) for blue/white selection. 14. E.Z.N.A. Plasmid Mini Kit (Omega Bio-tek) or equivalent. 15. Electrocompetent GV3101 Agrobacterium tumefaciens cells, BTX™-Harvard Apparatus ECM™ 399 electroporator, and 1 mm gap EP-101 electroporation cuvettes (Cell Projects) or equivalent. 16. Agarose gel equipment and supplies. 17. NanoDrop for DNA quantification. 18. Agro-infiltration buffer: 10 mM MES pH 5.6, 10 mM magnesium chloride, 200 μM acetosyringone. 19. Rolling mixer. 20. Spectrophotometer set at a wavelength of 600 nm to measure absorbance and plastic cuvettes. 21. Sterile 1-mL Plastikpak syringes without needle. 22. 4- to 5-Week-old N. benthamiana plants (growing conditions: 24 C day/20 C night in a 16 h light/8 h dark cycle). 23. Dual-Glo® Luciferase Assay System (Promega). 24. Mixer Mill M400 (Retsch). 25. GloMax 96 Microplate Luminometer (Promega). 26. BioSpec 3.2 mm stainless steel beads or equivalent.
Multiplexing CRISPR/Cas9 Constructs using GoldenBraid
33
27. CTAB buffer: 2% (w/v) CTAB (Cetyl trimethyl-ammonium bromide), 1.4 M NaCl, 20 mM EDTA pH 8.0, 100 mM Tris– HCl pH 8.0, 2% (v/v) β-mercaptoethanol, 0.0115% (w/v) RNaseA. 28. Water bath at 65 C. 29. Chloroform:isoamyl alcohol (24:1). 30. Isopropanol. 31. 80% ethanol (ice-cold).
3
Methods Here we offer guidance for untrained users by providing a detailed protocol for simultaneously targeting three positions on the Pnos for dCas9-based transcriptional repression.
3.1 Choice of the Gene(s) of Interest for Editing, Activation, or Repression
In the example shown here, we targeted three positions on the nopaline-synthase gene (nos) promoter for transcriptional regulation studies. The nos promoter region (Pnos) is widely used as promoter in selection marker cassettes for plant transformation. Pnos confers intermediate levels of transcriptional activity, between 10 and 15 times lower compared to the strong CaMV35s promoter [15]. Moderate and weak promoters like Pnos are required in several genetic engineering applications, e.g., for the expression of specific ratios of enzymes of certain pathways. Rather than a fix transcription rate, sometimes it is useful to modulate promoter expression using artificial Cas9-based transcription regulators. By targeting dCas9 to different sites of the Pnos promoter and measuring the changes in its transcriptional activity with a luciferase reporter, we aim to understand the requirements for dCas9mediated transcriptional regulation.
3.2 Considerations Concerning the Selection of Target Sites
Target sites for Cas9 action can be found near most genomic loci. However, the choice of the intended target site and the selection criteria of gRNAs require careful consideration and fall beyond the scope of this methods paper. Therefore, here we will only provide some basic guidelines. Refer to recent reviews [18, 19] for more detailed guidelines. General considerations: 1. Considerations regarding PAM: A trinucleotide PAM (50 -NGG-30 ) is required next to the 20 nucleotides target site to ensure that this will be recognized by Cas9. Several tools (such as Benchling or CRISPR-P) are available for gRNA design. They scan the gene of interest for putative target sites having a PAM motif. 50 -CGG-30 motifs were shown to be more efficient than the general 50 -NGG-30 motif [19].
34
M. Vazquez-Vilar et al.
2. Considerations regarding guide scores: Select guides with highest on-target score and lowest off-target score, as calculated with ad hoc algorithms. Software tools such as Benchling or CRISPR-P provide on-target scores. Availability of off-target scores requires availability of the genomic sequence of the target plant. 3. Considerations regarding the presence of a G in 50 position of the gRNA. This is often a requirement imposed by the transcriptional starting site (TSS) of RNAPolIII promoters. In the multiplexing approach described here, the 20-nucleotide guide sequences are not located next to a TSS: they are processed instead using a tRNA cleavage strategy. Therefore, there is no need for a G as starting nucleotide of the 20-nucleotide guide sequence. Therefore, the PAM (NGG) is the only requirement of the target site to be recognized by the Cas9. 4. Considerations regarding the region of the gene to be targeted. For knockouts, a general rule is to target the first exon, paying attention to downstream in-frame ATGs which could function as alternative translation start sites. Double gRNA constructs are often used to increase knockout efficiency. For transcriptional activation, target sites are selected in the promoter area, immediately upstream (100–250 bp) of the TSS. For transcriptional repression, it is generally advised to target the 50 -UTR or even the start of the coding region, although the experimental evidences supporting these considerations are scarce in plants and mainly extrapolated from experience in other non-plant systems. 3.3 Guide RNAs Design for Multiplexing with Goldenbraid
After target selection according to the criteria explained in Subheading 3.2, the target nucleotide sequence needs to be adapted in order to assemble the “tps” level 0 vectors by adding proper nucleotide overhangs (Fig. 2a). In the GB multiplexing-schema, protospacer sequences are assembled together with the tRNA and the Cas9-scaffold as oligonucleotides heteroduplex. The overhangs of the oligonucleotides heteroduplex will anneal to BsmBI sticky ends of the tRNA and Cas9-scaffold. 1. Add the prefix GTGC followed by an A to the 50 end of the selected target sequence (see Notes 2 and 3) resulting in the following sequence: 50 -GTGCaN1N2N3N4N5N6N7N8N9N10N11N12N13N14 N15N16N17N18N19N20-30 where N1 to N20 are the nucleotides of the target sequence. This will constitute the gRNA forward primer. The forward gRNA oligonucleotides for the Pnos selected targets are listed in Table 2. 2. Add the suffix AAAC to the 50 end of the reverse complement of the target sequence and a T on the 30 end (see Notes 2 and 4)
Multiplexing CRISPR/Cas9 Constructs using GoldenBraid
35
Fig. 2 Schematic representation of primer design and cloning steps. (a) Primer design for the generation of gRNAs. In green and blue, the standard overhangs that need to be incorporated respectively to forward and reverse primers for their later gRNA annealing. (b) Cloning steps for the assembly of multiplexing constructs. Reaction 1 is performed in level 1 and consists of the multipartite assembly of the tRNA, the protospacer and the scaffold. A total of three BsmBI/T4 ligase parallel reactions will take place, each of them involving the pUPD2 vector of the GB collection containing a LacZ cassette flanked by the overhangs A and D, one of the three previously annealed gRNA designed within the overhangs B and C and one of the three different vectors corresponding to the three different gRNA positions: GB1205 for position 1, GB1206 for position 2, and GB1207 for position 3. Each vector contains the tRNA flanked by the overhangs A and B and the scaffold within the overhangs C and D. The outcome of these reactions is three different vectors for the three different gRNA positions containing tRNA + gRNA + scaffold. All three vectors will be incorporated in the following reaction. Reaction 2: This reaction takes place in level 0. The three different vectors generated in Reaction 1, namely GB1315 [flanked by specific overhangs for position 1 (1 and 2)], GB1626 [flanked by specific overhangs for
36
M. Vazquez-Vilar et al.
resulting in the following sequence: 50 -AAACN20N19N18N17N16N15N14N13N12N11N10N9N8N7N6N5N4N3N2N1(reverse complement) t-30 where N1 to N20 represent the reverse complement of the target sequence. This will be the gRNA reverse primer. The reverse gRNA oligonucleotides for the Pnos selected targets are listed in Table 2. 3.4 Cloning of Level 0 tRNA-ProtospacerScaffold “tps” Parts
The first assembly step is to clone the individual “tps” cassettes in the pUPD2, the GB level 0 entry plasmid, by a restriction–ligation reaction (Fig. 2b). This is a single pot reaction that requires a level 1 tRNA-scaffold “ts” plasmid containing both the tRNA and the Cas9-scaffold flanked by BsmBI sites, the pUPD2 destination vector and the oligonucleotide pair to serve as protospacer for the gRNA as designed on the previous section. Different “ts” level 1 vectors are available for the assembly of individual gRNAs for the different positions in the polycistronic RNA (see Table 1). 1. Dilute the oligonucleotides designed on the previous step (see Subheading 3.3.) to 1 μM in MilliQ water. 2. Mix 5 μL of the forward oligonucleotide (gRNA1-forward in this example) and 5 μL of the reverse oligonucleotide (gRNA1reverse in this example) and let them anneal at room temperature for 30 min. 3. Set the restriction–ligation reaction up by mixing 20 femtomoles of the “ts” level 1 plasmid for position 1, GB1205, 20 femtomoles of pUPD2, 60 femtomoles of the annealed oligonucleotides from step 2, 8 U of BsmBI, 3 U of T4 ligase, and 1.5 μL of 10 T4 ligase buffer in a 15 μL reaction. 4. Incubate the BsmBI restriction–ligation reaction in a thermocycler with the following program: 37 C 10 min + 25 (37 C 3 min + 16 C 4 min) + 50 C 10 min + 80 C 10 min. 5. Mix 1 μL of the reaction with 50 μL of Mix&Go cells previously thawed on ice. Let them stand on ice for 2 min. Recover the cells by adding 500 μL of SOC, transfer the cells to a 1.5mL sterile tube and shake for 1 h at 37 C. Spread two volumes
ä Fig. 2 (continued) position 2 (2 and 3)], GB1623 [flanked by specific overhangs for position 3 (3 and 4)] are incorporated in a BsaI/T4 ligase reaction together with the vector GB1001 containing the U626 promoter within the overhangs 0 and 1 and the pDGB3α1 vector containing a LacZ cassette flanked by the overhangs 0 and 4. The outcome of this reaction is a unique construct (multiplexing module) containing the promoter U626 + 3 gRNA. Reaction 3: This reaction takes place in level 1. The vector GB1602 that contains the multiplexing module flanked by the overhangs 5 and 6 which was generated in the reaction 2 is now incorporated in a new BsmBI/T4 ligase reaction together with the vector GB1191 containing 35 s + dCas9 + Tnos within the overhangs 6 and 7 to be assembled in the GB vector pDGB3Ω1
Multiplexing CRISPR/Cas9 Constructs using GoldenBraid
37
(50 and 500 μL) of cells in LB/chloramphenicol/IPTG/X-Gal petri dishes. Incubate the plates overnight at 37 C. 6. Pick two white colonies into 2 mL of LB/chloramphenicol and grow the cultures overnight in a shaker at 37 C. 7. Isolate the plasmid from the cultures following the manufacturer’s indications and perform a restriction analysis of the purified plasmids. 8. Sequence the plasmid showing the expected restriction bands with a primer that anneals on the pUPD2 (see Note 5). 9. Repeat the process for the assembly of each individual “tps” plasmid (see Note 6). 3.5 Level 1 Polycistronic gRNA Cassette Assembly
The next cloning step involves the assembly of the polycistronic gRNA in a GB level 1 destination vector. Individual “tps” level 0 plasmids assembled in compatible vectors can be combined with an RNAPolIII promoter for the assembly of the 3 polycistronic gRNA expression cassette (Fig. 2b). 1. Set the restriction–ligation reaction up by mixing 20 femtomoles of pUPD_U6-26 (GB1001), 20 femtomoles of each pUPD2_tps plasmid (GB1626, GB1623, and GB1315 in this example) (see Note 7), 20 femtomoles of the pDGB3α2 destination plasmid, 8 U of BsaI, 3 U of T4 ligase and 1.5 μL of 10 T4 ligase buffer in a 15 μL reaction. 2. Incubate the BsaI restriction–ligation reaction in a thermocycler with the following program: 37 C 10 min + 25 (37 C 3 min + 16 C 4 min) + 50 C 10 min + 80 C 10 min. 3. Mix 1 μL of the reaction with 50 μL of Mix&Go cells previously thawed on ice. Let them stand on ice for 2 min. Recover the cells by adding 500 μL of SOC, transfer the cells to a 1.5mL sterile tube and shake for 1 h at 37 C. Spread two volumes (50 and 500 μL) of cells in LB/kanamycin/IPTG/X-Gal petri dishes. Incubate the plates overnight at 37 C. 4. Pick two white colonies into 2 mL of LB/kanamycin and grow the cultures overnight in a shaker at 37 C. 5. Isolate the plasmids from the cultures following manufacturer’s indications and perform a restriction analysis of extracted plasmids for correct assembly verification (see Note 8).
3.6 Final T-DNA Expression Vector Assembly by Combining the Polycistronic gRNA with the dCas9 TU
The last assembly step is the combination of the 3 polycistronic gRNA expression cassette with the dCas9 TU in a GB binary reaction using any pDGB3Ω as destination vector (Fig. 2c). 1. Set the restriction–ligation reaction up by mixing 20 femtomoles of pEGB3α1_35s:dCas9:Tnos (GB1191) (see Note 9),
38
M. Vazquez-Vilar et al.
20 femtomoles of pEGB3α2_U6-26:gRNA1:gRNA2:gRNA3 (GB1602), 20 femtomoles of the pDGB3Ω1 destination plasmid, 8 U of BsmBI, 3 U of T4 ligase, and 1.5 μL of 10 T4 ligase buffer in a 15 μL reaction. 2. Incubate the BsmBI restriction–ligation reaction in a thermocycler with the following program: 37 C 10 min + 25 (37 C 3 min + 16 C 4 min) + 50 C 10 min + 80 C 10 min. 3. Mix 1 μL of the reaction with 50 μL of Mix&Go cells previously thawed on ice. Let them stand on ice for 2 min. Recover the cells by adding 500 μL of SOC, transfer the cells to a 1.5mL sterile tube and shake for 1 h at 37 C. Spread two volumes (50 and 500 μL) of cells in LB/spectinomycin/IPTG/X-Gal petri dishes. Incubate the plates overnight at 37 C. 4. Pick two white colonies into 2 mL of LB/spectinomycin and grow the cultures overnight in a shaker at 37 C. 5. Isolate the plasmids from the cultures following the manufacturer’s indications and perform a restriction analysis of extracted plasmids for correct assembly verification (see Note 10). 3.7 Transient Expression in N. benthamiana Leaves
CRISPR/Cas9-mediated repression of Pnos was assessed in N. benthamiana transient experiments co-delivering, along with the multiplexing gRNAs—Cas9 construct (GB1610), a Luciferase reporter. The reporter module (GB1116) consisted of three TUs: (1) the Firefly Luciferase (FLuc) driven by the Pnos; (2) the Renilla Luciferase (RLuc) driven by the 35 s promoter, used in the normalization of the expression levels; and (3) the P19 gene silencing suppressor. 1. Transform 10 ng of GB1610 (see Note 11) into 50 μL of A. tumefaciens homemade electrocompetent cells by electroporation at 1.440 V; collect the cells from the cuvette with 500 μL of LB and grow them on a shaker at 28 C for 2 h. 2. Spread two cell volumes (20 and 100 μL) on LB/spectinomycin/rifampicin plates. Incubate plates at 28 C for 2 days. 3. Pick two colonies and inoculate them in 5 mL liquid LB containing spectinomycin and rifampicin. Grow them for 2 days in a shaker at 28 C. 4. Isolate the plasmids from the cultures and check the colonies by restriction analysis. 5. Subculture (1/100 dilution) into a new tube (5 mL final volume) and grow overnight at 28 C. 6. Pellet the cells by centrifugation (10 min, 2000 g).
Multiplexing CRISPR/Cas9 Constructs using GoldenBraid
39
7. Resuspend the cells in agro-infiltration buffer and incubate them for 2 h at room temperature in a horizontal rolling mixer in the dark. 8. Dilute the cell suspension with agro-infiltration buffer to a final optical density of 0.1 at 600 nm. 9. Mix the cell suspension of GB1610 with the cell suspension of the reporter (GB1116) in a 1:1 ratio. 10. Infiltrate the leaf intercellular spaces of three leaves of a 4- to 5-week-old N. benthamiana plant with needle-free syringe containing the cell suspension through the abaxial surface of the leaf while exerting a counter-pressure with the index finger from the other side. 3.8 Functional Characterization of the Generated Plasmid for the nos Promoter Regulation
Three proximal positions of the nos promoter, ranging from 85 to +7 (Fig. 3a), were covered with the generated plasmid that carries a T-DNA including a 3 polycistronic gRNA expression cassette and the dCas9 transcriptional unit (GB1610). For determining the effect of targeting the dCas9 to the Pnos with three gRNAs follow the next steps: 1. Collect one leaf-disk of each of the agroinfiltrated leaves at 5 days post infiltration (dpi) using a 0.8 cm cork-borer (20 mg of tissue approximately) in a 1.5-mL Eppendorf tube (see Note 12). 2. Add one stainless steel bead per Eppendorf tube. Freeze them immediately using liquid nitrogen. 3. Grind the leaf discs using a Mixer Mill M400 (Retsch) for 30 s at 30 Hz (see Note 13). 4. Add 150 μL of 1 Passive Lysis Buffer (1 PLB) to the ground tissue and vortex gently. 5. Centrifuge at 12,000 g for 15 min at 4 C. 6. Take 24 μL of the supernatant and dilute it with 36 μL of 1 PLB to perform a 2:3 dilution. 7. Place 10 μL of the diluted extract in a white polystyrene 96-well plate and add 40 μL of LARII. Incubate for 10 min, and then determine the FLuc activity using a GloMax 96 Microplate Luminometer (Promega) with a 2 s delay and a 10 s measurement. 8. To measure the RLuc activity, add 40 μL of Stop&Glo reagent and measure again after 10 min incubation under the same conditions. Determine the FLuc/RLuc ratios as the mean value of three samples coming from three independent agroinfiltrated leaves of the same plant and normalize them to the Fluc/Rluc ratio obtained for a reference sample including the reporter (GB1116) co-infiltrated with a construct carrying three Pnos unrelated gRNAs and the dCas9 TU (GB2209).
40
M. Vazquez-Vilar et al.
Fig. 3 Targeted transcriptional repression of the nos promoter with dCas9 and a 3 polycistronic gRNA. (a) Schematic representation of the nos promoter with the three targeted positions (85, 33, and +7 relative to the transcription start site, TSS). (b) dCas9 in combination with a 3 polycistronic gRNA targeting the Pnos (GB1610) effectively decreases the transcriptional activity of Pnos when compared to the repression rates of dCas9 in combination with unrelated gRNAs (GB2209). Values were normalized to the Fluc/Rluc ratios of a reference sample (GB1116) set as 1. Bars represent average values of three samples standard deviations. P-value < 0.05 3.9 Functional Validation of the Generated Plasmid for XT Gene Editing
Following the steps described in Subheadings 3.3–3.5, a 2 polycistronic gRNA for gene editing was assembled (see Note 7). Functionality of the designed gRNAs targeting two loci of Xylosyltransferase (XT) (Fig. 4a) was assessed in transient expression in N. benthamiana leaves following the procedure described in Subheading 3.6. Genomic DNA (gDNA) was extracted from leaves co-infiltrated at a 1:1 ratio with GV3101 cells carrying the Cas9 TU (GB0639) and a 2 polycistronic gRNA expression cassette (GB1217) at 5 dpi. Cas9-mediated editing efficiency was estimated based on the loss of a restriction enzyme (RE) site overlapping the Cas9-cutting site (Fig. 4b) (see Note 14). Proceed with the
Multiplexing CRISPR/Cas9 Constructs using GoldenBraid
41
Fig. 4 Targeted mutagenesis of the N. benthamiana xylosyltransferase gene with CRISPR/Cas9 and a 2 polycistronic gRNA. (a) Schematic representation of the exon-intron gene structure for Niben101Scf04205Ctg025 (XT1) and Niben101Scf04551Ctg021 (XT2) (exons are squares and introns are lines). The targeted sequences for the two tested guide RNAs are depicted. Diagnostic restriction sites are underlined and the PAM sequence is shown in bold. (b) PCR/RE assay for detection of simultaneous targeted mutations in XT1 and XT2 with a 2 polycistronic gRNA. Red arrows show BsmBI- and SpeI-resistant PCR products amplified from the N. benthamiana genomic DNA of agroinfiltrated leaves; indicative of the CRISPR/ Cas9 activity inducing the loss of the RE recognition site. The mutation efficiency was estimated in a 6% for XT1 and 8% for XT2 based on the intensity of the undigested bands marked with a red arrow relative to the undigested DNA present on the negative control
following steps for gRNA editing efficiency determination with the PCR/RE lost method: 1. Collect one leaf disk of each of the agroinfiltrated leaves at 5 dpi using a 0.8 cm cork-borer (20 mg of tissue approximately) in a 1.5-mL Eppendorf tube (see Note 12). 2. Add one stain steel bead per Eppendorf. Freeze them immediately using liquid N2. 3. Grind the leaf discs using a Mixer Mill M400 (Retsch) for 30 s at 30 Hz (see Note 13). 4. Proceed with the gDNA extraction using the CTAB protocol. First prepare enough volume of CTAB buffer. 5. Add 600 μL of CTAB buffer with 2% β-mercaptoethanol to each tube with frozen powder from step 3. Vortex. Alternatively, tissue may be grind (fresh or frozen) in CTAB buffer using a small pestle. 6. Incubate for 45 min at 65 C. Mix by inversion every 5–15 min to keep the plant material in suspension. 7. Add 600 μL of chloroform:isoamylalcohol (24:1). Vortex. 8. Spin for 15 min at 12,000 g.
42
M. Vazquez-Vilar et al.
9. Transfer the upper phase to a clean Eppendorf tube taking care not to take any of the interphase or chloroform phases. 10. If the upper phase is not clear, repeat steps 5–7. 11. Add 1 volume of isopropanol. Mix by inversion. Precipitate on ice for 5 min. 12. Spin for 10 min at 12,000 g at 4 C. Remove the supernatant and wash the pellet with 600 μL of 80% ethanol (ice-cold). Vortex. 13. Spin for 5 min at 12,000 g at 4 C. Remove supernatant and let the pellet dry. 14. Resuspend in sterile H2O (or TE buffer). Let it stand at RT for about 30 min. Mix well with a pipette and let at RT for additional 30 min. 15. PCR amplify the gDNA region containing the target site using Phusion polymerase according to the manufacturer’s instructions. Use primers XT1_Fwd 50 -AACCACTTTTCCTCGTCG GAAA -30 and XT1_Rev 50 - TAACTATTCAACTAAAGCTT CAAACAG-30 for the XT1 locus amplification and XT2_Fwd 50 - AACCACTTTTCCTTGTCGGAAA -30 XT2_Rev 50 - GGA ATGAAATTAACCACTTCAGG -30 for the XT2 locus amplification. 16. Purify the PCR products using the QIAquick PCR purification kit (QIAGEN) following the manufacturer’s protocol. 17. Set a restriction reaction up with 500 ng of the purified PCR product and the corresponding restriction enzyme; BsmBI (Fermentas) for XT1 and SpeI (Fermentas) for XT2. 18. Run the digested product in a 1% agarose gel. 19. Estimate band intensities using ImageJ as described in [20].
4
Notes 1. The GB kit includes TUs for the expression of the catalytically active Cas9 and of the inactive dCas9 fused to different activation and repression domains. This set of TUs allows the application of multiplexing for both gene editing and gene regulation. 2. Illegal BsaI and BsmBI sites cannot be generated on the target after overhang addition. If so, an alternative target sequence must be chosen. 3. Nucleotides to add to the target for the creation of the forward primer are the same independently on the selected level 1 tRNA-gRNA vector.
Multiplexing CRISPR/Cas9 Constructs using GoldenBraid
43
4. Nucleotides to add to the target for the creation of the reverse primer are the same independently on the selected level 1 tRNA-gRNA vector. 5. Use the 50 -GCTTTCGCTAAGGATGATTTCTGG-30 primer for sequencing in order to confirm the correct tRNA-spacergRNA assembly prior to proceed with the Golden Gate assembly reaction. 6. Repeat the assembly process for the assembly of individual tRNA-gRNA plasmids for each position in the polycistronic gRNA expression cassette: GB1206 is used for the assembly of the annealed oligonucleotides gRNA3-forward and gRNA3reverse, and GB1207 is used for the assembly of the annealed oligonucleotides gRNA2-forward and gRNA2-reverse. 7. For a 2 gRNAs polycistronic gRNA assembly, two pUPD2_tps vectors would be used instead of three. For gene editing, the two loci of the N. benthamiana XT gene, a two-gRNA polyscistronic gRNA cassette, were generated following the same approach as for the Pnos. Two independent “tps” plasmids (GB1214 and GB1215) were generated by cloning the target sequences depicted in Fig. 4a in the pUPD2, making use of the level 1 plasmids GB1208 and GB1207, respectively. GB1214 and GB1215 were assembled together with the GB1001 in pDGB3α2 creating GB1217. 8. HindIII can be used for correct assembly verification. The expected bands on this restriction analysis are 6345 bp and 798 bp. 9. If the active Cas9 TU (GB0639) is used instead of the dCas9 TU (GB1119), the resulting construct will be ready for gene editing. Alternatively, other dCas9-based transcription factor TUs can be used instead in the final assembly. 10. BamHI can be used for the verification of the final assembly. The expected bands for GB1610 are 6674, 3824, and 2646 base pairs. 11. Transform also in GV3101 the GB1116 and GB2209 plasmids in order to have the reporter construct for testing gRNAs targeting the Pnos via co-infiltration (GB1116) and a construct with unrelated gRNAs to be used as negative control (GB2209). 12. Avoid the leaf veins which are a source of variability when collecting the samples. 13. Alternatively, small blue pestles can be used to grind the sample. 14. When gRNAs do not contain a restriction enzyme (RE) site overlapping the Cas9 cutting site, the RE site loss method cannot be used for validation of CRISPR-Cas9 editing.
44
M. Vazquez-Vilar et al.
Alternative methods for editing efficiency estimation are the T7 Endonuclease I mismatch detection assay or the Tracking of Indels by Decomposition (TIDE) analysis [21]. References 1. Schindele P, Wolter F, Puchta H (2018) Transforming plant biology and breeding with CRISPR/Cas9, Cas12 and Cas13. FEBS Lett 592(12):1954–1967 2. Belhaj K, Chaparro-Garcia A, Kamoun S et al (2015) Editing plant genomes with CRISPR/ Cas9. Curr Opin Biotechnol 32:76–84 3. Bortesi L, Fischer R (2015) The CRISPR/ Cas9 system for plant genome editing and beyond. Biotechnol Adv 33(1):41–52 4. Lowder LG, Paul JW, Qi Y (2017) Multiplexed transcriptional activation or repression in plants using CRISPR-dCas9-based systems. Methods Mol Biol 1629:167–184 5. Piatek A, Ali Z, Baazim H et al (2015) RNA-guided transcriptional regulation in planta via synthetic dCas9-based transcription factors. Plant Biotechnol J 13(4):578–589. https://doi.org/10.1111/pbi.12284 6. Minkenberg B, Wheatley M, Yang Y (2017) CRISPR/Cas9-enabled multiplex genome editing and its application. Prog Mol Biol Transl Sci 149:111–132 7. Tsai SQ, Wyvekens N, Khayter C et al (2014) Dimeric CRISPR RNA-guided FokI nucleases for highly specific genome editing. Nat Biotechnol 32:569–576. https://doi.org/10. 1038/nbt.2908 8. Gao Y, Zhao Y (2014) Self-processing of ribozyme-flanked RNAs into guide RNAs in vitro and in vivo for CRISPR-mediated genome editing. J Integr Plant Biol 56 (4):343–349. https://doi.org/10.1111/jipb. 12152 9. Xie K, Minkenberg B, Yang Y (2015) Boosting CRISPR/Cas9 multiplex editing capability with the endogenous tRNA-processing system. Proc Natl Acad Sci 112(11):3570–3575. https://doi.org/10.1073/pnas.1420294112 10. Patron NJ (2014) DNA assembly for plant biology: techniques and tools. Curr Opin Plant Biol 19:14–19 11. Engler C, Gruetzner R, Kandzia R, Marillonnet S (2009) Golden gate shuffling: a one-pot DNA shuffling method based on type ils restriction enzymes. PLoS One 4(5):e5553. https://doi.org/10.1371/journal.pone. 0005553 12. Engler C, Marillonnet S (2014) Golden Gate cloning. Methods Mol Biol 1116:119–131.
https://doi.org/10.1007/978-1-62703-7648_9 13. Sarrion-Perdigones A, Falconi EE, Zandalinas SI et al (2011) GoldenBraid: an iterative cloning system for standardized assembly of reusable genetic modules. PLoS One 6(7):e21622. https://doi.org/10.1371/journal.pone. 0021622 14. Sarrion-Perdigones A, Vazquez-Vilar M, Palacı´ J et al (2013) Goldenbraid 2.0: a comprehensive DNA assembly framework for plant synthetic biology. Plant Physiol 162 (3):1618–1631. https://doi.org/10.1104/ pp.113.217661 15. Vazquez-Vilar M, Quijano-Rubio A, Fernandez-Del-Carmen A et al (2017) GB3.0: a platform for plant bio-design that connects functional DNA elements with associated biological data. Nucleic Acids Res 45:2196–2209. https://doi.org/10.1093/ nar/gkw1326 16. Vazquez-Vilar M, Bernabe´-Orts JM, Fernandez-del-Carmen A et al (2016) A modular toolbox for gRNA-Cas9 genome engineering in plants based on the GoldenBraid standard. Plant Methods 12:1–12. https:// doi.org/10.1186/s13007-016-0101-2 17. Patron N et al (2015) Standards for plant synthetic biology: a common syntax for exchange of DNA parts. New Phytol 208:13–19. https://doi.org/10.1111/nph.13532 18. Graham DB, Root DE (2015) Resources for the design of CRISPR gene editing experiments. Genome Biol 16:260 19. Doench JG, Hartenian E, Graham DB et al (2014) Rational design of highly active sgRNAs for CRISPR-Cas9-mediated gene inactivation. Nat Biotechnol 32 (12):1262–1267. https://doi.org/10.1038/ nbt.3026 20. Guschin DY, Waite AJ, Katibah GE et al (2010) A rapid and general assay for monitoring endogenous gene modification. Methods Mol Biol 649:247–256. https://doi.org/10. 1007/978-1-60761-753-2_15 21. Brinkman EK, Chen T, Amendola M, Van Steensel B (2014) Easy quantitative assessment of genome editing by sequence trace decomposition. Nucleic Acids Res 42(22):e168. https://doi.org/10.1093/nar/gku936
Chapter 3 Gene Editing in Green Alga Chlamydomonas reinhardtii via CRISPR-Cas9 Ribonucleoproteins Simon Kelterborn, Francisca Boehning, Irina Sizova, Olga Baidukova, Heide Evers, and Peter Hegemann Abstract With the establishment of the CRISPR-Cas9 molecular tool as a DNA editing system in 2012, the handling of gene editing experiments was strongly facilitated pushing reverse genetics approaches forward in many organisms. These new gene editing technologies also drastically increased the possibilities for design-driven synthetic biology. Here, we describe a protocol for gene editing in the green algae Chlamydomonas reinhardtii using preassembled CRISPR-Cas9 ribonucleoproteins. The three sections of the protocol guide through a complete gene editing experiment, starting with the experimental design and the choice of suitable CRISPR target sites and how to perform a Cas9 in vitro test digestion. The second part covers the transformation of algal cells with Cas9 RNPs using electroporation. In the last part, the PCR-based screening for mutants and isolation of clones is explained. Key words Chlamydomonas reinhardtii, Green algae, Microalgae, CRISPR/Cas9, S. pyogenes, SpCas9, Gene editing, Gene targeting, Synthetic biology, Ribonucleoproteins, RNP, Homologous recombination (HR), Homology-directed repair (HDR)
1
Introduction Gene editing is an important method for basic research to study the function of genes of interest and for biotechnological experiments that include strain engineering. Since single- and double-stranded DNA breaks facilitate nucleotide excision and recombination with homologous template DNA, programmable site-directed nucleases are the tool of choice to perform site-specific DNA modification. The great advantage of nucleases found in the Clustered Regularly Interspaced Short Palindromic Repeats (CRISPR) system like Cas9 compared to previous designer nucleases such as zinc-finger nucleases (ZFNs) or transcription activator-like effector nucleases (TALENs) is the flexibility to adapt the system to new target sites. The binding specificity of Cas9 is directed by an RNA that base-
Matias D. Zurbriggen (ed.), Plant Synthetic Biology, Methods in Molecular Biology, vol. 2379, https://doi.org/10.1007/978-1-0716-1791-5_3, © Springer Science+Business Media, LLC, part of Springer Nature 2022
45
46
Simon Kelterborn et al.
pairs 1:1 to the target DNA. So once the system is established for an organism, it is very easy to use it for multiple genes [1]. In Chlamydomonas reinhardtii, the expression of foreign genes is often inefficient, particularly if they encode large proteins like Cas9 [2, 3]. Therefore, we describe a method using recombinant S. pyogenes Cas9 protein preassembled with synthetic CRISPR RNA to ribonucleoproteins (RNPs) that are directly transformed using electroporation. This circumvents the need for Cas9 expression by the host cell and makes the protocol applicable to different strains and even to other species. Moreover, by using RNPs, the risk of generating off-target mutations is strongly reduced, as prolonged Cas9 expression or the random integration of the Cas9 plasmids into the genome can lead to uncontrollable side effects. In this protocol, we give detailed instructions to inactivate nuclear genes in C. reinhardtii with Cas9 RNPs. We describe a protocol where the RNP is co-transformed with an antibiotic resistance marker plasmid to select for cells that were successfully electroporated [3, 4]. Two other labs described protocols without the use of a selection marker. This is an interesting alternative for DNA-free transformations. However, without pre-selection, mutation rates are lower and more cells need to be screened, especially when targeting genes with unknown phenotype [5, 6]. The method described here was also used for gene insertions and single point mutations, but in favor of clarity, these variations were omitted here (see Note 1 for some details). The protocol is divided into three steps: (1) Cas9 target and primer design and in vitro Cas9 test digestion, (2) transformation of Cas9 RNPs into algal cells via electroporation, and (3) mutant identification via PCR-based screenings.
2
Materials
2.1 Algal Transformation
1. Chlamydomonas reinhardtii cells: We successfully used cell wall containing strains (CC–125, CC–1826, SAG11–32b, SAG– 73.72) and cell wall-deficient strains (CC–3403, CC–4350, CC–4533) obtained from the Chlamydomonas Resource Center (https://www.chlamycollection.org). 2. Strains are grown in Tris–acetate–phosphate (TAP) medium [7], at 110 rpm either in continuous light (40–60 μE/m2/s) or in light–dark cycles of 14 h at 25 C in light (40–60 μE/m2/ s) and 10 h at 18 C in darkness. Media is supplemented with 100 mg/mL L-arginine for the strains CC–1826, CC–3403, and CC–4350. 3. DUPLEX Buffer (IDT-DNA, #11-01-03-01): 100 mM potassium acetate, 30 mM HEPES, pH 7.5, sterile and nucleasefree.
C. reinhardtii CRISPR-Cas9 Protocol
47
Table 1 FLAGv3 oligo sequences. Sequences are given in 50 ! 30 orientation. * ¼ phosphorothioate (PTO) bond for stabilization Name
Sequence
FLAGv3 (stop-EMX1-1-stop) FW
T*T*A*GCTAAGCCTCCCCAAAGCCTGGCCAGGGTC*T*A*G
FLAGv3 (stop-EMX1-1-stop) RV
C*T*A*GACCCTGGCCAGGCTTTGGGGAGGCTTAGC*T*A*A
4. 10 Buffer O (ThermoFisher Scientific, #BO5): 0.5 M Tris– HCl (pH 7.5 at 37 C), 100 mM MgCl2, 1 M NaCl, 1 mg/mL BSA sterile and nuclease-free. 5. Cas9 protein: Recombinant S. pyogenes Cas9 protein is commercially available or can be self-made. We use the plasmid pET-28b-Cas9-His (Addgene #47327 https://www.addgene. org/47327/) for expression in E. coli strain Rosetta2(DE3) pLysS at 20 C followed by Ni-NTA purification according to published protocols [8]. After purification, the concentration of Cas9 protein is adjusted to 1.6 mg/mL (10 μM) in 1 Buffer O, sterile filtrated and stored in aliquots of 20 μL at 80 C. 6. CRISPR-RNA: We order CRISPR RNAs commercially as two-component RNAs (gene-specific crRNA and constant tracrRNA) but can also be synthesized in vitro as single guide RNA [6]. 7. FLAG insertion oligos: Two complementary oligos with phosphorothioate bonds (PTO) for protection. Sequences are listed in Table 1 (see Note 2 why this sequence was chosen). 8. Marker plasmid: Antibiotic resistance marker plasmid, e.g., pAPHVIII (pPH075) conferring resistance to paromomycin (selection at 10–20 μg/mL) or pAPHVII (pPH360) conferring resistance to hygromycin B (selection at 10–20 μg/mL) and pARG7 (pHR11) for recovery of arg7 mutant auxotrophy. All plasmids are available from the Chlamydomonas Resource Center (www.chlamy.de/plasmids). 9. Electroporation buffer (ME-Suc): MAX Efficiency™ Transformation Reagent for Algae (ThermoFisher Scientific, #A24229) supplemented with sterile 40 mM sucrose; freshly prepared. 10. NEPA21 Electroporator (Nepagene). Electroporators from other suppliers might function as well but electroporation conditions need to be adjusted. 11. Electroporation cuvettes with 2 mm gap and working volume of 40–400 μL with transfer pipettes (sterile).
48
Simon Kelterborn et al.
12. 24-well cell culture plates (sterile). 13. Tris–acetate–phosphate (TAP) medium [7] and selective TAP agar plates. We use big 12 cm square plates. For paromomycin selection via APHVIII, plates contained 12 μg/mL paromomycin; for hygromycin B selection via APHVII, plates contained 10 μg/mL of hygromycin B, and for ARG7 selection, arginine-free agar plates were used. 2.2 PCR-Based Mutant Screening
1. 96-well PCR equipment: 96-well PCR cycler, 96-well PCR plates, foils, multichannel pipettes. 2. Phire Plant Direct PCR Master Mix (ThermoFisher Scientific, #F160S). 3. 5 M Betaine. 4. 96-well cell culture plates (sterile). 5. 96-well V-bottom plates. 6. Toothpicks (sterile). 7. Equipment for DNA agarose gel electrophoresis: Tris–borate– EDTA (TBE) buffer, agarose, DNA stain, e.g., Midori Green Advance, 6 DNA loading dye, imaging device. 8. PCR/gel cleanup kit. 9. DNA sequencing analysis software. 10. Optional: 30 mM 5-bromo-4-chloro-3-indolyl sulfate (X-SO4). Prepare a working solution with 3 mM X-SO4 in 0.1 M Tris–HCl (pH 7.5).
3
Methods
3.1 Experimental Design and Pretest for Cas9 Activity
For illustration, the gene SNRK2.2 (Cre12.g499500) will be used (Fig. 1), but this method can be adapted for any nuclear gene of interest. Mutants with a disrupted SNRK2.2 gene can be phenotypically screened by a color test (see Note 3). This gene can be used as a positive control or method establishment. 1. Identify Cas9 target sites in your gene of interest by using search tools like, e.g., CRISPR-P (http://crispr.hzau.edu.cn/ cgi-bin/CRISPR2/CRISPR, [9]) or CRISPOR (http:// crispor.tefor.net/, [10]). Choose target sequences in an exon in the first third of the coding sequence with an on–score >0.5 and as few off-target sites as possible ( 0.9), the light rays no longer travel on parallel paths; rather, they converge at a focal spot [7, 10]. To correct for this effect, correction factors l1 and l2 can be used, that are incorporated into the formula to calculate the steadystate anisotropy as follows [5, 10]: r¼
GI I ⊥ ð1 3l 1 ÞGI þ ð2 3l 2 Þ I ⊥
In principle, the correction factors should be determined for each individual objective lens used for anisotropy measurements. However, these values are very small, and the performance of these calibration measurements is quite complex [10]. Therefore, for practical reasons, we suggest to ignore this correction step or take the values from literature [10].
208
Stefanie Weidtkamp-Peters et al.
3. Make sure to determine the G-factor for each measuring day, as it can vary from one day to the other, e.g., due to temperature, alignment of the beam path. 4. Free EGFP is mixed with some triton X-100 for better solubility and avoidance of attachment to the glass surfaces. 5. The following software tools for calculation of fluorescence anisotropy are available: depending on your system setup: SymphoTime by PicoQuant (www.picoquant.com), SPCImage by Becker&Hickl (www.becker-hickl.de), and AnI (www.mpc.hhu.de) [11]. 6. We suggest using 1–2 μW of a 488 nm laser at the objective front lens.
Acknowledgments We acknowledge the Deutsche Forschungsgemeinschaft (DFG) for financial support to Y.S. by grant STA12/12 1-1 and S.W.P by grant WE53/43 1-1. We are thankful to Jelle Ludolf Postma and Sebastian H€ansch for critical reading of the manuscript. References 1. Fo¨rster T (1948) Zwischenmolekulare Energiewanderung und Fluoreszenz. Ann Phys 437:55 2. Vogel SS, Thaler C, Blank PS et al (eds) (2009) FLIM microscopy in biology and medicine: Chapter 10: Time-resolved fluorescence anisotropy. Taylor & Francis, Boca Raton, FL 3. Lakowicz JR (2006) Principles of fluorescence spectroscopy, 3rd edn. Springer, New York, NY 4. Bader AN, Hoetzl S, Hofman EG et al (2011) Homo-FRET imaging as a tool to quantify protein and lipid clustering. ChemPhysChem 12(3):475–483. https://doi.org/10.1002/ cphc.201000801 5. Stahl Y, Grabowski S, Bleckmann A et al (2013) Moderation of Arabidopsis root stemness by CLAVATA1 and ARABIDOPSIS CRINKLY4 receptor kinase complexes. Curr Biol 23(5):362–371. https://doi.org/10. 1016/j.cub.2013.01.045 6. Grabowski S (2013) Multiparameter-Fluoreszenz-Image-Spektroskopie von Proteinkomplexen in planta. Dissertation 7. Thaler C, Koushik SV, Puhl HL et al (2009) Structural rearrangement of CaMKIIalpha
catalytic domains encodes activation. PNAS 106:6369–6374 8. Bleckmann A, Weidtkamp-Peters S, Seidel CAM et al (2010) Stem cell signaling in Arabidopsis requires CRN to localize CLV2 to the plasma membrane. Plant Physiol 152(1): 166–176. https://doi.org/10.1104/pp.109. 149930 ˜a AU et al 9. Ameloot M, Vande Ven M, Acun (2013) Fluorescence anisotropy measurements in solution: methods and reference materials (IUPAC technical report). Pure Appl Chem 85(3):589–608. https://doi.org/10.1351/ PAC-REP-11-11-12 10. Koshioka M, Sasaki K, Masuhara H (1995) Time-dependent fluorescence depolarization analysis in three-dimensional microspectroscopy. Appl Spectrosc 49(2):224 11. Weidtkamp-Peters SS, Felekyan A, Bleckmann R et al (2009) Multiparameter fluorescence image spectroscopy to study molecular interactions. Photochemical & photobiological sciences Official journal of the European Photochemistry Association and the European Society for Photobiology 8(4):470–480
Chapter 13 Mathematical Modelling in Plant Synthetic Biology Anna Deneer and Christian Fleck Abstract Mathematical modelling techniques are integral to current research in plant synthetic biology. Modelling approaches can provide mechanistic understanding of a system, allowing predictions of behaviour and thus providing a tool to help design and analyse biological circuits. In this chapter, we provide an overview of mathematical modelling methods and their significance for plant synthetic biology. Starting with the basics of dynamics, we describe the process of constructing a model over both temporal and spatial scales and highlight crucial approaches, such as stochastic modelling and model-based design. Next, we focus on the model parameters and the techniques required in parameter analysis. We then describe the process of selecting a model based on tests and criteria and proceed to methods that allow closer analysis of the system’s behaviour. Finally, we highlight the importance of uncertainty in modelling approaches and how to deal with a lack of knowledge, noisy data, and biological variability; all aspects that play a crucial role in the cooperation between the experimental and modelling components. Overall, this chapter aims to illustrate the importance of mathematical modelling in plant synthetic biology, providing an introduction for those researchers who are working with or working on modelling techniques. Key words Mathematical modelling, Descriptive and predictive modelling, Model-based design, Spatio-temporal scales, Parameter analysis, Stochastic modelling, Uncertainty quantification, Theoretical–experimental plant synthetic biology approach
1
Introduction The aim of synthetic biology is to design and engineer new functionality of biological systems. To this end, synthetic biology joins forces with systems biology, which as a true interdisciplinary science brings together biologists, physicists, mathematicians, computer scientists, and engineers to deal with the masses of experimental data currently being produced and to aid in understanding phenomena that emerge from biological systems [3, 30, 55, 65, 90, 95]. Mathematics and mathematical modelling are crucial to design synthetic systems as it can integrate knowledge, allow to formulate concepts, help to analyse systems, enable in silico experiments, and help to plan experiments via experimental design. The aim of mathematical modelling is to arrive at statements about the system
Matias D. Zurbriggen (ed.), Plant Synthetic Biology, Methods in Molecular Biology, vol. 2379, https://doi.org/10.1007/978-1-0716-1791-5_13, © Springer Science+Business Media, LLC, part of Springer Nature 2022
209
210
Anna Deneer and Christian Fleck
under inspection, either conceptual statements or predictions about the consequences of certain manipulations or perturbations of the system. Mathematics can help to formulate concepts and ideas that turn a heap of data into apprehensible and meaningful scientific results. Although mathematical modelling is a powerful tool, there is no clear-cut way from the biological question or design problem to the mathematical model. At the start of any project, one has to clarify what are the hypotheses one wishes to test or what new knowledge about the biological system one wishes to gain. The behaviour of complex systems is often only understandable if one finds the correct level of description, and it is often the case that a rich yet structured picture emerges only if one asks the correct questions by allowing for controlled errors. In many aspects, mathematical modelling is an art: it requires insight into the biological system at hand and a solid understanding of the mathematical tools to be able to reduce the confusing diversity of real biological systems, such that the essentials are captured by the model but it remains feasible at the same time. In the majority of cases, a mathematical model consists of formalised relationships between quantities of interest (usually called variables) and other input quantities. As a simple example, consider the relationship between the abundance of a transcription factor and a protein of interest. One important input quantity is, e.g., how much protein is produced per unit time in dependence on the transcription factor abundance. Parameters relate variables to each other, or variables to processes, such as transcription factor concentration to protein production. In systems and synthetic biology, most mathematical models have at least one parameter, although typically biological models depend on many more. This leads to the fundamental problem of mathematical models, which is that most parameters are unknown. If the mathematical model is sufficiently simple, one can derive an analytical solution, i.e., one is able to find a set of expressions in terms of mathematical functions that express the sought for relation between variables. In the abovementioned example of the relationship between the transcription factor abundance and protein abundance, the process of protein production could be expressed in terms of an ordinary differential equation that can be solved exactly: dPðtÞ dt PðtÞ
¼ αT ðtÞ λPðtÞ ¼ P 0e
λðtt 0 Þ
Z
t
T ðt 0 Þe λðtt0Þ dt 0 :
þα t0
P(t) and T(t) denote the protein and transcription factor concentration at time t, respectively. The behaviour of P(t) can be studied qualitatively using the above result. However, for any quantitative statement, the values of the initial concentration P0, the
Mathematical Modelling in Plant Synthetic Biology
211
concentration T(t) of the transcription factor, the production rate α, and the degradation rate λ need to be known. In Subheading 2, we provide some more details about typical mathematical approaches employed in systems and synthetic biology. Biological systems are often highly coupled non-linear systems, and as such the corresponding mathematical models are difficult to analyse analytically. Therefore, one has to rely on computer simulations or numerical analysis. In both cases, the model parameters need to be specified. In order to increase the knowledge about the system and to estimate the parameters, specific experiments can be performed. The model-based planning of experiments with the aim to maximise the information gain is called experimental design. This leads to the problem of non-identifiability of the parameters. It can happen that due to limitations in the data or due to structural properties of the model some of the model parameters cannot be estimated. It is important to realise whenever this happens and to distinguish structural problems from problems resulting due to data limitations. We provide in Subheading 3 some brief introduction into parameter estimation, identifiability, and experimental design. The experimental data obtained is then used to infer the parameter values and to decide whether a mathematical model sufficiently captures the biological processes under investigation. A frequent situation is that one does not only have a single model at hand that can describe the data, but one has two or more models and wishes to decide which model to choose. In Subheading 4, we state some of the fundamental ideas of statistical model selection procedures. Unfortunately, there will always be some uncertainties about the parameters, due to experimental errors, biological variability, or lack of information. Even if all parameters of a mathematical model are in principle identifiable, it can be very costly in terms of time, man power, and required equipment to perform all the necessary experiments to obtain the necessary data, resulting in a practically non-feasible study. To make progress, one needs to find in this case a way to handle the inevitable uncertainties and arrive at statements about the investigated system despite incomplete information. This leads to the field of uncertainty quantification, and in Subheading 6, we provide some information about the basic ideas. One way to circumvent the problem of parameter uncertainty is to use Boolean models [121]. Sometimes the behaviour of a system can be brought down to only two levels: 1 or 0. In the case of a gene regulatory network, the genes can be either on or off, and the simplest assumption for such a network is the Boolean approximation. Such models have discrete variables and discrete update rules. The advantage of such a simplification is that the model is very tractable, and all possible states are enumerable, which allow for a global characterisation of the system. Boolean models can be very useful, but their application needs to be handled with care. To our
212
Anna Deneer and Christian Fleck
knowledge, there exists no mathematical proof about the mapping from Boolean model to the full dynamic model, i.e., what of the more complex dynamic model is captured by the Boolean model, nor any systematic comparison. The simplicity of Boolean models, in which components and their links can only be on or off, precludes any useful application in studies that aim to explore emergent behaviours arising from dynamic interactions within a welldefined network. Of course, it is possible to apply pseudotemporal characteristics to Boolean models, but the outputs are inevitably disconnected from any meaningful comparison with physiology. In this chapter, we will not discuss Boolean models any further and focus on parametric dynamic models. Moreover, because we focus on dynamical systems, we do not discuss the modelling of metabolic networks. In contrast to dynamic models, metabolic models are usually large and consist of many hundreds of equations and can be time-consuming to analyse numerically, in addition to the problem of having many unknown parameters. Therefore, researchers assume that enzymatic reactions occur at a much faster time scale compared to observable changes in phenotypes of an organism (e.g., growth rate) and treat the equation in steady state. This results in a set of algebraic equations, and many methods have been derived to find the possible solutions of these equations [2, 76]. The aim of this chapter is to provide an overview for the novice about mathematical approaches employed in synthetic biology. We intend to give an idea about the use and the challenges of mathematical modelling and do not intend to provide an in-depth explanation or to be fully comprehensive. However, we provide references for those who wish to dive deeper into the realm of mathematical modelling. There are many published books about mathematical modelling in biology that have summarised different aspects and show how mathematics can be implemented to solve biological problems [3, 54, 59, 60, 120]. For an overview of mathematical models describing a wide range of temporal and spatial biological phenomena, we direct interested readers to the excellent books by L. Edelstein-Keshet, J. D. Murray, R. Phillips et al. that go into greater depths of mathematical analysis than the books listed above [32, 74, 82]. Of course, those who really aim to obtain a deeper understanding about the mathematical methods employed in systems and synthetic biology need to refer to the wealth of the mathematical literature.
2
Mathematical Approaches In synthetic biology, modelling can be used to design a synthetic network and study its potential behaviour. Furthermore, it can help to analyse the data produced by a synthetic system and can provide mechanistic understanding of the system under study. Typically, a
Mathematical Modelling in Plant Synthetic Biology
213
model is kept as simple as possible where the main criterion is the ability to reproduce the observed behaviour. When one has two models that can reproduce the experimental observations, the model with less unknown parameters allows for an easier and more intuitive analysis. It is therefore important to understand which aspects the model needs to capture. For example, in models of metabolic networks, gene regulatory networks, or signalling pathways, one of the most common assumptions is that molecular components are uniformly distributed within a cell. Under this assumption, intra-cellular concentration gradients are ignored. Of course, in some cases, intra-cellular variation in concentrations can play an important role, and therefore, even the most commonly used assumptions should always be validated [82]. Another assumption that is often used in enzyme kinetics or transcriptional regulation is that some of the reactions are considered to be in steady state [60, 74]. Such an assumption can reduce and simplify the model without changing the observed behaviour. When setting up a mathematical model for biological systems, the first questions to be answered are: what is the system to be studied, what new understanding one aims to achieve, and which hypotheses one wishes to test. The next step is to decide upon the level of granularity of the model, how many details are necessary to include, and what is the correct time and/or spatial scale of the model. This will depend strongly on the investigated questions, and the level of complexity should be carefully considered. Due to practical constraints, a model should not contain all the details of the system; often it is more informative to employ a coarse-grained description. As such, it is easier to pinpoint the relevant behaviours and tests hypotheses or construct predictions. For example, is it enough to consider protein binding on an abstract level where all interactions are subsumed into a rate constant, or is it important to consider molecular forces? Such considerations will influence the extent of what a model can capture. The model and the questions asked will also be influenced by the knowledge and data that is available. For many cellular processes, there is only a qualitative understanding available, and quantitative characterisations will depend more strongly on the data. With the relevant data and hypothesis at hand, a model can be formulated, which consists of a set of equations capturing the relevant behaviour of the system. Quantitative descriptions typically involve physical or chemical laws, making the models mechanistic. In contrast to this, descriptive models aim only to summarise data and, compared to mechanistic models, provide limited insight into system behaviour. Once a model has been formulated, it can serve as a description of the system, and different mathematical tests can be performed to evaluate hypotheses or make predictions. An advantage of modelling is that a large range of behaviours can be simulated, some of which are not possible to achieve in experiments. This allows insight into
214
Anna Deneer and Christian Fleck
mechanisms that are difficult to study otherwise. Furthermore, it can provide insight into the link between the network structure that is being modelled and the resulting behaviour that can be observed at every time point. Different environmental conditions or perturbations to the system can easily be captured and analysed. Although simulations cannot serve as a proof of a hypothesis, they can help to underpin a mathematical theory; they can serve as a guideline for experimental design or for revealing discrepancies between the current understanding of the system and experimental observations. Models can be investigated in a controlled way, which can help to discover inconsistencies of a hypothesis that leads in turn to a refinement of the underlying assumptions, either by reasonable changes in the network or by new experiments. This iterative cycle of experiments and modelling is typical of systems biology approaches. In synthetic biology, models are also used to guide the design of biological networks. In such approaches, a wide range of different designs are tested for certain behaviour. Rather than testing a range of possible combinations of components experimentally, which can arguably cost a prohibitively large amount of time, a search of different model networks can reveal which combinations show the appropriate response. 2.1
Dynamics
The temporal evolution of non-spatial dynamical systems is often described by ordinary differential equations (ODEs). In synthetic biology, this typically refers to a change in molecular concentrations within a single cell or a population of cells. ODEs are used in cases where the concentration of a certain molecular species is measured across a population of cells, for example, as in qPCR. Chemical reaction systems can be described by ODEs whenever spatial inhomogeneities are negligible, i.e., the system is homogeneous or wellstirred [18]. ODEs are defined as dq 1 dt
¼
f 1 ðq 1 , q 2 , . . ., θ1 , . . ., θn , tÞ
ð1Þ
dq 2 dt
¼
f 2 ðq 1 , q 2 , . . ., θ1 , . . ., θn , tÞ
ð2Þ
⋮ within ¼ ,
where q1, q2, . . . are the state variables (typically these are molecular concentrations), and f1, f2, . . . describe the relationships or interaction between the variables (e.g., the molecular interactions) and depend in general on the variables, a set of parameters (θ1, . . ., ! θn), ! and time t. In vectorial notation, q ¼ ðq 1 , q 2 , . . .Þ , f ¼ !
ð f 1 , f 2 , . . .Þ, θ ¼ ðθ1 , . . ., θn Þ: !
dq dt
! ! !
¼ f ðq , θ , tÞ:
ð3Þ
Mathematical Modelling in Plant Synthetic Biology
215
As a first simple example, we turn back to the one-dimensional model of protein production and degradation shown in the introduction: dP dt
¼
f ðP, θÞ ¼ αT ðtÞ λPðtÞ:
ð4Þ
The variable is P(t) (often the time dependence is suppressed to simplify the notation, i.e., P(t) ¼ P). The parameter vector is given ! by θ ¼ ðT ðtÞ, α, λÞ. Because it is time dependent and not constant, one also refers to T(t) as an input function. The solution for P for time dependent T is given in the introduction. The solution simplifies if the abundance of the transcription factor T is constant and the amount of P at t ¼ t0 ¼ 0 is zero: αT ð5Þ 1 e λt : PðtÞ ¼ λ It is instructive to notice that by measuring the abundance of P over time, one cannot make any statements about the transcription factor concentration T, unless the expression rate α is determined by an independent experiment. The steady-state solution is given by limt!1P(t) ¼ αT/λ, which can also be obtained by setting dP dt ¼ 0 in Eq. (4). As a slightly more complex example system, we will consider the phytochrome-PIF light signalling network found in plants. Phytochromes are molecules that are responsive to light. Under certain light conditions, these molecules switch between two confomeres. The inactive, red-light absorbing form (Pr) is converted into the active, far red-light absorbing form (Pfr) when absorbing red light. Pfr can be converted back into the inactive form under far red light. Additionally, the active state reverts to the inactive Pr form due to spontaneous, thermal relaxation; the rate at which this happens is called the thermal relaxation rate or dark reversion rate. Note that the term dark reversion is a misnomer as it happens always, irrespective whether light is shed on the phytochromes or not [99]. The active form Pfr binds to Phytochrome Interacting Factors (PIFs) [84]. In this way, the information of light signals is translated through PhyB, and gene expression is consequently regulated by PIF. These reactions are illustrated in Fig. 1. The PhyB–PIF interaction has been used to control gene expression in yeast, Escherichia coli, and mammalian cells [67, 73, 97]. These optogenetic systems are synthetic tools with high-spatio-temporal resolution for inducing gene expression. The mathematical model for the PhyB–PIF system reads [73] d½P r dt
¼ ðk1 þ kd Þ½P fr ðk2 þ γ 1 Þ½P r ,
d½P fr ¼ k1 ½P r ðk2 þ kd þ γ 2 þ k3 ½PIF Þ½P fr , dt
ð6Þ ð7Þ
216
Anna Deneer and Christian Fleck
∅
kd
Pr γ1
∅
k1 k2
Pfr +
k3
k5
k4 γ3
γ2
∅
∅
k6
X γ5
γ4
∅
∅
Fig. 1 Reaction network of PhyB and PIF. Under red light, inactive Pr is converted into the active form Pfr. In darkness or far red light, Pfr is converted into Pr. When PhyB is in the Pfr form, it can bind to PIF giving the PhyB–PIF complex PP. The PP complex can induce the production of mRNA, which is translated into protein X
Fig. 2 Predicted production of protein X using the PhyB–PIF model. The left panel shows the light input sequence of red and far red light. The middle panel shows the conversion Pr and Pfr kinetics, and the right panel shows the resulting concentration of protein X. In this example, protein X is assumed to be very stable, i.e., γ5 ¼ 0
d½PP ¼ k3 ½P fr ½PIF γ 3 , dt d½mRNA dt
¼ k5 þ
d½X dt
k4 ½PP γ 4 ½mRNA, K M þ ½PP
¼ k6 ½mRNA γ 5 ½X :
ð8Þ ð9Þ ð10Þ
The production of mRNA by the PhyB–PIF complex (PP) is described with Michaelis–Menten kinetics, with vmax ¼ k4. To account for a leaky promoter, the parameter k5 is included for the mRNA concentration. The total amount of PIF in the systems is given by [PIF]tot ¼ [PIF] + [PP]; if we assume that PIF is available in abundance, we can write [PIF](t) ¼ [PIF]tot [PP](t). Parameters like k1, k2 and kd can be found in the literature [99], whereas other parameters will have to be estimated from the experimental data. The model can also be used to design and predict experiments, as shown in Fig. 2, where a certain sequence of red and far red-light pulses is applied to the system and the model predicts the
Mathematical Modelling in Plant Synthetic Biology
217
corresponding Pfr-Pr kinetics and protein production. Note that due to the delay between mRNA and protein production, the Pfr and Pr switching kinetics are not directly visible in the protein concentration curve. 2.2 Dynamics Across Space
Spatially homogeneous dynamic processes can be described using ordinary differential equations. However, biological processes generally depend on time and space. These are described by partial differential equations (PDEs) that are considerably more difficult to solve [32, 74]. Examples include bacterial colonies, pattern formation, such as skin coat patterning, and the generation and development of tissues. Pattern formation relies on spatially heterogeneous cell behaviour. As usually all cells in a developing tissue have the same genome, the fundamental question arises: how to make cells different? A conceptually simple possibility is by using boundary layer information. A morphogen is produced at the boundary of the tissue, and due to finite stability and diffusion of the morphogen, a gradient is established. Depending on the distance from the boundary, cells experience disparate concentrations of the morphogen and through a threshold mechanism differentiate into different states [123]. The challenge lies in the explanation of the threshold mechanism. Another way is that the cells exchange information via secretion of diffusive molecules and adapt their internal states accordingly. The class of PDE models describing these systems are called reaction–diffusion models [62]. In a seminal paper from 1952, Alan Turing revealed that through a diffusion-driven instability a pattern can occur [115]. This idea was taken up about 20 years later by others and spawned a whole field of mathematical research [42, 43, 77, 78]. ! Using Fick’s Laws, the flux of a concentration q across spatial domains is related to the diffusion coefficient by !
!
J ðx, tÞ
!
!
¼ D
∂q ∂x
ð11Þ
!
∂J ðx, tÞ ∂ðD∂q =∂xÞ ∂q ! ¼ ∇ðD∇q Þ, ∂x ∂x ∂t
ð12Þ
∂ ∂ ∂ , ∂y , ∂z Þ ) tells us where ∇ ¼ ∂/∂x (or in three dimensions ∇¼ð∂x how the concentration of q changes across space. D is the diffusion coefficient that in general depends on time and space. Adding this diffusion term to a chemical reaction system as described in the previous section results in a reaction–diffusion equation: !
!
∂q ! ! ! ¼ f ðq Þþ∇ðD∇q Þ, ∂t
ð13Þ
where f describes the reaction kinetics. Without this term, the ! ! solution q is determined by the boundary condition (how q should behave at the boundary of the domain, whether the system is closed
218
Anna Deneer and Christian Fleck ! !
or not, etc.). The interaction of the reaction term f ðq Þ with the ! diffusion term ∇ðD∇q Þ results in a rich patterning behaviour as discovered first by Turing [115, 124]. Because the reactions are typically non-linear, reaction–diffusion systems are mostly non-linear PDEs that need to be solved numerically [85]. A type of reaction–diffusion model that is widely used to study pattern formation is the so-called activator–inhibitor model [61, 62, 74]. Self-enhancement is necessary to supply the field of cells with activators, where any relatively small dominance of activator is amplified. This in turn induces the production of inhibitors, where induction is strongest in activator peaks. These inhibitors then exert their effect by diffusing to neighbouring cells, thereby fulfilling the role of long-range inhibition. The activator–inhibitor model is described by ∂A A2 ð14Þ A þ D A ∇2 A ¼ ρA 1 þ KII ∂t ∂I ∂t
¼ ρI ðA 2 I Þ þ D I ∇2 I ,
ð15Þ
where A and I represent the activator and inhibitor concentrations, respectively; ρA, ρI are the cross-reaction coefficients; and DA, DI are the diffusion constants. As one can show mathematically, a necessary requirement for pattern formation of a two-component system is that the diffusion rates have to differ such that DA DI [74]. This is the reason for the notion of local self-enhancement and long-range inhibition as necessary ingredients to produce patterns as are seen in biology [61, 62]. For a system of two interacting and diffusing chemicals, one can show that exactly two classes of pattern-forming networks exist: activator–inhibitor and substrate depletion [74]. Many examples exist in the literature, and interested readers can find further details in the textbooks by Murray and Edelstein-Keshet [32, 74]. In plant biology, it is often justified to assume that intra-cellular gradients are small, i.e., to assume that intra-cellular diffusion is sufficiently fast to ensure on the time scale relevant for pattern formation (usually the time scale of cell differentiation) a homogenous protein distribution inside the cells. In this case, space is discretised with the cell as the fundamental discretisation unit. This turns the above-described partial differential equation into a coupled system of ordinary differential equations, as we will explain in the next section using an example from plant systems biology. 2.2.1 Trichome Patterning Model
An example of an application of the activator–inhibitor model is the pattern of trichomes that are formed on Arabidopsis thaliana leaves [52]. Through genetic analysis, genes were identified that fulfilled the roles of “activator” or “inhibitor”, and through modelling approaches, their interactions were tested in the context of
Mathematical Modelling in Plant Synthetic Biology
219
Fig. 3 The trichome patterning model is formulated using coupled ordinary differential equations on a grid. The cells with a high concentration of the activating complex are taken precursors of trichome cells
patterning [23, 29, 81]. Engineering biological pattern formation is one of the most challenging tasks of synthetic biology [92, 111, 119], and the trichome system may help to understand how to engineer epidermal pattern in plants. We briefly discuss exemplarily a model that consists only of GLABRA1 (GL1), GLABRA3 (GL3), and TRYPTICHON (TRY). The activator GLABRA1 (GL1) and the inhibitor TRYPTICHON (TRY) compete for binding to GL3. The complex GL1–GL3 is denoted as the activating complex AC as it activates the downstream gene GLABRA2 (GL2) that triggers trichome formation. The mathematical model based on GL1, GL3, and TRY was tested against experimentally observed phenotypes [29]. It focuses on the early phases of the development and takes only the establishment of a spatial protein concentration pattern into account (see Fig. 3), which triggers cell differentiation [52, 81]. The model network is shown in Fig. 4a. Because cellular gradients can be neglected, the model is formulated on a hexagonal grid of Ni by Nj cells (Fig. 3). The diffusive spatial coupling on the ^ hexagonal grid is described by D: ^ i,j ¼ A i1,j þ A iþ1,j þ A i,j 1 þ A i,j þ1 DA þA iþ1,j 1 þ A i1,j þ1 6A i,j :
ð16Þ
The following system of coupled ODEs describes the change in concentrations of GL1, GL3, TRY, and the GL1–GL3 active complex (AC) as shown in Fig. 4a: ∂t ½GL1i,j
¼ σ 1 þ α1 ½ACi,j ½GL1i,j ðρ1 þ β1 ½GL3i,j Þ ð17Þ
∂t ½GL3i,j
¼ σ 2 þ α2 ½ACi,j ½GL3i,j ðρ2 þ β1 ½GL1i,j þ β2 ½T RY i,j Þ ð18Þ
220
Anna Deneer and Christian Fleck
AC GL1 GL3 TRY TRY CPC
(a)
Trichom
TTG
l
(b)
(c)
Fig. 4 Example of a trichome patterning model. (a) Interaction scheme for the model. AC is the active complexes formed by GL1 and GL3. IC is the inactive complex formed by TRY and GL3. Note that these inactive complexes are not modelled explicitly, as they do not feedback into the system. (b) A simulation of a wild-type parameter set, showing the concentration pattern for AC. (c) The model predicts increasing trichome densities for increasing GL3 basal production, showing two example simulations for different GL3 overexpression levels
∂t ½T RY i,j
^ RY ¼ α3 ½AC2i,j ½T RY i,j ðρ3 þ β2 ½GL3i,j Þ þ γ 1 D½T i,j ð19Þ
∂t ½ACi,j
¼ β1 ½GL3i,j ½GL1i,j ρ4 ½AC1i,j :
ð20Þ
These equations have to be solved simultaneously for every cell i, j on the grid. On a 30 30 grid, this results into 4 30 30 ¼ 3600 coupled equations that need to be integrated until the steady state is reached. In Fig. 4b, we show the result of a numerical simulation. The green dots denote the peaks of the pattern of the active complex AC. The concentration of this complex is used as an indicator for cells that differentiate towards the trichome cell fate. Once the model is set up and calibrated, it can be used to design a different pattern by studying the effects of different parameters on the resulting pattern. One relatively easy to manipulate feature of the trichome pattern is the trichome density. This was seen in mutant studies where GL3 overexpression (35S:GL3) resulted in an increase in the number of trichomes [29]. One can use the model to design a trichome pattern of a particular, desired density as shown in Fig. 4c, where an increase in the parameter for GL3 basal production (σ 2) of up to 10 times its wild-type value leads to up to three-fold increase in trichome density.
Mathematical Modelling in Plant Synthetic Biology
221
The trichome model is deterministic, i.e., running the model with the same parameters and initial conditions will always result in the same trichome pattern. However, the pattern formation process is triggered by initial small deviations of the protein concentration between the cells due to random fluctuation [74]. These initial random fluctuations result from stochastic gene expression [8, 35], which need to be mathematically described by stochastic models. 2.3 Stochastic Models
In the case the copy number of some of the relevant molecules involved in a chemical reaction is small, we need to refer to stochastic processes. Due to the stochasticity of the reaction, it is impossible to know at any time the exact state of the system (besides, maybe, the initial state). One can only make statements about the probability to find the system in a given state at a given time. In contrast to a deterministic description, the time development of a system has to be described by an equation for the time development of this probability. The equation for the probability is a balance equation; it is concerned at each time point with a gain and a loss in the probability to find the system in a given state. As an example let us consider the photoreceptors phytochromes, which can be in a simplified description in two states, inactive and active. The transition between these two states is given by k1 and k2, which depend on the light conditions, and kd, which is the rate of the lightindependent spontaneous reaction from the active to the inactive state [99]. To simplify matters we analyse first the situation for one phytochrome only (N ¼ 1). We examine the reaction between state 0 (inactive) and 1 (active): k1
0 −− −− −− 1. k2 +kd
The probability P(1, t + Δt) of finding the ligand in the bound state at time t + Δt is based on the probability at time t and the transition probabilities to either move from state 0 to 1 (gain) or vice versa (loss): Pð1, t þ ΔtÞ ¼ Pð1, tÞ þ k1 ΔtPð0, tÞ ðk2 þ kd ÞΔtPð1, tÞ: gain
loss
Dividing by Δt and taking the limit Δt ! 0 yields a differential equation for P(1, t): dPð1, tÞ ¼ k1 Pð0, tÞðk2 þ kd ÞPð1, tÞ: dt This type of gain–loss equation for the probability is called the master equation. In steady state, i.e., when t !1, we find
222
Anna Deneer and Christian Fleck
Pð0, t ! 1Þ ¼ Pð1, t ! 1Þ ¼
k2 k1 þ k2 þ kd k1 : k1 þ k2 þ kd
We now consider an ensemble of N phytochromes exposed to light. Each of the molecules constantly cycle between the inactive and active states. We aim to find the steady-state probability to find n phytochromes in the active state. Denoting by 0, 1, 2, etc. the number of phytochromes in the active state, we can write for the reaction scheme k1
k1
k1
k1
k2 +kd
k2 +kd
k2 +kd
k2 +kd
0 −− −− −− 1 −− −− −− 2 · · · (n − 1) −− −− −− n · · · (N − 1) −− −− −− N. The corresponding master equation reads dPðn, tÞ dt
¼
k1 ðN n þ 1ÞPðn 1, tÞ þ ðk2 þ kd Þðn þ 1ÞPðn þ 1, tÞ gain:frominactivestates
gain:fromactivestates
k1 ðN nÞPðn, tÞ ðk2 þ kd ÞnPðn, tÞ: loss:toactivestates
loss:toinactivestates
This equation describes the time development of the probability to find n molecules at time t in the active state and needs to be solved such that it obeys the initial condition P(n, t0) ¼ P0(n) and the boundary conditions P(1, t) ¼ P(N + 1, t) ¼ 0, 8t t0. The solution in steady state is given by the binomial distribution: n N n N! k1 k2 þ kd : Pðn, t ! 1Þ ¼ k1 þ k2 þ kd ðN nÞ!n! k1 þ k2 þ kd In Fig. 5 are examples of this solution for different ensemble sizes. As expected, the probability of finding a large number of phytochromes in the active state is smaller under far red light than for red light. We can now calculate, e.g., the mean and the variance in steady state: μ ¼ hni ¼ σ 2 ¼ hn2 i hni2
¼
k1 , k1 þ k2 þ kd k1 k2 N : ðk1 þ k2 þ kd Þ2 N
A measure how well the system is described by the mean is given by the coefficient of variation CV ¼ σ/μ: rffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi k2 þ kd CV ¼ : N k1 pffiffiffiffiffi We find the well-known result CV 1= N . To find an equation for the mean or average number of molecules hni, one multiplies
Mathematical Modelling in Plant Synthetic Biology
N = 10
223
N = 100
Fig. 5 Probability of finding n phytochromes in the active state at steady state for two different conditions. (a) An ensemble of N ¼ 10 phytochromes at different wavelengths. In far red light (λ ¼ 740 nm, corresponds to k1/k2 ¼ 0.17), the probability of finding active phytochromes is much smaller than in red light (λ ¼ 660 nm, corresponds to k1/k2 ¼ 6.7). (b) Increased ensemble size to N ¼ 100, again for far red and red-light conditions
the master equation by n and sums over all possible values for n. This gives rise to dhni dt
¼ k1 N ðk2 þ kd Þhni:
This equation for the average of the stochastic process is identical to the deterministic equation one obtains using mass–action kinetics [60]. This means that the average obtained from the master equation and the result from the mass–action kinetics agree. However, this is not true for non-linear reactions, such as protein–protein binding. In this case, the equation for the mean obtained from the master equation differs from the equation derived from mass– action kinetics. Additional assumptions, e.g., high molecule abundance, are required for correspondence between the stochastic and deterministic descriptions [40]. For a general chemical reaction system, the chemical master equation (CME) describes the change of the chemical distribution. However, although being an exact description for the probability to find a given chemical composition of the biological system at a given time, it is very difficult to obtain analytical solutions. This is due, in part, to biological networks breaking detailed balance through synthesis and degradation reactions. Numerical solutions to the CME can be obtained by using a variant of the Gillespie algorithm or the Stochastic Simulation Algorithm (SSA; see ref. [45] for a review). However, to obtain information about the probability distribution, several thousands of similar simulations need to be performed. Since the CME is difficult to solve and
224
Anna Deneer and Christian Fleck
computationally intensive to simulate for larger systems, several approximative schemes have been developed. Among these the linear-noise approximation and the Chemical Langevin Equation are widely used [40, 44, 47, 93, 113, 118]. As a rule of thumb, a stochastic description is necessary whenever small amounts of molecules are involved in the relevant biochemical reactions. One has to be very careful when making statements about the behaviour of single cells based on measurements using bulk readouts such as the reporter gene secreted alkaline phosphatase (SEAP) [35]. 2.4 Model-Based Design
Mathematics can be used to aid the design of synthetic systems. In fact, it is the mathematical formulation of the biological system that allows for a systematic approach. A simple minded trial-and-error experimental strategy is in many cases non-feasible due to the often enormous amount of possibilities and the time needed for the experiments. However, also for a mathematical investigation of the biological system, one has to know what the interaction network is one aims to explore, which is not always a priori clear. What should be clear, nevertheless, is the design goal one has in mind, for example, sustained oscillations, toggle switches, or a certain spatial pattern [98]. This design goal can be cast into a fitness function that serves as a metric to measure the distance between the behaviour of a particular system and the desired behaviour [68]. Sometimes it is possible to design the desired functionality based on fundamental design principles [4, 7, 17, 61, 68, 75, 92, 94, 95, 98, 111, 116, 117, 119]. But very often knowing the design principle is only the first step, and one needs to add additional requirements and aim for synthetic networks that are optimal in a multi-objective sense [79]. For example, when building a synthetic oscillator, one may not only look for oscillating networks but also require that they are robust against molecular perturbations [34, 75, 107]. One needs to decide upon the required mathematical modelling approach, are ordinary or partial or stochastic differential equations necessary? Once this is decided, a computational method to explore the parameter and network space is needed. We mention two possibilities: exhaustive search in network space together with a Monte Carlo search strategy for the parameters [51, 68]. This strategy provides an unbiased exploration of the network space but is only feasible for small networks, i.e., one must clearly delimit the space to be searched by coarse-graining the features of the network. This could be done by fixing features such as the number of nodes or the types of nodes used in the network. One can evaluate each network architecture, e.g., by its robustness, defined as the fraction of the sampled parameters for which it can perform the target function above some threshold score. Another interesting search strategy is based on evolutionary algorithms (EAs). These EAs mimic the way biological systems evolve in nature by introducing random mutations and selection based on model
Mathematical Modelling in Plant Synthetic Biology
225
performance at each step [68, 89, 100, 102, 103]. After many rounds of mutation and selection (often hundreds of cycles), convergence to particular network structures can often be observed. Because EAs follow individual network evolution trajectories, they have the advantage of highlighting networks that are “evolvable”. The advantage of the EAs is that it is possible to explore large network and parameter spaces. However, at the same time, they can be trapped in local fitness maxima. Once a network structure is known, which gives the required functionality, it may be desirable to realise the network in vivo. However, matching model predictions and the constructed biological systems can be challenging due to, for example, mismatching expression levels between parts, unexpected interactions between the system and the host, or toxicity levels. Ideally, the parts for the design are well-characterised and model parameters are welldefined. It is therefore necessary to know the values of the real network parameters, such as protein–protein binding rates, protein degradation rates as it will further constrain the possible networks. To this end, experiments need to be designed, experiments performed, and parameters inferred from the generated data. In the next section, we give a brief overview of the standard techniques for these purposes.
3
Parameter Inference Dynamic mathematical models depend on a set of parameters that describe processes like synthesis, degradation, binding rates, reaction rates, etc. These rates can often not directly be estimated but need to be inferred from data by means of a mathematical model. Every biological dataset will contain variation stemming from intrinsic stochastic noise inherent to biochemical reactions and variations due to environmental noise or measurement errors [39, 114]. Highly variable data negatively impacts the chances of finding a single optimal parameter set. As changing experimental conditions can lead to alterations in biological networks, data for parameter estimation would ideally come from a single set of experimental conditions. Mixing of datasets across different experimental setups can lead to further variation in parameter estimates, and the researcher cannot guarantee that the underlying system of equations does not need altering between different experimental conditions. These variations in the data affect the parameter estimates as well as their precision, imposing a certain level of data quality needed to determine the parameters. Given the available dataset, a method is needed to match the model simulation to the experimental observations:
226
Anna Deneer and Christian Fleck ! _
! !
!
ð21Þ
x ðtÞ ¼ f ðx ðtÞ, θ Þ, !
!
where x are the state variables and θ are the model parameters. Often the state variables of the systems cannot be observed directly, ! but only a function of x can be measured. Each possible measurement is mathematically represented by a functional mapping: !
! !
!
!
ð22Þ
y ðtÞ ¼ g ðx ðtÞ, λ Þ þ ε ðtÞ !
that might include additional parameters λ , such as scaling parameters for relative data obtained from fluorescence measurements. ! The measurement noise is denoted by ε ðtÞ. A usual assumption is ! ! that ε ðtÞ is normally distributed. The dimensionality of y is often ! smaller the dimensionality of x . There are a wide range of methods available that share a similar principle [9, 14, 72, 88]. One of the simplest and most used functions is the sum-of-squared residuals, which is equivalent to a maximum likelihood approach based on the assumption that the measurement noise is Gaussian:
Pn ðd i y i ð!θ ÞÞ
2
!
Cðθ Þ ¼
σ 2i
i¼1
ð23Þ
,
where di is the measured data point at time ti, σ 2i is the variance of the data point di, and yi is the corresponding model output at ! ti. For notational simplicity we included the additional parameters λ into ! ! ! ! θ , i.e., we defined θ~ ¼ ðθ , λ Þ and skipped the tilde. C can be interpreted as a score that becomes smaller as the model output ! ! matches the data more closely, i.e., if y ¼ ðy 1 , . . ., y m Þ ! d ¼ ðd 1 , . . ., d m Þ , then C ! 0. The parameter set for which C 0 is potentially the correct set of biological rates for the system that is being modelled. This cost function can also be extended to encompass multiple experiments by !
Ĉðθ Þ ¼ !
Ĉðθ Þ ¼
Pm
ð24Þ
j ¼1 C j !
2
Pm Pn j ðd i,j y i,j ðθ ÞÞ j ¼1
i¼1
σ 2i,j
,
ð25Þ
where m is the number of experiments and nj indicates the data points in experiment j. Finding the best fitting parameter set involves searching the parameter space that is often highdimensional (as many dimensions as parameters). For the highdimensional cases, it becomes computationally very costly to test all possible parameter combinations. Several methods exist for efficient searching of the parameter space such as linear and non-linear least-squares fitting, simulated annealing, genetic algorithms, and evolutionary algorithms [9, 10, 12–14, 19, 22, 25, 28, 39, 69, 72, 83, 122]. These optimisation methods search the
Mathematical Modelling in Plant Synthetic Biology
227
parameter space and minimise a certain cost function like Eq. (23). Depending on the system under study, these methods may be computationally very expensive or show bad performance under conditions of high amounts of noise in the data. It may happen that despite the information contained in the data not all parameters can be identified with sufficient precision. Instead, several parameter sets or a range of values give an equally good fit. In practice, this is quite common, and in these cases, the parameters are referred to as non-identifiable. 3.1 Parameter Identifiability
By the process of identifying a parameter of a mathematical model, one usually means that one achieves information about the parameter value with a certain level of confidence. The confidence with which one knows a parameter value is expressed in terms of a confidence interval [31]. This means that increasing the knowledge about parameter results into a reduction of the confidence interval. However, it can happen that parameters are not identifiable in the sense that either the confidence interval is large or it is even infinite. One distinguishes two types of this non-identifiability: 1. Practical non-identifiability: In this case, the amount and/or quality of the data are insufficient to get a good estimate (i.e., a small confidence interval) of the parameter. 2. Structural non-identifiability: There is a functional relationship between two parameters that does not allow the identifiability of the parameter in question, e.g., the model depends on the ratio of two parameters. This can be resolved by a qualitative change in the data (e.g., measurement of a different species or a change in experimental conditions) or by changing the model. There are several approaches for identifiability analysis [26, 88]. A sufficient way to detect structural non-identifiability is to determine the rank of the extended sensitivity matrix for the observables ! y ¼ ðy 1 , . . ., y m Þ [104, 105]:
228
Anna Deneer and Christian Fleck
··· .. .. . . ∂y1 (t ) · · · ∂θ1 N ∂y2 (t ) · · · ∂θ1 1 .. .. . . ∂y2 (t ) · · · ∂θ1 N .. .. . . ∂ym (t ) · · · ∂θ1 1 .. .. . . ∂ym (t ) · · · ∂θ1 N ∂y1 (t ) ∂θ1 1
=
∂y1 ∂θP
(t1 ) .. . ∂y1 (t ) ∂θP N ∂y2 (t ) ∂θP 1 .. . . ∂y2 (t ) ∂θP N .. . ∂ym (t ) ∂θP 1 .. . ∂ym (t ) ∂θP N
ð26Þ
If and only if the matrix S is full rank, the model parameters are identifiable. Full rank means for the matrix S that only the zero vector is mapped to the zero vector (the dimension of the kernel of S is zero). If in doubt, testing the rank of the sensitivity matrix S is always a good idea. However, in case S is not full-ranked, one needs to go into further details to figure out which parameters are related, why the model is structurally non-identifiable, and how to change the model to make it structural identifiable [48, 57, 106]. Some data-based approaches also allow for conclusions about practical non-identifiability that is caused by limited quality of experimental data [88]. One interesting way to evaluate parameter identifiability is by calculating the profile likelihood, which can be calculated by Kreutz et al. [64] and Raue et al. [87, 88]: !
PLðθi Þ ¼ !min Ĉð θ Þ:
ð27Þ
θ jθi fixed
(a)
(b)
(c)
Fig. 6 Different profile likelihood examples for a certain parameter θi. (a) The parameter θi is structurally non-identifiable due to the flat profile likelihood. (b) The parameter θi is practically non-identifiable. Δα indicates the different confidence thresholds. Δα ¼ 0.05 corresponds to a confidence of 95% and α ¼ 0.32 to 68%. (c) The parameter θi is identifiable
Mathematical Modelling in Plant Synthetic Biology
229
This is a minimisation of the sum-of-squared residual in Eq. (25) with respect to all parameters while keeping the parameter of interest fixed. In Fig. 6, we show the three possible for the profile likelihood. In the case of a structural non-identifiability, the parameter has a flat profile likelihood (Fig. 6a). For practical non-identifiabilities, the profile likelihood shows a minimum, but for increasing and/or decreasing values of θi it does not exceed a confidence threshold Δα at confidence level α (Fig. 6b). A parameter that is identifiable has a minimum and passes a threshold Δα for increasing and decreasing values of θi (Fig. 6c). In the case of a structural non-identifiability, one needs to rewrite the model, while in case of a practical non-identifiability, it is necessary to obtain better (with less variability) or different data. This is also the case if the model is identifiable, but the confidence intervals are large and one aims to reduce the uncertainty about the estimated parameters. Which type of data is required, what to measure how and when, are questions that are answered by optimal experimental design. 3.2 Optimal Experimental Design
The process of optimal experimental design (OED) can help to identify the most informative experimental setup in terms of statistical inference of the model parameters. Typically, this involves choosing at which time points to measure or which set of stimuli or inputs should be applied to the system under study. For every possible design, the amount and quality of the information contained in the experiment can be quantified, with the aim of achieving the highest precision in parameter estimates. The quantification of information involves a specified optimality or design criterion, often based on the Fisher Information Matrix (FIM) [14, 63] given by: F
¼
ST QS:
ð28Þ
The matrix S is the sensitivity matrix given by Eq. (26), ST its transpose, and Q is a positive definite symmetric weighting matrix [11]. This matrix F measures the value of information contained in the data, and one can show that no estimator for a set of parameters can have a smaller covariance matrix than the inverse of the FIM [31, 38, 70]. Optimal experimental design (OED) based on the Fisher Information Matrix involves maximisation (or minimisation) of a scalar measure derived from the FIM by varying the system’s inputs, sampling times, experiment duration, and/or initial conditions. There are several design criteria [58, 63], where the most commonly used cost functions are D-optimality, (modified) E-optimality, and A-optimality. Each of these optimality criteria are functionals ϕ of the eigenvalues of the FIM:
230
Anna Deneer and Christian Fleck l
D-optimal: max ϕ ¼ det(F)
l
E-optimal: max ϕ ¼ λmin(F)
l
ðFÞ Modified E-optimal: min ϕ ¼ λλmax min ðFÞ
l
A-optimal: max ϕ ¼ tr(F)
The D-optimality criterion maximises the determinant of the FIM, which corresponds to a minimisation of the volume of the confidence ellipsoid. An E-optimal design maximises the smallest eigenvalue, i.e., a minimisation of the largest confidence interval. The modified E-optimal design minimises the ratio between the largest and smallest eigenvalues. Finally, the A-optimality criterion maximises the trace of the FIM. How does OED work in practice? We describe here only the basic steps in broad terms. After definition of the model, one needs a first estimate for the parameters. This first parameter set is called the nominal parameter set. Then the sensitivity matrix S is calculated. In case the dynamic model equations are non-linear and can only be solved numerically, one obtains the sensitivities in Eq. (26) by solving numerically an extended set of equations [86]. Once S is constructed, it is necessary to check for structural non-identifiability of the model by determining the rank of S. In case S is not full-ranked, the model needs to be rewritten [57]. If the parameters are identifiable, the matrix F is to be constructed. If one has information about the variances of the data used to obtain the nominal parameter set, the matrix Q can be constructed using these variances; otherwise one approach is to set Q to be the identity matrix [11]. Using one of the abovementioned optimality criteria, plus possible additional constraints, one needs to find the optimal experimental setup. The space of possible experimental perturbations and measurements is usually very high-dimensional, making it necessary to limit the possibilities appropriately. For example, the perturbation could be the transient induction of a gene, such that the question may be when to induce and for how long and when to measure what. An additional criterion could be to achieve the best results with the lowest number of measurements. Once this is decided and the experiments are performed, the parameters need to be estimated and the confidence intervals need to be determined, e.g., by using the profile likelihood approach [64]. In case all parameters can be identified with sufficient accuracy (small confidence intervals), the procedure can be terminated. Otherwise, the scheme starts again, now based on the estimated parameter set as the new nominal parameter set. More elaborated schemes can be found in [33, 37, 39, 72]. One may encounter the situation that there are two or more models that can equally well explain the data, are consistent with what is known about the system, and for which the parameters can be estimated. In fact, this is a typical situation in systems and
Mathematical Modelling in Plant Synthetic Biology
231
synthetic biology, which may arise, e.g., because one is interested in the minimal model, sufficient for the system and questions at hand, or one is unsure about the effect of a certain component. We show in the next section how to tackle these situations.
4
Model Selection Given a set of parametrised models, the question arises how to select among this set of models the one that offers the best description of the biological system under study. Deciding whether a model is actually a good fit to the system is far from a trivial problem. A number of approaches have been developed to quantify the quality of a model to help choose the best model from a set of candidates. A guiding rule among these approaches is to aim for the lowest model complexity; a more complicated model should only be chosen if it improves significantly over the simpler version. This is a long-known principle also known under the name Ockham’s razor or the law of parsimony [41, 101]. It should be noted that this is a heuristic principle and, although it is plausible, it could be misleading. It could be the case that the larger model is correct and the smaller model is wrong, although the smaller model gives an equally good explanation (fit) of the data. Nevertheless, unless one has good reasons to depart from the principle of parsimony, it provides a good guidance for the search of the relevant model for a system. The question arises, however, what does an equally good fit for two models exactly mean and are there any quantitative measures on which one can base the decision which model to prefer? We provide in the following some basic examples of statistical model selection procedures.
4.1
Likelihood Tests
The simplest test that can be performed is the likelihood ratio test. In this statistical test, the goodness of fit of two models, A and B, is compared based on the ratio of their likelihoods, which can only be applied if the models are nested [5, 31]. Nested means that model A needs to be a special case of the larger model B, e.g., model B has one reaction more, which is ignored in model A by setting the corresponding parameters to zero. It can be shown that the logarithm of the ratio follows asymptotically a chi-squared distribution [49]: ! ! ! Lðθ A jd Þ 2 ln ð29Þ χ 2nB nA ! ! Lð θ B jd Þ nA, where nA and nB are the number with degrees of freedom!nB ! of free parameters, and θ A , θ B are the maximum likelihood parameter estimates for the two models. p-values can be calculated using a null hypothesis in that both models explain the data equally well. If
232
Anna Deneer and Christian Fleck
the ratio of the likelihoods is significantly high, the null hypothesis is rejected. For normally ! distributed measurement noise, the loga! rithmic likelihood ln Lðθ jd Þ is related to the sum of squared residuals given by Eq. (23). ! ! ! ! Lðθ A jd Þ Cðθ A Þ 2 ln χ 2nB nA : ð30Þ ¼ ! ! ! Cðθ B Þ Lð θ B jd Þ The likelihood ratio test is only valid for nested models and for that reason other tests were developed. 4.2 Information Criteria
Akaike developed a model selection criteria based on information theoretic measures that can be applied to nested as well as non-nested models [1, 5, 24, 71]. The Akaike Information Criterion (AIC) is given by !!
AIC ¼ 2 ln Lðθ jd Þ þ 2k,
ð31Þ
where k is the number of parameters. Interestingly, this criterion enforces parsimony or Ockham’s razor, but only as a by-product of the derivation [5]. In the case of small sample sizes, the corrected AIC (AICc) is given by Anderson [5], Burnham et al. [24]: !!
AICc ¼ 2 ln Lðθ jd Þ þ 2k þ
2kðk þ 1Þ , nk1
ð32Þ
where n is the number of data points. The use of AICc is highly recommended in practice. Calculating the AICc for each model allows a comparison of the relative fit of a set of models to the observed data. Practically, one calculates AICc for all models and chooses the one with the lowest AICc. This comparison does not evaluate how well the best model fits or explains the data, but it just tells that among the available choices the model with the lowest AICc should be preferred. Another criterion developed is the Bayesian Information Criterion (BIC). The BIC is based on the idea that if the amount of data goes to infinity the true model should be selected. It remains debatable what the true model is and whether it exists [5]: !!
BIC ¼ 2 ln Lðθ jd Þ þ k ln n:
ð33Þ
The BIC penalises free parameters more strongly than the AIC. As n goes to infinity, the probability of selecting the true model will go to 1, i.e., the BIC is consistent. However, BIC requires very large sample sizes to achieve consistency; and typically for small data sizes the AICc-criterion is more efficient [5, 24]. The AIC and BIC are the most widely used selection criteria, but other information criteria exist for specific cases. For example, the Takeuchi Information Criterion (TIC) is appropriate for models that are very far from the true model, and the Widely Applicable Information Criterion
Mathematical Modelling in Plant Synthetic Biology
233
(WAIC) is suited for models in which parameter identifiability is a problem [5]. When the models are formulated, the parameters are estimated, and the most suitable model is selected, one usually aims to analyse the implications of the best model: how does it behave under certain conditions, can one make any particularly interesting predictions, what are the specific contributions of its component, etc.? In the next section, we state some approaches how to analyse a dynamic mathematical model.
5
Model Analysis A mathematical model is more than a function that can reproduce the measured data. It contains condensed information that can be discovered by investigating the implications of the model. There are many ways how to analyse a mathematical model and how to derive information from it. As in the majority of cases in biology, the models are non-linear and encompass several variables and a direct way to investigate the model is by in silico experiments. Mathematical models allow to test the effect of manipulations, also of those that are difficult to perform experimentally. The challenge, however, when doing numerical experiments is that one often has to explore a high-dimensional space for which it is difficult to obtain an overview where in parameter space the model behaves in which manner. Alternatively, a model can be analysed using theoretical instruments, giving a more general insight into its potential behaviours. Using mathematical techniques, these model analysis approaches provide insight into system behaviour, which are difficult to be reached using simulations. The difference between in silico experiments and mathematical analysis is in the outcome: simulations reveal how a system behaves and model analysis reveals why it behaves as it does. Model analysis can reveal the non-intuitive link between model structure and system behaviour. Often the system is too complex to be tackled only by one method; a combination of simulations and mathematical analysis may yield the required insight.
5.1 Phase Plane Analysis
Typically, dynamic behaviour is represented by showing concentration profiles over time. An alternative way to visualise the system is by plotting concentrations of different species against each other [108, 109]. To provide the reader with an idea how phase plane analysis works, we discuss a simple example, shown in Fig. 7a [54]. This network has 5 reaction rates that can be written as r1 ¼
k1 , 1 þ ðs 2 =K Þn
r 2 ¼ k2 ,
r 3 ¼ k3 s 1 ,
r 4 ¼ k4 s 2 ,
r 5 ¼ k5 s 1 ,
234
Anna Deneer and Christian Fleck
∅
∅ r2
r1
s1 r3
r5
∅
s2 r4
∅ (b)
(a)
(c)
(d)
Fig. 7 Analysis of the example model of allosteric inhibition. (a) Reaction network of the example used in this section. Production of s1 is inhibited by s2. r1 to r5 indicate the reaction rates. (b) Concentrations plotted over time, reaching steady state. (c) Phase plane showing concentration of s1 versus s2. (d) Phase portrait of s1 versus s2 using different initial conditions. All trajectories converge to the same steady state, which is where the nullclines for s1 and s2 intersect. Parameters used in this figure: k1 ¼ 20, k2 ¼ 5, K ¼ 1, k3 ¼ 5, k4 ¼ 5, k5 ¼ 2, and n ¼ 4
where r1 is written as Michaelis–Menten kinetics to model the inhibition of the production of s1 by the binding of n molecules of s2. The system of ODEs is then written as s_ 1 s_ 2
¼
k1 k3 s 1 k5 s 1 1 þ ðs 2 =K Þn ¼ k2 þ k5 s 1 k4 s 2 :
ð34Þ
In this system, the production of s1 is inhibited by s2, where at the same time s1 is consumed in the production of s2 at rate k5. In Fig. 7b, the concentrations of s1 and s2 are plotted over time. Figure 7c shows the phase portrait where concentration of s1 is plotted versus concentration of s2. From this figure, we can see the
Mathematical Modelling in Plant Synthetic Biology
235
!
initial conditions s ðt 0 Þ¼ðs 1 ðt 0 Þ, s 2 ðt 0 ÞÞ and the trajectory towards ! ! steady state s 1 ¼ lim t!1 s ðtÞ. Note that in contrast to Fig. 7b time is implicit, as it parametrises the curve starting at the initial point towards the point of steady state. The phase portrait can provide a unified visualisation of the dynamic behaviour of the system. In Fig. 7d, different initial conditions show a convergence to the same steady state, revealing the general system behaviour. The points at which the trajectory of the phase portrait changes direction of the variables reach a local maximum or minimum that can be determined directly from the model. The turning points can be identified by analysing the systems’ nullclines, which are given by taking s_1 ¼ f ðs 1 , s 2 Þ ¼ 0 and s_2 ¼ gðs 1 , s 2 Þ ¼ 0. The intersection of the nullclines (where s_1 ¼ s_2 ¼ 0) immediately reveals the steady states (fixed points) of the system. The nullclines can be determined directly from the model, without the need for simulation, by solving a set of algebraic equations. We consider two limiting cases: (i) no negative feedback from s2 to s1 (K !1) and (ii) maximal negative feedback from s2 to s1 (K ! 0). In both cases, we find an analytic solution: K !1 yields
s2
s1
¼
k1 k3 þ k5
ð35Þ
¼
k5 k1 k þ 2, k4 k3 þ k5 k4
ð36Þ
and for vanishing K, the solution reads ¼
s1
0
ð37Þ
//
// Fig. 8 Steady state of s1 and s2 for varying values of K and n. Other parameters are k1 ¼ 40, k2 ¼ 20, k3 ¼ 1, k4 ¼ 3, k5 ¼ 3. For very low or high values of K, the steady states of s1 and s2 converge to the solutions given in the text
236
Anna Deneer and Christian Fleck
s2
¼
k2 : k4
ð38Þ
A general solution for finite K and n > 2 can only be found numerically. However, for 1 k5 k1 k2
¼ þ2 , ð39Þ K 2 k4 k3 þ k5 k4 the solution is independent of n and is given by s1
¼ s2
k4 k2 K k5 k5
ð40Þ
¼ K :
ð41Þ
This means, in particular, that the steady-state solutions for s1 and s2, respectively, which are in general different for different n, coincide for K . In Fig. 8, we show the steady states of s1 and s2 for varying values of K and n. 5.2
Stability Analysis
In many cases the steady states of a system are of special interest; therefore, the long-time behaviour or asymptotic behaviour of dynamic models is often analysed. For stable genetic and biochemical networks, two different behaviours can be expected [32, 54, 74, 108]: (i) convergence to a steady state or (ii) sustained periodic oscillation, called limit cycle oscillation. Fixed points correspond to an intersection of the nullclines, i.e., they can be determined by setting the left-hand side of the differential equation to zero, and solve the resulting algebraic equations for the variables. Limit cycles are far more difficult to determine [108], and we therefore confine the discussion to fixed points. A model has in general several fixed points, and broadly spoken fixed points can be stable or unstable [108]. Each stable fixed point is surrounded by a region called the basin of attraction. If started within this region, the trajectory finally ends at the corresponding fixed point, and thus, it depends on the initial conditions of which of the stable fixed points will be finally reached. A stable fixed point attracts nearby trajectories, i.e., small perturbation of the system out of the steady state will relax back. The curves that define the separation between two basins of attraction are called separatrices. Fixed points that do not attract trajectories are called unstable. Refer for a precise definition of the stability of a fixed point to [32, 54, 108]. For the behaviour of a system, the stability of the fixed points is of utmost importance, and when solving the algebraic equations to obtain the fixed points of the system, is it not clear which fixed point is stable and which is unstable. Mathematical methods are needed to further analyse the system. We briefly discuss a simple one-dimensional example. Consider a protein that forms a dimer and positively feeds back on its own production. Ignoring any basal protein expression, the system can mathematically described by
Mathematical Modelling in Plant Synthetic Biology
(a)
237
(b)
Fig. 9 Stability analysis for the model of protein dimerisation and self-activation. (a) Phase plot for μ ¼ 2, three different fixed points can be identified from the plot. (b) Two examples of the time evolution of protein x concentration for different values of μ. For μ > 1, the solution reaches the steady state at x s ¼ pffiffiffiffiffiffiffiffiffiffiffiffiffi μ þ μ2 1. For μ < 1, the solution converges to xs ¼ 0
αx 2 λx: 1 þ ðx=K Þ2
x_ ¼
ð42Þ
We assumed that the dimerisation step is much faster than protein expression and degradation. α denotes the expression rate, and K is the concentration at which half of the maximum of the expression speed given by αK2 is reached. After an appropriate rescaling of the system, we arrive at x_
2μx 2 x, 1 þ x2
¼
ð43Þ
where μ ¼ αK/(2λ) is a non-dimensional parameter controlling the behaviour of the system. The fixed points (Fig. 9a) can be calculated by setting x_ ¼ 0 and are given by ¼ 0 pffiffiffiffiffiffiffiffiffiffiffiffiffiffi μ μ2 1 :
ð44Þ
xs
x
¼
μ 1:
ð45Þ
Whether a fixed point is stable or not can be determined by looking at the slope of x_ at the fixed points: f ðxÞ ¼ df dx
¼
2μx 2 x 1 þ x2
4μx 1: ð1 þ x 2 Þ2
ð46Þ ð47Þ
In case the slope is negative, the fixed point is stable, otherwise it is unstable:
238
Anna Deneer and Christian Fleck
df dx x¼x s
¼ 1 < 0
) fixedpointisstable
pffiffiffiffiffiffiffiffiffiffiffiffiffiffi μ2 1 df ¼ < 0 ) fixedpointisstable x¼x þ μ dx pffiffiffiffiffiffiffiffiffiffiffiffiffiffi μ2 1 df ¼ > 0 ) fixedpointisunstable: x¼x μ dx
ð48Þ ð49Þ ð50Þ
Thus, the system has for μ > 1 two fixed points. Figure 9a shows the three fixed points when setting μ ¼ 2, and Fig. 9b shows the evolution to the two stable points for μ < 1 and μ > 1. Systems with two distinct stable steady states are called bistable. Which of the steady states will eventually be reached will depend on the past conditions, providing the system with a kind of memory. The type of analysis presented above is called linear stability analysis [32]. In higher dimensions, the principle of the analysis remains but concepts from linear algebra are required, such as eigenvalues and eigenvector, which go beyond this overview chapter. A very good introduction can be found in [32, 108]. In the example discussed above, the system has three fixed points (two stable, one unstable) for μ > 1 and one stable fixed point for μ < 1. Figure 9b shows the difference in behaviour for the two cases. Two fixed points are created when increasing μ from μ < 1 to μ > 1. This leads to the important concept of bifurcations. 5.3 Bifurcation Analysis
The creation or destruction of fixed points and/or the change of its flavour (stable/unstable) when changing one or more parameters of a system is called bifurcation. There are many different types of bifurcations, e.g., the system in the previous section exhibits a transcritical bifurcation [96, 108]. Parameter values at which these changes occur are called bifurcation points. We consider as another example a model for glycolysis [108]: x_ ¼ x þ ay þ x 2 y y_ ¼ b ay x 2 y,
ð51Þ
where x represents ADP, y fructose, and a, b are the kinetic parameters. Depending on the values of a and b, the system will have a stable limit cycle or a stable fixed point. In Fig. 10a, the value for b is fixed to b ¼ 0.5 and a is varied, showing the changes in concentration for x, y over time. The different steady states can also be analysed by phase portraits as in Fig. 10b. Here a ¼ 0.12 is the bifurcation point. The system behaviour for a range of values for both a and b can be visualised by plotting the line separating the two types of behaviours in parameter space as shown in Fig. 10c. Analysing in which region of the parameter space the system exhibits which behaviour helps to conceptually understand a system and can provide insight into the robustness of the system.
Mathematical Modelling in Plant Synthetic Biology
(a)
239
(b) 1 0.8 0.6 0.4 0.2 0 0
0.05
0.1
0.15
(c)
Fig. 10 Bifurcation analysis of the glycolytic oscillator. (a) Concentrations of x, y over time for different values of a. b is fixed to b ¼ 0.5. (b) Phase portrait for different values of a and fixed b. (c) Overview of the regions in parameter space for which the system exhibits oscillations or a stable fixed point. The curve indicates the bifurcation points for (a, b). Combinations of (a, b) on the right side of the curve will result in a stable fixed point, whereas parameters on the left side will show a stable limit cycle
When there is a small or no response to perturbations, the system is called robust. When the parameters of a system are slightly perturbed far from a bifurcation point, the system will not show a qualitative change in behaviour, whereas perturbations close to a bifurcation point can result in drastic changes. 5.4 Sensitivity Analysis
While bifurcation analysis yields an understanding about the qualitative changes of a system (e.g., stable ! oscillating), sensitivity analysis aims to quantify how much a function changes when a parameter (or a set of parameters) is changed a little bit, which is called parametric sensitivity. In Subheading 3.1, we used the sensitivities to determine whether a system is structurally identifiable. The parametric sensitivity of the solution of a differential
240
Anna Deneer and Christian Fleck
Fig. 11 Sensitivity of x(t) to the parameters σ and λ. At earlier time points the sensitivities sσ and sλ are smaller than at the steady state of x(t). Parameters used in this figure are x0 ¼ 1, σ ¼ 3, λ ¼ 1.5
equation is a local property, i.e., it depends on at which point in parameter space it is calculated [54]. We exemplify sensitivity analysis using a simple example, having the advantage that everything can be calculated analytically. The dynamics of the abundance x of a protein that is expressed with a basal expression rate σ and degraded with rate λ is described by the equation: x_
¼
σ λx:
ð52Þ
Obviously, the solution x(t) depends on σ and λ, but are the dependencies constant over time or do they change? A quantitative answer to these questions is given by sensitivity analysis. The sensitivities with respect to the parameters σ and λ are defined by s σ ðtÞ ¼
∂x ðtÞ ∂σ
ð53Þ
s λ ðtÞ ¼
∂x ðtÞ: ∂λ
ð54Þ
Taking the derivative with respect to time on both sides of this equation results into (given that we can interchange the order of differentiation) ∂x_ ðtÞ ¼ 1 λs σ ðtÞ ∂λ
ð55Þ
∂x_ ðtÞ¼ xðtÞ λs λ ðtÞ: ∂λ
ð56Þ
s_σ ðtÞ ¼ s_λ ðtÞ
¼
Mathematical Modelling in Plant Synthetic Biology
241
We arrive at the extended set of equations for x and the sensitivities: x_ s_σ
¼ ¼
s_λ
σ λx
ð57Þ
1 λs σ ðtÞ
ð58Þ
¼ x λs:
ð59Þ
The solutions for x(t ¼ 0) ¼ x0 and sσ(t ¼ 0) ¼ sλ(t ¼ 0) ¼ 0 read σ xðtÞ ¼ x 0 e λt þ 1 e λt ð60Þ λ 1 ð61Þ s σ ðtÞ ¼ 1 e λt λ σ σ s λ ðtÞ ¼ t x 0 e λt 2 1 e λt : ð62Þ λ λ Note that we present this example to explain the principle but could have of course calculated the sensitivities directly from the solution x. However, when an analytical solution for x cannot be determined, the derivatives for the sensitivities would need to be calculated numerically. Instead, it is generally advantageous to solve numerically an extended system of equations [53, 91]: ! _
! ! !
f ðx , θ Þ
ð63Þ
S_ ¼ F þ JS
ð64Þ
x
¼
togetherwith !
y
¼
Y
¼
! ! !
g ðx , θ Þ
ð65Þ
G þ ^JS:
ð66Þ
The matrices are defined by Sij ¼ ∂xi/∂θj (sensitivities of the state variables), Fij ¼ ∂fi/∂θj (explicit derivative), Jij ¼ ∂fi/∂xj (Jacobian), Yij ¼ ∂yi/∂θj (sensitivities of the observables), Gij ¼ ∂gi/ ∂θj (explicit derivative), Jij ¼ ∂g i =∂x j . The first two equations ! are solved simultaneously and the results for x and S inserted into the remaining two equations. It can be informative to put the sensitivities into relation to the variable and the parameter, i.e., instead of the absolute sensitivities to use the rescaled and dimensionless sensitivities ~s i j ¼
θ j ∂x i : x i ∂θ j
ð67Þ
In synthetic biology, it is important to determine which reactions most significantly affect the system output and thus to know about the sensitivities. This can have a couple of reasons. For systems in which an output should be controlled by an input (such as in optogenetic circuits), it is important to realise in which parts of the system one should interfere with to best control the output.
242
Anna Deneer and Christian Fleck
Further, in biology, parameters are usually subject to variabilities; therefore, the design of systems that function robustly despite parameter variations is desirable. When the engineered system does not function as expected, the most sensitive reactions are likely to be the best targets for modification [6]. Sensitivity analysis achieves all these aims, but sensitivity analysis has an important limitation. It is a local analysis, and as such it depends at which point in parameter space the sensitivities are calculated. Global sensitivity analysis aims to overcome this limitation, as it tries to provide a global picture of which parts of the system are sensitive to variations and which not [53, 91]. In global sensitivity analysis, usually statistical approaches are used to cover the sampling of the parameter space and determining corresponding model behaviour. Sampling of the whole parameter space is in most case infeasible and inefficient. It is therefore important to analyse the uncertainties one may have about the parameters of a mathematical model, which limits the parameter space to a sub-space and thus makes the subsequent analysis easier. Even when all parameters of a biological system are structurally identifiable, when the available data is sufficient to practically identify the parameters, uncertainties will remain and methods are needed to address these.
6
Handling Uncertainty Design and engineering biological system despite incomplete information is a challenging problem in synthetic biology. Incomplete information gives rise to uncertainties about the system. These uncertainties can be the result of lack of knowledge or due to a lack of appropriate data. Even when a parameter is in theory identifiable, the necessary experiments to obtain the required data may be expensive, very time-consuming, or at the current stage impossible to perform. Even when a parameter has been identified, the confidence intervals may be large. Additionally, the natural variability of biological system due to intrinsic noise, e.g., due to stochastic gene expression, leads to uncertainties [8, 20, 35, 112, 114]. One usually distinguishes two classes of uncertainties: aleatoric uncertainty refers to situations when there is a natural variability in the phenomenon under consideration. This means that by gathering more experimental data the distribution of the parameter can be more refined but the uncertainty about the specific realisation will not be reduced. The second class is epistemic uncertainty, which refers to a lack of knowledge. In contrast to aleatoric uncertainty, epistemic uncertainty can be reduced by performing appropriate experiments. We consider epistemic uncertainty as the typical case for complex biological systems [110]. The most common source of epistemic uncertainty stems from parameters derived from imperfect datasets that contain noisy, incoherent, or missing data points.
Mathematical Modelling in Plant Synthetic Biology
243
Additionally, a lack of knowledge on the underlying mechanism or incomplete coverage can contribute to epistemic uncertainty. To quantify the uncertainty, a probabilistic framework can be employed. In such a framework, the uncertainty in the input data is associated with random variables that describe the input data. Here, we focus on two methods of uncertainty quantifications: (i) the popular and easy to implement Monte Carlo methods and (ii) spectral methods that aim to reconstruct the functional dependence of the model response on the random parameters. 6.1 Monte Carlo Methods
Statistical methods based on (pseudo-)random numbers were used (e.g., to obtain an estimate for π) long before the name “Monte Carlo” for these types of approaches was coined inspired by the Monte Carlo Casino [56]. Monte Carlo (MC) methods are used in physics, chemistry, statistics, computer science, to name just a few areas [15]. Fundamentally, MC methods rely on a pseudo-random sampling of the unknown parameters in order to construct a set of input realisations. For each of these realisations, there exists a corresponding solution of the model, and all of these solutions together give the sample solution set. Using this set, one can estimate certain statistics of the model output; for example, the expectation of the model solution s can be estimated by 1 N !1 N
hsi ¼ lim
XN
i¼1
s i wi ,
XN
i¼1
wi ¼ N ,
ð68Þ
where N is the number of realisations and wi is the weight associated with realisation i. In a non-biased sampling approach all wi ¼ 1. MC methods can be applied to any parametrised model, either stochastic or deterministic. MC methods are flexible and robust in that they do not depend on the model form or any regularity in the model solutions. MC methods will always provide approximations of the quantity of interest. Convergence of that approximation can be assessed based on indicators that are related to the computed solutions. This convergence of MC methods is their main limitation, as the convergence rate depends on the number of realisations by N1/2. Several different sampling methods have been proposed to improve this rate like quasi-Monte Carlo (using low-discrepancy sequences), Latin hypercube, importance sampling, and variance reduction [27, 46, 50]. However, the large number of computations that are needed still remains a bottleneck for a substantial amount of uncertainty problems. Overall, Monte Carlo methods are widely applicable and very useful when dealing with a large number of uncertainties either in the model parameters or other inputs. The main disadvantage is the computational power required for a large number of repeated simulations such that the quantity approximated by MC simulations is
244
Anna Deneer and Christian Fleck
sufficiently accurate. For some models, a single simulation can already be computationally very costly; in these cases, MC approaches are not feasible. In synthetic biology, MC methods have been used to design and analyse circuits [16, 36, 125]. MC can be employed in global sensitivity analysis (as discussed in Subheading 5.4). Estimating the sensitivity of certain circuit properties (e.g., rate constants) can guide the design of optimal mutations, thus reducing the experimental effort [36]. When the parameter values are not precisely known, MC can be used to still analyse sensitivities. In this case, different values for parameters are sampled. This sampling can also be used, for example, to determine specific input–output characteristics [16]. 6.2 Spectral Methods
MC methods are straightforward and easy to use. However, they have the significant disadvantage that they require many calculations of the model under investigation. This holds in particular when it is computationally demanding to solve the model. An interesting alternative worthwhile to be mentioned here are spectral methods. The function x(t, θ) is approximated by a series expansion or spectral expansion (SE) in terms of orthogonal functions Ψ of the uncertainty parameter θ [66, 110]: xðt, θÞ
PN
k¼0 x k ðtÞΨk ðθÞ,
ð69Þ
where xk are the (usually) time-dependent expansion coefficients and N is the order of the expansion. This type of expansion resembles the well-known Fourier expansion used, e.g., in signal analysis. The advantage of this representation is that one obtains an approximation of x for all values of θ. For a precise definition of the expansion and how to generalise it for many uncertainty parameters, see refs. [66, 110]. This form allows immediate evaluation of statistics of x, either analytically or through sampling of θ. This advantage comes at the cost that one has to determine the expansion coefficients xk. For this, two classes of spectral methods can be considered. The model’s governing equations can be reformulated such that each quantity is represented by a spectral expansion. This results in a system of differential equations for the expansion coefficients xk. This approach is called intrusive spectral projection. In contrast, non-intrusive spectral projection, the expansion coefficients xk are determined using the model without changing the original model equations. The advantage of this non-intrusive method is that it requires only straightforward deterministic model solutions and does not depend on any reformulation of the model. Furthermore, treating the model as a black box gives the advantage that it can even be applied to very complex models (e.g., highly non-linear models). Just like MC methods, SE can become computationally expensive as more deterministic model resolutions are needed or if the model is expensive to solve. In particular when a
Mathematical Modelling in Plant Synthetic Biology
245
Fig. 12 Histogram of the average trichome density for one random variable. Trichome density was determined either by MC methods or SE. The solid grey histogram indicates the trichome densities for 100 MC samples and the dashed line for a SE of N ¼ 20, where 100 samples of the SE basis have been used for the construction of the distribution. Note that the SE exactly overlaps with the MC results; however, the SE approach was performed in a much shorter time than the MC sampling
higher order of expansion is needed or when facing highdimensional problems, the SE can become prohibitively expensive to solve. There are methods developed to deal with reducing the complexity of the non-intrusive methods, such as adaptive sparse grids [21]. Spectral methods are increasingly popular in the field of uncertainty quantification and have proven very effective for physical and mechanical models (e.g., fluid dynamics). For these models, SE outperforms MC methods in terms of computational efficiency. For biological models, spectral methods are so far uncommon [80]; however, given their success in models that share the same mathematical principles and fundamentally similar sources of uncertainty, it seems a promising approach for biological problems as well. To showcase an example of non-intrusive spectral projection in the context of a biological system, we turn to the trichome model that was introduced earlier (Subheading 2.2.1). As mentioned before, the parameters for this model are unknown, but for this example we will consider only one of the parameters as an uncertainty variable. Since there is no information known about the distribution for this parameter, we consider it to follow a uniform distribution. An example is shown in Fig. 12 where the binding
246
Anna Deneer and Christian Fleck
parameter between proteins is chosen as the uncertainty variable. The histogram of the average trichome density is constructed either by MC methods or by spectral expansion. Both methods show the same results. The main difference is in the computational time, where the MC method required 2.6 h for 1000 samples, and the spectral expansion required less than 4 min. Although this is a very simple case (only one uncertainty variable), it shows the advantage of using a spectral expansion instead of MC methods.
7
Summary In this chapter, we gave a brief overview of the mathematical methods used for dynamic models in synthetic biology, starting from how to derive a model to how to analyse it. We necessarily had to neglect many important aspects of mathematical modelling in biology but hope to have provided the reader with the basic ideas of some of the most important mathematical concepts used in systems and synthetic biology. There are certainly many more things one could say about mathematical modelling, but due to the nature of an overview chapter we refer the interested reader to the references given in the text. For us, the guiding principle when writing this chapter was to convey how practically mathematical modelling is done and what the basic steps are.
Acknowledgements We would like to apologise to any readers who feel that we have neglected important references. Given the large field of mathematical modelling in biological systems, it is almost impossible and also beyond the scope of this chapter to give all the room and right they deserve. The cited references contained are those that the authors believe would provide a useful introduction to interested readers. CF acknowledges funding by the European Commission— Research Executive Agency (H2020 Future and Emerging Technologies (FET- Open) Project ID 801041 CyGenTiG. References 1. Akaike H (1981) Likelihood of a model and information criteria. J Econ 16(1):3—14 2. Allen DK, Libourel IGL, Shachar-Hill Y (2009) Metabolic flux analysis in plants: Coping with complexity. Plant Cell Environ 32(9):1241–1257. https://doi.org/10. 1111/j.1365-3040.2009.01992.x 3. Alon U (2007) An introduction to systems biology: Design principles of biological circuits. Chapman and Hall, London
4. Alon U (2007) Network motifs: theory and experimental approaches. Nat Rev Genet 8(6):450–461 5. Anderson DR (2008) Model based inference in the life sciences: a primer on evidence. Springer, New York. oCLC: ocn195612532 6. Andrianantoandro E, Basu S, Karig DK, Weiss R (2006) Synthetic biology: new engineering rules for an emerging discipline. Mol Syst Biol
Mathematical Modelling in Plant Synthetic Biology 2(1):2006—0028. https://doi.org/10. 1038/msb4100073 7. Aoki SK, Lillacci G, Gupta A, Baumschlager A, Schweingruber D, Khammash M (2019) A universal biomolecular integral feedback controller for robust perfect adaptation. Nature 570(7762):533–537. https://doi.org/10.1038/s41586-0191321-1 8. Arau´jo IS, Pietsch JM, Keizer EM, Greese B, Balkunde R, Fleck C, Hu¨lskamp M (2017) Stochastic gene expression in Arabidopsis thaliana. Nat. Commun. 8(1):420–429. https://doi.org/10.1038/s41467-01702285-7. zSCC: 0000009 9. Ashyraliyev M, Fomekong-Nanfack Y, Kaandorp JA, Blom JG (2009) Systems biology: parameter estimation for biochemical models. FEBS J 276(4):886–902. https://doi.org/ 10.1111/j.1742-4658.2008.06844.x 10. Balsa-Canto E, Banga JR (2011) AMIGO, a toolbox for advanced model identification in systems biology using global optimization. Bioinformatics (Oxford, England) 27(16): 2311–2313. https://doi.org/10.1093/bioin formatics/btr370 11. Balsa-Canto E, Banga JR, Alonso AA (2008) Computational procedures for optimal experimental design in biological systems. IET Syst Biol 2(4):163–172. https://doi.org/10. 1049/iet-syb:20070069 12. Balsa-Canto E, Peifer M, Banga J, Timmer J, Fleck C (2008) Hybrid optimization method with general switching strategy for parameter estimation. BMC Syst Biol 2(1):1–9. ZSCC: 0000113 13. Banga JR (2008) Optimization in computational systems biology. BMC Syst Biol 2(1): 47. https://doi.org/10.1186/1752-05092-47 14. Banga JR, Balsa-Canto E (2008) Parameter estimation and optimal experimental design. Essays Biochem 45:195–209. https://doi. org/10.1042/BSE0450195 15. Barbu A, Zhu SC (2020) Monte Carlo methods. Springer, Singapore. https://doi.org/ 10.1007/978-981-13-2971-5 16. Barnes CP, Silk D, Sheng X, Stumpf MPH (2011) Bayesian design of synthetic biological systems. Proc Natl Acad Sci 108(37): 15190–15195. https://doi.org/10.1073/ pnas.1017972108 17. Bartley BA, Kim K, Medley JK, Sauro HM (2017) Synthetic biology: engineering living systems from biophysical principles. Biophys J
247
112(6):1050–1058. https://doi.org/10. 1016/j.bpj.2017.02.013 18. Bellouquid A, Delitala M (2006) Mathematical modeling of complex biological systems: a kinetic theory approach. In: Modeling and Simulation in Science, Engineering and Technology. Birkhauser, Boston 19. Beyer HG, Sendhoff B (2007) Robust optimization – A comprehensive survey. Comput. Methods Appl. Mech. Eng. 196(33–34): 3190–3218. https://doi.org/10.1016/j. cma.2007.03.003 20. Blake WJ, Kaern M, Cantor CR, Collins JJ (2003) Noise in eukaryotic gene expression. Nature 422(6932):633–637. https://doi. org/10.1038/nature01546 21. Blatman G, Sudret B (2011) Adaptive sparse polynomial chaos expansion based on least angle regression. J Comput Phys 230(6): 2345–2367 22. Blum C (2005) Ant colony optimization: introduction and recent trends. Phys Life Rev 2(4):353–373. https://doi.org/10. 1016/j.plrev.2005.10.001 23. Bouyer D, Geier F, Kragler F, Schnittger A, Pesch M, Wester K, Balkunde R, Timmer J, Fleck C, Hu¨lskamp M (2008) Two-dimensional patterning by a trapping/ depletion mechanism: the role of TTG1 and GL3 in Arabidopsis trichome formation. PLoS Biol. 6(6):e141. https://doi.org/10. 1371/journal.pbio.0060141. zSCC: NoCitationData[s0] 24. Burnham KP, Anderson DR, Burnham KP (2002) Model selection and multimodel inference: a practical information-theoretic approach, 2nd edn. Springer, New York. oCLC: ocm48557578 25. Chen BS, Hsu CY, Liou JJ (2011) Robust design of biological circuits: evolutionary systems biology approach. J Biomed Biotechnol 2011:304236. https://doi.org/10.1155/ 2011/304236 26. Chis OT, Banga JR, Balsa-Canto E (2011) Structural identifiability of systems biology models: a critical comparison of methods. PLoS One 6(11):e27755–16. https://doi. org/10.1371/journal.pone.0027755 27. Dalal IL, Processors DS (2008) Low discrepancy sequences for Monte Carlo simulations on reconfigurable platforms. In: Proceedings of the 2008 International Conference on Application-Specific Systems, Architectures and Processors. https://doi.org/10.1109/ ASAP.2008.4580163. ieeexploreieeeorg
248
Anna Deneer and Christian Fleck
28. Das S, Maity S, Qu BY, Suganthan PN (2011) Real-parameter evolutionary multimodal optimization — A survey of the state-of-theart. Swarm Evol Comput 1(2):71–88. https://doi.org/10.1016/j.swevo.2011. 05.005 29. Digiuni S, Schellmann S, Geier F, Greese B, Pesch M, Wester K, Dartan B, Mach V, Srinivas BP, Timmer J, Fleck C, Hu¨lskamp M (2008) A competitive complex formation mechanism underlies trichome patterning on Arabidopsis leaves. Mol Syst Biol 4:217. https://doi.org/10.1038/msb.2008.54 30. Duschak VG (2016) Synthetic Biology: Computational Modeling Bridging the Gap between In Vitro and In Vivo Reactions. Current Synthetic Syst Biol. https://doi.org/10. 4172/2332-0737.10001027 31. Eadie W, James F (2006) Statistical Methods in Experimental Physics. World Scientific Publishing, Singapore 32. Edelstein-Keshet L (1988) Mathematical Models in Biology, vol 46. SIAM, Philadelphia 33. Eisenkolb I, Jensch A, Eisenkolb K, Kramer A, Buchholz PCF, Pleiss J, Spiess A, Radde NE (2019) Modeling of biocatalytic reactions: A workflow for model calibration, selection, and validation using Bayesian statistics. AIChE J. 66(4):e16866. https://doi.org/10.1002/ aic.16866, eprint: https://aiche.onlinelibrary. wiley.com/doi/pdf/10.1002/aic.16866 34. Elowitz MB, Leibler S (2000) A synthetic oscillatory network of transcriptional regulators. Nature 403(6767):335–338. https:// doi.org/10.1038/35002125 35. Elowitz MB, Levine AJ, Siggia ED, Swain PS (2002) Stochastic gene expression in a single cell. Science (New York, NY) 297(5584): 1183–1186. https://doi.org/10.1126/sci ence.1070919 36. Feng XJ, Hooshangi S, Chen D, Li G, Weiss R, Rabitz H (2004) Optimizing genetic circuits by global sensitivity analysis. Biophys J 87(4):2195–2202. https://doi.org/10. 1529/biophysj.104.044131 37. Franceschini G, Macchietto S (2008) Modelbased design of experiments for parameter precision: state of the art. Chem Eng Sci 63(19):4846–4872. https://doi.org/10. 1016/j.ces.2007.11.034 38. Frieden R, Gatenby RA (2010) Exploratory data analysis using Fisher information. Springer, London. https://doi.org/10. 1007/978-1-84628-777-0
39. Ga´bor A, Banga JR (2015) Robust and efficient parameter estimation in dynamic models of biological systems. BMC Syst Biol 9(1):74. https://doi.org/10.1186/s12918-0150219-2 40. Gardiner CW (2004) Handbook of Stochastic Methods for Physics, Chemistry, and the Natural Sciences. Springer, Berlin 41. Gauch HG (2003) Scientific method in practice. Cambridge University Press, New York 42. Gierer A, Meinhardt H (1972) A theory of biological pattern formation. Kybernetik 12(1):30–39 43. Gierer A, Meinhardt H (1974) Applications of a theory of biological pattern formation based on lateral inhibition. J Cell Sci 15: 321–376 44. Gillespie DT (2000) The chemical Langevin equation. J Chem Phys 113(1):297. https:// doi.org/10.1063/1.481811 45. Gillespie DT (2007) Stochastic simulation of chemical kinetics. Annu Rev Phys Chem 58(1):35–55. https://doi.org/10.1146/ annurev.physchem.58.032806.104637 46. Glynn PW, Iglehart DL, Fishman AGS (1989) Importance Sampling for Stochastic Simulations. Manag Sci 35(11): 1367–1392 47. Grima R, Thomas P, Straube AV (2011) How accurate are the nonlinear chemical FokkerPlanck and chemical Langevin equations? J Chem Phys 135(8):084103. https://doi. org/10.1063/1.3625958 48. Guillaume JHA, Jakeman JD, Marsili-LibelliS, Asher M, Brunner P, Croke B, Hill MC, Jakeman AJ, Keesman KJ, Razavi S, Stigter JD (2019) Introductory overview of identifiability analysis: a guide to evaluating whether you have the right type of data for your modeling purpose. Environ Model Softw 119:418–432. https://doi.org/10.1016/j.envsoft.2019. 07.007 49. Held L, Sabane´s Bove´ D (2014) Applied statistical inference. Springer, Berlin. https:// doi.org/10.1007/978-3-642-37887-4 50. Helton J, Davis F (2003) Latin hypercube sampling and the propagation of uncertainty in analyses of complex systems. Reliab Eng Syst Saf 81(1):23–69. https://doi.org/10. 1016/S0951-8320(03)00058-9 51. Hornung G, Barkai N (2008) Noise propagation and signaling sensitivity in biological networks: a role for positive feedback. PLoS Comput Biol 4(1):e8. https://doi.org/10. 1371/journal.pcbi.0040008
Mathematical Modelling in Plant Synthetic Biology 52. Hu¨lskamp M (2004) Plant trichomes: a model for cell differentiation. Nat Rev Mol Cell Biol 5(6):471–480. https://doi.org/10. 1038/nrm1404 53. Ingalls B (2008) Sensitivity analysis: from model parameters to system behaviour. Essays Biochem 45:177–194. https://doi.org/10. 1042/bse0450177 54. Ingalls BP (2013) Mathematical Modeling in Systems Biology. An Introduction. MIT Press, New York 55. Ingolia NT, Weissman JS (2008) Systems biology: Reverse engineering the cell. Nature 454(7208):1059–1062 56. James F (1980) Monte Carlo theory and practice. Rep Prog Phys 43(9):1145–1189 57. Joubert D, Stigter JD, Molenaar J (2018) Determining minimal output sets that ensure structural identifiability. PLoS One 13(11): e0207334. https://doi.org/10.1371/jour nal.pone.0207334 58. Kiefer J (1959) Optimum Experimental Designs. J R Stat Soc Ser B Methodol 21(2): 272–319 59. Kitano H (2002) Systems biology: a brief overview. Science 295(5560):1662–1664. https://doi.org/10.1126/science.1069492 60. Klipp E, Herwig R, Kowald A, Wierling C, Lehrach H (2005) Systems Biology in Practice. Wiley, New York 61. Koch A, Meinhardt H (1994) Biological pattern formation: from basic mechanisms to complex structures. Rev Mod Phys 66(4): 1481–1507 62. Kondo S, Miura T (2010) Reaction-diffusion model as a framework for understanding biological pattern formation. Science 329(5999):1616–1620. https://doi.org/10. 1126/science.1179047 63. Kreutz C, Timmer J (2009) Systems biology: experimental design. FEBS J 276(4): 923–942. https://doi.org/10.1111/j. 1742-4658.2008.06843.x 64. Kreutz C, Raue A, Kaschek D, Timmer J (2013) Profile likelihood in systems biology. FEBS J 280(11): 2564–2571. https://doi. org/10.1111/febs.12276 65. Lanza AM, Crook NC, Alper HS (2012) Innovation at the intersection of synthetic and systems biology. Curr Opin Biotechnol 23(5):712–717. https://doi.org/10.1016/j. copbio.2011.12.026 66. Le Maitre O, Knio OM (2010) Spectral methods for uncertainty quantification. In: With applications to computational fluid dynamics. Springer, Berlin
249
67. Levskaya A, Chevalier AA, Tabor JJ, Simpson ZB, Lavery LA, Levy M, Davidson EA, Scouras A, Ellington AD, Marcotte EM, Voigt CA (2005) Synthetic biology: Engineering Escherichia coli to see light. Nature 438(7067):441–442. https://doi.org/10. 1038/nature04405 68. Lim WA, Lee CM, Tang C (2013) Design principles of regulatory networks: searching for the molecular algorithms of the cell. Mol Cell 49(2):202–212. https://doi.org/10. 1016/j.molcel.2012.12.020 69. Lin JG (2003) On min-norm and min-max methods of multi-objective optimization. Math Program 103(1):1–33. https://doi. org/10.1007/s10107-003-0462-y 70. Ljung L (2010) Perspectives on system identification. Annu Rev Control 34(1):1–12. https://doi.org/10.1016/j.arcontrol.2009. 12.001 71. Ludden TM, Beal SL, Sheiner LB (1994) Comparison of the Akaike Information Criterion, the Schwarz criterion and the F test as guides to model selection. J. Pharmacokinet. Biopharm. 22(5):431–445 72. Mitra ED, Hlavacek WS (2019) Parameter estimation and uncertainty quantification for systems biology models. Curr Opin Syst Biol. 18:9–18. https://doi.org/10.1016/j.coisb. 2019.10.006 73. Mu¨ller K, Siegel D, Rodriguez Jahnke F, Gerrer K, Wend S, Decker EL, Reski R, Weber W, Zurbriggen MD (2014) A red light-controlled synthetic gene expression switch for plant systems. Mol BioSyst 10(7): 1679–1688. https://doi.org/10.1039/ c3mb70579j 74. Murray JD (2002) Mathematical biology, 3rd edn. Interdisciplinary Applied Mathematics. Springer, New York 75. Nova´k B, Tyson JJ (2008) Design principles of biochemical oscillators. Nat Rev Mol Cell Biol 9(12):981–991. https://doi.org/10. 1038/nrm2530 76. Orth JD, Thiele I, Palsson BØ (2010) What is flux balance analysis? Nat Biotechnol 28(3): 245–248. https://doi.org/10.1038/nbt. 1614 77. Othmer HG, Scriven LE (1969) Interactions of Reaction and Diffusion in Open Systems. Ind Eng Chem Fundam 8(2):302–313. https://doi.org/10.1021/i160030a020 78. Othmer HG, Scriven LE (1971) Instability and dynamic pattern in cellular networks. J Theor Biol 32(3):507–537 79. Patane` A, Santoro A, Costanza J, Carapezza G, Nicosia G (2015) Pareto
250
Anna Deneer and Christian Fleck
optimal design for synthetic biology. IEEE Trans Biomed Circuits Syst 9(4):555–571. https://doi.org/10.1109/TBCAS.2015. 2467214. Conference name: IEEE Transactions on Biomedical Circuits and Systems 80. Paulson JA, Martin-Casas M, Mesbah A (2019) Fast uncertainty quantification for dynamic flux balance analysis using non-smooth polynomial chaos expansions. PLoS Comput Biol 15(8):e1007308 81. Pesch M, Hu¨lskamp M (2009) One, two, three...models for trichome patterning in Arabidopsis? Curr Opin Plant Biol 12(5): 587–592. https://doi.org/10.1016/j.pbi. 2009.07.015 82. Phillips R, Kondev J, Theriot J, Garcia HG (2013) Physical biology of the cell, 2nd edn. Garland Science, New York 83. Poli R, Kennedy J, Blackwell T (2007) Particle swarm optimization. Swarm Intell 1(1): 33–57. https://doi.org/10.1007/s11721007-0002-0 84. Quail PH (2002) Phytochrome photosensory signalling networks. Nat Rev Mol Cell Biol 3(2):85–93. https://doi.org/10.1038/ nrm728 85. Quarteroni A, Sacco R, Saleri F (2007) Numerical mathematics, 2nd edn. Texts in Applied Mathematics, vol 37. Springer, Berlin 86. Rabitz H, Kramer M, Dacol D (1983) Sensitivity analysis in chemical kinetics. Annu Rev Phys Chem 34(1):419–461 87. Raue A, Kreutz C, Maiwald T, Bachmann J, Schilling M, Klingmu¨ller U, Timmer J (2009) Structural and practical identifiability analysis of partially observed dynamical models by exploiting the profile likelihood. Bioinformatics (Oxford, England) 25(15):1923–1929. https://doi.org/10.1093/bioinformatics/ btp358 88. Raue A, Karlsson J, Saccomani MP, Jirstrand M, Timmer J (2014) Comparison of approaches for parameter identifiability analysis of biological systems. Bioinformatics (Oxford, England) 30(10):1440–1448. https://doi.org/10.1093/bioinformatics/ btu006 89. Rodrigo G, Carrera J, Elena SF (2010) Network design meets in silico evolutionary biology. Biochimie 92(7):746–752. https://doi. org/10.1016/j.biochi.2010.04.003 90. Rollie´ S, Mangold M, Sundmacher K (2012) Designing biological systems: systems engineering meets Synthetic Biology. Chem Eng Sci 69(1):1–29. https://doi.org/10.1016/j. ces.2011.10.068
91. Saltelli A (ed) (2008) Global Sensitivity Analysis: The Primer. John Wiley, Chichester, England; Hoboken, NJ. oCLC: ocn180852094 92. Santos Moreno J, Schaerli Y (2019) Using synthetic biology to engineer spatial patterns. Adv Biosyst 3(4):1800280–15. https://doi. org/10.1002/adbi.201800280 93. Schnoerr D, Sanguinetti G, Grima R (2017) Approximation and inference methods for stochastic biochemical kinetics—a tutorial review. J Phys A Math Theor 50(9): 093001–60. https://doi.org/10.1088/ 1751-8121/aa54d9 94. Schwille P (2011) Bottom-up synthetic biology: Engineering in a Tinkerer’s world. Science (New York, NY). https://doi.org/10. 1126/science.1206843 95. Schwille P, Diez S (2009) Synthetic biology of minimal systems. Crit Rev Biochem Mol Biol 44(4):223–242. https://doi.org/10.1080/ 10409230903074549 96. Seydel R (2010) Practical Bifurcation and Stability Analysis, Interdisciplinary Applied Mathematics, vol 5. Springer, New York. https://doi.org/10.1007/978-1-44191740-9 97. Shimizu-Sato S, Huq E, Tepperman JM, Quail PH (2002) A light-switchable gene promoter system. Nat Biotechnol 20(10): 1041–1044. https://doi.org/10.1038/ nbt734 98. Slusarczyk AL, Lin A, Weiss R (2012) Foundations for the design and implementation of synthetic genetic circuits. Nat Rev Genet 13(6):406–420. https://doi.org/10.1038/ nrg3227 99. Smith RW, Helwig B, Westphal AH, Pel E, Ho¨rner M, Beyer HM, Samodelov SL, Weber W, Zurbriggen MD, Borst JW, Fleck C (2016) Unearthing the transition rates between photoreceptor conformers. BMC Syst Biol 10(1):110. https://doi.org/10. 1186/s12918-016-0368-y 100. Smith RW, van Sluijs B, Fleck C (2017) Designing synthetic networks in silico: a generalised evolutionary algorithm approach. BMC Syst Biol 11(1):118. https://doi.org/ 10.1186/s12918-017-0499-9 101. Sober E (2015) Ockham’s Razors: a user’s manual. Cambridge University Press, Cambridge 102. Soyer OS (2012) Evolutionary systems biology. Springer, Berlin 103. Soyer OS, O’Malley MA (2013) Evolutionary systems biology: what it is and why it matters.
Mathematical Modelling in Plant Synthetic Biology Bioessays 35(8):696–705. https://doi.org/ 10.1002/bies.201300029 104. Stigter JD, Molenaar J (2015) A fast algorithm to assess local structural identifiability. Automatica 58:118–124. https://doi.org/ 10.1016/j.automatica.2015.05.004 105. Stigter JD, Joubert D, Molenaar J (2017) Observability of Complex Systems: Finding the Gap. Sci Rep 7(1): 1–9. https://doi. org/10.1038/s41598-017-16682-x 106. Stigter JD, van Willigenburg LG, Molenaar J (2018) An Efficient Method to Assess Local Controllability and Observability for Non-Linear Systems. IFAC-PapersOnLine 51(2):535–540. https://doi.org/10.1016/j. ifacol.2018.03.090 107. Stricker J, Cookson S, Bennett MR, Mather WH, Tsimring LS, Hasty J (2008) A fast, robust and tunable synthetic gene oscillator. Nature 456(7221):516–519. https://doi. org/10.1038/nature07389 108. Strogatz SH (1994) Nonlinear Dynamics and Chaos: With Applications to Physics, Biology, Chemistry, and Engineering. Studies in Nonlinearity. Addison-Wesley Pub, Reading 109. Strogatz S (2001) Exploring complex networks. Nature 410(6825):268–276 110. Sullivan TJ (2015) Introduction to Uncertainty Quantification. Springer, Berlin 111. Teague BP, Guye P, Weiss R (2016) Synthetic Morphogenesis. Cold Spring Harb Perspect Biol 8(9):a023929–16. https://doi.org/10. 1101/cshperspect.a023929 112. Thattai M, van Oudenaarden A (2001) Intrinsic noise in gene regulatory networks. PNAS 98(15):8614–8619. https://doi.org/ 10.1073/pnas.151588598 113. Thomas P, Matuschek H, Grima R (2013) How reliable is the linear noise approximation of gene regulatory networks? BMC Genomics 14(Suppl 4):S5. https://doi.org/10.1186/ 1471-2164-14-S4-S5 114. Tsimring LS (2014) Noise in biology. Rep Prog Phys 77(2):026601–29. https://doi. org/10.1088/0034-4885/77/2/026601
251
115. Turing AM (1952) The Chemical Basis of Morphogenesis. Philos Trans R Soc Lond Ser B Biol Sci 237(641):37–72. https://doi. org/10.1098/rstb.1952.0012 116. Tyson JJ, Nova´k B (2010) Functional motifs in biochemical reaction networks. Annu Rev Phys Chem 61(1):219–240. https://doi. org/10.1146/annurev.physchem.012809. 103457 117. Tyson JJ, Chen KC, Novak B (2003) Sniffers, buzzers, toggles and blinkers: dynamics of regulatory and signaling pathways in the cell. Curr Opin Cell Biol 15(2):221–231 118. van Kampen NG (2011) Stochastic processes in physics and chemistry. Elsevier, Amsterdam 119. Velazquez JJ, Su E, Cahan P, Ebrahimkhani MR (2018) Programming morphogenesis through systems and synthetic biology. Trends Biotechnol 36(4):415–429. https:// doi.org/10.1016/j.tibtech.2017.11.003 120. Voit EO, CRC Press LLC (2018) A first course in systems biology. Garland Science, New York. oCLC: 1066662669 121. Wang RS, Saadatpour A, Albert R (2012) Boolean modeling in systems biology: an overview of methodology and applications. Phys Biol 9(5):055001–14. https://doi.org/ 10.1088/1478-3975/9/5/055001 122. Wehrens R, Buydens L (1998) Evolutionary optimisation: a tutorial. TrAC Trends Anal Chem 17(4):193–203 123. Wolpert L (ed) (2011) Principles of Development, 4th edn. Oxford University, Oxford 124. Woolley TE, Baker RE, Maini PK (2017) Turing’s theory of morphogenesis: where we started, where we are and where we want to go. In: The Incomputable. Springer, Cham, pp 219–235. https://doi.org/10.1007/9783-319-43669-2_13 125. Yokobayashi Y, Weiss R, Arnold FH (2002) Directed evolution of a genetic circuit. PNAS 99(26):16587–16591. https://doi.org/10. 1073/pnas.252535999
Chapter 14 In Vivo Epitope Tagging of Plant Mitochondria Franziska Kuhnert and Andreas P. M. Weber Abstract Mitochondria play a key role in cellular metabolism. Analyses of the genome, the proteome, metabolic, physiological, and biochemical functions of mitochondria frequently require the isolation of intact and functional mitochondria from various plant tissues with sufficient yield. For this purpose, we generated a transgenic Arabidopsis thaliana (Arabidopsis) line which presents a triple hemagglutinin tag on the surface of the outer mitochondrial membrane. The affinity tag enables immunocapture of the organelles in a single step. This chapter gives detailed instructions on how to generate transgenic Arabidopsis lines harboring a ubiquitously expressed 3xHA-sGFP-TOM5 mitochondrial fusion protein that is targeted to the outer mitochondrial membrane and enables purification of the organelles in a single step. Key words Arabidopsis, Organelle isolation, Mitochondria, Immunocapture, TOM5, HA-tag, Magnetic beads
1
Introduction Rapid isolation of intact mitochondria from various plant tissues is essential for proteomic applications, and for the study of mitochondrial metabolism and function. However, isolation of sufficient amounts of intact mitochondria is still a rather cumbersome process. A variety of protocols have been established to address this matter. Most of them are based on differential centrifugation and continuous or discontinuous density gradients. Generally, such protocols yield a sufficient amount of isolated mitochondria, but they are rather time-consuming as the procedure takes several hours and requires substantial amounts of biological material to start with [1–4]. Co-immunoprecipitation is a valuable tool not only to tackle protein–protein interactions but also to isolate and concentrate a particular protein out of a sample containing many different proteins. Recently, Chen and colleagues developed a newly method called co-immunopurification which combines a rapid immunocap-
Matias D. Zurbriggen (ed.), Plant Synthetic Biology, Methods in Molecular Biology, vol. 2379, https://doi.org/10.1007/978-1-0716-1791-5_14, © Springer Science+Business Media, LLC, part of Springer Nature 2022
253
254
Franziska Kuhnert and Andreas P. M. Weber
Fig. 1 Schematic depiction of the 3xHA-sGFP-TOM5 tagging construct including selected unique restriction sites. sGFP synthetic green fluorescent protein, HA hemagglutinin, TOM5 Translocase of outer mitochondrial membrane 5, UB10p Ubiquitin10 promoter
ture of intact epitope-tagged mitochondria from transgenic HeLa cells, followed by quantitative metabolite profiling of the isolated organelles [5]. To this end, they generated transgenic HeLa cell lines harboring a 3xHA-eGFP-OMP25 mitochondrial fusion protein in the outer mitochondrial membrane. Intact mitochondria were rapidly enriched from the transgenic lines using magnetic Anti-HA beads [5]. In recent years, similar strategies were successfully applied for Saccharomyces cerevisiae, Caenorhabditis elegans, and Mus musculus [6–8]. To adapt this method for Arabidopsis, we generated a synthetic fusion protein targeted to the outer membrane of mitochondria [9]. We chose translocase of outer mitochondrial membrane 5 (TOM5) as a suitable anchor protein because it is a well-studied outer mitochondrial membrane protein with known topology and function [10, 11]. Previous work demonstrated that its N-terminus, which faces toward the cytosol, can be fused to a green fluorescent protein (GFP) without altering the localization of the protein [12, 13]. Hence, we constructed an N-terminal fusion protein of AtTOM5 with a synthetic GFP (sGFP) and a triple hemagglutinin epitope-tag (3HA; Fig. 1). The fusion protein was generated as a single gene construct in the pUTKan vector under the control of the UBIQUITIN10 promoter (UB10p) from Arabidopsis [9, 14]. Each part of the construct including the UB10p of the pUTKan vector is flanked by unique restriction sites enabling a rapid exchange of the single gene parts by restrictionbased or Gibson cloning. Hence, it is possible to exchange the affinity tag for the co-IP, to exchange or remove the fluorescence protein if its presence interferes with downstream applications, or to express the tagging construct under the control of a cell- or organ-specific promotor such as the guard cell-specific promotor pGC1 [15], the stem cell-specific promoter Wuschel-Related Homeobox4 [16, 17], or the root cell-specific promotor Target of Monopteros5 [17, 18], to mention only a few. In this chapter, we present detailed instructions for the construction of the 3xHA-sGFP-TOM5 gene construct and the
In Vivo Epitope Tagging of Plant Mitochondria
255
generation of transgenic Arabidopsis lines harboring the synthetic construct in the outer membrane of mitochondria.
2
Materials
2.1 Cloning of Tagging Construct
Prepare all buffers and solutions with deionized water. 1. Plasmids: pGWB4 (GenBank accession AB289778), pGWB15 (GenBank accession AB289767), pUTKan [14]. 2. Arabidopsis thaliana cDNA. 3. Phusion® HF Polymerase and HF buffer (NEB). 4. 10 mM dNTP Mix. 5. PCR primers (10 μM stock solution in water): (a) HA-fwd (50 - CACACGGGCCCATGTACCCATAC GATGTTCC-30 ). (b) HA-rev (50 GAACGT-30 ).
GTGTGCCGCGGAGCGTAATCTG
(c) sGFP-fwd (50 CACACCCGCGGGGAGG GAGCGGCGTGAGCAAGGGCGA-30 ). (d) sGFP-rev (50 -GTGTGGTCGACGCCGCTCCCTCCCT TGTACAGCTCGTCC-30 ). (e) TOM5-fwd (50 CAACGTTGTCTC-30 ).
CACACGTCGACGTGAA
(f) TOM5-rev (50 - GTGTGACTAGTTCATCAAACTCC CATGAGATC-30 ). (g) UB10-Prom-fwd (50 - CGATTTTCTGGGTTTGATCG 30 ). (h) RBCS-Term-rev (50 - CACAGTTCGATAGCGAAAACC30 ). 6. Restriction enzymes: ApaI, SalI-HF®, SacII, SpeI-HF®, and CutSmart®buffer (NEB). 7. T4 DNA ligase and buffer. 8. pJET PCR cloning system. 9. PCR clean-up system. 10. Plasmid DNA isolation and purification system. 11. PCR cycler and reaction tubes. 12. Competent E. coli. 13. Micro-volume spectrophotometer. 14. Liquid LB medium: 10 g L1 tryptone, 5 g L1 yeast extract, 10 g L1 NaCl.
256
Franziska Kuhnert and Andreas P. M. Weber
15. Solid LB medium in Petri dishes: 10 g L1 tryptone, 5 g L1 yeast extract, 10 g L1 NaCl, 15 g L1 Bacto Agar. 16. Antibiotics: ampicillin, spectinomycin. 17. Temperature-controlled incubator at 25 C and 37 C. 18. Agarose gel electrophoresis equipment. 19. DNA ladder. 20. UV Table. 21. 10 mg mL1 ethidium bromide solution (see Note 1). 2.2
Plant Cultivation
1. 0.5 M MES/KOH pH 5.7. 2. Plant cultivation medium in petri dishes: 2.2 g L1 MS salts, 10 mL L1 0.5 M MES/KOH pH 5.7, 8 g L1 Plant Agar. 3. Arabidopsis thaliana Col-0 seeds (see Note 2). 4. Surgical tape (breathable). 5. Plant growth chamber. 6. Soil for growing Arabidopsis. 7. Pots and trays for growing Arabidopsis.
2.3 Arabidopsis Transformation
1. pUTKan-3xHA-sGFP-TOM5 plasmid (Addgene ID 130674). 2. Electrocompetent Agrobacterium tumefaciens cells, strain GV3101::pMP90 [19]. 3. Electroporator. 4. Liquid YEP medium: 10 g L1 peptone, 10 g L1 yeast extract, 5 g L1 NaCl. 5. Solid YEP medium in petri dishes: 10 g L1 peptone, 10 g L1 yeast extract, 5 g L1 NaCl, 15 g L1 Bacto Agar. 6. Antibiotics: rifampicin, gentamycin, spectinomycin. 7. Temperature-controlled incubator at 30 C. 8. 3% (w/v) sucrose solution. 9. Polyalkyleneoxide modified heptamethyltrisiloxane (Silwet L-77; Kurt Obermeier GmbH).
2.4 Selection of Transgenic Lines
1. 0.5 M MES/KOH pH 5.7. 2. Plant cultivation medium in petri dishes: 2.2 g L1 MS salts, 10 mL L1 0.5 M MES/KOH pH 5.7, 8 g L1 plant agar, 50 μg m L1 kanamycin. 3. Transgenic seeds. 4. Leaf material from transgenic plants. 5. gDNA isolation buffer: 250 mM Tris/HCl pH 7.5, 250 mM NaCl, 25 mM EDTA, 0.5% (w/v) SDS. 6. 3 M potassium acetate pH 5.0.
In Vivo Epitope Tagging of Plant Mitochondria
257
7. Isopropanol. 8. 70% (w/v) ethanol. 9. TE buffer: 10 mM Tris/HCl pH 7.5, 0.1 mM EDTA. 10. PCR primers (10 μM stock solution in water): (a) UB10-Prom-fwd (50 - CGATTTTCTGGGTTTGATCG 30 ). (b) RBCS-Term-rev (50 - CACAGTTCGATAGCGAAAACC30 ). 11. Taq-DNA polymerase and buffer. 12. 10 mM dNTP mix. 13. PCR cycler and reaction tubes. 14. Agarose gel electrophoresis equipment. 15. DNA ladder. 16. UV table. 17. 10 mg/mL ethidium bromide solution. 18. Protein gel and immunoblotting buffers and equipment. 19. Single-step anti-HA-HRP antibody. 20. Chemiluminescence detection system and solutions. 2.5 Confocal Laser Scanning Microscopy
1. Staining buffer: 0.5 M MES/KOH pH 5.7, 3% (w/v) sucrose. 2. 0.5 M MES/KOH pH 5.7. 3. MitoTracker™ Red CMXRos (Thermo Fisher Scientific). 4. DMSO. 5. Black reaction tubes (see Note 3). 6. Microscopy slides and coverslips. 7. Zeiss LSM 780 Confocal Microscope and Zeiss Zen software.
3
Methods
3.1 Construction of Tagging Construct
1. Amplify the HA epitope-tag (104 bp) from the Gateway binary vector pGWB15 in a PCR with the HA-fwd and the HA-rev primers, Phusion HF Polymerase, and 0.2 mM dNTP mix. The oligonucleotides were designed to introduce a start codon at the 50 end, and the unique restriction sites ApaI and SacII at the 50 and 30 ends of the amplified PCR fragment. 2. Amplify the sGFP fragment (714 bp) from the Gateway binary vector pGWB4 in a PCR with the sGFP-fwd and the sGFP-rev primers, Phusion HF Polymerase, and 0.2 mM dNTP mix. The oligonucleotides were designed to omit the start and stop codon of the sGFP gene and to introduce the unique restriction
258
Franziska Kuhnert and Andreas P. M. Weber
sites SacII and SalI, and a linker peptide (GGSG) at the 50 and the 30 ends of the amplified PCR fragment. 3. Amplify the coding sequence of AtTOM5 (At5g08040, 162 bp) from A. thaliana cDNA in a PCR with the TOM5fwd and the TOM5-rev primers, Phusion HF Polymerase and 0.2 mM dNTP mix. The oligonucleotides were designed to omit the start codon of the TOM5 gene and to introduce the unique restriction sites SalI and SpeI at the 50 and 30 ends of the amplified PCR fragment. 4. Run the PCR products on a 2% (w/v) agarose gel supplemented with 0.01% of an ethidium bromide solution (see Notes 1 and 4). 5. Cut out the PCR products from the gel on a UV table (see Note 5). 6. Elute the DNA from the agarose gel slices using a PCR cleanup system. 7. Determine the concentration of the isolated DNA fragments using a micro-volume spectrophotometer. 8. Ligate each fragment into pJET1.2/blunt using the pJET PCR cloning system. 9. Transform competent E. coli cells with each ligation reaction. 10. Plate the transformation reactions on LB Agar plates supplemented with 200 μg mL1 ampicillin and incubate overnight at 37 C. 11. Select resistant colonies of each construct and grow E. coli cultures overnight. 12. Isolate plasmid DNA from the liquid E. coli cultures using a plasmid DNA isolation and purification system. 13. Optional: Perform a restriction digest with BglII. 14. Determine the concentration of the isolated DNA fragments using a micro-volume spectrophotometer. 15. Sequence the DNA inserted into pJET1.2/blunt using pJET1.2 Forward sequencing or pJET1.2 Reverse sequencing primers to ensure that no mutations are present in the fragments. 16. Digest 3 μg of the pUTKan vector using ApaI and SpeI (see Note 6). 17. Digest 2 μg of the pJET1.2-HA construct with ApaI and SacII (see Note 6). 18. Digest 2 μg of the pJET1.2-sGFP construct with SacII and SalI-HF. 19. Digest 2 μg of the pJET1.2-TOM5 construct with SalI-HF and SpeI-HF.
In Vivo Epitope Tagging of Plant Mitochondria
259
20. Run the digestion reactions on a 1% (w/v) agarose gel supplemented with 0.01% of an ethidium bromide solution (see Notes 1 and 4). 21. Cut out the digested fragments (HA-tag: 99 bp, sGFP: 744 bp, TOM5: 171 bp, pUTKan: 10,813 bp) and elute the DNA from the agarose gel slices using the PCR clean-up system. 22. Ligate the fragments HA-tag, sGFP, and TOM5 into the digested pUTKan vector with T4 DNA Ligase using a molecular vector to insert ratio of 1:3. 23. Transform competent E. coli cells with the ligation reaction. 24. Plate the transformation reaction on LB Agar plates supplemented with 200 μg/mL spectinomycin and incubate over night at 37 C. 25. Optional: Perform a colony PCR using the primers UB10-Promfwd and RBCS-Term-rev to verify the insertion of the fragments (1430 bp). 26. Select resistant colonies and grow E. coli cultures overnight. 27. Isolate plasmid DNA from the liquid E. coli cultures using a plasmid DNA isolation and purification system. 28. Determine the concentration of the isolated DNA fragments using a micro-volume spectrophotometer. 29. Sequence the DNA inserted into the pUTKan vector using UB10-Prom-fwd and RBCS-Term-rev as sequencing primers to ensure that all fragments are inserted in the correct order and no mutations or frameshifts are present. 3.2 Cultivation of Arabidopsis
1. Prepare petri dishes with autoclaved plant cultivation medium in a sterile environment (see Notes 7 and 8). 2. Arabidopsis Col-0 seedlings are sterilized by washing with an aqueous solution containing 70% (v/v) ethanol supplemented with 0.1–1% Triton X-114 followed by washing with 100% ethanol (see Note 9). 3. Equally distribute sterilized seeds onto solid plant cultivation medium (see Notes 10 and 11). 4. Seal the plates with one layer of surgical breathable tape. 5. Incubate the plates for 2 days at 4 C in the dark to synchronize germination. 6. Place the plates in a plant growth chamber (see Note 12). 7. After 2 weeks growth at ambient conditions, seedlings are transferred onto soil (see Note 13). 8. Plants are grown under ambient conditions until they are ready for transformation (see Note 14).
260
Franziska Kuhnert and Andreas P. M. Weber
3.3 Generation of Transgenic Lines
1. Transform electroshock competent Agrobacterium tumefaciens cells, strain GV3101::pMP90 with 1 μg of the pUTKan-3xHAsGFP-TOM5 tagging construct generated as described in Subheading 3.1. 2. Plate the transformation reaction on YEP Agar plates supplemented with 150 μg mL1 rifampicin, 50 μg mL1 gentamycin, 100 μg mL1 spectinomycin and incubate for 2–3 days at 30 C. 3. Select resistant colonies and perform a colony PCR using the primers UB10-Prom-fwd and RBCS-Term-rev to verify that the Agrobacteria containing the tagging plasmid (1430 bp). 4. Optional: Select resistant colonies and grow A. tumefaciens cultures overnight. Isolate plasmid DNA from the liquid A. tumefaciens cultures using a plasmid DNA isolation and purification system. Verify the plasmid by restriction digest or sequencing. 5. Transform Arabidopsis by using the floral dip method [20] (see Notes 15 and 16). 6. Grow the transformed Arabidopsis plants for two more weeks until new siliques have been formed. 7. Bag the plants and wait until they are fully dried. 8. Collect the seeds of the transformed plants.
3.4 Selection of Transgenic Lines
1. Sterilize the seeds generated in subheading “Generation of Transgenic Lines” by following steps 1–8 in subheading “Cultivation of Arabidopsis”. 2. Select positive Arabidopsis plants on solid plant cultivation medium supplemented with 50 μg mL1 kanamycin [21]. 3. Isolate gDNA of positive plants to confirm insertion of the tagging construct by PCR using UB10-Prom-fwd and RBCSTerm-rev as primers. 4. Verify expression of the tagging construct by immunoblotting using single-step anti-HA-HRP antibody (see Note 17).
3.5
Visualization
1. Prepare petri dishes with autoclaved plant cultivation medium and transgenic tagged Col-0 seeds by following steps 1–8 in subheading “Cultivation of Arabidopsis” and let them grow for approximately 2 weeks in a plant growth chamber (see Note 18). 2. Carefully remove the seedlings from the agar plate with a pair of tweezers, place them into a black reaction tube containing 1 mL staining buffer supplemented with 200 nM MitoTracker™ Red CMXRos, and incubate for 5–15 min at 28 C with gentle rocking (see Notes 19, 20, 21 and 22).
In Vivo Epitope Tagging of Plant Mitochondria
261
Fig. 2 Confocal laser scanning microscopy of epitope-tagged mitochondria in tagged Col-0 leaf and root tissue. Images were taken of the epidermal layer of the first true leaf pair (a) and the epidermal and cortex layer of the main root (b) of 10-day-old Arabidopsis seedlings expressing the UB10p-3xHA-sGFP-TOM5 construct. Green color represents the fluorescent signal of the 3xHA-sGFP-TOM5 fusion protein. Red color represents the fluorescent signal of the MitoTracker™ Red CMXRos. Brightfield is shown in gray. Merged images are shown in yellow. Scale bars ¼ 15 μm
3. Remove the staining buffer and wash the seedling at least three times with staining buffer to remove excess MitoTracker™ Red CMXRos. 4. Carefully place the seedling on top of a microscopy slide. 5. Add a little bit of staining buffer and cover the seedling with a coverslip. 6. Images are taken with a confocal laser scanning microscope (e.g., Zeiss LSM 780 Confocal Microscope with the Zeiss Zen software). 7. Use the following excitation/emission wavelengths: sGFP (488 nm/490–550 nm) and MitoTracker™ Red CMXRos (561 nm/580–625 nm). 8. See Fig. 2 for a typical result.
4
Notes 1. Alternatively, different DNA staining methods can be used. 2. If the tagging construct is to be integrated into a specific genotype, use seeds of this genotype for transformation.
262
Franziska Kuhnert and Andreas P. M. Weber
3. MitoTracker™ Red CMXRos is light sensitive. Use black reaction tubes or cover transparent reaction tubes with aluminum foil. 4. Ethidium bromide is toxic and cancerogenic. Strictly follow safety procedures, such as protecting skin by wearing disposable gloves. 5. UV light is harmful. Protect skin and eyes. 6. Optimal temperature for ApaI is 25 C. Digest the plasmid first for an hour with ApaI. Afterwards add the second restriction enzyme and digest for 1 h at 37 C. 7. We prefer round petri dishes with a diameter of 85 mm and a height of 15 mm. 8. Plates can be prepared in advance and stored at 4 C for at least 1 month. 9. Alternatively, seeds can be surface sterilized with chlorine gas in a desiccator. 10. Seeds have to be dried in a sterile environment before they can be placed onto the plates. 11. Carefully distribute the seeds evenly across the plate. 12. We used a growth chamber with a 12 h light/12 h dark photoperiod under 100 μmol m2 s1 light intensity with ambient CO2 conditions (0.038% [v/v] CO2). Alternatively, photoperiod and light intensities can be varied. 13. Carefully pull out the seedlings with a pair of tweezers. Do not rupture the roots of the plants. 14. Plants should be in the flower-producing stage/Principle growth stage 6 [22]. 15. Generally, we transform four pots with four plants each per construct, for a total of 16 transformed plants per construct. 16. Remove all present siliques prior to transformation. 17. Alternatively, anti-GFP antibody can be used. 18. Growth length and conditions can be varied. We usually preferred to work with 10- to 14-day-old seedlings. 19. We usually incubate three seedlings per reaction tube. 20. We found that an incubation time of 5 min is sufficient to stain mitochondria in root tissue. To image-stained mitochondria in leaf tissue, we advise to incubate the seedlings for 15 min in staining buffer supplemented with 200 nM MitoTracker™ Red CMXRos. 21. Staining time can be varied with different concentrations of MitoTracker™ Red CMXRos.
In Vivo Epitope Tagging of Plant Mitochondria
263
22. Application of a slight vacuum might increase the staining efficiency of mitochondria in leaf tissue. However, this was not tested in this study.
Acknowledgments This work was funded by the Deutsche Forschungsgemeinschaft (DFG, German Research Foundation)—Project-ID 267205415— SFB 1208 and under Germany’s Excellence Strategy EXC-2048/1, Project ID 390686111. References 1. Keech O, Dizengremel P, Gardestro¨m P (2005) Preparation of leaf mitochondria from Arabidopsis thaliana. Physiol Plant 124:403–409. https://doi.org/10.1111/j. 1399-3054.2005.00521.x 2. Werhahn W, Niemeyer A, J€ansch L, Kruft V, Schmitz UK, Braun H-P (2001) Purification and characterization of the Preprotein translocase of the outer mitochondrial membrane from Arabidopsis. Identification of multiple forms of TOM20. Plant Physiol 125:943–954. https://doi.org/10.1104/pp. 125.2.943 3. Millar AH, Sweetlove LJ, Giege´ P, Leaver CJ (2001) Analysis of the Arabidopsis mitochondrial proteome. Plant Physiol 127:1711–1727. https://doi.org/10.1104/pp.010387 4. Sweetlove LJ, Taylor NL, Leaver CJ (2007) Isolation of intact, functional mitochondria from the model plant Arabidopsis thaliana. Methods Mol Biol 372:125–136. https://doi. org/10.1007/978-1-59745-365-3_9 5. Chen WW, Freinkman E, Wang T, Birsoy K, Sabatini DM (2016) Absolute quantification of matrix metabolites reveals the dynamics of mitochondrial metabolism. Cell 166:1324–1337.e11. https://doi.org/10. 1016/j.cell.2016.07.040 ¨ zerdem C, Lewis 6. Bayraktar EC, Baudrier L, O CA, Chan SH, Kunchok T, Abu-Remaileh M, Cangelosi AL, Sabatini DM, Birsoy K, Chen WW (2019) MITO-tag mice enable rapid isolation and multimodal profiling of mitochondria from specific cell types in vivo. Proc Natl Acad Sci U S A 116:303–312. https://doi. org/10.1073/pnas.1816656115 7. Ahier A, Dai C-Y, Tweedie A, BezaworkGeleta A, Kirmes I, Zuryn S (2018) Affinity purification of cell-specific mitochondria from whole animals resolves patterns of genetic mosaicism. Nat Cell Biol 20:352–360.
https://doi.org/10.1038/s41556-017-0023x 8. Liao PC, Boldogh IR, Siegmund SE, Freyberg Z, Pon LA (2018) Isolation of mitochondria from Saccharomyces cerevisiae using magnetic bead affinity purification. PLoS One 13:1–15. https://doi.org/10.1371/journal. pone.0196632 9. Kuhnert F, Stefanski A, Overbeck N, Drews L, Reichert AS, Stu¨hler K, Weber APM (2020) Rapid single-step affinity purification of HA-tagged plant mitochondria. Plant Physiol 182:692–706. https://doi.org/10.1104/pp. 19.00732 10. Werhahn W, J€ansch L, Braun HP (2003) Identification of novel subunits of the TOM complex from Arabidopsis thaliana. Plant Physiol Biochem 41:407–416. https://doi.org/10. 1016/S0981-9428(03)00047-0 11. Wiedemann N, Frazier AE, Pfanner N (2004) The protein import machinery of mitochondria. J Biol Chem 279:14473–14476. https://doi.org/10.1074/jbc.R400003200 12. Dietmeier K, Ho¨nlinger A, Bo¨mer U, Dekker PJT, Eckerskorn C, Lottspeich F, Kubrich M, Pfanner N, Lottspeicht F, Ku¨brich M, Pfanner N, Lottspeich F, Kubrich M, Pfanner N, Honlinger A, Bomer U, Lottspeich F, Dekker PJT, Dietmeier K, Kubrich M, Pfanner N, Eckerskorn C (1997) Tom5 functionally links mitochondrial preprotein receptors to the general import pore. Nature 388:195–200. https://doi.org/10. 1038/40663 13. Horie C, Suzuki H, Sakaguchi M, Mihara K (2003) Targeting and assembly of mitochondrial tail-anchored protein Tom5 to the TOM complex depend on a signal distinct from that of tail-anchored proteins dispersed in the membrane. J Biol Chem 278:41462–41471. https://doi.org/10.1074/jbc.M307047200
264
Franziska Kuhnert and Andreas P. M. Weber
14. Krebs M, Held K, Binder A, Hashimoto K, Den Herder G, Parniske M, Kudla J, Schumacher K (2012) FRET-based genetically encoded sensors allow high-resolution live cell imaging of ca 2+ dynamics. Plant J 69:181–192. https://doi.org/10.1111/j. 1365-313X.2011.04780.x 15. Yang Y, Costa A, Leonhardt N, Siegel RS, Schroeder JI (2008) Isolation of a strong Arabidopsis guard cell promoter and its potential as a research tool. Plant Methods 4:1–15. https://doi.org/10.1186/1746-4811-4-6 16. Hirakawa Y, Kondo Y, Fukuda H (2010) TDIF peptide signaling regulates vascular stem cell proliferation via the WOX4 homeobox gene in Arabidopsis. Plant Cell 22:2618–2629. https://doi.org/10.1105/tpc.110.076083 17. Schu¨rholz A-KK, Lo´pez-Salmero´n V, Li Z, Forner J, Wenzl C, Gaillochet C, Augustin S, Barro AV, Fuchs M, Gebert M, Lohmann JU, Greb T, Wolf S (2018) A comprehensive toolkit for inducible, cell type-specific gene expression in Arabidopsis. Plant Physiol 178:40–53. https://doi.org/10.1104/pp.18. 00463 18. Schlereth A, Mo¨ller B, Liu W, Kientz M, Flipse J, Rademacher EH, Schmid M, Ju¨rgens G, Weijers D (2010) MONOPTEROS
controls embryonic root initiation by regulating a mobile transcription factor. Nature 464:913–916. https://doi.org/10.1038/ nature08836 19. Koncz C, Schell J, Rip HBI (1986) The promoter of TL-DNA gene 5 controls the tissuespecific expression of chimeric genes carried by a novel type of agrobacterium binary vector. Mol Gen Genet 204:383–396. https://doi. org/10.1007/BF00331014 20. Clough SJ, Bent AF (1998) Floral dip: a simplified method for agrobacterium-mediated transformation of Arabidopsis thaliana. Plant J 16:735–743. https://doi.org/10.1046/j. 1365-313x.1998.00343.x 21. Weigel D, Glazebrook J (2006) Kanamycin selection of transformed Arabidopsis. In: CSH protocols. Cold Spring Harbor Laboratory Press, Cold Spring Harbor, NY. https://doi. org/10.1101/pdb.prot4669 22. Boyes DC, Zayed AM, Ascenzi R, McCaskill AJ, Hoffman NE, Davis KR, Go¨rlach J (2001) Growth stage-based phenotypic analysis of Arabidopsis: a model for high throughput functional genomics in plants. Plant Cell 13:1499–1510. https://doi.org/10.1105/ tpc.13.7.1499
Chapter 15 Trichome Transcripts as Efficiency Control for Synthetic Biology and Molecular Farming Richard Becker, Christian Go¨rner, Pavel Reichman, and Nico Dissmeyer Abstract A variety of methods for studying glandular leaf hairs (trichomes) as multicellular micro-organs are well established for synthetic biology platforms like tobacco or tomato but rather rare for nonglandular and usually single-celled trichomes of the model plant Arabidopsis thaliana. A thorough isolation of—ideally intact—trichomes is decisive for further biochemical and genomic analyses of primary and secondary metabolic compounds, enzymes, and especially transcripts to monitor initial success of an engineering approach. While isolation of tomato or tobacco trichomes is rather easy, by simply freezing whole plants in liquid nitrogen and brushing off trichomes, this approach does not work for Arabidopsis. This is mainly due to damage of trichome cells during the collection procedure and very low yield. Here, we provide a robust method for a virtually epithelial cell-free isolation of Arabidopsis trichomes. This method is then joined with an RNA isolation protocol to perform mRNA analysis on extracts of the isolated trichomes using a semiquantitative RT-PCR setup. Key words Synthetic biology, Molecular farming, Metabolic engineering, Trichomes, Trichofarming
1
Introduction In order to isolate whole leaf hairs (trichomes) from Arabidopsis thaliana, different approaches exist and were tested for their suitability for the purpose of mRNA and protein analysis to monitor initial success of a biotechnological engineering approach. One early method was to use fine forceps and collect individual trichomes by ripping them off the leaves. Though this was surprisingly fast and gathered intact trichomes, most of them came with basal cells attached to them, which is not desired as it easily leads to dilution of the relevant trichome cell type with cells of the upper epidermis. This elevates background signals and the risk of contamination with plant compounds of epithelial (epidermal) origin (Fig. 1a, b). The method that proved to reliably and efficiently isolate trichomes mostly followed the protocol published by David Marks
Matias D. Zurbriggen (ed.), Plant Synthetic Biology, Methods in Molecular Biology, vol. 2379, https://doi.org/10.1007/978-1-0716-1791-5_15, © Springer Science+Business Media, LLC, part of Springer Nature 2022
265
266
Richard Becker et al.
Fig. 1 Arabidopsis thaliana leaves and trichomes. (a) Arabidopsis leaf after trichome isolation using forceps. The red arrows mark holes in the epithelial tissue where trichomes were located before the collection. (b) Trichomes isolated by forceps with basal epidermal cells still attached, (c) single and (d) clustered multiple Arabidopsis trichomes that were isolated following the method in this protocol. Scale bars: (a–c): 1 mm, (d): 0.1 mm
and colleagues [1] with some adjustments and modifications. It is a solid method to isolate trichomes from Arabidopsis with little to no epithelial cells attached to them (Fig. 1c, d). Of note, any contamination with other cell types heavily influences the readout of the experiment. The isolation was first conducted in a normally tempered laboratory (room temperature) and later it was changed to take place in a cooled climate chamber at 8 C where the plants were cultivated to further minimize the risk of mRNA degradation and temperature-induced transcription. The general principle of the described method is that EGTA as a chelating agent will chelate calcium (Ca2+) ions which results in a weakening of the bond between the basal cells and the trichome cell itself (see Note 1). The second experimental difficulty to overcome was the isolation of intact mRNA from trichomes. This proved to be difficult since the trichomes have a thick cell wall which is difficult to break up without risking mRNA degradation. The solution proposed is to
Trichome Transcripts in Synthetic Biology and Molecular Farming
267
use the established TRI reagent (see Note 2) while grinding the trichomes with a mortar and pestle containing phenol and guanidinium thiocyanate. Guanidinium thiocyanate is a very effective protein denaturant and should prevent the degradation of the mRNA caused by RNAses during the RNA isolation process [2]. Besides this procedure, there are only very few reports on trichome transcript analysis and studies on transcriptional profiling of mature Arabidopsis trichomes remain scarce [3]. To verify the integrity of the isolated mRNA a semiquantitative RT-PCR approach with primers for different genes including the housekeeping gene TUBULIN2 was used and the resulting DNA fragments were analyzed on an agarose gel (Fig. 2) [4]. The described method was used to proof the concept of establishing switchable cell types, here trichomes, as conditional, regulatable cellular entity. The switch-like behavior is based on the so-called phenotype-on-demand system that allows conditional complementation of a glabrous trichome-lacking mutant plant line with a conditional complementation construct. This phenotypic rescue construct can be stabilized by low ambient temperatures (Fig. 3) [4– 7]. The underlying low-temperature (lt)-degron can be used to produce chimeric proteins of interest in a conditional manner as temperature-sensitive alleles (reviewed in [6, 8]). The system is based on conditional and highly targeted proteolysis addressing the N-degron pathway of targeted protein degradation [9– 11]. Depending on the target protein that is regulated this way, developmental processes can be manipulated and phenotypes on demand established in vivo. These are, for example, conditional cell
Fig. 2 Harvested individual Arabidopsis trichomes. (a) Trichomes were harvested as described and resuspended in buffer. (b) An example for an RT-PCR on RNA samples isolated from harvested trichomes. TUBULIN2 as housekeeping control, ABCB19 (ARABIDOPSIS THALIANA ATP-BINDING CASSETTE B19), DFR (DIHYDROFLAVONOL 4-REDUCTASE), and LDOX (LEUCOANTHOCYANIDIN DIOXYGENASE). Scale bar: (a): 1 mm
268
Richard Becker et al.
Fig. 3 Application of the lt-degron system. (a) Production of proteins in conditional alleles and (b) example for the establishment of phenotypes on demand in vivo, here as example: switchable trichomes, adapted from Faden et al. [5] and Dissmeyer [6]. The method is further outlined elsewhere and used in the context of expression of toxic proteins in trichomes [7]
types such as trichomes by upregulating TRANSPARENT TESTA GLABRA1 (TTG1) or conditional tissues such as floral meristems by stabilizing CONSTANS (CO) protein [5] (Fig. 3). The aforementioned system has been applied in several contexts such as stabilizing proteins in Drosophila, yeast cells, animal and plant cell cultures and also to restrict the toxicity of deleterious target proteins in vivo [7].
2 2.1
Materials Plant Cultivation
1. Arabidopsis thaliana seeds of line of interest. 2. Growth cabinets or climate chambers (“phytochambers”). 3. Temperature logger (see Note 3). 4. Sterilized soil (fungicides may be added to the soil, see [5, 12–14]). 5. Optional (if switchable trichome lines are used that are based on the lt-degron system): Arabidopsis prt1-1 mutant [15] or wild type, e.g., accession Columbia-0 (Col-0) as control.
Trichome Transcripts in Synthetic Biology and Molecular Farming
2.2 Trichome Harvest and Isolation
269
1. Liquid nitrogen. 2. PBS buffer only containing potassium salts (PBS-K; see Note 4). 3. 50–100 mM EGTA (sc-3593, Santa Cruz Biotechnology, Inc.) in PBS-K, prepare about 150 mL per line. 4. Door mesh (Schellenberger, mesh size ca. 1.6 mm, kindly provided by David Marks) or fly screen for windows (Globus Baumarkt, Braschwitz). 5. Standard HPLC glass beads, 60–106 μm. 6. Cell strainer, mesh size of 100–150 μm (see Note 5). 7. Scissors, forceps, graduated cylinder. 8. 50 mL centrifuge tubes. 9. 2 mL safe-lock microcentrifugation tubes. 10. Vortexer. 11. Laboratory scale.
2.3 Homogenization of Trichomes
1. Clean and autoclaved porcelain mortar and pestle one set for each line (see Note 6). 2. TRISure (Bioline). 3. 2 mL microcentrifugation tubes. 4. Forceps and spatula. 5. Benchtop centrifuge.
2.4 RNA Isolation with TRIsure
1. TRISure (Bioline). 2. 2 and 1.5 mL microcentrifugation tubes. 3. Benchtop centrifuge.
2.5 RNA Cleanup (Optional)
RNA cleanup is an optional step which can easily be done by using commercially available column solutions. 1. RNeasy Mini Kit (Qiagen). 2. Benchtop centrifuge.
2.6 cDNA Synthesis of Targeted mRNA
1. Thermo Scientific RevertAid First Strand cDNA Synthesis Kit (Thermo Scientific). 2. Oligo d(T) primers. 3. Thermocycler (LabCycler Gradient, SensoQuest).
270
3
Richard Becker et al.
Methods Plant Cultivation
Arabidopsis plants conditionally producing unicellular trichomes (lt-degron K2:TTG1 lines [5]) were grown on soil at permissive temperature of 8 C and under long day conditions. Regular plants can be grown under standard conditions such as at room temperature. Mature trichomes were harvested at different stages of the plant life from vegetative tissue (see Note 7).
3.2 Trichome Harvest and Isolation
Ideally the whole trichome isolation process should be done in a cold room (ca. 8 C), in order to prevent temperature-induced transcription and reduce mRNA degradation.
3.1
1. Add 50–70 mg of glass beads into as many of 50 mL centrifuge tubes as needed. Approximately 8 centrifuge tubes with 1.5 g of leaves will result in about 100 mg of isolated trichomes in the end (see Note 6). 2. Cut off whole plant leaves and cut bigger leaves in half (see Note 8). Add up to 1.5 g of leaves into a 50 mL centrifuge tube. It is important not to weigh in too much so that the leaves can move freely during the vortexing. Add 15 mL of 50 mM EGTA in PBS-K to the Falcons and mix briefly. Incubate the leaves for 20 min on ice. After the incubation vortex the leaves for 1 min on highest level followed by 1 min incubation on ice (Fig. 4). 3. Repeat this step 4 times. 4. Pool the leaves of all centrifuge tubes containing the same line into a big beaker.
Fig. 4 Harvesting and processing of Arabidopsis leaves for trichome isolation. (a) Harvested Arabidopsis leaves, (b) about 1.5 g of leaves were weighted into a 50 mL tube leaving enough space for the leaves to move freely while vortexing (c)
Trichome Transcripts in Synthetic Biology and Molecular Farming
271
Fig. 5 Beaker with several layers of door mesh on top. The leaves and PBS mixture is filtered into the beaker. The number of door mesh layers needs to be sufficient to prevent any leaf material from passing through. Afterwards the beaker will contain the PBS solution with the trichomes and the leaves can be transferred back to their original beaker and washed again
5. Filter the mixture through 4–8 layers of door mesh into another beaker to retain the leaves (Fig. 5). 6. Transfer the retained leaves back into their original beaker and wash them with PBS-K to free stuck trichomes and filter it again through the door mesh into the beaker without the leaves. 7. Repeat this several times (about 3) until you feel like no more trichomes can be washed off the leaves (Fig. 6). 8. Prewet the cell strainer with PBS-K (Fig. 5). 9. Take the beaker containing the trichomes and sieve the content through a 100–150 μm cell strainer to remove smaller cells and debris from the trichomes. This can be done by carefully pouring the content from the beaker through the strainer but one must take care not to pour too fast so that the cell strainer does not overflow which would result in a loss of trichomes. 10. Rinse the trichomes inside the cell strainer with a few ml of PBS-K. 11. Invert the cell strainer into a 50 mL centrifuge tube and use a 1 mL pipette and PBS-K to dislodge the trichomes from the cell strainer into the 50 mL tube. 12. Repeat until all the trichome suspension was sieved through the cell strainer. 13. Wash the beaker with PBS-K to gather trichomes that might have been stuck to the walls. 14. Let the trichomes settle by gravity on ice and remove the supernatant with a 200 μL pipette.
272
Richard Becker et al.
Fig. 6 Arabidopsis leaves before and after the trichome isolation. (a) Mature trichomes are large enough to be seen by naked eye on the leaf epidermal surface. (b) A significant amount of trichomes was removed from the leaf by the collection process and the leaf epithelial tissue remains largely intact. Scale bars: (a) and (b): 1 mm
15. Add 1 mL of PBS and transfer the trichomes to a 2 mL microcentrifuge tube using a 1 mL pipette with a cut off tip. 16. Repeat this approximately 3 times to transfer most of the trichomes from the 50 mL tube to the 2 mL microcentrifuge tube. 17. Remove excess PBS-K from the 2 mL tube using a 200 μL pipette if necessary and at the end. 18. For storage, freeze the trichomes in liquid nitrogen and store at 80 C if the trichomes are not processed immediately (Fig. 6). 3.3 Homogenization of Trichomes
To counteract degradation of mRNA which might also occur with the cells intact the trichomes should be handled only on ice and stored at 80 C if they were to be used at a later date. Prior to the procedure the mortar and pestle should be cleaned thoroughly first with detergent and water then with ethanol and afterwards they should be autoclaved preferably multiple times while being covered in aluminum foil to reduce the amount of active RNases present on the mortar and pestle. The focus here is again to keep the RNA intact while grinding down the trichomes. Also follow general rules for working with RNA like using a designated set of pipette tips and tubes which are used for RNA work only to further minimize RNase contamination. Do not precool mortar and pestle as it will lead to undesired precipitation of water. TRIsure reagent contains phenol and thiocyanate; take necessary precautions.
Trichome Transcripts in Synthetic Biology and Molecular Farming
273
1. In case a frozen pellet of trichomes is used, use a pair of forceps or a spatula to transfer the pellet into the mortar. If unfrozen trichomes are used, add 800 μL of the tri-reagent to the tube and use a cutoff pipette to transfer the mixture into the mortar but it might be easier to just freeze the trichomes briefly to remove the complete pellet. 2. Add 800 μL of TRIsure reagent onto the pellet inside of the mortar and grind it sufficiently. The time needed depends on the operating person. 3. Transfer the mixture into a 2 mL tube which is suitable for centrifugation at 12,000 g. 4. Use 200 μL of the TRIsure reagent for a total amount of 1 mL of TRIsure used to rinse the mortar and add the liquid to the 2 mL tube as well. 5. Centrifuge the sample for 10 min at 12,000 g to remove insoluble material and proceed with the protocol for RNA isolation according to the manual for the TRIsure reagent. 3.4 RNA Isolation with TRIsure
The general procedure of the RNA isolation followed the Trisure protocol for RNA isolation. The major difficulty consists of a sufficient lysis and homogenization of the trichomes as it is the first step for the RNA isolation (see Subheading 3.3; see Note 9).
3.5 RNA Cleanup with RNA-Easy Columns (Optional)
This step should only be performed if enough RNA was isolated but it is contaminated with phenol for example. This step will not separate the RNA from the white precipitate. Follow the clean-up procedure in the RNeasy protocol to further purify the isolated RNA from Subheading 3.4. More details on procedures addressing proteostasis via the N-degron pathway, on construct design and enzymatic technology can be found in related protocols [16–19].
3.6 cDNA Synthesis of Targeted mRNA
The cDNA synthesis was facilitated using the Thermo Scientific RevertAid First Strand cDNA Synthesis Kit which includes an M-MuL V reverse transcriptase. The amount of enzyme and other components needs to be adjusted to the amount of RNA available since usually the yield is lower than required, especially when performing the DNase treatment which is recommended to remove carryover DNA. It might be useful to concentrate the RNA by evaporating some of the water in which it is solved at the end of the RNA isolation protocol. This must be done under low pressure since RNA might degrade if it is heated to higher temperatures.
274
4
Richard Becker et al.
Notes 1. EGTA is used instead of EDTA since it has a higher affinity for Ca2+ ions while EDTA has a higher affinity for Mg2+ ions. 2. In principle, all “TRI reagents” equivalent to TRIsure should work with this protocol but here, only TRIsure (Bioline Cat. Nr.: BIO-38032) was used to test the protocol. Further information is given elsewhere [2]. 3. The website of the German Controlled Environment User Group (www.gceug.de) gives instructions on diverse aspects of controlled plant growth environments from technology to plant growth parameters. This web resource raises awareness for details that need to be considered in instrument and equipment acquisition from measurement devices and growth cabinets to walk-in chambers and whole facilities. 4. Only Potassium salts are used because at low temperatures Sodium Phosphate might precipitate. 5. It might be beneficial to use a larger mesh size (up to 200 μm) since the Arabidopsis trichomes are relatively large and if several are present they entangle with each other preventing them from passing through the mesh while smaller impurities will be removed more easily. 6. Since the trichomes have a thick cell wall while being relatively small they are hard to break. Bead milling of the deep frozen trichome pellet with 4 mm diameter steel beads did not result in any noticeable rupturing of the cells even if conducted for prolonged time. The longest run in the bead mill was for 10 min with interim freezing periods using liquid nitrogen. The trichomes were assessed under a light microscope and found mostly intact. 7. Trichomes will have different transcription patterns depending on their maturity [3]. This method was used to harvest mature trichomes. 8. An optional step to minimize later impurities at this stage is to wash the plants several times before harvesting the leaves, though this was not done if the plants appeared to be visually clean. 9. This method did not prove to be without flaws. Most of the time low RNA yields and undesired A280/260 values were observed when spectrometrically analyzing the preparation, though sufficient mRNA was isolated to achieve sufficient cDNA synthesis and amplification results. In some cases, a white precipitate formed after the RNA precipitation step in the TRIsure protocol which is suspected to be polysaccharides or cell wall components. These samples delivered no usable
Trichome Transcripts in Synthetic Biology and Molecular Farming
275
RNA even after a cleanup step through a purification kit such as RNeasy (Qiagen). Possibly, the RNA is bound to the polysaccharides or the white precipitate contains denatured protein which could bind the RNA and render it insoluble. A specific reason during handling which caused the precipitation could not be clearly identified though it formed first when trichomes from older plants were used while the procedure of harvesting and isolation remained unchanged.
Acknowledgments We cordially thank David Marks for sending initial lab supply (door mesh, cell strainer, glass beads) and helpful discussions on the various protocols. This work was supported by a grant for setting up the junior research group of the ScienceCampus Halle—Plantbased Bioeconomy to N.D., a Ph.D. scholarship from the DAAD (Deutscher Akademischer Austauschdienst) to P.R., by DI 1794/ 3-1 of the German Research Foundation (DFG) to N.D., and by grant LSP-TP2-1 of the Research Focus Program “Molecular Biosciences as a Motor for a Knowledge-Based Economy” from the European Regional Development Fund (EFRE) to N.D. Financial support came from the Leibniz Association, the state of Saxony Anhalt, the DFG Graduate Training Center GRK1026 “Conformational Transitions in Macromolecular Interactions” at Halle, and the Leibniz Institute of Plant Biochemistry (IPB) at Halle, Germany. N.D.’s lab was participant of the European Cooperation in Science and Technology (COST) Action BM1307—“European network to integrate research on intracellular proteolysis pathways in health and disease (PROTEOSTASIS)” [20]. References 1. Marks MD, Betancur L, Gilding E, Chen F, Bauer S, Wenger JP, Dixon RA, Haigler CH (2008) A new method for isolating large quantities of Arabidopsis trichomes for transcriptome, cell wall and other types of analyses. Plant J 56(3):483–492. https://doi.org/10. 1111/j.1365-313X.2008.03611.x 2. Chomczynski P, Sacchi N (2006) The singlestep method of RNA isolation by acid guanidinium thiocyanate-phenol-chloroform extraction: twenty-something years on. Nat Protoc 1(2):581–585. https://doi.org/10.1038/ nprot.2006.83 3. Jakoby M, Falkenhan D, Mader M, Brininstool G, Wischnitzki E, Platz N, Hudson A, Hu¨lskamp M, Larkin J, Schnittger A (2008) Transcriptional profiling of mature Arabidopsis trichomes reveals that NOECK
encodes the MIXTA-like transcriptional regulator MYB106. Plant Physiol 148(3): 1583–1602. https://doi.org/10.1104/pp. 108.126979 4. Faden F, Mielke S, Dissmeyer N (2018) Switching toxic protein function in life cells. bioRxiv 430439. https://doi.org/10.1101/ 430439 5. Faden F, Ramezani T, Mielke S, Almudi I, Nairz K, Froehlich MS, Hockendorff J, Brandt W, Hoehenwarter W, Dohmen RJ, Schnittger A, Dissmeyer N (2016) Phenotypes on demand via switchable target protein degradation in multicellular organisms. Nat Commun 7:12202. https://doi.org/10.1038/ ncomms12202
276
Richard Becker et al.
6. Dissmeyer N (2017) Conditional modulation of biological processes by low-temperature degrons. Methods Mol Biol 1669(1669): 407–416. https://doi.org/10.1007/978-14939-7286-9_30 7. Faden F, Mielke S, Dissmeyer N (2019) Modulating protein stability to switch toxic protein function on and off in living cells. Plant Physiol 179(3):929–942. https://doi.org/10.1104/ pp.18.01215 8. Faden F, Mielke S, Lange D, Dissmeyer N (2014) Generic tools for conditionally altering protein abundance and phenotypes on demand. Biol Chem 395(7–8):737–762. https://doi.org/10.1515/hsz-2014-0160 9. Dissmeyer N (2019) Conditional protein function via N-degron pathway-mediated proteostasis in stress physiology. Annu Rev Plant Biol 70:83–117. https://doi.org/10.1146/ annurev-arplant-050718-095937 10. Dissmeyer N, Rivas S, Graciet E (2018) Life and death of proteins after protease cleavage: protein degradation by the N-end rule pathway. New Phytol 218(3):929–935. https:// doi.org/10.1111/nph.14619 11. Perrar A, Dissmeyer N, Huesgen PF (2019) New beginnings and new ends: methods for large-scale characterization of protein termini and their use in plant biology. J Exp Bot 70(7): 2021–2038. https://doi.org/10.1093/jxb/ erz104 12. Dissmeyer N, Nowack MK, Pusch S, Stals H, Inze D, Grini PE, Schnittger A (2007) T-loop phosphorylation of Arabidopsis CDKA;1 is required for its function and can be partially substituted by an aspartate residue. Plant Cell 19(3):972–985. https://doi.org/10.1105/ tpc.107.050401 13. Dissmeyer N, Weimer AK, Pusch S, De Schutter K, Alvim Kamei CL, Nowack MK, Novak B, Duan GL, Zhu YG, De Veylder L, Schnittger A (2009) Control of cell proliferation, organ growth, and DNA damage
response operate independently of dephosphorylation of the Arabidopsis Cdk1 homolog CDKA;1. Plant Cell 21(11):3641–3654. https://doi.org/10.1105/tpc.109.070417 14. Dissmeyer N, Schnittger A (2011) Use of phospho-site substitutions to analyze the biological relevance of phosphorylation events in regulatory networks. Methods Mol Biol 779:93–138. https://doi.org/10.1007/9781-61779-264-9_6 15. Mot AC, Prell E, Klecker M, Naumann C, Faden F, Westermann B, Dissmeyer N (2018) Real-time detection of N-end rule-mediated ubiquitination via fluorescently labeled substrate probes. New Phytol 217(2):613–624. https://doi.org/10.1111/nph.14497 16. Naumann C, Mot AC, Dissmeyer N (2016) Generation of artificial N-end rule substrate proteins in vivo and in vitro. Methods Mol Biol 1450:55–83. https://doi.org/10.1007/ 978-1-4939-3759-2_6 17. Klecker M, Dissmeyer N (2016) Peptide arrays for binding studies of E3 ubiquitin ligases. Methods Mol Biol 1450:85–94. https://doi. org/10.1007/978-1-4939-3759-2_7 18. Faden F, Eschen-Lippold L, Dissmeyer N (2016) Normalized quantitative western blotting based on standardized fluorescent labeling. Methods Mol Biol 1450:247–258. https://doi.org/10.1007/978-1-4939-37592_20 19. Reichman P, Dissmeyer N (2017) In vivo reporters for protein half-life. Methods Mol Biol 1669:387–406. https://doi.org/10. 1007/978-1-4939-7286-9_29 20. Dissmeyer N, Coux O, Rodriguez MS, Barrio R, Core Group Members of P (2019) PROTEOSTASIS: a European network to break barriers and integrate science on protein homeostasis. Trends Biochem Sci 44(5): 383–387. https://doi.org/10.1016/j.tibs. 2019.01.007
Chapter 16 Generation of Stable, Light-Driven Co-cultures of Cyanobacteria with Heterotrophic Microbes Amit K. Singh and Daniel C. Ducat Abstract Co-cultivation of an autotrophic species with one or more heterotrophic microbes is a strategy for photobiological production of high-value compounds and is relatively underexplored in comparison to cyanobacterial or microalgal monocultures. Long-term stability of such consortia is required for useful collaboration between the partners, and this property can be increased by encapsulation of phototrophic partners within a hydrogel. Encapsulated cyanobacteria have advantages relative to planktonic cultures that may be useful to explore the potential for artificial microbial communities for targeted biomolecule synthesis, such as increased control over population sizes and reduced liquid handling requirements. In this chapter, we describe a method for encapsulation of genetically modified cyanobacterial strain (Synechococcus elongatus PCC 7942, CscB+) into a sodium alginate matrix, and the utilization of these encapsulated cells to construct stable, artificial autotroph/heterotroph co-cultures. This method has applications for the study of phototroph-based synthetic microbial consortia, and multi-species photobiological production. Key words Cyanobacteria, Encapsulation, Alginate, Co-cultivation, Synthetic consortia, Carbohydrates, Photobiological production
1
Introduction Current bioproduction techniques are driven primarily by microbial consumption of plant-based, simple carbohydrates (e.g., corn ethanol), although the environmental footprint of such strategies has spurred interest in alternative feedstocks. Due to their efficient photosynthetic processes and genetic tractability, cyanobacteria are increasingly being considered for the light-driven conversion of CO2 into carbohydrates that are compatible with bioproduction microbes. Multiple research teams have developed cyanobacterial strains with enhanced secretion of sugars and sugar alcohols [1– 10], including some strains which could be competitive with traditional plant crop species for the production of simple carbohydrates.
Matias D. Zurbriggen (ed.), Plant Synthetic Biology, Methods in Molecular Biology, vol. 2379, https://doi.org/10.1007/978-1-0716-1791-5_16, © Springer Science+Business Media, LLC, part of Springer Nature 2022
277
278
Amit K. Singh and Daniel C. Ducat
One possible strategy for the utilization of sugar-secreting cyanobacteria is to grow them in co-culture with heterotrophic microbes that can utilize the secreted photosynthate to power metabolic bioproduction pathways. Recently, our group and others have developed methods to construct synthetic microbial consortia where modified cyanobacteria provide organic carbon that supports the growth and metabolic processes of a co-cultivated heterotrophic microbe. These artificial consortia have been used to produce a variety of bioproducts from solar energy, including bioplastics, enzymes, fixed nitrogen, and biofuel precursors [11–18] and are under consideration for bioremediation applications [19, 20]. We recently reported a light-driven consortia of an engineered strain of the cyanobacterium Synechococcus elongatus with an unrelated heterotroph (Halomonas boliviensis or Escherichia coli), where the co-culture was stable for >5 months and could continuously produce appreciable amounts of the bioplastic precursor molecule, polyhydroxybutyrate (PHB) [18]. In this particular report, S. elongatus was engineered to autotrophically generate and secrete sucrose through the heterologous expression of sucrose permease, cscB [1]. This transporter allows sucrose that accumulates in the cytosol of the cyanobacteria during periods of illumination to be secreted continuously into the media, thereby supporting the growth of H. boliviensis in the absence of any supplemented organic carbon sources. The unusual stability of this artificial consortia was bolstered by encapsulation of the cyanobacterial strain within alginate hydrogels. Alginate is a polysaccharide composed of (1,4)linked β-D mannuronic (M block) and α-L-guluronic (G block) acids [21] that is commonly used for microbial cell encapsulation [22, 23]. Alginate can be rapidly crosslinked into a stable hydrogel through interactions with divalent ions such as Ca2+ and Ba2+. Alginate encapsulation offers several advantages with regard to the stability of cyanobacteria/heterotroph co-cultures. First, encapsulation greatly slows growth and division of cyanobacteria without compromising cellular viability: this allows sucrose production to remain constant over time and prevents cyanobacterial overgrowth, which can impair co-cultured heterotrophs due to hyperoxia and/or competition for trace elements [11, 12]. By drastically reducing cellular growth and division, encapsulation also eliminates the possibility that the cyanobacterial population will lose its capacity to secrete sucrose over time by mutation and counter-selection against the cscB gene. S. elongatus expressing cscB were also shown to have a higher specific productivity on a per-cell basis when encapsulated, perhaps due to a redirection of carbon away from cell growth and toward sucrose secretion [18, 24]. Finally, alginate beads are easy to handle and permit rapid separation of heterotrophic biomass (grown in the liquid phase surrounding the beads) for recovery of bioproducts. Although a number of previously-reported synthetic consortia do
Stable Co-Culture of Cyanobacteria/Heterotrophs
279
not make use of alginate encapsulation [2, 12, 25–27], it is likely that this strategy can provide benefits to many different artificial consortia that are composed of an autotrophic species in co-culture with a heterotroph. In this chapter, we describe a method for establishing cyanobacteria/heterotroph co-cultures and describe a simple encapsulation procedure that does not require specialized equipment. The protocols we describe was developed for the cyanobacterial strain S. elongatus that is expressing sucrose permease, CscB; however, this protocol could be used with a variety of different microalgal or cyanobacterial strains to establish stable, light-driven co-cultures.
2
Materials Prepare all solutions using ultrapure filtered water. A Milli-Q water purification system or similar filtration device will suffice. Extra care should be taken during preparation and manipulation of all buffers and media to maintain sterility. It is therefore recommended that a biological safety cabinet be utilized for all steps involving the use of previously sterilized solutions (see Note 1). All glassware vessels for preparing solutions and storing reagents should be thoroughly cleaned and autoclaved prior to use. All media and chemical reagents can be stored at room temperature, unless otherwise indicated.
2.1 Cyanobacterial Strain
We routinely use an engineered strain of Synechococcus elongatus PCC 7942 that has been modified to express the sugar transporter, sucrose permease (cscB), under an IPTG-inducible promoter [1]. While sucrose does not accumulate appreciably in S. elongatus when cultivated under routine conditions, supplementing growth media with a low concentration of salts (e.g., 100–200 mM NaCl) will provide sufficient osmotic pressure to activate the salt-response pathway and upregulate sucrose synthesis enzymes [28]. Expression of sucrose transporters, including CscB, allows export of cytosolically accumulated sucrose into the surrounding medium in many cyanobacterial strains [1, 3, 5, 29]. For brevity, we will hereafter refer to the strain of S. elongatus PCC 7942 that expresses cscB under an IPTG-inducible promoter as “S. elongatus CscB+” [1]. As discussed above, several alternative cyanobacterial strains that secrete photosynthate could be substituted in this protocol with appropriate alterations. We routinely cultivate cyanobacterial strains in an Infors Multitron Photobiological Incubator (32 C, 150 rpm shaking) with constant illumination (80–100 μmol photon/m2/s of photosynthetically active radiation). Many other incubators suitable for cyanobacterial cultivation are commercially available.
280
Amit K. Singh and Daniel C. Ducat
2.2 Heterotrophic Strain
A variety of distinct heterotrophic microbes have been successfully co-cultivated with S. elongatus CscB+. For the sake of simplicity in this protocol, we will confine the description to the steps that would be used to form a co-culture with E. coli W strain [30]. The W (“Waksman’s”) strain of E. coli (ATCC 9637) is an isolate that is routinely used in many laboratories, which encodes a sucrose utilization operon (cscABR). See Note 2 for features to take into account when considering co-culture applications with other heterotrophic species.
2.3 Cell Culture Media and Prepared Solutions
All culture media and stock solutions can be stored at room temperature. Aliquots of stock solution of antibiotics and IPTG should be stored at 20 C.
2.3.1 Preparation of Cell Growth Media
1. BG-11 medium: Many model cyanobacterial strains are routinely cultivated in BG-11 medium, including S. elongatus. BG-11 can be prepared using appropriate molecular-grade chemical reagents (see Table 1 and Note 3), or concentrated stocks can be purchased for convenience (e.g., 50 BG-11 Stock, Sigma-Aldrich). In order to reduce experimental variability due to changes in pH, we routinely buffer BG-11 with 1 g/L HEPES (see Table 1). 2. BG-11 plus ammonium chloride (NH4Cl) medium: Add 0.213 g ammonium chloride to 1 L of BG-11 medium (see column 3 in Table 1). 3. BG-11 minus sulfur medium or sulfur-free BG-11: All the chemical compounds as listed in BG-11 medium except magnesium sulfate are added in 1 L deionized water (see column 4 in Table 1). Medium is sterilized by autoclaving. 4. LB medium: E. coli can be cultivated in Luria Bertani (LB) medium, which is prepared as per instruction described by manufacturer and sterilized by autoclaving. LB broth and agar plates can be stored at room temperature and 4 C, respectively.
2.3.2 Preparation of Additional Chemical Stocks
1. 1 M IPTG: 1 M stock solution of isopropyl β-D-1-thiogalactopyranoside (IPTG) is prepared by dissolving 238.31 g of IPTG powder into 1 mL ultrapure water and is filter sterilized (0.22 μm pore size). 2. 5 M sodium chloride: 5 M stock solution of sodium chloride (NaCl) is prepared by dissolving 14.61 g of NaCl into 50 mL deionized water and is filter sterilized. 3. 10% sucrose: 10% stock solution of sucrose is prepared by dissolving 10 g in 100 mL deionized water and is filter sterilized.
Stable Co-Culture of Cyanobacteria/Heterotrophs
281
Table 1 Media composition for cell growth [18]
Chemical compound
BG-11 (concentration g/L)
BG-11 plus ammonium chloride (concentration g/L)
BG-11 minus sulfur) (concentration g/L)
NaNO3
1.5
1.5
1.5
K2HPO4
0.04
0.04
0.04
MgSO4 · 7H2O
0.05
0.05
–
CaCl2 · 2H2O
0.036
0.036
0.036
Citric acid
0.006
0.006
0.006
Ferric ammonium citrate
0.006
0.006
0.006
EDTA (disodium salt)
0.001
0.001
0.001
Na2CO3
0.02
0.02
0.02
Trace metal composition H3BO3
2.86
2.86
2.86
MnCl2 · 4H2O
1.81
1.81
1.81
ZnSO4 · 7H2O
0.222
0.222
0.222
NaMoO4 · 2H2O
0.39
0.39
0.39
CuSO4 · 5H2O
0.079
0.079
0.079
0.0494
0.0494
Co (NO3)2 · 6H2O 0.0494 Additional HEPES
1.0
1.0
1.0
Ammonium chloride
–
0.213
–
pH 8.3 titration agent
NaOH
NaOH
NaOH
4. 1 M ammonium chloride: Stock solution of ammonium chloride (NH4Cl) is prepared by dissolving 2.67 g in 50 mL deionized water and can be sterilized in an autoclave. 5. Antibiotic: 25 mg/mL stock solution of chloramphenicol (Cm) is prepared in ultrapure water and is filter sterilized. 2.3.3 Preparation of Sodium Alginate
Due to the time required to prepare and sterilize solutions (especially 3% sodium alginate), it is recommended that these solutions be prepared in advance of the day that cells are to be encapsulated.
282
Amit K. Singh and Daniel C. Ducat
1. Sodium alginate solution (3%): Add 3 g of sodium alginate (bioreagent grade) to 100 mL distilled water in a small, autoclavable bottle and mix with magnetic stirrer (see Note 4). 2. Degas solution using a vacuum pump for about 10 min (see Note 5). 3. Sterilize by autoclaving for 45 min at 121 C and 15 psi. Leave magnetic stir bar in the alginate solution during sterilization. 2.3.4 Preparation of Barium Chloride Curing Solution
1. Prepare at least 2 L of 20 mM barium chloride solution for each 100 mL of alginate beads that will be prepared (4.16 g BaCl2 per liter dH2O). 2. Dispense BaCl2 solution into 500 mL aliquots in autoclavable bottles. Add a magnetic stir bar to each bottle, then sterilize by autoclaving for 45 min at 121 C and 15 psi (see Note 6).
3
Methods
3.1 Preparation of Cyanobacterial Culture
1. S. elongatus CscB+ cultures are routinely maintained in 100 mL of BG-11 with 25 μg/mL chloramphenicol in 250-mL baffled flask. These flasks are grown under constant light (80–100 μmol photon/m2/s) at 32 C, with 2% CO2 supplemented into the incubator headspace and with 150 rpm rotary shaking (see Note 7). 2. To achieve sufficient cells for the encapsulation step, it is necessary to expand out the cyanobacterial culture to a sufficient density and volume. In order to get desired cell concentration to be encapsulated, pre-calculate the starting culture volume by using the following formula: (Desired OD) (Volume of alginate solution) ¼ (Stock culture OD) (Volume of stock culture). We routinely encapsulate a minimum of 100 mL volume of S. elongatus CscB+ in alginate at a cell density of ~3.0 OD750.
3.2 Preparation of Heterotrophic Culture
In order to allow heterotrophic strains to grow to sufficient density and to promote acclimation to medium conditions prior to transfer, axenic cultures should be first inoculated ~3 days in advance of the anticipated time that the heterotroph will be introduced into co-culture with the cyanobacterium. 1. Pick a single E. coli W colony on an LB agar plate and inoculate a small volume (5–30 mL) of LB broth (broth can be supplemented with antibiotics to maintain an axenic culture, when appropriate resistance genes are present in the background). 2. Grow the starter culture overnight at 32 C.
Stable Co-Culture of Cyanobacteria/Heterotrophs
283
3. Back-dilute overnight cultures ~1:100 into fresh BG-11 + 2% sucrose + 4 mM NH4Cl. 4. Grow the culture overnight at 32 C. 5. Back-dilute overnight cultures ~1:100 into fresh BG-11 + 0.2% sucrose + 4 mM NH4Cl. This step acts to acclimate the heterotrophic cells to a low-carbon medium. 6. Grow the culture overnight at 32 C. 7. Pellet the E. coli and resuspend in BG-11 lacking any supplementary carbon at an appropriate cell density use for inoculation with beads (see below). 3.3 Encapsulation of S. elongatus CscB+
1. Measure and record the optical density (OD750; see Note 8) of the S. elongatus CscB+ culture that is to be encapsulated.
3.3.1 Preparation of Cyanobacterial Cells
2. Centrifuge an appropriate volume (see Subheading 3.1, step 2) of S. elongatus cells grown in BG-11 medium at ~3000 g for 20 min. 3. Carefully decant supernatant and discard. 4. Wash cyanobacterial cell pellet with ~5 mL of sulfur-free BG-11 to remove any residual sulfate or secreted bioproducts in the supernatant that may interfere with the formation of a robust alginate hydrogel. Discard supernatant. 5. Vigorously resuspend cell pellet in sulfur-free BG-11 (see column 4 in Table 1) and vortex to break up any small clumps of pelleted cells. Final resuspended volume should be 1/10th the volume desired of encapsulated cells (therefore OD750 should be 10 times of final concentration).
3.3.2 Encapsulation of Cyanobacterial Cells
1. Using sterile technique, transfer resuspended cells from step 5 of Subheading 3.3.1 to the top of previously prepared and sterilized 3% sodium alginate solution. Using a stir plate and magnetic stir bar, mix the sodium alginate solution with concentrated cyanobacteria. Minimize the introduction of air bubbles into the alginate solution during this step to avoid inconsistencies during later steps with the syringe pump. 2. In a laminar hood, remove the syringe plunger from a 30 mL syringe and affix a shielded 30 G needle to the end of the syringe (see Note 9). 3. Pour 25 mL of cyanobacteria/alginate suspension into the syringe. 4. Re-insert syringe plunger, then loosen the shielded needle to reduce internal pressure. Hold the syringe vertically with needle pointing up. Slowly push the trapped air from the syringe using the plunger. When air has been removed, tighten the shielded needle to again establish a seal.
Amit K. Singh and Daniel C. Ducat
Centrifuge at 3000g for 20min
Resuspended into sulphate free BG11 medium
Cell pellet S. elongatus
Mix the resuspended cell (sulphur free BG11 medium) into 3% alginate
Fill the 30ml syringe with 25 ml of mixture (Cells+3% alginate)
25mL
Cell + sulphur free BG11 medium
Syringe Pump
284
Cells+3% alginate
Magnetic stirrer
30 cm Incubate Ba-alginate beads under continuous light and temperature at 30°C
Wash with distilled water x3 times and then BG11 x3
Incubate for 30 min on magnetic stirrer Ba-alginate beads BaCl2 solution BaCl2 solution (20mM)
Magnetic stirrer
Magnetic stirrer
Fig. 1 Outline of encapsulation protocol. S. elongatus CscB+ cells are grown to a sufficiently high density, pelleted, and resuspended in 1/10th volume of sulfur-free BG-11. Resuspended cells are mixed completely into a prepared solution of 3% sodium alginate, then poured into a syringe. The syringe is loaded onto an infusion syringe pump, and the solution is dripped into a continuously stirred solution of 20 mM BaCl2. Ba-alginate beads form in this solution and are cured for 30 min before multiple rinse steps
Fig. 2 Representative alginate beads. (a) Ba-alginate beads containing S. elongatus CscB+ cells immediately after encapsulation at a density of ~3.0 OD750, or ~1 109 cells/mL. (b) Ba-alginate beads encapsulated at the same density, but incubated in a constant light photobioreactor as indicated for 2 weeks after encapsulation. (c) Closeup of Ba-alginate beads as in (a). (d) Brightfield light microscopy of a cross section of beads shown in (a). (e) Fluorescence microscopy of Ba-alginate bead cross-section shown in (d). S. elongatus cells can be clearly visualized by their chlorophyll a autofluorescence. Scale bar ¼ 50 μm
Stable Co-Culture of Cyanobacteria/Heterotrophs
285
5. Insert filled syringe into vertically positioned syringe pump with needle pointing downwards. See Fig. 1 for schematic of setup for encapsulation. 6. Position bottle of 500 mL (20 volume) of 20 mM BaCl2 in a sealable bottle and magnetic stir bar below the syringe needle on a stir plate. Approximately 30 cm of distance should be maintained between the tip of the needle, and the surface of the BaCl2 solution. 7. Set appropriate parameters in a high-pressure syringe infusion pump (see Note 10). 8. Start the syringe pump and allow the mixture released dropwise through the 30 G needle into a 500 mL volume of 20 mM BaCl2 using a vertically positioned syringe infusion pump pressure and an ejection rate of ~1.5 mL/min (see Note 10). BaCl2 solution should be constantly stirred using stir plate and included magnetic stir bar. Drops of soluble alginate should immediately begin to polymerize upon striking the surface of the BaCl2 solution and should retain a spherical shape (see Fig. 2). 9. When all liquid has been dispensed from syringe, continue to incubate the alginate beads in BaCl2 solution for at least 30 min while maintaining stirring. Stirring should be sufficiently vigorous to suspend all of the beads during the duration of this curing process. 10. Decant the BaCl2 solution into a separate large beaker (e.g., 2 L), label this beaker Ba+ waste and set aside. Due to the toxicity of barium, it is necessary to properly treat this solution prior to disposal (see Note 11). 11. Wash beads with sterilized distilled water (3 times) and BG-11 medium (3 times) to get rid of excess soluble BaCl2 around beads. 12. Incubate the beads with shaking at 150 rpm and continuous light intensity of ~80 μmol photon/m2/s PAR with 2% CO2 at 32 C. 3.4 Maturation of Beads
1. Incubate cured alginate beads (see Subheading 3.3.2, step 12) under the desired growth conditions (i.e., light, CO2, temperature, shaking, and medium conditions) for at least 1 week prior to beginning evaluation of a co-culture. This allows the encapsulated cyanobacteria to acclimate to their environment within the hydrogel. Although alginate beads may become visibly more green during the first days after encapsulation (Fig. 2b), it is likely this is reflective of a higher density of chlorophyll rather than cell growth/division within the barium-alginate hydrogel—see reference [18] for additional evidence of this claim.
286
Amit K. Singh and Daniel C. Ducat
2. During this time, replace the medium surrounding the alginate beads with fresh medium at least every other day (see Note 12). 3. To induce increased sucrose export, add IPTG (1 mM) and NaCl (100–200 mM) to the growth medium when induction is required. 4. Heterotrophic microbes (prepared as described above) can be inoculated into solution surrounding the matured Ba-alginate beads. We routinely seed E. coli W into co-cultures at an OD600 of 0.05 or 0.1. 5. Inoculated co-cultures can be returned to photobioreactor and assayed at desired time points for growth of the heterotrophic strain (see below), and other biomarkers or bioproduction (protocols dependent upon specific application). 3.5 Evaluation of Cell Concentration of Heterotrophs in Consortium
The cell concentration of E. coli in consortium can be determined either by flow cytometry or via serial dilution, and plating cells for a viable colony forming unit (CFU) count. The CFU counting approach can be used without any special equipment and is described briefly below. 1. Prepare agar plates that are permissive for the heterotroph’s growth (e.g., LB) by slightly desiccating the agar pad in order to promote absorption of liquid droplets. Agar plates can be dried by removing the lid in a sterile environment for ~30 min in air. 2. A serial dilution of co-culture supernatant can be prepared in 96 deep-well plates (1.2 mL well volume). For example, a 1:10 dilution of co-culture supernatant can be prepared in the first row of the plate by adding 100 μL of co-culture supernatant to 900 μL diluent (e.g., BG-11 or H2O). Multi-channel pipettes can be used to conduct several serial dilutions in parallel, using sterilized pipette tips for each transfer (see Fig. 3). 3. For plating, use a multi-channel pipette to transfer 10 μL aliquot of each dilution onto the plate. Keep the lid of the plate open in laminar flow hood until all of the diluent has soaked into the plate or evaporated. 4. Incubate the inverted plate at 32 C for overnight. 5. Colonies appear within 24 h in case of E. coli. Count the number of colonies, and calculate original titer based on total volume plated, and dilution.
4
Notes 1. The largest technical barrier to establishing reproducible microbial co-cultures is the maintenance of axenic single-
Stable Co-Culture of Cyanobacteria/Heterotrophs
287
Fig. 3 Outline of rapid assay for CFU-based estimation of heterotrophic titer in co-culture. A dilution gradient can be rapidly prepared in a 96-well plate format. Transfer of 5–10 μL of each dilution in the series to a rich media plate can be used to assay heterotrophic titer over a large dynamic range. Use of a multi-channel pipette increases the throughput of this approach
species cultures, and preventing the contamination of established co-cultures with foreign microbes. Use of appropriate aseptic techniques for all stages of the encapsulation process and during all routine sub-culturing is essential. 2. The primary feature to take into consideration if attempting a co-culture of S. elongatus CscB+ with a new heterotroph is if the strain is capable of growth when utilizing sucrose as a sole carbon source. While additional photosynthates are secreted by S. elongatus, the dominant carbon source will be sucrose, and efficient utilization of this carbohydrate is paramount to the overall co-culture stability and productivity. Results of co-culture may be improved in genetic backgrounds that are more efficient at utilizing sucrose at low concentrations. For example, a knockout of the sucrose catabolism repressor, cscR, has been shown to improve sucrose utilization in E. coli W [31], and therefore, we routinely use this genetic background for S. elongatus CscB+/E. coli co-cultures. Additional features that will improve the heterotrophic performance include tolerance to hyperoxia, alkaline media (pH > 8), and mild osmotic pressure (>150 mOs). If there are additional nutrient requirements for the heterotroph, these can be added to the base BG-11 medium (e.g., it is necessary to add 4 mM ammonium to BG-11 when growing E. coli W in BG-11 with supplemented sucrose). 3. Autoclaving BG-11 medium will often cause the formation of a small amount of iron precipitates. These precipitates are not
288
Amit K. Singh and Daniel C. Ducat
toxic to S. elongatus, but can reduce the bioavailability of iron to the culture. Care should be taken to mix any precipitates prior to using the medium. Alternatively, BG-11 can be sterilized by filtration, or ferric ammonium citrate can be added as a separate stock solution (1000) after autoclaving to minimize the formation of precipitates. 4. Preparation of a homogenous solution of sodium alginate requires a slow dispersion of the desiccated powder into deionized water (dH2O). When sodium alginate is added rapidly, it will make large clumps in solution, and these aggregates can take more than an hour to dissolve during rapid mixing on a magnetic stir plate. Clumping of alginate can be minimized by a very slow addition of the powder while constantly stirring or by increasing the temperature (60 C) of the solution; however, this is time-intensive and the more efficacious approach may be to simply budget for the extended time required on a stir plate. 5. Due to the vigorous stirring required to dissolve sodium alginate, many gas bubbles are typically introduced into the viscous solution. Prior to autoclaving, it is necessary to degas the trapped air by applying a vacuum. Gas bubbles in the 3% sodium alginate will expand significantly under a vacuum, and it is necessary to monitor the solution closely to avoid the possibility of the bubbles expanding to the point where they spill over the top of the container and/or enter the vacuum line. By initially applying a vacuum in several pulses, the larger air bubbles can be removed and the risk of the solution “boiling” over will be greatly diminished. To facilitate pulsed degassing, we routinely apply a vacuum to the top of the bottle using a rubber stopper with a bore hole through it. The solution should be degassed until the solution bubbles much less vigorously, with only 1–2 bubbles appearing in solution at a time (~10 min). 6. Calcium is routinely used as a divalent cation to crosslink sodium alginate into a hydrogel. In our experience, the use of barium creates a more stable hydrogel and minimizes the number of cyanobacterial cells that “escape” from the alginate bead under subsequent co-culture conditions. However, soluble barium is toxic and must be precipitated and disposed of appropriately (see Note 11). 7. To minimize variation between experiments, it is important to ensure that the S. elongatus cultures are in logarithmic growth. Therefore, we back-dilute all cultures to an OD750 of 0.3 each day for at least 3 days prior to initiating an experimental protocol. This ensures that the cells within a given culture are actively dividing, standardizes the “growth history” of the culture, and
Stable Co-Culture of Cyanobacteria/Heterotrophs
289
increases the reproducibility between experiments performed on different days and/or with distinct biological replicates. 8. Because cyanobacteria have pigments that absorb in the visible spectrum (400–700 nm), optical density of these cultures should be evaluated outside of this range, rather than at the standard wavelength for non-pigmented cells (e.g., 600 nm). 9. Before starting the syringe pump, make sure that the shielded 30 G needle is firmly seated on the syringe by twisting clockwise and pull the shield straight off the needle to avoid damage to the needle point. In the bead preparation, the size of the beads can be regulated by changing either the flow-rate of the infusion syringe pump or the gauge size of the needle. Usually, with a 30 G needle, an alginate bead diameter of ~1 mm is achieved. The syringe pump may need to be monitored, as the needle can clog, increasing the backpressure and causing torsion of the syringe plunger. If evidence of a clogged needle arises, apply the withdrawal function of the syringe pump to relieve pressure on the contents of the syringe. Thereafter, carefully replace the clogged needle with a new one and resume encapsulation process as outlined. 10. While setting the syringe volume in the pump, it may be necessary to specify an infusion volume slightly greater than real volume in order to fully expel the alginate solution from the syringe. This adjustment may be necessary because of the high viscosity of the solution and the resulting torsion of the syringe plunger. Rates higher than 1.5 mL/min may lead to increased back-pressure, torsion on the syringe, and expulsion of alginate suspension in a manner that does not allow for spherical beads to form. Without more specialized equipment, it is not advisable to use a higher flow rate. 11. Medium containing soluble barium is toxic and should not be disposed without first precipitating the metal ions. Soluble sulfate will readily chelate free barium ions and cause them to precipitate in the form of a fine white aggregate. A stock solution of 2 M Na2SO4 should be prepared as a precipitation reagent. A volume of at least 1/100th of 2 M Na2SO4 should be added to any spent BaCl2 curing solution. A cloudy white precipitate should immediately form, indicative of soluble Ba2+ reacting with sulfate. After allowing the precipitate to settle (~30 min undisturbed), additional 2 M Na2SO4 can be added dropwise to the suspension: the visual formation of additional precipitate is indicative of incompletely cleared Ba2+. Once no additional precipitate forms when adding ammonium sulfate, the solution can be allowed to settle completely. Clear supernatant can be safely disposed of as with other non-toxic liquids, while the barium precipitate should be disposed of as part of
290
Amit K. Singh and Daniel C. Ducat
biohazardous waste, in keeping with established institutional policies. 12. It is likely that a white precipitate will form in the supernatant medium throughout the first 48–96 h after the alginate beads have been cured. This is a result of precipitation of barium ions that are released from the hydrogel and which interact with sulfate in the medium. Increasing the number of times that fresh media is added to the culture will decrease the length of time before this precipitate ceases to form.
Acknowledgments This work was supported by the Department of Energy (Grant: DE-FG02-91ER20021), as well as by NSF Grant CBET #1437657. References 1. Ducat DC, Avelar-Rivas JA, Way JC et al (2012) Rerouting carbon flux to enhance photosynthetic productivity. Appl Environ Microbiol 78:2660–2668 2. Hays SG, Ducat DC (2015) Engineering cyanobacteria as photosynthetic feedstock factories. Photosynth Res 123:285–295 3. Du W, Liang F, Duan Y et al (2013) Exploring the photosynthetic production capacity of sucrose by cyanobacteria. Metab Eng 19:17–25 4. Niederholtmeyer H, Wolfst€adter BT, Savage DF et al (2010) Engineering cyanobacteria to synthesize and export hydrophilic products. Appl Environ Microbiol 76:3462–3466 5. Song K, Tan X, Liang Y et al (2016) The potential of Synechococcus elongatus UTEX 2973 for sugar feedstock production. Appl Microbiol Biotechnol 100:7865–7875 6. Tan X, Du W, Lu X (2015) Photosynthetic and extracellular production of glucosylglycerol by genetically engineered and gel-encapsulated cyanobacteria. Appl Microbiol Biotechnol 99:2147–2154 7. Zhao C, Li Z, Li T et al (2015) High-yield production of extracellular type-I cellulose by the cyanobacterium Synechococcus sp. PCC 7002. Cell Discov 1:15004 8. Jacobsen JH, Frigaard NU (2014) Engineering of photosynthetic mannitol biosynthesis from CO2in a cyanobacterium. Metab Eng 21:60 9. Savakis P, Tan X, Du W et al (2015) Photosynthetic production of glycerol by a recombinant cyanobacterium. J Biotechnol 195:46
10. van der Woude AD, Perez Gallego R, Vreugdenhil A et al (2016) Genetic engineering of Synechocystis PCC6803 for the photoautotrophic production of the sweetener erythritol. Microb Cell Factories 15:60 11. Li T, Li C-T, Butler K et al (2017) Mimicking lichens: incorporation of yeast strains together with sucrose-secreting cyanobacteria improves survival, growth, ROS removal, and lipid production in a stable mutualistic co-culture production platform. Biotechnol Biofuels 10:55 12. Hays SG, Yan LLW, Silver PA et al (2017) Synthetic photosynthetic consortia define interactions leading to robustness and photoproduction. J Biol Eng 11:4 13. Smith MJ, Francis MB (2017) Improving metabolite production in microbial co-cultures using a spatially constrained hydrogel. Biotechnol Bioeng 114:1195–1200 14. Smith MJ, Francis MB (2016) A designed A. vinelandii-S. elongatus coculture for chemical photoproduction from air, water, phosphate, and trace metals. ACS Synth Biol 5:955–961 15. Lo¨we H, Hobmeier K, Moos M et al (2017) Photoautotrophic production of polyhydroxyalkanoates in a synthetic mixed culture of Synechococcus elongatus cscB and Pseudomonas putida cscAB. Biotechnol Biofuels 10:190 16. Ruiz-Gu¨ereca DA, Sa´nchez-Saavedra MP (2015) Growth and phosphorus removal by Synechococcus elongatus co-immobilized in alginate beads with Azospirillum brasilense. J Appl Phycol 28:1501–1507
Stable Co-Culture of Cyanobacteria/Heterotrophs 17. Xue C, Wang L, Wu T et al (2017) Characterization of co-cultivation of cyanobacteria on growth, productions of polysaccharides and extracellular proteins, nitrogenase activity, and photosynthetic activity. Appl Biochem Biotechnol 181:340 18. Weiss TL, Young EJ, Ducat DC (2017) A synthetic, light-driven consortium of cyanobacteria and heterotrophic bacteria enables stable polyhydroxybutyrate production. Metab Eng 44:236–245 19. Fedeson DT, Saake P, Calero P et al (2018) Biotransformation of 2,4-dinitrotoluene in a phototrophic co-culture of engineered Synechococcus elongatus and Pseudomonas putida. Microb Biotechnol 13:997 20. Abed RMM, Ko¨ster J (2005) The direct role of aerobic heterotrophic bacteria associated with cyanobacteria in the degradation of oil compounds. Int Biodeterior Biodegrad 55:29 21. Rowley JA, Madlambayan G, Mooney DJ (1999) Alginate hydrogels as synthetic extracellular matrix materials. Biomaterials 20:45 22. Gasperini L, Mano JF, Reis RL (2014) Natural polymers for the microencapsulation of cells. J R Soc Interface 11:20140817 23. Lee KYY, Mooney DJJ, Manuscript A et al (2013) Alginate: properties and biomedical applications. Prog Polym Sci 37:106 24. Abramson B, Lensmire J, Yang-Tsung L et al (2018) Redirecting carbon to bioproduction
291
via a growth arrest switch in a sucrose-secreting cyanobacterium. Algal Res 33:248–255 25. Junicke H, Feldman H, Van Loosdrecht MCM et al (2015) Limitation of syntrophic coculture growth by the acetogen. Biotechnol Bioeng 113:560–567 26. Kosina SM, Danielewicz MA, Mohammed M et al (2016) Exometabolomics assisted design and validation of synthetic obligate mutualism. ACS Synth Biol 5:569 27. Pande S, Shitut S, Freund L et al (2015) Metabolic cross-feeding via intercellular nanotubes among bacteria. Nat Commun 6:6238 28. Klahn S, Hagemann M (2011) Compatible solute biosynthesis in cyanobacteria. Environ Microbiol 13:551–562 29. Xuan YH, Hu YB, Chen L-Q et al (2013) Functional role of oligomerization for bacterial and plant SWEET sugar transporter family. Proc Natl Acad Sci 110:E3685–E3694 30. Archer CT, Kim JF, Jeong H et al (2011) The genome sequence of E. coli W (ATCC 9637): comparative genome analysis and an improved genome-scale reconstruction of E. coli. BMC Genomics 12:9 31. Arifin Y, Sabri S, Sugiarto H et al (2010) Deletion of cscR in Escherichia coli W improves growth and poly-3-hydroxybutyrate (PHB) production from sucrose in fed batch culture. J Biotechnol 156:275–278
INDEX A Acrylamide, 173, 174, 178 Agarose gel electrophoresis (AGE), 86 Agrobacterium-mediated enhanced seedling transformation (AGROBEST)., see AGROBEST Agrobacterium tumefaciens, 8, 10, 20, 21, 38, 83, 92, 111–113, 172, 175, 254, 258 AGROBEST agrobacterium culture for infection, 115, 117 Agrobacterium tumefaciens strain and vector, 113 arabidopsis seedlings for transformation, 115, 117 Arabidopsis seeds, 113–115, 117, 118 co-cultivation and infection, 115, 117–118 flowchart, 113 GUS staining of transient expression of gusA-intron gene, 112–114 GUS staining stock solution, 115–116, 118 MAMP or PAMP, 112 medium for coculture of Arabidopsis seedlings and Agrobacterium, 114 T-DNA binary vector by electroporation, 115–117 Agroinfiltration, 11, 112, 113 Aleatoric uncertainty, 240 Alginate encapsulation, 276 Anhydrotetracycline (aTc), 77 Anisotropic fluorophore, 198 Anisotropy, 198, 200 Arabidopsis protoplasts, 97–108 seedlings, 113–115, 117, 118 Arabidopsis thaliana, 2, 101, 263 cDNA synthesis, 267, 271 CONSTANS (CO) protein, 266 harvesting and processing, 266, 268 homogenization of trichomes, 267, 270–271 leaves and PBS mixture, 269 leaves and trichomes, 264 lt-degron system, 266 N-degron pathway, 265 PBS solution, 269 plant cultivation, 266, 268 RNA cleanup, 267, 271 RNA isolation with TRIsure, 267, 271 trichome harvest and isolation, 267, 268, 270
using forceps, 264 AssemblX web tool, 80–85, 91 Aux/IAA-based biosensors equipment, 189–190 hormone induction, 188, 191 luminescence determination, 188, 191–192 plant material preparation, 190 protoplast isolation, 187–188, 190–191 protoplast transformation, 191 reagents, consumables, and kits, 186–187 reagent setup, 188–189 statistical analysis, 192 Auxin perception, 185 Auxin regulator proteins, 184
B Bimolecular fluorescence complementation (BiFC), 113 BioBrick system, 68–72, 76, 77, 81 Biosensors, 183, 184 Aux/IAA-based biosensors (see Aux/IAA-based biosensors) degradation-based signaling mechanism, 184 FRET, 184 phytohormone perception mechanism and, 185 2A peptide, 184 Boltzmann constant, 199 Brownian motion, 199 BsaI, 19 recognition sites, 159, 174 restriction-ligation, 161 BsmBI, 30
C CaMV35S promoter, 28 Carbon fixation, 68 Cas9-mediated targeted mutagenesis, 1–3 CRISPR-PLANT, 19 methods mutagenesis activity, 10–14 plasmid construct assembly, 8–10 sgRNAs selection and design of plasmid construct, 5–8 stable lines, production of, 17–19 targeted mutagenesis assessment, 15–17 mutagenesis activity, 4–5
Matias D. Zurbriggen (ed.), Plant Synthetic Biology, Methods in Molecular Biology, vol. 2379, https://doi.org/10.1007/978-1-0716-1791-5, © Springer Science+Business Media, LLC, part of Springer Nature 2022
293
PLANT SYNTHETIC BIOLOGY
294 Index
Cas9-mediated targeted mutagenesis (cont.) plasmid construct assembly, 3–4 sgRNAs selection and design of plasmid construct, 3 Cauliflower mosaic virus (CaMV), 84 ¨ T)H, 112–114, 118, 119 C58C1(pTiB6S3A Cellular gradients, 217 Chemical master equation (CME), 221 Chemical reaction systems, 212 Chlamydomonas reinhardtii, gene editing, 45–46 algal transformation, 46–48 blue-green screen, 61 Cas9 activity, experimental design and pretest for, 48–52 cell growth, 62 contaminations, 64 crude cell extracts, storage of, 63 electroporation of algae cells, 52–55, 62–63 FLAGv3 mutation oligos, 60–61 FLAGv3 sequence, 60 gradient PCR, 50 mutants, screening for, 56–59 PCR, 63–64 PCR-based mutant screening, 48 plating, 63 primer sequences, 49 specific genomic primer, 61 tip off supernatant, 63 in vitro cleavage assay, 52, 61–62 in vitro digest, 51 Chloroplasts, 169, 170 Cloning insert OSC gene, 135–136 insert terpene synthase gene, 134–135 isoprene precursors, 133 materials, 127 multiple transcriptional units, 80–83, 86, 88, 91–93 pRhon5Hi-2-based vectors, 135 Clp protease system, 170 Coexpression, 153, 154 Co-immunoprecipitation (Co-IP), 113 Co-immunopurification, 251 Colony forming unit (CFU), 284, 285 Computer-Aided Design (CAD) software, 80, 99 Conventional restriction/ligation cloning, 76 CRISPR/Cas9, 27 Cas9 TU, 43 GB level-1 tRNA-scaffold “ts” plasmids, 31 illegal BsaI and BsmBI sites, 42 materials, 30–33 methods cloning of level 0 tRNA-protospacer-scaffold “tps” parts, 36–37 final T-DNA expression vector assembly, 37–38 gene, choice of, 33
generated plasmid for XT gene editing, functional validation, 40–42 guide RNAs design, multiplexing with goldenbraid, 34–36 level 1 polycistronic gRNA cassette assembly, 37 N. benthamiana leaves, transient expression in, 38–39 nos promoter regulation, functional characterization, 39–40 target sites, selection of, 33–34 Pnos “tps” plasmid assembly, oligonucleotides for, 32 restriction enzyme sites, 43–44 CRISPR-PLANT, 19 CRISPR RNAs, 47
D dCas9 TU, 37–38 Degradation-based fluorescent sensors, 184 Dipole orientation, 200 DNA binding domains (DBD), 98 Double-stranded break (DSB), 1, 5 Drosophila, 98, 266 Dual luciferase reporter assay, 101, 105, 106
E EF-Tu receptors (EFR), 112 Endogenous native plasmids, 68 Enhanced cyan fluorescent protein (ECFP), 84 ENLYFQ/X, 170, 171 Epistemic uncertainty, 240 Escherichia coli, 67, 68, 70, 71, 73, 76, 153, 157, 213, 276
F F-Box protein, 184, 185 Firefly Luciferase (FLuc), 38–40, 99, 100, 105, 106, 108 Floral dip method, 175 Fluorescence intensities, 197 Fluorescent proteins (FPs), 197 EGFP, 197 energy transfer, 196 FP-labeled proteins, 202 localization, 113 FokI, 1 Fo¨rster Resonance Energy Transfer (FRET) bioindirect live-cell super-resolution technique, 196 biosensors, 184 homo-FRET, 196 Fragment of interest (FOI), 174
G GAL4/UAS system, Arabidopsis protoplasts dual luciferase reporter assay, 101, 105, 106
PLANT SYNTHETIC BIOLOGY Index 295 heterodimeric transcription factor, 98 isolated protoplasts, 98 maxi-preparation of plasmid DNA, 100, 103–104 plasmid DNA purification, 99 protoplast isolation and transformation, 101, 104–105 reporter and effector plasmids, 100–103 synthetic regulatory modules, 99 transactivation assays, 107 transfected protoplast, 106, 108 GenBank format, 84 Gene editing Chlamydomonas reinhardtii, 45–46 algal transformation, 46–48 blue-green screen, 61 Cas9 activity, experimental design and pretest for, 48–52 cell growth, 62 contaminations, 64 crude cell extracts, storage of, 63 electroporation, 62–63 electroporation of algae cells, 52–55 FLAGv3 mutation oligos, 60–61 FLAGv3 sequence, 60 gradient PCR, 50 mutants, screening for, 56–59 PCR, 63–64 PCR-based mutant screening, 48 plating, 63 primer sequences, 49 specific genomic primer, 61 tip off supernatant, 63 in vitro cleavage assay, 52, 61–62 in vitro digest, 51 “Generate Sequence” tab, 84 Genes of interest (GOIs), PCR amplication of, 158–160 chemically competent cell preparation, 157 chemically competent E. coli cells, 165 cloning of, 159–161 pET-28GG plasmid, 163 polycistronic expression cassette, 164 polycistronic level vectors, 163 polymerase chain reaction, 155 positional level vectors, 161–163 protein expression and purification, 157–158 recombinant proteins, 163–165 restriction ligation, 155–157 ribosome-binding site (RBS), 162 screening of colonies, 157 transformation, 161 UbiGate toolbox, 156 Genetic engineering of cyanobacteria., see pSHDY GenoCAD, 80–86, 91
Genome editing, 2 Genome engineering, 1 G factor, 202, 205, 206 Gibberellin, 184 Gibson cloning, 252 GoldenBraid (GB) approach, 28–29 GoldenBraid system, 80 Golden Gate approach, 8, 80 Golden Gate cloning, 154, 174 Green algae Chlamydomonas reinhardtii, gene editing, 45–46 algal transformation, 46–48 blue-green screen, 61 Cas9 activity, experimental design and pretest for, 48–52 cell growth, 62 contaminations, 64 crude cell extracts, storage of, 63 electroporation, 62–63 electroporation of algae cells, 52–55 FLAGv3 mutation oligos, 60–61 FLAGv3 sequence, 60 gradient PCR, 50 mutants, screening for, 56–59 PCR, 63–64 PCR-based mutant screening, 48 plating, 63 primer sequences, 49 specific genomic primer, 61 tip off supernatant, 63 in vitro cleavage assay, 52, 61–62 in vitro digest, 51 Green fluorescent protein (GFP), 252 Guanidinium thiocyanate, 265 GUS staining stock solution, 115–116, 118 GV3101, 21
H Homo-FRET imaging anisotropy measurements, 203 fluorescence signal, 199–201 free EGFP, 204 mean fluorescence intensity, 196 photo-damage and/or acquisition bleaching, 204 PM-spanning receptor kinase CLAVATA1 (CLV1), 203 Homology-directed repair (HDR), 2, 59 Hormone induction, 185
I Intrusive spectral projection, 242
PLANT SYNTHETIC BIOLOGY
296 Index J
Jasmonate, 184
L Level 0 vectors, 34, 82, 86, 87, 93 Level 1 vectors, 82, 83, 86–89
M Mach-1 T1R cells, 106 Mass spectrometry, 183 Mathematical modelling activator–inhibitor model, 216–217 bifurcation analysis, 236 Boolean models, 209, 210 chemical reaction systems, 212 enzyme kinetics/transcriptional regulation, 211 experimental data, 209 experimental design, 209 experimental errors, 209 experimental observations, 211 intra-cellular concentration gradients, 211 level of granularity, 211 model-based design, 222–223 model parameters, 209 model selection information criteria, 229–230 likelihood tests, 229 optimal experimental design, 227–228 ordinary differential equations (ODEs), 212 parameter identifiability, 225–226 partial differential equations (PDEs), 215 phase plane analysis, 231–233 PhyB–PIF interaction, 213, 214 phytochrome-PIF light signalling network, 213 reaction–diffusion models, 215, 216 sensitivity analysis, 236–240 stability analysis, 234–236 stochastic models, 219–221 thermal relaxation rate, 213 transcription factor, 208 trichome patterning model, 218 Maxi-preparation of plasmid DNA, 100, 103–104 Mendel’s laws, 17 Microalgae, 67 Microbe or pathogen associated molecular pattern (MAMP or PAMP), 112 Milli-Q water purification system, 277 mobA Y25F mutation, 76 Modular cloning (MoClo), 68, 69, 80, 154 Monte Carlo methods, 240–241 mRNA degradation, 264 Multiple transcriptional units, 80–83, 86, 88, 91–93 Multiplex CRISPR/Cas9, 27
Cas9 TU, 43 GB level-1 tRNA-scaffold “ts” plasmids, 31 illegal BsaI and BsmBI sites, 42 materials, 30–33 methods cloning of level 0 tRNA-protospacer-scaffold “tps” parts, 36–37 final T-DNA expression vector assembly, 37–38 gene, choice of, 33 generated plasmid for XT gene editing, functional validation, 40–42 guide RNAs design, multiplexing with goldenbraid, 34–36 level 1 polycistronic gRNA cassette assembly, 37 N. benthamiana leaves, transient expression in, 38–39 nos promoter regulation, functional characterization, 39–40 target sites, selection of, 33–34 Pnos “tps” plasmid assembly, oligonucleotides for, 32 restriction enzyme sites, 43–44 mVenus, 77
N N-degron pathway, 169, 170 n-Dodecane sampling, 129, 141 NEBuilder HiFi assembly (NEB), 80 NEBuilder HiFi DNA assembly method, 81, 86, 93, 94 N-end rule pathway., see N-degron pathway NeoBrick system, 69–72, 76, 77 NEPA21 electroporator, 54 NGG PAM, 6 Nicotiana benthamiana, 1, 2, 10–11, 13, 27, 28, 32, 38–41, 43, 80, 81, 91, 92, 112, 170, 200, 202–205 Non-homologous end joining (NHEJ), 2 Non-intrusive spectral projection, 242 Nopaline synthase promoter (pNOS), 84 N-recognins, 169 N-terminal amino acid, 169–171 N-termini in plastids equipment, 172 immunodetection, 174, 176–177 N-degron pathway, 169, 170 plant transformation, 172, 175 protein concentration determination, 173, 176 protein extraction, 173, 175 PRT1 and PRT6, 169 reporter construct cloning, 172, 174 reporter protein stability, 171 SDS-PAGE, 173–174, 176 specific N-terminal amino acid generation, 171 TEV protease, 170, 171, 177 transformants selection, 172–173, 175
PLANT SYNTHETIC BIOLOGY Index 297
Ordinary differential equations (ODEs), 212
design and cloning of constructs, 71 microorganisms, 70 step-by-step protocol, 69 triparental mating in Synechocystis sp. PCC 6803, 70–74 pVZ321-Spec, 68
P
R
Partial differential equations (PDEs), 215 Pattern-recognition receptor (PRRs), 112 Pattern-triggered immunity (PTI), 112 pBISN1, 113, 114, 116, 117, 119 pCAMBIA_ASX plant X-tender expression vector, 89, 90, 92 PCR Clean-up, 76, 81, 93, 253, 257 pGREEN 0800-LUC vector, 106 Phenotype-on-demand system, 265 Phire polymerase, 63 Photinus pyralis luciferase (FLuc), 106 Photobleaching, 203 Photoselection, 197 Photosynthetic organisms, 68 Phytobricks, 28 Phytohormone analyses, 185 Phytohormones auxin, 184 pICH75055 vector, 172, 178 Plant expression vectors, 82 Plant X-tender toolkit AssemblX web tool, 81, 84, 85, 91 cloning, 81 cloning into level 0 vectors, 86, 87 level 1 vectors, 86–89 plant X-tender expression vectors, 89–91 cloning strategy, 81, 82 designing constructs and primers, 81 GenoCAD for single transcriptional unit, 83–84 testing functionality of assembled transcriptional units, 81, 91 Plasmid, 68–70, 74, 76 Polyacrylamide, 173, 176, 178 Polycistronic expression, 155 pRhon5Hi-2-based constructs, 145, 147 Promoters, 67, 72–75 Protein degradation, 169, 265 Protein of interest (POI), 170 PROTEOLYSIS1 (PRT1), 169 PROTEOLYSIS6 (PRT6), 169 Protospacer adjacent motifs (PAMs), 2 pSHDY alternative cloning sites, 69 cloning using pSHDY plasmid, 70 culturing of cyanobacteria, 71, 74–75
Renilla luciferase (REN), 185 Renilla reniformis luciferase (RLuc), 38–40, 105, 106 Restriction endonuclease (REN), 15 Rhodobacter capsulatus cell harvesting and extraction, 129–130, 142–144 cloning insert OSC gene, 135–136 insert terpene synthase gene, 134–135 isoprene precursors, 133 materials, 127 pRhon5Hi-2-based vectors, 135 cultivation of, 126–129, 136–140 GC analysis, 129, 141–142 gene sequences, 130–132 gene sequences expression, 127, 130–132 intracytoplasmic membrane (ICM), 125 LC analysis, 129–130, 144–145 MEP pathway, 126 meta-codon usage table, 146 n-Dodecane sampling, 129, 141 non-sulfur purple α-proteobacteria, 125 photoheterotrophic growth, 125 pRhon5Hi-2-based constructs, 145, 147 terpene biosynthesis architecture, 124 terpene production, 148 Ribonucleoproteins, 46 Ribosome binding sites (RBS), 67, 131, 132, 155, 156, 162, 167 RNA-based regulators, 68 RSF1010-based conjugative vector systems, 68, 69, 76 RSF1010 IncQ-type plasmids, 68
ubiquitin, 170 western blotting, 170, 171, 174, 176, 177 NucleoSpin ® Gel, 93
O
S Saccharomyces cerevisiae, 67 Seamless ligation cloning extract (SLiCE), 80, 81, 83, 89–91, 93, 94 Single guide RNA (sgRNA), 2 SNRK2.2 gene, 48 Sodium dodecyl sulfate polyacrylamide gel electrophoresis (SDS-PAGE), 173–174, 176 Spacers, 2 SpCas9, 5 Spectral methods, 242–243 26S proteasome, 184 Steady-state anisotropy, 199, 201
PLANT SYNTHETIC BIOLOGY
298 Index
Streptococcus pyogenes, 5 Strigolactone, 185 SuperProtocol file, 91 Synechococcus elongatus, 276 cell concentration of E. coli, 284 cell culture media and prepared solutions barium chloride curing solution, 280 chemical stocks, 278–279 growth media preparation, 278 media composition, 279 sodium alginate preparation, 279–280 CFU counting, 284, 285 cyanobacterial cells encapsulation of, 281–283 materials, 277 methods, 280 preparation of, 281 heterotrophic microbes materials, 278 methods, 280–281 maturation of beads, 283–284 Synechocystis, 68, 70, 71, 73–77 Synthetic gRNA, 52
T Target of Monopteros5, 252 Temperature-induced transcription, 264 Terpene and physicochemical properties cell harvesting and extraction, 142–144 GC analysis, 141–142 LC analysis, 144–145 n-Dodecane sampling, 141 Time-correlated single-photon counting (TCSPC), 198 Tobacco etch virus (TEV) protease, 170, 171, 177 Transcription Activator-Like Effector (TALE), 1 Transcriptional repression, 33, 40 Transcriptional terminators, 67 Transfected protoplast, 106, 108 Transferred-DNA (T-DNA), 83, 92, 111–113, 115–117, 119 Transformation-associated recombination (TAR) cloning, 80, 93
Transient expression N. benthamiana leaves, 38–39 in protoplasts, 11–12 Transient transformation, 112, 118 Translocase of outer mitochondrial membrane 5 (TOM5), 252 TRANSPARENT TESTA GLABRA1 (TTG1), 266 Tris–acetate–phosphate (TAP), 46 Tumor-inducing (Ti)-plasmid-encoded virulence (vir) genes, 111 2A peptide, 184
U Ubigate, 155, 156 Ubiquitin (Ub) fusion technique, 170 UBIQUITIN10 promoter, 252
W Western blotting, 170, 171, 174, 176 Wizard ® SV Gel, 93 Wuschel-Related Homeobox4, 252
X X-Gal, 20 3xHA-eGFP-OMP25 mitochondrial fusion, 252 arabidopsis cultivation, 258 arabidopsis transformation, 254 cloning of tagging construct, 253–257 confocal laser scanning microscopy, 255, 258–259 epitope-tagged mitochondria, 259 plant cultivation, 254 schematic diagram, 252 transgenic lines generation, 258 transgenic lines selection, 254–255, 258 UBIQUITIN10 promoter, 252
Z Zea mays, 8 Zebrafish, 98 Zinc Finger (ZF) protein, 1