Z-DNA: Methods and Protocols 1071630830, 9781071630839

This detailed volume compiles key methods and techniques used to establish some of the structural and functional aspects

238 12 13MB

English Pages 332 [333] Year 2023

Report DMCA / Copyright

DOWNLOAD PDF FILE

Table of contents :
Preface
Contents
Contributors
Chapter 1: The Origin of Left-Handed Poly[d(G-C)]
1 Introduction
2 Postdoctoral Experience with Arthur Kornberg at Stanford
3 My Transition from Stanford to Göttingen, Germany
4 Fritz Pohl, a Visionary
5 Birth of Poly[d(G-C)] in Göttingen
6 Salt-Dependent ``R-L Transition´´ of Poly[d(G-C)]
7 Birth of the Max Planck Institute for Biophysical Chemistry (1971)
8 Interim Period: 1972 Up to When Z-DNA Appeared in 1979
9 Z-DNA (Accompanied by B-DNA) Is Revealed and Proliferates (1979-)
10 Unfinished Business
11 Concluding Remarks
References
Chapter 2: Characterization of Z-DNA Using Circular Dichroism
1 Introduction
2 Materials
2.1 DNA Preparation
2.2 Protein Preparation
2.3 CD Spectrometer
2.4 CD Characterization of the B- to Z-DNA Transition Induced by caZαPKZ at Physiological Salt Concentration (for Subheading 3...
2.5 CD Characterization of the B- to Z-DNA Transition Induced by Chemicals (for Subheading 3.2)
3 Methods
3.1 DNA Preparation
3.2 Protein Preparation
3.3 CD Spectroscopy (See Note 2)
3.4 CD Characterization of the B- to Z-DNA Transition Induced by Zα at Physiological Salt Concentration (Fig. 1) (See Note 15)
3.4.1 Wavelength Scan
3.4.2 Time-Course Measurement
3.5 CD Characterization of the B- to Z-DNA Transition Induced by Chemicals
3.5.1 Wavelength Scan
3.5.2 Time-Course Measurement
3.6 CD Characterization of the BZ Junction-Forming DNA (Fig. 3)
3.6.1 Wavelength Scan
3.6.2 Time-Course Measurement
4 Notes
References
Chapter 3: Characterization of Z-DNA by Infrared Spectroscopy
1 Introduction
2 Materials
3 Methods
3.1 Spectral Collection
3.2 Spectral Analysis
4 Notes
References
Chapter 4: Crystallization of Z-DNA in Complex with Chemical and Z-DNA Binding Z-Alpha Protein
1 Introduction
2 Materials
2.1 Preparation of Z-DNA-Forming ODNs
2.2 Bacterial Culture for Expression of hZαADAR1
2.3 Purification of hZαADAR1
2.4 Crystallization
3 Methods
3.1 Preparation of Duplex ODNs (Annealing)
3.2 Bacterial Cell Culture and Expression of hZαADAR1
3.3 Protein Purification (Zα Proteins)
3.4 Crystallization of Z-DNA in Complex with Zα Protein or Chemical Stabilizer
4 Notes
References
Chapter 5: NMR Titration Studies in Z-DNA Dynamics
1 Introduction
2 Materials
2.1 M9 Media
2.2 Expression
2.3 Purification
2.4 Preparation of NMR Experiment
3 Methods
3.1 Expression of 15N-Labeled ZBPs
3.2 Purification of 15N-Labeled ZBPs
3.3 Preparation of a DNA Duplex
3.4 Titration of ZBP into DNA
3.5 Titration of DNA into ZBP
3.6 Analysis of Titration Data
4 Notes
References
Chapter 6: Single-Molecule Methods to Study Z-DNA Mechanics and Dynamics
1 Introduction
2 Materials
2.1 DNA Samples
2.2 Protein Samples
2.3 Sample Chamber
2.4 Buffer Solutions
2.5 Objective-Type TIRF-Based Single-Molecule Fluorescence Detection Setup with Dual Channels
2.6 Magnetic Tweezers Setup Combined with Single-Molecule FRET Setup (See Subheading 2.5)
3 Methods
3.1 DNA Sample Preparation
3.2 Protein Assay Preparation
3.3 Chamber Preparation
3.4 Imaging Buffer
3.5 smFRET Assay with Two-Fragment DNA
3.6 DNA Manipulation with smFRET-MT or Magnetic Tweezers Only
3.7 Assay Examples
4 Notes
References
Chapter 7: BZ Junctions and Its Application as Probe (2AP) to Detect Z-DNA Formation and Its Effector
1 Introduction
2 Materials
2.1 2AP-Labeled ODNs
2.2 B-to-Z Transition Monitoring
2.3 Zα Domain (See Chap. 5 for Details)
3 Methods
3.1 Design of 2AP-Labeled ODNs Containing ZFS
3.2 Preparation of Double-Stranded ODNs (ODN Annealing)
3.3 Purification of dsODNs (Option)
3.4 Purification of Z-Alpha Protein (See Chap. 4 for Details)
3.5 Fluorescent-Based Z-DNA Formation Assay
4 Notes
References
Chapter 8: Oligonucleotide Containing 8-Trifluoromethyl-2′-Deoxyguanosine as a Z-DNA Probe
1 Introduction
2 Materials
2.1 8CF3dG Phosphoramidite Synthesis
2.2 Prepare 8CF3dG Labeled Oligonucleotides
2.3 Prepare CD Samples
2.4 Prepare In Vitro and In-Cell 19F-NMR Samples
3 Methods
3.1 Synthesis of 8CF3dG Phosphoramidite
3.2 Synthesis and Purification of 8CF3dG Labeled Oligonucleotides
3.3 CD Sample Preparation and Measurements
3.4 19F NMR Preparation and Measurements
3.4.1 In Vitro 19F NMR Sample Preparation and Measurement
3.4.2 In-Cell 19F NMR Sample Preparation and Measurement
3.5 CD Analysis of 8CF3dG Modified Oligonucleotide
3.6 In Vitro 19F NMR Study of 8CF3dG Modified Oligonucleotide
3.7 In-Cell 19F NMR Study of 8CF3dG Modified Oligonucleotide
4 Notes
References
Chapter 9: Chiroptical Properties of Z-DNA Using Ionic Porphyrins and Metalloporphyrins
1 Introduction
2 Materials
3 Methods
3.1 Interaction with Cationic Porphyrins (ZnT4, H2T4, NiT4)
3.2 Z-DNA/NiTPPS System as Supramolecular Device
3.3 Short Z-DNA Sequences
3.4 BZB Sequences
3.5 Spermine Porphyrin Conjugate (ZnTCPPSpm4)
4 Notes
References
Chapter 10: Construction of a Z-DNA-Specific Recombinant Nuclease Zαα-FOK for Conformation Studies
1 Introduction
2 Materials
2.1 Expression of Zαα-FOK Protein
2.2 Purification of Zαα-FOK Protein
2.3 Z-DNA Cleavage Assay
2.4 Cell-Free Protein Expression of Zαα-FOK.
3 Methods
3.1 Zαα-FOK Expression Vector
3.2 Expression of Zαα-FOK Nuclease
3.3 Purification of Zαα-FOK Nuclease
3.3.1 Cell Lysis
3.3.2 Ni-Affinity Chromatography
3.3.3 Size Exclusion Chromatography
3.3.4 Quality and Quantity Check of Zαα-FOK Nuclease
3.4 In Vitro Z-DNA Cleavage Assay
3.5 Cell-Free Protein Expression of Zαα-FOK Nuclease
4 Notes
References
Chapter 11: Human Heme Oxygenase-1 Promoter Activity Is Mediated by Z-DNA Formation
1 Introduction
2 Materials
2.1 Transfection
2.2 Z-Probe Expression Plasmids
2.3 Chromatin Immunoprecipitation
2.4 Real-Time PCR
3 Methods
3.1 Cell Culture and Transfection
3.2 Cell Lysate Preparation
3.3 Antibody-Bound Dynabeads Preparation and Chromatin Immunoprecipitation
4 Notes
References
Chapter 12: ChIP-Seq Strategy to Identify Z-DNA-Forming Sequences in the Human Genome
1 Introduction
2 Materials
2.1 Expression of Z-DNA-Binding Domain
2.2 Chromatin Immunoprecipitation
2.3 ChIP-Seq Library Preparation
2.4 ChIP-Seq Data Analysis
3 Methods
3.1 Cell Culture and the Expression of Zaa
3.2 Chromatin Immunoprecipitation
3.3 ChIP-Seq Library Preparation and Sequencing
3.4 ChIP-Seq Data Analysis
4 Notes
References
Chapter 13: Detection of Z-DNA Structures in Supercoiled Genome
1 Introduction
2 Materials
2.1 Purification of High Molecular Weight DNA
2.2 Converting Linear Genomic DNA into Supercoiled DNA
2.3 Enrichment of DNA Fragments Surrounding Single-Stranded DNA in Supercoiled Genome
2.4 Common Reagents
2.5 Buffers
2.6 Equipment
2.7 Kits
3 Methods
3.1 Purification of High Molecular Weight DNA
3.2 Converting Linear Genomic DNA into Supercoiled DNA
3.3 Enrichment of DNA Fragments Surrounding Single-Stranded DNA in Supercoiled Genome
3.4 Identification of Z-DNA Structures
4 Notes
References
Chapter 14: Thermogenomic Analysis of Left-Handed Z-DNA Propensities in Genomes
1 Introduction
1.1 Background
1.2 Energetics of Z-DNA Formation
1.3 Statistical Mechanics of the Zipper Model for the B-Z Transition
1.4 ZHUNT: A Computational Approach to Mapping Z-DNA in Genomes
1.5 Validating ZHUNT
1.6 Applications of ZHUNT for Genomic and Phylogenomic Analyses
1.7 mZHUNT for Analyses of Z-DNA in Genomes with Methylated Cytosine
2 Materials
3 Methods
4 Notes
4.1 Input File Format
4.2 Possible Errors Running ZHUNT
4.3 Running ZHUNT or mZHUNT on Local Computer
5 Conclusions and Discussion
References
Chapter 15: DeepZ: A Deep Learning Approach for Z-DNA Prediction
1 Introduction
2 The Input Data
3 Data Compression
4 Deep Learning Architectures
5 Train and Test Set
6 Whole-Genome Annotation with Z-DNA Regions
7 DeepZ Model Interpretation
8 Notes
References
Chapter 16: Methods to Study Z-DNA-Induced Genetic Instability
1 Introduction
2 Materials
2.1 Materials for Screening for Z-DNA-Induced DNA Double-Strand Breaks in Yeast Artificial Chromosomes (Exp #1)
2.2 Materials for Detecting Z-DNA-Induced DSBs on Reporter Vectors Recovered from Mammalian Cells (Exp #2)
2.3 Materials for Detecting Z-DNA-Induced Single- and Double-Strand Breaks in Cell-Free Extracts (Exp #3)
3 Methods
3.1 Methods for Screening for Z-DNA-Induced DNA Double-Strand Breaks in Yeast Artificial Chromosomes (Exp #1)
3.1.1 YAC Construction
3.1.2 Z-DNA-Induced Fragility Assay (FOA Selection of URA3 in Yeast Cells)
3.1.3 Transferring YACs to Mutant Yeast Strains Using a Kar-Cross Protocol (Liquid Method) (See Note 5)
3.2 Methods for Detecting Z-DNA-Induced DSBs on Reporter Vectors Recovered from Mammalian Cells (LM-PCR) [Exp #2]
3.3 Methods for Detecting Z-DNA-Induced Single- and Double-Strand Breaks in Cell-Free Extracts (Exp #3)
4 Notes
References
Chapter 17: Single-Molecule Visualization of B-Z Transition in DNA Origami Using High-Speed AFM
1 Introduction
1.1 Design and Construction of Direct Observation System of Rotation in B-Z Transition in the DNA Frame
1.2 Observation of the Rotation in B-Z Transition in the DNA Frame
1.3 Direct Observation of the Flag Rotation During B-Z Transition in the Equilibrium State
2 Materials
2.1 Design and Preparation of DNA Origami
2.2 Preparation of a DNA Origami Frame with B-Z Transition DNA Components
2.3 High-Speed Atomic Force Microscopy (HS-AFM)
3 Methods
3.1 Design and Preparation of a DNA Origami Frame
3.2 Assembly of the DNA Components in the DNA Frame
3.3 High-Speed AFM Imaging of the Behavior of the B-Z DNA Strands in the DNA Frame
4 Notes
References
Chapter 18: Adoption of A-Z Junctions in RNAs by Binding of Zα Domains
1 Introduction
2 Materials
2.1 General Supplies Needed
2.2 Circular Dichroism
2.3 Isothermal Titration Calorimetry
2.4 Analytical Ultracentrifugation
2.5 Nuclear Magnetic Resonance
3 Methods
3.1 Acquiring and Preparing Protein Zα Samples
3.2 Preparation of RNA and RNA/Zα Complexes
3.3 Circular Dichroism for Quantification of Z-Form
3.4 Calculation of Ez Scores from CD Data to Determine Extent of Z-RNA Formation
3.5 Isothermal Titration Calorimetry to Investigate Affinity and Thermodynamics of Binding
3.6 Sedimentation Velocity Analytical Ultracentrifugation of Z-Conformation-Containing RNA/DNAs Bound to Zα
3.7 Nuclear Magnetic Resonance to Monitor Zα-Dependent Switch from A- to Z-Form
4 Notes
References
Chapter 19: Detecting Z-RNA and Z-DNA in Mammalian Cells
1 Introduction
2 Materials
2.1 Culturing Mouse Embryonic Fibroblasts (MEFs) or L929 Cell Line
2.2 Virus Infection or Treatment with CBL0137
2.3 Immunofluorescence Microscopy
3 Methods
3.1 Influenza a Virus Infection
3.2 Immunofluorescence Detection of Z-RNA
4 Notes
References
Chapter 20: Identification of ADAR1 p150 and p110 Associated Edit Sites
1 Introduction
2 Materials
2.1 Exogenous Expression of p150, p110, and p150/p110 in ADAR1 KO Background
2.2 Total RNA Extraction and Preparation of Libraries
2.3 Sequencing, Alignment, Variant Calling, and Determination of ADAR1-Associated Mismatches and Isoform-Selective Edits (Fig....
2.4 Validation of Edit Sites by Amplicon Sequencing
3 Methods
3.1 Exogenous Expression of p150, p110, and p150/p110 in ADAR1 KO Background
3.2 Total RNA Extraction and Preparation of Libraries
3.3 Sequencing, Alignment, Variant Calling, and Determination of ADAR1-Associated Mismatches and Isoform-Selective Edits
3.4 Validation of Edit Sites by Amplicon Sequencing
4 Notes
References
Chapter 21: Z-DNA and Z-RNA: Methods-Past and Future
1 Introduction
2 A Retrospective
3 The Biology of Z-DNA
4 The Zα Family Domain Structure
5 Making a Z-DNA-Binding wHTH from One That Binds B-DNA
6 Well-Characterized Zα Proteins
7 New Approaches to Z-DNA and Z-RNA Biology
8 Final Thoughts
9 What Will We Find in the Future?
10 What Is There to Do?
References
Index
Recommend Papers

Z-DNA: Methods and Protocols
 1071630830, 9781071630839

  • 0 0 0
  • Like this paper and download? You can publish your own PDF file online for free in a few minutes! Sign Up
File loading please wait...
Citation preview

Methods in Molecular Biology 2651

Kyeong Kyu Kim Vinod Kumar Subramani  Editors

Z-DNA Methods and Protocols

METHODS

IN

MOLECULAR BIOLOGY

Series Editor John M. Walker School of Life and Medical Sciences University of Hertfordshire Hatfield, Hertfordshire, UK

For further volumes: http://www.springer.com/series/7651

For over 35 years, biological scientists have come to rely on the research protocols and methodologies in the critically acclaimed Methods in Molecular Biology series. The series was the first to introduce the step-by-step protocols approach that has become the standard in all biomedical protocol publishing. Each protocol is provided in readily-reproducible step-bystep fashion, opening with an introductory overview, a list of the materials and reagents needed to complete the experiment, and followed by a detailed procedure that is supported with a helpful notes section offering tips and tricks of the trade as well as troubleshooting advice. These hallmark features were introduced by series editor Dr. John Walker and constitute the key ingredient in each and every volume of the Methods in Molecular Biology series. Tested and trusted, comprehensive and reliable, all protocols from the series are indexed in PubMed.

Z-DNA Methods and Protocols

Edited by

Kyeong Kyu Kim and Vinod Kumar Subramani Department of Precision Medicine, Institute for Antimicrobial Resistance Research and Therapeutics, Sungkyunkwan University School of Medicine, Suwon, Korea

Editors Kyeong Kyu Kim Department of Precision Medicine Institute for Antimicrobial Resistance Research and Therapeutics Sungkyunkwan University School of Medicine Suwon, Korea

Vinod Kumar Subramani Department of Precision Medicine Institute for Antimicrobial Resistance Research and Therapeutics Sungkyunkwan University School of Medicine Suwon, Korea

ISSN 1064-3745 ISSN 1940-6029 (electronic) Methods in Molecular Biology ISBN 978-1-0716-3083-9 ISBN 978-1-0716-3084-6 (eBook) https://doi.org/10.1007/978-1-0716-3084-6 © The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Science+Business Media, LLC, part of Springer Nature 2023 Chapter 1 is licensed under the terms of the Creative Commons Attribution 4.0 International License (http:// creativecommons.org/licenses/by/4.0/). For further details see license information in the chapter. This work is subject to copyright. All rights are solely and exclusively licensed by the Publisher, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting, reproduction on microfilms or in any other physical way, and transmission or information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or hereafter developed. The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication does not imply, even in the absence of a specific statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use. The publisher, the authors, and the editors are safe to assume that the advice and information in this book are believed to be true and accurate at the date of publication. Neither the publisher nor the authors or the editors give a warranty, expressed or implied, with respect to the material contained herein or for any errors or omissions that may have been made. The publisher remains neutral with regard to jurisdictional claims in published maps and institutional affiliations. This Humana imprint is published by the registered company Springer Science+Business Media, LLC, part of Springer Nature. The registered company address is: 1 New York Plaza, New York, NY 10004, U.S.A.

Preface In the 1970s, DNA with unusual characteristics, differing from those of the well-studied B-DNA, was reported. Continued studies on this unique DNA led scientists working on nucleic acids to unveil its nature. The first atomic structure of this DNA determined by X-ray crystal diffraction experiment finally unraveled that this DNA has a left-handed conformation with zig-zag pattern, and thus it was named as ‘Z-DNA’. After its structure/form was confirmed, Z-DNA studies were performed to understand the functions associated with this distinct structural form of DNA in the genome. This book, Z-DNA Methods and Protocols, is a compilation of the key methods and techniques used to establish some of the structural and functional aspects of Z-form nucleic acids. Efforts were made to rope-in the contributors themselves; these pioneering efforts helped in establishing the various roles now associated with Z-DNA. We begin the book with ‘The Origin of Left-Handed Poly-GC’ (Chap. 1) with a personal account of the man who first obtained the salt-induced inverted Circular Dichroism (CD) spectrum of Poly[d(GC)] DNA that leads to all that we know of Z-DNA today. Following this, we describe the protocol for the characterization of Z-DNA using circular dichroism in ‘Characterization of Z-DNA Using Circular Dichroism’ (Chap. 2). Here, we describe the CD methods, including real-time monitoring, used to study the B-to-Z DNA conformational transition in Z-DNA forming DNA sequences as well as in sequences that bridge B-DNA and Z-DNA, called the BZ junction sequences. This is the fundamental technique commonly used in Z-DNA studies. In ‘Characterization of Z-DNA by Infrared Spectroscopy’ (Chap. 3), the authors describe how infrared spectroscopy is used to study Z-DNA, including its relative content in the cells. The key technique that determines the atomic structure of Z-DNA is X-ray crystallography, which involves the methodology to prepare DNA and the Zα protein to make cocrystals and study their structural details as described in ‘Crystallization of Z-DNA in Complex with Chemical and Z-DNA Binding Zα Protein’ (Chap. 4). Following structure determination, in ‘NMR Titration Studies in Z-DNA Dynamics’ (Chap. 5), the authors describe the methods for titration experiments of the Zα protein to study B-to-Z DNA transition dynamics using nuclear magnetic resonance (NMR) spectroscopy. In the next chapter, we move from methods used in bulk structural studies to more advanced singlemolecule methods that investigate the mechanical properties of Z-DNA and the dynamics of the B-to-Z DNA transition ‘Single-Molecule Methods to Study Z-DNA Mechanics and Dynamics’ (Chap. 6). Furthermore, as an application of the structural details of BZ Junction sequences, ‘BZ Junction and Its Application as a Probe (2AP) to Detect Z-DNA Formation and Its Effectors’ (Chap. 7) describes the adaptation of the BZ junction as a fluorescent probe (2AP) to detect Z-DNA formation and its effectors. ‘Oligonucleotide Containing 8-Trifluoromethyl-2’-Deoxy-Guanosine as a Z-DNA Probe’ (Chap. 8) describes the development of oligonucleotide modified with 8-trifluoromethyl-2’-deoxy-guanosine as a probe to investigate the left-handed Z-DNA structure in vitro and in living cells. In ‘Chiroptical Properties of Z-DNA Using Ionic Porphyrins and Metalloporphyrins’ (Chap. 9), the authors describe methods using cationic and anionic meso porphyrins and metallo derivatives with Z-DNA as probes, storing system, and logic gates. In further application-oriented

v

vi

Preface

adaptation, ‘Construction of a Z-DNA-Specific Recombinant Nuclease Zαα-FOK for Conformation Studies’ (Chap. 10) describes methods for constructing a Z-DNA-specific recombinant nuclease Zαα-FOK for Z-DNA conformational studies. In the next set of chapters, gene- and genome-targeting methods are described. In ‘Human HO-1 Promoter Activity Is Mediated by Z-DNA Formation’ (Chap. 11), a detailed protocol for Z-DNA detection in the human Heme oxygenase-1 gene (HO-1) promoter region based on chromatin immunoprecipitation (ChIP) with quantitative PCR is described. Such methods have enabled the study of Z-DNA in targeted cellular genomic regions. Expanding the scope of studying Z-DNA in targeted genomic regions, the ‘ChIPSeq Strategy to Identify Z-DNA Forming Sequences in the Human Genome’ (Chap. 12) recounts a ChIP-Seq strategy to identify all Z-DNA forming sequences in the human genome to provide a genome-wide perspective with regard to left-handed DNA. In the next chapter, ‘Detection of Z-DNA Structures in Supercoiled Genome’ (Chap. 13), this book features another genome-wide approach to the study of Z-DNA, which involves converting a linear genome into a supercoiled genome that sponsors Z-DNA formation. Applying permanganate-based methodology and high-throughput sequencing to supercoiled genomes allows the genome-wide detection of single-stranded DNA. Single-stranded DNA is characteristic of junctions between the classical B-form of DNA and Z-DNA. Consequently, this approach provides snapshots of the Z-DNA conformation over the whole genome. In continuation with the theme of genome-wide approaches of Z-DNA studies, in ‘Thermogenomic Analysis of Left-Handed Z-DNA Propensities in Genomes’ (Chap. 14), the authors present a thermogenomic analysis of Left-Handed Z-DNA propensities in genomes using an algorithm called ZHunt. In ‘DeepZ – A Deep-Learning Approach for Z-DNA Prediction’ (Chap. 15), methods from a recent work using a deep-learning neural network approach for Z-DNA prediction called DeepZ are described. To demonstrate the consequences of Z-DNA formation in cells, ‘Methods to Study Z-DNA-Induced Genetic Instability’ (Chap. 16) provides a description of the methods used to study Z-DNA-induced genetic instability in eukaryotic model systems. ‘Single-Molecule Visualization of B-Z Transition in DNA Origami Using High-Speed AFM’ (Chap. 17) narrates powerful single-molecule visualization of the B-Z transition in DNA origami structures using high-speed atomic force microscopy (AFM), which shows the B-to-Z DNA transition in real time. The next few chapters focus on the realm of Z-RNA. The ‘Adoption of A-Z Junctions in RNAs by Binding of Zα Domains’ (Chap. 18) describes the study of the A-Z junction forming RNAs by the binding of Zα domains, while ‘Detecting Z-RNA in Mammalian Cells’ (Chap. 19) describes the procedure for detecting Z-RNA in influenza A virus (IAV)infected cells. Furthermore, it also shows its application in detecting Z-RNA produced during vaccinia virus infection, as well as Z-DNA induced by a small-molecule DNA intercalator. ADAR1 and its Zα domain are among the most structurally and functionally studied Z-DNA-binding proteins that stabilize the left-handed helical conformation. The ‘Identification of ADAR1 p150 and p110-Associated Edit Sites’ (Chap. 20) presents methods for the identification of ADAR1 isoform-associated RNA editing sites. In the final chapter ‘Z-DNA and Z-RNA – Methods Past and Future’ (Chap. 21) of this book, the methods describing key discoveries in the field of Z nucleic acids are summarized, along with an insight into the challenging areas awaiting exploration. This chapter provides a glimpse into future initiatives for Z nucleic acid researchers and probable enterprises from future Z nucleic acid research.

Preface

vii

We are thankful to all the authors of this book for their contributions. We made sincere efforts to bring together most of the key contributors to the field of Z-DNA research while accounting for the methods, which in several cases were pioneered by the authors themselves. In doing so, we believe that we have covered all the key methodologies relevant to Z nucleic acid research from the past, present, and for several decades into the future. We also believe that the way this book has been compiled and arranged will inspire and equip students and researchers to become curious about this unique Z nucleic acid as their area of investigation and help the field grow, flourish, and unravel the hidden and novel roles of Z. In addition, we hope that this book will help motivate new researchers in this area to realize the ultimate role of Z-nucleic acid-targeted therapeutic and intervention strategies. Suwon, Korea

Kyeong Kyu Kim Vinod Kumar Subramani

Contents Preface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Contributors. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

v xi

1 The Origin of Left-Handed Poly[d(G-C)] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Thomas M. Jovin 2 Characterization of Z-DNA Using Circular Dichroism . . . . . . . . . . . . . . . . . . . . . . Vinod Kumar Subramani and Kyeong Kyu Kim 3 Characterization of Z-DNA by Infrared Spectroscopy . . . . . . . . . . . . . . . . . . . . . . . Fengqiu Zhang and Qing Huang 4 Crystallization of Z-DNA in Complex with Chemical and Z-DNA Binding Z-Alpha Protein. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . MinSoung Kang and Doyoun Kim 5 NMR Titration Studies in Z-DNA Dynamics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Seo-Ree Choi, Kwang-Im Oh, Yeo-Jin Seo, and Joon-Hwa Lee 6 Single-Molecule Methods to Study Z-DNA Mechanics and Dynamics. . . . . . . . . Hae Jun Jung, Beom-Hyeon Park, Sook Ho Kim, and Seok-Cheol Hong 7 BZ Junctions and Its Application as Probe (2AP) to Detect Z-DNA Formation and Its Effector . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . MinSoung Kang and Doyoun Kim 8 Oligonucleotide Containing 8-Trifluoromethyl-2′-Deoxyguanosine as a Z-DNA Probe . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Hong-Liang Bao and Yan Xu 9 Chiroptical Properties of Z-DNA Using Ionic Porphyrins and Metalloporphyrins. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Alessandro D’Urso 10 Construction of a Z-DNA-Specific Recombinant Nuclease Zαα-FOK for Conformation Studies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Seul Ki Lee and Yang-Gyun Kim 11 Human Heme Oxygenase-1 Promoter Activity Is Mediated by Z-DNA Formation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Atsushi Inose-Maruyama, Shuya Kasai, and Ken Itoh 12 ChIP-Seq Strategy to Identify Z-DNA-Forming Sequences in the Human Genome . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Tae-Young Roh 13 Detection of Z-DNA Structures in Supercoiled Genome . . . . . . . . . . . . . . . . . . . . Fedor Kouzine, Damian Wojtowicz, Teresa M. Przytycka, and David Levens

1

ix

33 53

59 69 85

105

115

131

143

157

167 179

x

14

15 16 17

18

19 20

21

Contents

Thermogenomic Analysis of Left-Handed Z-DNA Propensities in Genomes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Ryan S. Czarny and P. Shing Ho DeepZ: A Deep Learning Approach for Z-DNA Prediction . . . . . . . . . . . . . . . . . . Nazar Beknazarov and Maria Poptsova Methods to Study Z-DNA-Induced Genetic Instability. . . . . . . . . . . . . . . . . . . . . . Guliang Wang, Laura Christensen, and Karen M. Vasquez Single-Molecule Visualization of B–Z Transition in DNA Origami Using High-Speed AFM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Masayuki Endo and Hiroshi Sugiyama Adoption of A–Z Junctions in RNAs by Binding of Zα Domains . . . . . . . . . . . . . Parker J. Nichols, Shaun Bevers, Morkos A. Henen, Jeffrey S. Kieft, Quentin Vicens, and Beat Vo¨geli Detecting Z-RNA and Z-DNA in Mammalian Cells. . . . . . . . . . . . . . . . . . . . . . . . . Chaoran Yin, Ting Zhang, and Siddharth Balachandran Identification of ADAR1 p150 and p110 Associated Edit Sites . . . . . . . . . . . . . . . Tony Sun, Brad R. Rosenberg, Hachung Chung, and Charles M. Rice Z-DNA and Z-RNA: Methods—Past and Future . . . . . . . . . . . . . . . . . . . . . . . . . . . Alan Herbert

Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

195 217 227

241 251

277 285

295 331

Contributors SIDDHARTH BALACHANDRAN • Blood Cell Development and Function Program, Fox Chase Cancer Center, Philadelphia, PA, USA; Lead Contact, Philadelphia, USA HONG-LIANG BAO • Division of Chemistry, Department of Medical Sciences, Faculty of Medicine, University of Miyazaki, Kiyotake, Miyazaki, Japan; Institute of Molecular Medicine, Shanghai Key Laboratory for Nucleic Acid Chemistry and Nanomedicine, Renji Hospital, School of Medicine, Shanghai Jiao Tong University, Shanghai, China NAZAR BEKNAZAROV • Laboratory of Bioinformatics, Faculty of Computer Science, National Research University Higher School of Economics, Moscow, Russia SHAUN BEVERS • Department of Biochemistry and Molecular Genetics, University of Colorado Denver School of Medicine, Aurora, CO, USA; Colorado School of Mines, Golden, CO, USA SEO-REE CHOI • Department of Chemistry and the Research Institute of Natural Science, Gyeongsang National University, Jinju, South Korea LAURA CHRISTENSEN • Division of Pharmacology and Toxicology, College of Pharmacy, The University of Texas at Austin, Dell Pediatric Research Institute, Austin, TX, USA HACHUNG CHUNG • Department of Microbiology and Immunology, Columbia University, New York, NY, USA RYAN S. CZARNY • Department of Biochemistry & Molecular Biology, Colorado State University, Fort Collins, CO, USA ` degli Studi di ALESSANDRO D’URSO • Dipartimento di Scienze Chimiche, Universita Catania, Catania, Italy MASAYUKI ENDO • Department of Chemistry, Graduate School of Science, Kyoto University, Sakyo-ku, Kyoto, Japan; Institute for Integrated Cell-Material Sciences, Kyoto University, Sakyo-ku, Kyoto, Japan; Organization for Research and Development of Innovative Science and Technology, Kansai University, Suita, Osaka, Japan MORKOS A. HENEN • Department of Biochemistry and Molecular Genetics, University of Colorado Denver School of Medicine, Aurora, CO, USA; Faculty of Pharmacy, Mansoura University, Mansoura, Egypt ALAN HERBERT • Discovery, InsideOutBio, Charlestown, MA, USA P. SHING HO • Department of Biochemistry & Molecular Biology, Colorado State University, Fort Collins, CO, USA SEOK-CHEOL HONG • Center for Molecular Spectroscopy and Dynamics, Institute for Basic Science, Department of Physics, Korea University, Seoul, South Korea QING HUANG • Hefei Institutes of Physical Science, Chinese Academy of Sciences, Hefei, Anhui, China ATSUSHI INOSE-MARUYAMA • Division of Microbiology and Molecular Cell Biology, Nihon Pharmaceutical University, Ina-machi, Kita-adachigun, Japan KEN ITOH • Center for Advanced Medical Sciences, Department of Stress Response Science, Hirosaki University Graduate School of Medicine, Hirosaki, Japan THOMAS M. JOVIN • Max Planck Institute for Multidisciplinary Sciences, Go¨ttingen, Germany HAE JUN JUNG • Center for Molecular Spectroscopy and Dynamics, Institute for Basic Science, Department of Physics, Korea University, Seoul, South Korea

xi

xii

Contributors

MINSOUNG KANG • Department of Lifestyle Medicine, College of Environmental and Bioresource Sciences, Jeonbuk National University, Iksan, Republic of Korea; Advanced Materials Division, Korea Research Institute of Chemical Technology (KRICT), Daejeon, Republic of Korea SHUYA KASAI • Center for Advanced Medical Sciences, Department of Stress Response Science, Hirosaki University Graduate School of Medicine, Hirosaki, Japan JEFFREY S. KIEFT • Department of Biochemistry and Molecular Genetics, University of Colorado Denver School of Medicine, Aurora, CO, USA; RNA Bioscience Initiative, University of Colorado Denver School of Medicine, Aurora, CO, USA DOYOUN KIM • Therapeutics and Biotechnology Department, Drug Discovery Platform Research Center, Korea Research Institute of Chemical Technology (KRICT), Daejeon, Republic of Korea; Medicinal Chemistry and Pharmacology, Korea University of Science and Technology (UST), Daejeon, Republic of Korea KYEONG KYU KIM • Department of Precision Medicine, Institute for Antimicrobial Resistance Research and Therapeutics, Sungkyunkwan University School of Medicine, Suwon, South Korea SOOK HO KIM • Center for Molecular Spectroscopy and Dynamics, Institute for Basic Science, Department of Physics, Korea University, Seoul, South Korea; College of Veterinary Medicine, Chungbuk National University, Cheongju, South Korea YANG-GYUN KIM • Department of Chemistry, Sungkyunkwan University, Suwon, South Korea FEDOR KOUZINE • Laboratory of Pathology, NCI/NIH, Bethesda, MD, USA JOON-HWA LEE • Department of Chemistry and the Research Institute of Natural Science, Gyeongsang National University, Jinju, South Korea SEUL KI LEE • Department of Chemistry, Sungkyunkwan University, Suwon, South Korea DAVID LEVENS • Laboratory of Pathology, NCI/NIH, Bethesda, MD, USA PARKER J. NICHOLS • Department of Biochemistry and Molecular Genetics, University of Colorado Denver School of Medicine, Aurora, CO, USA KWANG-IM OH • Department of Chemistry and the Research Institute of Natural Science, Gyeongsang National University, Jinju, South Korea BEOM-HYEON PARK • Center for Molecular Spectroscopy and Dynamics, Institute for Basic Science, Department of Physics, Korea University, Seoul, South Korea MARIA POPTSOVA • Laboratory of Bioinformatics, Faculty of Computer Science, National Research University Higher School of Economics, Moscow, Russia TERESA M. PRZYTYCKA • Computational Biology Branch, NCBI/NIH, Bethesda, MD, USA CHARLES M. RICE • Laboratory of Virology and Infectious Disease, The Rockefeller University, New York, NY, USA TAE-YOUNG ROH • Department of Life Sciences, Pohang University of Science and Technology (POSTECH), Pohang, South Korea BRAD R. ROSENBERG • Department of Microbiology, Icahn School of Medicine at Mount Sinai, New York, NY, USA YEO-JIN SEO • Department of Chemistry and the Research Institute of Natural Science, Gyeongsang National University, Jinju, South Korea VINOD KUMAR SUBRAMANI • Department of Precision Medicine, Institute for Antimicrobial Resistance Research and Therapeutics, Sungkyunkwan University School of Medicine, Suwon, South Korea

Contributors

xiii

HIROSHI SUGIYAMA • Department of Chemistry, Graduate School of Science, Kyoto University, Sakyo-ku, Kyoto, Japan; Institute for Integrated Cell-Material Sciences, Kyoto University, Sakyo-ku, Kyoto, Japan TONY SUN • Laboratory of Virology and Infectious Disease, The Rockefeller University, New York, NY, USA KAREN M. VASQUEZ • Division of Pharmacology and Toxicology, College of Pharmacy, The University of Texas at Austin, Dell Pediatric Research Institute, Austin, TX, USA QUENTIN VICENS • Department of Biochemistry and Molecular Genetics, University of Colorado Denver School of Medicine, Aurora, CO, USA BEAT VO¨GELI • Department of Biochemistry and Molecular Genetics, University of Colorado Denver School of Medicine, Aurora, CO, USA GULIANG WANG • Division of Pharmacology and Toxicology, College of Pharmacy, The University of Texas at Austin, Dell Pediatric Research Institute, Austin, TX, USA DAMIAN WOJTOWICZ • Computational Biology Branch, NCBI/NIH, Bethesda, MD, USA YAN XU • Division of Chemistry, Department of Medical Sciences, Faculty of Medicine, University of Miyazaki, Kiyotake, Miyazaki, Japan CHAORAN YIN • Blood Cell Development and Function Program, Fox Chase Cancer Center, Philadelphia, PA, USA FENGQIU ZHANG • Henan Key Laboratory of Ion-Beam Bioengineering, School of Physics and Microelectronics, Zhengzhou University, Zhengzhou, China TING ZHANG • Blood Cell Development and Function Program, Fox Chase Cancer Center, Philadelphia, PA, USA

Chapter 1 The Origin of Left-Handed Poly[d(G-C)] Thomas M. Jovin Abstract The discovery of a reversible transition in the helical sense of a double-helical DNA was initiated by the first synthesis in 1967 of the alternating sequence poly[d(G-C)]. In 1968, exposure to high salt concentration led to a cooperative isomerization of the double helix manifested by an inversion in the CD spectrum in the 240–310 nm range and in an altered absorption spectrum. The tentative interpretation, reported in 1970 and then in detailed form in a 1972 publication by Pohl and Jovin, was that the conventional right-handed B-DNA structure (R) of poly[d(G-C)] transforms at high salt concentration into a novel, alternative lefthanded (L) conformation. The historical course of this development and its aftermath, culminating in the first crystal structure of left-handed Z-DNA in 1979, is described in detail. The research conducted by Pohl and Jovin after 1979 is summarized, ending with an assessment of “unfinished business”: condensed Z*-DNA; topoisomerase IIα (TOP2A) as an allosteric ZBP (Z-DNA-binding protein); B–Z transitions of phosphorothioate-modified DNAs; and parallel-stranded poly[d(G-A)], a double helix with high stability under physiological conditions and potentially also left-handed. Key words Left-handed DNA, CD left-handed DNA, R–L transition of poly[d(G-C)], Z*-DNA, TOP2A, Parallel-stranded psRR-DNA, Phosphorothioate-modified Z-DNA

1

Introduction This account has been written at the request of the editors of this volume. I have complied with their kind invitation to relate the historical circumstances of the research initiated at the Max Planck Institute of Physical Chemistry (MPIpc) in Go¨ttingen, Germany, on the “R–L transition,” which preceded by 11 years the publication in 1979–1980 of the first left-handed “Z-DNA” and B-DNA crystal structures. Unfortunately, my partner in the research, Fritz Pohl (“Fritz”), suffered an untimely, tragic death in 1994 at age 55. Thus, while I have tried to accurately reconstruct happenings of more than half a century ago—by relying on my lab books, publications, correspondence, and memory—the product undoubtedly contains errors and inconsistencies. My only defense is that they are unintentional due to incomplete records and the absence of Fritz’s

Kyeong Kyu Kim and Vinod Kumar Subramani (eds.), Z-DNA: Methods and Protocols, Methods in Molecular Biology, vol. 2651, https://doi.org/10.1007/978-1-0716-3084-6_1, © The Author(s) 2023

1

2

Thomas M. Jovin

insights and contributions to what would have been a mutual effort. The reader will hopefully forgive an inordinate degree of autobiographical detail, in my judgment required to provide a perspective of the evolving field of molecular biology during the 1960s–1970s.

2

Postdoctoral Experience with Arthur Kornberg at Stanford I arrived in Palo Alto, California, in the fall of 1964, fresh out of medical school in Baltimore, to embark on what was to be a 3-year postdoc with Arthur Kornberg (“Arthur”) (Fig. 1) in the Department of Biochemistry of Stanford University Medical School. I was not unaware of nucleic acids and their properties. My introduction to DNA had been in 1956, as a freshman at Caltech in the chemistry class taught by Linus Pauling. He would describe features of the Watson–Crick (W–C) model of B-DNA, announced 3 years earlier, while extracting seemingly endless pairs of chains out of a small black bag. Pauling was an enthusiast of the W–C model, despite the criticism that had been leveled at his own (fundamentally flawed) alternative proposal for B-DNA.1 At Stanford I participated in the extensive research activities focused on the general field of DNA biochemistry and associated technology development. A key project was a new purification and the biochemical, biophysical, and functional characterization of the “Kornberg enzyme,” E. coli DNA polymerase I (Pol I). Part of this effort was the introduction of

Fig. 1 My two mentors, Arthur Kornberg (left) and Manfred Eigen (right; 1967). (The photo of Arthur is taken with permission from Springer Nature from Ref. [4]. The photo of Manfred was kindly provided by Ruthild Oswatitsch-Eigen.)

1

Linus Pauling was a brilliant chemist but human. In his monumental book, The Nature of the Chemical Bond, he chastised Watson and Crick for omitting the third H-bond of the G–C base pair in their 1953 publications and not citing the proposal of a complementarity principle of replication he made together with Max Delbru¨ck in 1940 [1]. Alex Rich was a postdoc with Pauling at Caltech until 1954.

The Origin of Left-Handed Poly[d(G-C)]

3

polynucleotide celluloses (PTCs) as solid-state primers and templates for polymerases [2]. Despite their demonstrable utility for this and other uses, for example, in the discovery of DNA ligase by fellow Kornberg postdoc Nick Cozzarelli (later a guru of DNA topology and Chief Editor of PNAS), I failed in 1966 to successfully employ these early solid-support reagents in three successive projects that most readers (e.g., those 350) plaques affixed to the outer walls of venerable halftimbered houses feature names such as Gauss, Weber, Dirichlet, von Haller, Courant, Hilbert, Wo¨hler, Koch, Wallach, Nernst, Wigner, Debye, Windaus, Franck, Born, Heisenberg, Hahn, Debye, Klein, Noether, as well as the Grimm brothers, Goethe, King George II of Great Britain (and Ireland; he was also the Duke of Hannover and a Prince-elector of the Holy Roman Empire), Bismarck, Brentano, Lichtenberg, and Benjamin Franklin. During a 2-week visit in 1766, Franklin intensively consulted with two eminent professors of German constitutional law and history, greatly influencing his federalist theories that found their way into the US Constitution. More than 40 Nobel Laureates have either studied, performed research, or taught in Go¨ttingen. 9 We were also acutely aware of the thousands of Russian troops just over the nearby border with East Germany.

6

Thomas M. Jovin

Installed in the MPIpc, I (an Argentine-American) shared a large lab with three colleagues who became lifelong friends and collaborators; each went on to very distinguished careers: Ernst Grell, a Swiss; Israel Pecht, the first scientific postdoc to Germany from Israel (the Weizmann Institute) after WWII; and Rudolf Rigler (“Rudolf”), a Swede-Austrian. We were an international cooking pot of young, hungry, iconoclastic scientists, and despite Manfred’s role as “master cook,” he didn’t stir the pot very often, having turned his attention to quantitative treatments of biological evolution.10 Yet neither he nor the rest of us ignored the hot topics in molecular biology which at that time included allosterism, synthesis and sequencing of DNA and RNA, the identity and operation of the genetic code, DNA and RNA structure/function relationships, gene regulation in normal and in disease states, and receptordependent signaling in the nervous system. The resident and visiting scientific staff of the institute performed research in these areas since elucidating the underlying binding and conformational transitions was a challenge ideally matched to the new kinetic technologies with high (ns–μs–ms) temporal resolution. Experimental results and theoretical schemes were thrashed out in rapidfire German at the notorious “Teestunde” (tea hour) sessions and the annual Winter Ski seminars in Austria and Switzerland. My self-assigned goal was to determine the kinetics of binding and conformational transitions of complexes of the E. coli DNA polymerase I with its substrates, dNTPs and DNA. Additions to the lab, such as a scintillation counter, chromatography columns, and micropipettes, were made so as to purify polymerase starting from the paste of 2–3000 liters of bacterial culture imported from Iowa (recombinant DNA was still in the future). The intermediate and purest11 fractions of the polymerase were to be essential tools in the work initiated with Fritz Pohl later in 1967. At the same time, I began acquiring expertise in the theory and practice of rapid chemical kinetics, including devising a fluorescent T-jump apparatus with Rudolf.12

4

Fritz Pohl, a Visionary Fritz, born in Graz, Austria, in 1939, obtained a degree in physics and in 1964 joined Manfred’s team at the MPIpc, first as a postdoc

10

Israel Pecht and I wrote a retrospective of Manfred Eigen upon his death in 2019 [5]. The highly purified E. coli DNA polymerase I produced at the MPIpc became one of the first molecular biological offerings of Boehringer Mannheim. I also provided the enzyme to Fred Sanger upon his request—at the time he was developing his method for DNA sequencing. 12 Rudolf went on to invent fluorescence correlation spectroscopy (FCS), a key tool of single-molecule biochemistry and biophysics. Elliot Elson, a graduate student of Buzz Baldwin, was the independent co-inventor of FCS; it has been my great privilege to have shared publications, albeit not about FCS, with both Rudolf and Elliot. 11

The Origin of Left-Handed Poly[d(G-C)]

7

and then as a research associate. He developed a passion for molecular chirality and applied spectroscopic methods for studying the kinetics and thermodynamics of transitions of proteins and nucleic acids subjected to variations in solution conditions. Fritz was not only a gifted experimentalist; he was also very competent with the theoretical issues, particularly when novel insights and approaches were required. In 1967, Fritz developed a T-jump method to study the reversible denaturation of proteases in water and mixed solvents. A review of “cooperative conformational” transitions of globular proteins appeared in 1972. But already much earlier, he had turned his attention to salt-dependent transitions in DNA. On October 12, 1967, Fritz submitted a note (in German) to the journal Naturwissenschaften featuring the difference ORD (native, heat-denatured forms) of calf thymus and T4 bacteriophage DNA in 0.2 and in 6 M NaClO4 [6]. Fritz attributed the inversion of Cotton effects (Fig. 3, left) to a reversal in the helical sense of the DNA from right to left, subject to the assumption that base stacking was being preserved under both conditions.13 This remarkable, short note also proposed how such a reversal in helical handedness might be involved in DNA synthesis, recombination (Fig. 3, right) transcription, and packing in chromosomes. The closing sentence is worth quoting: “The proposed model is one conceivable extension of the existing one (he is referring to W-C B-DNA, 1953), but is neither confirmed nor excluded by direct experiment.”

Fig. 3 Fritz Pohl first invokes the existence of a transition of right-handed DNA to an alternative left-handed conformation at high salt concentration (Figure adapted from Ref. [6]). Left: difference ORD (25 –95 ) of T4 bacteriophage DNA as a function of salt concentration. Right: model for strand exchange between two DNA molecules in a segment bridging left-handed (L) and right-handed (R) helical regions; a junctional structure was also incorporated

13

It is not possible to interpret Fig. 3, left, as evidence for a reversal in helical sense because it consisted of a single measurement of a thermal difference spectrum of natural DNA. The signals are more specific for backbone chirality in the vacuum UV (< 220 nm) [7, 8], as seen in Fig. 7.

8

5

Thomas M. Jovin

Birth of Poly[d(G-C)] in Go¨ttingen In view of the above and my background in “matters DNA,” Fritz and I engaged in a lively interchange of ideas, which quickly led to a working relationship and an enduring friendship (Fig. 4). The challenge was to extend the suggestive experimental findings of Fig. 3 to better defined DNAs and a protocol in which the features expected of an intramolecular R(ight)–L(eft) transition would be observable and unambiguous: titratability, cooperativity, reversibility, and concentration independence. I had come to Go¨ttingen with a rich assortment of synthetic DNAs, which constituted attractive samples because of their sequence uniformity, especially in the case of self-complementary dinucleotide sequences such as poly[d(A-T)]. This DNA, however, did not exhibit a perceptible transition between distinct ORD/CD spectra in high salt. It was/is also of low helical stability and capable of adopting alternative topological states such as hairpins and cruciforms. The obvious alternative to poly[d(A-T)] was poly[d(G-C], expected to have much higher inherent stability as a double helix. Unfortunately there was a fundamental problem with this choice: poly[d(G-C] had not been reported in the literature and was thus not available, either from research labs or commercially. The reason was that in contrast to poly[d(A-T)], neither poly[d(G-C] nor poly[d(I-C] had arisen spontaneously as a product of de novo (i.e., template-

Fig. 4 The author (left) and Fritz Pohl (right) at the MPIpc in 1967. Note the chiral positions. I will not reveal whether Fritz or I (or both) favored the left orientation

The Origin of Left-Handed Poly[d(G-C)]

9

independent) reactions of the known DNA polymerases with dGTP (or the alternative dITP) + dCTP. Being the birthplace after WWII of the Max Planck Society— the renaissance of the former Kaiser Wilhelm Society—Go¨ttingen also housed a sister Max Planck Institute for Experimental Medicine (MPIem). Under the leadership of Fritz Cramer, the MPIem enjoyed international recognition in the field of nucleic acid chemistry and biochemistry, especially of RNAs. Eigen and Cramer organized periodic molecular biology symposia, attracting the luminaries in structural and molecular biology of the time to Go¨ttingen from institutions such as the Laboratory of Molecular Biology (LMB) in Cambridge, England; the Institut Pasteur in Paris; and the Weizmann Institute in Israel and many other European, Asian, and US addresses. In the MPIem were two gifted, productive chemists sharing a passion for the element sulfur as a replacement for oxygen in the bases and sugar–phosphate backbone of nucleic acids. Karl-Heinz Scheit introduced the thioketo substitution into thymine and demonstrated that ds4TTP could function as a substrate in the enzymatic synthesis of DNA, e.g., poly[d(A-s4T)] [9]. At the same, Fritz Eckstein created the chiral phosphorothioate (PS) modification of the DNA backbone, substituting sulfur for one (or both) of the two nonbonding oxygens of the phosphate group [10, 11].14 Both individuals and both of their innovations would play a very significant role in the work that Fritz Pohl and I would undertake with the “R–L transition” of poly[d(G-C)]. Despite certain misgivings15 the quest for a way to synthesize poly[d(G-C)] continued. During the late 1960s, the Kornberg and Khorana labs—with Robert Wells (“Bob”) as a chief protagonist— described in numerous publications the use of chemically synthesized deoxyribopolynucleotides with repeating short nucleotide sequences as templates for bacterial DNA polymerases. Unfortunately, in my hands, the d(G-C)4 oligonucleotide described earlier was inactive as a template. However, the synthesis of poly[d(G-T)d (C-A)] and its separation into individual strands in alkaline CsCl

14

The PS substitution in oligonucleotides and polynucleotides renders them generally resistant to enzymatic degradation by nucleases. This and other properties have led to its widespread application in basic and applied chemistry and in biomedicine [12]. A recent finding is that PS occurs in nature (bacteria, archaea, etc.) and is found in the human microbiome [13]. 15 My search in 1967 for a strategy enabling the synthesis of poly[d(G-C)] was not straightforward. The reader may choose not to believe me, yet I can reveal that an important consideration was whether to attempt a synthesis at all. My (in retrospect naive) hesitation was based on the notion that poly[d(G-C)] might have extraordinary stability and other properties enabling it to irreversibly “take over” the “world” of DNA, first in the test tube, but more generally. Similar considerations had arisen with respect to “polywater,” a (postulated) polymerized form of liquid water that Russian scientists had reported in the late 1960s and which was only debunked in 1973 (some research on polywater had actually been initiated at the MPIpc). The fear had been that polywater would autocatalytically convert and thereby “inactivate” the world’s supply of liquid water. In addition, although the Asilomar Conference on Recombinant DNA convoked by Paul Berg would not take place until 1975, the potential biohazards of biotechnology were already under discussion.

10

Thomas M. Jovin

Fig. 5 Strategy (unpublished) for synthesizing poly[d(G-C)] in 1967; see text for more details

gradients had been reported in 1965. We (Karl-Heinz Scheit and I) exploited this information and devised a rather elaborate scheme to synthesize poly[d(G-C)] by using poly[d(G-T)d(C-A)] as a template and replacing dTTP with ds4TTP as an initial step (Fig. 5). The first reactions (in November 1967) went well, and the product, poly[d(s4T-G)d(A-C]], was subjected to reductive amination of the thioketo group to an amino group, resulting in the conversion of s4T to m5C. Strand separation in CsCl yielded the desired poly[d (G-m5C)]. This DNA then served as a template, albeit a poor one,16 for the synthesis of poly[d(G-C)], which after expansion (Fig. 5) was used in our first experiments in 1968 demonstrating the R–L transition. Unbeknownst to us, Bob Wells and his colleagues were also after poly[d(G-C)] at that time, and in 1972 they published its synthesis and characterization; poly[d(I-C)] was also featured [16]. Bob generously supplied us with these materials, I believe in 1970  1, for use as templates and in comparison experiments. The new polynucleotides also became commercially available. It had turned out that poly[d(G-C)], after all, was not a biohazard. 16

It was of course not known to us in 1967 that poly[d(G-m5C)] undergoes the R–L transition at all and even less that it does so with greater facility than poly[d(G-C)] [14]. The interesting question arises as to why this DNA served us as a template for further poly[d(G-C)] synthesis (Fig. 5), inasmuch as the midpoint of the B–Z transition is at 0.6 mM MgCl2 compared to the much higher 6.7 mM of the enzymatic reaction, implying that the template should have been predominantly in the Z form and presumably inactive. It was finally reported in 1987 that poly [d(G-m5C)] is indeed a progressively poorer template as [Mg2+] is increased [15].

The Origin of Left-Handed Poly[d(G-C)]

6

11

Salt-Dependent “R–L Transition” of Poly[d(G-C)] In 1968 and 1969, the experiments with the new poly[d(G-C)] were yielding interesting results, and their interpretation was facilitated by parallel studies of the equilibria and kinetics of nucleic acid helix–coil transitions by Dietmar Po¨rschke, Manfred’s PhD student, as well as by the labs of institute “alumni” Buzz Baldwin and Don Crothers. We described the work at meetings and seminars. Our presentation at the winter’s meeting of the German Society of Biological Chemistry in 1970 was entitled (my translation from the German) “Kinetics of an ionic strengthdependent structural transition of synthetic DNA.” The published abstract [17] described a reversible, cooperative inversion of a Cotton effect (280, 300 nm) as the concentration of salt (NaCl, NH4Cl, NaClO4) was increased to 2–3 M. The reaction was temperature independent over 20–40  C, occurred at neutral pH, and was first order with a time constant of 102–103 s. The large activation energy of both the forward (kf) and reverse (kb) rate constants was indicative of a concerted participation of several bases; kf was independent of the concentration in contrast to kb, which was highly dependent. The reaction was interpreted as an all-or-none interconversion between a right-handed double helix (R) and a lefthanded double helix (L): kf

R ⇄L kb

This abstract, and not our universally cited paper in 1972, was the first publication asserting the existence of a left-handed doublehelical conformation of DNA, interconvertible in solution with righthanded Watson–Crick B-DNA. In 1971, during a sabbatical at the University of Bristol in England, Fritz made a presentation at the first European Biophysics Congress, “Isomerization of a doublestranded DNA,” which was then published [18]. The abstract stated: The observations suggest a delicate energetic balance governing different conformations of double-stranded nucleic acids in solution which is influenced by the base sequence and the interactions with other molecules or ions. A possible cation binding site for a L-form of poly d(pur-pur) is proposed.

The related polymers poly[d(I-C)] and poly[d(G)]poly[d(C)] did not undergo the transition. Fritz proposed that the O2 oxygens of cytosines of adjacent base pairs and two oxygens of the corresponding 30 phosphate groups acted as equatorial ligands for a cation (Na+). However, the model did not seem to fit the lefthanded double helix proposed by Bob Wells and colleagues for poly [d(I-C)], and deduced from fiber diffraction data and CD [19].

12

7

Thomas M. Jovin

Birth of the Max Planck Institute for Biophysical Chemistry (1971) I returned from a visit to Stanford, in April, 1971, accompanied by Donna Arndt, until then a postdoc in Paul Berg’s lab working on features of protein translation and SV40 cell biology. I had somehow persuaded her to join me in marriage and to further pursue her scientific career in Germany. One important selling point was the promise of life in a fourteenth-century castle, Schloss Berlepsch17 (Fig. 6). Another selling point for a career/life in Go¨ttingen was the conception, construction, and inauguration (in 1971) of a flamboyant new Max Planck Institute for Biophysical Chemistry (MPIbpc) (Fig. 2), the realization of Manfred’s concept and dream to merge the disciplines of chemistry, physics, and biology for exploring the fundamental principles of life forms and their evolution. He somehow felt that I would fit into this scheme and asked me in 1968 whether I would consider joining the new institute (but first the MPIpc) as a Scientific Member of the Max Planck Society and Director of a new Department. Despite some hesitation (e.g., I would have to give up my childhood dream of

Fig. 6 Schloss Berlepsch. See footnote 17 for details 17

The Schloss (castle) was/is surrounded by extensive forest and agricultural holdings. Our apartment was equipped with a canopy bed and a piano and our monthly rent was about $80. Families of scientists working at the MPIpc were able to live in the castle as a benefit of Manfred’s friendship with the hereditary owner, the Graf (Count) von Berlepsch. Fritz, his wife Edda, and their three children—Fritz, Peter, and Thomas—lived there until 1970, in the apartment above ours.

The Origin of Left-Handed Poly[d(G-C)]

13

being at MIT and would have to deal with German bureaucracy without mastery of the language), I accepted the offer to be considered for the position. The rather elaborate election/appointment procedures culminated in 1969 with the creation of a Department of Molecular Biology. It endured until my reaching emeritus status in 2007 and continued afterward as an Emeritus Laboratory of Cellular Dynamics. Setting up the new labs and a research program in the new institute was a challenge, yet Donna and I managed, traversing the 30 km between the Schloss and the Institute back and forth every day. But in 1971, there was also scientific business to transact at the Schloss. Fritz returned from his sabbatical sometime in 1971 and joined our Department. It was partly in the spacious library of the castle (Fig. 6), surrounded by numerous incunabula and sometimes under candlelight, that he and I hammered out the first, lengthy paper on the R–L transition. The adjoining hall of armored knights provided additional inspiration. The title of our definitive draft went something like “The saltdependent transition of poly[d(G-C)] from a right-handed to a left-handed double helix.” The text included numerous references and discussions of contemporary proposals for potentially lefthanded as well as right-handed structures based on fiber diffraction data; the DNA alphabet soup (A, B, C, D, E, etc.) was already extensive yet still growing. We asked Manfred for his appraisal of the paper, his criticism and advice. While he was (and had been) very positive about the experiments and results, he recommended against placing emphasis on a putative but unknown left-handed structure we were assigning to the high-salt conformation. True enough, Fritz and I had no proof, but it was our call, our decision to make. This we did and submitted the manuscript to the Journal of Molecular Biology in October, 1971; the publication appeared in 1972 [20]. In it, the word “left” was absent from the title and occurred only twice in the text.18 In retrospect, the decision to back off from what we considered to be a plausible, defensible interpretation of the data was wrong, injudicious. Had we stuck to our guns, the field of “left-handed DNA” might have advanced more rapidly. The most imposing and widely reproduced figure in the paper is that of the “inverted CD” spectrum near 290 nm of the “L form” of poly d[G-C)] in high salt (Fig. 7, left). This has become the CD

18

Yet we used the symbols “L” and “R” throughout and it took little imagination to deduce what they represented. The single letter designations of DNA helical forms are subject to ambiguity. A somewhat mysterious left-handed underwound “L-DNA” has appeared in torsion measurements of single molecules [21], and the same designation has been applied to the L-enantiomers (mirror images) of B-DNA [22]. The term Z-DNA has been subverted as well [23].

14

Thomas M. Jovin

Fig. 7 CD spectra of poly[d(G-C)] in the B, L(Z), and A forms. Left: B form in 0.2 M NaCl, pH 7.2, 25  C; Z form after addition of NaCl to 3.9 M. (Adapted from Fig. 2 of Ref. [20]). Right: B form in 0.01 M Na phosphate, pH 7, 22  C; Z form after addition of 2 M NaClO4; A form in 80% trifluoroethanol, 0.67 mM Na phosphate, pH 7. (Adapted from Fig. 1 of Ref. [8])

signature of Z-DNA, although Fig. 7, right, of a spectrum reported in 1985 by the Tinoco group [8], revealed the much larger (~15x) differential CD signal of B-DNA and Z-DNA (difference between their respective maximal absolute values) in the vacuum UV at 5000 148

Fig. 2 Synthesis scheme of 8CF3dG phosphoramidites 3.6 In Vitro 19F NMR Study of 8CF3dG Modified Oligonucleotide

1. The trifluoromethyl group could be used as a 19F NMR probe to investigate the B–Z DNA structural transition (Fig. 3a) and DNA–protein interaction (Fig. 4a) in vitro. 2. Typical B–Z structural transition results from 19F NMR spectroscopy showed in Fig. 3b, c. For 6-mer DNA, in the absence of NaCl, a single peak at -61.79 ppm was observed as the B-form DNA, whereas the addition of NaCl induced the appearance of a new peak at -61.29 ppm, and the original peak of the B-form DNA (-61.79 ppm) disappeared completely in 100 mM NaCl concentration. Based on the CD results, we assigned the new peak as a Z-form DNA structure. 3. The 8-mer oligonucleotide with two 8CF3dG modifications was also employed to confirm the assignment via 19F NMR spectroscopy (Fig. 3c). In the absence of NaCl, two 19F NMR signals (-61.68 ppm and -61.89 ppm) could be observed, which are resulted from two asymmetric 8CF3dG due to their different positions within the 8-mer DNA sequence. With the addition of NaCl into the DNA solution, the two peaks significantly decreased and completely disappeared in B-DNA, and two new strong-intensity peaks (-61.36 ppm and 61.42 ppm) appeared as Z-DNA.

124

Hong-Liang Bao and Yan Xu

Fig. 3 CD spectra of natural and 8CF3dG labeled oligonucleotide DNA. (a, c) 8CF3dG labeled sequences d (CGC8CF3GCG)2 and d(C8CF3GCAC8CF3GCG)/d(CGCGTGCG); (b, d) natural DNA sequences d(CGCGCG)2 and d (CGCACGCG)/d(CGCGTGCG) in 1 mM Na-PO4 buffer (pH 7.0), at 283 K. Various NaCl concentrations are indicated. (Reproduced from Bao et al. (2020) with permission from Oxford University Press [25])

4. Using the 19F NMR spectroscopy, the interaction of 8CF3dG labeled DNA with Zα protein is further investigated (Fig. 4b, c). For 6-mer DNA, when 1 equivalent Zα protein was added to the DNA sample, a new signal at -61.13 ppm was detected in the 19F NMR spectrum (Fig. 4b). The new signal was assigned to the complex of DNA-Zα in accordance with the previous reports [31]. We note that the original signal of free DNA still exists, which indicated that 1 equivalent Zα protein is not enough to bind with the 6-mer DNA. When 2 equivalents of Zα protein was added to the DNA sample, the original peak of the free DNA completely disappeared, and only the complex of DNA-Zα signal remains. This is consistent with previous studies, that is, the Z-DNA sequence could bind with two Zα proteins [31].

Oligonucleotide Containing 8-Trifluoromethyl-2′-Deoxyguanosine. . .

125

Fig. 4 19F NMR experiments for the study of B–Z transition in vitro. (a) Concept for the detection of B–Z transition by 19F NMR. Two 19F resonances of different chemical shifts are expected according to B-DNA and Z-DNA. (b) 19F NMR spectra of d(CGC8CF3GCG)2 in 1 mM Na-PO4 buffer (pH 7.0) and various NaCl concentrations. (c) 19F NMR spectra of d(C8CF3GCAC8CF3GCG)/d(CGCGTGCG) in 1 mM Na-PO4 buffer (pH 7.0) and various NaCl concentrations. The 19F NMR spectra were recorded at 50 μM duplex concentration. Red and black spots indicated B-form and Z-form DNA, respectively. (Reproduced from Bao et al. (2020) with permission from Oxford University Press [25])

5. We also studied the binding reaction of the 8-mer duplex DNA with Zα protein (Fig. 5c). Similar to the 6-mer DNA duplex, after adding of 1 equivalent of Zα protein, two new peaks appeared, and after adding 2 equivalents of Zα protein, the original peaks of free 8-mer duplex DNA completely disappeared. These results indicated 19F NMR spectroscopy is a useful tool for studying the binding events of Z-DNA and protein targets. 3.7 In-Cell 19F NMR Study of 8CF3dG Modified Oligonucleotide

1. At first, prepare the DNA sample for in-cell introduced in Subheading 3.4.2.

19

F NMR as

2. An in-cell 19F NMR spectroscopy strategy is shown in Fig. 6a, which suggested that comparing the in-cell 19F NMR spectrum with the in vitro results as a reference can enable a reliable

126

Hong-Liang Bao and Yan Xu

Fig. 5 19F NMR experiments for the study of DNA–protein interaction. (a) Concept for the detection of DNA– protein interaction by 19F NMR. Two 19F resonances of different chemical shifts are expected according to free DNA and DNA–protein complex. (b) 19F NMR spectra of d(CGC8CF3GCG)2 in 1 mM Na-PO4 buffer (pH 7.0) and different ratios of Zα protein. (c) 19F NMR spectra of d(C8CF3GCAC8CF3GCG)/d(CGCGTGCG) in 1 mM Na-PO4 buffer (pH 7.0) and different ratios of Zα protein. The 19F NMR spectra were recorded at 15 μM duplex concentration. Red and blue spots indicated free DNA and DNA–protein complex, respectively. (Reproduced from Bao et al. (2020) with permission from Oxford University Press [25])

determination of the intracellular Z-DNA conformation. Figure 6b showed a comparison of the in vitro and in-cell NMR spectra for the 8CF3dG labeled 6-mer DNA. One signal is observed at the top of Fig. 5b, and its chemical shift is the same as that of the corresponding Z-DNA observed in the in vitro 19F NMR spectrum. 3. After the in-cell NMR measurement, the suspension was collected and checked by 19F NMR spectroscopy. Almost no signal was observed in the supernatant (Fig. 6b), indicating that almost all NMR signals were derived from the HeLa cells. We also generated the difference spectrum between HeLa cells and the suspension to eliminate the signal in the supernatant. Therefore, the in-cell 19F NMR data indicated that 8CF3dG labeled 6-mer DNA can form a Z-DNA conformation in living

Oligonucleotide Containing 8-Trifluoromethyl-2′-Deoxyguanosine. . .

127

Fig. 6 19F NMR experiments for the study of 8CF3dG labeled oligonucleotide DNA in HeLa cells. (a) Concept for the detection of the structure of 8CF3dG labeled oligonucleotide DNA in HeLa cells. Comparison of the in-cell 19 F NMR spectrum with the in vitro results as a reference can enable a reliable determination of the intracellular Z-DNA conformation. (b) In-cell 19F NMR spectra for CF3dG labeled d(CGC8CF3GCG)2 in vitro B-DNA, in vitro Z-DNA, in HeLa cell, and in supernatant and difference spectrum between HeLa cell and supernatant. (c) In-cell 19F NMR spectra for CF3dG labeled d(C8CF3GCAC8CF3GCG)/d(CGCGTGCG) in vitro B-DNA, in vitro Z-DNA, in HeLa cell, and in supernatant and difference spectrum between HeLa cell and supernatant. (Reproduced from Bao et al. (2020) with permission from Oxford University Press [25])

cells. To our best knowledge, this is the first time that the Z-DNA conformation in living cells has been directly observed by NMR spectroscopy. 4. In-cell 19F NMR experiment of the 8CF3dG modified 8-mer DNA sequence d(C8CF3GCAC8CF3GCG)/d(CGCGTGCG) was carried out. As shown in Fig. 6c, the 8-mer DNA sequence, which needs a high concentration of NaCl to obtain a Z-DNA structure, could form both Z-DNA and B-DNA structure in HeLa cells. This sequence may offer the opportunity for further investigating the B–Z DNA structural transition in living cells in the future.

128

4

Hong-Liang Bao and Yan Xu

Notes 1. dA-, dG-, dC, and dT-CE phosphoramidites and dG- and dC-CPGs were used for DNA synthesis. All phosphoramidites and CPGs were purchased from Glen Research. 2. Thin-layer chromatography was performed using TLC Silica gel 60 F254 (Merck). 3. High-resolution mass spectra (HRMS) and electrospray ionization mass spectra (ESI-MS) were recorded on a Thermo Scientific Q Exactive instrument. 4. 1H, 19F, and 31P NMR spectra were recorded on a Bruker (AV-400 M) magnetic resonance spectrometer. DMSO-d6 and CDCl3 were used as the solvents. 1H spectra chemical shifts (δ) are reported in parts per million (ppm) referenced to residual protonated solvent peak (DMSO-d6, δ. 2.52, CDCl3, δ. 7.26). Coupling constants (J values) are given in Hz and are correct to within 0.5 Hz. Signal patterns are indicated as follows: br, broad; s, singlet; d, doublet; t, triplet; m, multiplet. 5. Purification of products was also performed on a middle pressure liquid chromatography (MPLC) system (EPCLC-AI580S, Yamazen Corporation) equipped with silica gel column (Hi-Flash Column, Yamazen Corporation). 6. Recycling preparative HPLC system was equipped with a JAIGEL-HR column, CHCl3 as a mobile phase, a flow rate of 3 mL/min. 7. Ammonium hydroxide/methylamine (AMA) solution is a 1:1 mixture of ammonium hydroxide solution (28% in water) and methylamine solution (40% wt % in water). The solution should be made up fresh before using. 8. Zα protein preparation was according to Ref. [31]. 9. Shigemi 5 mm Symmetrical NMR microtube was used for in vitro and in-cell 19F NMR.

Acknowledgments This work is supported by JSPS KAKENHI (17H03091, 20 K15402). Support from the Naito Foundation, Ichiro Kanehara Foundation, and The Yasuda Medical Foundation is also acknowledged.

Oligonucleotide Containing 8-Trifluoromethyl-2′-Deoxyguanosine. . .

129

References 1. Rich A, Nordheim A, Wang AH (1984) The chemistry and biology of left-handed Z-DNA. Annu Rev Biochem 53:791–846 2. Blaho JA, Wells RD (1987) Left-handed Z-DNA binding by the recA protein of Escherichia coli. J Biol Chem 262:6082–6088 3. Kmiec EB, Angelides KJ, Holloman WK (1986) Left-handed DNA and the synaptic pairing reaction promoted by Ustilago rec1 protein. Cell 40:139–145 4. Oh DB, Kim YG, Rich A (2002) Z-DNA-binding proteins can act as potent effectors of gene expression in vivo. Proc Natl Acad Sci U S A 99:16666–16671 5. Champ PC, Maurice S, Vargason JM, Camp T, Ho PS (2004) Distributions of Z-DNA and nuclear factor I in human chromosome 22: a model for coupled transcriptional regulation. Nucleic Acids Res 32:6501–6510 6. Ha SC, Lowenhaupt K, Rich A, Kim YG, Kim KK (2005) Crystal structure of a junction between B-DNA and Z-DNA reveals two extruded bases. Nature 437:1183–1186 7. Kang YM, Bang J, Lee EH, Ahn HC, Seo YJ, Kim KK, Kim YG, Choi BS, Lee JH (2009) NMR spectroscopic elucidation of the B-Z transition of a DNA double helix induced by the Z alpha domain of human ADAR1. J Am Chem Soc 131:11485–11491 8. Zhang Y, Cui Y, An R, Liang X, Li Q, Wang H, Wang H, Fan Y, Dong P, Li J, Cheng K, Wang W, Wang S, Wang G, Xue C, Komiyama M (2019) Topologically constrained formation of stable Z-DNA from normal sequence under physiological conditions. J Am Chem Soc 141: 7758–7764 9. Bae S, Kim D, Kim KK, Kim YG, Hohng S (2011) Intrinsic Z-DNA is stabilized by the conformational selection mechanism of ZDNA-binding proteins. J Am Chem Soc 133: 668–671 10. Lee AR, Park CJ, Cheong HK, Ryu KS, Park JW, Kwon MY, Lee J, Kim KK, Choi BS, Lee JH (2016) Solution structure of the Z-DNA binding domain of PKR-like protein kinase from Carassius auratus and quantitative analyses of the intermediate complex during B-Z transition. Nucleic Acids Res 44:2936–2948 11. Kim SH, Lim SH, Lee AR, Kwon DH, Song HK, Lee JH, Cho M, Johner A, Lee NK, Hong SC (2018) Unveiling the pathway to Z-DNA in the protein-induced B-Z transition. Nucleic Acids Res 46:4129–4137 12. Brown BA, Lowenhaupt K, Wilbert CM, Hanlon EB, Rich A (2000) The Zα domain of the

editing enzyme dsRNA adenosine deaminase binds left-handed Z-RNA as well as Z-DNA. Proc Natl Acad Sci U S A 97:13532–13536 13. Subramani VK, Kim D, Yun K, Kim KK (2016) Structural and functional studies of a large winged Z-DNA-binding domain of Danio rerio protein kinase PKZ. FEBS Lett 590: 2275–2285 14. de Rosa M, Zacarias S, Athanasiadis A (2013) Structural basis for Z-DNA binding and stabilization by the zebrafish Z-DNA dependent protein kinase PKZ. Nucleic Acids Res 41: 9924–9933 15. Kim K, Khayrutdinov BI, Lee CK, Cheong HK, Kang SW, Park H, Lee S, Kim YG, Jee J, Rich A, Kim KK, Jeon YH (2011) Solution structure of the Zβ domain of human DNA-dependent activator of IFN-regulatory factors and its binding modes to B- and Z-DNAs. Proc Natl Acad Sci U S A 108: 6921–6926 16. Ha SC, Kim D, Hwang HY, Rich A, Kim YG, Kim KK (2008) The crystal structure of the second Z-DNA binding domain of human DAI (ZBP1) in complex with Z-DNA reveals an unusual binding mode to Z-DNA. Proc Natl Acad Sci USA 105:20671–20676 17. Kim D, Hur J, Park K, Bae S, Shin D, Ha SC, Hwang HY, Hohng S, Lee JH, Lee S et al (2014) Distinct Z-DNA binding mode of a PKR-like protein kinase containing a Z-DNA binding domain (PKZ). Nucleic Acids Res 42: 5937–5948 18. Herbert A (2019) Z-DNA and Z-RNA in human disease. Commun Biol 2:7 19. Ravichandran S, Subramani VK, Kim KK (2019) Z-DNA in the genome: from structure to disease. Biophys Rev 11:383–387 20. Jiao H, Wachsmuth L, Kumari S, Schwarzer R, Lin J, Eren RO, Fisher A, Lane R, Young GR, Kassiotis G et al (2020) Z-nucleic-acid sensing triggers ZBP1-dependent necroptosis and inflammation. Nature 580:391–395 21. Zhang T, Yin C, Boyd DF, Quarato G, Ingram JP, Shubina M, Ragan KB, Ishizuka T, Crawford JC, Tummers B et al (2020) Influenza virus Z-RNAs induce ZBP1-mediated necroptosis. Cell 180:1115–1129 22. Vongsutilers V, Gannett PM (2018) C8-Guanine modifications: effect on Z-DNA formation and its role in cancer. Org Biomol Chem 16:2198–2209 23. Balasubramaniyam T, Ishizuka T, Xiao CD, Bao HL, Xu Y (2019) 2′-O-Methyl-8-methylguanosine as a Z-form RNA stabilizer for

130

Hong-Liang Bao and Yan Xu

structural and functional study of Z-RNA. Molecules 23:2572 24. Xu Y, Ikeda R, Sugiyama H (2003) 8-Methylguanosine: a powerful Z-DNA stabilizer. J Am Chem Soc 125:13519–13524 25. Bao HL, Masuzawa T, Oyoshi T, Xu Y (2020) Oligonucleotides DNA containing 8-trifluoromethyl-2′-deoxyguanosine for observing Z-DNA structure. Nucleic Acids Res 48: 7041–7051 26. Bao HL, Xu Y (2020) Telomeric DNA-RNAhybrid G-quadruplex exists in environmental conditions of HeLa cells. Chem Commun 56: 6547–6550 27. Bao HL, Ishizuka T, Iwanami A, Oyoshi T, Xu Y (2017) A simple and sensitive 19F NMR approach for studying the interaction of RNA G-quadruplex with ligand molecule and protein. ChemistrySelect 2:4170–4175

28. Bao HL, Xu Y (2018) Investigation of higherorder RNA G-quadruplex structures in vitro and in living cells by 19F NMR spectroscopy. Nat Protoc 13:652–665 29. Bao HL, Ishizuka T, Sakamoto T, Fujimoto K, Uechi T, Kenmochi N, Xu Y (2017) Characterization of human telomere RNA G-quadruplex structures in vitro and in living cells using 19F NMR spectroscopy. Nucleic Acids Res 45: 5501–5511 30. Bao HL, Liu HS, Xu Y (2019) Hybrid-type and two-tetrad antiparallel telomere DNA G-quadruplex structures in living human cells. Nucleic Acids Res 47:4940–4947 31. Oyoshi T, Kawai K, Sugiyama H (2003) Efficient C2′ alpha-hydroxylation of deoxyribose in protein-induced Z-form DNA. J Am Chem Soc 125:1526–1531

Chapter 9 Chiroptical Properties of Z-DNA Using Ionic Porphyrins and Metalloporphyrins Alessandro D’Urso Abstract The non-covalent interaction of achiral porphyrins with nucleic acids has been extensively studied, and various macrocycles have been indeed utilized as reporters of different sequences of DNA bases. Nevertheless, few studies have been published on the capability of these macrocycles to discriminate among the various nucleic acid conformations. Circular dichroism spectroscopy allowed to characterize the binding of several cationic and anionic mesoporphyrins and metallo derivatives with Z–DNA, in order to exploit the functionality of these systems as probes, storing system, and logic gate. Key words Porphyrin, Circular dichroism, Induced chirality, Supramolecular chemistry, Z–DNA, Molecular switch

1

Introduction The left-handed double helix Z-DNA under physiological conditions is high energy conformation [1]. Yet the biological role of Z-DNA is unknown since it is less common than the right-handed B-DNA [2–4]; however recently it has been demonstrated that certain classes of proteins bind tightly and specifically to Z-conformation [5–8]. The two conformations present significant structural differences that can be displayed by circular dichroism (CD) spectroscopy. Indeed, the different handedness between B and Z-DNA results in distinct CD spectra below 300 nm (Fig. 1). In the presence of chromophore without roto-reflection symmetry elements, CD results in very powerful diagnostic technique to detect optical activity generated within a specific electronic transition. When electronic communication between two and more chromophores occurs, CD signal can be very indicative regarding both intensity and shape [9]. In case of polynucleotides that can adopt a wide range of secondary structures, the reciprocal interactions of transition dipole moments of the bases change with the

Kyeong Kyu Kim and Vinod Kumar Subramani (eds.), Z-DNA: Methods and Protocols, Methods in Molecular Biology, vol. 2651, https://doi.org/10.1007/978-1-0716-3084-6_9, © The Author(s), under exclusive license to Springer Science+Business Media, LLC, part of Springer Nature 2023

131

132

Alessandro D’Urso

Fig. 1 CD spectra of B- (dashed curve) and Z-DNA (solid curve)

structure and modulate both the shape and the intensity of the CD signal which become highly diagnostic of the specific arrangement. Noteworthy, the CD signal of DNA double-stranded helix depends on the number of base pairs per turn, the inclination and the distance of bases with respect to the helix axis, the rise per base pair, and the handedness of the helix, confirming that CD spectroscopy can be extensively used to study the chiroptical properties of nucleic acid constructs [10]. However, in vivo, this technique suffers from the simultaneous presence of different DNA conformation and/or other biomolecules such as proteins that make the region below 300 nm in the CD spectra difficult to analyze. Indeed, the detection of unusual conformation of DNA like Z–DNA is needed to design chiroptical probes that discriminate between the Z-conformation and other structures and absorb above 300 nm, a region that is free from interferences, to provide characteristic induced circular dichroism (ICD) signals. ICD signal stems from interaction of small molecules and chiral template like DNA and can be detected in the absorption region of the achiral ligands. Noteworthily the shape and intensity of ICD are diagnostic of the type of interaction (e.g., external binding, intercalation, or aggregation). With this aim, porphyrinoids have been tested to be incorporated into DNA owing to them peculiar characteristics: (1) high molar absorptivity coefficient of the main absorption band (called Soret band) owing to the large aromatic structure, (2) the possibility to tune the electronic properties of the molecules by implementing small variation in the macrocycle ring or introducing metal ions in the core, (3) absorption spectrum in the 360–750 nm range, far

Porphyrins as Chiroptical Probe for Z-DNA

133

from spectroscopic interference of nucleic acid absorption, and (4) ability to act as photosensitizers in the presence of oxygen. In this chap. I will describe how to prepare samples to perform CD characterization of porphyrin–Z-DNA systems working as probes, storing system, and logic gate.

2

Materials All solutions have been prepared dissolving the solid of the reagents in ultrapure water with a resistivity of 18 MΩcm at 25  C. Cacodylate Buffer 1 mM Na–cacodylate at pH ¼ 7.0. To prepare 50 mL of cacodylate buffer, dissolving the needed amount of Na– cacodylate salt in water, mix and adjust pH with HCl (see Note 1). Store the buffer solution at 4  C. Sodium Chloride 3 M of NaCl was prepared dissolving in water the needed amount of NaCl solid in the desired volume (see Note 2). Store the solution at 4  C. Z-Inducers The solution of spermine and NiCl2 is prepared dissolving the solid in buffer and then stored at 4  C (see Note 3). Oligonucleotide Sequences ODN stock solutions were prepared in Na–cacodylate buffer annealed at 80  C for 20 min, cooled at 1  C/min, and kept at 4  C. The concentration of the DNA stock solutions (ranging from 1  102M to 5  102M) was quantified by UV–vis absorption spectroscopy (see Note 4). Porphyrins Porphyrin stock solutions are prepared dissolving small amount of solid in water. The concentration of the porphyrin stock solutions ranging from 2  104M to 8  104M (see Note 5) was quantified by UV–vis absorption spectroscopy (see Note 6). meso-tetrakis(N-methyl pyridinium-4-yl)porphyrin (H2T4); ε ¼ 2.61  105 M1 cm1 at 422 nm for H2T4 [11]. Nickel(II)meso-tetrakis(N-methyl pyridinium-4-yl) porphyrin 5 1 1 (NiT4); ε ¼ 1.49  10 M cm at 420 nm for NiT4 [12]. Zinc(II)meso-tetrakis(N-methyl pyridinium-4-yl) porphyrin (ZnT4); ε ¼ 2.04  105 M1 cm1 at 437 nm for ZnT4 [12]. meso-tetrakis(4-sulfonatophenyl)porphyrin (H2TPPS); ε ¼ 4.8  105 M1 cm1 at 413 nm for H2TPPS [11]. Nickel(II) meso-tetrakis(4-sulfonatophenyl)porphyrin (NiTPPS); ε ¼ 2.7  105 M1 cm1 at 409 nm for NiTPPS [12].

134

Alessandro D’Urso

Zinc(II) meso-tetrakis(4-sulfonatophenyl)porphyrin (ZnTPPS); ε ¼ 3. 7  105 M1 cm1 at 422 nm for ZnTPPS [12]. Zinc(II) meso-tetrakis-(4-carboxysperminephenyl)porphyrin (ZnTCPPSpm4); MW ¼ 1589.95.

3

Methods

3.1 Interaction with Cationic Porphyrins (ZnT4, H2T4, NiT4)

1. Prepare sample solution directly in 1 cm path length quartz cuvette. 2. Add 2.5 mL of Na–cacodylate buffer solution (1 mM, pH ¼ 7) into cuvette; add 8.3 μL of 3 M NaCl in order to reach 10 mM concentration. 3. Set the circular dichroism (CD) instrument as follows: scanning rate 50 nm/min, data pitch 0.2 nm, digital integration time (DIT) 2 s, and bandwidth 1.0 nm. Each CD spectrum was an average of at least three scans. Perform CD spectrum in the range from 195 nm to 550 nm as blank to subtract to the next spectra. 4. Add the amount of poly(dG-dC)2 to the solution in cuvette in order to achieve 50 μM concentration. 5. Perform the CD spectra in the range 220–350 nm just to check the shape and intensity of the poly(dG-dC)2 spectrum. 6. Now to perform the B to Z transition, add spermine up to reach 12 μM in cuvette. Perform the addition of spermine three times: first add 6 μM and increase the temperature up to 60  C for 10 min. Check by circular dichroism if the B to Z transition is occurring; then perform other addition of spermine up to 9 μM. Finally cool down at 25  C, and perform the last addition of spermine (other 3 μM), and perform CD measure in the range 195–550 nm (see Note 7). 7. Finally perform stepwise addition of increasing amount of porphyrins (from 1 μM to 10 μM) to the cuvette solution, and record the CD spectra after each addition of porphyrin (Fig. 2) (see Note 8). Exactly the same procedure can be followed with anionic porphyrin (H2TPPS, ZnTPPS, NiTPPS) to study the chiroptical properties of Z-DNA [13].

3.2 Z-DNA/NiTPPS System as Supramolecular Device

1. Prepare the blank solution mixing 2.5 mL of Na–cacodylate buffer (1 mM, pH ¼ 7) and 8.3 μL of 3 M NaCl in order to reach 10 mM concentration, in 1 cm path length quartz cuvette. 2. Set the CD instrument as follows: scanning rate 50 nm/min, data pitch 0.2 nm, digital integration time (DIT) 2 s, and

Porphyrins as Chiroptical Probe for Z-DNA

135

Fig. 2 CD spectra of H2T4, ZnT4, and NiT4 in the presence of Z-DNA

bandwidth 1.0 nm. Each CD spectrum was an average of at least three scans. Perform CD spectrum in the range from 195 nm to 550 nm as blank to subtract to the next spectra. 3. Add the amount of poly(dG-dC)2 to the solution in cuvette in order to achieve 50 μM concentration of ODN, and perform the CD spectrum in the range 220–350 nm just to check the shape and intensity of the poly(dG-dC)2 CD signal. 4. The B to Z transition has been achieved by adding to the DNA solution in cuvette, spermine up to reach the concentration of 14 μM. Perform the addition of spermine three times: first add 6 μM and increase the temperature up to 60  C for 10 min. Check by circular dichroism if the B to Z transition is occurring; then perform other addition of spermine up to 9 μM. Finally cool down at 25  C, and perform the last addition of spermine (other 5 μM), and perform CD measure in the range 195–550 nm. 5. Add NiTPPS up to reach 4 μM concentration and record CD spectrum in the range 195–550 nm after 10 min (Fig. 3). 6. Holding the cuvette in the sample holder of the CD instrument equipped with temperature controller Peltier, increase the temperature up to 48  C (see Note 9). 7. Raise the pH up to 9.5 from 7 by adding small amount of 1 M NaOH and record the CD spectrum after 10 min (Fig. 3). 8. Decrease the pH at 8.5 by adding few drop of 1 M HCl and record the spectra after 5 min. 9. Decrease again the pH up to 7 by adding few drop of 1 M HCl and record the spectrum after 10 min (Fig. 3). 10. It is possible to repeat steps 7–9 several times keeping the temperature at 48  C (see Note 10).

136

Alessandro D’Urso

Fig. 3 Schematic representation of AND logic gate truth table and variation of ICD signal 3.3 Short Z-DNA Sequences

1. Add 2.5 mL of Na–cacodylate buffer (1 mM, pH ¼ 7) into 1 cm path length quartz cuvette and 8.3 μL of 3 M NaCl in order to reach 10 mM concentration. 2. Set the CD instrument as follows: scanning rate 50 nm/min, data pitch 0.2 nm, digital integration time (DIT) 2 s, and bandwidth 1.0 nm. Each CD spectrum was an average of at least three scans. Perform CD spectrum in the range from 195 nm to 550 nm as blank to subtract to the next spectra. 3. Add the amount of 48-mer 50 --(dG--dC)24 to achieve 80 μM concentration, and check by CD measurement the signal for 50 --(dG--dC)24. 4. Z-form of 48-mer 50 --(dG--dC)24 was successfully induced at room temperature adding at room temperature 0.5 millimolar concentration of NiCl2 to the solution in cuvette (see Note 11). 5. Record the CD spectrum from 195 nm to 550 nm. 6. At the end perform stepwise addition (from 1 μM to 10 μM) of tetra-cationic Zn(II)meso-tetrakis(N-methyl pyridinium-4-yl) porphyrin (ZnT4), and record the CD spectra after each addition of ZnT4. Exactly the same procedure can be followed with anionic porphyrin (NiTPPS) to study the chiroptical properties of Z-DNA [14]. However at point 4, it is necessary to add 50 mM of NiCl2 (see Note 11).

3.4

BZB Sequences

1. Blank solution is prepared directly in 1 cm path length quartz cuvette by adding 2.5 mL of 1 mM Na–cacodylate buffer and 8.3 μL of 3 M NaCl in order to reach 10 mM concentration.

Porphyrins as Chiroptical Probe for Z-DNA

137

2. Set the CD instrument as follows: scanning rate 50 nm/min, data pitch 0.2 nm, digital integration time (DIT) 2 s, and bandwidth 1.0 nm. Each CD spectrum was an average of at least three scans. Perform CD spectrum in the range from 195 nm to 550 nm as blank to subtract to the next spectra. 3. Add the amount of synthetic self-complementary 42-mer sequence BZB--I (see Note 12) to achieve 100 μM concentration, and check by CD measurement the signal in the region between 220 nm and 350 nm. 4. Perform the addition of NaCl at room temperature to reach 0.1 M into the cuvette in order to induce the B to Z transition of the central portion of the sequences (see Note 13). 5. Record the CD spectrum from 195 nm to 550 nm. 6. Finally add 4 μM of ZnT4 to the cuvette and perform CD spectrum from 195 nm to 550 nm with ten scans. 3.5 Spermine Porphyrin Conjugate (ZnTCPPSpm4)

1. Blank solution is prepared directly in 1 cm path length quartz cuvette by adding 2.5 mL of 5 mM Na–cacodylate buffer and 8.3 μL of 3 M NaCl in order to reach 10 mM concentration. 2. Set the CD instrument as follows: scanning rate 50 nm/min, data pitch 0.2 nm, digital integration time (DIT) 2 s, and bandwidth 1.0 nm. Each CD spectrum was an average of at least three scans. Perform CD spectrum in the range from 195 nm to 550 nm as blank to subtract to the next spectra. 3. Add the amount of poly(dG-dC)2 to the solution in cuvette in order to achieve 35 μM concentration of ODN, and perform the CD spectrum in the range 220–350 nm just to check the shape and intensity of the poly(dG-dC)2 CD. 4. Perform the addition of 6 μM of spermine (see Note 14) and after increasing the temperature up to 60  C, record the CD spectrum. 5. Keeping the temperature at 60  C, perform stepwise addition of increasing amount of ZnTCPPSpm4 (up to 7 μM) to the cuvette solution, and record the CD spectrum after each addition of porphyrin in order to detect the B to Z transition. 6. Finally cool down to 25  C and record the CD spectrum.

4

Notes 1. The initial pH of the Na–cacodylate solution in general is around 8; therefore to reach the pH 7, it should be better to perform drop addition of ~1 M HCl.

138

Alessandro D’Urso

2. Depending on the experiment you perform, it could be useful preparing the 1 mM cacodylate buffer containing already the 10 mM NaCl. 3. It has been reported [15] that micromolar concentration of spermine induces the B to Z transition in poly(dG-dC)2 DNA as well as millimolar concentration of NiCl2 [16]. However the mechanism of the induction is different; indeed the spermine– DNA interaction is found to be very cooperative and the kinetic very slow at room temperature, whereas that of the nickel chloride is uncooperative and occurs in second. 4. Since left-handed Z-DNA is characterized also by an alternating anti–syn configuration of its base pairs, in vitro, the B to Z transition is favored in guanine–cytosine alternated sequences; therefore poly(dG-dC)2 sequences are the best ODN used for inducing the Z-DNA. Moreover synthetic poly(dG-dC)2 is a highly stable sequence able to maintain the Z-conformation which is a high-energy structure. Finally historically, the B–Z transition was first observed with this alternating purine– pyrimidine polymer in high-salt solutions [17]. 5. Porphyrin has a well-known tendency to aggregate in water solution; in order to avoid the aggregation process, it is better to prepare diluted stock solution with concentration below mM. Moreover at the same time in order to avoid dilution effect in the sample solution, the porphyrin stock solution should not be too diluted. 6. Be careful to use the appropriate extinction coefficient to check the concentration of the porphyrin stock solution. 7. The amount of spermine needed to induce complete B to Z transition depends on the length of the ODN sequence used. For the first studies on porphyrin Z-DNA interactions, poly (dG-dC)2 with 960 average base length was still commercially available, and 9 μM of spermine resulted the sufficient amount to induce the B to Z transition. However to date the longest poly(dG-dC)2 commercially available is 300 average base length. Therefore since the Z-DNA is a high-energy structure, other 3 μM of spermine is needed to stabilize the Z-conformation of 300 average base length poly(dG-dC)2. 8. It is not possible to establish the final amount of porphyrins to be added before performing the experiments. The titration can be finished when the porphyrin does not interact with DNA and remain unbounded in solution. Even the binding mode could affect the end point of the experiment. For example, if the porphyrin aggregates on to the DNA helix, we will never observe the unbounded porphyrin.

Porphyrins as Chiroptical Probe for Z-DNA

139

9. The simultaneous restoration of both left-handed Z-DNA helix and NiTPPS–Z-DNA complex (directly from pH 9.5 to pH 7) proved to be slow (16 h) and incomplete even at elevated temperatures although the B to Z transition is enthalpy driven. Therefore we decided to go via a two-step process (pH 9.5 ! 8.5 ! 6.8) and to increase the temperature to 48  C. Temperatures over 48  C caused the denaturation of the NiTPPS–DNA complex; indeed the ICD signal of NiTPPS in the Soret region disappeared completely at T  60  C. 10. Finally, using pH and temperature as input and induced CD signal (ranging from 375 nm to 450 nm) as output, this system behaves as a reversible AND logic gate [18]. In detail starting from point 5, one can increase the temperature up to 60  C and record the CD spectrum; then raise the pH up to 9.5 by adding small amount of 1 M NaOH and perform CD spectrum; at this point cool down to 25  C and record the CD spectrum. 11. The amount of NiCl2 could be increased in order to induce completely the B to Z transition. However the addition of more than 0.5 mM of NiCl2 hinders the interaction with cationic porphyrins, since ions Ni2+ shield the ODN sequence from electrostatic interactions. On the contrary to promote the interaction with anionic porphyrins, higher concentration of NiCl2 is needed in order to shield from electrostatic repulsion between anionic ODN and anionic porphyrin. 12. In order to test the efficiency of ZnT4 as chiroptical probe for Z-DNA, I designed more competitive environment realizing mixed sequences with central portion rich in G–C alternated base pairs embedded between peripheral portion with A–T base pairs which cannot be induced in Z-conformation [19]. The main idea behind the construction of three different BZB sequences (Fig. 4) was to test the capacity of ZnT4 to sense Z-conformation in conditions similar to that found in biological samples. The three self-complementary sequences differ for length and B/Z ratios. Moreover, in order to favor the B to Z transition of the central portion of the two sequences with shorter G–C segment, two guanines have been replaced with 8-bromoguanines (X in the BZB--I and BZB--II sequences) [20]. The third sequence was designed without 8-bromoguanines, in the G–C portion and, importantly, with two A–T tracks longer than the central G–C segment (BZB--III). 13. For each sequence different amount and type of inducer have been used. The selection of the inducer mainly depends on the length of central G–C portion. Spermine did not turn out to be efficient to induce the B to Z transition of the central portions of all three sequences.

140

Alessandro D’Urso

Fig. 4 Schematic representation of the mixed sequences

14. The aim of this study is to investigate whether ZnTCPPSpm4 is able to induce the B to Z transition in synthetic poly (dG-dC)2. However we found by ECD that in the range 2 μM up to 10 μM, ZnTCPPSpm4 is unable to induce any observable conversion of B- to Z-form. Therefore we need to investigate whether synergic presence in solution of both spermine (in concentration lower than 12 μM) and ZnTCPPSpm4 could induce this transition. Finally we found that the minimum amount of spermine necessary to exploit the induction power of ZnTCPPSpm4 is 6 μM. At this concentration the spermine induces only partially conversion of B to Z [21]. References 1. Belmont P, Constant JF, Demeunyck M (2001) Nucleic acid conformation diversity: from structure to function and regulation. Chem Soc Rev 30:70–81 2. Jovin TM, Soumpasis DM, McIntosh LP (1987) The transition between B-DNA and Z-DNA. Annu Rev Phys Chem 38:521–558 3. Rich A, Nordheim A, Wang AH (1984) The chemistry and biology of left-handed Z-DNA. Annu Rev Biochem 53:791–842 4. Rich A, Zhang S (2003) Timeline: Z-DNA: the long road to biological function. Nat Rev Genet 4:566–572 5. Herbert A, Alfken J, Kim Y--G, Mian IS, Nishikura K, Rich A (1997) A Z-DNA binding domain present in the human editing enzyme, double-stranded RNA adenosine deaminase. Proc Natl Acad Sci U S A 94:8421–8426 6. Kim Y--G, Lowenhaupt K, Maas S, Herbert A, Schwartz T, Rich A (2000) The Zab domain of the human RNA editing enzyme ADAR1 recognizes Z-DNA when surrounded by B-DNA. J Biol Chem 275:26828–26833

7. Kim Y--G, Lowenhaupt K, Oh D--B, Kim KK, Rich A (2004) Evidence that vaccinia virulence factor E3L binds to Z-DNA in vivo: implications for development of a therapy for poxvirus infection. Proc Natl Acad Sci U S A 101:1514– 1518 8. Schwartz T, Rould MA, Lowenhaupt K, Herbert A, Rich A (1999) Crystal structure of the Zalpha domain of the human editing enzyme ADAR1 bound to left-handed Z-DNA. Science 284:1841–1845 9. Berova N, Nakanishi K, Woody RW (2000) Circular dichroism principles and application. Wiley-VCH, New York 10. Kypr J, Kejnovská I, Rencˇiuk D, Vorlı´cˇková M (2009) Circular dichroism and conformational polymorphism of DNA. Nucleic Acids Res 37: 1713–1725 11. Mammana A, De Napoli M, Lauceri R, Purrello R (2005) Induction and memory of chirality in porphyrin hetero-aggregates: the role of the central metal ion. Bioorg Med Chem 13: 5159–5163

Porphyrins as Chiroptical Probe for Z-DNA 12. Pasternack RF, Francesconi L, Raff D, Spiro E (1973) Aggregation of nickel(II), copper(II), and zinc(II) derivatives of water-soluble porphyrins. Inorg Chem 12:2606–2611 13. Choi JK, D’Urso A, Balaz M (2013) Chiroptical properties of anionic and cationic porphyrins and metalloporphyrins in complex with lefthanded Z-DNA and right-handed B-DNA. J Inorg Biochem 127:1–6 14. D’Urso A, Choi JK, Shabbir--Hussain M, Ngwa FN, Lambousis MI, Purrello R, Balaz M (2010) Recognition of left-handed Z-DNA of short unmodified oligonucleotides under physiological ionic strength conditions. Biochem Biophys Res Commun 397:329–332 15. Parkinson A, Hawken M, Hall M, Sanders KJ, Rodger A (2000) Amine induced Z-DNA in poly(dG-dC)poly(dG-dC): circular dichroism and gel electrophoresis study. Phys Chem Chem Phys 2:5469–5478 16. Schoenknecht T, Diebler H (1993) Spectrophotometric and kinetic studies of the binding of Ni2+, Co2+, and Mg2+ to poly(dG-dC)  poly (dG-dC). Determination of the stoichiometry of the Ni2+-induced B ! Z transition. J Inorg Biochem 50:283–298 17. Fuertes MA, Cepeda V, Alonso C, Perez JM (2006) Molecular mechanisms for the B-Z transition in the example of poly[d(G-C)d (G-C)] polymers. A critical review. Chem Rev 106:2045–2064 18. D’Urso A, Mammana A, Balaz M, Holmes AE, Berova N, Lauceri R, Purrello R (2009)

141

Interactions of a Tetraanionic porphyrin with DNA: from a Z-DNA sensor to a versatile supramolecular device. J Am Chem Soc 131: 2046–2047 19. (a) D’Urso A, Holmes AE, Berova N, Balaz M, Purrello R (2011) Z-DNA recognition in B-ZB sequences by a cationic zinc porphyrin. Chem Asian J 6:3104–3109; (b) Ha SC, Lowenhaupt K, Rich A, Kim YG, Kim KK (2005) Crystal structure of a junction between B-DNA and Z-DNA reveals two extruded bases. Nature 437:1183–1186 20. (a) Moller A, Nordheim A, Kozlowski SA, Patel DJ, Rich A (1984) Bromination stabilizes poly (dG-dC) in the Z-DNA form under low-salt conditions. Biochemistry 23:54–62; (b) Nadler A, Diederichsen U (2008) Guanosine analog with respect to Z-DNA stabilization: nucleotide with combined C8-Bromo and C20 -Ethynyl modifications. Eur J Org Chem 9:1544–1549; (c) Kimura T, Kawai K, Tojo S, Majima T (2004) One-electron attachment reaction of B- and Z-DNA modified by 8-Bromo-20 -deoxyguanosine. J Org Chem 69: 1169–1173 21. Gangemi CMA, D’Urso A, Tomaselli GA, Berova N, Purrello R (2017) A novel porphyrin-based molecular probe ZnTCPPSpm4 with catalytic, stabilizing and chiroptical diagnostic power towards DNA B-Z transition. J Inorg Biochem 173:141–143

Chapter 10 Construction of a Z-DNA-Specific Recombinant Nuclease Zαα-FOK for Conformation Studies Seul Ki Lee and Yang-Gyun Kim Abstract Development of FokI-based engineered nucleases has been a platform technology that enables creation of novel sequence-specific nucleases as well as structure-specific nucleases. Z-DNA-specific nucleases have been constructed by fusing a Z-DNA-binding domain to the nuclease domain of FokI (FN). In particular, Zαα, an engineered Z-DNA-binding domain with a high affinity, is an ideal fusion partner to generate a highly efficient Z-DNA-specific cutter. Here, we describe construction, expression, and purification of Zαα-FOK (Zαα-FN) nuclease in detail. In addition, Z-DNA-specific cleavage is demonstrated by the use of Zαα-FOK. Key words Z-DNA, Z-DNA-binding domain, Zαα, FokI, Conformation-specific nuclease, Zαα-FOK, Protein expression, Protein purification, DNA cleavage

1

Introduction Z-DNA forms a left-handed double-helical structure held together by conventional Watson–Crick base pairings [1]. Its dynamic and fleeting nature makes Z-DNA elusive and difficult to detect. Hence, it is critical to have an effective method for Z-DNA detection. In this regard, development of a Z-DNA-specific nuclease is an attractive goal to provide a useful tool for Z-DNA study. FokI-based synthetic nucleases have been developed by using DNA-binding domains of proteins that contain different recognition sequences [2, 3]. Moreover, modular nature of FokI restriction endonuclease is a structural basis of the first- and second-generation programmable nucleases such as zinc finger nucleases and TAL effector nucleases to create custom-designed sequence-specific nucleases [4]. Using the same strategy, it has been demonstrated that it is possible to design a nuclease with conformation specificity [5–7].

Kyeong Kyu Kim and Vinod Kumar Subramani (eds.), Z-DNA: Methods and Protocols, Methods in Molecular Biology, vol. 2651, https://doi.org/10.1007/978-1-0716-3084-6_10, © The Author(s), under exclusive license to Springer Science+Business Media, LLC, part of Springer Nature 2023

143

144

Seul Ki Lee and Yang-Gyun Kim

To construct a FokI-based Z-DNA-specific nuclease, in principle, any of Z-DNA-binding domains can be used as a DNA-binding domain responsible for conformation specificity. The first Z-DNAspecific nuclease was reported by using the Z-DNA-specific binding domain (hZαADAR1) present in the human ADAR1 protein [5]. Subsequently, the construction of Zαα-FOK (Zαα-FN) containing Zαα (previously referred to as Zaa), an engineered Z-DNAbinding domain with two hZαADAR1s, was then reported [8]. Since Zαα has high specificity and tight binding to Z-DNA, it is ideal for Z-DNA detection. Zαα along with Zαα-FOK has been used for identifying Z-DNA formation in vivo as well as in vitro [9–11]. We describe here the detailed protocols for expression and purification of Zαα-FOK and its use for Z-DNA cleavage assay to detect Z-DNA. Recently, the importance of non-B-DNA structure formation in many biological processes has been emphasized [12]. General difficulty in searching and capturing these non-B-DNA structures requires reliable tools to study. Thus, Zαα-FOK could provide a facile and easy tool to directly detect Z-DNA formation in vitro as well as in vivo.

2

Materials

2.1 Expression of Zαα-FOK Protein

1. pET28a:Zαα-FOK Subheading 3.1).

(for

detailed

description,

see

2. Competent cells of E. coli BL21(DE3). 3. LB agar plate containing 40 μg/mL of kanamycin. 4. Liquid LB media containing 40 μg/mL of kanamycin. 5. Kanamycin: 40 mg/mL in ddH2O. 6. IPTG (isopropyl β-D-1-thiogalactopyranoside): 1 M in ddH2O. 2.2 Purification of Zαα-FOK Protein

1. Sonicator: VCX 750 (Sonics & Materials, Newtown, CT, USA). 2. PMSF (phenylmethylsulfonyl fluoride): 10 mM PMSF in isopropanol. 3. Benzonase (Merck Millipore, Billerica, MA, USA): 250 units/μL. 4. FPLC: AKTA Purifier 100 (GE Healthcare, Uppsala, Sweden). 5. 5 mL HiTrap chelating column (GE Healthcare, Uppsala, Sweden). 6. Buffers for Ni-affinity chromatography (see Note 1). Bind buffer: 50 mM Tris–Cl, pH 7.9, 1 M NaCl, 5 mM imidazole, 0.1% Tween 20 (v/v), 10% glycerol (v/v), 2 mM β-mercaptoethanol.

Construction of Z-DNA Specific Nuclease

145

Wash buffer: 50 mM Tris–Cl, pH 7.9, 500 mM NaCl, 60 mM imidazole, 0.1% Tween 20 (v/v), 10% glycerol (v/v), 2 mM β-mercaptoethanol. Elution buffer: 50 mM Tris–Cl, pH 7.9, 500 mM NaCl, 600 mM imidazole, 0.1% Tween 20 (v/v), 10% glycerol (v/v), 2 mM β-mercaptoethanol. 7. Superdex 75 10/300 GL column (GE Healthcare, Uppsala, Sweden). 8. Ultrafiltration filter: Amicon Ultra-15 centrifugal filter with 30K MWCO (Merck Millipore, Billerica, MA, USA). 9. Buffer for size exclusion chromatography. SEC buffer: 40 mM HEPES–NaOH, pH 7.9, 600 mM NaCl, 1 mM TCEP. 2.3 Z-DNA Cleavage Assay

1. pUC19-CG8: Substrate plasmid for Z-DNA cleavage assay, which contains eight repeats of d(CG) dinucleotides as a ZDNA-forming sequence in the polylinker region of pUC19 plasmid. DNA is prepared from E. coli by QIAprep Spin Miniprep Kit (Qiagen, Germantown, MD, USA) and is dissolved in TE buffer (10 mM Tris–Cl and 0.1 mM EDTA, pH 8.0). The concentration of DNA is determined by 1 OD260 = 50 μg/ mL. 2. 10× Zαα-FOK reaction buffer: 1× reaction buffer contains 10 mM Tris–Cl, pH 8.0, 75 mM NaCl, 10 mM MgCl2, 50 μg/mL BSA, and 3 mM DTT. 3. 10× CutSmart® buffer (New England Biolabs, Beverly, MA, USA): 1× CutSmart® buffer contains 50 mM potassium acetate, 20 mM Tris-acetate, pH 7.9, 10 mM magnesium acetate, and 100 μg/mL BSA. 4. Yeast tRNA (Invitrogen, Carlsbad, CA, USA): 12.5 mg/mL in ddH2O. 5. RNase A (Thermo Scientific, Carlsbad, CA, USA): 10 mg/mL in ddH2O. 6. Proteinase K (New England Biolabs, Beverly, MA, USA): 20 mg/mL in ddH2O.

2.4 Cell-Free Protein Expression of ZααFOK.

1. TNT® T7 Quick Coupled Transcription/Translation System (Promega, Madison, WI, USA). 2. pET28a:Zαα-FOK.

146

3

Seul Ki Lee and Yang-Gyun Kim

Methods

3.1 Zαα-FOK Expression Vector

1. Zαα-FOK is used as a probe to detect Z-DNA. 2. Originally, the Zαα-FOK expression vector was constructed in the backbone of pET15b plasmid [8]. 3. To improve the stability of the Zαα-FOK expression vector, the gene of Zαα-FOK is cloned into pET28a plasmid that contains a kanamycin resistance gene as a selectable marker, resulting in pET28a:Zαα-FOK (Fig. 1) (see Note 2). 4. The gene of Zαα-FOK is expressed under the tight control of T7 promoter to minimize basal expression of Zαα-FOK that may cause host DNA damage. 5. A host E. coli strain for Zαα-FOK expression should have the gene of T7 RNA polymerase (see Note 3). 6. The gene of Zαα-FOK encodes an N-terminal His6-tag for nickel-chelating affinity chromatography for facilitating purification of Zαα-FOK (Fig. 2).

Fig. 1 Construction of Zαα-FOK expression vector. (a) Representative map of pET28a:Zαα-FOK. In this vector, the Zαα-FOK gene is placed downstream of the T7 promoter. (b) Structural organization of Z-DNA-binding domains, Za, Zab, and Zαα. In particular, Zαα has two hZαADAR1 domains. (c) Schematic representations of Zαα-FOK nuclease. A flexible (Gly4-Ser)3 linker is inserted between the Zαα and the FokI nuclease domain (FN) in addition to the N-terminal His6-tag for facile purification

Construction of Z-DNA Specific Nuclease

147

Fig. 2 Nucleotide and amino acid sequences of the entire Zαα-FOK. All sequences of the His6-tag (in gray box), Zαα with two hZαADAR1 domains (bold), a (Gly4-Ser)3 linker (underlined), and the FokI nuclease domain are listed

7. Additionally, pET28a:Zαα-FOK can be used as a DNA template in a cell-free protein expression system using T7 RNA polymerase to express Zαα-FOK protein for prompt use (see Subheading 3.4). 3.2 Expression of Zαα-FOK Nuclease

1. Transform E. coli BL21(DE3) cells with pET28a:Zαα-FOK plasmid DNA (see Note 4). 2. Plate the transformed cells onto LB agar plates containing 40 μg/mL kanamycin. 3. Incubate the plates at 37 °C overnight. 4. To prepare an overnight culture, inoculate cells from a single colony to 50 mL of liquid LB media containing 40 μg/mL kanamycin. 5. Transfer the cells of the overnight culture to 2 L of liquid LB media containing 40 μg/mL kanamycin to a final OD600 of less than 0.05. 6. Grow cells at 37 °C with vigorous shaking until the cell density reaches to OD600 of 0.4. 7. Move the cell culture to 22 °C and grow further for 30 min with vigorous shaking until the OD600 reaches 0.5~0.7 (see Note 5). 8. Induce the Zαα-FOK expression by adding IPTG to a final concentration of 0.5 mM. 9. Grow the cell culture for 4~6 h further at 22 °C with vigorous shaking (see Note 6). 10. Check the soluble expression of Zαα-FOK nuclease before starting the protein purification (Fig. 3).

148

Seul Ki Lee and Yang-Gyun Kim

Fig. 3 Expression of Zαα-FOK nuclease. SDS-PAGE profiles of uninduced (Bf) and IPTG-induced (Af) samples of E. coli BL21(DE3) are compared. Total protein extract (Total) of induced E. coli cells is separated into soluble (Sol) and insoluble (Ins) protein fractions. An arrow indicates Zαα-FOK nuclease

3.3 Purification of Zαα-FOK Nuclease 3.3.1

Cell Lysis

All column chromatography steps for Zαα-FOK purification are performed on the FPLC system (see Note 7). 1. Centrifuge the cells that overexpress Zαα-FOK nuclease at 4000× g for 20 min at 4 °C. 2. Resuspend the cells in 50 mL of ice-cold bind buffer containing 0.1 mM PMSF and 12.5 units/mL Benzonase. 3. Lyse the cells by sonication at 2-s on/off cycles for 2 h at 4 °C (see Note 8). 4. Centrifuge the cell lysate at 13,000× g for 20 min at 4 °C. 5. Transfer the supernatant to a new tube.

3.3.2 Ni-Affinity Chromatography

1. Filter the supernatant through a 0.45 μM filter before being loaded onto a 5 mL HiTrap chelating column charged with Ni2+ ion and pre-equilibriated with bind buffer (see Note 9). 2. Wash the column with 50 mL of bind buffer at a flow rate of 2 mL/min. 3. Wash the column with 30 mL of wash buffer at a flow rate of 2 mL/min. 4. Elute the protein with 100 mL of a 60 mM-to-600 mM linear imidazole concentration gradient that is formed by mixing wash buffer and elution buffer at a flow rate of 2 mL/min. 5. Collect the eluates in 1 mL fractions.

Construction of Z-DNA Specific Nuclease 3.3.3 Size Exclusion Chromatography

149

1. Pool the fractions that contain Zαα-FOK proteins (see Note 10). 2. Concentrate the pooled protein sample to a volume of less than 500 μL by ultrafiltration with an Amicon Ultra-15 filter. 3. Inject the protein sample onto Superdex 75 10/300 GL column with a 500 μL sample loop. 4. Run the column with SEC buffer (40 mM HEPES–NaOH, pH 7.9, 600 mM NaCl, 1 mM TCEP) at a flow rate of 0.5 mL/ min. 5. Collect 0.5 mL fractions throughout the separation. 6. Pool the fractions that contain Zαα-FOK nuclease. 7. Concentrate the pooled protein sample by ultrafiltration with an Amicon Ultra-15 filter.

3.3.4 Quality and Quantity Check of Zαα-FOK Nuclease

1. After Ni-affinity column and size exclusion column, Zαα-FOK nuclease is purified to near homogeneity. 2. The apparent size of Zαα-FOK nuclease on the SDS-PAGE is close to 60 kDa, which migrates slower than its theoretical size of 53 kDa (Fig. 4). 3. The concentration of Zαα-FOK nuclease is determined at 280 nm, assuming a molar extinction coefficient of 60,850 M-1 cm-1.

Fig. 4 Purification of Zαα-FOK nuclease. SDS-PAGE profiles of soluble (Sol) fraction of total protein extract of induced E. coli, the eluted fraction from nickelchelating column (Ni2+), and the separated fraction from size exclusion column (SEC) are compared. An arrow indicates Zαα-FOK nuclease

150

Seul Ki Lee and Yang-Gyun Kim

4. This expression and purification procedure for Zαα-FOK nuclease typically yields 100~200 μg per 1 L culture. 5. The purified Zαα-FOK is stored in 50% glycerol (v/v) at -20 ° C for short-term storage or -80 °C for long-term storage (see Note 11). 3.4 In Vitro Z-DNA Cleavage Assay

1. As a substrate DNA for Z-DNA cleavage assay, pUC19-CG8 is constructed to have eight repeats of d(CG) as a Z-DNA-forming sequence in the polylinker region of pUC19 plasmid (see Note 12). 2. Plasmid DNA substrates are isolated from E. coli by an alkaline lysis/silica-based plasmid purification kit. More than 90% of the freshly prepared plasmids are usually negatively supercoiled (see Note 13). 3. Prepare an in vitro Z-DNA cleavage reaction as described below. Zαα-FOK: 100 nM in final concentration (see Note 14). 10× Zαα-FOK reaction buffer: 3 μL. Plasmid DNA template: 1 μg. Nuclease-free water to a final volume of 30 μl. 4. Incubate the reaction for 4 h at 22 °C (see Note 15). 5. Terminate the reaction by heating for 30 min at 50 °C (see Note 16). 6. Add the second restriction nuclease, ScaI, to allow the determination of specific cleavage sites. 7. Incubate the reaction for an additional 3 h at 37 °C. 8. Add proteinase K to the reaction at a final concentration of 50 μg/mL. 9. Incubate the reaction for longer than 1 h at 37 °C (see Note 17). 10. Analyze the reaction by agarose gel electrophoresis (Fig. 5a).

3.5 Cell-Free Protein Expression of Zαα-FOK Nuclease

1. In vitro production of Zαα-FOK nuclease can be carried out by using commercially available cell-free in vitro expression kits such as TNT® T7 Quick Coupled Transcription/Translation System. 2. Prepare a cell-free protein expression reaction mix for Zαα-FOK protein as described below. TNT® T7 Quick master mix: 40 μL. Methionine, 1 mM: 1 μL. Plasmid template: 1 μg of pET28a:Zαα-FOK (see Note 18). Nuclease-free water to a final volume of 50 μL.

Construction of Z-DNA Specific Nuclease

151

Fig. 5 Z-DNA cleavage assay with Zαα-FOK nuclease. (a) Map of pUC19-CG8 containing a Z-DNA region, d (CG)8, in supercoiled form. (b) Z-DNA-specific cleavage by the purified Zαα-FOK nuclease. When the cleavages occur at the Z-DNA in pUC19-CG8 by Zαα-FOK (+), the following ScaI digestion results in two DNA fragments, 0.9 kb and 1.8 kb (arrows). In the absence of Zαα-FOK (-), only linearized DNAs are produced. (c) Z-DNA-specific cleavage by Zαα-FOK nuclease produced from the in vitro translation reaction. Supercoiled (Sc) and ScaI-linearized (Linear) forms of pUC19-CG8 DNAs are subjected to digestion by either Zαα-FOK or BssHI, respectively. BssHI cleaves DNA at CGCGCG. The following ScaI digestion reveals that Zαα-FOK does not cleave linearized DNAs because it cannot form Z-DNA

3. Incubate the reaction for 90 min at 30 °C. 4. Further purification of Zαα-FOK is not required for in vitro Z-DNA cleavage assay. 5. Prepare an in vitro Z-DNA cleavage reaction mix slightly different from the reaction condition for the purified Zαα-FOK nuclease as described below. Translation reaction: 1~3 μL (see Note 19). 10× CutSmart® buffer: 3 μL (see Note 20). Plasmid template: 1 μg. Nuclease-free water to a final volume of 30 μL. 6. Incubate the reaction for 4 h at 22 °C. 7. Terminate the reaction by heating for 30 min at 50 °C. 8. Add the second restriction nuclease, ScaI, to allow the determination of specific cleavage sites. 9. Incubate the reaction for an additional 3 h at 37 °C. 10. Add proteinase K to the reaction to a final concentration of 50 μg/mL. 11. Incubate the reaction for longer than 1 h at 37 °C. 12. Analyze the reaction by agarose gel electrophoresis (Fig. 5b).

152

4

Seul Ki Lee and Yang-Gyun Kim

Notes 1. High concentration of salt in the buffers used for the purification of Zαα-FOK nuclease helps to reduce aggregation of Zαα-FOK and remove nonspecific bindings of proteins to the nickel-chelating resin. Thus, we recommend to add high salt concentration of NaCl or KCl in the buffers at all stages of purification. 2. The Zαα-FOK expression vector reported previously [8] was modified for this protocol. The DNA fragment of NdeI/ BamHI cleavage from pET15b:Zαα nuclease containing the Zαα-FOK gene was inserted into the NdeI/BamHI sites of pET28a plasmid, resulting in pET28a:Zαα-FOK. 3. Several E. coli strains that contain a copy of the T7 RNA polymerase gene under the control of lacUV5 promoter have been tested for expression of Zαα-FOK nuclease including BL21(DE3), NovaBlue(DE3), and Rosseta(DE3). Apparently, there is no significant difference in expression level and solubility of Zαα-FOK. 4. As in the previous reports [2, 3, 5], coexpression of E. coli ligase by pACYC184:lig may help to alleviate toxic effects caused by Zαα-FOK expression. However, it is optional to use it because the expression of Zαα-FOK is not significantly impaired without the constitutively expressing E. coli ligase. 5. For the induction of Zαα-FOK at a low temperature, the incubation temperature for cell culture is shifted from 37 °C to 22 °C before it reaches to optimal induction cell density. To drop the cell culture temperature quickly, the cell culture can also be cooled to the induction temperature in ice water. 6. Induction temperature for Zαα-FOK expression should be considered cautiously. The insoluble expression of recombinant proteins is frequently observed [13]. Similarly, the induction of Zαα-FOK nuclease at 30 °C or higher mostly results in insoluble aggregates, seemingly caused by misfolding of Zαα-FOK nuclease. To overcome the solubility problem in E. coli, induction temperature of Zαα-FOK expression is adjusted to lower temperatures ranging from 25 °C to 16 °C. As shown in Fig. 3, the majority of Zαα-FOK nuclease is soluble at low induction temperatures. The induction time may be longer than 6 h to increase the expression level of Zαα-FOK nuclease. However, a long induction time may result in more degradation products of Zαα-FOK nuclease, which may cause significant problems during the purification. 7. We recommend to use FPLC for a good yield of Zαα-FOK nuclease. However, the conventional gravitation column chromatography can be utilized for Zαα-FOK purification although

Construction of Z-DNA Specific Nuclease

153

it is necessary to start with more cells such as a 6 L of LB cell culture to obtain a decent amount of Zαα-FOK nuclease with relatively high purity. 8. Alternatively, cell lysis can be carried out by French press or protein extraction reagents such as BugBuster® (Merck Millipore, Billerica, MA, USA). To get the best result from the cell lysis, it is important to minimize exposure of cells to a high temperature. 9. HiTrap chelating column is a commercial prepacked nickelchelating affinity column for his-tagged proteins. This column can be connected to FPLC system for use. We recommend a 5 mL column for cell extracts from larger than a 2 L LB liquid culture. 10. The protein band of Zαα-FOK can be easily identified in a SDS-PAGE gel. A nonspecific nuclease activity assay can be also used to confirm the presence of Zαα-FOK by digesting DNA with 0.5 μL of each fractions in restriction enzyme digestion buffer, 10 mM Tris–HCl, pH 7.5, 50 mM NaCl, 10 mM MgCl2, and 1 mM 1,4-dithioerythritol [5]. In addition, western blotting can be used to directly detect Zαα-FOK nuclease. Antibodies against FokI restriction endonuclease are commercially available from several sources. 11. Zαα-FOK nuclease is very sensitive to high temperature. After purification, immediately concentrate Zαα-FOK to high concentration as possible. After diluting it to a final concentration of 50% glycerol (v/v), the purified Zαα-FOK is divided into small volume aliquots to avoid repeated freeze/thaw cycles and stored at -80 °C. 12. Z-DNA prefers alternating d(CG) dinucleotides. d(CG)n has the more potential to form Z-DNA as the number of repeats (n) is increased. The supercoiled form of pUC19-CG8 stably forms a Z-DNA region in the stretch of d(CG). In a linearized plasmid DNA without negative supercoiling, this Z-DNA region turns into B-form DNA. 13. Z-DNA is stabilized by negative supercoiling. Hence, the quality of substrate plasmid DNA for the in vitro Z-DNA cleavage assay is crucial. The content of supercoiled DNA in substrate DNA should be checked before the Z-DNA cleavage assay. We recommend to use the freshly isolated plasmid DNAs for the assay. Most of commercially available kits for plasmid purification yield high contents of supercoiled form. More than 90% of the plasmids were negatively supercoiled. In general, the longterm stored plasmid DNA tends to reduce supercoiled DNA content. Storage for a long period of time after plasmid isolation may cause significant reduction of Z-DNA content in the plasmid DNA even at -20 °C. Thus, we recommend to use freshly isolated plasmid DNA.

154

Seul Ki Lee and Yang-Gyun Kim

14. Acting concentration of Zαα-FOK for Z-DNA cleavage assay is usually dependent on each protein preparation. In general, too high concentration of Zαα-FOK will produce nonspecific DNA cleavages. Zαα-FOK is also sensitive to salt concentration. 15. Reaction temperature can be carried out between 25 °C and 20 °C. 16. Zαα-FOK nuclease is quickly inactivated by heat treatment at above 50 °C. 17. Proteinase K treatment is optional. However, it can prevent retarded bands, which possibly resulted from protein/DNA complex formation. 18. pET28a:Zαα-FOK can be used as a DNA template for the cellfree protein expression system containing rabbit reticulocyte lysates with T7 RNA polymerase. 19. The amount of Zαα-FOK nuclease produced from TNT® T7 Quick Coupled Transcription/Translation reaction is varied depending on each preparation. Thus, the amount of the translation reaction used for Z-DNA cleavage assay needs to be adjusted by titration. 20. The Z-DNA cleavage assay with the cell-free reaction product containing Zαα-FOK is carried out in CutSmart® buffer, which gives good cleavage results.

Acknowledgments This work was supported by the National Research Foundation of Korea (NRF) grants funded by the Korean government (MSIT) (No. 2021R1F1A1050830 and No. 2022R1F1A1074916). References 1. Rich A, Zhang S (2003) Z-DNA: the long road to biological function. Nat Rev Genet 4:566– 572 2. Kim YG, Chandrasegaran S (1994) Chimeric restriction endonuclease. Proc Natl Acad Sci U S A 91:883–887 3. Kim YG, Cha J, Chandrasegaran S (1996) Hybrid restriction enzymes: zinc finger fusions to Fok I cleavage domain. Proc Natl Acad Sci U S A 93:1156–1160 4. Chandrasegaran S, Carroll D (2016) Origins of programmable nucleases for genome engineering. J Mol Biol 428:963–989 5. Kim YG, Kim PS, Herbert A, Rich A (1997) Construction of a Z-DNA-specific restriction

endonuclease. Proc Natl Acad Sci U S A 94: 12875–12879 6. Xu S, Cao S, Zou B, Yue Y, Gu C, Chen X et al (2016) An alternative novel tool for DNA editing without target sequence limitation: the structure-guided nuclease. Genome Biol 17: 186 7. Dang DT, Nguyen LTA, Truong TTT, Nguyen HD, Phan AT (2021) Construction of a G-quadruplex-specific DNA endonuclease. Chem Commun 57:4568–4571 8. Kim YG, Lowenhaupt K, Schwartz T, Rich A (1999) The interaction between Z-DNA and the Zab domain of double-stranded RNA adenosine deaminase characterized using

Construction of Z-DNA Specific Nuclease fusion nucleases. J Biol Chem 274:19081– 19086 9. Liu R, Liu H, Chen X, Kirby M, Brown PO, Zhao K (2001) Regulation of CSF1 promoter by the SWI/SNF-like BAF complex. Cell 106: 309–318 10. Mulholland N, Xu Y, Sugiyama H, Zhao K (2012) SWI/SNF-mediated chromatin remodeling induces Z-DNA formation on a nucleosome. Cell Biosci 2:3 11. Shin SI, Ham S, Park J, Seo SH, Lim CH, Jeon H et al (2016) Z-DNA-forming sites identified by ChIP-Seq are associated with actively

155

transcribed regions in the human genome. DNA Res 23:477–486 12. Kouzine F, Wojtowicz D, Baranello L, Yamane A, Nelson S, Resch W et al (2017) Permanganate/S1 nuclease footprinting reveals non-B DNA structures with regulatory potential across a mammalian genome. Cell Syst 4:344–356 13. Kapust RB, Waugh DS (1999) Escherichia coli maltose-binding protein is uncommonly effective at promoting the solubility of polypeptides to which it is fused. Protein Sci 8:1668–1674

Chapter 11 Human Heme Oxygenase-1 Promoter Activity Is Mediated by Z-DNA Formation Atsushi Inose-Maruyama, Shuya Kasai, and Ken Itoh Abstract In recent years, it has been shown that Z-DNA formation in DNA plays functionally significant roles in nucleic acid metabolism, such as gene expression, chromosome recombination, and epigenetic regulation. The reason for the identification of these effects is mainly due to the advancement of Z-DNA detection methods in target genome regions in living cells. The heme oxygenase-1 (HO-1) gene encodes an enzyme that degrades an essential prosthetic heme, and environmental stimuli, including oxidative stress, lead to robust induction of the HO-1 gene. Many DNA elements and transcription factors are involved in the induction of the HO-1 gene, and Z-DNA formation in the thymine–guanine (TG) repetitive sequence in the human HO-1 gene promoter region is required for maximum gene induction. Here, we describe a detailed protocol for Z-DNA detection in the human HO-1 gene promoter region based on chromatin immunoprecipitation with quantitative PCR. We also provide some control experiments to consider in routine lab procedures. Key words Heme oxygenase-1 (HO-1), ADAR1 Z-DNA-binding region Zα, Chromatin immunoprecipitation, Real-time polymerase chain reaction (real-time PCR), BRG1

1

Introduction Heme (ferroprotoporphyrin IX) is a prosthetic group that is essential for life and is involved in oxygen transport and the catalytic activity of redox-based enzymes, such as catalase. It also regulates cellular activity via heme regulatory motifs [1]. Heme is highly lipophilic and causes oxidative stress; therefore, the synthesis and degradation of heme are stringently regulated in cells. Heme oxygenase (HO) is a rate-limiting enzyme that breaks down heme into biliverdin, carbon monoxide (CO), and ferrous iron (Fe2+) [2] (Fig. 1a). In mammals, two isoforms of HO, inducible HO-1 and constitutive HO-2, have been identified [2].

Kyeong Kyu Kim and Vinod Kumar Subramani (eds.), Z-DNA: Methods and Protocols, Methods in Molecular Biology, vol. 2651, https://doi.org/10.1007/978-1-0716-3084-6_11, © The Author(s), under exclusive license to Springer Science+Business Media, LLC, part of Springer Nature 2023

157

158

Atsushi Inose-Maruyama et al.

A

E2

B

H3C H2C

N N CH3

Fe

ARE COOH

N

NRF2 Heme oxygenase

CO Fe2+

N CH3

Heme

COOH

Biliverdin NADPH + H+ NADP+ O2

E1

ARE NRF2

CH3

H2C

BRG1

chromatin remodeling Pol II TATA

HO-1

Z-DNA formation

Fig. 1 Heme oxygenase activity and its transcriptional regulation. (a) Heme oxygenase degrades heme into biliverdin, carbon monoxide (CO), and ferrous iron (Fe2+). (b) HO-1 gene expression is induced by NRF2 via antioxidant response element (ARE) in E1 and E2 enhancers. NRF2 recruit chromatin remodeling factor BRG1, which enhances the formation of Z-DNA of the TG repeats in the promoter region

Expression of the HO-1 gene is robustly induced by environmental stimuli, such as oxidative stress, heavy metals, and ultraviolet irradiation, indicating that HO-1 is a cytoprotective factor [3]. Multiple transcription factors and DNA elements that have been identified in the gene regulatory region of the HO-1 gene locus are involved in the regulation of the HO-1 gene [3]. Regulatory DNA elements are enriched in the E1 and E2 enhancer regions, which are located 4 kb and 10 kb upstream of the transcription start site of the HO-1 gene, respectively [4–7]. In addition, it has also been reported that internal enhancers in the HO-1 gene coding region; noncoding RNAs, such as enhancer RNAs from the E2 enhancer; and miRNAs are involved in the expression of the HO-1 gene [8–10]. Previously, we showed that BRG1, a core ATPase subunit of the chromatin remodeling complex, interacted with the oxidative stress-responsive transcription activator NRF2 and selectively induced NRF2-dependent HO-1 gene expression [11]. We also demonstrated that the thymidine–guanine (TG) repetitive sequence in the human HO-1 promoter region is necessary for maximum induction of the HO-1 gene by the BRG1/ NRF2 complex [11] (Fig. 1b). The TG repetitive sequence of the HO-1 gene promoter is uniquely conserved in primates; it has been proposed that the length of polymorphisms determines the HO-1 gene expression level and susceptibility to diseases, including chronic pulmonary emphysema, which is induced by cigarette smoke [12, 13]. Liu et al. reported that BRG1 enhances the activation of the colony-stimulating factor 1 (CSF1) gene through the formation of Z-DNA in the TG repeats present in the promoter of the CSF1 gene [14]. As a method for detecting intracellular Z-DNA formation, specific antibodies against Z-DNA were used to detect Z-DNA sites in the genome in fixed chromatin or agarose-encapsulated metabolically active nuclei [15–17]. Herbert and colleagues

Z-DNA Detection in the Promoter of the Human HO-1 Gene

159

hypothesized that if Z-DNA had a biological function, some proteins would bind to it. As a result of their intensive research, they found that the Zα domain of human adenosine deaminase acting on RNA 1 (ADAR1) specifically bound to Z-DNA [18]. By analyzing the solution and cocrystal structure of the Z-DNA and the ADAR1 Zα domain, the amino acid residues of the Zα domain required for binding were determined [19, 20]. Li et al. applied this structural information for the detection of genome-wide Z-DNA formation using the Zα domain [21]. Referring to these previous studies, we developed an experimental method to detect Z-DNA in living cells by expressing a genetically designed protein that contains the Zα domain and enhanced green fluorescence protein (EGFP) (hereafter called the Z-probe), followed by chromatin immunoprecipitation (ChIP) using anti-EGFP antibodies and real-time PCR [22]. As a result, we successfully showed the correlation between Z-DNA in the human HO-1 gene promoter and gene induction in living cells [22]. Although some computer model analyses to detect Z-DNA containing regions were performed [21, 23], the actual Z-DNA conformation in chromatin should be investigated by experiments using cultured cells or individual organisms. We consider intracellular Z-DNA detection using ChIP and Z-probes to be a convenient and powerful strategy [22, 24]. The core principle of our Z-DNA detection method is that the ADAR1 Zα domain specifically binds to Z-DNA. To exclude nonspecific protein binding, control experiments should be simultaneously performed using a probe that contains amino acid mutations in the Zα domain required for Z-DNA binding. Analyzing the immunoprecipitated DNA region using real-time PCR allows quantitative detection of Z-DNA in a target genome region. Furthermore, the transfection efficiency into cultured cells can be monitored by the EGFP expression level using fluorescence microscopy. An outline of our experimental procedure is shown in Fig. 2. This chapter describes our method to detect Z-DNA using the Z-probe and ChIP in cultured cells.

2 2.1

Materials Transfection

2.2 Z-Probe Expression Plasmids

1. Transfection reagent: FuGENE HD (Promega Corporation) (see Note 1). 1. pcDNA3-Zα-Ln-NLS-EGFP: Z-probe protein expression plasmid (Fig. 3). 2. pcDNA3-Zαmut-Ln-NLS-EGFP: Z-DNA-binding activity defective Z-probe protein (Zmut probe) expression plasmid (see Note 2).

160

Atsushi Inose-Maruyama et al. Culture cells Transfection of Z-probe coding plasmids into cells Confirm transfection efficiency by fluorescence microscopy observation Appropriate stimulation (if need) Cell lysate preparation Confirm the size of sheared chromatin DNA by agarose gel electrophoresis Prepare antibody-bound DynaBeads

Chromatin immunoprecipitation DNA purification Realtime PCR

Fig. 2 Outline of the steps for Z-DNA detection. Flowchart of the steps for Z-DNA detection by ChIP and real-time PCR. The transfection efficiency and intensity of DNA shearing should be checked at the indicated steps. Antibody-bound Dynabeads should be prepared the day before chromatin immunoprecipitation NdeI

CMV enhancer

HindIII KpnI

XhoI

ApaI

bGH poly(A) signal SP6 promoter

Z-probe

CMV promoter T7 promoter

ADAR1 Za (Gly4-Ser)3 Linker

SphI

f1 ori

EGFP SV40 NLS

Fig. 3 Map of the Z-probe expression plasmid. Schematic representation of the Z-probe coding region in the expression vector. The Z-probe (light green) encodes the human ADAR1 Zα domain (orange), (Gly4-Ser)3 linker (yellow), SV40 nuclear localizing signal (NLS, purple), and EGFP (green). The length of the Z-probe coding region is 1272 bp. The restriction enzyme digestion sites are also shown. In the Zmut probe, the N173A and Y177A mutations, which correspond to the human ADAR1 Zα domain, are introduced. The maps were created using SnapGene® 2.3 Chromatin Immunoprecipitation

1. Phosphate-buffered saline (1× PBS): 8.1 mM disodium hydrogen phosphate (Na2HPO4), 1.47 mM potassium dihydrogen phosphate (KH2PO4), 137 mM sodium chloride (NaCl), and 2.68 mM potassium chloride (KCl). 2. 11% formaldehyde solution: Freshly dilute a 37% formaldehyde solution with 1× PBS. Store the solution at room temperature for 1 week.

Z-DNA Detection in the Promoter of the Human HO-1 Gene

161

3. 1.25 M glycine solution: Dissolve glycine in ultrapure water. Store the solution at room temperature. 4. ChIP cell lysis buffer: 50 mM Tris–HCl (pH 8.0), 10 mM EDTA, and 1% SDS. Store the buffer at room temperature. 5. 10 mM MG132 solution: Dissolve MG132 in dimethyl sulfoxide (DMSO). Store the solution at -20 °C. 6. Complete protease inhibitor cocktail (25×; Roche Applied Science): One tablet is dissolved in ultrapure water. Store the cocktail at -20 °C until use. 7. ChIP dilution buffer: 50 mM Tris–HCl (pH 8.0), 167 mM NaCl, 1.1% Triton X-100, and 0.11% sodium deoxycholate. Store the buffer at room temperature. 8. Antibodies (see Note 3). 9. Dynabeads Protein G (Thermo Fisher, Invitrogen). 10. Magnet apparatus, such as DynaMag™-2 (Thermo Fisher, Invitrogen). 11. ChIP wash buffer 1: 50 mM Tris–HCl (pH 8.0), 150 mM NaCl, 1 mM EDTA, 1% Triton X-100, 0.1% SDS, and 0.1% sodium deoxycholate. Store the buffer at 4 °C. 12. ChIP wash buffer 2: 50 mM Tris–HCl (pH 8.0), 500 mM NaCl, 1 mM EDTA, 1% Triton X-100, 0.1% SDS, and 0.1% sodium deoxycholate. Store the buffer at 4 °C. 13. ChIP wash buffer 3: 50 mM Tris–HCl (pH 8.0), 250 mM LiCl, 1 mM EDTA, 0.5% NP-40, 0.1% SDS, and 0.5% sodium deoxycholate. Store the buffer at 4 °C. 14. Tris–EDTA (TE, 1×) buffer: 10 mM Tris–HCl (pH 8.0) and 1 mM EDTA (pH 8.0). Store the buffer at room temperature. 15. ChIP elution buffer: 10 mM Tris–HCl (pH 8.0), 300 mM NaCl, 5 mM EDTA, and 0.5% SDS. Store the buffer at room temperature. 16. 10 mg/mL proteinase K solution: Store the solution at -20 °C. 17. 10 mg/mL RNase A solution: Store the solution at -20 °C. 2.4

Real-Time PCR

1. Region-specific primers: Z-DNA formation regions. Human CSF1 TG repeat region: 5′-CGC AGA AGA CAG AGG GTG AC-3′ and 5′-GGC ATG TGG TTT ATG GGA AA-3′, QIAGEN QuantiTect SYBR® Green PCR Kits. Human HO-1 TG repeat region: 5′-CTC TGG AAG GAG CAA AAT CAC A-3′ and 5′-GGC CAT AGG ACT TTT AGA GAA AAC A-3′, TaKaRa Bio SYBR® Premix Ex Taq II (perfect real-time PCR).

162

Atsushi Inose-Maruyama et al.

Non-Z-DNA formation regions. Human CSF1_3600 upstream from the CSF1 transcription start site: 5′-ACA AGG GCA TTC AGT CCA AA-3′ and 5′-AGA CAG CGT GAA GGG TGG TA-3′, TaKaRa Bio SYBR® Premix Ex Taq II (perfect real-time PCR). Human CSF1 exon 4 region: 5′-TCA GAG ATA ACA CCC CCA ATG-3′ and 5′-CTT CAT AAT CCT TGG TGA AGC A-3′, TaKaRa Bio SYBR® Premix Ex Taq II (perfect real-time PCR). Human HO-1 exon 3 region: 5′-CAC CAA GTT CAA GCA GCT CTA C-3′ and 5′-CTT CTA TCA CCC TCT GCC TGA C-3′, TaKaRa Bio SYBR® Premix Ex Taq II (perfect real-time PCR). Human TXNRD1 ARE region: 5′-TGC ACG AGG AGT GGA TTT CTG CTT-3′ and 5′-GCT GCA AAT GCC GGA GTG AAG AAA-3′, TaKaRa Bio SYBR® Premix Ex Taq II (perfect real-time PCR). 2. PCR premix enzyme: The PCR conditions are variable by the target region of interest. SYBR® Premix Ex Taq II (perfect realtime PCR) (TaKaRa Bio), QuantiTect SYBR® Green PCR Kits (QIAGEN).

3

Methods

3.1 Cell Culture and Transfection

1. Grow cells in the appropriate cell culture medium. Split the cells the day before transfection and culture 1–2 × 106 cells on a 10 cm dish at 37 °C with 5% CO2 and saturated humidity. 2. Transiently or stably transfect the Z-probe expression plasmids (2.5 μg/1–2 × 106 cells) into target cells using the appropriate transfection reagent (see Note 4). Then, the cells may be simultaneously transfected with some transcription factors or treated with test compounds if needed (see Notes 5 and 6).

3.2 Cell Lysate Preparation

1. To cross-link chromatin, directly add a 1/10 volume of the 11% formaldehyde solution to the culture medium, and mix thoroughly and gently on a flat experimental bench. Incubate with agitation for 10 min at room temperature (see Note 7). 2. To stop the cross-linking reaction, directly add a 1/10 volume of the 1.25 M glycine solution to the culture medium, and mix thoroughly and gently on a flat experimental bench. Incubate with agitation for 5 min at room temperature. 3. Remove as much medium as possible by decantation and pipetting, wash cells immediately with ice-cold PBS twice, and harvest cells using a scraper (see Note 8).

Z-DNA Detection in the Promoter of the Human HO-1 Gene

163

4. To collect cells, centrifuge cells at 3000 rpm for 5 min using a swing rotor at 4 °C. 5. Remove the supernatant, and add 1 mL ChIP lysis buffer plus 10 μL of 10 mM MG132 and 40 μL of the 25× complete solution. Mix thoroughly by vortexing or pipetting until completely dispersed. 6. To shear chromatin, carry out sonication. Please keep samples ice-cold during sonication (see Note 9). 7. Transfer sonicated cell lysates to new 1.5 mL tubes and centrifuge them at 13,000 rpm for 10 min at 10 °C (see Note 10). 8. Recover the supernatant, which contains sheared chromatin, and place it in a new 1.5 mL tube (see Note 11). 3.3 Antibody-Bound Dynabeads Preparation and Chromatin Immunoprecipitation

1. Wash 5 μL of Dynabeads Protein G with ChIP dilution buffer twice. Then resuspend Dynabeads Protein G in 300 μL ChIP dilution buffer and add 1 μg of the antibody solution. To bind the antibody to Dynabeads Protein G, incubate with gentle rotation at 4 °C overnight (12–16 h). 2. Before using the antibody-bound Dynabeads, precipitate the beads by brief centrifugation, and place them on a magnet apparatus, and aspirate the supernatant using a pipet. 3. Add 860 μL ChIP dilution buffer to the antibody-bound Dynabeads, followed by 100 μL cell lysates and 40 μL 25× complete solution. Rotate tubes using a rotator at 4 °C overnight (–12 h). 4. Remove 10 μL of the sonicated cell lysate and mix with 100 μL ChIP elution buffer; store at 4 °C. Label this sample as an input. 5. After a brief centrifugation, place tubes on the magnet apparatus and remove the supernatant. Then, add 500 μL ChIP wash buffer 1 and rotate the tubes at 4 °C for 5 min. 6. After a brief centrifugation, place the tubes on the magnet apparatus and remove the supernatant. Then, add 500 μL ChIP wash buffer 2 and rotate the tubes at 4 °C for 5 min. 7. After brief centrifugation, remove the tubes from the magnet apparatus and remove the supernatant. Then, add 500 μL ChIP wash buffer 3 and rotate the tubes at 4 °C for 5 min. 8. After a brief centrifugation, place the tubes on the magnet apparatus and remove the supernatant. Then, add 500 μL 1× TE buffer and rotate the tubes at 4 °C for 5 min. 9. After a brief centrifugation, place the tubes on the magnet apparatus and remove the supernatant. Then, add 100 μL ChIP elution buffer and incubate the tubes at 65 °C for 4 h or more to reverse cross-linking. The cross-linking of the input samples is reversed together.

164

Atsushi Inose-Maruyama et al.

10. Add 1 μL RNase A solution and incubate at 37 °C for 30 min. 11. Add 1 μL proteinase K solution and incubate at 55 °C for 1 h. 12. Purify the DNA fragments using a DNA purification kit (see Note 12). 13. Perform real-time PCR with specific primers.

4

Notes 1. The most effective transfection reagent may vary by cell type and laboratory conditions. Therefore, it is necessary to determine the appropriate transfection reagent and conditions or use in experiments. 2. A map of the Z-probe expression plasmid is shown in Fig. 3. The Zmut probe cannot bind to Z-DNA due to the N173A and Y177A mutations, which correspond to the human ADAR1 Zα domain. These plasmids are available from our laboratory upon reasonable request. To prepare highly purified plasmids, these plasmids should be prepared using plasmid purification kits, such as the QIAGEN Plasmid Midi Kit or GenEluteTM Plasmid Midiprep Kit (Sigma-Aldrich). 3. Anti-GFP antibody (Clontech Laboratories, Inc., #632592). Another ChIP-grade anti-GFP antibody can be used in this experiment. For immunoprecipitation control experiments, normal rabbit IgG (Millipore, 12-370) was used. Antibodybound Dynabeads should be prepared the day before ChIP. 4. The result of this method depends on the transfection efficiency. The transfection efficiency was confirmed by fluorescence microscopy to determine whether the Z-probes were uniformly introduced into cells. Alternatively, it may be a good idea to establish a cell line that stably expresses the Z-probe protein. However, because cytotoxicity is observed when EGFP is strongly expressed in cells, it may be better to prepare a cell line that induces Z-probe expression by adding reagents such as tetracycline. 5. We suggest the use of SW13 cells and the BRG1 expression plasmid as a positive control experiment for the Z-DNA detection procedure. SW13 is a human adrenal gland/cortex cell line, and Z-DNA-mediated gene expression of CSF1 and HO-1 is defective because the chromatin remodeling factors BRG1 and BRM are missing. When active BRG1 is expressed in SW13 cells, it induces Z-DNA formation in the CSF1 gene promoter region [25]. For the detection of Z-DNA in the human HO-1 gene promoter region, HeLa cells are also recommended.

Z-DNA Detection in the Promoter of the Human HO-1 Gene

165

6. We recommend treating the cells with 100 μM diethylmaleate as a positive control experiment to detect Z-DNA in the HO-1 gene promoter region in Z-probe-transfected HeLa cells. 7. When the original volume of culture medium was 10 mL, 1 mL of the 11% formaldehyde solution was added. 8. Cells were placed in 15 mL plastic tubes. When the volume of the culture medium was 10 mL, 10 mL of 1× PBS was added. The tubes were placed on crushed ice until the next step. 9. Because the power of the sonicator differs by device, the sonication condition needs to be determined. Please check the size of the sheared chromatin. A DNA size between 200 and 1000 base pairs is recommended. To check the size of the sheared chromatin, 50 μL sonicated cell lysates were taken and diluted with 450 μL ChIP direct elution buffer. To reverse cross-linked chromatin, the sample was incubated at 65 °C for 4 h or more. The following steps were carried out according to Subheading 3.3, step 9. Purified DNA samples were visualized and analyzed using 1× TAE agarose gel electrophoresis and staining. Once the conditions for sonication are determined, DNA shearing should be performed under those conditions. 10. To avoid SDS precipitation, we performed centrifugation at 10 °C. 11. Samples can be stored at -80 °C. However, it is recommended to perform the ChIP assay with freshly prepared samples. 12. High Pure PCR Cleanup Micro Kit (Roche Applied Science) or GenEluteTM PCR Clean-Up Kit (Sigma-Aldrich) is helpful for DNA purification.

Acknowledgments This work was supported by a Nihon Pharmaceutical University Research Grant to A.I.M. References 1. Zhang L, Guarente L (1995) Heme binds to a short sequence that serves a regulatory function in diverse proteins. EMBO J 14:313–320 2. Zhang L (ed) (2011) HEME BIOLOGY The secret life of heme in regulating diverse biological processes. World Scientific 3. Sikorski EM, Hock T, Hill-Kapturczak N, Agarwal A (2004) The story so far: molecular regulation of the heme oxygenase-1 gene in renal injury. Am J Physiol Ren Physiol 286: F425–F441

4. Alam J (1994) Multiple elements within the 5′ distal enhancer of the mouse heme oxygenase1 gene mediate induction by heavy metals. J Biol Chem 269:25049–25056 5. Alam J, Cai J, Smith A (1994) Isolation and characterization of the mouse heme oxygenase1 gene. Distal 5′ sequences are required for induction by heme or heavy metals. J Biol Chem 269:1001–1009 6. Alam J, Camhi S, Choi AM (1995) Identification of a second region upstream of the mouse

166

Atsushi Inose-Maruyama et al.

heme oxygenase-1 gene that functions as a basal level and inducer-dependent transcription enhancer. J Biol Chem 270:11977–11984 7. Hill-Kapturczak N, Sikorski E, Voakes C, Garcia J, Nick HS, Agarwal A (2003) An internal enhancer regulates heme- and cadmiummediated induction of human heme oxygenase-1. Am J Physiol Ren Physiol 285: F515–F523 8. Deshane J, Kim J, Bolisetty S, Hock TD, HillKapturczak N, Agarwal A (2010) Sp1 regulates chromatin looping between an intronic enhancer and distal promoter of the human heme oxygenase-1 gene in renal cells. J Biol Chem 285:16476–16486 9. Maruyama A, Mimura J, Itoh K (2014) Non-coding RNA derived from the region adjacent to the human HO-1 E2 enhancer selectively regulates HO-1 gene induction by modulating Pol II binding. Nucleic Acids Res 42:13599–13614 10. Qiu L, Fan H, Jin W, Zhao B, Wang Y, Ju Y, Chen L, Chen Y, Duan Z, Meng S (2010) miR-122-induced down-regulation of HO-1 negatively affects miR-122-mediated suppression of HBV. Biochem Biophys Res Commun 398:771–777 11. Zhang J, Ohta T, Maruyama A, Hosoya T, Nishikawa K, Maher JM, Shibahara S, Itoh K, Yamamoto M (2006) BRG1 interacts with Nrf2 to selectively mediate HO-1 induction in response to oxidative stress. Mol Cell Biol 26:7942–7952 12. Yamada N, Yamaya M, Okinaga S, Nakayama K, Sekizawa K, Shibahara S, Sasaki H (2000) Microsatellite polymorphism in the heme oxygenase-1 gene promoter is associated with susceptibility to emphysema. Am J Hum Genet 66:187–195 13. Exner M, Minar E, Wagner O, Schillinger M (2004) The role of heme oxygenase-1 promoter polymorphisms in human disease. Free Radic Biol Med 37:1097–1104 14. Liu H, Mulholland N, Fu H, Zhao K (2006) Cooperative activity of BRG1 and Z-DNA formation in chromatin remodeling. Mol Cell Biol 26:2550–2559 15. Runkel L, Nordheim A (1986) Chemical footprinting of the interaction between left-handed

Z-DNA and anti-Z-DNA antibodies by diethylpyrocarbonate carbethoxylation. J Mol Biol 189:487–501 16. Wittig B, Wo¨lfl S, Dorbic T, Vahrson W, Rich A (1992) Transcription of human c-myc in permeabilized nuclei is associated with formation of Z-DNA in three discrete regions of the gene. EMBO J 11:4653–4663 17. Mu¨ller V, Takeya M, Brendel S, Wittig B, Rich A (1996) Z-DNA-forming sites within the human beta-globin gene cluster. Proc Natl Acad Sci U S A 93:780–784 18. Herbert A, Alfken J, Kim YG, Mian IS, Nishikura K, Rich A (1997) A Z-DNA binding domain present in the human editing enzyme, double-stranded RNA adenosine deaminase. Proc Natl Acad Sci U S A 94:8421–8426 19. Kim YG, Kim PS, Herbert A, Rich A (1997) Construction of a Z-DNA-specific restriction endonuclease. Proc Natl Acad Sci U S A 94: 12875–12879 20. Kim YG, Lowenhaupt K, Schwartz T, Rich A (1999) The interaction between Z-DNA and the Zab domain of double-stranded RNA adenosine deaminase characterized using fusion nucleases. J Biol Chem 274:19081– 19086 21. Li H, Xiao J, Li J, Lu L, Feng S, Dro¨ge P (2009) Human genomic Z-DNA segments probed by the Z alpha domain of ADAR1. Nucleic Acids Res 37:2737–2746 22. Maruyama A, Mimura J, Harada N, Itoh K (2013) Nrf2 activation is associated with Z-DNA formation in the human HO-1 promoter. Nucleic Acids Res 41:5223–5234 23. Beknazarov N, Jin S, Poptsova M (2020) Deep learning approach for predicting functional Z-DNA regions using omics data. Sci Rep 10: 19134 24. Shin SI, Ham S, Park J, Seo SH, Lim CH, Jeon H, Huh J, Roh TY (2016) Z-DNA-forming sites identified by ChIP-Seq are associated with actively transcribed regions in the human genome. DNA Res 23:477–486 25. Liu R, Liu H, Chen X, Kirby M, Brown PO, Zhao K (2001) Regulation of CSF1 promoter by the SWI/SNF-like BAF complex. Cell 106: 309–318

Chapter 12 ChIP-Seq Strategy to Identify Z-DNA-Forming Sequences in the Human Genome Tae-Young Roh Abstract Different from the canonical right-handed B-DNA, a left-handed Z-DNA forms an alternating syn- and anti-base conformations along the double-stranded helix under physiological conditions. Z-DNA structure plays a role in transcriptional regulation, chromatin remodeling, and genome stability. To understand the biological function of Z-DNA and map the genome-wide Z-DNA-forming sites (ZFSs), a ChIP-Seq strategy is applied, which is a combination of chromatin immunoprecipitation (ChIP) and high-throughput DNA sequencing analysis. Cross-linked chromatin is sheared and its fragments associated with Z-DNAbinding proteins are mapped onto the reference genome sequence. The global information of ZFSs positioning can provide a useful resource for better understanding of DNA structure-dependent biological mechanism. Key words Z-DNA, ChIP-Seq, Z-DNA-binding domain, Z-DNA-forming site

1

Introduction The genetic information contained in the DNA is maintained by forming three types of active double-helical conformation. Different from the most common right-handed B-DNA, a left-handed Z-DNA occupies a small part of the genome and has alternating purines in the syn-conformation and pyrimidines in anti-conformations along the double-stranded helix under physiological conditions [1, 4]. Even though the formation of Z-DNA requires a higher free energy and is not a favorable process thermodynamically, Z-DNA is involved in transcriptional regulation, chromatin remodeling, and genome stability [2, 3, 6, 7]. The genome-wide locations of Z-DNA was not easily identified due to the unstableness of Z-DNA. Here we demonstrate that the genome-wide map of Z-DNA-forming sites (ZFSs) can be obtained by adopting a ChIP-Seq strategy, a combination of chromatin immunoprecipitation (ChIP) and high-throughput DNA sequencing analysis.

Kyeong Kyu Kim and Vinod Kumar Subramani (eds.), Z-DNA: Methods and Protocols, Methods in Molecular Biology, vol. 2651, https://doi.org/10.1007/978-1-0716-3084-6_12, © The Author(s), under exclusive license to Springer Science+Business Media, LLC, part of Springer Nature 2023

167

168

Tae-Young Roh

Fig. 1 The procedures of Zaa expression, ChIP, and data analysis (Reproduced from Ref. 5 freely distributed under the terms of the Creative Commons CC BY license)

Chromatin is cross-linked and fragmentized to isolate the fractions associated only with Z-DNA-binding proteins. The locations of Z-DNA sequences are analyzed by mapping onto the reference genome sequence [5]. This genome-wide map of Z-DNA positioning can provide a useful resource for better understanding of DNA structure-dependent mechanism. The overall process for this analysis is shown in Fig. 1. From the three replicated ChIP-Seq experiments, a total of 391 ZFSs were identified and some specific examples are shown in Fig. 2. The ZFSs are located at promoter (46.2%), gene body (29.7%), and intergenic regions (24.1%). The target genes of ZFSs are associated significantly with diverse cellular functions such as nucleosome assembly, translation, cellular macromolecular complex subunit organization, translational elongation, regulation of transcription from RNA polymerase II promoter, macromolecular complex assembly, chromosome organization, and cell cycle regulation, suggesting that the Z-DNA can have multiple functions at the level of transcription. Also, ZFSs are colocalized with active histone modification marks like histone H3K4me3 and H3K9ac but not with repressive marks (Fig. 3). Our results reflect that Z-DNA tends to form at the promoters of actively transcribed genes and its formation is linked with epigenetic regulation. In this chapter, the detailed ChIP-Seq strategy is described to identify Z-DNA-forming sequences in the human genome.

ChIP-Seq Strategy to Identify Z-DNA-Forming Sequences in the Human Genome

169

Fig. 2 Z-DNA-forming sequences or Z-DNA-binding regions (ZDRs) are identified by ChIP-Seq analysis (Reproduced and modified from Ref. 5 freely distributed under the terms of the Creative Commons CC BY license)

Fig. 3 ZFSs are identified from three replicated ChIP-Seq analysis and correlated with histone modification profiles and RNA polymerase bindings (Reproduced from Ref. 5 freely distributed under the terms of the Creative Commons CC BY license)

170

2

Tae-Young Roh

Materials

2.1 Expression of ZDNA-Binding Domain

1. Z-DNA-binding domain (ZBD) expression vector fused with FLAG-SV40 nuclear localization signal (NLS) in pEF1α vector (Fig. 4). 2. For HeLa cell culture: DMEM containing 10% FBS, 50 U/mL penicillin, and 50 μg/mL streptomycin.

2.2 Chromatin Immunoprecipitation

1. 37% formaldehyde. 2. 2.5 M glycine solution. 3. 1 PBS: NaCl (9 g/L), Na2HPO4 (0.775 g/L), KH2PO4 (0.165 g/L), pH 7.4. 4. Nuclei extraction buffer 1: 10 mM HEPES [pH 6.5], 0.25% Triton X-100, 10 mM EDTA, 0.5 mM EGTA, and 1 mM PMSF. 5. Nuclei extraction buffer 2: 10 mM HEPES [pH 6.5], 200 mM NaCl, 1 mM EDTA, 0.5 mM EGTA, and 1 mM PMSF. 6. Sonication buffer: 50 mM HEPES [pH 7.9], 140 mM NaCl, 1 mM EDTA, 1% Triton X-100, 0.1% Na-deoxycholate, 0.1% SDS, and 1 protease inhibitor cocktail. 7. Monoclonal anti-FLAG M2 antibody (F1804, Sigma, USA) and mouse anti-IgG antibody (sc-2025, Santa Cruz, USA). 8. Pierce Protein A/G magnetic beads (#88802, Thermo Fisher, USA), normal mouse IgG (Santa Cruz, USA), and magnetic stand. 9. 5 M NaCl stock solution. 10. LiCl buffer: 1 TE, 0.25 M LiCl, 0.5% NP40, 0.5% sodium deoxycholate. 11. 20 mg/mL glycogen. 12. Buffer-saturated phenol/chloroform (1:1). 13. 3 M sodium acetate (NaOAc), pH 5.2. 14. Absolute ethanol. 15. Protease K. 16. 1 TE buffer: 10 mM Tris–Cl [8.0], 1 mM EDTA.

Fig. 4 Schematic structure for overexpression of Z-DNA-binding domain. (Reproduced from Ref. [5] freely distributed under the terms of the Creative Commons CC BY license)

ChIP-Seq Strategy to Identify Z-DNA-Forming Sequences in the Human Genome

2.3 ChIP-Seq Library Preparation

1. End-It DNA End-Repair Biotechnologies, USA).

kit

171

(Epicentre

2. QIAquick PCR purification kit (QIAGEN Sciences, USA). 3. 1 mM dATP. 4. Taq DNA polymerase (NEB, USA). 5. T4 DNA ligase (NEB, USA). 6. Low EDTA TE buffer (10 mM Tris–HCl pH 8.0, 0.1 mM EDTA). 7. Solid Phase Reversible Immobilization (SPRI) magnetic beads (Beckman Coulter, USA). 8. PEG–NaCl solution: 20% PEG (polyethylene glycol)-8000, 2.5 M NaCl. 9. Absolute ethanol. 10. Accel-NGS 2S DNA Library Kits (Swift, USA). 11. Phusion PCR master mix (NEB, USA). 12. DNA oligo sequences: P5 adapter: 50 -AATGATACGGCGACCACCGAGATCTA CACTCTTTCCCTACACGACGCTCTTCCGATCT-30 . P7 adapter: 50 - GATCGGAAGAGCACACGTCTGAACT CCAGTCAC -(index)-CTCGTATGCCGTCTTCTG CTTG30 where index can be one of CGATGTAT, TGACCAAT, ACAGTGAT, GCCAATAT, CAGATCAT, CTTGTAAT, AGTCAACA, AGTTCCGT, ATGTCAGA, CCGTCCCG, GTCCGCAC, and GTGAAACG. PCR primer 1: 50 -AATGATACGGCGACCACCGAGA TCTACAC-30 . PCR primer 2: 50 -CAAGCAGAAGACGGCATACGAG AT-30 . 2.4 ChIP-Seq Data Analysis

3

1. The following programs or packages are used for sequencing data processing: FastQC-0.11.5, Cutadapt-1.11, fastxtoolkit-0.0.13, bioawk, paired_sequence_match.py, bowtie-1.1.2, bowtie2–2.3.0, BWA-0.7.15, Samtools-1.3.1, Qualimap-2.2.1, homer-4.9, bedtools-2.26.0, and MACS v2.1.0. The version of each tool is just an example and the later versions are also acceptable depending on the computer system requirement.

Methods

3.1 Cell Culture and the Expression of Zaa

1. Culture 10 mL of HeLa cells in DMEM containing 10% FBS, 50 U/mL penicillin, and 50 μg/mL streptomycin at 37  C with 5% CO2.

172

Tae-Young Roh

2. For transfection, the Za expression construct (pEF1α-FLAGSV40NLS-Zaa) is transfected into HeLa cells using Lipofectamine 2000 (Thermo Fisher, USA) with DNA to Lipofectamine ratio of 1:1. 3.2 Chromatin Immunoprecipitation

1. Wash HeLa cells expressing Zaa with ice-cold 1 PBS two times and cross-linked with 37% formaldehyde to a final concentration of 1% for 10 min at room temperature. 2. Stop cell fixation by incubating with 2.5 M glycine to a final concentration of 0.1375 M for 5 min. 3. Wash cells two times with cold 1 PBS andd harvest them by centrifugation. 4. Isolate the nuclei by treating cells with ice-cold buffer 1, followed by centrifugation at 2000 g for 5 min at 4  C. 5. Suspend the pellet with ice-cold buffer 2, followed by the same centrifugation as step 4. 6. Resuspend the pellet with 500 μL of sonication buffer. 7. Sonicate the pellet until the chromatin size ranges between 100 and 300 bp and then centrifuge at 1000 g for 5 min at 4  C (Fig. 5). 8. Transfer the chromatin solution to a new microcentrifuge tube containing 50 μL of washed Protein A/G magnetic beads.

Fig. 5 An example of chromatin size before and after sonication

ChIP-Seq Strategy to Identify Z-DNA-Forming Sequences in the Human Genome

173

9. Incubate the tube from step 8 with rotation for 30 min for pre-clearing. 10. Use magnetic stand to separate the magnetic beads from the chromatin solution and take only the chromatin solution carefully. 11. Incubate the chromatin solution for 2 h at room temperature with 2 μg of anti-FLAG antibody and normal mouse IgG antibody for each. Before adding antibody, 1 percent of pre-cleared chromatin should be reserved as the input chromatin control. 12. Transfer the chromatin solution with antibody into a new microcentrifuge tube containing 50 ul of washed Protein A/G magnetic beads, and incubate overnight at 4  C with rotation. 13. Pellet the magnetic beads in each immunoprecipitation by placing the tube onto the magnetic stand and waiting for 1 min. 14. Remove the supernatant and wash the beads twice with 10 min rotation at room temperature after adding 1 mL of sonication buffer. 15. Repeat step 13. 16. Remove the supernatant and wash the beads twice with 1 mL of LiCl buffer for 10 min rotation each at room temperature. 17. Repeat step 13. 18. Remove the supernatant and resuspend in 100 μL of 1 TE. Add 2.5 μL of 10% SDS and 5 μL of 10 mg/mL proteinase K. 19. Incubate overnight at 65  C. Also treat the input chromatin control (step 11) in the same way. 20. Pellet the magnetic beads by placing the tube onto the magnetic stand and waiting for 1 min. 21. Transfer the supernatant to a new tube. 22. Wash the beads from step 20 with 100 μL of 1 TE and repeat step 20. 23. Take the supernatant and combine it with the supernatant from step 21. 24. Extract the solution from step 23 with phenol/chloroform twice. 25. Take the aqueous solution and add 1 μL of 20 mg/mL glycogen, 20 μL of 3 M NaOAc, pH 5.2, and 500 μL of ethanol. 26. Store the mixture at -80  C for 1 h and centrifuge at 4  C.

174

Tae-Young Roh

27. Wash the pellet once with 70% ethanol, spin, remove the supernatant, air-dry briefly, and resuspend DNA in 20 μL of 1 TE. 3.3 ChIP-Seq Library Preparation and Sequencing

1. Repair DNA ends to generate blunt-ended DNA using the End-It DNA END-Repair kit. Mix DNA solution with 5 μL of 10 end repair buffer, 5 μL of 2.5 mM each dNTPs, 5 μL of 10 mM ATP, and 1 μL of End-Repair Enzyme mix (T4 DNA Pol + T4 PNK). Make sure total reaction volume should be 50 μL after adding suitable amount of H2O. Keep at room temperature for 45 min. 2. Use QIAquick PCR purification kit to precipitate DNA. 3. Add a terminal “A” nucleotide to 30 ends by mixing 30 μL of DNA (step 2) with 2 μL of H2O, 5 μ of 10 Taq buffer, 10 μL of 1 mM dATP, and 3 μL of 5 U/μL Taq DNA polymerase. Incubate for 30 min at 70  C. Purify the DNA using QIAquick PCR purification kit and elute the DNA with 20 μL of low EDTA TE buffer. 4. Ligate the DNA solution (step 3) with 5 μL of indexed P7 adapter, 3 μL of Buffer Y1, and 2 μL of Enzyme Y3 from AccelNGS 2S DNA Library Kits at 25  C for 15 min. 5. Mix the ligated DNA (step 4) with the same volume (30 μL) of SPRI beads (ratio 1:1) and then add 45 μL of PEG–NaCl solution (ratio 1:0.75). 6. Mix by vortex, spin down, and incubate at room temperature for 5 min. 7. Pellet the SPRI beads by placing the tube into the magnetic stand and waiting for 1 min. 8. Remove the supernatant, add 180 μL of fresh 80% ethanol, and incubate for 30 s. 9. Repeat steps 7 and 8. 10. Remove any residual ethanol solution completely. 11. Add 30 μL of low EDTA TE to the pelleted DNA on the SPRI beads. 12. Ligate the DNA solution (step 11) with 2 μL of indexed P5 adapter, 5 μL of Buffer B1, 9 μL of Buffer B3, 1 μL of Enzyme B4, 2 μL of Enzyme B5, and 1 μL of Enzyme B6 from AccelNGS 2S DNA Library Kits at 40  C for 10 min. 13. Mix the ligated DNA (step 12) with 42.5 μL of PEG–NaCl solution (ratio 1:0.85). 14. Repeat steps from 7 to 10.

ChIP-Seq Strategy to Identify Z-DNA-Forming Sequences in the Human Genome

175

Fig. 6 Typical distribution of ChIP-Seq library DNA analyzer by Bioanalyzer

15. Add 20 μL of low EDTA TE to the pelleted DNA on the SPRI beads and incubate at room temperature for 2 min. 16. Transfer the supernatant to a new tube. 17. Amplify the ChIP-Seq library by mixing 10.5 μL of DNA from step 16, 12.5 μL of Phusion PCR master mix, and 1 μL of PCR primer 1 and 2 for each, with denaturing at 98  C for 30 s and then 6 ~ 15 cycles of 98  C, 10 s; 65  C, 30 s; 72  C, 30 s, and final amplifying at 72  C, 60 s. 18. Purify the PCR product with QIAquick PCR purification kit. Quantify and check the size of library DNA (Fig. 6). 19. ChIP-Seq libraries are sequenced on the Illumina NGS platform. 3.4 ChIP-Seq Data Analysis

The processing of raw sequence reads depends on your own choice of programs. The following procedure is just a simple example of analysis pipelines and the overall outline is shown in Fig. 7. The specific usage of individual software should be referred to the developer’s site (see Note 1). 1. Filter the raw sequence reads in .fastq format with FastQC-0.11.5 and use only high-quality reads with Phred score > 33 for further analysis. 2. Remove the adapter sequences by using Cutadapt-1.11. 3. Align the reads onto the reference human genome (hg18 or later) by bowtie-1.1.2, bowtie2–2.3.0, or BWA-0.7.15. 4. Check the quality of mapped reads with Qualimap-2.2.1.

176

Tae-Young Roh

Fig. 7 Outline of ChIP-Seq data analysis pipeline

5. Use HOMER program to produce .bedGraph file as an output file to see the aligned reads on the genome browser. 6. Call the enrichment of reads as peaks using bedtools-2.26.0 and MACS v2.1.0, sequentially (see Note 2).

4 Notes 1. Visit the sites in Table 1 for further functions and options. 2. Sometimes the peaks are overlapped with frequently overrepresented random peaks called blacklisted regions which might be caused by PCR bias or sequencing error and should be removed for further analysis. The blacklist detected in the human genome can be downloaded from the website (https://sites.google.com/site/anshulkundaje/projects/ blacklists).

Acknowledgments This work was supported by the National Research Foundation of Korea (NRF-2014M3C9A3064548, NRF-2017M3C9A6047625,

ChIP-Seq Strategy to Identify Z-DNA-Forming Sequences in the Human Genome

177

Table 1 The programs used for ChIP-Seq data analysis and the developer’s sites Tools

Site

FastQC

http://www.bioinformatics.babraham.ac.uk/projects/fastqc/

FASTX-Toolkit

http://hannonlab.cshl.edu/fastx_toolkit/

Cutadapt

http://cutadapt.readthedocs.io/en/stable/index.html

Paired_sequence_match.py

http://pydoc.net/Python/paired_sequence_utils/0.1/

Bowtie1

https://ccb.jhu.edu/software/tophat/index.shtml

Bowtie2

http://bowtie-bio.sourceforge.net/bowtie2/index.shtml

BWA aln BWA samse BWA sampe

http://bio-bwa.sourceforge.net/

Qualimap

http://qualimap.bioinfo.cipf.es/

MACS2

https://github.com/taoliu/MACS

HOMER annotatePeaks.pl HOMER findMotifsGenome.pl

http://homer.ucsd.edu/homer/index.html

NRF-2019M3A9H1103711). I thank So-I Shin and Insoon Jang for the part of sequence data analysis and production of figures and table. References 1. Johnston BH (1992) Generation and detection of Z-DNA. Methods Enzymol 211:127–158 2. Mulholland N, Xu Y, Sugiyama H et al (2012) SWI/SNF-mediated chromatin remodeling induces Z-DNA formation on a nucleosome. Cell Biosci 2:3 3. Muller V, Takeya M, Brendel S et al (1996) ZDNA-forming sites within the human betaglobin gene cluster. Proc Natl Acad Sci U S A 93:780–784 4. Rich A, Zhang S (2003) Timeline: Z-DNA: the long road to biological function. Nat Rev Genet 4:566–572

5. Shin SI, Ham S, Park J et al (2016) Z-DNAforming sites identified by ChIP-Seq are associated with actively transcribed regions in the human genome. DNA Res 23:477–486 6. Wittig B, Dorbic T, Rich A (1991) Transcription is associated with Z-DNA formation in metabolically active permeabilized mammalian cell nuclei. Proc Natl Acad Sci U S A 88:2259–2263 7. Wolfl S, Martinez C, Rich A et al (1996) Transcription of the human corticotropin-releasing hormone gene in NPLC cells is correlated with Z-DNA formation. Proc Natl Acad Sci U S A 93: 3664–3668

Chapter 13 Detection of Z-DNA Structures in Supercoiled Genome Fedor Kouzine, Damian Wojtowicz, Teresa M. Przytycka, and David Levens Abstract Z-DNAs are nucleic acid secondary structures that form within a special pattern of nucleotides and are promoted by DNA supercoiling. Through Z-DNA formation, DNA encodes information by dynamic changes in its secondary structure. A growing body of evidence indicates that Z-DNA formation can play a role in gene regulation; it can affect chromatin architecture and demonstrates its association with genomic instability, genetic diseases, and genome evolution. Many functional roles of Z-DNA are yet to be discovered highlighting the need for techniques to detect genome-wide folding of DNA into this structure. Here, we describe an approach to convert linear genome into supercoiled genome sponsoring Z-DNA formation. Applying permanganate-based methodology and high-throughput sequencing to supercoiled genome allows genome-wide detection of single-stranded DNA. Single-stranded DNA is characteristic of the junctions between the classical B-form of DNA and Z-DNA. Consequently, analysis of single-stranded DNA map provides snapshots of the Z-DNA conformation over the whole genome. Key words Non-B DNA, Z-DNA, Potassium permanganate, DNA supercoiling, High-throughput genomics

1

Introduction Crick and Watson’s strands of DNA are twisted around each other in a right-handed double helix resulting in B-form of DNA. Untwisting or over-twisting of the double helix generates negative or positive supercoils, respectively, as it occurs during all genomic transactions (RNA transcription, DNA replication, chromatin remodeling). The resulting DNA torsional stress in turn has regulatory roles on genome operation [1]. Among different supercoilbased mechanisms of regulation, formation of alternative DNA structures (non-B form of DNA) is emerging in a wide variety of regulatory processes [2]. DNA possesses structural variability, and DNA elements with special patterns of nucleotide sequence might flip into non-B DNA structures. The requisite for transitions from B-DNA to alternate structures is the destabilization of double helix.

Kyeong Kyu Kim and Vinod Kumar Subramani (eds.), Z-DNA: Methods and Protocols, Methods in Molecular Biology, vol. 2651, https://doi.org/10.1007/978-1-0716-3084-6_13, © The Author(s), under exclusive license to Springer Science+Business Media, LLC, part of Springer Nature 2023

179

180

Fedor Kouzine et al.

It can be accomplished in vitro by changes of temperature or hydration, chemical treatment with denaturants, or torque in negatively supercoiled DNA. In vivo, the main sponsor of non-B DNA formation is DNA supercoiling known to vary at strategic sites of the genome [3–5]. In this chapter, we focus on the transition of DNA from B-form to Z-form, a left-handed double helix [6]. Z-DNA formation requires more energy than B-DNA formation. Z-DNA is intrinsically unstable because it must compete for its existence against B-DNA. Prolonged stability of Z-DNA is supported by alternating purine–pyrimidine sequences with hierarchy, where d(GC)n is more favored than d(AC)n, which are favored over d(AT)n, and by energy supplied by negative DNA supercoiling [6]. In addition, the stability of Z-DNA depends on the length of the affected sequence. The preferences for Z-DNA formation were used to search for the potential occurrence of Z-DNA in genomic sequences. Z-Hunt predicts the propensity of Z-DNA formation using statistical mechanics to calculate the potential of each fixed DNA region to form a left-handed helix based on empirically determined energetic parameters [7]. The analysis in most cases properly predicted the relative propensity for the sequences of interest to form Z-DNA as measured by biochemical assays. DNA is highly polymorphic molecule and a dozen non-B DNA structures have been discovered to date. It is expected and often observed that multiple transitions can occur within a close distance creating competition for the supercoiling energy. For that reason, a computational analysis of conformational transitions in DNA should include all non-B DNA competitions. Although in recent years, the algorithms that predict DNA conformation have improved, they are still not universal as they do not yet consider all diversity of DNA structures [8, 10]. Z-DNA-forming sequences are often found near the transcriptional start sites of genes suggesting a role for Z-DNA in transcriptional regulation. DNA supercoiling generated by transcription and/or chromatin remodeling at the promoters favors Z-DNA formation. Z-DNA enforces a specific pattern of chromatin structure associated with gene activity by influencing the occupancy and positioning of nucleosome [11–13]. The discovery of Z-DNA interacting proteins and disease-causing mutations in these proteins provides further evidence for the biological importance of this conformation. Genetic experiments also suggest that Z-DNA has a protective effect on the genome by reducing the potential for harmful non-B DNA structure formation [14] and even provides a source of genetic adaptation in natural populations [15, 16]. Collectively, it is evident now that Z-DNA formation is tightly linked to important genomic transactions. However, the full map of Z-DNA structures in the genome is not currently available slowing the progress in the field. The problem with Z-DNA detection derives from an important feature of this structure—it is short-lived,

Z-DNA in Supercoiled Genome

181

appears when the right condition is achieved at particular genomic locations, performs the required function(s), and disappears without leaving a mark on the genome. Early studies with antibodies to Z-DNA had potential to provide evidence for this structure [17, 18]. However, it has been widely discussed that the antibodies might not only detect pre-folded structures, but they might also induce the folding into Z-DNA of the sequences to which they bind [19]. The first Z-DNA maps of the genome were generated by using the Z-DNA-binding domain of the RNA editing enzyme ADAR as a probe, followed by separation of chromatin bound probe and DNA sequencing [20, 21]. The resulting maps were markedly different, probably due to the difference in chromatin preparation/sequencing. Importantly, only a few hundred Z-DNA sites were detected in the human genome, while algorithmic prediction delivered a few hundred thousand DNA regions with potential to form Z-DNA [13]. This raises the question of how efficiently the probe can find its target in the competitive environment of nuclei. Therefore, it was important to develop orthogonal approaches to map Z-DNA structures with a probe that does not compete with molecular partners of the left-handed helix. Combination of potassium permanganate footprinting and high-throughput sequencing allowed generation of a high-resolution map of single-stranded DNA (characteristic of junction between B-DNA and Z-DNA). Overlapping this map with sequences computationally predicted to form Z-DNA indicates that over 10% of them (two dozen thousand Z-DNA structures) are folded in left-handed helix in human genome [13]. Later, a similar map based on a kethoxalassisted single-stranded DNA sequencing approach was developed [22]. However, the authors did not specifically look for Z-DNA formation. The evident limitation of these single-stranded DNA approaches is the necessity to combine experimental methods with computational algorithm to predict Z-DNA-forming regions. Considering the competition between different non-B DNA structures in the context of supercoiled DNA, single-stranded DNA maps should be supplemented with empirical evidence of Z-DNA formation potential. Here we report the method we developed to force sequences with potential for Z-DNA formation throughout the mouse genome, to adopt Z-DNA conformation in vitro [13]. Linear genomic DNA is fragmented using restriction enzymes, ligated into circles, and supercoiled with topoisomerase I in the presence of ethidium bromide (Fig. 1). Potassium permanganate footprinting and high-throughput sequencing (ssDNA-seq) is then applied to the whole genome supercoiled in vitro. In ssDNA-seq, singlestranded DNA regions are modified by potassium permanganate making these regions susceptible to cleavage by single-strand-specific nuclease—S1 nuclease. After nuclease digestion, double-

182

Fedor Kouzine et al.

Fig. 1 Z-DNA structures (green bar) mapping workflow: from top, counterclockwise direction. Linear genomic DNA is converted to population of supercoiled circular DNA. ssDNA-seq is then applied to the supercoiled genome to detect single-stranded DNA regions (yellow ribbon). Overlapping of computationally predicted sequences with potential to form Z-DNA structure and specific pattern of sequencing tags allows to detect Z-DNA. Top right corner illustrates Z-DNA-specific sequencing tag distribution on supercoiled plasmid bearing sequence (green bar) with Z-DNA formation potential. Forward and reverse strands of ssDNA-seq tags are shown in red and blue, respectively

stranded breaks are produced at the sites of the DNA chemical modification. These breaks are labeled with biotinylated nucleotides. After DNA sonication, DNA fragments surrounding Z-DNA structure are enriched with streptavidin beads. These fragments are then sequenced using the high-throughput Illumina platform. Overlapping the sequencing signal with computationally predicted Z-DNA-forming sequences delivers a high-resolution map of Z-DNA structures formed in the genome. Z-DNA is separated from the flanking B-DNA by short patches of single-stranded regions (Fig. 1). It is expected that only single base pair is flipping out at the junctions between the left- and right-handed double helices [23]. Thus, the sequencing tag distribution for this structure should be pairs of peaks bracketing Z-DNA structure (Fig. 1). By increasing the depth of sequencing, this simple and characteristic pattern associated with Z-DNA formation will allow reading of the complete “Z-ome” without the need for computational predictions. Our methods can provide the map of the Z–DNA in any genome and under defined experimental conditions.

2

Materials

2.1 Purification of High Molecular Weight DNA

1. 500 μg/mL RNase, DNase-free. 2. 20 mg/mL proteinase K.

Z-DNA in Supercoiled Genome

2.2 Converting Linear Genomic DNA into Supercoiled DNA

183

• ApoI (NEB). • TaqαI (NEB). • T4 DNA Ligase (NEB), supplied with 10× T4 DNA Ligase Buffer. • Lambda Exonuclease (NEB), supplied with 10× Lambda Exonuclease Buffer. • Chloroquine. • Ethidium bromide. • 2 mg/mL glycogen. • Isobutanol.

2.3 Enrichment of DNA Fragments Surrounding SingleStranded DNA in Supercoiled Genome

1. Terminal Transferase (NEB), supplied with 10× Terminal Transferase reaction buffer and 2.5 mM CoCl2. 2. Nuclease S1 (Thermo Fisher Scientific), supplied with 5× Nuclease S1 buffer. 3. Topoisomerase 1B. 4. 1 mM Biotin-16-dUTP. 5. 100 mM dNTPs, PCR grade. 6. Amicon Ultra-2 mL Centrifugal Filters, 30 K (Millipore Sigma). 7. 14 M 2-mercaptoethanol.

2.4 Common Reagents

1. Ethanol 100%. 2. SYBR Green Nucleic Acid Gel Stain. 3. Agarose (agarose gels are prepared in TAE buffer). 4. Molecular biology-grade water. 5. 3.0 M sodium acetate. 6. Phenol/chloroform/isoamyl alcohol 25:24:1, Tris (pH 8.0) saturated. 7. DNA Ladder Mix.

2.5

Buffers

1. EB buffer: 10 mM Tris–HCl (pH 8.0). 2. Lysis buffer: 20 mM Tris–HCl (pH 7.5), 1% SDS, 100 mM EDTA. 3. Elution buffer: 10 mM Tris–HCl (pH 7.5), 1 mM EDTA, 1 M NaCl, 2 M 2-mercaptoethanol. 4. Topoisomerase IB buffer (5×): 250 mM Tris–HCl (pH 7.5), 250 mM KCl, 50 mM MgCl2, 2.5 mM DTT, 0.5 mM EDTA. 5. Potassium permanganate reaction buffer (10×): 200 mM Tris– HCl (pH 7.5), 1 M KCl, 10 mM MgCl2.

184

Fedor Kouzine et al.

6. Potassium permanganate solution: 100 mM KMnO4 in molecular biology-grade water. 7. TAE buffer: 40 mM Tris, 20 mM acetic acid, 1 mM EDTA. 8. PBS: 137 mM NaCl, 2.7 mM KCl, 10 mM Na2HPO4, 2 mM KH2PO4, without calcium and magnesium, pH 7.4. 2.6

Equipment

• Gel electrophoresis apparatus. • Eppendorf DNA LoBind tube. • DynaMag magnet for magnetic separation of Dynabeads. • Ultrasonic sonicator (Bioruptor, Diagenode). • Spectrophotometer. • Tube wheel rotator, water bath, aspirator, and thermomixer. • DNA gel imaging system. • Access to Sequencing Facility (Illumina).

2.7

Kits

1. QIAquick PCR Purification Kit (Qiagen). 2. MinElute Reaction Cleanup Kit (Qiagen). 3. Dynabeads kilobaseBINDER Kit (Thermo Fisher Scientific).

3

Methods

3.1 Purification of High Molecular Weight DNA

1. Transfer 40 million cells to a centrifuge tube and recover them by centrifugation at 500 × g for 5 min at RT (see Note 1). Resuspend the cells in 5 mL of ice-cold PBS and add 5 mL of lysis buffer (see Note 2). Invert gently the tube eight times until solution becomes viscous. 2. Digest the sample overnight with 200 μg/mL proteinase K at 55 °C. 3. Extract DNA twice with Phenol/Chloroform/Isoamyl Alcohol. Precipitate DNA with 2 volumes of ethanol 100% in the presence of 0.3 M sodium acetate (see Note 3). 4. Centrifuge at 4000 × g for 10 min. Remove the supernatant; add ethanol 70%; mix by inverting the sample; centrifuge at 4000 × g for 5 min. Remove the supernatant and air-dry the pellet for 15 min. 5. Add to the pellet 500 μL of EB buffer supplemented with 5 μg of DNase-free RNase, and incubate it for 12 h at room temperature with gentle rotation. 6. Incubate for 1 h at 55 °C with 200 μg/mL proteinase K. Bring the volume to 10 mL with EB buffer, extract the DNA with Phenol/Chloroform/Isoamyl Alcohol, and precipitate the DNA with ethanol as above.

Z-DNA in Supercoiled Genome

185

7. Resuspend DNA pellet in 500 μL of EB buffer and determine DNA concentration with a spectrophotometer. 8. Run 2 μL of DNA on a 0.6% agarose gel (see Note 4). Stain DNA in the gel with SYBR Green and visualize DNA using gel imaging system. 3.2 Converting Linear Genomic DNA into Supercoiled DNA

1. Aliquot the DNA solution into two tubes (40 μg of DNA in each tube) (see Note 5). 2. Use ApoI restriction enzyme in the first tube and TaqαI restriction enzyme in the second one. Set up restriction enzyme digest of genomic DNA in a final volume of 500 μL according to supplier’s instructions (see Note 6). Incubate the reactions at required temperature for o/n (see Note 7). 3. Run 2 μL of DNA from each tube on a 0.6% agarose gel to check the digest efficiency. Stain DNA in the gel with SYBR Green and visualize DNA using gel imaging system. 4. Cool the digest reactions to room temperature (RT) and add an equal volume of Phenol/Chloroform/Isoamyl Alcohol. Mix the two phases by shaking the tubes for a few minutes and separate the two phases by centrifugation at 15,000 × g for 3 min at RT. 5. Transfer the aqueous phase into Eppendorf tubes (0.5 mL in each), and add 2 volumes of ice-cold 100% ethanol in the presence of 0.3 M sodium acetate and 20 μg of glycogen. Keep tubes at -20 °C for 30 min and precipitate DNA by centrifugation at 15000 × g for 10 min at +4 °C. 6. Wash DNA pellets with 70% ethanol, air-dry for 10 min, and dissolve DNA in 100 μL of EB buffer. Measure DNA concentrations by spectrometer. 7. In 50 mL tubes, set up DNA ligation reaction for each DNA digest separately as follows: 100 μL of DNA, 2 mL of NEB T4 DNA Ligase Buffer (10×), 17.8 mL of H2O, and 50 μL of T4 DNA Ligase (2000 units/μL). Incubate the reactions at 16 °C for o/n. 8. Concentrate DNA solution in ligation reactions by repetitive extraction with isobutanol. Add to the ligation reactions an equal volume of isobutanol. Mix well and centrifuge at 1000 × g for 2 min. Remove and discard the upper (isobutanol) phase. 9. Perform several cycles of extraction until the volume of ligation reaction goes down to approximately 2 mL (see Note 8). 10. Add concentrated DNA solutions to an Amicon Centrifugal Filter Unit. Centrifuge samples according to instructions of the manufacturer until the volume is reduced to 200 μL. Discard the flow-through, add 1.8 mL of H2O, and repeat centrifugation. Repeat washing with water one more time. Recover

186

Fedor Kouzine et al.

samples into Eppendorf tubes and adjust the volume with H2O to 0.5 mL. 11. Extract samples with Phenol/Chloroform/Isoamyl Alcohol as in Subheading 3.2, step 4. 12. Precipitate DNA samples with ethanol as in Subheading 3.2, step 5. Dissolve DNA in 100 μL of EB buffer. Save 5 μL of DNA from each sample to use in Subheading 3.2, step 17. 13. Set up Lambda Exonuclease digest of the two samples as follows: 95 μL of DNA, 30 μL of NEB Lambda Exonuclease Reaction Buffer (10×), 165 μL of H2O, and 10 μL of Lambda Exonuclease (5 units/μL). Incubate at 37 °C for 30 min (see Note 9). 14. Purify DNA as in steps 11 and 12 above. Measure DNA concentrations by spectrometer (see Note 10). 15. For each sample, set up DNA topoisomerase 1B reaction as follows: 5 μg of DNA, 100 μL of topoisomerase IB buffer (5×), 2 μL of 10 mg/mL BSA, H2O to a final volume of 444 μL, and 50 μL of 8 μg/mL ethidium bromide solution. 16. Add 10 U of topoisomerase I and incubate mixtures at 37 °C for 1 h (see Note 11). 17. Purify supercoiled DNA with a QIAquick PCR Purification Kit, eluting DNA samples into 50 μL of EB buffer. Measure DNA concentrations by spectrometer. Aliquot 5 μL of DNA from each sample to use in the next step. 18. Perform two-dimensional gel electrophoresis topological analysis of the genomic DNA collected in steps 12 and 17, Subheading 3.2. For the first dimension, carry out the electrophoresis of samples on 1.8% (w/v) agarose gel (20 × 20 cm) in TAE buffer supplemented with 13.3 μM chloroquine. Perform electrophoresis for 24 h at 2 V/cm setting, with buffer recirculation. 19. Soak the gel for 24 h in TAE buffer without chloroquine. Change the buffer three to four times. 20. Perform electrophoresis at second dimension at 2 V/cm setting for 12–14 h, with buffer recirculation. 21. Stain the DNA in the gel with SYBR Green to visualize the pattern of genomic DNA migration (Fig. 2). 22. Combine two DNA samples together. 3.3 Enrichment of DNA Fragments Surrounding SingleStranded DNA in Supercoiled Genome

1. Prepare DNA solution for treatment with potassium permanganate as follows: 90 μL of DNA, 20 μL of permanganate reaction buffer (10×), and 90 μL of H2O (see Note 12). Incubate solution at 37 °C for 1 h to equilibrate conformational transitions in supercoiled DNA.

Z-DNA in Supercoiled Genome

187

Fig. 2 Guide for two-dimensional electrophoresis topological analysis of supercoiled genomic DNA on hypothetical agarose gel. Genomic DNA after ligation reaction was loaded in well 1. This DNA has two populations: un-ligated linear fragments running at diagonal and topologically closed relaxed circles. With these electrophoresis conditions, circular DNA runs faster in the first direction and slower in the second dimension (indicated by green and blue arrows, accordingly). Genomic DNA after lambda nuclease digestion and topoisomerase 1 treatment was loaded in well 2. This DNA is supercoiled circular DNA. It runs the slowest in the first dimension and the fastest in the second dimension

2. Add 12 μL of potassium permanganate stock solution to the DNA sample (see Note 13). Mix fast and incubate at 37 °C for 3 min (see Note 14). 3. Add 16 μL of 2-mercaptoethanol to quench the reaction. Purify chemically modified DNA with a QIAquick PCR Purification Kit, eluting DNA sample into 50 μL of EB buffer. 4. Set up S1 nuclease digest of genomic DNA by mixing 50 μL of DNA solution with 20 μL of 5× S1 nuclease buffer and H2O to a final volume of 100 μL. 5. Add to the tube 50 U of S1 nuclease. Incubate the reactions at 37 °C for 20 min (see Note 15). 6. Purify digested DNA with a QIAquick PCR Purification Kit, eluting DNA samples into 50 μL of EB buffer. 7. Set up DNA tailing reaction by mixing 50 μL of S1-digested DNA, 20 μL of 10× TdT buffer, 20 μL of CoCl2 solution, 4 μL of 10 mM dCTP, 4 μL of 10 mM dATP, 100 U of Terminal

188

Fedor Kouzine et al.

Transferase, and H2O to a final volume of 200 μL. Incubate reactions for 3 min at 37 °C. 8. Add 10 μL of 1 mM Biotin-16-dUTP and incubate at 37 °C for 30 min. 9. Purify DNA tailing reaction with a QIAquick PCR Purification Kit, eluting DNA samples into 100 μL of EB buffer (see Note 16). 10. Sonicate the biotinylated DNA to generate 200–400 bp DNA fragments. Perform 20 cycles of 30 s on/30 s off in an ice bath at medium power (ultrasonic sonicator Bioruptor, Diagenode). Spin the tube after 7 and 14 cycles to ensure homogeneous sonication (see Note 17). 11. Check the DNA fragment sizes by running 2 μL of DNA solution on a 1% agarose gel. Stain DNA in the gel with SYBR Green and visualize DNA using gel imaging system. 12. Capture the biotinylated DNA fragments by using Dynabeads kilobaseBINDER Kit according to instructions of the manufacturer. Transfer 50 μL of thoroughly resuspended beads to an Eppendorf DNA LoBind tube. Use the magnet to separate beads from supernatant. Resuspend beads in 100 μL of binding buffer (provided with the beads). Place the tubes on the magnet and remove the supernatant. Resuspend the beads in 100 μL of binding buffer and add 100 μL of sonicated DNA solution. 13. Incubate the tube for 3 h at room temperature with rotation. Aliquot 10 μL of unbound DNA solution and purify it with a MinElute Reaction Cleanup Kit. Elute this DNA into 10 μL of EB buffer to use in Subheading 3.3, step 19. 14. Wash the beads four times with washing buffer (provided with the beads). To perform each wash, add 200 μL of washing solution. Incubate at 50 °C for 5 min with agitation on the thermomixer. Use the magnet to separate the beads from the washing buffer. Remove supernatant and add a new washing solution. 15. After last wash, add to the beads 200 μL of elution buffer and incubate the sample at 75 °C for 2 h with agitation on the thermomixer. 16. Use the magnet to separate the supernatant from the beads. Purify free DNA fragments from the supernatant with a QIAquick PCR Purification Kit, eluting DNA into 30 μL of EB buffer. Aliquot 2 μL of DNA to use in Subheading 3.3, step 19. 17. To remove the biotinylated tails from DNA, incubate the sample with 25 U of S1 nuclease in 110 μL of 1× Nuclease S1 buffer for 15 min at 37 °C.

Z-DNA in Supercoiled Genome

189

18. Purify DNA with a QIAquick PCR Purification Kit, eluting DNA into 30 μL of EB buffer. 19. Measure DNA concentrations by spectrometer. Aliquot 2 μL of DNA to use in Subheading 3.3, step 19. 20. Run DNA samples aliquoted at Subheading 3.3, steps 12, 15, and 18 on a 1% agarose gel. Stain the DNA in the gel with SYBR Green in TAE buffer (see Note 18). 21. The DNA sample recovered from the biotin–streptavidin selection is ready for downstream steps which involve highthroughput sequencing and computational analysis. Algorithm of computational data analysis is outlined below. 3.4 Identification of Z-DNA Structures

1. Process raw sequencing data from this protocol using Illumina Analysis Pipeline software (image analysis, base calling, and quality scores). 2. Perform quality control check of the raw sequencing data before doing any further analysis, for example, use FastQC available at https://www.bioinformatics.babraham.ac.uk/pro jects/fastqc/. 3. Align the short sequencing reads to the reference genome using a sequence aligning tool, for example, Bowtie2 or BWA [24, 25]. Short reads that come from ligated fragments of distal restriction sites of ApoI and TaqαI can be ignored. 4. Generate a coverage track (bigWig or bedGraph) from the aligned reads using, for example, deepTools tool set [26]. The coverage track can be used to visualize the data in UCSC Genome Browser or Integrative Genomics Viewer (IGV) and inspect Z-DNA motifs of interest. 5. To find all possible occurrences of potential Z-DNA motifs, use Z-hunt II (PMID: 3780676) or SIBZ program [7, 9]. Alternatively, a precomputed list of Z-DNA motif locations in various genomes can be downloaded from the non-B DB database [27]. 6. For each genomic occurrence of Z-DNA sequence motif, count the number of short reads overlapping two windows of length 500 bp and 1000 bp centered at a given motif using htseq-count script [28]. 7. For each of these motifs, compute a p-value for observed number of reads in the 500 bp window within the 1000 bp window, as defined in the previous step, using binomial distribution. 8. To find a reasonable p-value cutoff, use a permutation test, i.e., randomly shuffle read location within the 500 bp windows, and compute p-values for number of randomized reads found in the 500 bp windows within the 1000 bp windows.

190

Fedor Kouzine et al.

9. Use a p-value cutoff that corresponds to false discovery rate of 5% computed based on randomized data. Z-DNA motifs with p-value above this cutoff can be considered as regions forming Z-DNA structure.

4

Notes 1. This protocol is designed for mouse activated B cells grown in suspension. It can be easily modified for adherent cells or for cells from other organisms. High molecular weight DNA might also be prepared by any standard commercially available kits. 2. Lysis buffer should be kept at room temperature. 3. After addition of ethanol to the DNA solution with sodium acetate, mix by gently inverting the sample until DNA fully precipitates (a white cloud is formed). 4. This step is performed to check the quality of DNA which should not show excessive degradation pattern. 5. This amount of DNA is required to control all intermediate steps of the procedure and visualize DNA on the agarose gel. We recommend following all the checkpoints for the first experiments. With experience, one can scale down this amount up to 4 μg of DNA. 6. DNA supercoiling in vivo is expected to propagate at kilobase scale. Consequently, our choice of restriction enzymes was based on the criteria to have higher number of DNA fragments with length between 300 and 2000 bp. Restriction enzymes make cuts in double-stranded DNA at specific positions near their recognition site. The R Bioconductor function DigestDNA computes restriction enzyme cutting maps for provided DNA recognition sequences and query genomic sequences (http://www2.decipher.codes/). These maps allow to find the expected DNA fragments between consecutive restriction enzyme cutting positions and the genome-wide distribution of their lengths for each restriction enzyme. 7. Set of two enzymes was chosen to have broader genome coverage. 8. At the last cycle of extraction, addition of isobutanol can result in disappearing of the aqueous phase. If this happens, gradually add H2O to the tubes and mix until an aqueous phase reappears. 9. Lambda Exonuclease digests linear or nicked double-stranded DNA. This step ensures that only double-stranded DNAs left in the reaction are topologically closed DNA circles. Some

Z-DNA in Supercoiled Genome

191

single-stranded DNA can survive the digestion; however, it will be efficiently eliminated at the later steps of the protocol. 10. Depending on efficiency of ligation reaction, the yield is expected to be 6–10 μg of DNA at this step. 11. This procedure generates a mixture of DNA topoisomers with average supercoiling density of approximately -0.06 which is expected supercoiling density in vivo. If the amount of DNA in the reaction is different, ethidium bromide concentration should be adjusted to keep the same ratio (w/w) of ethidium bromide to DNA. 12. Do not use DEPC-treated water which sometimes inhibits permanganate reaction. 13. Dissolve KMnO4 in water with constant shaking for at least 1 h. Prepare stock solution in the day of the experiment. Keep permanganate solution protected from the light. 14. Treatment of supercoiled genome with permanganate results in chemical modification of single-stranded region making them susceptible to cleavage by single-strand-specific nuclease—S1 nuclease. 15. S1 nuclease converts the sites of DNA chemical modification into DNA breaks. 16. At this step, DNA ends exposed by nuclease treatment are biotinylated. 17. If using other ultrasonic sonicator, sonication parameters should be determined in preliminary experiments to ensure the correct size of DNA fragment. 18. This step is performed to check the quality of the final DNA sample and the efficiency of the removal of biotinylated tails from the DNA fragments. Population of DNA fragments with biotinylated tail released from the beads is running slow in comparison with DNA unbound to the beads. Removing tails with S1 nuclease results in similar migration between unbound and bound DNA. References 1. Baranello L, Levens D, Gupta A, Kouzine F (2012) The importance of being supercoiled: how DNA mechanics regulate dynamic processes. Biochim Biophys Acta 1819(7): 632–638. https://doi.org/10.1016/j. bbagrm.2011.12.007 2. Kouzine F, Levens D (2007) Supercoil-driven DNA structures regulate genetic transactions. Front Biosci 12:4409–4423. https://doi.org/ 10.2741/2398

3. Teves SS, Henikoff S (2014) Transcriptiongenerated torsional stress destabilizes nucleosomes. Nat Struct Mol Biol 21(1):88–94. https://doi.org/10.1038/nsmb.2723 4. Kouzine F, Gupta A, Baranello L, Wojtowicz D, Ben-Aissa K, Liu J, Przytycka TM, Levens D (2013) Transcriptiondependent dynamic supercoiling is a shortrange genomic force. Nat Struct Mol Biol 20(3):396–403. https://doi.org/10.1038/ nsmb.2517

192

Fedor Kouzine et al.

5. Naughton C, Avlonitis N, Corless S, Prendergast JG, Mati IK, Eijk PP, Cockroft SL, Bradley M, Ylstra B, Gilbert N (2013) Transcription forms and remodels supercoiling domains unfolding large-scale chromatin structures. Nat Struct Mol Biol 20(3):387–395. https://doi.org/10.1038/nsmb.2509 6. Rich A, Zhang S (2003) Timeline: Z-DNA: the long road to biological function. Nat Rev Genet 4(7):566–572. https://doi.org/10. 1038/nrg1115 7. Ho PS, Ellison MJ, Quigley GJ, Rich A (1986) A computer aided thermodynamic approach for predicting the formation of Z-DNA in naturally occurring sequences. EMBO J 5(10): 2737–2744 8. Beknazarov N, Jin S, Poptsova M (2020) Deep learning approach for predicting functional Z-DNA regions using omics data. Sci Rep UK 10(1):19134. https://doi.org/10.1038/ s41598-020-76203-1 9. Zhabinskaya D, Benham CJ (2011) Theoretical analysis of the stress induced B-Z transition in superhelical DNA. PLoS Comput Biol 7(1): e1001051. https://doi.org/10.1371/journal. pcbi.1001051 10. Zhabinskaya D, Benham CJ (2013) Competitive superhelical transitions involving cruciform extrusion. Nucleic Acids Res 41(21): 9610–9621. https://doi.org/10.1093/nar/ gkt733 11. Liu H, Mulholland N, Fu H, Zhao K (2006) Cooperative activity of BRG1 and Z-DNA formation in chromatin remodeling. Mol Cell Biol 26(7):2550–2559. https://doi.org/10.1128/ MCB.26.7.2550-2559.2006 12. Mulholland N, Xu Y, Sugiyama H, Zhao K (2012) SWI/SNF-mediated chromatin remodeling induces Z-DNA formation on a nucleosome. Cell Biosci 2:3. https://doi.org/10. 1186/2045-3701-2-3 13. Kouzine F, Wojtowicz D, Baranello L, Yamane A, Nelson S, Resch W, Kieffer-Kwon KR, Benham CJ, Casellas R, Przytycka TM, Levens D (2017) Permanganate/S1 nuclease footprinting reveals non-B DNA structures with regulatory potential across a mammalian genome. Cell Syst 4(3):344–356. e347. https://doi.org/10.1016/j.cels.2017.01.013 14. Edwards SF, Sirito M, Krahe R, Sinden RR (2009) A Z-DNA sequence reduces slippedstrand structure formation in the myotonic dystrophy type 2 (CCTG) x (CAGG) repeat. Proc Natl Acad Sci U S A 106(9):3270–3275. https://doi.org/10.1073/pnas.0807699106 15. Zhao J, Bacolla A, Wang G, Vasquez KM (2010) Non-B DNA structure-induced genetic

instability and evolution. Cell Mol Life Sci 67(1):43–62. https://doi.org/10.1007/ s00018-009-0131-2 16. Xie KT, Wang G, Thompson AC, Wucherpfennig JI, Reimchen TE, MacColl ADC, Schluter D, Bell MA, Vasquez KM, Kingsley DM (2019) DNA fragility in the parallel evolution of pelvic reduction in stickleback fish. Science 363(6422):81–84. https://doi.org/10. 1126/science.aan1425 17. Wittig B, Wolfl S, Dorbic T, Vahrson W, Rich A (1992) Transcription of human c-myc in permeabilized nuclei is associated with formation of Z-DNA in three discrete regions of the gene. EMBO J 11(12):4653–4663 18. Wolfl S, Wittig B, Rich A (1995) Identification of transcriptionally induced Z-DNA segments in the human c-myc gene. Biochim Biophys Acta 1264(3):294–302. https://doi.org/10. 1016/0167-4781(95)00155-7 19. van Holde K, Zlatanova J (1994) Unusual DNA structures, chromatin and transcription. BioEssays 16(1):59–68. https://doi.org/10. 1002/bies.950160110 20. Shin SI, Ham S, Park J, Seo SH, Lim CH, Jeon H, Huh J, Roh TY (2016) Z-DNA-forming sites identified by ChIP-Seq are associated with actively transcribed regions in the human genome. DNA Res 23(5):477–486. https:// doi.org/10.1093/dnares/dsw031 21. Li H, Xiao J, Li J, Lu L, Feng S, Droge P (2009) Human genomic Z-DNA segments probed by the Z alpha domain of ADAR1. Nucleic Acids Res 37(8):2737–2746. https:// doi.org/10.1093/nar/gkp124 22. Wu T, Lyu R, You Q, He C (2020) Kethoxalassisted single-stranded DNA sequencing captures global transcription dynamics and enhancer activity in situ. Nat Methods 17(5): 515–523. https://doi.org/10.1038/s41592020-0797-9 23. Ha SC, Lowenhaupt K, Rich A, Kim YG, Kim KK (2005) Crystal structure of a junction between B-DNA and Z-DNA reveals two extruded bases. Nature 437(7062): 1183–1186. https://doi.org/10.1038/ nature04088 24. Langmead B, Salzberg SL (2012) Fast gappedread alignment with bowtie 2. Nat Methods 9(4):357–359. https://doi.org/10.1038/ nmeth.1923 25. Li H, Durbin R (2009) Fast and accurate short read alignment with burrows-wheeler transform. Bioinformatics 25(14):1754–1760. https://doi.org/10.1093/bioinformatics/ btp324

Z-DNA in Supercoiled Genome 26. Ramirez F, Ryan DP, Gruning B, Bhardwaj V, Kilpert F, Richter AS, Heyne S, Dundar F, Manke T (2016) deepTools2: a next generation web server for deep-sequencing data analysis. Nucleic Acids Res 44(W1):W160–W165. https://doi.org/10.1093/nar/gkw257 27. Cer RZ, Bruce KH, Mudunuri US, Yi M, Volfovsky N, Luke BT, Bacolla A, Collins JR, Stephens RM (2011) Non-B DB: a database of

193

predicted non-B DNA-forming motifs in mammalian genomes. Nucleic Acids Res 39 (Database issue):D383–D391. https://doi. org/10.1093/nar/gkq1170 28. Anders S, Pyl PT, Huber W (2015) HTSeq – a python framework to work with highthroughput sequencing data. Bioinformatics 31(2):166–169. https://doi.org/10.1093/ bioinformatics/btu638

Chapter 14 Thermogenomic Analysis of Left-Handed Z-DNA Propensities in Genomes Ryan S. Czarny and P. Shing Ho Abstract The initial discovery of left-handed Z-DNA was met with great excitement as a dramatic alternative to the right-handed double-helical conformation of canonical B-DNA. In this chapter, we describe the workings of the program ZHUNT as a computational approach to mapping Z-DNA in genomic sequences using a rigorous thermodynamic model for the transition between the two conformations (the B–Z transition). The discussion starts with a brief summary of the structural properties that differentiate Z- from B-DNA, focusing on those properties that are particularly relevant to the B–Z transition and the junction that splices a left- to right-handed DNA duplex. We then derive the statistical mechanics (SM) analysis of the zipper model that describes the cooperative B–Z transition and show that this analysis very accurately simulates this behavior of naturally occurring sequences that are induced to undergo the B–Z transition through negative supercoiling. A description of the ZHUNT algorithm and its validation are presented, followed by how the program had been applied for genomic and phylogenomic analyses in the past and how a user can access the online version of the program. Finally, we present a new version of ZHUNT (called mZHUNT) that has been parameterized to analyze sequences that contain 5-methylcytosine bases and compare the results of the ZHUNT and mZHUNT analyses on native and methylated yeast chromosome 1. Key words B–Z transition, Zipper model, Statistical mechanics, 5-Methylcytosine

1 1.1

Introduction Background

Studies to elucidate the structures of polynucleic acids accelerated with the discovery that DNA is the molecular blueprint that carries the cell’s genetic information [1, 2]. It is fair to state that Watson and Crick’s right-handed double-helical model of the hydrated form of DNA (B-DNA; Fig. 1) [3] has transformed modern biology and has become an iconic symbol of modern science. It was shocking then when the first atomic structure of DNA was the lefthanded double helix called Z-DNA (Fig. 1) [4]. Since its discovery in 1979, research on Z-DNA has been a roller-coaster ride but is seeing renewed interest in its potential role in a number of human diseases, including cancer and Alzheimer’s [5, 6]. This chapter

Kyeong Kyu Kim and Vinod Kumar Subramani (eds.), Z-DNA: Methods and Protocols, Methods in Molecular Biology, vol. 2651, https://doi.org/10.1007/978-1-0716-3084-6_14, © The Author(s), under exclusive license to Springer Science+Business Media, LLC, part of Springer Nature 2023

195

196

Ryan S. Czarny and P. Shing Ho

Fig. 1 Structures of B-DNA (left) and Z-DNA (right). Shown are two dodecanucleotide sequences of DNA looking into the major grooves at the center. The right-handed B-DNA structure is of the so-called Drew–Dickerson dodecamer sequence d(CGCGAATTCGCG)2 [52]. The left-handed Z-DNA structure shown is that of the dodecamer sequence (5′-CGCGCGCGCGCG-3′)2 (PDB code 4OCB [53]), which is twice the length of the original 1979 Z-DNA structure [4]

describes how the program ZHUNT is used to predict the thermodynamic propensity of genomic DNAs to adopt this left-handed form. In addition, we present mZHUNT, which we have developed to analyze the ability of genomes with 5-methylcytosines (5mC) to form Z-DNA. From the start, Z-DNA has been an enigmatic form of DNA, particularly in terms of its functional relevance in the cell. The initial evidence for a left-handed form of DNA came from studies by Robert Wells’ group on alternating d(IC) polymers, which showed an unusual fiber X-ray diffraction pattern and an inverted circular dichroism (CD) spectrum [7]. Pohl and Jovin subsequently found that polymers of alternating CG base pairs (bps) in high salt solutions also had inverted CD spectra relative to “standard” DNAs, presenting, for the first time, a possible left-handed structure of DNAs with standard Watson–Crick base pairs [8]. These spectroscopic results, however, provide only indirect evidence for an opposite handedness in the helical conformation. The definitive evidence for a left-handed double helix came from the single-crystal structure of the self-complementary hexanucleotide sequence d(CGCGCG)2 from Alexander Rich’s group (Fig. 1) [4]. The crystal structure of B-DNA was published shortly afterward by Richard Dickerson’s group [9], allowing a direct comparison between what is recognized as the canonical DNA

Thermogenomic Analysis of Left-Handed Z-DNA

197

Fig. 2 The anti and syn conformations (a) and associated sugar conformations (b) of nucleotides. In the anti conformation, the base rotated around χ such that it extends away from the sugar, while the syn conformation rotates the base to sit above the sugar. The syn conformation is associated with the C3′-endo conformation of the associated deoxyribose sugar, while the anti conformation is typically associated with the C2′-endo sugar conformation. (Adapted from Ref. [54])

relative to the alternative Z-DNA structures. In addition to the opposite handedness of the helical twists, the bases are all in the lower energy anti conformation in B-DNA, while they alternated between the anti and alternative syn conformations in Z-DNA (Fig. 2) [10]. The guanine bases of Z-DNA adopt the energetically less favorable syn orientation and the cytosines in anti. The result is that the left-handed structure of the alternating sequence shows a characteristic zigzag pattern—hence the name “Z-DNA”—and the dinucleotide (dn) being the repeating unit. The preference for pyrimidines to adopt the syn conformation and purines the anti conformation along with the alternating anti/syn motif imposes a general preference for Z-DNA for alternating pyrimidine/purine (Py/Pu) sequences. Within the alternating Py/Pu motif, the preference is for CG > CA/TG > TA dinucleotides. This preference was shown to primarily arise from the differences in solvent interactions with C·G and T·A base pairs in the Z- versus B-DNA structures [11, 12]. 1.2 Energetics of ZDNA Formation

The initial skepticism concerning the biological relevance of Z-DNA arose from the need for high salt solutions (in the molar range of NaCl) as found in the initial spectroscopic studies of Pohl and Jovin. Subsequent solution studies have found that the high monovalent cation requirements could be mitigated by multivalent cations (including Mg2+ or polyamines) [13–15]. The more significant observations are that Z-DNA could be induced to form by negative supercoiling (NSC), either in closed circular DNAs such as plasmids [16–19], potentially through release of nucleosome

198

Ryan S. Czarny and P. Shing Ho

Fig. 3 Zipper model for the B-DNA to Z-DNA transition (B–Z transition). The B–Z transition is initiated by lefthanded rotations (-ΔTw) of B-DNA. The initial step (nucleation) is the high energy, low probability unwinding and melting of ~4 bps (equivalent to 2 B–Z junctions). As the two junctions migrate in opposite directions, Z-DNA is extended in lower energy, higher probability propagation steps between the two junctions. (Adapted from Ref. [54])

particles [20, 21], or in the wake of a transcribing RNA polymerase [22] along genomic DNAs. Finally, methylation of cytosine bases at the 5-position was seen to help stabilize the Z-conformation [13, 14]. By inserting defined sequences into negatively supercoiled plasmids, the energetics for the transition from B-DNA to Z-DNA (the B–Z transition) were determined for all possible combinations of base pairs, including alternating and non-alternating C·G or T·A dinucleotides. The B–Z transition for any dinucleotide ranges from slightly unfavorable (+0.6 kcal/mol for alternating CG) to highly unfavorable (+4.4 kcal/mol for TT/AA) [23, 24]. The energetics of a B–Z transition within the context of a B-DNA background, however, is not simply the sum of these dinucleotide energies, but must also take into consideration the structural transition between the right- and left-handed double helices (B–Z junction; Fig. 3). The B–Z junction must not only change the direction of the helical twist but must also invert its “sense,” i.e., the directions of the major and minor grooves have been inverted [25]. The structure of the B–Z junction was determined by Ha et al., showing it to essentially be a set of unpaired or “melted” base pairs [26]. As unpaired nucleotides, the B–Z junction is inherently unstable, determined to be +5 kcal/mol/junction. For a potential Z-DNA sequence flanked on either side by B-DNA, there would be two B– Z junctions, and, therefore, a B–Z transition for a Z-sequence insert would necessarily require formation of two such junctions. Consequently, the energy for a B–Z transition in a genome-like

Thermogenomic Analysis of Left-Handed Z-DNA

199

context would be unfavorable to highly unfavorable, even for a fully alternating CG sequence. The unfavorable energetics raises the question of whether genomic Z-DNA actually exists. The answer comes from coupling these B–Z transition energies to the NSC driver of the B–Z transition in a potential genomic setting. Since Z-DNA has a negative twist, the formation of 1 turn of Z-DNA (12 bps) relaxes ~2 turns of NSCs. Since each B–Z junction is essentially melted, it also contributes to the relaxation of NSCs (~0.2 turns per junction) [18]. Thus, as the length of a Z-DNA sequence increases, the B– Z transition energy costs increase, but the number of NSCs relaxed also increases. The B–Z transition energies increase linearly, while the energetics for NSCing increases as the square of the number of supercoils. Consequently, at some point, the number of NSCs relaxed provides enough energy to overcome the B–Z transition energies, thereby allowing formation of Z-DNA in a supercoiled genomic context. 1.3 Statistical Mechanics of the Zipper Model for the B–Z Transition

The question at this point is how to put the B–Z transition and NSC energies into a predictive algorithm for Z-DNA. This process starts with a basic “zipper” model derived for coil–helix transitions in proteins [27] and adapted by Peck and Wang for the B–Z transition in a simple alternating CG sequence in negatively supercoiled plasmids [18]. The concept behind the zipper model is that highly cooperative transitions in polymers start with a high energy, low probability initiation (nucleation), followed by a series of lower energy, high probability extensions (propagation). Peck and Wang’s zipper model (Fig. 3) can be described as starting the B– Z transition with formation of 2 B–Z junctions (essentially melting 4 bps of B-DNA within a Z-DNA sequence)—an energetically very expensive (+10 kcal/mol) nucleation step that relaxes 0.8 turns of NSC [18]. As each junction migrates in opposite directions, individual CG dinucleotides sandwiched between the junctions can systematically convert from B- to Z-DNA (the propagation steps), with an energetic cost of 0.6 kcal/mol/dn and relaxing (or changing the writhe, ΔWr) by 0.375 turns/dn of NSC. The equilibrium free energy for any number of Z-DNA dinucleotide (dnZ) in a plasmid with a specific initial number of NSCs is now the sum of the energetic costs from the initiation step and the propagation through the number dnZ in the Z-forming insert (10 kcal/mol + dnZ × 0.6 kcal/mol/dn), minus the energies from relaxation of the NSCs relaxed from the initiation and each propagation set. It should be noted that the NSC energy of the system is in terms of the resulting change in the linking number (ΔLk, a measure of the overall number of supercoils of a closed circular DNA that is all B-DNA) as a result of the change in the helical twist of the Z-DNA insert (ΔTw) and the change in the writhe (ΔWr) of

200

Ryan S. Czarny and P. Shing Ho

the NSCed DNA (ΔLk = ΔTw + ΔWr). The ΔG°(NSC) is proportional to ΔWr2, or ΔG°(NSC) = KΔWr2 = K(ΔLk - ΔTw)2. In order to derive the probability of Z-DNA formation for any particular CG dinucleotide at any particular starting NSC level, the overall free energies are converted to fractions of Z-DNA ( fZ) based on the standard Gibbs relationship ( fZ = [Z-DNA]/ [B-DNA] = e- (ΔΔG°(Z-B)/RT), where ΔG°(Z-B) is the difference in free energy between the left- and right-handed conformations, taking into account the B–Z junction energy (nucleation energy), the B–Z transition energies for the CG dinucleotide(s) (depending on the number of dinucleotides that propagates through the transition), and the energy of unwinding associated with the nucleation and propagation terms. This calculation is only for a single state (i.e., a defined number of CG dinucleotides in the Z-form). For any particular NSC level, a Z-DNA sequence can have all dinucleotides fully in the B-form, fully in the Z-form, or some fraction in between. Thus, the effective fraction of Z-DNA must consider the probability that the DNA is in the left-handed form across a population of molecules. This is where we apply a SM analysis of the zipper model. In SM analysis, the probability of each state (each possible number of dinucleotides as Z-DNA) is compared to all possible states (the sum of all states, also known as the partition function, or Q). Without deriving it, the partition function for the zipper model of a B–Z transition for a sequence of n alternating CG dinucleotides is shown in Eq. 1. Xn Xn  k  fð - K =RT Þ½ΔLk - kð0:375Þ - 0:82 g Q =1 þ ð1Þ σ S e i=1 k=1

< ΔTw > =

In this equation, σ is the nucleation probability (= e (-10 kcal/ mol)/RT ), S is the propagation probability (= e (-0.6 kcal/mol)/RT), K is the proportionality constant for supercoiling (= 1100 RT/N, with N being the size of the supercoiled domain or closed circular plasmid), ΔLk is the total linking number or the starting number of NSCs of the supercoiled domain (or closed circular plasmid), k is the counter for the number of CG dinucleotides that undergo the B–Z transition (from 1 to n dinucleotides), and i is the counter for each possible state from 1 to n. The propensity for the formation of Z-DNA in a length of CG dinucleotides is calculated as the average twist () of the DNA, as defined in Eq. 2. nP P  k  fð - K =RT Þ½ΔLk - kð0:375Þ - 0:82 g o n n i=1 k = 1 ðk ð0:375Þ - 0:8Þσ S e Q ð2Þ

Thermogenomic Analysis of Left-Handed Z-DNA

Soon after the discovery of Z-DNA, it became important to determine where Z-DNA would occur in a genome in order to address the question of the biological relevance of the structure. Earlier studies had experimentally mapped Z-DNA using electron microscopy (EM) to visualize where antibodies raised against the lefthanded conformation were localized along a genomic sequence [28]. These studies were initially plagued by questions of whether the methods to fix the DNA onto surfaces for EM visualization were inducing the DNA to flip its handedness [29]. Eventually, milder fixing conditions were developed such that antibody assays became reliable both for locating Z-DNA and estimating the probability of its formation along genomic DNAs [30, 31]. There remained, however, a number of technical challenges to using this assay routinely to map Z-DNA. In addition, there was the question of whether antibodies could in themselves be inducing a B–Z transition as a consequence of the tight binding equilibria. In 1985, Michael Ellison and the corresponding author were postdoctoral fellows in Dr. A. Rich’s laboratory [32] and considered the question of whether the occurrence of Z-DNA could be computationally predicted in genomes based on thermodynamic criteria—an approach we now call thermogenomics [33]. At the time, Dr. Ellison had applied two-dimensional gel electrophoresis analyses to determine the B–Z propagation free energies (ΔGP) of various alternating and non-alternating dinucleotides [23, 24]. These experimental ΔGP values allowed us to derive the propagation probabilities (S) for all possible dinucleotides, which could then be incorporated into a set of extended equations for the partition function (Q; Eq. 3) and average Z-DNA (; Eq. 4).   Xn Xn 2 k Q =1 þ e fð - K =RT Þ½ΔLk - kð0:375Þ - 0:8 g σ ∏ S j j = 1 i=1 k=1

1.4 ZHUNT: A Computational Approach to Mapping Z-DNA in Genomes

nP n < ΔTw > =

201

i=1

ð3Þ o  2 k fð - K =RT Þ½ΔLk - kð0:375Þ - 0:8 g ð k ð 0:375 Þ 0:8 Þσ ∏ S e k=1 j=1 j

Pn



Q ð4Þ In these equations, Sj is the propagation probability of the dinucleotide at position j along the Z-DNA sequence, and K was set equal to 1100RT/N (where N is set to 4363, the size of the pBR322 plasmid used to experimentally determine the ΔGP values). Using these relationships, we can calculate the number of NSC (ΔLk) required to convert, on average, one base pair fully into Z-DNA ( = -1.0) for any sequence. The SM model was put to the test when a study was published showing that the CA/TG-type sequences found in the rat somatostatin and prolactin genes showed complex behaviors indicative of left-handed conformations, but perhaps not that of Z-DNA

202

Ryan S. Czarny and P. Shing Ho

Fig. 4 Validation of the statistical mechanics model for the B–Z transition (a) and the program ZHUNT (b and c). (a) Simulation of the promoter sequence from the rat prolactin gene using an extension of the ZHUNT SM model to a ΔLk (initial negative supercoiling, NSC). The data from two separate two-dimensional gel electrophoresis analyses are shown in the closed circles and open squares. The SM simulation of the unwinding associated with the formation of Z-DNA (ΔTw) in the sequence is shown as a solid line. Both the experiments and simulation show three distinct B–Z transitions in the sequence. (Data from Ref. [35]). (b) Antibody mapping of Z-DNA in the genome of the ϕX-174 virus. The height of each bar represents the probability of observing an antibody at that position by electron microscopy [31]. It should be noted that the origin of the sequence from this study is not the same as the current starting nucleotide in the current sequence database. (b) ZHUNT prediction of Z-DNA in the ϕX-174 genome. The starting nucleotide for this analysis was aligned with that defined in the antibody studies in (b). (Data from Ref. [32])

[34]. Applying the SM analysis showed that the program could very accurately simulate the behavior of these naturally occurring and complex gene sequences (Fig. 4a) [35]. Thus, the basic thermodynamic principles were shown to be fundamentally correct and could be incorporated into a general algorithm to search for Z-DNA in genomic sequences. This extended SM model of the B–Z transition was implemented in the program ZHUNT to analyze genomic sequences for their abilities to form left-handed Z-DNA [32]. A complete SM analysis of an entire genome, such as the 3.2 billion nucleotides of the human genome, is impossible. Thus, the ZHUNT analysis uses a sliding window strategy to identify stretches of sequences with propensities to form Z-DNA (we call these Z-DNA regions, or ZDRs). Although a ZDR can be of any length, the typical window size for scanning a sequence is between 6 and 12 dns (one to two full turns of Z-DNA). The actual size of the window is set by the program by first assigning an anti (A) or syn conformation (S) to each nucleotide in all possible combinations of A and S, then

Thermogenomic Analysis of Left-Handed Z-DNA

203

finding that combination and window with the lowest summed ΔGP. It is this set of ΔGP values and window size that are applied in the full SM calculation. The ZHUNT analysis was applied to calculate the ΔLk for = -1.0 for 80,000 random sequences of various lengths in order to determine a normal distribution of Z-DNA propensities, which allowed for the development of a statistically meaningful definition of a significant ZDR. The ΔLk for = -1.0 calculated for any segment of Z-DNA is compared to the standard deviation of this distribution, which then allows the Z-score (defined as the number of random base pairs that must be searched to find a sequence that is as good or better at forming Z-DNA) to be determined for the segment. It is this Z-score that was used as a measure of the thermodynamic propensity of a particular segment for forming Z-DNA in the context of a B-DNA background. A Z-score ≥ 580 (the propensity for forming one full turn of Z-DNA in a 12 bp CA/TG sequence) would typically be considered the minimum threshold to identify a significant Z-DNA-forming sequence. A Z-score is listed for the first base pair of a sequence analysis window. Thus, a Z-score of 580 at position 1000 indicates that there is a ZDR equivalent to 12 bps of CA/TG from 1000 to 1012 (or longer). The actual length of the Z-DNA-forming sequence is the length of contiguous Z-scores ≥580 plus the length of the analysis window (12–24 bp). 1.5

Validating ZHUNT

1.6 Applications of ZHUNT for Genomic and Phylogenomic Analyses

Predictions from the ZHUNT program were validated by comparing the calculated Z-scores against the experimental mapping of Z-DNA across various genomic sequences that were available at that time [32]. The analysis of the viral φX-174 genome showed that Z-scores aligned well both in their positions and the relative magnitudes compared to the fractions of anti-Z-DNA antibodies bound across this sequence (Fig. 4b, c). Thus, ZHUNT accurately predicts the thermodynamic propensity of sequences to form Z-DNA in genomes. The effort to determine the sequence of the human genome (the Human Genome Project) started in 1990 [36]. By 1991, a sufficient number of sequences were released that the scientific community could start to search for specific properties that were potentially unique to genes [37, 38]. At that time, we had applied ZHUNT to analyze the sequences of 137 human genes and found a nonrandom distribution of Z-DNA sequences clustered near their transcription start sites (TSS), as compared to other features (5′-UTR, 3′-UTR, introns, and exons) [39]. These results suggested a potential role of Z-DNA in the regulation of transcription. This potential regulatory role was consistent with the model put forth by Liu and Wang that NSCs could be created in the wake of an actively transcribing RNA polymerase [22] and the finding that a strong Z-DNA sequence

204

Ryan S. Czarny and P. Shing Ho

was required for regulating the transcription of the colonystimulating factor-1 (CSF-1) by an adjacent nuclear factor I (NFI) promoter [40]. A ZHUNT analysis found that positions of ZDNA-forming regions were correlated with NFI positions along the human chromosome 22 genomic sequence and a Z-DNA sequence was positioned adjacent to the NFI site upstream of the transcriptional start site of the human CSF-1 gene [41]. The Human Genome Initiative was declared to be complete in 2003 with the release of a complete draft of the euchromatic sequence of the human genome [42]. At that same time, the complete genomic sequences became available for a number of other organisms ranging from eubacteria to archaea to invertebrates, simple vertebrates, and various mammals. A phylogenomic analysis of the genomes across this range of organisms showed correlated patterns at the TSS of CG-rich elements, including CpG islands, NFI binding sites, and Z-DNA [43]. More detailed analyses gave rise to a model for how these various CG-rich elements emerged, as the transcriptional regulatory elements evolved from prokaryotes to simple and then higher eukaryotes (Fig. 5). ZHUNT and similar thermogenomic programs [33], therefore, have become powerful tools to help identify genomic sequences for their propensities to undergo conformational transitions based on rigorous thermodynamic principles. 1.7 mZHUNT for Analyses of Z-DNA in Genomes with Methylated Cytosine

5-Methylcytosine (5mC) has different functions in prokaryotes versus eukaryotic organisms [44, 45]. In prokaryotes, 5mC serves to protect the host genome from cleavage by its own nucleases that serve to protect against viral infections. However, 5mC is now recognized as an epigenetic modification to eukaryotic DNAs and is primarily associated with regulating gene expression. The effect of 5mC on Z-DNA stability was initially demonstrated by a threeorder of magnitude reduction in the concentration of Mg2+ required to induce the inversion of CD spectra in poly(dG-d5mC) compared to poly(dG-dC) [13, 14]. The confluence of this base modification and the left-handed double helix toward potential gene regulation makes an analysis of Z-DNA in 5mC-modified genomes interesting. We have modified ZHUNT to analyze for Z-DNA in methylated genomes (creating the program mZHUNT) by reducing the ΔGP of each 5mC·G base pair by 1.37 kcal/mol, resulting in, for example, a ΔGP = -0.71 kcal/mol for a 5mCG dinucleotide in the anti–syn conformation. This value comes from an analysis of solvent free energies and their effects on the B–Z transition energies [14, 46]. The mZHUNT program can be run from the ZHUNT online portal. In this case, the one letter “C” designation of each cytosine in a sequence of the input file that is known to be methylated should be replaced by an “M” (see Subheading 4). Otherwise, the program runs in exactly the same manner as ZHUNT.

Thermogenomic Analysis of Left-Handed Z-DNA

205

Fig. 5 Phylogenomic analysis of CG-rich transcriptional elements. The bars below the zero-line represent underrepresentation, while those above represent overrepresentation of the element as percentages relative to the average. Blue bars represent the under- or overrepresentation up to 220 bp downstream of the transcriptional start site (TSS), yellow at the TSS, and red up to 220 bp upstream of the TSS. (a) Distribution of overall CG content. (b) Distribution of CpG islands. (c) Distribution of nuclear factor I (NFI) binding sites. (d) Distribution of Z-DNA regions (ZDRs) as predicted by ZHUNT. (From Ref. [43])

206

Ryan S. Czarny and P. Shing Ho

Fig. 6 Comparison of results from ZHUNT (a) and mZHUNT (b) analyses of chromosome 1 from S. cerevisiae

We applied ZHUNT analyses on the yeast chromosome 1 sequence and mZHUNT on the methylated version of this sequence (Fig. 6). There were 8834 methylation sites identified by Bis-Seq analysis [47]. The ZHUNT analysis of the unmodified chromosome identified several clusters of Z-DNA, with three dominant ZDRs having Z-scores ≥7000. The mZHUNT analysis of the methylated chromosome identified a larger number of ZDRs and distributed them very differently across the chromosome. Thus, cytosine methylation has a dramatic effect. Previous ZHUNT analyses showed that ZDRs clustered near the TSS of eukaryotic genes, with the majority located immediately downstream (within 200 bps) of the TSS. An example is shown in Fig. 7a for yeast chromosome 1. Interestingly, a mZHUNT analysis of this same sequence shows an even stronger clustering at and within 50 bps of the TSS (Fig. 7b). Thus, cytosine methylation is seen to increase the overall number of potential ZDRs in a sequence and also affects where and how ZDRs cluster around the TSS of genes.

Thermogenomic Analysis of Left-Handed Z-DNA

207

Fig. 7 Comparison of ZDRs relative to the transcription start site (TSS) from ZHUNT (a) and mZHUNT (b) analyses of chromosome 1 from S. cerevisiae

2

Materials The ZHUNT and mZHUNT programs are available online (http://zhunt.bmb.colostate.edu) or by request from the corresponding author (C++ code, no interface). The yeast chromosome 1 sequence used in the ZHUNT analysis is available as a FASTA file from the NCBI Genome database (https://www.ncbi. nlm.nih.gov/genome/?term=Saccharomyces%20cerevisiae [Organism] & cmd=DetailsSearch) under RefSeq number NC_001133.9, while the methylated sequence for the mZHUNT analysis on yeast chromosome 1 is available from the NCBI BioProject database (https://www.ncbi.nlm.nih.gov/bioproject/) under accession number PRJNA510813 [47].

3

Methods The genomic analyses in this study were performed using the ZHUNT and mZHUNT online portal at http://zhunt.bmb. colostate.edu (Fig. 8). The ZHUNT homepage of this portal requests only an input file and a user’s email address. The input file can be in either the simple text (with standard designations of C, G, A, or T, as upper- or lowercase letters) or in an NCBI FASTA

208

Ryan S. Czarny and P. Shing Ho

Fig. 8 Homepage for ZHUNT. The program ZHUNT can be accessed and run online at http://zhunt.bmb. colostate.edu

format (as downloaded from various databases). After uploading an input file and the email address, the user simply clicks the SUBMIT button. Since ZHUNT uses a rigorous SM/thermodynamic analysis, the analysis time can be fairly long. The original program in FORTRAN 77 running on a VAX 11/780 computer required ~1 month to complete the analysis of the adenovirus genome (~37 kb) [32]. The program is now ported to the C++ language [39, 41]. The current online version of ZHUNT running on a Macintosh minicomputer completes the analysis of a 230 kb sequence in ~33 s, with a linear increase in computational time with an increase in the size of the sequence being analyzed (Fig. 9). Once a successful analysis has been completed, the user is sent to the results page (Fig. 10), where the raw output can be downloaded and opened separately with various data analysis programs (Fig. 11). The first row of the raw output shows the number of bps analyzed, followed by the minimum and maximum window size used in the analysis (the online program fixes these at 6 dn and 12 dn). The remainder of the data are presented in four columns, with the first column being the ΔLk for initiating one dinucleotide as Z-DNA ( = -1.0), the second column being the slope at this transition point (as a measure of cooperativity of the transition for that ZDR), the third column being the Z-score (in kilobase pairs, kb) for the first bp of an analysis window, and the fourth column being the assignment of the base conformation (A, anti; S,

Thermogenomic Analysis of Left-Handed Z-DNA

209

Fig. 9 Time for ZHUNT calculations relative to the number of nucleotides in the input file

Fig. 10 Results page after completion of ZHUNT analysis

syn) for the base pairs in the window. A Z-score ≥ 700 (7e-01 kb) is considered to be a potentially significant Z-DNA-forming sequence. The user can click the “View Graph” button to see the plot of Z-scores (y-axis) versus the sequence (x-axis). Alternatively, the user can download an image of this Z-score plot by clicking the “Download Graphs” button.

210

Ryan S. Czarny and P. Shing Ho

Fig. 11 Output file of data from ZHUNT. The first line returns the input file name, the number of nucleotides in that file, and the minimum and maximum window size used for the calculation. The remainder of the file is divided into four columns, as described in the text

4 4.1

Notes Input File Format

The input file for ZHUNT and mZHUNT is either in the “.fasta” or “.txt” format. When running mZHUNT, the “C” for each cytosine position that is methylated must be replaced by an “M” in the input file. If the information for methylated sites is saved as a list of numbered positions, these substitutions can be made manually using Microsoft Excel or other spreadsheet program by following these steps: 1. Save the sequence as a .csv formatted file each nucleotide on a separate line. 2. Number the nucleotides from 1 to n (n being your last nucleotide). 3. Copy in the methylation site information into a new column. 4. Use the COUNTIF function in each row to determine if the nucleotide sequence number matches the methylation site number. 5. Filter out the “0” values. 6. Replace all of the Cs that have “1” values with “M.” 7. Save the edited .csv formatted as a.txt formatted file.

Thermogenomic Analysis of Left-Handed Z-DNA

4.2 Possible Errors Running ZHUNT

211

There are a couple of things to look for to quickly troubleshoot any problems running the online version of ZHUNT or mZHUNT: 1. Check that the input file is in the FASTA format and/or has not been corrupted. 2. Refresh the browser and clear cookies. 3. If using mZHUNT, ensure that only the characters A/T/G/ C/M are used. 4. Check the expected calculated time to determine if the run is outside of an acceptable range before quitting.

4.3 Running ZHUNT or mZHUNT on Local Computer

5

To run the ZHUNT program on a local computer, first download a copy of the ZHUNT and mZHUNT programs from the server at http://zhunt.bmb.colostate.edu. With the program code in hand, you will need to compile the program using a C++ compiler. The program can then be run in command mode. The program will ask for the name of an input file (in FASTA or TXT format), a minimum and maximum window size (set these at 6 and 12 dn, respectively). The output from the program will be a text file similar to that shown in Fig. 11. For short sequences, the output file can be imported into a standard spreadsheet program for analysis or to plot the data. The analysis of chromosomal or genomic DNA sequences will result in very large files, which are best analyzed using the scripts available from the http://zhunt.bmb.colostate. edu server or your own custom script.

Conclusions and Discussion In this chapter, we have summarized the structural and energetic differences between canonical right-handed B-DNA and the alternative left-handed Z-DNA and shown how these differences have been applied to a SM analysis of the zipper model for the B–Z transition. This thermodynamic model was the basis for the development of the program ZHUNT to predict the location of sequences in genomes with a propensity to form Z-DNA. The fundamental principle behind this thermogenomic approach was validated by its ability to simulate the behavior of naturally occurring sequences found in the promoters of two genes in rats, while the predictive power of the ZHUNT algorithm was validated by comparisons to the locations and probabilities for antibody binding along a viral genome. Application of ZHUNT to analyze sequences in and around human genes showed that Z-DNA sequences clustered near transcriptional start sites, while a phylogenomic analysis across a broad range of genomes from eubacteria to archaea to lower and higher eukaryotes resulted in a model for how CG-rich transcriptional elements may have emerged. The ZHUNT program

212

Ryan S. Czarny and P. Shing Ho

is now made available online for users to analyze their own sequences through a simple portal. The ZHUNT program has now been extended to the analysis of 5mC modified genomes through the program mZHUNT, which is also available through the same ZHUNT portal. A comparison of the ZHUNT and mZHUNT analyses of unmethylated and methylated chromosome 1 from yeast shows that cytosine methylation can affect both the relative propensities of strong Z-forming sequences (as reflected in the change in order of Z-scores along the chromosome) but can also have a dramatic effect of changing a non-Z-forming sequence in the unmodified chromosome to a very strong Z-sequence in the modified chromosome. How this epigenetic modification effects the relative expression of these genes in comparison to the ZHUNT and mZHUNT predictions would be very interesting and could provide insight into how the modification in the DNA chemistry and its helical conformation are linked. We note that ZHUNT is not the only computational program that predicts Z-DNA sequences in genomes. Similar thermogenomic analysis programs include SIBZ [48] and Z-CATCHER [49], both of which yield results that align generally with the ZHUNT analyses. More recently Beknazarov et al. [50] developed a machine learning (ML) approach to identify Z-DNA sequences that form dynamically during transcription. This ML method trained against an RNA ChIPSeq dataset of Z-DNA in actively transcribing genes [51] was shown to be more accurate than ZHUNT in predicting the dynamic formation of Z-DNA during transcription. It should be noted, however, that ZHUNT and the ML approach were developed and designed differently and, consequently, their results should be interpreted differently, perhaps in complementary fashion. ZHUNT is a thermodynamic analysis and finds sequences with high propensities to form Z-DNA. The ML method, on the other hand, is trained on data that reflects formation of Z-DNA as a transient structure resulting from dynamic cellular processes, which is dependent on transcriptional rates and other factors that are kinetic and not strictly thermodynamic. Each provides insight into how the cell’s genome dances away from the canonical form, with one telling us how likely the DNA will take up the dance, while the other gives a snapshot during the dance between DNA and its transcriptional partners.

Acknowledgments The studies in Ho laboratory were supported by a grant from the National Science Foundation (CHE-1905328 and MCB-2124202). We thank A. N. Ho for critical reading of the manuscript.

Thermogenomic Analysis of Left-Handed Z-DNA

213

References 1. Avery OT, MacLeod CM, McCarty M (1944) Studies on the chemical nature of the substance inducing transformation of pneumococcal types induction of transformation by a desoxyribonucleic acid fraction isolated from pneumococcus type III. J Exp Med 79(2):137–158. https://doi.org/10.1084/jem.79.2.137 2. Hershey AD, Chase M (1952) Independent functions of viral protein and nucleic acid in growth of bacteriophage. J Gen Physiol 36(1):39–56. https://doi.org/10.1085/jgp. 36.1.39 3. Watson JD, Crick FH (1953) Molecular structure of nucleic acids; a structure for deoxyribose nucleic acid. Nature 171(4356):737–738 4. Wang AHJ, Quigley GJ, Kolpak FJ, Crawford JL, Vanboom JH, Vandermarel G et al (1979) Molecular-structure of a left-handed double helical DNA fragment at atomic resolution. Nature 282(5740):680–686. https://doi. org/10.1038/282680a0 5. Gannon HS, Zou T, Kiessling MK, Gao GF, Cai D, Choi PS et al (2018) Identification of ADAR1 adenosine deaminase dependency in a subset of cancer cells. Nat Comm 9:5450. https://doi.org/10.1038/s41467-01807824-4 6. Suram A, Rao LKS, Latha KS, Viswamitra MA (2002) First evidence to show the topological change of DNA from B-DNA to Z-DNA conformation in the hippocampus of Alzheimer’s brain. NeuroMolecular Med 2(3):289–297. https://doi.org/10.1385/Nmm:2:3:289 7. Mitsui Y, Langridge R, Shortle BE, Cantor CR, Grant RC, Kodama M et al (1970) Physical and enzymatic studies on poly D(I-C) poly D(I-C), an unusual double-helical DNA. Nature 228(5277):1166. https://doi.org/10.1038/ 2281166a0 8. Pohl FM, Jovin TM (1972) Salt-induced cooperative conformational change of a synthetic DNA – equilibrium and kinetic studies with poly(dg-dc). J Mol Biol 67(3):375. https:// doi.org/10.1016/0022-2836(72)90457-3 9. Wing R, Drew H, Takano T, Broka C, Tanaka S, Itakura K et al (1980) Crystalstructure analysis of a complete turn of B-DNA. Nature 287(5784):755–758. https://doi.org/10.1038/287755a0 10. Ho PS, Mooers BHM (1997) Z-DNA crystallography. Biopolymers 44(1):65–90. https:// doi.org/10.1002/(Sici)1097-0282(1997)44: 13.0.Co;2-Y 11. Kagawa TF, Stoddard D, Zhou GW, Ho PS (1989) Quantitative analysis of DNA

secondary structure from solvent-accessible surfaces: the B- to Z-DNA transition as a model. Biochemistry 28(16):6642–6651 12. Ho PS, Kagawa TF, Tseng K, Schroth GP, Zhou GW (1991) Prediction of a crystallization pathway for Z-DNA Hexanucleotides. Science 254(5034):1003–1006. https://doi.org/ 10.1126/science.1948069 13. Behe M, Felsenfeld G (1981) Effects of methylation on a synthetic polynucleotide – the B-Z transition in poly(dG-m5dC).Poly (dG-m5dC). Proc Natl Acad Sci USA 78(3): 1619–1623. https://doi.org/10.1073/pnas. 78.3.1619 14. Kagawa TF, Howell ML, Tseng K, Ho PS (1993) Effects of base substituents on the hydration of B- and Z-DNA: correlations to the B- to Z-DNA transition. Nucleic Acids Res 21(25):5978–5986. https://doi.org/10. 1093/nar/21.25.5978 15. Howell ML, Schroth GP, Ho PS (1996) Sequence-dependent effects of spermine on the thermodynamics of the B-DNA to Z-DNA transition. Biochemistry 35(48): 15373–15382. https://doi.org/10.1021/ bi961881i 16. Peck LJ, Nordheim A, Rich A, Wang JC (1982) Flipping of cloned D(Pcpg)N.D(Pcpg)N DNA-sequences from right-handed to lefthanded helical structure by salt, co(iii), or negative supercoiling. Proc Natl Acad Sci, USA 79(15):4560–4564. https://doi.org/10. 1073/pnas.79.15.4560 17. Nordheim A, Peck LJ, Lafer EM, Stollar BD, Wang JC, Rich A (1983) Supercoiling and lefthanded Z-DNA. Cold Spring Harb Symp Quant Biol 47(Pt 1):93–100 18. Peck LJ, Wang JC (1983) Energetics of B-to-Z transition in DNA. Proc Natl Acad Sci U S A 80(20):6206–6210. https://doi.org/10. 1073/pnas.80.20.6206 19. Wang JC, Peck LJ, Becherer K (1983) DNA supercoiling and its effects on DNA structure and function. Cold Spring Harb Symp Quant Biol 47(Pt 1):85–91 20. Nickol J, Behe M, Felsenfeld G (1982) Effect of the B--Z transition in poly(dG-m5dC). poly (dG-m5dC) on nucleosome formation. Proc Natl Acad Sci U S A 79(6):1771–1775 21. Ausio J, Zhou G, van Holde K (1987) A reexamination of the reported B----Z DNA transition in nucleosomes reconstituted with poly (dG-m5dC).poly(dG-m5dC). Biochemistry 26(18):5595–5599

214

Ryan S. Czarny and P. Shing Ho

22. Liu LF, Wang JC (1987) Supercoiling of the DNA-template during transcription. Proc Natl Acad Sci U S A 84(20):7024–7027. https:// doi.org/10.1073/pnas.84.20.7024 23. Ellison MJ, Kelleher RJ 3rd, Wang AH, Habener JF, Rich A (1985) Sequence-dependent energetics of the B-Z transition in supercoiled DNA containing nonalternating purinepyrimidine sequences. Proc Natl Acad Sci U S A 82(24):8320–8324 24. Ellison MJ, Feigon J, Kelleher RJ 3rd, Wang AH, Habener JF, Rich A (1986) An assessment of the Z-DNA forming potential of alternating dA-dT stretches in supercoiled plasmids. Biochemistry 25(12):3648–3655 25. Dickerson RE (1992) DNA-structure from A to Z. Method Enzymol 211:67–111 26. Ha SC, Lowenhaupt K, Rich A, Kim YG, Kim KK (2005) Crystal structure of a junction between B-DNA and Z-DNA reveals two extruded bases. Nature 437(7062): 1183–1186 27. Zimm BH, Bragg JK (1959) Theory of the phase transition between helix and random coil in polypeptide chains. J Chem Phys 31(2):526–535. https://doi.org/10.1063/1. 1730390 28. Nordheim A, Lafer EM, Peck LJ, Wang JC, Stollar BD, Rich A (1982) Negatively supercoiled plasmids contain left-handed Z-DNA segments as detected by specific antibodybinding. Cell 31(2):309–318. https://doi. org/10.1016/0092-8674(82)90124-6 29. Pulleyblank DE, Haniford DB, Morgan AR (1985) A structural basis for S1 nuclease sensitivity of double-stranded DNA. Cell 42(1): 271–280. https://doi.org/10.1016/S00928674(85)80122-7 30. Dicapua E, Stasiak A, Koller T, Brahms S, Thomae R, Pohl FM (1983) Torsional stress induces left-handed helical stretches in DNA of Natural Base sequence - circular-dichroism and antibody-binding. EMBO J 2(9):1531–1535. https://doi.org/10.1002/j.1460-2075.1983. tb01619.x 31. Revet B, Zarling DA, Jovin TM, Delain E (1984) Different Z DNA forming sequences are revealed in phi-X174 Rfi by high-resolution dark-field Immuno-electron microscopy. EMBO J 3(13):3353–3358. https://doi.org/ 10.1002/j.1460-2075.1984.tb02303.x 32. Ho PS, Ellison MJ, Quigley GJ, Rich A (1986) A computer aided thermodynamic approach for predicting the formation of Z-DNA in naturally occurring sequences. EMBO J 5(10): 2737–2744

33. Ho PS (2009) Thermogenomics: thermodynamic-based approaches to genomic analyses of DNA structure. Methods 47(3): 159–167. https://doi.org/10.1016/j.ymeth. 2008.09.007 34. Kladde MP, Kohwi Y, Kohwi-Shigematsu T, Gorski J (1994) The non-B-DNA structure of d(CA/TG)n differs from that of Z-DNA. Proc Natl Acad Sci U S A 91(5):1898–1902 35. Ho PS (1994) The non-B-DNA structure of d(CA/TG)n does not differ from that of Z-DNA. Proc Natl Acad Sci U S A 91(20): 9549–9553 36. Collins FS (1991) The genome project and human health. FASEB J 5(1):77 37. Venter JC, Adams MD, Myers EW, Li PW, Mural RJ, Sutton GG et al (2001) The sequence of the human genome. Science 291(5507):1304–1351 38. Szustakowski J, Consor IHGS (2001) Initial sequencing and analysis of the human genome. Nature 409(6822):860–921. 411(6838):720 39. Schroth GP, Chou PJ, Ho PS (1992) Mapping Z-DNA in the human genome. Computeraided mapping reveals a nonrandom distribution of potential Z-DNA-forming sequences in human genes. J Biol Chem 267(17): 11846–11855 40. Liu R, Liu H, Chen X, Kirby M, Brown PO, Zhao K (2001) Regulation of CSF1 promoter by the SWI/SNF-like BAF complex. Cell 106(3):309–318 41. Champ PC, Maurice S, Vargason JM, Camp T, Ho PS (2004) Distributions of Z-DNA and nuclear factor I in human chromosome 22: a model for coupled transcriptional regulation. Nucleic Acids Res 32(22):6501–6510 42. Collins FS, Lander ES, Rogers J, Waterston RH, Conso IHGS (2004) Finishing the euchromatic sequence of the human genome. Nature 431(7011):931–945. https://doi.org/ 10.1038/nature03001 43. Khuu P, Sandor M, DeYoung J, Ho PS (2007) Phylogenomic analysis of the emergence of GC-rich transcription elements. Proc Natl Acad Sci U S A 104(42):16528–16533 44. Hug LA, Baker BJ, Anantharaman K, Brown CT, Probst AJ, Castelle CJ et al (2016) A new view of the tree of life. Nat Microbiol 1:16048. h ttps://d oi.or g/10 .1 03 8/n micro bio l . 2016.48 45. Mulligan CJ (2018) Insights from epigenetic studies on human health and evolution. Curr Opin Genet Dev 53:36–42. https://doi.org/ 10.1016/j.gde.2018.06.008

Thermogenomic Analysis of Left-Handed Z-DNA 46. Ho PS, Quigley GJ, Tilton RF Jr, Rich A (1988) Hydration of methylated and nonmethylated B-DNA and Z-DNA. J Phys Chem 92:939–945 47. Wang YH, Wang AQ, Liu ZJ, Thurman AL, Powers LS, Zou M et al (2019) Singlemolecule long-read sequencing reveals the chromatin basis of gene expression. Genome Res 29(8):1329–1342. https://doi.org/10. 1101/gr.251116.119 48. Zhabinskaya D, Benham CJ (2011) Theoretical analysis of the stress induced B-Z transition in superhelical DNA. PLoS Comput Biol 7(1): e1001051. https://doi.org/10.1371/journal. pcbi.1001051 49. Li H, Xiao J, Li JM, Lu L, Feng S, Droge P (2009) Human genomic Z-DNA segments probed by the Z domain of ADAR1. Nucleic Acids Res 37(8):2737–2746. https://doi.org/ 10.1093/nar/gkp124 50. Beknazarov N, Jin S, Poptsova M (2020) Deep learning approach for predicting functional Z-DNA regions using omics data. Sci Rep UK

215

10(1):19134. https://doi.org/10.1038/ s41598-020-76203-1 51. Shin SI, Ham S, Park J, Seo SH, Lim CH, Jeon H et al (2016) Z-DNA-forming sites identified by ChIP-Seq are associated with actively transcribed regions in the human genome. DNA Res 23(5):477–486. https://doi.org/10. 1093/dnares/dsw031 52. Drew HR, Wing RM, Takano T, Broka C, Tanaka S, Itakura K et al (1981) Structure of a B-DNA dodecamer: conformation and dynamics. Proc Natl Acad Sci U S A 78:2179– 2183 53. Luo ZP, Dauter M, Dauter Z (2014) Phosphates in the Z-DNA dodecamer are flexible, but their P-SAD signal is sufficient for structure solution. Acta Crystallogr Sect D Struct Biol 70:1790–1800. https://doi.org/10.1107/ S1399004714004684 54. Carter M, Ho PS (2011) DNA structure: alphabet soup for the cellular soul. In: Seligmann H (ed) DNA replication – current advances. InTech, London, pp 3–28

Chapter 15 DeepZ: A Deep Learning Approach for Z-DNA Prediction Nazar Beknazarov and Maria Poptsova Abstract Here we describe an approach that uses deep learning neural networks such as CNN and RNN to aggregate information from DNA sequence; physical, chemical, and structural properties of nucleotides; and omics data on histone modifications, methylation, chromatin accessibility, and transcription factor binding sites and data from other available NGS experiments. We explain how with the trained model one can perform whole-genome annotation of Z-DNA regions and feature importance analysis in order to define key determinants for functional Z-DNA regions. Key words Z-DNA, Machine learning, Deep learning, CNN, RNN, Omics data, DNA secondary structures

1

Introduction Computational detection of Z-DNA regions based exclusively on the information from sequence is a difficult task. Though some sequences with specific patterns (such as GT repeats) are more prone to flip from B- to Z-conformations, the entire set of potential Z-DNA-forming sequences are far larger. Initially the general understanding was that Z-DNA is formed from the alternating purine–pyrimidine repeats, but ChIP-seq data on protein binding with Z-DNA revealed that this is not always the case, and the sequences that at first glance have no definite sequence patterns are shown to adopt Z-DNA conformation [1]. On the other hand, even if a sequence is a potential Z-forming sequence, it does not necessarily serve as a functional Z-DNA element. Often genomic functional elements are surrounded by other functional elements such as histone marks or DNA motifs for transcription factors and other DNA-binding proteins. Combinatorial patterns of different genomic and epigenomic signals should accompany functional Z-DNA regions, and it is a nontrivial task to determine those regions.

Kyeong Kyu Kim and Vinod Kumar Subramani (eds.), Z-DNA: Methods and Protocols, Methods in Molecular Biology, vol. 2651, https://doi.org/10.1007/978-1-0716-3084-6_15, © The Author(s), under exclusive license to Springer Science+Business Media, LLC, part of Springer Nature 2023

217

218

Nazar Beknazarov and Maria Poptsova

Another problem that one may face when applying machine learning approach to Z-DNA detection is the scarcity of experimental data. The experiments for detection of Z-DNA structure have many biases (see [2] for a summary); that is why at the time DeepZ was developed, there were only few whole-genome maps available. The first Z-DNA map of the human genome was generated by using Zα domain of the double-stranded RNA editing enzyme ADAR [3]. There were 186 Z-DNA hotspots found, among which 46 hotspots were located in centromeres of 13 human chromosomes. The first ChIP-seq experiment for detection of Z-DNA regions [4] used Zaa protein with two Z-DNAbinding domains. The generated genome-wide map of Z-DNA sites contained 391 regions with the majority of the Z-DNA located in promoter areas. Below we will describe how we overcome the problem of small training data set by considering nucleotide-level approach rather than region-based. Here we take advantage of machine learning approach that can aggregate information from multiple layers of genome organization together with information on DNA sequence and structure and predict functional elements of interest, here Z-DNA. Deep learning models were shown to be successful in predicting gene expression [5] and differential gene expression from histone modification signals [6], histone modifications from sequence information and chromatin accessibility data [7], protein–RNA binding preferences from sequence and RNA secondary structure information [8], and promoters and enhancers from histone modification and TF binding ChIP-seq, DNase-seq, FAIRE-seq, and ChIA-PET data [9]. Here we describe a deep learning approach to predict Z-DNA regions incorporating information about sequence, structure, epigenetic code, chromatin accessibility, and transcription factor and RNA polymerase binding sites.

2

The Input Data The input data is taken from ChIP-seq experiments and usually are represented in the form of intervals (typically, in .bed format). In the original study [10], where we described DeepZ model, we used two Z-DNA data sets: one from ChIP-seq experiment that reported 391 Z-DNA regions [4] and the second data set composed of data from Wu et al. [11] and Kouzine et al. [12]. The data sets should be cleaned from ENCODE blacklist regions [13]. Often, for usage with deep learning methods, the regions of interest are centered and adjusted to the same width and are treated as objects for positive class. In our approach due to the small number of items in the positive class, we propose a different method. Instead of intervals we consider the level of nucleotides where the entire genome is represented by a Boolean array, where

DeepZ

219

1 is assigned to nucleotides in Z-DNA regions and 0 otherwise. With this approach a minor class will contain enough elements to use in machine learning models (e.g., around 150,000 for 380 sites of Z-DNA regions from ChIP-seq experiments each approximately 400 bp long). The second class is composed from random positions in the genome. Along with the sequence, the model allows incorporating any additional information. This can be information on physical, chemical, or structural properties of dinucleotides and any omics data from NGS experiments. We also included in DeepZ model B–Z transition energy that was originally used in Z-Hunt (see Table 2 in [14]) and the additional information on histone marks (HM), DNase I hypersensitive sites (DNase-seq), transcription factor (TF), and RNA polymerase (RNAP) binding sites. Methylation variation maps were taken from [15]. In fact, any genomic track can be added as an informational layer (see Fig. 1). In the original DeepZ publication, the total set included 1058 markers of which there were 100 histone marks, 947 transcription factor binding sites, 10 RNA polymerase binding sites, and DNase I hypersensitive sites. The full list of features can be found in Supplementary Table S1 in [10].

3

Data Compression Each feature was normalized to the interval [0, 1]. The entire genome was mapped to the matrix of size L × N where L is the size of the genome and N is the number of features used in the model. The total size of the human genome exceeds 3 × 109 nucleotides, and it requires 3 terabytes of RAM to store the entire matrix with each value encoded by 4-byte float. To overcome this problem, we propose to compress the data with the sparse vector method. The basic idea of the method is to encode the data by two vectors. The first data vector stores directly the values of the encoded vector; the second vector stores the indexes of the values in the encoded vector. This vector supports the following operations: (1) returns standard vector values for a given slice [i, j] and (2) changes vector values on a given slice [i, j]. On the real data—histone data labels—the compression level exceeded 100. Thus, instead of 1 terabyte, about 100 megabytes will solve the task. For DeepZ model described in [10] with 1058 markers, all the data for human genome took up only about 200 megabytes. Hereby all the input data can run in RAM permanently. This package was implemented in Python 3 using the NumPy library and is available in the repository https://github.com/Nazar1997/ Sparse-vector.

220

Nazar Beknazarov and Maria Poptsova

Fig. 1 General schema of deep learning models for Z-DNA prediction. (a) CNN-based deep model architectures. (b) RNN-based deep model architecture for Z-DNA prediction. The second LSTM cell takes reversed order of data and then concatenates the result with the first LSTM cell with the original order to improve the performance

DeepZ

4

221

Deep Learning Architectures The proposed method is based on deep learning approach. We considered three architectures comprising three types of deep neural networks: CNN, RNN, and hybrid CNN–RNN. Comparative analysis performed by us in [10] showed that all three architectures performed relatively well for the task of Z-DNA prediction. The typical CNN and RNN blocks are presented in Fig. 1. They can have different number of layers. The model ends by a fully connected (FC) block, which also can be represented by more than one layer. A dropout layer can be placed in between FC layers with a probability of every dropout layer set to 0.5. The last FC layer has two output neurons corresponding to two classes. CNN-Based Architecture This type of DL models consists of only CNN and FC layer blocks (Fig. 1). One and two CNN layers with ReLU activation in between CNN layers were tried. Number of convolutional kernels and kernel size varied from 1 to 17. Stride was set to 1; padding was set to (kernel size - 1)/2, to keep the same size of the output. Every convolutional kernel has 1D conformation. An output of the CNN block is sent to the FC block, where final prediction is made. RNN-Based Architecture This type of DL models consists of only RNN and FC blocks (Fig. 1). Untouched input is sent to the RNN block. The RNN block consists of the LSTM network with different hyperparameters. We tested one and two LSTM layers, one and bidirectional LSTM with various hidden sizes. Output of the RNN block is sent to the FC block where final prediction is made. Hybrid CNN–RNN-Based Architecture This type of DL models consists of both RNN and CNN and FC blocks. The input is first sent to the CNN block and then to the RNN block, and the final prediction is made in the FC block. Searching for hyperparameters for each block was the same as described above. In the original DeepZ publication, all models were trained using RMSprop via backpropagation (RMSprop is the unpublished, adaptive learning rate method proposed by Geoff Hinton). Instead of the full-gradient calculation, the gradient was calculated on a subset of the training set, and model parameters were updated accordingly after each gradient calculation.

222

5

Nazar Beknazarov and Maria Poptsova

Train and Test Set Every chromosome is divided into a set of subsequences. We recommend to avoid generating boundaries of subsequences based on the sites of Z-DNA as it takes place when the functional element is centered in the region (see Note 1). Every chromosome was evenly cut into pieces with the length of 5000 nucleotides. For train and test sets, we included all subsequences containing Z-DNA and background sequences that do not contain Z-DNA, which were randomly chosen from the entire genome. Randomization was fixed for reproducibility. The number of non-Z-DNA sequences was triple the number of Z-DNA-containing sequences. Training and test sets were stratified and divided in the ratio of 4 to 1. The stratification was based on Z-DNA presence and chromosome number.

6

Whole-Genome Annotation with Z-DNA Regions Once the model is trained, it can be used to predict novel functional Z-DNA regions. The problem with training DeepZ model was the scarcity of experimental Z-DNA data at the time DeepZ model was developed. To minimize the bias toward the available training set, we implemented procedure similar to five-fold cross validation. We describe it further in detail. The entire data set, which is the entire genome, is divided into 5 folds of equal size, and each fold is stratified by chromosome number and indication of Z-DNA presence/absence (1 or 0). At each consequent step, one fold out of 5 is chosen for a test set and the DeepZ model is trained on the remaining 4 folds. The procedure is repeated five times. In total, five DeepZ models are trained. Each of the five models is used for predictions of the genomic regions outside of the training set. The final prediction is calculated as an average of all five models’ predictions, and these are probabilities for a nucleotide to belong to a Z-DNA region. Thus, every nucleotide from every chromosome will have a probability to belong to a Z-DNA-forming region. We assign a nucleotide as belonging to a Z-DNA region if the predicted probability is above a threshold. The threshold is recommended to choose as the value that maximized F1 score on the combined set of all 5 folds (see Note 2). This method can assign short DNA regions being Z-DNA that can be located at a short distance from each other. To avoid fragmentation, we combined short regions into longer one based on the rule that all intervals with a gap less than 11 bp can be joined together taking into account that 11 bp is the length of one turn of DNA helix (see Note 3).

DeepZ

7

223

DeepZ Model Interpretation One of the important aspects of machine learning approach is the interpretability of the constructed model. The value of the machine learning model depends on whether it is possible to extract important features that contribute to the model performance. Since RNN architectures are not good for interpretations, we used the best CNN model, which performance was only slightly inferior to the best RNN model. We applied different approaches to interpretation and describe each separately. CNN was originally developed for image recognition where image is supplied in the form of matrix with pixel values. The CNN model applies different filters (small matrices) to reveal important elements in the image regardless of their position inside the image. Here genomic data is digitalized and represented as a matrix with real values similar to image representation with pixel values. The idea of applying different filters is the same as for images but here the important filters correspond to the recurrent sequence motifs. This methodology of extracting important filters from CNN trained to predict genomic regions of interest was successfully applied in many works including prediction of DNA-binding sites [16] and others [8, 17]. This method uses only sequence information that is converted to the DNA sequence motifs characteristic for regions of interest (see Note 4). The second method for getting feature importance from DeepZ model consists in quantifying both positive and negative contribution of each feature from omics data and physical, chemical, or structural properties of DNA. To obtain these values, CNN model should be trained separately with a high regularization penalty. For image classification task, the proposed method computes the gradient of the class score with respect to the input image [18]. Our method is similar with the difference that the input is a 1D image of the nucleotide sequences. The training of the CNN model is done with an addition of 10-3 (or 10-2) weights of L1 regularization in the loss function. L1 regularization has the property of nullifying all unnecessary model weights, and all features with zero weights in the first convolutional layer are further ignored. The nonzero weights of the model are frozen, and the trainable input is passed again to this model. The structure of the model allows limiting the trainable input length to nine nucleotides (Fig. 2). The most distant filter of the second layer is located at a distance of two nucleotides; in turn the most distant nucleotide is located at a distance of two nucleotides from the side filter. Thus, the dependence on the target nucleotide will not exceed four nucleotides to the left and to the right. A sequence of nine elements will completely define one output of the trained CNN model as shown in Fig. 2.

224

Nazar Beknazarov and Maria Poptsova

Fig. 2 Model interpretation scheme

However, unlike a neural network, whose weights can take any real value, values of this input can only take values from 0 to 1. In order to find features that from the model’s point of view increase the probability of Z-DNA formation, the range of values was set from -1 to 1. This way we can quantify features with both positive and negative contribution. The target function maximizes the predicted probability of becoming a Z-DNA site for the central nucleotide. RMSprop with learning rate 10-2 was used for input learning. Input values were mapped to the interval [-1, 1] after every learning iteration. After the input that maximizes the output of the CNN is found, it is difficult to find a DNA sequence that corresponds to its maximum output, since the sequence itself is encoded by the one-hot encoding method. This means that all four input features depend on each other, and their independent maximization can give an incorrect answer unlike other features. In order to find such a sequence, a separate maximization was performed for the encoded sequence but with additional restrictions. The sum of four features for each nucleotide is equal to one. With these restrictions, the problem is not solved by an ordinary gradient descent, but it is solved using sequential least squares programming. The output is the weight matrix, which is interpretable as a Z-DNA probability (see Note 5). Availability The DeepZ model implementation is available at https://github.com/Nazar1997/DeepZ.

8

Notes 1. Because of the small number of the positive class elements and large number of features, the model is prone to overfitting. Every step in the training should be taken with caution of overfitting. A method to fight against overfitting is to avoid

DeepZ

225

Z-DNA to be centered in the region submitted for training. Otherwise the model will know boundaries and it will result in a target leakage. That is why we partition chromosome in 5000 bp intervals and real Z-DNA regions are randomly distributed over that 5000 bp intervals. 2. We can recommend two strategies to set up a threshold. The one is chosen as the value that maximized F1 (or any other) score on the combined set of all 5 folds as it was done in DeepZ. The other way is to set up the desired number of predicted intervals one expects from the model taking into account model performance metrics. In DeepZ original study, we used the cutoff threshold of 0.343 as it was the value that maximized F1 score on the combined set of all 5 folds. 3. The accumulating data on Z-DNA binding indicate that ZDNA-binding sites can be shorter than 11 bp. It is up to a researcher to set up the value for the Z-DNA region minimum length. If the value is too small, then the prediction can result in many fragmented regions. 4. Extraction of important filters and their conversion into DNA motifs are done differently from the approaches when only sequence information is used in the form one-hot-encoded matrix. Here we must perform optimization with boundary conditions, and for this task the standard gradient methods must be modified. 5. Feature importance analysis for deep neural network models is a developing field and there does not exist one solution. Different approaches and methods can be employed. In the original DeepZ publications, we applied regularization to the first convolution layer. Later we found that linear regression with regularization also works given that the linear model shows a good performance. References 1. Li H, Xiao J, Li J, Lu L, Feng S, Droge P (2009) Human genomic Z-DNA segments probed by the Z alpha domain of ADAR1. Nucleic Acids Res 37(8):2737–2746. https:// doi.org/10.1093/nar/gkp124 2. Herbert A (2020) ALU non-B-DNA conformations, flipons, binary codes and evolution. R Soc Open Sci 7(6):200222. https://doi.org/ 10.1098/rsos.200222 3. Herbert A, Alfken J, Kim YG, Mian IS, Nishikura K, Rich A (1997) A Z-DNA binding domain present in the human editing enzyme, double-stranded RNA adenosine deaminase. Proc Natl Acad Sci U S A 94(16):8421–8426. https://doi.org/10.1073/pnas.94.16.8421

4. Shin SI, Ham S, Park J, Seo SH, Lim CH, Jeon H, Huh J, Roh TY (2016) Z-DNA-forming sites identified by ChIP-Seq are associated with actively transcribed regions in the human genome. DNA Res 23:477. https://doi.org/ 10.1093/dnares/dsw031 5. Singh R, Lanchantin J, Robins G, Qi Y (2016) DeepChrome: deep-learning for predicting gene expression from histone modifications. Bioinformatics 32(17):i639–i648. https:// doi.org/10.1093/bioinformatics/btw427 6. Sekhon A, Singh R, Qi Y (2018) DeepDiff: DEEP-learning for predicting DIFFerential gene expression from histone modifications.

226

Nazar Beknazarov and Maria Poptsova

Bioinformatics 34(17):i891–i900. https://doi. org/10.1093/bioinformatics/bty612 7. Yin Q, Wu M, Liu Q, Lv H, Jiang R (2019) DeepHistone: a deep learning approach to predicting histone modifications. BMC Genomics 20(Suppl 2):193. https://doi.org/10.1186/ s12864-019-5489-4 8. Ben-Bassat I, Chor B, Orenstein Y (2018) A deep neural network approach for learning intrinsic protein-RNA binding preferences. Bioinformatics 34(17):i638–i646. https:// doi.org/10.1093/bioinformatics/bty600 9. Li Y, Shi W, Wasserman WW (2018) Genomewide prediction of cis-regulatory regions using supervised deep learning methods. BMC Bioinform 19(1):202. https://doi.org/10. 1186/s12859-018-2187-1 10. Beknazarov N, Jin S, Poptsova M (2020) Deep learning approach for predicting functional Z-DNA regions using omics data. Sci Rep 10(1):19134. https://doi.org/10.1038/ s41598-020-76203-1 11. Wu T, Lyu R, You Q, He C (2020) Kethoxalassisted single-stranded DNA sequencing captures global transcription dynamics and enhancer activity in situ. Nat Methods 17(5): 515–523. https://doi.org/10.1038/s41592020-0797-9 12. Kouzine F, Wojtowicz D, Baranello L, Yamane A, Nelson S, Resch W, Kieffer-Kwon KR, Benham CJ, Casellas R, Przytycka TM, Levens D (2017) Permanganate/S1 nuclease footprinting reveals non-B DNA structures with regulatory potential across a mammalian

genome. Cell Syst 4(3):344–356. e347. https://doi.org/10.1016/j.cels.2017.01.013 13. Amemiya HM, Kundaje A, Boyle AP (2019) The ENCODE blacklist: identification of problematic regions of the genome. Sci Rep 9(1): 9354. https://doi.org/10.1038/s41598019-45839-z 14. Ho PS, Ellison MJ, Quigley GJ, Rich A (1986) A computer aided thermodynamic approach for predicting the formation of Z-DNA in naturally occurring sequences. EMBO J 5(10): 2737–2744 15. Gao Y, Li L, Yuan P, Zhai F, Ren Y, Yan L, Li R, Lian Y, Zhu X, Wu X, Kee K, Wen L, Qiao J, Tang F (2020) 5-Formylcytosine landscapes of human preimplantation embryos at single-cell resolution. PLoS Biol 18(7):e3000799. https://doi.org/10.1371/journal.pbio. 3000799 16. Alipanahi B, Delong A, Weirauch MT, Frey BJ (2015) Predicting the sequence specificities of DNA- and RNA-binding proteins by deep learning. Nat Biotechnol 33(8):831–838. https://doi.org/10.1038/nbt.3300 17. Kalkatawi M, Magana-Mora A, Jankovic B, Bajic VB (2019) DeepGSR: an optimized deep-learning structure for the recognition of genomic signals and regions. Bioinformatics 35(7):1125–1132. https://doi.org/10.1093/ bioinformatics/bty752 18. Simonyan K, Vedaldi A, Zisserman A (2013) Deep inside convolutional networks: visualising image classification models and saliency maps. arXiv preprint arXiv:13126034

Chapter 16 Methods to Study Z-DNA-Induced Genetic Instability Guliang Wang, Laura Christensen, and Karen M. Vasquez Abstract Alternative DNA structures that differ from the canonical B-DNA double helix, including Z-DNA, have received much attention recently due to their impact on DNA metabolic processes, including replication, transcription, and genome maintenance. Non-B-DNA-forming sequences can also stimulate genetic instability associated with disease development and evolution. Z-DNA can stimulate different types of genetic instability events in different species, and several different assays have been established to detect Z-DNAinduced DNA strand breaks and mutagenesis in prokaryotic and eukaryotic systems. In this chapter, we will introduce some of these methods including Z-DNA-induced mutation screening and detection of Z-DNAinduced strand breaks in mammalian cells, yeast, and mammalian cell extracts. Results from these assays should provide better insight into the mechanisms of Z-DNA-related genetic instability in different eukaryotic model systems. Key words Z-DNA, Genetic instability, Reporter gene, Yeast artificial chromosome, Shuttle vector, DNA double-strand break, DNA single-strand break, Linker-mediated PCR

1

Introduction Shortly after the B-DNA double helix was first described [1], an alternative DNA structure (i.e., non-B-DNA), left-handed Z-DNA was described by Rich and his colleagues [2]. Z-DNA can form at alternating purine–pyrimidine sequences (e.g., TG or GC) and is known for the unique zigzag arrangement of the sugar–phosphate backbone, where the purines are in the syn conformation while the adjacent pyrimidines remain in the anti conformation [3]. Among these Z-DNA-forming repeats, GC repeats have the highest propensity to form Z-DNA [4, 5], and GT repeats are the most abundant dinucleotide repeat in mammalian genomes [6]. Z-DNA has been of interest due to the high occurrence and conservation of Z-DNA-forming sequences across multiple eukaryotic species [7]. Z-DNA has also been shown to play important roles in transcription, recombination, RNA editing, viral pathogenicity, and tumor development [8–10]. Moreover, despite its

Kyeong Kyu Kim and Vinod Kumar Subramani (eds.), Z-DNA: Methods and Protocols, Methods in Molecular Biology, vol. 2651, https://doi.org/10.1007/978-1-0716-3084-6_16, © The Author(s), under exclusive license to Springer Science+Business Media, LLC, part of Springer Nature 2023

227

228

Guliang Wang et al.

abundancy [11] the distribution of Z-DNA-forming sequences in eukaryotic genomes is not random. Z-DNA-forming sequences are enriched at human chromosomal instability hotspots, implicating Z-DNA in genetic instability and human disorders [12–15]. We have also discovered that Z-DNA-forming CG repeats can stimulate the formation of DNA double-strand breaks (DSBs) resulting in large-scale deletions in yeast, cultured mammalian cells, and genomes of transgenic mice [16–19]. Interestingly, the same CG repeat sequences in the same mutation-reporter systems resulted in expansions and contractions of repeat units in bacteria and did not affect the adjacent sequences [16], suggesting that the mechanisms of Z-DNA-induced mutagenesis in prokaryotic and eukaryotic systems are different. Z-DNA-induced mutation rates or frequencies are significantly higher than control B-DNA-forming sequences that are not able to form Z-DNA in all species tested, suggesting that Z-DNA structures represent an endogenous source of genetic variation and instability [16, 17, 19]. However, the absolute numbers of mutants induced by Z-DNA in total can be low (one of the reasons why these unstable elements are still present and conserved throughout genomes). Note that Z-DNA-forming sequences in the genome can facilitate important biological functions; therefore mutations in these regions might impact cell survival and growth, resulting in a selection pressure for the mutants. This renders direct detection of Z-DNA-induced mutants using traditional methods (e.g., PCR and DNA sequencing) challenging. Thus, a functional neutral mutation-reporter plasmid that is sensitive to Z-DNA-induced mutation and does not cause such biased selection pressure can be beneficial in studies of Z-DNA-induced genetic instability. Additionally, the formation of Z-DNA is a dynamic process and is dictated by many factors including sequence context, temperature, salt, pH, the presence of binding proteins, and negative supercoiling [8, 14, 20, 21]. Since it is often not possible to manipulate these conditions in the regions of interest on genomic DNA in living cells, a negatively supercoiled plasmid substrate provides a much simpler alternative for mechanistic studies. We previously described several different shuttle vector mutation reporters that can be used in bacteria and mammalian cells [22]. Because Z-DNA predominantly stimulated DSBs leading to large deletions and rearrangements in eukaryotic cells [16, 23], we chose a lacZ’-based reporter that is sensitive for both deletions and point mutations for studying Z-DNA-induced genetic instability in both bacterial and mammalian cells [16, 22]. The step-by-step instructions can be found in [22]. We have also established a transgenic mutation-reporter mouse model carrying a recoverable mutation reporter in the genome. The reporter segments can be recovered from mouse genomes and religated to form a plasmid for quick and easy mutation screening in bacteria [17, 22]. This approach was derived from

Methods to Study Z-DNA-Induced Genetic Instability

229

Vijg’s creative system [24, 25] and provides a very efficient, sensitive, low-cost, and unbiased screening of Z-DNA-induced genetic instability in mouse genomes. We have described some commonly used methods previously with a focus on how to select appropriate strategies for different purposes [22]. Herein we describe three additional methods that are designed for different purposes in detail: [Exp #1]: How to screen for Z-DNA-induced mutation rates on yeast artificial chromosomes (YACs) as originally described by Freudenreich et al. [26] and how to transfer reporter YACs to mutant yeast strains to investigate the functions of candidate genes in Z-DNA-induced chromosomal instability. This method provides a facile and low-cost approach to screen for genes that are involved in Z-DNA-induced mutagenesis in eukaryotes. The information obtained could be used for directing further investigations in other systems. [Exp #2]: How to detect Z-DNA-induced DSBs on reporter vectors recovered from mammalian cells. Because Z-DNAinduced breaks occur at a low frequency and are not always found at a particular nucleotide or position, detecting and mapping the breakpoints can be challenging. We have developed a modified linker-mediated PCR (LM-PCR) assay to directly map DSBs generated at or near non-B-DNA-forming sequences on plasmids recovered from mammalian cells [16, 27, 28]. DSBs will be converted into blunt ends by DNA polymerase I Klenow fragment (polymerase activity for 50 -overhangs and 30 –50 exonuclease activity for 30 -overhangs), and then linkers will be ligated to the breakpoints. Using a primer that is specific to the Z-DNA region and a primer on the linker, the Z-DNA-induced breakpoints will be mapped from the lengths of the PCR products (Fig. 1). [Exp #3] How to detect Z-DNA-induced single- and double-strand breaks in cell-free extracts. There are different mechanisms involved in Z-DNA-induced genetic instability in mammalian systems, including both DNA replicationdependent and DNA replication-independent pathways [19, 29, 30]. It is often difficult to control the status of plasmid replication in transfected cells, as well as integrated reporter vectors or endogenous genomic DNA in living cells. In this case, using a cell-free extract system may be beneficial. Z-DNA structures can be recognized and processed by structurespecific enzymes in cell-free extracts with a buffer system that supports DNA repair, replication, and transcription. Processed mutation-reporter plasmids have been shown to contain similar mutations at or near Z-DNA as those from mammalian cells and similar processing intermediates, such as those containing

230

Guliang Wang et al.

Fig. 1 Schematic illustration of linker-mediated PCR (LM-PCR) to detect DSBs near Z-DNA structures. Only ZDNA-induced DSB hotspots could result in unique PCR amplifications and exhibit sharp bands in an agarose gel. The positions of the DSB hotspots can be calculated based on the lengths of the PCR products

single- and double-strand breaks [16]. Thus, use of cell-free extracts to study Z-DNA-induced DNA breaks provides a wellcontrolled system to assist in the elucidation of the mechanisms involved in Z-DNA-induced genetic instability.

2

Materials

2.1 Materials for Screening for Z-DNAInduced DNA DoubleStrand Breaks in Yeast Artificial Chromosomes (Exp #1)

1. Yeast strain 213: Genotype (MATa kar1-1 leu2-3,112 ura3-52 his7). 2. Recombination plasmid that works with yeast strain 213 to construct reporter YAC: pRS306 (see Note 1). 3. YPD medium: Mix 50 g of YPD powder in 1 L of distilled water (add 15 g agar if making plates); autoclave for 15 min at 121  C. 4. Minimal synthetic defined (SD) medium: Mix 26.7 g of SD base powder in 1 L of distilled water (add 20 g agar if making plates); autoclave for 15 min at 121  C. 5. SD medium (see Note 2). 6. DO supplement powder: DO-Leu, DO-Leu-Ura, DO-Lys, DO-Arg, DO-Ura-Arg. 7. 5-Fluoroorotic acid (FOA): Add FOA (100; store at 20  C) into 1 L medium after autoclaving and cooling to 50  C. 8. Canavanine: Add 1 mL of canavanine liquid into 1 L medium after autoclaving and cooling to 50  C, final concentration 60 μg/mL.

Methods to Study Z-DNA-Induced Genetic Instability

231

9. List of selection medium/plates used: SD + DO-Leu + FOA (for mutation frequency assay). SD + DO-Ura-Arg + Canavanine (for kar-cross selection). SD + DO-Arg + Canavanine (for canavanine resistance selection). YPD + G418: Add G418 in YPD at final concentration of 200 mg/L (for maintaining mutant cultures). 2.2 Materials for Detecting Z-DNAInduced DSBs on Reporter Vectors Recovered from Mammalian Cells (Exp #2) 2.3 Materials for Detecting Z-DNAInduced Single- and Double-Strand Breaks in Cell-Free Extracts (Exp #3)

3

2. Specific primer: AGATCCAGTTCGATGTAACC (see Note 3). 3. Hirt’s solution: 10 mM Tris–HCl (pH 8.0), 10 mM EDTA, 0.6% SDS. 1. Cell-free extracts: Prepare the cell extract according to established protocols (e.g., the NucBuster Protein Extraction Kit from Millipore, Burlington, MA, following the manufacturer’s instructions). 2. DNA repair buffer: The buffer should allow for DNA repair activity. For example, we have used the following buffer [16]: 30 mM HEPES, pH 7.5; 7 mM MgCl2; 0.5 mM dithiothreitol; 4 mM ATP; 100 μM each of dNTP; 50 μM each of NTP; 40 mM phosphocreatine, and 0.625 units of creatine phosphokinase.

Methods

3.1 Methods for Screening for Z-DNAInduced DNA DoubleStrand Breaks in Yeast Artificial Chromosomes (Exp #1) 3.1.1

1. Two oligos for DNA linker: Linker-1 CGTACATTCACAAC GATAGCGACTGA and Linker-2 GCTATCGTTGTGAATG TACG.

YAC Construction

3.1.2 Z-DNA-Induced Fragility Assay (FOA Selection of URA3 in Yeast Cells)

1. Insert the Z-DNA sequence or control B-DNA sequence of interest into the pRS306 plasmid at the NsiI site between the C4A4 sequence (a telomer re-gain seeding sequence) and the URA3 gene following standard plasmid cloning protocols. 2. Construct the YACs with the Z-DNA or control B-DNA sequences adjacent to the URA3 selection marker by homologous recombination between the parental YAC and the linearized (by restriction enzyme AatII) pRS306 plasmid. Transfect pRS306 derivatives into yeast cells following standard instructions provided from many commercially available kits, for example, the Frozen-EZ Yeast Transformation II T2001 (Zymo Research, Irvine, CA). 1. Recover yeast from frozen stock (80  C) and thaw on ice. Plate 10 μL of the culture on a YPD agar plate, and incubate at 30  C overnight (or follow the specific culture conditions for specific mutants if necessary). For any strains from the ATCC

232

Guliang Wang et al.

knockout library (for kar-crossing the YACs into mutant strains, see below), add 200 mg/L G418 in the first YPD plate. 2. For yeast containing the wild-type (WT) YAC, stock cells should be re-streaked onto an SD + DO-Leu-Ura plate and incubated at 30  C. The growth of yeast cells on SD dropout medium supplements is slower than in complete YDP medium. Colonies should be visible after 30 h and this incubation can be extended to 48–72 h. 3. Select five single yeast colonies from the SD + DO-Leu-Ura plate, and inoculate into 2 mL of the same medium broth, and incubate at 30  C overnight. 4. The Z-DNA or control B-DNA sequences on the YACs from the colonies used in step 3 can be amplified by PCR using T7 and T3 primers. PCR products can be sequenced using either primer to confirm that the inserts are intact. 5. Inoculate 100 μL fresh culture into SD + DO-Leu liquid (2 mL), shake at 30  C, 220 rpm, for 20 h to maintain selection for the YAC, and allow for Z-DNA-induced loss of the right arm of the YAC (see Note 4). 6. Make SD + DO-Leu plates and SD + DO-Leu + FOA plates and cover the FOA plates to avoid light. It is recommended to make the plates at least 1 day prior to use so that the plates are easier to dry in step 7. 7. After 20 h of incubation, plate 50 μL of the SD + DO-Leu culture onto SD + DO-Leu + FOA plates (three plates for each culture). Cover the lids after the plates are dried. This is for selecting the FOA-resistant mutants. 8. Dilute the culture 10,000 (depending on the growth conditions, 1:100 dilution, and then another 1:100 dilution from diluted, dilute in SD + DO-Leu), vortex well, and plate 50 μL on to SD + DO-Leu plates (two plates each) to obtain the number of total colonies/μL in the original culture. 9. Incubate the two sets of plates from steps 7 and 8 at 30  C for 48 h. 10. Calculate the frequency of Z-DNA-induced FOA-resistant URA3 mutants as the number of FOA-resistant (FOAR) colonies vs. the number of total colonies. 11. The FALCOR website: https://lianglab.brocku.ca/ FALCOR/ or http://shinyflan.its.manchester.ac.uk can be used to determine the mutation rates. 3.1.3 Transferring YACs to Mutant Yeast Strains Using a Kar-Cross Protocol (Liquid Method) (See Note 5)

1. Streak donor cells (strain 213 containing the YAC of interest, verified by PCR and sequencing, MATa) onto YPD plates to allow for colony growth. Then select three colonies and re-streak them onto SD + DO-Leu-Ura plates. Streak recipient cells (without the YAC, MATα) onto SD + DO-

Methods to Study Z-DNA-Induced Genetic Instability

233

Arg + Canavanine plates or according to its genotype (e.g., add G418 as suggested), and grow for 48 h at 30  C. 2. Pre-warm YPD and -Leu-Ura media. Pick one colony from each plate, and inoculate into 2 mL of the appropriate liquid medium (e.g., for donor 213 cells, grow in 2 mL -Leu-Ura media; for recipient ATCC cells, grow in 2 mL YPD broth), and incubate cells overnight at 30  C. 3. Measure OD600 of 200 μL of each culture diluted in 800 μL of the corresponding medium. Mix well by vortexing. OD should be near 0.5. 4. Pre-warm the YPD media (YPD) and inoculate equal numbers of donor and recipient cells (~50 μL each; adjust loading from the OD values) into 1 mL fresh YEPD medium, and shake for 6 h at the permissive temperature (30  C or as suggested for specific mutants) to allow for the kar-cross. 5. Meanwhile, pre-warm the SD-Ura-Arg + Canavanine plates at 30  C. 6. From the 1 mL kar-cross culture from step 4, plate 0.1 mL, 0.3 mL, and the remaining culture (~0.7 mL; spin and concentrate into 0.1 mL volume prior to plating) onto SD-Ura-Arg + Canavanine (60 mg/L) plates. 7. Let plates dry, and incubate at 30  C for 2 or 3 days until the colonies are reasonably sized for easy detection (1–2 mm). 8. Replicate colonies from SD-Ura-Arg + Canavanine plates to SD-Leu-Ura and SD-Lys plates. A BY4705 background cell containing the YAC will grow on an SD-Leu-Ura plate, but will not grow on an SD-Lys plate. If not using a whole plate replica technique, then manually replicate at least 50 colonies for each. 9. Confirm the cells identified in step 8, and resuspend colonies in 20 μL ddH2O in a 96-well plate. Spot 3 μL of each onto SD-Ura-Arg + Canavanine, SD-Leu-Ura, and SD-Lys plates. The YAC transfer is screened by growth on SD-Ura-Leu plates, but not on SD-Lys plates. 10. Incubate the plates at 30  C for 2 days and identify the correct clones (i.e., those that grow on SD-Ura-Arg+Canavanine and SD-Leu-Ura plates but do not grow on SD-Lys plates). Based on the results, select ten colonies from the SD+DO-Leu-Ura plate and re-streak onto a fresh -Leu-Ura plate. PCR the YAC using primers surrounding the reporter region containing either the Z-DNA or the B-DNA control inserts, verify by direct DNA sequencing, and freeze the colonies at 80  C in medium with 25–30% glycerol.

234

Guliang Wang et al.

3.2 Methods for Detecting Z-DNAInduced DSBs on Reporter Vectors Recovered from Mammalian Cells (LMPCR) [Exp #2]

1. Anneal double-strand DNA linker. Mix equal moles of the two oligos in 1 T4 PNK buffer, heat the mixture to 95  C, and let the tube cool down slowly (see Note 5). 2. Phosphorylate the 50 -ends using T4 PNK and ATP according the instruction’s provided by the enzyme provider. 3. Purify the double-strand linkers from unannealed oligos in an agarose gel following standard protocols (e.g., Qiagen Gel Purification Kit). Adjust the final concentration to 2.5 μM. 4. Transfect the cells with a Z-DNA-containing mutationreporter plasmid (or the B-DNA control), according to established protocols. Plasmid DNA can be recovered 24–72 h after transfection, depending on the cell type and transfection method (see Note 6). Do not allow cells to reach >95% confluence. Passage the cells if necessary. 5. Wash the cells with PBS. For adherent cells, add 2 mL Hirt’s solution into the dish. Tilt the dish gently until cells are evenly lysed. Avoid rough mixing to prevent genomic DNA shearing. For cells in suspension, spin the cells (2–8  106 cells) and wash with phosphate-buffered saline (PBS). Resuspend in 1.9 mL 10 mM Tris–HCl (pH 8.0) and 10 mM EDTA, and then add 120 μL 10% SDS, mix gently, and incubate for 20–30 min at 37  C. 6. Add 0.5 mL (1/4 volume) of 5 M NaCl, to make a final concentration of 1 M. Mix slowly and gently until the NaCl is evenly incorporated. Avoid rough mixing to prevent genomic DNA from shearing. Store at 4  C overnight. 7. Centrifuge at >17,000 g for 60 min at 4  C. Transfer the supernatant to a fresh tube, add RNase A to a final concentration of 100 μg/mL, incubate at 37  C for 30 min, and then add proteinase K to a final concentration of 100 μg/mL at 37  C for 60 min. 8. Purify DNA by phenol/chloroform/isoamyl alcohol (25:24:1) extraction. Add ammonium acetate to 1 M final concentration and 2 volumes of ethanol for precipitation. 9. Dissolve the DNA in 50 μL 1 NEBuffer 2 (New England Biolabs, Ipswich, MA) supplemented with 33 μM each dNTP and 5 units of DNA polymerase I Klenow fragment. Incubate for 20 min at 25  C. Stop the reaction by adding EDTA to a final concentration of 10 mM and inactivate the enzyme by heating for 20 min at 75  C. Purify DNA by phenol/chloroform/isoamyl alcohol (25:24:1) extraction; precipitate by ethanol as usual. 10. Dissolve DNA in 15 μL of ddH2o; add 10 pmol phosphorylated linker, 3.5 μL of 10 T4 DNA ligase buffer, and 0.5 μL of T4 DNA ligase. Bring the total volume to 35 μL. Incubate the

Methods to Study Z-DNA-Induced Genetic Instability

235

mixture at 16  C overnight. Inactivate the enzyme by heating to 75  C for 30 min. Let the mixture cool to room temperature. 11. Mix 6 μL of the linker-conjugated plasmid DNA from step 7 with 10 pmol primers (one specific to the linker and the other specific to a region near the Z-DNA sequence; see Subheading Materials) in a standard PCR reaction. The conditions should be adjusted according to the primer sequences and the expected product length, but a general example is as follows: denature at 95  C for 30 min; anneal at 54  C for 30 s; extend at 72  C for 20 s for 28–32 cycles. 12. Separate the PCR products on a 1.5–2.0% agarose gel (adjust gel concentration according to the expected length of PCR products). 13. The distance of the DSB to the 50 -end of the specific primer ¼ length of the PCR product – length of primer on the linker. 14. PCR products can be purified and cloned into a pGEM-T vector and sequenced, as previously described [28]. 3.3 Methods for Detecting Z-DNAInduced Single- and Double-Strand Breaks in Cell-Free Extracts (Exp #3)

1. Incubate 50–75 ng of Z-DNA or control B-DNA plasmid with 200–500 μg of cell-free extract in DNA repair buffer at 37  C for 4–6 h. 2. Add proteinase K to 100 μg/mL and incubate at 37  C for 60 min. Purify DNA by phenol/chloroform/isoamyl alcohol (25:24:1) extraction. Add glycogen at a final concentration of 50–500 μg/mL to facilitate the precipitation and visualization of the DNA pellet. Add 1/10 volume of 3 M sodium acetate (pH 5.2) and 2.5 volumes of ice-cold ethanol to precipitate the DNA by incubating at 20  C for 60 min. Centrifuge the sample for 30 min at >12,000 g at 4  C. Discard the supernatant, and wash the pellet with 70% ethanol. 3. Dissolve DNA in 50 μL of ddH2o and add RNase A to 100 μg/ mL; incubate at 37  C for 20 min. Repeat phenol/chloroform/isoamyl extraction and ethanol precipitation. 4. Dissolve DNA in 50 μL of 1 NEB CutSmart Buffer supplied with 5 units of calf intestinal alkaline phosphatase (CIP), and incubate at 37  C for 30 min. Repeat the phenol/chloroform/ isoamyl extraction and ethanol precipitation. 5. Dissolve DNA in 50 μL of 1 NEB T4 PNK Reaction Buffer supplied with 50 pmol of [γ-32P]ATP and 20 units of T4 PNK, and incubate at 37  C for 30 min. Heat inactivate by incubating at 65  C for 20 min. Purify the DNA from the free [γ-32P]ATP using the Qiagen Nucleotide Removal Kit following the instructions provided.

236

Guliang Wang et al.

6. Restriction digest the radiolabeled plasmid DNA sample by using enzymes (A and B in Fig. 2a) that cleave on both sides of the Z-DNA or B-DNA inserts (ideally 200–800 bp with the sequence of interest located ~1/3 length from the end of the fragment). 7. Load 1/3 of the total reaction on a 1.2% agarose gel and electrophorese at 5–10 V/cm for 30–60 min. Dry (without heat) the gel and visualize the radiolabel by autoradiography or by using a PhosphorImager. A radiolabeled full-length fragment represents single-strand breaks, while DSBs will result in two products from the original fragment.

4

Notes 1. The pRS306 plasmid has a telomer seeding sequence, C4A4, and a functional URA3 gene. The homologous sequences on both sides will direct the replacement of the mutant ura3-52 gene with the functional URA3 gene together with the Z-DNA or control B-DNA insert. The pRS306 plasmid does not replicate in yeast cells; thus the selected cells are the result of recombination events. 2. SD medium does not contain amino acids. Add corresponding dropout (DO) supplements as needed (see below; follow the manufacturer’s instructions) to make defined medium lacking the specified amino acids for selection. 3. Preparation of a double-strand DNA linker. Keeping in mind that one of the primers (Linker-2) will be designed to bind on the longer strand in the linker, it should be compatible with the specific primer that binds to the reporter vector near the Z-DNA sequence (e.g., similar melting temperatures, low self-annealing to each other). After annealing the two oligonucleotides, the duplex linker should have a blunt end and an overhang on the other end to direct the orientation of the linker on the breakpoint. 4. Keep the -Leu culture from step 5 in case too many or too few colonies grow on the plates from steps 7 and 8, so that alternate dilutions can be made to keep the colonies within a countable range. 5. To investigate the function of a gene of interest in Z-DNAinduced chromosomal instability, YACs containing Z-DNA or control B-DNA sequences near the URA3 gene can be transferred to mutant yeast strains and subjected to the same mutation assay as described in Subheading 3.1.2. For example, we used mutants derived from the strain BY4705 (MATα, ade2Δ:: hisG his3-Δ200 leu2-Δ0 ura3-Δ0 trp1-Δ63 lys2-Δ0 met15-Δ0),

Methods to Study Z-DNA-Induced Genetic Instability

237

Fig. 2 Schematic illustration of radiolabeling detection of Z-DNA-induced single-strand breaks and DSBs in cell-free extracts. (a) A simple strategy that uses initial DNA amount as loading control. (b) A modified strategy that uses internal radiolabeled fragment (A–B restriction fragment) as a loading and label-efficiency control (see Notes 7 and 8)

238

Guliang Wang et al.

and BY4742 (MATα, his3-Δ1 leu2-Δ0 ura3-Δ0 lys2-Δ0), or BY4739 (MATα, leu2-Δ0 ura3-Δ0 lys2-Δ0), and YACs were transferred by kar-cross (see below) to determine the gene products involved in Z-DNA-induced genetic instability [19]. 6. A standard alkaline lysis method should be avoided because linear plasmids with DBSs could be lost during purification; thus, a classic Hirt’s method [31] should be used. 7. Alternatively, another restriction digestion could be made (further away from the restriction sites used in step 6, A in Fig. 2b) prior to CIP treatment so that the restricted site will be radiolabeled together with the DNA breakpoints generated in cell extracts. Step 6 (restriction digestion with enzymes B and C in Fig. 2b) will release an additional fragment (A–B fragment, as shown in Fig. 2b) containing a radiolabel at the restriction site, which can serve as an internal loading and labeling control for the assay. 8. If fine mapping of the breakpoints is required, choose restriction sites closer to the Z-DNA sequence so the fragments are shorter. Separate the DNA using denaturing polyacrylamide gel electrophoresis and visualize by autoradiography or by using a PhosphorImager.

Acknowledgments This study was supported by an NIH/NCI grant to K.M.V. (CA093729). The authors declare no conflict of interest. References 1. Watson JD, Crick FH (1953) Molecular structure of nucleic acids; a structure for deoxyribose nucleic acid. Nature 171(4356): 7 3 7 – 7 3 8 . h t t p s : // d o i . o r g / 1 0 . 1 0 3 8 / 171737a0 2. Wang AH, Quigley GJ, Kolpak FJ, Crawford JL, van Boom JH, van der Marel G, Rich A (1979) Molecular structure of a left-handed double helical DNA fragment at atomic resolution. Nature 282(5740):680–686. https:// doi.org/10.1038/282680a0 3. Harvey SC (1983) DNA structural dynamics: longitudinal breathing as a possible mechanism for the B in equilibrium Z transition. Nucleic Acids Res 11(14):4867–4878. https://doi. org/10.1093/nar/11.14.4867 4. Haniford DB, Pulleyblank DE (1983) The in-vivo occurrence of Z DNA. J Biomol Struct Dyn 1(3):593–609. https://doi.org/10. 1080/07391102.1983.10507467

5. Wang AH, Hakoshima T, van der Marel G, van Boom JH, Rich A (1984) AT base pairs are less stable than GC base pairs in Z-DNA: the crystal structure of d(m5CGTAm5CG). Cell 37(1):321–331. https://doi.org/10.1016/ 0092-8674(84)90328-3 6. Lander ES, Linton LM, Birren B, Nusbaum C, Zody MC, Baldwin J et al (2001) Initial sequencing and analysis of the human genome. Nature 409(6822):860–921. https://doi.org/ 10.1038/35057062 7. Zhao J, Bacolla A, Wang G, Vasquez KM (2010) Non-B DNA structure-induced genetic instability and evolution. Cell Mol Life Sci 67(1):43–62. https://doi.org/10.1007/ s00018-009-0131-2 8. Rich A, Zhang S (2003) Timeline: Z-DNA: the long road to biological function. Nat Rev Genet 4(7):566–572. https://doi.org/10. 1038/nrg1115

Methods to Study Z-DNA-Induced Genetic Instability 9. Wang G, Vasquez KM (2007) Z-DNA, an active element in the genome. Front Biosci 12:4424–4438 10. Rich A (1994) Speculation on the biological roles of left-handed Z-DNA. Ann N Y Acad Sci 726:1–16.; discussion 16-17. https://doi. org/10.1111/j.1749-6632.1994.tb52792.x 11. Schroth GP, Chou PJ, Ho PS (1992) Mapping Z-DNA in the human genome. Computeraided mapping reveals a nonrandom distribution of potential Z-DNA-forming sequences in human genes. J Biol Chem 267(17): 11846–11855 12. Adachi M, Tsujimoto Y (1990) Potential Z-DNA elements surround the breakpoints of chromosome translocation within the 50 flanking region of bcl-2 gene. Oncogene 5(11): 1653–1657 13. Seite P, Leroux D, Hillion J, Monteil M, Berger R, Mathieu-Mahul D, Larsen CJ (1993) Molecular analysis of a variant 18;22 translocation in a case of lymphocytic lymphoma. Genes Chromosomes Cancer 6(1): 39–44 14. Wolfl S, Wittig B, Rich A (1995) Identification of transcriptionally induced Z-DNA segments in the human c-myc gene. Biochim Biophys Acta 1264(3):294–302. https://doi.org/10. 1016/0167-4781(95)00155-7 15. Rimokh R, Rouault JP, Wahbi K, Gadoux M, Lafage M, Archimbaud E, Charrin C, Gentilhomme O, Germain D, Samarut J et al (1991) A chromosome 12 coding region is juxtaposed to the MYC protooncogene locus in a t(8;12)(q24;q22) translocation in a case of B-cell chronic lymphocytic leukemia. Genes Chromosomes Cancer 3(1):24–36 16. Wang G, Christensen LA, Vasquez KM (2006) Z-DNA-forming sequences generate largescale deletions in mammalian cells. Proc Natl Acad Sci U S A 103(8):2677–2682. https:// doi.org/10.1073/pnas.0511084103 17. Wang G, Carbajal S, Vijg J, DiGiovanni J, Vasquez KM (2008) DNA structure-induced genomic instability in vivo. J Natl Cancer Inst 100(24):1815–1817. https://doi.org/10. 1093/jnci/djn385 18. Kha DT, Wang G, Natrajan N, Harrison L, Vasquez KM (2010) Pathways for doublestrand break repair in genetically unstable ZDNA-forming sequences. J Mol Biol 398(4): 471–480. https://doi.org/10.1016/j.jmb. 2010.03.035 19. McKinney JA, Wang G, Mukherjee A, Christensen L, Subramanian SHS, Zhao J, Vasquez KM (2020) Distinct DNA repair

239

pathways cause genomic instability at alternative DNA structures. Nat Commun 11(1):236. https://doi.org/10.1038/s41467-01913878-9 20. Wittig B, Dorbic T, Rich A (1989) The level of Z-DNA in metabolically active, permeabilized mammalian cell nuclei is regulated by torsional strain. J Cell Biol 108(3):755–764. https:// doi.org/10.1083/jcb.108.3.755 21. Li H, Xiao J, Li J, Lu L, Feng S, Droge P (2009) Human genomic Z-DNA segments probed by the Z alpha domain of ADAR1. Nucleic Acids Res 37(8):2737–2746. https:// doi.org/10.1093/nar/gkp124 22. Wang G, Zhao J, Vasquez KM (2009) Methods to determine DNA structural alterations and genetic instability. Methods 48(1):54–62. https://doi.org/10.1016/j.ymeth.2009. 02.012 23. Xie KT, Wang G, Thompson AC, Wucherpfennig JI, Reimchen TE, MacColl ADC, Schluter D, Bell MA, Vasquez KM, Kingsley DM (2019) DNA fragility in the parallel evolution of pelvic reduction in stickleback fish. Science 363(6422):81–84. https://doi.org/10. 1126/science.aan1425 24. Boerrigter ME, Dolle ME, Martus HJ, Gossen JA, Vijg J (1995) Plasmid-based transgenic mouse model for studying in vivo mutations. Nature 377(6550):657–659. https://doi.org/ 10.1038/377657a0 25. Gossen JA, de Leeuw WJ, Molijn AC, Vijg J (1993) Plasmid rescue from transgenic mouse DNA using LacI repressor protein conjugated to magnetic beads. BioTechniques 14(4): 624–629 26. Freudenreich CH, Stavenhagen JB, Zakian VA (1997) Stability of a CTG/CAG trinucleotide repeat in yeast is dependent on its orientation in the genome. Mol Cell Biol 17(4): 2090–2098. https://doi.org/10.1128/mcb. 17.4.2090 27. Wang G, Vasquez KM (2004) Naturally occurring H-DNA-forming sequences are mutagenic in mammalian cells. Proc Natl Acad Sci U S A 101(37):13448–13453. https://doi. org/10.1073/pnas.0405116101 28. Lu S, Wang G, Bacolla A, Zhao J, Spitser S, Vasquez KM (2015) Short inverted repeats are hotspots for genetic instability: relevance to cancer genomes. Cell Rep 10(10): 1674–1680. https://doi.org/10.1016/j.cel rep.2015.02.039 29. Wang G, Gaddis S, Vasquez KM (2013) Methods to detect replication-dependent and replication-independent DNA structure-

240

Guliang Wang et al.

induced genetic instability. Methods 64(1): 67–72. https://doi.org/10.1016/j.ymeth. 2013.08.004 30. Wang G, Vasquez KM (2017) Effects of replication and transcription on DNA structure-

related genetic instability. Genes (Basel) 8(1). https://doi.org/10.3390/genes8010017 31. Hirt B (1967) Selective extraction of polyoma DNA from infected mouse cell cultures. J Mol Biol 26(2):365–369. https://doi.org/10. 1016/0022-2836(67)90307-5

Chapter 17 Single-Molecule Visualization of B–Z Transition in DNA Origami Using High-Speed AFM Masayuki Endo and Hiroshi Sugiyama Abstract To study the physical properties of molecules and their reaction processes, direct visualization of target molecules is one of the straightforward methods. Atomic force microscopy (AFM) enables the direct imaging of biomolecules under physiological conditions at nanometer-scale spatial resolution. In addition, using the DNA origami technology, the precise placement of target molecules in a designed nanostructure has been achieved, and the detection of the molecules at the single-molecule level has been realized. DNA origami is applied for visualizing the detailed movement of molecules combining with high-speed AFM (HS-AFM), which enables the analysis of the dynamic movement of biomolecules in a subsecond time resolution. Here, we describe the combination of the DNA origami system with HS-AFM for the imaging of rotation of dsDNA originated from B–Z transition. The rotation of dsDNA during B–Z transition is directly visualized in a DNA origami using the HS-AFM. These target-oriented observation systems serve to the detailed analysis of DNA structural changes in real time at molecular resolution. Key words DNA nanotechnology, DNA origami, B–Z transition, Single-molecule observation, High-speed atomic force microscopy

1

Introduction Direct imaging of target biomolecules is one of the straightforward ways to elucidate the physical properties of molecules in various reactions involved in living systems. Single-molecule imaging using atomic force microscopy (AFM) is a practical approach for investigating the motions of biomolecules during reactions. AFM enables the direct observation of biomolecules with nanoscale spatial resolution, and the imaging can be performed under physiological conditions. Because AFM visualizes all molecules in a scanning area, an observation scaffold is required to facilitate the targetspecific imaging of molecules in the dynamic state [1–3]. DNA origami, a novel DNA self-assembly system, has been developed for design and construction of various two-dimensional and three-

Kyeong Kyu Kim and Vinod Kumar Subramani (eds.), Z-DNA: Methods and Protocols, Methods in Molecular Biology, vol. 2651, https://doi.org/10.1007/978-1-0716-3084-6_17, © The Author(s), under exclusive license to Springer Science+Business Media, LLC, part of Springer Nature 2023

241

242

Masayuki Endo and Hiroshi Sugiyama

dimensional nanostructures [4]. DNA origami can be used as a scaffold for the incorporation of various molecules and nanoparticles at specific positions. The DNA origami system is used for the detection of target molecules and the analysis of chemical reactions using single-molecule observation with AFM [1, 2]. Therefore, the detailed dynamics of the molecules can be visualized when the single-molecule imaging is performed on the DNA origami nanostructure. In the past decade, high-speed AFM (HS-AFM) imaging has also been progressed to visualize molecular movement during biological reactions in a subsecond timescale [5–8]. For the analysis of the enzyme reactions on the DNA strand, the dynamic movement of biomolecules and the reactions can be clearly visualized in a robust origami structure by combining the DNA origami technology and HS-AFM imaging. The DNA origami system has expanded visualization of the dynamics in various biochemical reactions, including enzymatic reactions, DNA structural changes, DNA photoreactions, DNA catalytic reactions, and RNA interactions at the single-molecule level [2, 9]. Using the designed DNA origami and improving the HS-AFM imaging technique, the observation systems can be used extensively to elucidate the physical properties of individual molecules involved in various biological and nonbiological reactions. Right-handed B-form dsDNA containing a CG repeat sequence is known to transit to the left-handed Z-form structure with increasing salt concentration (Fig. 1a) [10]. A nanomechanical device employing the rotation of the B–Z transition was created on a DNA nanostructure, and the rotation of the nanostructure was investigated by FRET [11]. We directly visualized the rotary motion of B–Z conformational transition of double-helical DNA in the DNA frame structure (Fig. 1) [12]. In this chapter, we describe the direct observation of the B–Z transition of dsDNA with molecular resolution. AFM-based singlemolecule imaging can visualize whole DNA strands including specific target region by directly monitoring the shapes of DNA components. First, a DNA origami frame with cavity inside with four connection sites was designed (Fig. 1). We prepared complementary strands called staple DNA strands and mixed them and then annealed with single-stranded DNA M13mp18 (7249-nt) [4, 9]. dsDNA part containing B–Z transition region was incorporated to observe the helical rotation of dsDNA strands (Fig. 1). B–Z transition region can respond to high salt concentration to transit from right-handed B-form DNA to left-handed Z-form DNA reversibly. We prepared a dsDNA containing B–Z transition region part with flag-shaped three-helix-bundled DNA structure (triple crossover) (Fig. 1). Also, a dsDNA containing random sequence and flag maker was prepared as control. These

AFM Observation of B-Z Transition in DNA Origami

243

Fig. 1 Direct observation of the rotation in the B–Z transition. (a) B-form and Z-form dsDNA structure and B–Z transition. (b) Direct observation system for the B–Z transition using three-helix-bundled dsDNA flag marker. dsDNAs with a (5meCG)6 sequence are connected to a flag marker to observe rotation during B–Z transition. (c) Direct observation system for the B–Z transition in the DNA origami frame. DNA frame structure and

244

Masayuki Endo and Hiroshi Sugiyama

two structures were incorporated into upper and lower side in the cavity of the DNA frame. We incorporated a B–Z transition DNA strand in the cavity of the DNA frame, to visualize the metal ion-induced helical rotation at the single-molecule level (Fig. 2). 1.1 Design and Construction of Direct Observation System of Rotation in B–Z Transition in the DNA Frame

To visualize the B–Z transition, dsDNA containing a 5-methyl-CG six-repeat sequence (mCG)6 and a flag marker containing threehelix-bundled DNA connected by crossovers are introduced to the DNA frame (Fig. 1b). The 5-methyl-CG sequence can promote formation of the Z-form even at low salt concentrations [13]. One dsDNA with the (mCG)6 sequence and a flag marker is introduced to the upper side as B–Z transition system, while the other dsDNA containing a random sequence and a flag marker is introduced to the lower side as a control (Fig. 1c). To allow rotation and avoid surface contact during the B–Z transition, four connector loops are introduced in the DNA frame to lift both ends of the dsDNAs from the DNA origami surface (Figs. 1c and 2a). In addition, to observe the rotation of the (mCG)6 region, both the ssDNA linkers (sticky ends) in the left terminal of the dsDNA are fixed to the connector loop, while one ssDNA linker in the right terminal is attached to the connector loop (Figs. 1c and 2a). After the assembly of the two components of B–Z system and control to the DNA frame by annealing, the assembled structure was observed by AFM (Fig. 2a). In the AFM image, two components were attached to the DNA frame. The B–Z system and control components can be distinguished by hairpin makers placed at left top corner of the DNA frame (Fig. 2a).

1.2 Observation of the Rotation in B–Z Transition in the DNA Frame

During the B–Z transition, the flag marker is expected to rotate around the dsDNA shaft, and the rotary motion could be observed by monitoring the position of the flag marker. In the previous report, B–Z transition was promoted by controlling the concentration of Mg2+ ions [13]. Initially, we measured the CD spectra of the (mCG)6 duplex by changing the concentration of Mg2+ and confirmed the B–Z transition occurred at 10 mM of Mg2+ concentration [12]. Then, the rotation of the flag marker within the B–Z system was examined by AFM observation under equilibrium conditions for the B–Z transition (Fig. 2). Both initial down-flag and rotated up-flag were observed in the B–Z system. Upon increasing the concentration of Mg2+ ions, frequency of the up-flag increased in B–Z system as compared to that in control (Table 1). In the case

ä Fig. 1 (continued) incorporation of two dsDNAs with a (5meCG)6 sequence (B–Z system) and a random sequence (control) with flag marker to the upper and lower side of the DNA frame, respectively. To observe dsDNA rotation, both ssDNA linkers in the left terminal of the dsDNA are fixed to the connector, while one ssDNA linker in the right terminal is attached to the connector

AFM Observation of B-Z Transition in DNA Origami

245

Fig. 2 Incorporation of containing B–Z system and control components to the DNA origami frame. (a) By annealing from lower temperature, two components were assembled at upper and lower positions via selective hybridization to the connection sites in the DNA frame. AFM image of the DNA frame with two components. Using hairpin makers, B–Z system and control components can be distinguished. (b, c) Rotation of B–Z system observed by the positional change of the flag marker. In the presence of Mg2+, the frequency of up-flags in the B–Z system increased

of 20 mM Mg2+, 70% of the flag marker within the B–Z system rotated to the up-side, whereas 76% of the flag in the control remained unchanged as down-side. 1.3 Direct Observation of the Flag Rotation During B–Z Transition in the Equilibrium State

The sample under equilibrium conditions for the B–Z transition was observed using HS-AFM. Movement of the flag marker in the B–Z system was observed during HS-AFM scanning (Fig. 3). In successive AFM images, the flag marker moved from the down state (5 s) to the up state (20 s) via an intermediate state (10–15 s) of rotation during the B–Z transition and then returned to the lower state (35 s). The successive images also showed the change in position and height of the flag marker, indicating that the rotation of the flag marker occurred around the dsDNA shaft of the B–Z transition system. From the trajectory of the both flags, the static behavior of the control component and the dynamic nature of the B–Z system were observed (Fig. 3b, c).

246

Masayuki Endo and Hiroshi Sugiyama

Table 1 Statistical analysis: orientation of flag containing (5MeCG)6 B-Z system

Flags

Mg2+

Control

Flags

counted

concentration

down

up

5 mM

71%

29%

10 mM +

40%

15 mM

counted down

up

70

75%

25%

68

60%

163

70%

30%

17

38%

62%

84

71%

29%

86

20 mM

30%

70%

121

76%

24%

116

25 mM

32%

68%

62

76%

24%

63

In conclusion, by combining the DNA origami system with HS-AFM, rotation of dsDNA originated from B–Z transition was observed at molecular resolution. Although single-molecule imaging with HS-AFM cannot directly visualize the rotation of dsDNA during B–Z transition, the structure marker such as a flag marker helped to image the rotation of dsDNA by directly monitoring the shapes of these DNA components. These target-oriented observation systems can contribute to the detailed analysis of DNA structural changes in real time at molecular resolution.

2

Materials

2.1 Design and Preparation of DNA Origami

1. M13mp18 single-stranded DNA (New England Biolabs, USA). 2. Staple DNA strands (Sigma Genosys, Tokyo, Japan). 3. Thermal cycler. 4. Sephacryl S-300 gel-filtration column (GE Healthcare, Buckinghamshire, UK). 5. Gel-filtration column (Bio-Rad Laboratories, Hercules, CA). 6. Electrophoresis buffer).

buffer:

0.5×

TBE

(Tris/Borate/EDTA

7. Agarose gel: 1% agarose, 5 mM MgCl2, and 0.5× TBE.

AFM Observation of B-Z Transition in DNA Origami

247

Fig. 3 Direct observation of the B–Z transition in the DNA origami frame. (a) The sample was successive HS-AFM images recorded at a scanning rate of 0.2 frame/s. Image size 150 nm × 150 nm. Repeating rotation of the flag in the B–Z system was visualized in the DNA frame by HS-AFM during scanning. (b) Graphical explanation of the molecular events during the flag rotation. (c) Trajectories of the distance profiles of the flags. The distances between the edges of the flag in the B–Z system (red; L1) or that of the control component (blue; L2) and the origami frame (inset)

2.2 Preparation of a DNA Origami Frame with B–Z Transition DNA Components

1. DNA origami frame prepared using the procedure described in Subheading 3.1. 2. Target dsDNA assemblies prepared using the procedure described in Subheading 3.2. 3. Thermal cycler. 4. Sephacryl S-500 gel-filtration column (GE Healthcare, Buckinghamshire, UK). 5. Gel-filtration column (Bio-Rad Laboratories, Hercules, CA).

2.3 High-Speed Atomic Force Microscopy (HS-AFM)

1. Mica discs (diameter of 1.5 mm) (Furuuchi Chemical Corporation, Tokyo, Japan). 2. Solution of 10 nM DNA nanostructure prepared using the procedure described in Subheading 3.3.

248

Masayuki Endo and Hiroshi Sugiyama

3. Observation buffer: 20 mM Tris–HCl, pH 7.6, and 5–25 mM MgCl2. 4. High-speed atomic force microscopy (Nano Live Vision, RIBM, Tsukuba, Japan) equipped on the Olympus IX-71 microscope-based unit. 5. Cantilever (BL-AC10EGS, Olympus, Tokyo, Japan).

3

Methods

3.1 Design and Preparation of a DNA Origami Frame

The DNA frame structure was designed according to the rules of the DNA origami method (see Note 1) [4]. The M13mp18 singlestranded DNA was used for the scaffold DNA strand, and the complementary staple strands were extracted according to the design of the origami DNA frame (Fig. 1c) [9]. The sequences of the staple DNA strands were extracted from the design of the DNA frame. These sequences are listed in Ref. [12]. The DNA frame also has four hairpin markers at the left top corner in the DNA frame for the identification of the orientation of the DNA frame structure. The DNA origami frame was prepared as follows: 1. Mix all staple DNA strands (226 strands, 100 μM each): ~0.2 μM final concentration. 2. Prepare a 20 μL solution containing the 10 nM M13mp18 single-stranded DNA, 40 nM premixed staple strands (4 equiv.), 20 mM Tris–HCl (pH 7.6), 1 mM EDTA, and 5–25 mM MgCl2. 3. Anneal the mixture from 85 °C to 15 °C at a rate of -1.0 °C/ min using a thermal cycler. 4. Purify the sample by gel filtration chromatography (Sephacryl S-300). 5. Observe the formation of the structure by AFM using the procedure described in Subheading 3.3.

3.2 Assembly of the DNA Components in the DNA Frame

1. Assemble substrate DNA components containing B–Z transition (or random) sequence and flag marker by annealing from 85 °C to 15 °C at a rate of -1.0 °C/min using a thermal cycler. 2. Mix two assembled DNA substrates (0.1 μM, 10 equiv.) with the 10 nM preassembled DNA frame in a solution containing 20 mM Tris–HCl (pH 7.6), 1 mM EDTA, and 5–25 mM MgCl2. 3. Assemble the two DNA components in the DNA frame by annealing from 45 °C to 15 °C at a rate of -1.0 °C/min. 4. Purify the sample by gel filtration chromatography (Sephacryl S-500) equilibrated with 20 mM Tris–HCl (pH 7.6), 1 mM EDTA, and 5–25 mM MgCl2.

AFM Observation of B-Z Transition in DNA Origami

249

5. Observe the assembled structures with HS-AFM using the procedure described in Subheading 3.3 by changing the Mg2+ concentration (5–25 mM). 3.3 High-Speed AFM Imaging of the Behavior of the B–Z DNA Strands in the DNA Frame

AFM images were obtained via HS-AFM (Nano Live Vision, RIBM, Tsukuba, Japan) (see Note 2) using a silicon nitride cantilever (Olympus BLAC10EGS) (see Note 3). Samples for AFM imaging were prepared, and the HS-AFM operation was performed. 1. Attach the mica disc onto the glass scaffold. 2. Cleave the mica disc to obtain a fresh surface. 3. Dilute the DNA nanostructure sample to ~5 nM by adding observation buffer. 4. Place the sample solution (~2 μL) onto the mica surface for 5 min. 5. Rinse the surface with imaging buffer (~10 μL) three times to remove unbound molecules. 6. Place the cantilever on the cantilever holder. 7. Fill the liquid cell with observation buffer (~120 μL). 8. Place the mica plate with a glass scaffold on onto the scanner stage. 9. Set up the scanner over the liquid cell in which the cantilever is immersed in the observation buffer. 10. Align the laser focusing position and the photodetector position to maximize the intensity of the laser light reflected back from the cantilever. 11. Find the resonant frequency of the cantilever using a fast Fourier transform (FFT) analyzer. 12. Excite the cantilever at the resonant frequency by applying sinusoidal AC voltage. 13. Execute the approach until the software stops the motor automatically. 14. Gradually decrease the set point voltage until the sample is clearly imaged. 15. Image the samples in the observation buffer with a scanning rate of 0.2 frame/s (fps).

4

Notes 1. The design of DNA origami structures is currently carried out using the caDNAno software (http://cadnano.org/) [14]. 2. Details on the information of the instrument are available at the homepage of RIBM (http://www.ribm.co.jp).

250

Masayuki Endo and Hiroshi Sugiyama

3. For HS-AFM imaging, small cantilevers are used. Small cantilevers (9 μm long, 2 μm wide, and 130 nm thick; BL-AC10DS, Olympus, Tokyo, Japan) made of silicon nitride with a spring constant ~0.1 N/m and a resonant frequency of ~300–600 kHz in water are commercially available from Olympus. References 1. Torring T, Voigt NV, Nangreave J, Yan H, Gothelf KV (2011) DNA origami: a quantum leap for self-assembly of complex structures. Chem Soc Rev 40:5636–5646 2. Rajendran A, Endo M, Sugiyama H (2012) Single-molecule analysis using DNA origami. Angew Chem Int Ed 51:874–890 3. Endo M, Yang Y, Sugiyama H (2013) DNA origami technology for biomaterials applications. Biomater Sci 1:347–360 4. Rothemund PW (2006) Folding DNA to create nanoscale shapes and patterns. Nature 440: 297–302 5. Ando T, Kodera N, Takai E, Maruyama D, Saito K, Toda A (2001) A high-speed atomic force microscope for studying biological macromolecules. Proc Natl Acad Sci U S A 98:12468–12472 6. Ando T, Kodera N (2012) Visualization of mobility by atomic force microscopy. Methods Mol Biol 896:57–69 7. Uchihashi T, Kodera N, Ando T (2012) Guide to video recording of structure dynamics and dynamic processes of proteins by high-speed atomic force microscopy. Nat Protoc 7:1193– 1206 8. Rajendran A, Endo M, Sugiyama H (2014) State-of-the-Art High-Speed Atomic Force

Microscopy for Investigation of SingleMolecular Dynamics of Proteins. Chem Rev 114:1493–1520 9. Endo M, Katsuda Y, Hidaka K, Sugiyama H (2010) Regulation of DNA methylation using different tensions of double strands constructed in a defined DNA nanostructure. J Am Chem Soc 132:1592–1597 10. Jovin TM, Soumpasis DM, Mcintosh LP (1987) The transition between B-DNA and Z-DNA. Annu Rev Phys Chem 38:521–560 11. Mao CD, Sun WQ, Shen ZY, Seeman NC (1999) A nanomechanical device based on the B-Z transition of DNA. Nature 397:144–146 12. Rajendran A, Endo M, Hidaka K, Sugiyama H (2013) Direct and real-time observation of rotary movement of a DNA nanomechanical device. J Am Chem Soc 135:1117–1123 13. Behe M, Felsenfeld G (1981) Effects of methylation on a synthetic polynucleotide: the B--Z transition in poly(dG-m5dC).poly (dG-m5dC). Proc Natl Acad Sci U S A 78: 1619–1623 SM, Marblestone AH, 14. Douglas Teerapittayanon S, Vazquez A, Church GM, Shih WM (2009) Rapid prototyping of 3D DNA-origami shapes with caDNAno. Nucleic Acids Res 37:5001–5006

Chapter 18 Adoption of A–Z Junctions in RNAs by Binding of Zα Domains Parker J. Nichols, Shaun Bevers, Morkos A. Henen, Jeffrey S. Kieft, Quentin Vicens, and Beat Vo¨geli Abstract While DNA and RNA helices often adopt the canonical B- or A-conformation, the fluid conformational landscape of nucleic acids allows for many higher energy states to be sampled. One such state is the Z-conformation of nucleic acids, which is unique in that it is left-handed and has a “zigzag” backbone. The Z-conformation is recognized and stabilized by Z-DNA/RNA binding domains called Zα domains. We recently demonstrated that a wide range of RNAs can adopt partial Z-conformations termed “A–Z junctions” upon binding to Zα and that the formation of such conformations may be dependent upon both sequence and context. In this chapter, we present general protocols for characterizing the binding of Zα domains to A–Z junction-forming RNAs for the purpose of determining the affinity and stoichiometry of interactions as well as the extent and location of Z-RNA formation. Key words A–Z junctions, Zα domains, Circular dichroism, Ez score, Isothermal titration calorimetry, Analytical ultracentrifugation, Nuclear magnetic resonance, Z-RNA

1

Introduction The Z-conformation is a unique higher-energy helical structure adopted by DNA and RNA that is left-handed and comprised of dinucleotide repeats termed “Z-steps.” [1–3] Although the Z-form of nucleic acids was originally discovered in 1979 [3], understanding the role of Z-DNA/RNA in biology has been a slow process with many ongoing areas of research. Z-DNA and Z-RNA have been shown to play important roles in many biological processes including DNA replication, RNA transcription, splicing, RNA editing, and the innate immune response [4–8]. Most of these processes involve proteins containing one or more winged helix–turn– helix Zα domains which recognize Z-RNA/Z-DNA by binding to and stabilizing Z-form structures [9–13] that form within the context of larger nucleic acid sequences [4–6, 14].

Kyeong Kyu Kim and Vinod Kumar Subramani (eds.), Z-DNA: Methods and Protocols, Methods in Molecular Biology, vol. 2651, https://doi.org/10.1007/978-1-0716-3084-6_18, © The Author(s), under exclusive license to Springer Science+Business Media, LLC, part of Springer Nature 2023

251

252

Parker J. Nichols et al.

Despite the diversity of DNA and RNA sequences that interact with Zα-containing proteins [9, 15], most biochemical and biophysical characterization of RNA formation has been limited to studies involving (CpG)n sequences, the reason being that the CpG dinucleotide repeat is the sequence most prone to adopt the Z-conformation [1, 16]. Significantly more progress has been made in this area for Z-DNA than for Z-RNA, with several studies demonstrating Z-DNA formation and Zα binding in TpA, CpA, GpC, and TpG steps [9, 17]. In addition, work on understanding the role of Z-DNA formation under torsional stress has demonstrated that Z-forming sequences within the context of larger DNA sequences form B–Z junctions, where the B-DNA regions flanking the Z-DNA site become destabilized and the nucleotides in between the B- and Z-DNA flip out of the helix in order to preserve the continuous base stacking [18–22]. The formation of such junctions can be conveniently monitored using 1D-1H NMR upon addition of Zα, as demonstrated for a variety of B–Z junction-forming DNAs [18, 23–25]. We recently applied what was known about B–Z junction formation to investigate a similar phenomenon in ribosomal hairpins [15] and predicted Z-forming regions within an Alu foldback [5] created from the pairing of an AluSx1 and an AluJo within the 3′ UTR of the CTSS gene [26]. Using biophysical techniques including circular dichroism (CD), analytical ultracentrifugation (AUC), isothermal titration calorimetry (ITC), and nuclear magnetic resonance (NMR), we characterized the binding of Zα and the formation of A–Z junctions in these RNAs. We showed that the sequence dependence of Z-RNA formation may be more complex than originally thought [26]. In this Chapter, we describe in detail our pipeline for how to carry out and analyze these biophysical experiments so they can be applied to characterize other potential A–Z junction-forming RNAs (Fig. 1).

2

Materials

2.1 General Supplies Needed

1. Glass beakers (2 L is preferable). 2. 500 mL and 1 L buffer bottles. 3. Magnetic stir bars. 4. 1.5 mL microcentrifuge tubes. 5. 0.2 mL PCR tubes. 6. 15 and 50 mL conical tubes. 7. Micropipettes and micropipette tips. 8. Gel loading tips (not required but useful for loading samples into quartz cuvettes for CD and sample cells for AUC). 9. Potassium phosphate monobasic (KH2PO4).

Adoption of A–Z Junctions in RNAs by Binding of Zα Domains recombinantly express Zα in E. coli and purify to high concentration (> 2 mM) and in vitro transcribe RNA and purify or have synthesized

dialyze or exchange RNA and Zα into appropriate buffer and concentrate

refold RNA to promote correct secondary structure adoption

ITC

CD

AUC

NMR

dialyze RNA and Zα (separately) within the same beaker of buffer to prevent buffer mismatch incubate free RNA, RNA with enough Zα to saturate all potential binding sites and induce the Z-conformation, and RNA with sodium perchlorate as a positive control, at 42°C for 30 min determine RNA concentration needed to give appropriate absorbance levels (typically low μM)

measure unbound RNA 1D 1H imino spectra

refold RNA to promote correct secondary structure adoption

concentrate RNA and Zα to concentrations needed for ITC. Use concentrations that result in the sigmoidocity factor (C) to be between 1 and 1000

carry out CD and compare spectra to controls to determine how much of the RNA adopts the Z-conformation

parametrize the extent of Z-RNA formation by calculating EZ scores using molar ellipticity at 266, 285, and 295 nm

incubate RNA with enough Zα to saturate all binding sites and induce the Z-conformation at 42°C for 30 min

carry out AUC experiment to determine the molecular weight of the RNA:Zα complex and complex stoichiometry

carry out titration of Zα into RNA and measure imino spectra for each, making sure to incubate the sample at 42°C for 30 min after every addition of Zα

253

carry out ITC experiments to investigate the thermodynamic parameters and dissociation constants of Zα interacting with A-Z junction RNAs

use imino assignment and signatures of A-Z junction formation to determine which parts of the RNA are converted to the Z-conformation and destabilized

measure 2D 1H-1H imino NOESY to assign imino spectra

Fig. 1 Pipeline for the biophysical characterization of Zα-dependent A–Z junction adoption in RNAs. An overview of the steps required for characterization of A–Z junction adoption upon binding to Zα by circular dichroism (CD), isothermal titration calorimetry (ITC), analytical ultracentrifugation (AUC), and nuclear magnetic resonance (NMR) is shown

10. Potassium phosphate dibasic (KHPO4). 11. Sodium chloride (NaCl). 12. Ethylenediaminetetraacetic (CH2CO2H)2]2).

acid

(EDTA,

[CH2N

13. 1,4-Dithiothreitol (DTT, C4H10O2S2). 14. Sodium perchlorate (NaClO4). 15. Heat block or water bath able to achieve 95 °C and 42 °C. 16. Tabletop microcentrifuge and swing-bucket rotor centrifuge (or any centrifuge that can spin 50 mL conicals). 17. 3 kDa cutoff dialysis membrane or 2 kDa cutoff Slide-A-Lyzer cassettes (Thermo Fisher, MA). 18. 3 kDa cutoff Amicon Ultra-15 and Ultra-4 centrifugal filter units (MilliporeSigma, MO). 19. Nano UV-Vis spectrometer. 2.2 Circular Dichroism

1. Access to a CD spectrometer (ours was a Jasco J-815 CD Spectropolarimeter, Jasco, RI). 2. Sample temperature controller (ours was a Jasco PTC-423L Peltier Controller, Artisan Technology Group, IL). 3. Software for controlling CD spectrometer (we used Spectra Manager version 2, Jasco, RI). 4. Quartz cuvettes (Jasco, RI). We used 1 mm path length for our studies. 5. Single cuvette washer (Fireflysci Type P65S, Thermo Fisher, MA). 6. Microsoft Excel or other data analysis software.

254

Parker J. Nichols et al.

2.3 Isothermal Titration Calorimetry

1. Access to an isothermal titration calorimeter (ours was a Malvern ITC200, Malvern, PA). 2. Software for controlling ITC machine (ours was ITC200 version 1.26.1, Malvern, PA). 3. Sample cell syringe (Malvern Panalytical Inc ITC200 Syringe Filling, Malvern, PA). 4. Analysis software (we used Microcal Analysis version 7 SR4, Origin, MA).

2.4 Analytical Ultracentrifugation

1. Access to an analytical ultracentrifugation instrument (ours was a XL-I Beckman Coulter, Beckman, CA). 2. An-60 Ti analytical 4-place titanium rotor (Beckman Coulter, CA). 3. Standard 12 mm EPON centerpieces with quartz windows (Beckman Coulter, CA). 4. Software for controlling AUC machine (we used ProteomeLab version 6.0, Beckman Coulter, CA). 5. Method for calculating buffer density. Can be readily done using SEDNTERP version 3 (by John S. Philo at Alliance Protein Laboratories, CA). 6. Software for analyzing AUC data (we used SEDFIT version 14.7g, NIH).

2.5 Nuclear Magnetic Resonance

1. Access to a high-field NMR magnet with 1H/13C/15N cryoprobe and associated equipment (Bruker or Varian). 2. 5 mm spinner for sample insertion (Bruker or Varian). 3. Regular 5 mm NMR tube or Shigemi tubes (Wilmad-LabGlass, IL). 4. Software for controlling NMR console and data analysis. We used VNMRJ version 4.2 Revision A (Agilent, CA) for collecting data from Varian spectrometers and TopSpin version 4.0.7 (Bruker, MA) for Bruker.

3

Methods

3.1 Acquiring and Preparing Protein Zα Samples

The recombinant expression and purification of the Zα protein have been covered thoroughly elsewhere [12, 26, 27] and will not be discussed here. Once the Zα protein has been purified, it can be checked for its Z-DNA/Z-RNA-stabilizing activity by incubating it with r/d(CpG)n repeats using CD (as discussed below) (see Note 1).

Adoption of A–Z Junctions in RNAs by Binding of Zα Domains

3.2 Preparation of RNA and RNA/Zα Complexes

255

We will not cover these methods here, but RNAs can either be produced in-house through in vitro T7 transcription [28–32] or by solid-phase synthesis [33] and purified through preparative denaturing polyacrylamide gel electrophoresis [32, 34] (we usually used 20% polyacrylamide as most of the A–Z junction RNAs we tested were ~30 nts or less) or chromatography [35, 36]. Because RNAs comprising a few dozen nucleotides at most may be difficult to in vitro transcribe, they may more conveniently be obtained by chemical synthesis, which can be carried out, for example, by Integrated DNA Technologies (IDT) or Horizon Discovery (formerly known as Dharmacon). RNAs purified by denaturing polyacrylamide gel electrophoresis should be subjected to an additional round of purification (see Note 2), as described below (step 1–4). For all other cases, start at step 5. 1. Prepare anion exchange binding and equilibration buffers. Binding/wash buffer: 20 mM potassium phosphate (pH 6.4), 150 mM NaCl, 0.2 mM EDTA. Elution buffer: 20 mM potassium phosphate (pH 6.4), 2000 mM NaCl, 0.2 mM EDTA. 2. Equilibrate the DEAE-Sepharose into anion exchange binding buffer by pumping at least 2 column volumes (CV) of buffer through the column. A ~20 mL column should be more than sufficient. We purchase the pack of five 5 mL DEAE columns from MilliporeSigma and attach them together. 3. Inject the RNA sample onto the column and wash with at least 3 CV of wash buffer. 4. Elute the RNA using elution buffer, and monitor the ultraviolet signal at 260 nm (preferred) or 280 nm to determine when all of the RNA has been collected. 5. Prepare 20 mM potassium phosphate (pH 6.4), 25 mM NaCl, 0.5 mM EDTA, and 1 mM DTT for NMR or 20 mM potassium phosphate (pH 7.0), 25 mM NaCl, 0.5 mM EDTA, and 1 mM DTT for all other measurements (see Notes 3 and 4). 6. Dialyze RNA and Zα protein into the correct buffer. This can be done either by standard dialysis through dialysis membrane (SpectraPor) in a 2 L beaker with buffer or by buffer exchange using Amicon Centricons (MilliporeSigma). For Centricon dialysis, centrifuging a total of about 50 mL of buffer through the Centricon is usually sufficient to effectively dialyze the sample. Keep everything at 4 °C prior to measurement (see Note 5). 7. Concentrate RNA and Zα using 3 kDa Amicon centrifugal filter units (MilliporeSigma) to concentrations required for subsequent experiments (concentrations can be determined using a NanoDrop or other methods). For NMR, aim for concentrations in excess of 1 mM in order to minimize dilution

256

Parker J. Nichols et al.

of the sample upon addition of titrant. RNA and Zα can be frozen at this point at -20 °C (-80 °C preferred) for longterm storage. 8. Heat anneal RNAs by incubating them at 95 °C for 5 min followed by cooling on ice for 20 min (or the RNA can be cooled slowly at room temperature for 45–60 min) (see Notes 6 and 7). 9. Combine Zα with RNA at proper concentrations and molar ratios for planned experiment (see below for details depending on which experiment is being performed). 10. Important: For any equilibrium measurement, such as NMR, CD, or AUC, incubate Zα with the RNA at 42 °C for at least 30 min before measuring to ensure complete conversion of the RNA to the Z-conformation (see Note 8). 3.3 Circular Dichroism for Quantification of ZForm

The concentration of RNA needed to acquire good signal to noise for CD will depend on both the size of the RNA (longer RNAs have a higher absorbance), the type of cuvette being used (larger cuvettes will have a longer path length and thus require more volume but less concentration), and the sensitivity of the spectrometer (see Note 9). There are two ways to carry out the CD titration. To conserve the RNA sample, the Zα stock can be concentrated to high levels so that the dilution of the RNA is minimal and can be titrated into the RNA sample for each titration point. Note that this method requires that the sample be incubated at 42 °C for 30 min following each titration. The other method is to make a separate sample for each titration point (which is what we usually do). 1. Calculate out the volume needed from the concentrated RNA stock to reach 50 μM RNA in 175 μL of 20 mM potassium phosphate (pH 7.0), 25 mM NaCl, 0.5 mM EDTA, and 1 mM DTT. Then, calculate how much volume from the Zα stock is needed to reach the desired molar ratios for each of the titration points. For example, if we were going to measure five different points (free RNA, 6 M sodium perchlorate, 6:1, 1:1, and 1:6 RNA/Zα), and the RNA was 1 mM and Zα stock was 2 mM, then Table 1 would describe the different volumes needed from each. Also prepare the proper controls needed for proper interpretation of the data (see Note 10). 2. Incubate the samples at 42 °C for 30 min. While waiting, turn on the nitrogen gas to the instrument followed by starting the Jasco J-815 CD Spectropolarimeter (Jasco), as well as the Jasco PTC-423L Peltier Controller (Artisan Technology Group), and start equilibrating the sample cell holder to 25 °C. The

Adoption of A–Z Junctions in RNAs by Binding of Zα Domains

257

Table 1 Example setup for circular dichroism titration of RNA with Zα CD experiment/ titration point

RNA stock (1 mM)

Sodium Zα stock Buffer (20 mM potassium phosphate (pH 6.4), perchlorate (2 mM) 25 mM NaCl, 0.5 mM EDTA, 1 mM DTT) (8 M) 175 μL

Buffer control Free RNA

8.75 μL

166.25 μL 26.25 μL 148.75 μL

Free Zα control Z-RNA control 8.75 μL

35 μL

6:1 RNA/Zα

8.75 μL

0.73 μL

165.52 μL

1:1 RNA/Zα

8.75 μL

4.38 μL

153.12 μL

1:6 RNA/Zα

8.75 μL

26.25 μL 140 μL

131.25 μL

temperature will also need to be set within Spectra Manager version 2 (Jasco). 3. Set up the spectral measurement parameters within Spectral Manager. The measurement range should be set to be 320 nm to 220 nm (the range which reports on the secondary structure of nucleic acids), the step size to 1 nm, and the number of scans to at least two. 4. Once the samples have incubated, pipette (using gel loading pipettes here is helpful) the entire 175 μL of the first sample to be measured into a 1 mm quartz cuvette, making sure that the sample settles all the way to the bottom of the cuvette and that there are no bubbles. 5. Insert the cuvette containing the sample into the sample holder and begin the measurement. Once the collection is complete, transfer the absorbance versus wavelength data into an Excel sheet or other data analysis software. 6. Wash the cuvette with buffer (or water, but buffer is preferred) two times, followed by ethanol once, and then with water four times before the next measurement. Remove as much of the residual water as possible before proceeding to the next measurement. It is useful to have a cuvette washer (Thermo Fisher) for this step. 7. Repeat for the remaining samples. Export all the data into an Excel sheet. 8. Convert the CD absorbance molar ellipticity using the following equation: ½Θ] = Θobs * ðM =ð10* l * c ÞÞ

(b)

(ApU)

A-Z junction

(CpG)3

6

285 nm 266 nm

285 nm

266 nm

266 nm 285 nm

295 nm

wavelength (nm)

295 nm wavelength (nm)

295 nm wavelength (nm)

A- (A Z j pU un ) Alu ction6 Sx 1J (C o pG )3

molar ellipticity [ ]

(a)

Parker J. Nichols et al.

Ez score

258

Fig. 2 Circular dichroism to investigate A–Z junction adoption and determine the extent of Z-RNA formation. (a) CD spectra of the (CpG)3, an A–Z junction positive control, and an (ApU)6 negative control RNA in the absence of protein (black), with 6 M sodium perchlorate (dotted black line) and with a molar ratio of 1:6 RNA/Zα (red) at which binding is saturated. (b) Extent of Z-RNA (EZ) scores quantifying the extent of Z-conformation for the (ApU)6, A–Z junction control, AluSx1Jo, and (CpG)3 RNAs

where [Θ] is the molar ellipticity, Θobs is the measured raw ellipticity in mdeg, M is the molecular weight of the molecule in gmol-1, l is the path length of the cell in cm, and c is the concentration in g.L-1. 9. Plot the molar ellipticity of the different measurements on the y-axis versus the wavelength in nm on the x-axis to obtain the final result (Fig. 2a). The CD results can be used to judge how much of an RNA adopts the Z-conformation when bound by Zα. For RNAs that fully adopt the Z-form, such as (CpG)n repeat sequences, the CD spectra upon being saturated with Zα are characterized by a complete shift of the peak from ~266 to ~285 nm (Fig. 2a, left). In contrast, RNAs that only adopt a partial Z-conformation, such as the A–Z junction control (which is a sequence that contains a (CpG)6 followed by an A-forming region), are characterized by a population-weighted shift in the CD absorbance toward the Z-form values (Fig. 2a, middle), compared to an (ApU)6 RNA which has a minor dip at 266 nm but otherwise no growth at 285 and 295 nm (Fig. 2a, right). To obtain a quantitative description of this shift, we parametrize the different wavelengths that inform on Z-RNA formation as described in Subheading 3.4. 3.4 Calculation of Ez Scores from CD Data to Determine Extent of ZRNA Formation

Certain wavelengths within CD spectra inform on whether the RNA is in the A- or Z-form and therefore can be tracked upon addition of Zα to determine how much of an RNA adopts the Z-conformation. In particular, a decrease in the positive molar ellipticity centered around ~266 nm (which is indicative of the A-form) and a growth at 285 and 295 nm report on formation of Z-RNA [11, 37] (Fig. 2a). By parametrizing the change in the molar ellipticity at these wavelengths, we calculate what we have

Adoption of A–Z Junctions in RNAs by Binding of Zα Domains

259

Table 2 Example of Excel sheet containing final data from CD titration Wavelength (nm)

[Θ] (free RNA)

[Θ] (saturated RNA)

295

-10799.72

7059.54

285

-10834.9

16748.64

266

19807.86

8806.2

Growth/decay of [Θ]

Calibrated growth/ decay

termed as an “EZ score” which gives a single number that is useful for comparing different Z-adopting RNAs to each other (Fig. 2b) [26]. 1. Take the molar ellipticity values at 266, 285, and 295 nm for both the free form and the fully saturated complex (usually 1:6 RNA/Zα), and paste them into a new column as shown in Table 2 (example taken from (CpG)3). 2. Next, the decay of the molar ellipticity at 266 nm is calculated using the following equation: ( ) decay266 = Intfree 266- Intbound 266 =Intfree 266 where decay266 is the decay of the molar ellipticity at 266 nm, Intfree266 is the CD signal of the free RNA at 266 nm, and Intbound266 is the CD signal of the bound RNA at 266 nm. 3. The growth at 285 and 295 nm is then calculated as: ( ) growth285 = Intbound 285- Intfree 285 =Intfree 266 ) ( growth295 = Intbound 295- Intfree 295 =Intfree 266 where growth285 and growth295 is the growth of the molar ellipticity at 285 and 295 nm, respectively, Intfree266 is the CD signal of the free RNA at 266 nm, Intbound266 is the CD signal of the bound RNA at 266 nm, Intfree285 and Intfree295 are the CD signals of the free RNA at 295 and 295 nm respectively, and Intbound285 and Intbound295 are the CD signals of the bound RNA at 285 and 295 nm, respectively. Since the CD signal will vary for different RNA sequence contexts, it is important to always normalize the growth/decay to the signal of the free form, which is why each growth or decays are divided by Intfree266. Next, the growth/decays are calibrated to the control (CpG)3 RNA since this RNA fully adopts the Z-conformation when bound by Zα. This means that we take the growths/

260

Parker J. Nichols et al.

Table 3 Example of Excel sheet with calculated growth/decays of titration data Wavelength (nm)

[Θ] (free RNA)

[Θ] (saturated RNA)

Growth/decay of [Θ]

Calibrated growth/ decay

295

-10799.72

7059.54

0.90162491

1

285

-10834.9

16748.64

1.39255528

1

266

19807.86

8806.2

0.55541891

1

decays at the three wavelengths and multiply them by a constant that makes them equal to the number 1 for the (CpG)3 case. We then use the same constants for the other tested RNAs. 4. Multiply the decay266 by 1.11, the growth285 by 0.718, and the growth295 by 1.80 (which are the values determined empirically to calibrate the EZ score to the (CpG)3 RNA). See Table 3 for an example. 5. The final Ez score of the RNA is the average of the three calibrate growth/decay values. Example Ez scores from an (ApU)6 negative control, A–Z junction positive control, AluSx1Jo foldback, and (CpG)3 positive control RNA are shown (Fig. 2b). 3.5 Isothermal Titration Calorimetry to Investigate Affinity and Thermodynamics of Binding

The optimal amount of macromolecule in the calorimeter cell usually needs to be determined to acquire quality results and a sigmoidal binding curve. Ideally, we want the sigmoidicity factor (C) to be between 10 and 1000, which can be determined through the following equation: C = N ½M ]T =K D where N is the stoichiometry, [M]T is the biomolecule concentration in the ITC cell, and KD is the dissociation constant of the interaction. For example, with an injection of Zα into (CpG)3 RNA at a concentration 50 μM: ( ) C = 2* 50 × 10 - 6 =241:5 × 10 - 9 = 414:1 While the KD of the interaction between Zα and an RNA may be unknown, an initial pilot experiment can be carried out assuming a KD similar to the literature values and then adjusting the concentration from there. The stoichiometry is also usually an unknown but can be determined readily by AUC methods as described below. Aim for a titrant concentration of ~10× the concentration of the biomolecule in the cell, so if the cell concentration is 50 μM, then the concentration of titrant in the syringe should be 500 μM.

Adoption of A–Z Junctions in RNAs by Binding of Zα Domains

261

Due to the often many Zα binding sites present on some A–Z junction-forming RNAs, it is not uncommon for the binding curve to be unsaturated by the end of the experiment. If this happens, it is possible to reload the syringe with additional titrant and carry out another injection series and then concatenate the two data sets together (Microcal Concat ITC software, Microcal). Another option is to use a higher concentration of titrant and do smaller volumes per injection. Keep in mind that the volumes and best practices will depend on the type of ITC instrument used. In our case, we used a Malvern ITC200 (Malvern). 1. Dialyze Zα and the RNA(s) together (within separate 3 kDa cutoff dialysis bags (SpectrPor) or Slide-A-Lyzers (Thermo Fisher) so that the components do not mix) overnight at 4 °C in 2 L of 20 mM potassium phosphate (pH 7.0), 25 mM NaCl, and 0.5 mM EDTA. For ITC, avoid DTT as it can cause erratic baselines (2-mercaptoethanol (BME) can be used as a replacement). Make sure to dialyze enough Zα and RNA to reach the concentrations in the volumes required for ITC (see Note 11), for example, 50 μM of RNA in >280 μL (for the sample cell) and >40 μL of 500 μM Zα (for the injection syringe). 2. Concentrate Zα and the RNA(s) using 3 kDa cutoff Amicon centrifugal filters to concentrations required for ITC. Make sure to save some buffer from the 2 L dialysis for dilutions and for the buffer control titration (ligand into buffer). 3. Make sure the sample cell is thoroughly washed with buffer before measurement. Use the sample cell syringe to completely wash the cell with at least 3× with buffer before loading the ~280 μL of RNA into the cell. Avoid introducing bubbles. 4. Wash the injection syringe with 3× ~40 μL of buffer and then load the 40 μL of 500 μM Zα titrant. 5. Prepare the experimental settings in ITC200 version 1.26.1 by setting the sample cell reference temperature at 25 °C, the stirring speed at 750 rpm, and the reference power to 10 μcals-1 and inputting the sample cell and syringe concentrations. Edit the injection parameters so that there are 19 total injections of 2 μL and 1 injection (the first one in the list) with a volume of 0.4 μL. Set the initial delay to 6 s and the spacing between each injection to be 180 s. 6. Run the experiment to collect the ITC thermogram. 7. The ITC data can be fit and analyzed using Microcal Analysis version 7 SR4 (Origin). Load the data into the software and remove the first point from the data (the 0.4 μL injection). Next, perform a baseline correction so that the ΔH for the points at the end of the titration (when the RNA is saturated with Zα) are close to zero.

262

Parker J. Nichols et al.

AluSx1Jo

(CpG)3

(a) 0

10

20

30

40

50

0

60

10

20

30

40

h41 E. coli 50

60

0.10

0.00

0.08

-0.10

0.06 0.04

-0.20 0.02 0.00

-0.30

-0.02

-0.40

-0.04

0.0

-2.0

Δ Δ

2.0

Δ Δ

Δ Δ

-4.0

0.0

-6.0 0.0

0.0

0.5

0.5

1.0

1.5

2.0

(b) interaction (cell/syringe)

Kd (nM)

N

ΔH (kcalmol-1)

TΔS (kcalmol-1) ΔG (kcalmol-1)

Zα / (CpG)3

0.4 ± 0.0

241.5 ± 1300.0

-6.0 ± 0.1

3.0

Zα / AluSx1Jo

0.2 ± 0.0 / 0.3 ± 0.0

37.6 ± 103.8 / 1140.3 ± 8849.6

-0.1 ± 0.2 / 5.9 ± 0.6

10.0 / 14.0

h41 E. coli / Zα

0.6 ± 0.0 / 1.7 ± 0.1

218.8 ± 574.7 / 1140.3 ± 51020.4 -0.9 ± 0.1 / 2.1 ± 0.2

8.2 / 9.1

-9.0 -10.1 / -8.1 -9.1 / -7.0

Fig. 3 Isothermal titration calorimetry to characterize the thermodynamics of A–Z junction formation by Zα binding. (a) ITC thermograms and fits from titrating Zα into the (CpG)3 and AluSx1Jo and h41 E. coli RNAs. (b) Fitted thermodynamic parameters for the three titrations

8. Choose a binding model (whether it is one-site, two-site, etc.) and fit the data to extract the ΔH, ΔS, ΔG, stoichiometry, and KD. The appropriate model to choose depends on the complexity of the interaction. The (CpG)3 RNA, for example, has two binding sites for Zα but they are equivalent, and therefore the data fits best to a one-site model. When we measured ITC for the AluSx1Jo RNA [26], there were two nonequivalent binding sites, and therefore a two-site model was more appropriate for that case. More details about fitting and analysis of ITC data can be found here [38]. Example ITC thermograms are shown for the (CpG)3 RNA as well as two A–Z junction-forming RNAs, AluSx1Jo, and h41 E. coli (Fig. 3a) together with fitted binding parameters (Fig. 3b). Binding of Zα to the (CpG)3 RNA is characterized by single-site exothermic binding in the nanomolar range suggesting that the two binding sites for Zα are equivalent and that the overall reaction is favorable (Fig. 3a, left). In contrast, Zα binding to the AluSx1Jo and h41 E. coli RNAs is characterized by two-site binding in the nanomolar–

Adoption of A–Z Junctions in RNAs by Binding of Zα Domains

263

micromolar range, the first binding event being endothermic and the second being exothermic (Fig. 3a, middle, right). These data suggest that the adoption of A–Z junctions by binding to Zα is a complex process relative to the adoption of Z-RNA in the (CpG)3 repeat, involving multiple binding events and opposing thermodynamic processes. The nature of these processes can be investigated further through higher-resolution techniques such as NMR (discussed below) (see Note 12). 3.6 Sedimentation Velocity Analytical Ultracentrifugation of Z-ConformationContaining RNA/DNAs Bound to Zα

AUC is a powerful and versatile method for the quantitative analysis of macromolecules in solution [39]. While the setup is tedious and low throughput, it can provide detailed information about the stoichiometry of Zα/RNA complexes. AUC measures the absorbance of the sample as it is spun at high speeds, and so it is important not to use buffers which absorb in the UV range. Since the absorbance of RNA is significantly higher than Zα in the majority of cases, the sedimentation of the RNA can be tracked without interference from free protein. 1. Estimate the concentration of RNA needed to perform the AUC experiment. On our AUC instrument (XL-I Beckman Coulter, Beckman), the ideal absorbance at 260 nm is 0.8–1.2 so that the detector is not saturated before and during the run after the sample starts to sediment. An RNA concentration of 2 μM was usually sufficient on our instrument, but this is highly dependent upon the length of the RNA. Prepare enough RNA in 20 mM potassium phosphate (pH 7.0), 25 mM NaCl, 0.5 mM EDTA, and 1 mm DDT at the concentration needed to fill the total volume of the AUC cell which is ~420 μL. Make sure to also bring enough buffer (20 mM potassium phosphate (pH 7.0), 25 mM NaCl, 0.5 mM EDTA, 1 mM DTT) to fill the reference cell. 2. Refold the RNA as mentioned in Subheading 3.2. 3. Add to the RNA sample enough Zα from the concentrated stock to reach a molar ratio of 1:6 RNA/Zα. If more than six binding sites are predicted, adjust the molar ratio to be above the total number of sites. For example, if the RNA concentration is 2 μM, then add 12 μM Zα (which would be μL of 2 M Zα into a 500 μL volume). 4. Incubate the sample at 42 °C for 30 min. During the incubation, proceed to step 5 (assembly of the AUC cells). 5. Assemble the Standard 12 mm EPON centerpieces with quartz windows (Beckman Coulter), and load the sample (this can be done with a gel loading pipette or syringe). A good reference for how to do this can be found here [40]. 6. Insert the sample cells (including the buffer-only cell) into the An-60 Ti analytical 4-place titanium rotor (Beckman Coulter),

264

Parker J. Nichols et al.

and measure one scan at 3000 rpm to check for leaks, and determine whether the absorbance is at an acceptable level. A leak is easily spotted by an absorbance versus radius plot that tails off too quickly (the absorbance should stay level from a radius of ~7 to ~6 cm. If you notice a dip to zero at values significantly before 6 cm, there is a leak and the cell will need to be rebuilt) (see Note 13). If no leaking is detected and the absorbance falls within the expected range, move onto step 7. 7. Begin pulling the vacuum and wait for it to drop below 100 microns (13.3 Pa). After this point, set the temperature to 25 °C and make sure that the temperature is fully equilibrated before starting the run. Note that this process can take several hours, so the equilibration step can be performed overnight with a scheduled sample run the following morning. 8. Setup the experimental parameters. The rotor speed should be set to 50,000 rpm (this is the recommended rpm for complexes between 30 and 300 kDa, which is the range most Zα/RNA complexes will be within). The measurement time for a sedimentation velocity experiment should be 2–12 h depending on the protein size. Zα/RNA complexes are fairly small (ranging from ~18 to 100 kDa, but the complexes we studied were usually below 50 kDa), so we generally plan for longer run times to ensure full sedimentation of the sample (see Note 14). 9. Once all parameters have been set up, the experiment can be run. 10. Calculate the buffer density using SEDNTERP version 3 (by John S. Philo at Alliance Protein Laboratories). SEDNTERP will determine the density and/or viscosity of a buffer after entering the its composition. Since the partial-specific volume of RNA and protein is different, it may be necessary to calculate an average value corresponding to the composition of the Zα/RNA complex (see Note 15). 11. Load the scans from the sedimentation velocity runs and fit the data using SEDFIT version 14.7g, NIH. Resources for how to do this can be found on the SEDFIT website (https:// sedfitsedphat.nibib.nih.gov/software/default.aspx) or here (http://www.analyticalultracentrifugation.com/sedfit.htm). Remember to input the buffer density calculated from SEDNTERP and the average partial-specific volume calculated from the percentage of Zα and RNA in the complex. 12. Once the data is fit, compare the measured molecular weight to the theoretical molecular weight from different Zα/RNA complexes to determine the stoichiometry. For example, the major peak measured from AUC for the (CpG)3 RNA bound to Zα was 19.6 kDa confirming the 2:1 complex with a

Adoption of A–Z Junctions in RNAs by Binding of Zα Domains

265

Fig. 4 AUC of heat annealed and unannealed (CpG)3 RNA bound to Zα. Sedimentation coefficient distributions obtained by AUC for the (CpG)3 RNA with a molar ratio of 1:6 RNA/Zα. The blue and red plots show the distributions acquired when the (CpG)3 RNA was not heat annealed and heat annealed prior to measurement, respectively

theoretical molecular weight of 18.8 kDa (Fig. 4). If the fitted molecular weight suggests a different complex stoichiometry than was assumed for the calculation of the partial-specific volume, recalculate it using the new data, and redo the analysis in SEDFIT. The same process can be used to investigate the stoichiometry of any Zα/RNA complex amenable to AUC and can be particularity useful when attempting to determine the stoichiometry of Zα/ RNA complexes involving complex sequence contexts where the binding sites may not be obvious. Example AUC data from Zα binding to the (CpG)3, an A–Z junction control, and AluSx1Jo RNA are shown (Fig. 5a). The expected molecular weight of the Zα/RNA complexes versus the measure molecular weight is indicated (Fig. 5b).

266

Parker J. Nichols et al.

(a) (CpG)3

A-Z junction control

AluSx1Jo

(b)

normalized c(s)

complex identity

expected MW (kDa)

measured MW (kDa)

2:1 Zα:(CpG)3

18.8

19.6

2:1 Zα:A-Z junction control

23.4

25.9

2:1 Zα:AluSx1Jo

22.6

21.3

sedimentation coefficient (S)

Fig. 5 Analytical ultracentrifugation to determine the stoichiometry of A–Z junction/Zα complexes. (a) Sedimentation coefficient distributions obtained by AUC for (CpG)3, A–Z junction control, and AluSx1Jo RNA. (b) The insets show the raw data from the AUC run with the window position on the x-axis and the absorbance on the y-axis and individual scans over time going from left to right. (b) Predicted molecular weights for the different Zα/RNA complexes and measured molecular weights from AUC 3.7 Nuclear Magnetic Resonance to Monitor ZαDependent Switch from A- to Z-Form

While CD, ITC, and AUC can be used to determine whether a RNA adopts a Z-conformation, the stoichiometry, and thermodynamics of binding, relatively simple NMR experiments can be used to determine exactly what part of an RNA is adopting the Z-conformation in addition to the adjacent regions which are destabilized to accommodate A–Z junction formation [26]. For these experiments, we rely heavily on the foundational NMR characterization of B–Z junctions carried out previously which demonstrated specific NMR signatures for Z-DNA formation and junction destabilization [18, 25]. Specifically, it was shown that at early titration points, specific imino peaks within a 1D-1H spectra would begin to disappear into the noise indicating destabilization of the base pairs adjacent to the Z-DNA-forming region [18, 25]. This was then followed by chemical shift perturbations (CSPs) and line broadening at later titration points indicating Z-DNA formation [18, 25]. We observed similar signatures when titrating Zα into our test ribosomal hairpins and AluSx1Jo RNA foldback fragment [26], suggesting a conformational switch from the A- to the Z-form was occurring. We were then able to determine which regions within the secondary structure of the RNAs were being bound by Zα along with the neighboring destabilized regions [26]. 1. Access to a high field solution-state NMR spectrometer is needed, preferably with a 1H/13C/15N cryoprobe and associated equipment (Bruker or Varian). 2. Prepare the RNA sample at a reasonable concentration in 300 μL (for Shigemi tubes, Wilmad-LabGlass) or 500 μL (for 5 mm NMR tubes, Wilmad-LabGlass). For assignment of the

Adoption of A–Z Junctions in RNAs by Binding of Zα Domains

267

imino peaks, a sample concentration of at least 500 μM is recommended to acquire enough signal to noise to observe all the cross-peaks in the 2D [1H,1H]-NOESY spectra. 3. Concentrate Zα to at least 2 mM (the higher, the better to limit dilution of the RNA sample upon titration) if it is not already so. 4. Measure a 2D [1H,1H]-NOESY making sure to not use a water suppression method that saturates the water signal as the imino proton signal will also be killed. W5, WATERGATE, or flipback water suppression schemes are what we usually use. Make sure that the spectral widths are set so that they that will cover the entire imino spectrum range, which is about 9–16 ppm. The NOESY mixing time is dependent upon the size of the RNA, but generally the range of 160–320 ms is appropriate. A detailed description for measuring this experiment can be found in Ref. [41]. 5. The 2D [1H,1H]-NOESY data can be processed using NMRPipe [42] (a detailed tutorial for how to do this can be found here: https://spin.niddk.nih.gov/NMRPipe/doc1/), and the cross-peaks between imino peaks can be used to assign the base-pairing pattern of the RNA through what is known as a “NOESY walk” [41] (Fig. 6a). Note that less stable parts of the RNA helix such as terminal and noncanonical base pairs exchange with water on a faster timescale than stable ones and therefore will have less signal. The signal of such base pairs may be improved by decreasing the temperature and increasing the concentration. 6. Once the imino peaks have been assigned, they can be used to track A–Z junction formation upon binding to Zα through a 1D imino titration (Fig. 6b). Since only 1D 1H spectra are needed for this, the concentration of the RNA can be decreased to ~100 μM, but we recommend using a 5 mm tube to limit volume loss during the titration (some volume is lost each time the plunger is removed from the Shigemi tube). 7. Measure a 1D 1H imino spectrum of the free RNA at the new concentration. Again, make sure to use a W5, WATERGATE, or flip-back water suppression scheme and that the spectral width is large enough to cover the entire imino proton region. 8. The choice of titration points is somewhat subjective (make sure to choose a good range of RNA/Zα ratios), but we do recommend including these ratios: free, 4:1, 2:1, 1:1, 1:2, and 1:4 RNA/Zα. Titrate Zα into the RNA to reach the selected molar ratio, making sure to incubate the sample at 42 °C for 30 min afterward. Start with the largest ratio of RNA/Zα first (e.g., 8:1 RNA/Zα) and move downward to the smallest ratio.

268

Parker J. Nichols et al.

Fig. 6 Nuclear magnetic resonance to determine regions adopting A–Z junctions, Z-conformation, or being destabilized by Zα binding. (a) The imino regions of the 2D [1H,1H]-NOESY spectrum with a mixing time of 300 ms for the A–Z junction control RNA are shown. Imino proton connectivities (the “NOESY walk”) and assignments are shown with dashed lines and illustrated as red lines on the 2D secondary structure of the A–Z junction. (b) The imino regions of the 1D 1H titration are shown for the A–Z junction control and AluSx1Jo RNAs. Imino proton assignments are indicated with dashed lines. The ratio of RNA/Zα is indicated on the righthand side of each trace

9. Continue the process until 1D 1H imino spectra are measured for each titration point. 10. Load the 1D 1H imino spectra data into a data analysis software such as TopSpin version 4.0.7 (Bruker), VNMRJ version 4.2 Revision A (Agilent), NMRPipe [42], or others, to analyze the data. At early titration points, the base pairs adjacent to the Z-forming region become destabilized to allow the adoption of the A–Z junction conformation. Due to this, they disappear into the noise with increasing ratios of Zα (Fig. 6b). At later titration points (1:1, 1:2, 1:4 RNA/Zα, etc.), chemical shift perturbations and line broadening of the imino peaks are observed indicating Z-RNA formation. By using the assignment of the imino peaks, these changes can be directly correlated to the RNA 2D structure, and the regions which adopt the Z-conformation and regions which are destabilized can be determined [26].

Adoption of A–Z Junctions in RNAs by Binding of Zα Domains

4

269

Notes 1. For the most part, Zα domains are stable and therefore relatively forgiving proteins to work with. In our hands, the Zα domain from human ADAR1 (recombinantly expressed in BL21(DE3) E. coli) can be stored in the freezer in low salt buffers (we usually use 20 mM potassium phosphate at pH 6.4, 25 mM NaCl) at high concentration (ZαADAR1 can be concentrated past 10 mM if desired) for long periods of time (~ a year) without significant degradation. In addition, freeze–thaw cycles do not appear to degrade ZαADAR1 or alter its structure or function (as judged by CD and NMR). ZαADAR1 alone and bound to RNA is stable at room temperature for several weeks (as monitored by NMR), but it will begin to degrade after 3–6 months if kept unfrozen. Owing to the high sequence and structural homology of Zα domains, it is likely that the observed properties of ZαADAR1 apply to other Zα domains from other species and proteins, but this should not be assumed. 2. Purifying RNA by denaturing PAGE often involves an ultraviolet (UV) shadowing step to mark the band of interest and excise it from the gel. This step can cause significant chemical damage to the RNA [43] and therefore should be avoided if possible (through chromatography purification methods), or the exposure of the RNA band to the UV light should be kept as short as possible. RNAs that are purified by denaturing PAGE [32, 34] should be subjected to an additional round of chromatography purification before high-sensitivity experiments such as NMR in order to remove any residual contamination from the gel itself (Fig. 7). This can be done in 2–3 h via weak anion exchange chromatography (using DEAESepharose, MilliporeSigma) followed by buffer exchange to remove the high salt from the elution buffer. 3. CD, AUC, and NMR are or involve forms of absorbance spectroscopy and therefore some buffer compositions are not appropriate. For CD and AUC, buffering and reducing agents which absorb in the UV range should be avoided, such as TRIS, MOPS, citrates, imidazole, and DTT. For NMR, any buffer component at high concentration that contains protons will yield extremely strong peaks in the spectrum that can interfere with the analysis of the NMR data. In addition, high salt concentrations (>200 mM) will severely attenuate the NMR signals. 4. These buffers can be made by making 500 mM stocks of potassium phosphate dibasic (KHPO4) and potassium phosphate monobasic (KH2PO4) and then diluting each into the

270

Parker J. Nichols et al.

contamination peaks

post PAGE purification, pre anion exchange purification

RNA signal

post PAGE and anion exchange purification

Fig. 7 1D NMR spectra from RNA post denaturing PAGE purification and post denaturing PAGE and anion exchange chromatography purification

desired buffer volume in a ratio that gives the correct pH (as reported within potassium phosphate pH tables). For example, to get 20 mM potassium phosphate at pH 6.4 in 1 L, the ratio is 72.2% monobasic and 27.8% dibasic, so the final amount to add from the 500 mM stocks would be 28.88 mL of monobasic and 11.12 mL of dibasic. NaCl in powder form (or from a concentrated stock) can then be added in to reach the desired concentration. A concentrated stock of EDTA should be made separately and pH adjusted (using NaOH) to either 6.4 or 7.0 and diluted to reach the final concentration of 0.5 mM. Finally, add water until 1 L is reached. 5. When preparing for ITC experiments, we do not recommend dialyzing using Amicon centricons due to the high sensitivity of the technique to buffer mismatches between the cell and injection samples. Also, it is best to dialyze both the Zα protein and RNA in the same 2 L beaker overnight (in separate dialysis bags) to ensure that the buffers are completely matched. 6. Before proceeding to any experiment, it is important to make sure that the RNAs to be tested are in the correct buffer and heat annealed to promote the correct formation of secondary structure and tertiary structure (especially if RNAs are thawed from frozen stocks). For example, we ran AUC experiments for Zα bound to the RNA (CpG)3 repeat, one where the RNA was not heat annealed before incubating with protein and the other where we denatured it at 95 °C for 5 min followed by cooling on ice for 20 min. For the heat-annealed condition, we observed a peak centered at 19.6 kDa which corresponds to a 2:1 protein/RNA complex (theoretical molecular weight of

Adoption of A–Z Junctions in RNAs by Binding of Zα Domains

271

18.8 kDa), whereas for the unannealed condition, two peaks were present, with the peak corresponding to a molecular weight of 35.6 kDa making up the majority of the sample (Fig. 4). This higher molecular weight complex was likely formed from tiling of the (CpG)3 RNAs to form longer helical structures, which were prevented by the refolding step. Refolding of RNAs is case-dependent and may need to be optimized. For small RNA hairpins and duplexes, it is usually sufficient to heat at 95 °C for 5 min and cool rapidly (~2 min) on ice. 7. For longer RNAs and those involving complex tertiary interactions, a slower annealing process is recommended, such as allowing the RNA to slowly cool to the correct temperature over a period of 45 min, for example, in a thermocycler. The buffer conditions may need to be optimized in addition to the annealing process to ensure good results, including optimizing divalent metal and salt concentrations [44]. In addition to AUC, RNA folding can be monitored through native polyacrylamide gels (PAGE) [45]. 8. It is critical for any equilibrium measurement that the Zα/RNA mixtures are incubated at 42 °C for at least 30 min before measurement in order to promote formation and stabilization of the Z-conformation within the RNA. Z-RNA formation is highly temperature dependent due to the fact that the conformational flip from A- to Z-form has a high activation energy barrier (for RNA, not DNA) [11]. At 25 °C, conversion of the r (CpG)6 sequence from the A- to the Z-form proceeds slowly at ~12.5% conversion per hour. In comparison, ~100% conversion is observed in about 10 min after incubation at 42 °C [11]. Note that the temperature and time dependence of non-(CpG)n Z-forming sequences has not been investigated and therefore could deviate from the standard 42 °C for 30 min. It may be important to investigate and optimize incubation times and temperatures when working with unique sequences. For example, incubating the AluSx1Jo and h43 E. coli RNAs at 42 °C for 10 min was enough to promote complex formation. 9. Keep in mind that for CD experiments, a higher concentration will only help the signal-to-noise ratio to a certain point, beyond which the difference between the right- and leftpolarized light (Δε, which is measured) will become so small compared to the absorbance of the sample that it will be dominated by the noise. We carried out all our measurements on a Jasco J-815 CD Spectropolarimeter (Jasco) using a 1 mm quartz cuvette (Jasco). We found that an RNA concentration of 50 μM in a ~175 μL volume was usually sufficient to yield good

272

Parker J. Nichols et al.

data, but this will likely have to be optimized on a case-by-case basis. 10. For circular dichroism experiments, three controls are needed in order to properly interpret the data. First, make sure that the buffer by itself contributes minorly to the CD spectrum by measuring it in the same range as the actual experiment. If there is some minor absorbance, baseline correction can be done by subtracting the buffer spectrum from the experimental one to correct for as much of the buffer absorbance as possible. Second, an 8 M stock of sodium perchlorate (NaClO4) is needed to create the positive control sample which has the RNA in 6 M sodium perchlorate in order to induce the Z-conformation [46]. Finally, a sample with Zα by itself at the highest concentration to be used in the titration is also important as Zα has a minor contribution to the signal in 220–320 nm range (although this is usually negligible up to concentrations of 300 μM). 11. ITC can either be carried out with RNA in the ITC cell and Zα at a high concentration in the syringe or with the opposite configuration. Depending on availability of sample, it is usually easier and less expensive to have RNA in the cell and Zα at high concentration in the syringe since Zα can be made through recombinant expression. 12. We would like to stress that the complex thermodynamics from the binding of Zα to A–Z junctions uncovered by ITC also incur a risk of erroneously fitting the data and should be interpreted cautiously if used in isolation. It is best to confirm the KD and stoichiometry using other techniques, such as AUC and NMR. 13. Leaky cells are most often a problem with the contact between the centerpiece and the windows. Any small amount of dust, scratches, oils, etc. can easily compromise this seal, leading to leaking of the sample from the cell. If the absorbance is too high, the sample can be removed from the cell using a syringe (with a high-gauge needle), diluted with buffer to obtain an appropriate absorbance, and reloaded into the cell. 14. The number of scans to measure is dependent upon how many samples are being measured at the same time. For a single sample, we usually measure about 270 scans (as each scan takes about 0.8 min, the duplicate measurement takes 1.6 min, and when multiplied by 270, it gives a total measurement time of ~8 h). If two samples are being measured, the number of scans needs to be halved to achieve the same measurement time. For three samples, we would measure 1/3 of the scans.

Adoption of A–Z Junctions in RNAs by Binding of Zα Domains

273

15. Since the stoichiometry is likely unknown going into an AUC experiment, start with a best guess, and then adjust the partialspecific volume depending on the results of the fit. The partialspecific volumes for proteins and RNA are 0.73 mL.g-1 and 0.61 mL.g-1, respectively. The partial-specific volume for a Zα/RNA complex can therefore be calculated by weighting the partial-specific volumes of RNA and protein alone by the percentage of RNA and Zα that makes up the complex. As an example for the (CpG)3. The fully saturated Zα/RNA complex for the (CpG)3 RNA is 2:1, meaning one complex has two Zα proteins and one RNA. The molecular weights of Zα and the (CpG)3 RNA are 7.3 kDa and 4.2 kDa, respectively, for a total combined molecular weight of 18.8 kDa. Therefore, Zα makes up 78% of the complex, while the (CpG)3 RNA makes up 22%. Multiply the partial-specific volumes for protein and RNA by these percentages, and take the average ((0.78*0.73 mL.g-1 + 0.22*0.61 mL.g-1)/2) to acquire the final partial-specific volume, which for this case would be 0.703 mLg-1. References 1. D’Ascenzo L, Leonarski F, Vicens Q, Auffinger P (2016) ‘Z-DNA like’ fragments in RNA: a recurring structural motif with implications for folding, RNA/protein recognition and immune response. Nucleic Acids Res. https:// doi.org/10.1093/nar/gkw388 2. Harvey SC (1983) DNA structural dynamics: longitudinal breathing as a possible mechanism for the B in equilibrium Z transition. Nucleic Acids Res 11:4867–4878 3. Wang AHJ et al (1979) Molecular structure of a left-Handed double helical DNA fragment at atomic resolution. Nature. https://doi.org/ 10.1038/282680a0 4. Rich A, Zhang S (2003) Z-DNA: The long road to biological function. Nat Rev Genet. https://doi.org/10.1038/nrg1115 5. Herbert A (2019) Z-DNA and Z-RNA in human disease. Commun Biol. https://doi. org/10.1038/s42003-018-0237-x 6. Herbert A, Rich A (1996) The biology of lefthanded Z-DNA. J Biol Chem. https://doi. org/10.1074/jbc.271.20.11595 7. Chiang DC, Li Y, Ng SK (2021) The role of the Z-DNA binding domain in innate immunity and stress granules. Front Immunol. https://doi.org/10.3389/fimmu.2020. 625504 8. Lushnikov AY et al (2004) Interaction of the Zα domain of human ADAR1 with a negatively supercoiled plasmid visualized by atomic force

microscopy. Nucleic Acids Res. https://doi. org/10.1093/nar/gkh810 9. Herbert A et al (1998) The Zα domain from human ADAR1 binds to the Z-DNA conformer of many different sequences. Nucleic Acids Res. https://doi.org/10.1093/ nar/26.15.3486 10. Schwartz T, Rould MA, Lowenhaupt K, Herbert A, Rich A (1999) Crystal structure of the Zalpha domain of the human editing enzyme ADAR1 bound to left-handed Z-DNA. Science 11(80):1841–1845 11. Brown BA, Lowenhaupt K, Wilbert CM, Hanlon EB, Rich A (2000) The Zα domain of the editing enzyme dsRNA adenosine deaminase binds left-handed Z-RNA as well as Z-DNA. Proc Natl Acad Sci U S A. https://doi.org/10. 1073/pnas.240464097 12. Placido D, Brown BA, Lowenhaupt K, Rich A, Athanasiadis A (2007) A left-handed RNA double helix bound by the Zα domain of the RNA-editing enzyme ADAR1. Structure. https://doi.org/10.1016/j.str.2007.03.001 13. Kruse H, Mrazikova K, D’Ascenzo L, Sponer J, Auffinger P (2020) Short but weak: the Z-DNA lone-pair·π conundrum challenges standard Carbon Van der Waals Radii. Angew Chem Int Ed. https://doi.org/10.1002/anie. 202004201 14. Chung H et al (2018) Human ADAR1 prevents endogenous RNA from triggering

274

Parker J. Nichols et al.

translational shutdown. Cell. https://doi.org/ 10.1016/j.cell.2017.12.038 15. Feng S et al (2011) Alternate rRNA secondary structures as regulators of translation. Nat Struct Mol Biol. https://doi.org/10.1038/ nsmb.1962 16. Dickerson RE et al (1982) The anatomy of A-, B-, and Z-DNA. Science 80. https://doi.org/ 10.1126/science.7071593 17. Ha SC et al (2009) The structures of non-CGrepeat Z-DNAs co-crystallized with the ZDNA-binding domain, hZαADAR1. Nucleic Acids Res. https://doi.org/10.1093/nar/ gkn976 18. Lee YM et al (2013) NMR investigation on the DNA binding and B-Z transition pathway of the Zα domain of human ADAR1. Biophys Chem. https://doi.org/10.1016/j.bpc.2012. 12.002 19. Lee YM et al (2012) NMR study on the B-Z junction formation of DNA duplexes induced by Z-DNA binding domain of human ADAR1. J Am Chem Soc. https://doi.org/10.1021/ ja211581b 20. Kim D et al (2018) Sequence preference and structural heterogeneity of BZ junctions. Nucleic Acids Res. https://doi.org/10.1093/ nar/gky784 21. Sung CH, Lowenhaupt K, Rich A, Kim YG, Kyeong KK (2005) Crystal structure of a junction between B-DNA and Z-DNA reveals two extruded bases. Nature. https://doi.org/10. 1038/nature04088 22. Kim D et al (2009) Base extrusion is found at helical junctions between right- and lefthanded forms of DNA and RNA. Nucleic Acids Res. https://doi.org/10.1093/nar/ gkp364 23. Lee EH et al (2010) NMR study of hydrogen exchange during the B-Z transition of a DNA duplex induced by the Zα domains of yatapoxvirus E3L. FEBS Lett. https://doi.org/10. 1016/j.febslet.2010.10.003 24. Lee AR et al (2019) NMR dynamics study reveals the Zα domain of human ADAR1 associates with and dissociates from Z-RNA more slowly than Z-DNA. ACS Chem Biol. https://doi.org/10.1021/acschembio. 8b00914 25. Jeong M et al (2014) NMR study of the Z-DNA binding mode and B-Z transition activity of the Zα domain of human ADAR1 when perturbed by mutation on the α3 helix and β-hairpin. Arch Biochem Biophys. https:// doi.org/10.1016/j.abb.2014.06.026 26. Nichols PJ et al (2021) Recognition of non-CpG repeats in Alu and ribosomal RNAs

by the Z-RNA binding domain of ADAR1 induces A-Z junctions. Nat Commun. https://doi.org/10.1038/s41467-02121039-0 27. Schwartz T et al (1999) Proteolytic dissection of Zab, the Z-DNA-binding domain of human ADAR1. J Biol Chem. https://doi.org/10. 1074/jbc.274.5.2899 28. Brunelle JL, Green R (2013) In vitro transcription from plasmid or PCR-amplified DNA. Methods Enzymol. https://doi.org/10. 1016/B978-0-12-420037-1.00005-1 29. Scott LG, Hennig M (2008) RNA structure determination by NMR. Methods Mol Biol. https://doi.org/10.1007/978-1-60327159-2_2 30. Jeng S, Gardnerq J, Gumport R (1992) Transcription termination in vitro by bacteriophage T7 RNA polymerase. J. Biol, Chem 31. Beckert B, Masquida B (2011) Synthesis of RNA by in vitro transcription. Methods Mol Biol. https://doi.org/10.1007/978-159745-248-9_3 32. Edwards AL, Garst AD, Batey RT (2009) Determining structures of RNA aptamers and riboswitches by X-ray crystallography. Methods Mol Biol. https://doi.org/10.1007/978-159745-557-2_9 33. Francis AJ, Resendiz MJE (2017) Protocol for the solid-phase synthesis of oligomers of RNA containing a 2′-o-thiophenylmethyl modification and characterization via circular dichroism. J Vis Exp. https://doi.org/10.3791/ 56189 34. Petrov A, Wu T, Puglisi EV, Puglisi JD (2013) RNA purification by preparative polyacrylamide gel electrophoresis. Methods Enzymol. https://d oi.org/10.1016 /B97 8-0-12420037-1.00017-8 35. Easton LE, Shibata Y, Lukavsky PJ (2010) Rapid, nondenaturing RNA purification using weak anion-exchange fast performance liquid chromatography. RNA. https://doi.org/10. 1261/rna.1862210 36. Kim I, Mckenna SA, Puglisi EV, Puglisi JD (2007) Rapid purification of RNAs using fast performance liquid chromatography (FPLC). RNA. https://doi.org/10.1261/rna.342607 37. Miyahara T, Nakatsuji H, Sugiyama H (2016) Similarities and differences between RNA and DNA double-helical structures in circular dichroism spectroscopy: a SAC-CI study. J Phys Chem A. https://doi.org/10.1021/acs. jpca.6b08023 38. Freyer MW, Lewis EA (2008) Isothermal titration calorimetry: experimental design, data analysis, and probing macromolecule/ligand

Adoption of A–Z Junctions in RNAs by Binding of Zα Domains binding and kinetic interactions. Methods Cell Biol. https://doi.org/10.1016/S0091-679X (07)84004-0 39. Cole JL, Lary JW, Moody P, Laue TM (2008) Analytical ultracentrifugation: sedimentation velocity and sedimentation equilibrium. Methods Cell Biol. https://doi.org/10.1016/ S0091-679X(07)84006-4 40. Balbo A, Zhao H, Brown PH, Schuck P (2010) Assembly, loading, and alignment of an analytical ultracentrifuge sample cell. J Vis Exp. https://doi.org/10.3791/1530 41. Fu¨rtig B, Richter C, Wo¨hnert J, Schwalbe H (2003) NMR spectroscopy of RNA. ChemBioChem 4:936–962 42. Delaglio F et al (1995) NMRPipe: a multidimensional spectral processing system based on UNIX pipes. J Biomol NMR 6:277–293

275

43. Kladwang W, Hum J, Das R (2012) Ultraviolet shadowing of RNA can cause significant chemical damage in seconds. Sci Rep. https://doi. org/10.1038/srep00517 44. Edelmann FT, Niedner A, Niessing D (2014) Production of pure and functional RNA for in vitro reconstitution experiments. Methods. https://doi.org/10.1016/j.ymeth.2013. 08.034 45. Woodson SA, Koculi E (2009) Analysis of RNA folding by native polyacrylamide gel electrophoresis. Methods Enzymol. https://doi.org/ 10.1016/s0076-6879(09)69009-1 46. Klump HH, Jovin TM (1987) Formation of a left-handed RNA double helix: energetics of the A-Z transition of poly[r(G-C)] in concentrated sodium perchlorate solutions. Biochemistry 26:5186–5190

Chapter 19 Detecting Z-RNA and Z-DNA in Mammalian Cells Chaoran Yin, Ting Zhang, and Siddharth Balachandran Abstract Eukaryotic cells sense and respond to virus infections by detecting conserved virus-generated molecular structures, called pathogen-associated molecular patterns (PAMPs). PAMPs are usually produced by replicating viruses, but not typically seen in uninfected cells. Double-stranded RNA (dsRNA) is a common PAMP produced by most, if not all, RNA viruses, as well as by many DNA viruses. DsRNA can adopt either the right-handed (A-RNA) or the left-handed (Z-RNA) double-helical conformation. A-RNA is sensed by cytosolic pattern recognition receptors (PRRs) such as RIG-1-like receptor MDA-5 and the dsRNAdependent protein kinase PKR. Z-RNA is detected by Zα domain containing PRRs, including Z-form nucleic acid binding protein 1 (ZBP1) and the p150 subunit of adenosine deaminase RNA specific 1 (ADAR1). We have recently shown that Z-RNA is generated during orthomyxovirus (e.g., influenza A virus) infections and serves as activating ligand for ZBP1. In this chapter, we describe our procedure for detecting Z-RNA in influenza A virus (IAV)-infected cells. We also outline how this procedure can be used to detect Z-RNA produced during vaccinia virus infection, as well as Z-DNA induced by a small-molecule DNA intercalator. Key words Z-RNA, Z-DNA, ZBP1, ADAR1, Influenza A virus, Vaccinia virus, Necroptosis

1

Introduction Mammalian cells have in place a number of innate immune mechanisms to sense the presence of intercellular viruses [1]. Many innate immune PRRs detect double-stranded RNA (dsRNA), a common PAMP produced during the replication of RNA and DNA viruses, but not normally found in the cytosol or nucleoplasm of uninfected cells. DsRNA can adopt two conformations, the right-handed (A-form) double helix, called A-RNA, and the left-handed (Z-form) double helix (Z-RNA) [2] (see Fig. 1). Cytosolic A-RNA is a well-known PAMP, sensed by dsRNA-binding domain (dsRBD) containing proteins, such as PKR and oligoadenylate synthetase 1 (OAS1) [3]. A-RNA is also sensed by RLRs, via their

Authors Chaoran Yin, Ting Zhang have equally contributed to this chapter. Kyeong Kyu Kim and Vinod Kumar Subramani (eds.), Z-DNA: Methods and Protocols, Methods in Molecular Biology, vol. 2651, https://doi.org/10.1007/978-1-0716-3084-6_19, © The Author(s), under exclusive license to Springer Science+Business Media, LLC, part of Springer Nature 2023

277

278

Chaoran Yin et al.

Fig. 1 Structures of A-RNA and Z-RNA. Handedness and canonical sensors of each double-helical conformation are shown below each structure. A-RNA is the right-handed conformation of dsRNA and is detected by RLRs, TLR3, and PKR, among other host sensors. Z-RNA is the left-handed dsRNA duplex, detected by ZBP1

helicase domains [3]. Less well known is Z-RNA. Because Z-RNA is not a bioenergetically favorable conformation of dsRNA, it was not thought to readily occur in nature [4, 5]. However, Z-RNA, but not A-RNA, is selectively bound by proteins with Zα domains. In vertebrates, three proteins, protein kinase Z (PKZ) [6], ADAR1 p150 [7, 8], and ZBP1 [9, 10], possess Zα domains, and all three have been implicated in antiviral responses [11]. Moreover, Z-RNA has been detected in the cytoplasm of fixed protozoan cells by immunofluorescence microscopy [12]. Together, these observations suggest that Z-RNAs are produced during virus infections and that immune-fluorescence-based in situ approaches to detect Z-RNA in fixed eukaryotic cells are feasible. We have recently shown that Z-RNA is produced during orthomyxovirus [13] and vaccinia virus [14] infections. IAVs produce Z-RNA in infected nuclei during replication [13], and vaccinia virus generates cytoplasmic Z-RNA [14]. In this chapter, we describe our immunofluorescence microscopy-based procedure for detecting Z-RNA in orthomyxovirus-infected mammalian cells. We also outline how the protocol can be adapted to detect Z-RNA in the cytoplasm during poxviral infection, as well as Z-DNA triggered by the B → Z conversion of genomic dsDNA by the small molecule CBL0137 [15].

Detecting Z-RNA and Z-DNA in Mammalian Cells

2

279

Materials

2.1 Culturing Mouse Embryonic Fibroblasts (MEFs) or L929 Cell Line

1. Early-passage (P < 5) primary wild-type (e.g., C57Bl/6) MEFs (see Note 1). 2. L929 cell line (ATCC). 3. Complete DMEM for primary MEFs: Dulbecco’s Modified Eagle Medium supplemented with 15% heat-inactivated fetal bovine serum (FBS), 1% l-glutamine, 1% sodium pyruvate, 1% penicillin–streptomycin, and 1% amphotericin B. 4. Complete EMEM for L929 cells: Eagle’s Minimum Essential Medium (EMEM) supplemented with 5% heat-inactivated fetal bovine serum (FBS), 0.1 mM nonessential amino acid solution (GIBCO), and 50 mg/mL gentamycin. 5. Eight-well glass slides (EMD Millipore) (see Note 2). 6. Glass coverslips.

2.2 Virus Infection or Treatment with CBL0137 2.3 Immunofluorescence Microscopy

1. IAV strain A/Puerto Rico/8/1934 (PR8) (see Note 3). 2. Vaccinia virus strain WR (ATCC) (see Note 4). 3. CBL0137 (Cayman Biologicals) (see Note 5). 1. Phosphate-buffered saline (PBS), 1×: 137 mM NaCl, 2.7 mM KCl, 10 mM Na2HPO4, 1.8 mM KH2PO4, pH 7.4. 2. PBST: 0.1% Tween-20 in PBS. 3. 3% BSA: 3% BSA in PBST. Store at 4 °C. 4. Proteinase K (New England Biolabs): 0.008 U/mL in PBS. 5. MAXblock Blocking Medium (Active Motif). 6. 4% paraformaldehyde: 4% (w/v) paraformaldehyde in PBS. 7. 0.2% Triton X-100: 0.2% (v/v) Triton X-100 in PBS. 8. RNase A (Thermo Fisher Scientific). 9. RNase III (Thermo Fisher Scientific). 10. DNase I (Thermo Fisher Scientific). 11. Anti-Z-RNA antibody: rabbit monoclonal anti-Z-NA, clone Z22 (Absolute Antibody). 12. Anti-A-RNA antibody: mouse monoclonal anti-dsRNA, clone 9D5 (EMD Millipore). 13. Rabbit IgG Isotype Control antibody (Thermo Fisher Scientific). 14. Mouse IgG Isotype Control antibody (Thermo Fisher Scientific). 15. Primary antibody solution: Dilute anti-Z-RNA antibody at 1: 200 with 3% BSA. Dilute anti-A-RNA antibody at 1:50 with 3% BSA. Dilute IgG control antibody at 1:1000 with 3% BSA.

280

Chaoran Yin et al.

16. Donkey anti-Rabbit IgG (H + L) Highly Cross-Adsorbed Secondary Antibody, Alexa Fluor Plus 594 (Invitrogen). 17. Donkey anti-Mouse IgG (H + L) Highly Cross-Adsorbed Secondary Antibody, Alexa Fluor Plus 488 (Invitrogen). 18. Secondary antibody solution: Prepare secondary antibody solution by diluting fluorophore-conjugated secondary antibodies at 1:500 in PBST. 19. Mounting medium: ProLong Gold antifade reagent (Thermo Fisher Scientific). 20. DAPI (Thermo Fisher Scientific): 300 nM in PBS. 21. Hoechst 33342 (Thermo Fisher Scientific): 20 mM in PBS.

3

Methods

3.1 Influenza a Virus Infection

1. Plate 4 × 104 primary MEFs per well on slides with 200 μL complete DMEM, using one slide per experimental condition. Gently rock the slides back and forth to evenly spread cells and incubate overnight in humidified incubator maintained at 37 ° C and 5% CO2. 2. The next day, when cells have reached 60–80% confluence, change the medium with prewarmed serum-free DMEM containing virus inoculum at the desired multiplicity of infection (MOI) in a minimum volume that is sufficient to cover the monolayer cells (see Note 6). Incubate cells with virus inoculum for 1 h in a humidified incubator maintained at 37 °C and 5% CO2 with frequent rocking (every 5–10 min) to keep the cells from drying out. 3. Aspirate the virus inoculum and wash the cells once gently with PBS. Add 200 μL complete growth medium to the cells and incubate them for 6–18 h (see Note 7).

3.2 Immunofluorescence Detection of Z-RNA

1. At the desired time point, aspirate the growth medium, and fix the cells by incubating with freshly prepared 4% paraformaldehyde for 10 min at room temperature. 2. Wash the samples three times, each time with 200 μL PBS. 3. Permeabilize cells with 0.2% Triton X-100 for 15 min at room temperature (see Note 8). 4. Wash samples three times, each time with 200 μL PBS. 5. Incubate fixed and permeabilized cells with proteinase K at 37 °C. Check the morphology of cells under a light microscope every 5 min until the outline of the nuclear membrane becomes sharp-edged and clear. At this stage, typically about 20–40 min after exposure to proteinase K, wash out proteinase K. (See step 6.) Avoid overexposure to proteinase K, as this will make cells detach from the slides (see Notes 9, 10, and 11).

Detecting Z-RNA and Z-DNA in Mammalian Cells

281

Fig. 2 Detection of Z-RNA, A-RNA, and Z-DNA. (A) Primary wild-type MEFs were infected with IAV (PR8, MOI = 10), fixed at 6 h postinfection, and examined for presence of Z-RNA (red) or A-RNA (green) as described in this chapter. (B) Primary wild-type MEFs were treated with CBL0137 (5 μM), fixed at 12 h posttreatment, and stained for Z-DNA (red). Nuclei are stained with DAPI (blue) and outlined with dashed white lines. Scale bar represents 10 μm

6. Wash samples three times, each time with 200 μL PBS. 7. Incubate samples in MAXblock Blocking Medium for 1 h at 37 °C. 8. After blocking, aspirate blocking medium, and add 100 μL primary antibody solution to each well. 9. Incubate samples overnight (~16 h) at 4 °C. 10. Wash cells three times with 200 μL PBS. 11. Incubate cells with 100 μL secondary antibody solution for 1 h at room temperature. Minimize exposure of the secondary antibody to ambient light during this and subsequent steps. 12. Wash samples three times with 200 μL PBS. 13. Remove the chambers from the slides, according to the manufacturer’s instructions. 14. Place a drop of ProLong Gold antifade reagent (Thermo Fisher Scientific) in the middle of each cell-bearing area, and cover with a coverslip, avoiding any air bubbles from forming. As the ProLong Gold antifade reagent includes DAPI, no additional DNA staining dye is needed before mounting (see Notes 12 and 13). 15. Image stained slides by confocal microscopy on a Leica SP8 or similar instrument. We quantify fluorescence intensity using Leica LAS X software (see Fig. 2).

282

4

Chaoran Yin et al.

Notes 1. Primary murine embryo fibroblasts (MEFs) are prepared from day 13.5 embryos as previously described [16]. MEFs can be substituted with other cell lines permissive for IAV replication, such as A549, HT29, and LET1 [17] cell lines. 2. Sterile cell culture coverslips are an alternative to eight-well slides from EMD Millipore. For some cell types, coating the slides with poly-L-lysine or collagen prior to plating may facilitate adhesion of cells. 3. A/Puerto Rico/8/34(H1N1) is generated by reverse genetics as previously described [18]. Briefly, seed virus is injected into 10-day-old embryonated hen’s eggs, which are incubated at 37 °C for 48 h and chilled at 4 °C for a further 24 h. Allantoic fluid containing virus is then harvested from these eggs and characterized for titer on Madin-Darby canine kidney (MDCK) cells. 4. Vaccinia virus (VACV), a member of the poxvirus family of DNA viruses, produces Z-RNA in the cytoplasm of infected cells [14]. Proteinase K treatment is not needed for detecting Z-RNA in VACV-infected cells, although limited proteinase K digestion (20–40 min) liberates Z-RNAs from the VACVencoded E3 protein, boosting its availability for detection by the anti-Z-NA antibody [13]. Preparation of VACV stocks is described in [13]. If using VACV to generate Z-RNA signal, plate L929 cells on slides and allow cells to attach overnight. Prepare and use VACV-containing virus inoculum at 5 MOI, and process the slides for immunofluorescence microscopy as described above for IAV at 4 h postinfection. 5. The same protocol can be used to detect Z-DNA. The secondgeneration curaxin family member CBL0137 is a strong inducer of Z-DNA in cells [19]. If using CBL0137 to induce Z-DNA formation, we recommend incubating cells with CBL0137 (5 μM) for 6–12 h before fixation, permeabilization, and staining. Proteinase K treatment is not required for detecting Z-DNA in CBL0137-exposed cells (see Fig. 2B). 6. We typically prepare our virus inoculum in 100 μL serum-free medium per well and infect cells at MOIs of between 5 and 10 for a robust Z-RNA signal. 7. We typically harvest IAV-infected cells 1–2 h prior to onset of strong cytopathic effect (CPE). A classic sign of early CPE is cell-rounding (but not detachment), accompanied by increased refractility. At MOI = 10, we observe CPE in MEFs by 8–12 h of infection.

Detecting Z-RNA and Z-DNA in Mammalian Cells

283

8. Optimal Triton X-100 concentration is dependent on cell type. We recommend testing a range of Triton X-100 concentrations between 0.1 and 0.5% to identify one that produces the strongest signal. 9. We have found that proteinase K treatment is necessary to fully unmask Z-RNAs in IAV-infected cells, likely because these Z-RNAs are bound by cellular and/or viral proteins. The duration of proteinase K treatment can vary from 10 to 40 min, depending on cell type. It is important to continuously monitor proteinase-treated cells and prevent over-digestion of cellular structures. We suggest examining cells every 5 min by light microscopy to ensure the morphological features (especially of the nuclear envelope) of the fixed cells come into sharp relief (and remain intact) before washing out the active proteinase K. 10. If co-staining for Z-RNA and other proteins, we recommend reducing the length of proteinase K treatment to unmask sufficient Z-RNA while also preserving enough polypeptide antigenicity to allow detection of the proteins of choice. 11. To distinguish between Z-RNA and Z-DNA, treat cells with RNase A (1 mg/mL), RNase III (50 U/mL), or DNase I (25 U/mL) for 1 hour at 37 °C after proteinase K treatment. RNase A and RNase III will digest Z-RNA, and DNase I will reduce or eliminate the Z-DNA signal. As the Z22 antibody detects both Z-RNA and Z-DNA, treatment with these nucleases will determine if the Z22 signal emanates from Z-RNA or Z-DNA. 12. Slides can be stored in the dark at 4 °C and will remain suitable for imaging for around 6 months after mounting. 13. If cells are mounted in a reagent not containing a DNA-binding dye, incubate cells with DAPI or Hoechst 33342 before mounting for staining nuclei. References 1. Akira S, Uematsu S, Takeuchi O (2006) Pathogen recognition and innate immunity. Cell 124(4):783–801. https://doi.org/10.1016/j. cell.2006.02.015 2. Hall K, Cruz P, Tinoco I Jr, Jovin TM, van de Sande JH (1984) ‘Z-RNA’--a left-handed RNA double helix. Nature 311(5986): 5 8 4 – 5 8 6 . h t t p s : // d o i . o r g / 1 0 . 1 0 3 8 / 311584a0 3. Hur S (2019) Double-stranded RNA sensors and modulators in innate immunity. Annu Rev Immunol 37:349–375. https://doi.org/10. 1146/annurev-immunol-042718-041356

4. Brown BA 2nd, Lowenhaupt K, Wilbert CM, Hanlon EB, Rich A (2000) The zalpha domain of the editing enzyme dsRNA adenosine deaminase binds left-handed Z-RNA as well as Z-DNA. Proc Natl Acad Sci U S A 97(25): 13532–13536. https://doi.org/10.1073/ pnas.240464097 5. Athanasiadis A (2012) Zalpha-domains: at the intersection between RNA editing and innate immunity. Semin Cell Dev Biol 23(3): 275–280. https://doi.org/10.1016/j. semcdb.2011.11.001

284

Chaoran Yin et al.

6. Wu CX, Wang SJ, Lin G, Hu CY (2010) The Zalpha domain of PKZ from Carassius auratus can bind to d(GC)(n) in negative supercoils. Fish Shellfish Immunol 28(5–6):783–788. https://doi.org/10.1016/j.fsi.2010.01.021 7. Samuel CE (2011) Adenosine deaminases acting on RNA (ADARs) are both antiviral and proviral. Virology 411(2):180–193. https:// doi.org/10.1016/j.virol.2010.12.004 8. Herbert A, Alfken J, Kim YG, Mian IS, Nishikura K, Rich A (1997) A Z-DNA binding domain present in the human editing enzyme, double-stranded RNA adenosine deaminase. Proc Natl Acad Sci U S A 94(16):8421–8426 9. Kuriakose T, Man SM, Malireddi RK, Karki R, Kesavardhana S, Place DE, Neale G, Vogel P, Kanneganti TD (2016) ZBP1/DAI is an innate sensor of influenza virus triggering the NLRP3 inflammasome and programmed cell death pathways. Sci Immunol 1(2). https:// doi.org/10.1126/sciimmunol.aag2045 10. Thapa RJ, Ingram JP, Ragan KB, Nogusa S, Boyd DF, Benitez AA, Sridharan H, Kosoff R, Shubina M, Landsteiner VJ, Andrake M, Vogel P, Sigal LJ, tenOever BR, Thomas PG, Upton JW, Balachandran S (2016) DAI senses influenza a virus genomic RNA and activates RIPK3-dependent cell death. Cell Host Microbe 20(5):674–681. https://doi.org/10. 1016/j.chom.2016.09.014 11. Chiang C, Li Y, Ng SK (2020) The role of the Z-DNA binding domain in innate immunity and stress granules. Front Immunol 11: 625504. https://doi.org/10.3389/fimmu. 2020.625504 12. Zarling DA, Calhoun CJ, Hardin CC, Zarling AH (1987) Cytoplasmic Z-RNA. Proc Natl Acad Sci U S A 84(17):6117–6121 13. Zhang T, Yin C, Boyd DF, Quarato G, Ingram JP, Shubina M, Ragan KB, Ishizuka T, Crawford JC, Tummers B, Rodriguez DA, Xue J, Peri S, Kaiser WJ, Lopez CB, Xu Y, Upton JW, Thomas PG, Green DR, Balachandran S (2020) Influenza virus Z-RNAs induce ZBP1mediated necroptosis. Cell 180(6): 1115–1129. e1113. https://doi.org/10. 1016/j.cell.2020.02.050

14. Koehler H, Cotsmire S, Zhang T, Balachandran S, Upton JW, Langland J, Kalman D, Jacobs BL, Mocarski ES (2021) Vaccinia virus E3 prevents sensing of Z-RNA to block ZBP1-dependent necroptosis. Cell Host Microbe 29:1266. https://doi.org/10. 1016/j.chom.2021.05.009 15. Zhang T, Yin C, Fedorov A, Qiao L, Bao H, Beknazarov N, Wang S, Gautam A, Williams RM, Crawford JC, Peri S, Studitsky V, Beg AA, Thomas PG, Walkley C, Xu Y, Poptsova M, Herbert A, Balachandran S (2022) ADAR1 masks the cancer immunotherapeutic promise of ZBP1-driven necroptosis. Nature 606 (7914):594–602 16. Conner DA (2001) Mouse embryo fibroblast (MEF) feeder cell preparation. Curr Protoc Mol Biol, Chapter 23:Unit 23 22. https:// doi.org/10.1002/0471142727.mb2302s51 17. Rosenberger CM, Podyminogin RL, Askovich PS, Navarro G, Kaiser SM, Sanders CJ, McClaren JL, Tam VC, Dash P, Noonan JG, Jones BG, Surman SL, Peschon JJ, Diercks AH, Hurwitz JL, Doherty PC, Thomas PG, Aderem A (2014) Characterization of innate responses to influenza virus infection in a novel lung type I epithelial cell model. J Gen Virol 95(Pt 2): 350–362. https://doi.org/10.1099/vir.0. 058438-0 18. Hoffmann E, Neumann G, Kawaoka Y, Hobom G, Webster RG (2000) A DNA transfection system for generation of influenza a virus from eight plasmids. Proc Natl Acad Sci U S A 97(11):6108–6113. https://doi.org/ 10.1073/pnas.100133697 19. Gasparian AV, Burkhart CA, Purmal AA, Brodsky L, Pal M, Saranadasa M, Bosykh DA, Commane M, Guryanova OA, Pal S, Safina A, Sviridov S, Koman IE, Veith J, Komar AA, Gudkov AV, Gurova KV (2011) Curaxins: anticancer compounds that simultaneously suppress NF-kappaB and activate p53 by targeting FACT. Sci Transl Med 3(95): 9 5 r a 7 4 . h t t p s : // d o i . o r g / 1 0 . 1 1 2 6 / scitranslmed.3002530

Chapter 20 Identification of ADAR1 p150 and p110 Associated Edit Sites Tony Sun, Brad R. Rosenberg, Hachung Chung, and Charles M. Rice Abstract Adenosine deaminase acting on RNA 1 (ADAR1) catalyzes adenosine-to-inosine editing on doublestranded RNA molecules and is involved in regulating cellular responses to endogenous and exogenous RNA. ADAR1 is the primary A-to-I editor of RNA in humans, and the majority of edit sites are found in a class of short interspersed nuclear elements called Alu elements, many of which are located in introns and 3′ untranslated regions. Two ADAR1 protein isoforms, p110 (110 kDa) and p150 (150 kDa), are known to be coupled in expression, and decoupling the expression of these isoforms has revealed that the p150 isoform edits a broader range of targets compared to p110. Numerous methods for identification of ADAR1-associated edits have been developed, and we present here a specific method for identification of edit sites associated with individual ADAR1 isoforms. Key words ADAR1, A-to-I editing, RNA sequencing

1

Introduction ADAR1 is expressed as two isoforms, p150 and p110, and the larger isoform contains a Z-DNA/RNA-binding domain at its N-terminal end, allowing for binding and editing of additional RNA targets [1]. Endogenous expression of p150 in isolation is complicated by its coupled expression with p110, due to leaky ribosome scanning on the canonical p150-encoding mRNA. A genetically modified p150 open reading frame that increases translation initiation in an alternate reading frame effectively eliminates the leaky expression of p110 [2]. This modified p150 sequence, called p150r, is stably expressed in an ADAR1-KO background, in order to study the editome when only p150 is present. The wildtype p150 sequence is stably expressed in an ADAR1-KO background to study the editome when both isoforms are present. Finally, the p110 sequence is stably expressed in an ADAR1-KO background to study the editome when p110 is the dominant editor. Stable expression is achieved using a previously described system utilizing the CMV promoter to drive transcription of

Kyeong Kyu Kim and Vinod Kumar Subramani (eds.), Z-DNA: Methods and Protocols, Methods in Molecular Biology, vol. 2651, https://doi.org/10.1007/978-1-0716-3084-6_20, © The Author(s), under exclusive license to Springer Science+Business Media, LLC, part of Springer Nature 2023

285

286

Tony Sun et al.

integrated transgenes that encode the ADAR1 isoform and an RFP reporter protein via IRES-mediated translation initiation [3]. The RFP enables single-cell sorting using flow cytometry for screening of expression levels by immunoblot and subsequent clonal expansion. Our method to identify RNA edit sites involves sequencing of RNA and comparison with a reference genomic sequence [4–6] while also taking into account mismatches introduced during preparation of RNA sequencing libraries, the error rate of the sequencing platform, and even the error rate of RNA polymerase II: 10–100 errors each second in human cells [7–11]. For the three ADAR1 isoform groups mentioned above, clones are selected in triplicate following screening by immunoblot, and this is largely to address the issue of mismatches introduced during reverse transcription of RNA in the first step of library preparation (this issue could also be minimized by using error-correction techniques such as rolling-circle reverse transcription). The error rate of murine leukemia virus reverse transcriptase (RT) is 1/37,000 [12]. We use the Invitrogen Superscript III RT, which is a genetically modified version of the Moloney murine leukemia virus RT [13]. The error rate of Superscript III is between 1/15,000 and 1/32,000, depending on temperature and other variables [14]. Downstream of reverse transcription, PCR errors are rare, but unequal amplification of cDNA fragments and sequencer base-calling errors are possible, and these issues are addressed by introducing filters during variant calling. Finally, amplicon sequencing, making use of incorporated unique molecular tags during reverse transcription, is used to gain additional read depth at select loci and produce exact counts of edited and non-edited bases. Our method is adapted from various sources [15–20].

2

Materials

2.1 Exogenous Expression of p150, p110, and p150/p110 in ADAR1 KO Background

1. Human embryonic kidney 293T cells (ATCC ACS-4500). 2. Culture medium: Dulbecco’s Modified Eagle Medium, 10% fetal bovine serum, 1× MEM Non-Essential Amino Acids. 3. Culture materials: T25/T75/T175 flasks, 6-well/24-well/96well flat-bottom plates, 20 μm vacuum filter flasks, poly-Llysine, Accutase gentle dissociation media. 4. Transfection and transduction materials: Lipofectamine 2000 Transfection Reagent, polybrene, vesicular stomatitis virus glycoprotein expression plasmid, Gag-Pol polyprotein expression plasmid, pTRIP-CMV-IRES-RFP expression plasmid. 5. Western blot materials: 1× Sample Buffer, 400 mM dithiothreitol, 26G needle, 4–12% Bis-Tris gels, 1×

Identification of ADAR1 p150 and p110 Associated Edit Sites

287

3-morpholinopropane-1-sulfonic acid buffer, 1× transfer buffer, methanol, Tris–glycine buffer with 5% milk and 0.1% Tween, mouse anti-ADAR1 (Santa Cruz Biotechnology D-8), HRP goat anti-mouse secondary antibody (Abcam ab97023), SuperSignal West Pico Chemiluminescent Substrate. 2.2 Total RNA Extraction and Preparation of Libraries

1. RNA extraction: molecular biology grade purified water, TRIzol Reagent, chloroform, Zymo Research Direct-zol RNA MiniPrep kit, Invitrogen first-strand cDNA synthesis kit (random hexamer, dNTP, 10× RT buffer, MgCl2, DTT, RNase OUT, SSIII, RNaseH). 2. RNA library preparation: Illumina TruSeq Stranded Total RNA kit, AMPure XP Beads (Beckman Coulter), KOD Hot Start DNA Polymerase kit.

2.3 Sequencing, Alignment, Variant Calling, and Determination of ADAR1-Associated Mismatches and Isoform-Selective Edits (Fig. 1)

1. Software: bcl2fastq, STAR 2.5.4b, SAMtools 1.9, GATK 4.0, RStudio. 2. Analysis workflow: https://github.com/adamjdluo/ADAR1_ editome/.

Sequence total RNA

Align to reference

Identify ADAR1-associated mismatches ƒADAR > ƒ(–) + 2SD (ƒ = 6G/6A+C+G+T)

Identify mismatches

Filter by depth t 5

Variant depth t 2 (x3)

Fig. 1 Workflow for identification of ADAR1-associated edit sites. Total RNA is sequenced and aligned to the human reference genome. This reference is used to identify mismatch sites. Filter out sites with low read counts across the samples, and identify ADAR1-associated mismatches by filtering out mismatch sites that are present in the controls

288

Tony Sun et al.

Table 1 Index primer sequences Primer name

Sequence

i5-UDP0001 AATGATACGGCGACCACCGAGATCTACACTCGTGGAGCGTCGTCGGCAGCGTC i7-UDP0001 CAAGCAGAAGACGGCATACGAGATCGCTCAGTTCGTCTCGTGGGCTCGG i5-UDP0002 AATGATACGGCGACCACCGAGATCTACACCTACAAGATATCGTCGGCAGCGTC i7-UDP0002 CAAGCAGAAGACGGCATACGAGATTATCTGACCTGTCTCGTGGGCTCGG i5-UDP0003 AATGATACGGCGACCACCGAGATCTACACTATAGTAGCTTCGTCGGCAGCGTC i7-UDP0003 CAAGCAGAAGACGGCATACGAGATATATGAGACGGTCTCGTGGGCTCGG i5-UDP0004 AATGATACGGCGACCACCGAGATCTACACTGCCTGGTGGTCGTCGGCAGCGTC i7-UDP0004 CAAGCAGAAGACGGCATACGAGATCTTATGGAATGTCTCGTGGGCTCGG i5-UDP0005 AATGATACGGCGACCACCGAGATCTACACACATTATCCTTCGTCGGCAGCGTC i7-UDP0005 CAAGCAGAAGACGGCATACGAGATTAATCTCGTCGTCTCGTGGGCTCGG i5-UDP0006 AATGATACGGCGACCACCGAGATCTACACGTCCACTTGTTCGTCGGCAGCGTC i7-UDP0006 CAAGCAGAAGACGGCATACGAGATGCGCGATGTTGTCTCGTGGGCTCGG i5-UDP0007 AATGATACGGCGACCACCGAGATCTACACTGGAACAGTATCGTCGGCAGCGTC i7-UDP0007 CAAGCAGAAGACGGCATACGAGATAGAGCACTAGGTCTCGTGGGCTCGG i5-UDP0008 AATGATACGGCGACCACCGAGATCTACACCCTTGTTAATTCGTCGGCAGCGTC i7-UDP0008 CAAGCAGAAGACGGCATACGAGATTGCCTTGATCGTCTCGTGGGCTCGG i5-UDP0009 AATGATACGGCGACCACCGAGATCTACACGTTGATAGTGTCGTCGGCAGCGTC i7-UDP0009 CAAGCAGAAGACGGCATACGAGATCTACTCAGTCGTCTCGTGGGCTCGG i5-UDP0010 AATGATACGGCGACCACCGAGATCTACACACCAGCGACATCGTCGGCAGCGTC i7-UDP0010 CAAGCAGAAGACGGCATACGAGATTCGTCTGACTGTCTCGTGGGCTCGG i5-UDP0011 AATGATACGGCGACCACCGAGATCTACACCATACACTGTTCGTCGGCAGCGTC i7-UDP0011 CAAGCAGAAGACGGCATACGAGATGAACATACGGGTCTCGTGGGCTCGG i5-UDP0012 AATGATACGGCGACCACCGAGATCTACACGTGTGGCGCTTCGTCGGCAGCGTC i7-UDP0012 CAAGCAGAAGACGGCATACGAGATCCTATGACTCGTCTCGTGGGCTCGG

2.4 Validation of Edit Sites by Amplicon Sequencing

1. Amplicon preparation: molecular biology grade purified water, Invitrogen first-strand cDNA synthesis kit, KOD Hot Start DNA Polymerase kit. 2. Primers: see Table 1. 3. Software: dms_tools2.

Identification of ADAR1 p150 and p110 Associated Edit Sites

3

289

Methods

3.1 Exogenous Expression of p150, p110, and p150/p110 in ADAR1 KO Background

1. Plate WT 293T cells at a density of 400,000 cells/well in 6-well plates coated with PLL for transfection, and allow cells to incubate for 24 h at 37 °C and 5% CO2. 2. Transfect plated cells with plasmids encoding VSVG, Gag-Pol, and the pTRIP expression plasmid with the ADAR1 sequence of interest, using Lipofectamine 2000 Transfection Reagent, following manufacturer guidelines. 3. Allow transfected cells to incubate for 72 h at 37 °C and 5% CO2. 4. Harvest supernatant of cells containing lentivirus using polybrene, DMEM, and FBS. 5. Plate ADAR1 KO 293T cells at a density of 60,000 cells/well in 24-well plates for transduction, and allow cells to incubate for 24 h at 37 °C and 5% CO2. 6. Check RFP levels using fluorescent microscope or flow cytometry after 48 h; goal is about 30% RFP+ cells, corresponding to a multiplicity of infection of about 1. Select lentivirus titers to create about one integration event per cell, which correlates to about 20% or less of total transduced cells in a population of 293T cells, although this number can vary depending on the transduction conditions [18, 19]. 7. Prepare 96-well flat-bottom plates containing half 20 μm filtered 293T culture supernatant and half fresh 293T media (DMEM, 10% FBS, 1× NEAA). To prevent evaporation from inner wells during subsequent incubation, add 1× PBS to the outside ring of wells, and program BD FACSAria II so that single-cell clones will not be sorted into these wells. 8. Dissociate transduced cells gently in FACS media: 0.5 mL PBS/20 mM HEPES/0.1% BSA, and add DAPI at 2 ng per 1 million cells for live/dead cell gating. 9. Gate on singlets, DAPI-, RFP+ cells, and add one cell per well. 10. Add fresh 293T media to wells 24 h and 72 h after sorting. 11. After 2 weeks of incubation, visualize single colonies under light microscopy, and transfer cells by using gentle pipetting to six-well plates for expansion. 12. Following expansion to 80–90% confluency, add Accutase gentle dissociation media, and split the lifted cells in half, one half for immunoblot screening and the other half for resuspension in cryopreservation media (40% DMEM, 40% FBS, and 10% DMSO). This system of storing and screening single-cell clones allows direct correlation of genotype to phenotype and thawing of desired clones.

290

Tony Sun et al.

13. Freeze by incubating vials in isopropanol-infused Mr. Frosty containers, which are designed to achieve a temperature change of about -1 °C per minute. After 24 h of cooling in a -80 °C freezer, transfer cell clones to liquid nitrogen for storage. 14. For immunoblot, lyse cells in vials using 2× SDS-PAGE Sample Buffer with 400 mM DTT as a reducing agent. Pass lysate through a 26G needle ten times to shear DNA. Boil sample (100 °C) for 10 min and centrifuge at 10,000G for 10 min. Load cell lysate supernatants into 4–12% Bis-Tris gels and run at 130 V in 1× MOPS buffer for 2 h at room temperature. Transfer proteins from gels onto nitrocellulose membranes at 300 mA for 2 h at 4 °C. Block membranes for 1 h at room temperature with 5% milk in TBS with 0.1% Tween 20 detergent. Incubate membranes with Santa Cruz D-8 anti-ADAR1 antibodies diluted 1:200 in blocking buffer overnight at 4 °C. After incubation, wash three times for 15 min each with TBS/0.1% Tween 20, and incubate with secondary antibodies and anti-beta actin antibodies conjugated to horseradish peroxidase (HRP) for 1 h at room temperature. Wash again three times for 15 min each with TBS/0.1% Tween 20. Incubate membranes for 5 min using Pico chemiluminescent HRP substrate, wash for 5 min with 1× PBS, expose X-ray film to membrane in dark room, and develop film. 15. Based on immunoblot results, select and thaw clones in T25 flasks with similar levels of p110 and p150 protein expression levels for editing analysis. For example, the triplicate clones for the p150r and p110 groups should have similar levels of protein expression; levels of p110 should be similar to leaky p110 expression in WT p150. 16. Incubate thawed clones for 24 h in T25 flasks. 17. Expand thawed clones to T75 flasks. Save supernatants for mycoplasma testing. 18. Plate transduced clones at a density of 300,000 cells/well in 6-well plates (recommend plating in duplicates for a total of 24 wells). 19. Incubate plated cells for 48 h. 3.2 Total RNA Extraction and Preparation of Libraries

1. Use molecular biology grade purified water and keep samples and reagents on ice. Minimize repeated freeze–thaw cycles to preserve RNA sample integrity. 2. To harvest RNA, aspirate supernatant and add 1 mL of TRI reagent to dissolve cells. Incubate mix at room temperature for 5 min. Add 200 μL of chloroform per 1 mL of TRI reagent, and vortex mixture for 10 s. Incubate at room temperature for 3 min. Centrifuge mixture at 12,000G for

Identification of ADAR1 p150 and p110 Associated Edit Sites

291

15 min at 4 °C. Remove and transfer the upper aqueous phase (about 500 μL), which contains RNA, with care not to take off the white-colored layer in the middle. Continue RNA extraction as directed using the Direct-zol RNA MiniPrep kit, starting by adding to the samples to a volume of pure ethanol equal to the removed aqueous solution. Digest genomic DNA with DNase I at 28 °C for 30 min to ensure minimal genomic DNA contamination for sequencing. 3. Prepare RNA libraries as directed using the Illumina TruSeq Stranded Total RNA kit, with two modifications to the standard protocol to enrich for larger fragment (about 400 bp with average 300 bp inserts) sizes: (1) reduce fragmentation time to 2 min at the 94 °C incubation step with fragmentation buffer; (2) after adaptor ligation, reduce the volume of AMPure XP PCR cleanup beads to 60% of the standard amount. Pool, dilute, and sequence using the 150-base paired-end sequencing option on the Illumina NextSeq 500 High Output and NovaSeq S1 flow cells. 4. Convert base-calling data stored in BCL files to FASTQ files, and demultiplex using bcl2fastq Conversion Software. 3.3 Sequencing, Alignment, Variant Calling, and Determination of ADAR1-Associated Mismatches and Isoform-Selective Edits

1. Align_and_varCall.R: follow GATK best practice; annotate variants using ANNOVAR. 2. rawCount_AF_AC.R: calculate total read depth (DP), mismatch read counts (AC), and mismatch frequency (AF) for each variant. 3. gather_and_process_varInfo.R: integrate information for each variant into single table; filter and group variants. 4. report_rmd.Rmd: create visualized reports. 5. Detailed description of analysis workflow: 6. Merge and align NextSeq and NovaSeq sequencing results to hg19 using STAR 2.5.4b with 2-pass mapping to identify concordant read pairs. 7. Calculate total read counts, Phred quality scores, and alignment percentages using Picard 2.18.1. 8. Select unique mappings for variant calling. 9. Convert aligned SAM files into BAM files using SAMtools 1.9. 10. Identify reference–read mismatches for each of the 12 samples (4 groups in biological triplicates: firefly luciferase, WT p150, p150r, and p110) using Mutect2 (GATK 4.0.8). 11. Select genomic positions with a single mismatch for editing analysis. Exclude other mismatches. 12. Select genomic positions with a total read count of ≥5 in all 12 samples for downstream analysis.

292

Tony Sun et al.

13. Call a mismatch site for a group if the mismatched base has a read count ≥2 in each of the biological triplicates. 14. For each mismatch, calculate a mean mismatch frequency based on read counts from the biological triplicates: ΣG/ΣA+C +G+T. 15. Call a mismatch site from any of the three ADAR1 groups (WT p150, p150r, or p110) if the site has a mean mismatch frequency greater than two standard deviations above the mean mismatch frequency of that site in the firefly luciferase group. 16. Annotate mismatches using ANNOVAR, parsed from RefGene. 17. Exclude mismatch sites found in the dbSNP138 database. 18. To subcategorize the list of putative ADAR1-edit sites, select mismatches that have total read counts of ≥50 in all 12 samples. 19. Exclude mismatches that are not A-to-G following addition of annotated genomic strand information. 20. Call site as edited in a group if the mismatch frequency is ≥0.1 and not edited in a group if the mismatch frequency = 0. 3.4 Validation of Edit Sites by Amplicon Sequencing

1. Create amplicons starting from total RNA using gene-specific antisense primers with unique molecular identifiers (ten-base degenerate sequences) on each primer (see Note 1). 2. Use Superscript III for reverse transcription following manufacturer guidelines for gene-specific primer-based reactions. 3. For the first round of PCR using the KOD Hot Start DNA Polymerase kit, use gene-specific sense primers paired with a common primer that binds outside the unique molecular identifier sequence—PCR conditions: 2 μL cDNA, 65°C annealing temperature, 25 cycles, and 30 s extension. 4. For the second round of PCR using the KOD Hot Start DNA Polymerase kit, use a common set of primers that add indexes for sample multiplexing and add the Illumina adapter sequences—PCR conditions: 61 °C annealing temperature, ten cycles, and 40 s extension. Recommend using unique dual-matched indexes to increase the accuracy of assigning read to samples by 100 times compared to single index combinations [20]. 5. Sequence using MiSeq and analyze using the bcsubamp program from dms_tools2 [21]. 6. Adapt program arguments to fit the design of the amplicons (see Note 2).

Identification of ADAR1 p150 and p110 Associated Edit Sites

4

293

Notes 1. Sample RT and PCR Reactions for TIMM23B Gene, Sample 1 of 12 (Gene-Specific Regions of Primers Underlined) Antisense RT primer: 5′GGGCTCGGAGATGTGTATAAGAGACAG-NNNNNNN NNN- GCACAACTATGCACATACAC P5-PCR1 primer1: 5′TCGTCGGCAGCGTCAGATGTGTATAAGAGACAGACTATTATTTATCAATTCATTGCCTACATGTCAGC P7-PCR1 primer2 (identical for all genes): 5′GTCTCGT-GGGCTCGGAGATGTGTATAAGAGACAG P5-PCR2 primer1: 5′AATGATACGGCGACCACCGAGATCTACACtcgtggagcg-TCGTCGGCAGCGTC (i5-UDP0001) P7-PCR2 primer2: 5′CAAGCAGAAGACGGCATACGAGATcgctcagttcGTCTCGTGGGCTCGG (i7-UDP0001) 2. Amplicon Analysis Arguments for TIMM23B Samples PATH=/Users/tonysun/Library/Python/3.7/bin:$PATH dms2_batch_bcsubamp --refseq TIMM23B.txt --alignspecs 2,432,1,11 --batchfile Batch.csv --summaryprefix TIMM23B --bclen 0 --bclen2 10 --maxmuts 50 --minreads 1 --bcinfo --R1trim 175 --R2trim 175 --minconcur 1.0 --minfraccall 0.01 --minq 20

References 1. Chung H, Calis JJ (2018) Human ADAR1 prevents endogenous RNA from triggering translational shutdown. Cell 172:811–824.e14 2. Sun T, Yu Y, Wu X (2021) Decoupling expression and editing preferences of ADAR1 p150 and p110 isoforms. PNAS 118(12): e2021757118 3. Schoggins JW et al (2011) A diverse range of gene products are effectors of the type i interferon antiviral response. Nature 472:481–485 4. Ramaswami G, Lin W, Piskol R, Tan MH, Davis C, Li JB (2012) Accurate identification of human Alu and non-Alu RNA editing sites. Nat Methods 9:579–581. https://doi.org/10. 1038/nmeth.1982 5. Peng Z, Cheng Y, Tan BC, Kang L, Tian Z, Zhu Y et al (2012) Comprehensive analysis of RNA-Seq data reveals extensive RNA editing in

a human transcriptome. Nat Biotechnol 30: 253–260. https://doi.org/10.1038/nbt. 2122 6. Bahn JH, Lee JH, Li G, Greer C, Peng G, Xiao X (2012) Accurate identification of A-to-I RNA editing in human by transcriptome sequencing. Genome Res 22:142–150. https://doi.org/10.1101/gr.124107.111 7. Carey LB (2015) RNA polymerase errors cause splicing defects and can be regulated by differential expression of RNA polymerase subunits. elife 4:1–10 8. Ji J, Loeb LA (1992) Fidelity of HIV-1 reverse transcriptase copying RNA in vitro. Biochemistry 31:954–958 9. Kotewicz ML, D’Alessio JM, Driftmier KM, Blodgett KP, Gerard GF (1985) Cloning and overexpression of Moloney murine leukemia

294

Tony Sun et al.

virus reverse transcriptase in Escherichia coli. Gene 35:249–258 10. Orton RJ et al (2015) Distinguishing low frequency mutations from RT-PCR and sequence errors in viral deep sequencing data. BMC Genomics 16:1–15 11. Wu NC et al (2014) High-throughput profiling of influenza A virus hemagglutinin gene at single-nucleotide resolution. Sci Rep 4:1–8 12. Doud MB, Bloom JD (2016) Accurate measurement of the effects of all amino-acid mutations on influenza hemagglutinin. Viruses 8:1– 17 13. Jabara CB, Jones CD, Roach J, Anderson JA, Swanstrom R (2011) Accurate sampling and deep sequencing of the HIV-1 protease gene using a Primer ID. Proc Natl Acad Sci U S A 108:20166–20171 14. Kinde I, Wu J, Papadopoulos N, Kinzler KW, Vogelstein B (2011) Detection and quantification of rare mutations with massively parallel sequencing. Proc Natl Acad Sci U S A 108: 9530–9535 15. Zhang TH, Wu NC, Sun R (2016) A benchmark study on error-correction by read-pairing

and tag-clustering in amplicon-based deep sequencing. BMC Genomics 17:1–9 16. Hiatt JB, Patwardhan RP, Turner EH, Lee C, Shendure J (2010) Parallel, tag-directed assembly of locally derived short sequence reads. Nat Methods 7:119–122 17. Merten OW, Hebben M, Bovolenta C (2016) Production of lentiviral vectors. Mol Ther Methods Clin Dev 3:16017 18. Charrier S et al (2011) Quantification of lentiviral vector copy numbers in individual hematopoietic colony-forming cells shows vector dose-dependent effects on the frequency and level of transduction. Gene Ther 18:479–487 19. Geraerts M, Willems S, Baekelandt V, Debyser Z, Gijsbers R (2006) Comparison of lentiviral vector titration methods. BMC Biotechnol 6:1–10 20. MacConaill LE et al (2018) Unique, dualindexed sequencing adapters with UMIs effectively eliminate index cross-talk and significantly improve sensitivity of massively parallel sequencing. BMC Genomics 19:1–10 21. Bloom JD (2015) Software for the analysis and visualization of deep mutational scanning data. BMC Bioinf 16:168

Chapter 21 Z-DNA and Z-RNA: Methods—Past and Future Alan Herbert Abstract A quote attributed to Yogi Berra makes the observation that “It’s tough to make predictions, especially about the future,” highlighting the difficulties posed to an author writing a manuscript like the present. The history of Z-DNA shows that earlier postulates about its biology have failed the test of time, both those from proponents who were wildly enthusiastic in enunciating roles that till this day still remain elusive to experimental validation and those from skeptics within the larger community who considered the field a folly, presumably because of the limitations in the methods available at that time. If anything, the biological roles we now know for Z-DNA and Z-RNA were not anticipated by anyone, even when those early predictions are interpreted in the most favorable way possible. The breakthroughs in the field were made using a combination of methods, especially those based on human and mouse genetic approaches informed by the biochemical and biophysical characterization of the Zα family of proteins. The first success was with the p150 Zα isoform of ADAR1 (adenosine deaminase RNA specific), with insights into the functions of ZBP1 (Z-DNA-binding protein 1) following soon after from the cell death community. Just as the replacement of mechanical clocks by more accurate designs changed expectations about navigation, the discovery of the roles assigned by nature to alternative conformations like Z-DNA has forever altered our view of how the genome operates. These recent advances have been driven by better methodology and by better analytical approaches. This article will briefly describe the methods that were key to these discoveries and highlight areas where new method development is likely to further advance our knowledge. Key words Z-DNA, Z-RNA, Flipons, ADAR1, ZBP1, Interferon, Necroptosis, Genetics, Bioinformatics, DeepZ, Epigenetic

1

Introduction Those DNA segments that form alternative conformations under physiological conditions, including Z-DNA and other alternative nucleic acid conformations, have been called flipons [1]. Since these sequences are encoded genomically, they are genetic elements that are subject to natural selection. Variation in their propensity to alter DNA and RNA structure depends on a number of factors including their length, nucleotide composition, base modification, and the local action of processive enzymes like polymerases and helicases including those present in chromatin remodeling complexes. Since

Kyeong Kyu Kim and Vinod Kumar Subramani (eds.), Z-DNA: Methods and Protocols, Methods in Molecular Biology, vol. 2651, https://doi.org/10.1007/978-1-0716-3084-6_21, © The Author(s), under exclusive license to Springer Science+Business Media, LLC, part of Springer Nature 2023

295

296

Alan Herbert

their action is switch-like, existing as either one conformer or the other, flipons have the propensity to localize different sets of cellular machinery to a genomic region and alter the readout of genetic information [2, 3] by digitalizing the genome [2, 4]. We have only begun to explore this new frontier. Here I review our progress so far in the study of Z-DNA and Z-RNA, describing methods that have enabled us to gain the first glimpse of the biological functions of the left-handed DNA and RNA conformations and then make some proposals for new methods that would help further progress in this field.

2

A Retrospective Many discoveries that appear in hindsight to be an obvious linear development of previous work are not so. The retrospective interpretation fails to capture the enthusiasm [5, 6] or skepticism [7] that shaped the earlier era, where progress was made despite the lack of funding and support. The initial proposal for a left-handed helix was based on the inverted circular dichroism spectrum of polyd(C-G) observed in 3.5M sodium chloride which was the first evidence that a left-handed DNA existed [8]. The spectrum of the related polyd(I-C) did not invert under those conditions [8, 9]. The biological relevance of the observations was quite unknown at the time the findings were reported. However, the circular dichroism approach pioneered in this study was to become the bedrock validation of Z-DNA formation in many subsequent investigations. No further progress was made until methods became available to synthesize short oligonucleotides for crystallization. It came as a major surprise that the first DNA crystal solved should be one for left-handed DNA, named Z-DNA because of its zigzag backbone [10]. The conformation has a number of unique features that distinuisgh it from the right-handed B-DNA conformation. The base pairs in Z-DNA are inverted relative to B-DNA, with the backbone shape due to the alternation of syn and anti base conformations, with the base either lying over the nucleotide deoxyribose sugar moiety or pointing away from it. The syn conformation carries an energetic cost, which is lower for purines than pyrimidines. The result is that Z-DNA formation requires energy and that some sequences flip from B-DNA to Z-DNA more easily than others. Repeat sequences like d(C-G)n and d (C-A)n with alternating purines and pyrimidines, where the purine is in the syn conformation, form Z-DNA more readily than other sequences [11]. Raman spectroscopy studies confirmed that crystallized Z-DNA was the same left-handed helix observed in the circular dichroism studies [12]. The choice of sequence in both studies had been fortuitous and was based on a pyrimidine/purine dinucleotide

Z-DNA and Z-RNA: Methods—Past and Future

297

repeat of alternating d(CG) that favored Z-DNA formation. The increased accessibility to C8 and N7 positions of purine was proposed to favor sequence-specific protein binding while also increasing susceptibility to mutagens that modified these residues. Subsequent studies revealed that Z-DNA could be stabilized under physiological conditions by chemical bromination [13]. The Z-DNA polymer produced enabled discovery of antiZ-DNA antibodies in serum of patients with systemic lupus erythematosus [14], showing that the formation of left-handed DNA conformation occurs in living organisms. This finding reinforced other work with polyd(G-m5C) d(G-m5C) revealing that 5-methyl-cytosine stabilized Z-DNA formation under low salt conditions in the presence of physiological levels of cations [15]. The method that really brought Z-DNA into a biological context was demonstration that the negative supercoiling present in closed circular DNA plasmids could power the flip from B-DNA to Z-DNA. The technique relied on two-dimensional gel electrophoresis to resolve individual topoisomers. The transition of a DNA segment from right-handed to left-handed DNA produced a characteristic hump in the arch of topoisomers that was absent when a control plasmid with no Z-DNA-forming segment was run under the same conditions [16]. The change in plasmid mobility reflected the change in twist of the DNA from +10.4 bases per helical turn to -12 bases per turn in Z-DNA, absorbing 1.65 supercoils from the plasmid with 0.05 due to formation of the B-Z DNA junction. The method provided an estimate of the energetic cost of the flip from B-DNA to Z-DNA for sequences with different nucleotide sequences and different junctions. The free energies calculated enabled a computational approach for analyzing genomic sequences, revealing that sequences with a propensity to form Z-DNA were not randomly dispersed throughout the genome. Instead, they were frequently present in promoters [11, 17]. The search for Z-DNA-binding proteins was challenging as Z-DNA is not the only alternative conformation present in long polymers of brominated polyd(C-G). Single-stranded loops, bulges, and cross-over structures are also present. Also, it was not apparent at the time whether Z-DNA-binding proteins bound in a sequence-specific manner as initially proposed [10], or whether binding depended upon base modifications such as methylation [15]. Fortuitously, a band-shift assay using a 30–40-base pair hemi-brominated probe incorporating 5-bromo-cytosine enabled the purification of the first Z-DNA-binding protein from normal tissue and the identification of the Zα family of proteins [18– 20]. The assay was designed to be stringent using a 20,000-fold excess of high molecular weight B-DNA as an unlabeled competitor. By its nature, the approach excluded any protein that bound both B-DNA and Z-DNA as these would be trapped in the wells of the gel. The assay also depended on a slow off-rate from Z-DNA as

298

Alan Herbert

the magnesium required to induce Z-DNA formation by the probe was only present in the incubation buffer but not in the gel, so stabilization of the complex was protein dependent. The method provided an easy check for specificity of binding through the use of unlabeled Z-form brominated polyd(C-G) and B-form polyd(C-G) as competitors, as well as by plasmids with or without a Z-DNA containing segment, or by a Z-DNA-forming plasmid in the presence or absence of a topoisomerase that relaxed the plasmid to the B-DNA conformation. The approach required isolation of sufficient Z-DNA binding protein to allow identification of its peptide sequence by Edman degradation. Initial experiments detected the presence in chicken blood of a Z-DNA-binding domain. I called this domain Zα hoping there would be a Zβ, a Zγ, etc. [21]. The protein subsequently purified from chicken lung was identified by sequence homology as ADAR1 [22], using the human and rat entries that were luckily released by NCB1 just a few weeks previously. Once the human Zα domain was mapped to the N-terminus of ADAR1, circular dichroism proved that the domain was capable of inducing the Z-DNA conformation in polyd(C-G). The domain sequence allowed identification of the Zα family that includes ZBP1 (Z-DNA-binding protein 1), poxvirus protein E3 (encoded by E3L), Z-DNA-binding protein kinase (PKZ), and the cyprinid herpesvirus 3 ORF112. Further analysis of the family revealed that the domain contained a helix–turn–helix motif [20]. By varying the probe sequence in the band-shift assay, it was possible to show that the binding of Zα was structure-specific. This finding was confirmed by construction of Z-DNA specific Zα nuclease [23, 24] that was structure-specific, not sequence-specific [25]. Other studies revealed that a six-base pair Z-DNA segment was bound by two Zα proteins [26]. Mutagenesis identified the Zα residues essential for binding Z-DNA [27]. The various features of Zα identified biochemically were confirmed by co-crystallization of Zα with Z-DNA [28], then with Z-RNA [29] and by solution studies of Zα by NMR [30]. Subsequent work characterized more Zα family members by crystallization, including ZBP1 [31, 32] and those from other species [33–35] and two from viruses [33, 36– 38]. Co-crystals with human ADAR Zα and DNA revealed the structure of the junction between B-DNA and Z-DNA segments (the B–Z junction) [39] and two out of alternation Z-DNA segments (the Z–Z junction) [40]. Further characterization of Z-formation in the presence of Zα by DNA, RNA, and DNA– RNA hybrids using FRET, NMR, and magnetic tweezers has enriched the information on the energetics of the B–Z transition. Functional assays have demonstrated the ability of the Zα domain to regulate reporter gene expression [41, 42]. These studies were consistent with reports that Z-DNA formation in the c-myc and corticotrophin promoters is driven by transcription [43, 44] and that Z-DNA-forming sequences regulate CSF1, HO1, and

Z-DNA and Z-RNA: Methods—Past and Future

299

ADAM12 gene expression in cells [45–48]. Other roles have been found for Z-RNA in influenza and vaccinia viral infections [49–51] and in the regulation of interferon and cellular stress granulemediated responses [52–54]. Z-DNA formation also impacts fear conditioning in mice in a manner dependent on ADAR1 [55]. The toolbox is now well equipped to study this biology further, but new methods to study Z-DNA and Z-RNA formation in real time are needed to capture the dynamic nature of these events in the context that they occur.

3

The Biology of Z-DNA Biological functions have now been identified for a number of Z-DNA- and Z-RNA-binding proteins. The list includes ADAR1, ZBP1, E3, PKZ, and ORF112. ADAR1 is involved in suppression of type I interferon responses, while ZBP1 induces the programmed cell death program called necroptosis and is negatively regulated by viral E3 [56, 57]. PKZ is an interferon-induced protein kinase found in fish that inhibits translation [58] through a pathway that may not involve eukaryotic translation initiation factor α2 phosphorylation (EIF2α) [59, 60]. The cyprinid herpesvirus 3 ORF112 is an inhibitor of PKZ. The discovery of these roles for ADAR1, ZBP1, and E3 proteins resulted from a combination of approaches with confirmation using the power of human and mouse genetic methodology to answer difficult questions. Genetic approaches are also necessary to increase our understanding of the role of PKZ in fish. Through the outcomes these proteins induce, it is now possible to tie the fleeting formation of Z-DNA and Z-RNA to persistent biochemical outcomes such as adenosine to inosine editing of double-stranded RNA (dsRNA) and phosphorylation of protein effectors like receptor-interacting protein (RIP) kinase 3 and EIF2α. These long-lived modifications provide proxies for the transient induction in vivo of Z-DNA and Z-RNA by different environmental perturbations.

4

The Zα Family Domain Structure Bioinformatic analysis reveals that Zα domain (Pfam PF02295) is present in proteins from 236 species while a recently proposed Z-RNA protein from T. brucei (XP_823025.1), identified by a PSI-BLAST Zα homology search, is not a family member and requires further characterization [61]. The family contains proteins where the Zα domain is fused to other domains that perform a wide array of functions. In addition to the deaminase domain present in ADAR1 and found in distant species like sea urchins, Tropilaelaps mites, and the European centipede (http://pfam.xfam.org/,

300

Alan Herbert

accession PF02295), the RIP homotypic interaction motif (RHIM) present in ZBP1 and the protein kinase domain in PKZ (fish), other domains are fused to a PF02295 family member, including the transcriptional coactivator p15 domain from picoplanktonic green alga, sigma54 transcriptional domain found in the DNA translocase FtsK (Lapidilactobacillus), and an adenosine monophosphate nucleotide domain from Dinoflagellate. Only ADAR1, ZBP1, PKZ, and the E3 proteins from the vaccinia and yata poxviruses and the carp herpesvirus ORF112 protein have been extensively characterized biophysically (Fig. 1). They share a number of invariant features (Fig. 2): 1. An α1–β1–α2–α3–β2–β3 winged helix–turn–helix topology (wHTH) [20, 28, 30]. 2. A conserved asparagine at position “n” on the α3 recognition helix. 3. A conserved tyrosine at position “n+4” on the α3 recognition helix. 4. A CH–π interaction between the α3 tyrosine and any DNA or RNA base in the syn conformation (usually involving guanosine C8) that underlies the Z-specificity of the interaction (Fig. 1) [28]. 5. A conserved tryptophan in the β2–β3 wing that both stabilizes the fold and orients the tyrosine correctly for interaction with the Z-helix through an edge to face contact (Fig. 3). The tryptophan can also increase the energy of binding by making water-mediated contact with the phosphate backbone. Replacement by phenylalanine significantly diminishes Z-DNA binding [63]. Features that vary both the affinity and the ability to induce a B-DNA to Z-DNA transition include: 1. Positively charged residues “n-4” and “n-3” and “n+8” of α3 of the α3 recognition helix (or at “n-3” of the 310 ZBP1 Zα2 helix). These residues, such as lysine and arginine, lie on the same helical face as the conserved asparagine and increase Z-DNA on-rates [27]. 2. Positively charged residues at α3 “n+1” that make basespecific contacts with the same nucleic acid residue that forms a CH–π bond with a tyrosine from an adjacent Zα monomer [38]. This contact likely stabilizes the complex. 3. Positively charged residues in β1 or in the β2–β3 wing that alter the kinetics of the right-handed to left-handed flip [34, 35, 64]. 4. Helix α2 contacts with the Crick strand. Longer Z-DNA substrates reveal that PKZ residues in helix α2 contact

Z-DNA and Z-RNA: Methods—Past and Future

301

Fig. 1 The domain structure of Zα family proteins. The Zα domains whose interactions with Z-DNA have been characterized structurally are shaded in light yellow. The ADAR1 Zα variants that are causal for Aicardi– Goutie`res syndrome have dark yellow letters. Zβ is not known to bind Z-DNA. Its recognition helix lacks a tyrosine residue critical for high-affinity binding to Z-DNA. Both ADAR1 isoforms share three RNA-binding domains (dsRBD) and a deaminase domain that acts on adenosine in double-stranded RNA to form inosine. The p150 and p110 isoforms are transcribed from different promoters (PA for p150 and PB and PC for p110). ADAR1 null alleles exist that lead to nonsense-mediated decay of the p150 transcript but do not affect the level of p110 mRNA. These null alleles create a haploid transcriptome that enabled the mapping of Zα variants to disease outcomes in Aicardi–Goutie`res syndrome. The domain structure of ZBP1 is also illustrated. ZBP1 has two Zα domains and two receptor-interacting protein homotypic interaction motifs (RHIM; colored in blue) critical for the interaction with RIP kinase 3 that activates the programmed cell death pathway of necroptosis. E3 is produced by poxviruses and inhibits ZBP1. The PKZ protein from bony fish contains Zα domains fused to a protein kinase domain (shown in pale green) that phosphorylates the eukaryotic translation initiation factor 2α (EIF2α) fish and inhibits protein translation. The action of PKZ is opposed by the cyprinid herpesvirus 3-encoded ORF112 protein that has only a Zα domain

the second DNA strand [33], with evidence they stabilize the Z-DNA conformation formed by the binding of the first Zα molecule, facilitating binding of a second Zα molecule, a model consistent with kinetic data from solution studies [65, 66] (Fig. 3).

302

Alan Herbert

Fig. 2 The winged helix–turn–helix fold of the human ADAR1 Zα. The conserved residues asparagine 173 (Asn), tyrosine 177 (Try), and tryptophan 195 (Trp) are shown along with the proline 193 (Pro) that is a frequent variant in the world population and causal for Aicardi–Goutie`res syndrome when paired with a null allele in compound heterozygotes [62]. (Source PDB:1qbj [28]). Other residues on α3 and the β2–β3 wing of ADAR1 Zα are involved in binding Z-DNA. Such residues vary between Zα family members

5

Making a Z-DNA-Binding wHTH from One That Binds B-DNA Following these rules, the B-DNA-binding globular domain of histone H5 wHTH (GH5) [67] was converted to a Z-DNA-binding protein [63]. The affinity (given as Kd) for Z-DNA was measured at 1.8 ± 0.2 μM, while that for B-DNA was 31 ± 2 μM (compared to 0.60 ± 0.07 μM and 35 ± 3 μM for human ADAR1 Zα binding to Z-DNA and B-DNA respectively measured under the same conditions). The alteration in the binding interaction reflected the altered wing structure that sterically hindered the B-DNA but not the Z-DNA interaction, the binding to only one strand of the Z-DNA helix in contrast to the binding to both strands of B-DNA by GH5 and the opposite orientation of the α3 helix when bound to Z-DNA compared to B-DNA. While affinities characterize the overall interaction between a protein and its ligand, they do not capture the effect of mutations on the kinetics of binding. A protein that has both fast on- and off-rates can have the same Kd as one with a slow on- and off-rates.

Z-DNA and Z-RNA: Methods—Past and Future

303

Fig. 3 Contacts present in the cyprinid herpesvirus 3 ORF112 Zα dimer. The specificity of Zα for Z-DNA is due to the CH–π bond between the conserved α3 tyrosine and C8 of the guanine (the position is given relative to the conserved asparagine denoted by “n”). The guanosine is in the syn conformation. The tyrosine present in chain 1 is oriented by the edge on face contact with the conserved β3 tryptophan from chain 1. The figure illustrates how the three ring systems lie on different axes with the plane of guanine ring parallel to the x-axis, tyrosine to the y-axis, and tryptophan to the z-axis. The tyrosine also hydrogen bonds to the DNA backbone through its hydroxyl group (not shown). The chain 2 Zα monomer also makes base-specific contacts with the guanine ring bound by chain 1 through an arginine at n+1 of the chain 2 α3 helix that helps stabilize the interactions. (Source PDB:4WCG [38])

While superficially surprising, this fact follows from the way the Kd is defined. It is the ratio of koff/kon (so a protein with an on-rate of 106 M-1s-1 and an off-rate of 1 s-1 has the same Kd as a protein with an on-rate of 1 M-1s-1 and an off-rate of 10-6 s-1). So, while affinities of binding to B-DNA and Z-DNA may appear in the same range, it is possible to make a protein that goes on and comes off B-DNA quickly but binds Z-DNA slowly and dissociates slowly. The difference creates a window where a Zα domain stabilized Z-DNA segment may persist over time, while the B-DNA interaction permits a rapid search for sequences with a propensity to adopt the left-handed conformation. Such a multistep process is similar to the one for binding of B-DNA sequence-specific proteins to their cognate binding sites [68]. At one extreme, the Zα on-rate may be so slow that no significant docking to Z-DNA occurs. This appears

304

Alan Herbert

true for vaccinia E3 for which no inversion of the circular dichroism spectrum due to E3 Zα has been published [63, 64]. Binding of E3 to a preformed brominated polyd(C-G) Z-DNA substrate does however occur, and the Kd is 120 nM versus 57 nM for ADAR1 Zα in that study [64]. For other Zα domains, the number of monomers required to energetically stabilize a Z-DNA segment varies with the Zα sequence, which affects the on-rate, and by the DNA sequence, which changes the energetics of the flip [66, 69]. Current models favor an initial binding of Zα to a B-DNA conformation, followed by the flip to Z-DNA rather than the capture of Z-DNA that forms stochastically [63, 65, 66, 70]. The subsequent binding of additional Zα proteins to the complex is then to a preformed substrate. The initial contact with B-DNA and initiation of binding is likely dependent on the number of residues of the binding surfaces of the recognition helix and the wing. For the gold fish C. auratus PKZ, the B-to-Z transition rates of the wild-type and mutant Zα are proportional to kon and to the number of positively charged residues interacting with Z-DNA [34]. The formation of on-DNA dimers by the monomers, as seen with ORF112, then can affect koff by further stabilizing the Z-DNA helix [38]. Further, the second monomer can directly stabilize interaction between the first monomer and Z-DNA through contacts with the guanine ring that forms the CH–π bond with the α3 tyrosine of the first monomer (Fig. 3) [38]. In ORF112, such a contact is made with an arginine that hydrogen bonds with O6 and N7 of the purine ring. These base-specific contacts would also help stabilize an inosine residue, the product of ADAR1 editing, in the Z-RNA conformation, promoting further editing of the dsRNA by promoting dimer formation by ADAR1. The roles of the invariant tyrosine that confers Z-DNA specificity (Kd = 40 nM) have been studied by mutation [30]. In human ADAR1 Zα, a Y177A mutant has a Kd = 700 nM, while for a Y177F mutant, the Kd = 350 nM [49]. In contrast, a Y145A of Zα2 from ZBP1 abolishes binding completely, probably since the wing lacks prolines to structure the binding surface [32]. All the mutants cause inversion of the circular dichroism spectrum. The results suggest that this residue is key to inducing the initial flip to the left-handed conformation. Other residues also play a role. This idea is supported by the solution structure of the vaccinia E3 Zα. The conserved tyrosine is rotated away from the Z-DNA-binding face and is not oriented correctly to form the CH–π bond with the guanine ring. In the case of E3, it is likely that the engagement of the β2β3 wing reorients the α3 tyrosine to initiate stable binding to the lefthanded helices. This outcome highlights the importance of the wing promoting Z-DNA and Z-RNA specific binding. In many Zα family members, one or two adjacent prolines between the

Z-DNA and Z-RNA: Methods—Past and Future

305

β2 and β3 sheets are conserved and may help structure the wing and orient the binding surface to decrease the entropic cost of binding Zα to Z-DNA or Z-RNA, increasing kon. The modifications to GH5 that convert it to a Z-DNA-binding protein provide a model to engineer other Z-DNA binders. The sequence variation and the mutational analysis of the Zα family both provide a guide for producing proteins with the desired Z-DNA and Z-RNA binding properties. Particularly useful is the example of vaccinia E3 that binds only to preformed Z-DNA but does not induce its formation. Constructs based on E3 have the potential to localize the sites of physiological Z-formation within a cell, especially those containing allosteric switches that can be activated by light or ligands [71]. The other variable that can be engineered as shown with the work on GH5 is the relatively onand off-rates for B-DNA and Z-DNA. It is probable that nature has already performed such experiments and that dual B-DNA- and ZDNA-binding proteins exist that bind to activated genes, where Z-DNA is stabilized by negative supercoiling, in a manner that differs from their interaction with an inactive gene trapped in the B-DNA conformation. With current approaches, only the B-DNA structural isomer of these proteins has been characterized by crystallographic approaches. It is also likely that members of this class of protein that bind both B-DNA and Z-DNA would not have been found using the assay that led to the initial discovery of Zα.

6

Well-Characterized Zα Proteins Within the above framework, the family members characterized so far differ in both structure and function (Fig. 1). 1. ADAR1 (encoded by the human gene ADAR and the mouse gene Adar) is a protein that also binds dsRNA and deaminates adenosines within the dsRNA helix to form inosines that are subsequently translated as guanosines [72]. The enzyme has two protein isoforms: the long form called p150 that contains the Zα domain and the shorter p110 product. Both isoforms contain the Zβ domains that are not known to induce Z-DNA or Z-RNA formation, possibly due to the conversion of tyrosine 177 that contacts Z-DNA to isoleucine [73] and the loss of an arginine at position “n+1” that otherwise can stabilize dimer formation [38] (Fig. 3). Mutation of Zβ isoleucine to tyrosine (I335Y) partially restores Z-DNA binding by increasing the kinetics of Z-DNA formation [49, 73]. The absence of every second band in a gel shift assay when a ZαZβ dimer is compared to a ZαZα dimer is consistent with the absence of a high-affinity interaction between Zβ and Z-DNA [25]. The two ADAR1 isoforms also share three double-

306

Alan Herbert

stranded RNA-binding domains (dsRBD) along with the deaminase domain (Fig. 1). The p150 mRNA is transcribed from a different promoter (PA) than the p110 message (PB and PC). In humans this arrangement allows production of p110 to occur independently of p150 [74]. Neither the Z-DNA binding nor the dsRBD of ADAR are necessary for editing, but instead act to increase its efficiency [75]. Although RNA editing was thought to have an important phenotypic role in recoding proteins, deep sequencing has revealed that the substrates most commonly edited in humans by ADAR1 are in dsRNA formed by pairs of Alu repeat elements that are in opposite orientations [76–79]. When transcribed, the inverted Alu repeats fold back on one another and base pair to form an editing substrate. Subsequent analysis using computational approaches employing the Z-Hunt algorithm revealed that Alu sequences contain a Z-Box composed of sequences prone to adopt the Z-conformation [80]. The formation of Z-RNA by the Z-Box sequences was subsequently confirmed by NMR studies [81]. A role for ADAR1 in regulation of type I interferon responses was first suggested by the observation that the p150 isoform is induced by interferon [82]. The embryonic lethality of mouse Adar-deficient mice was later shown to be due to a failure to suppress interferon signaling [83]. The rescue by the ADAR1 p150 isoform but not p110 was consistent with a role for the Zα domain in this process [84]. In humans, ADAR loss of function mutants were also found associated with the type I interferonopathy found in the Aicardi–Goutie`res syndrome (AGS) type 6 [85]. Although the Zα P193A variant was frequent in disease families, it was also observed in controls, so its significance was called into question by the investigators. A subsequent analysis that focused on families transmitting an allele that expressed p110 mRNA but not p150 provided the definitive evidence that Zα caused disease in humans. In this situation, the p150 allele is null, creating a haploid transcriptome. Only the other parental allele encodes the p150 protein produced by these individuals. In the case, P150 variants on the expressed allele map directly to disease outcomes [62]. The disease-causing variants, P193A (proline is changed to alanine at position 193) and N173S (asparagine is changed to serine at position 193), alter human Zα residues directly involved in binding Z-DNA [27, 28]. In other families, haploinsufficiency of ADAR1 p150 occurs when a normal allele is combined with a loss of function variant, causing a different but milder disease called dyschromatosis symmetrica hereditaria. The high frequency of the P193A in northern Europeans is attributed to the protective effect of higher interferon levels against viral infections when the normal

Z-DNA and Z-RNA: Methods—Past and Future

307

allele is paired with a variant P193A or N173S allele. In disease patients, the trigger for interferon activation appears to be dsRNA formed by inverted repeat elements within the genome. The interferon response is driven by the binding of the helicase MDA5 (melanoma differentiation-associated protein 5 encoded by IFIH1) to dsRNA that then activates interferon gene expression [86] and likely involves a contribution from the related LGP2 protein (encoded by DHX58) [87]. In host RNAs, the most likely current hypothesis is that the Alu Z-Box prevents activation of the interferon response against self RNAs [88]. The assembly of LGP2 and MDA5 on host transcripts induces Z-RNA formation by twisting and stretching the dsRNA. The Z-RNA is then bound by ADAR p150 Zα. Subsequently, editing by ADAR1 destabilizes the dsRNA, releases MDA5, and stops induction of interferon gene expression. In this situation, Alu inverted repeats act as a marker for self RNAs. Further, as discussed below, binding of Z-RNA by ADAR prevents the initiation of inflammatory cell death by ZBP1, preventing further amplification of the immune response. In AGS, loss of function mutations in Zα disrupt both mechanisms, resulting in the induction of interferon responses by host transcripts. The recognition of Alu Z-RNA forming elements by ADAR enables self–nonself discrimination while not affecting host responses against pathogens that lack such identifiers [88]. Subsequent mouse genetic studies have confirmed that equivalent P195A Zα mutations in mice are associated with loss of interferon suppression [89–91]. 2. ZBP1 binds Z-DNA through its Zα2 differently from the way ADAR1 engages Z-DNA through its Zα domain. The conserved proline residues present in the ADAR1 Zα wing formed by the β2 and β3 strand interactions are absent in ZBP1. Instead, an arginine in the β1 stand provides additional contacts with the Z-DNA backbone. This alternative mode of binding appears to be also present in the Zα2 domain of PKZ although there is currently no structure available for this particular peptide [32]. ZBP1 has a RHIM that is found in only three other proteins: RIP kinase 1 (RIPK1), RIP kinase 3 (RIPK3), and TRIF (toll-like receptor adaptor molecule 2 encoded by TICAM2). ZBP1 does not have any reported phenotype in mouse when the gene is knocked out or any human disease associations. Early studies focused on ZBP1induced IRF3-mediated type I interferon responses either to cytoplasmic DNA [92] or during human cytomegalovirus infection [93]. Through its RHIM, ZBP1 interacts with the other RHIM containing proteins that regulate nuclear factor kappa B subunit 1 (NFκB) inflammatory responses and the programmed cell death pathway of necroptosis [57, 94,

308

Alan Herbert

95]. When activated by Z-RNA, ZBP1 induces necroptosis through its RHIM interaction with RIPK3 as first observed during influenza infections [50, 56]. In contrast, the ZBP1 RHIM-dependent interaction of RIPK1 inhibits ZBP1/ RIPK3/MLKL-dependent cell death [96]. Mice with a loss of function RHIM of RIPK1 are born but die perinatally with enhanced cell death in the skin due to the failure to suppress Zα-dependent activation of ZBP1 by endogenous RNA elements [97–104]. In RIPK1 wild-type animals, the tissuespecific deletion of the epigenetic regulator SETDB1 in intestinal stem cells also causes ZBP1-mediated necroptosis that is Zα dependent. In this model, ZBP1 is activated by RNAs from endogenous retroviruses whose expression is normally suppressed by histone methylation [105]. Necroptosis requires a catalytically active RIPK3. When the RIPK3 kinase domain is inactivated by mutation, a noninflammatory form of cell death called apoptosis occurs instead. The pathway also requires an interaction between RIPK1 and RIPK3 [106]. It is proposed that phosphorylation of the RIPK3 kinase domain loop physiologically regulates the switch between apoptosis and necroptosis [107]. The importance of ZBP1/RIPK3/MLKL pathway in host defenses against viruses is confirmed by the different strategies viruses use to suppress activation of cell death pathways by ZBP1. As exemplified by the vaccinia poxvirus, many viruses encode Zα family member proteins that inhibit ZBP1-induced necroptosis [51]. Viruses also encode RHIM containing proteins that inhibit ZBP1 induced necroptosis [57]. Since the Z-DNA and Z-RNA binding activities of ZBP1 are still active, the protein could potentially be repurposed by other viral proteins to regulate host and viral gene expression or to target additional Z-dependent processes. Outcomes regulated by ZBP1 that do not involve inflammatory pathways or cell death are currently under investigation [108–110]. 3. E3 proteins from vaccinia and yata poxviruses provide complementary information about the role of these proteins in viral infection. Structural studies of the yata E3 Zα domain provide information about their interaction with the Z-conformation [36], while more information about their biological roles is available from studies of vaccinia of E3 [49, 51]. The co-crystal of yata E3 with Z-DNA reveals that overall structure is very similar to other Zα family members [36]. Of the poxviruses tested, only yata E3 (Kd = 60 nM) has an affinity equivalent to ADAR1 Zα. Residues in the turn of the yata β2–β3 account for some of this difference in affinity. The wild-type vaccinia E3 has not been crystallized. An NMR solution study confirmed that E3 had a typical Zα fold. However,

Z-DNA and Z-RNA: Methods—Past and Future

309

the α3 tyrosine and the β2β3 wing were rotated away from the Z-DNA helix, a finding consistent with the failure of E3 to flip B-DNA to Z-DNA under conditions that worked with ADAR1 Zα as discussed above [111]. However, vaccinia E3 does bind preformed Z-DNA in a BIAcore assay with brominated polyd (C-G) (Kd = 120 nM versus 57 nM for ADAR1 Zα), indicating that although it cannot induce Z-DNA formation, reflected in part by its slow on-rate in the BIAcore assay, it does bind tightly due to a slow off-rate [64]. The role of E3 in vaccinia pathogenesis has been investigated in murine models [49, 112]. Mutations are predicted to inhibit Z-DNA binding by E3 lower viral infectivity and virulence as does the deletion of the entire Zα domain. Substitution of E3 by human ADAR1 Zα in viral constructs maintains virulence, while swapping in loss of function Zα variants diminishes viral infectivity. E3 Zα-deleted viruses show full virulence in RIPK3-/- and ZBP1-/- mice, indicating that one role of E3 Zα is to inhibit ZBP1-dependent necroptosis [113]. A second reported role for E3 Zα is as an antagonist of toll-like receptor signaling [114]. Binding of E3 Zα to Z-RNA is enhanced by the E3 dsRNA-binding domain that promotes Z-RNA formation on binding its substrate, even when detected with an assay that depends on the expression of the ZBP1 Zα and the E3 RNA-binding domains from different constructs [51]. A likely mechanism is that the dsRNA tangles formed creates sufficient topological strain to initiate Z-RNA formation. 4. PKZ proteins are present in zebra fish and goldfish and other fish species like trout and salmon [33–35, 59, 60, 115]. PKZ is an interferon-induced kinase that inhibits translation after activation by Z-DNA. In fish, the process is associated with eIF2α serine 51 phosphorylation. While the dsRNA polyinosinic– polycytidylic acid (poly I:C) does induce PKZ by interferon induction, it does not activate the kinase activity of PKZ, distinguishing it from the dsRNA-dependent protein kinase PKR that is expressed from the same chromosomal locus [59]. The actions of PKZ are antiviral. D. rerio PKZ has the largest Zα wing characterized to date with the structure solved by two separate groups of investigators using Z-DNAs of different length. Both structures show the canonical Zα fold. In the longer Z-DNA structure, additional contacts with the opposite DNA strand involving residues in helix α2 are detected [33]. Protein interactions that create two symmetric salt bridges form between residues from the N-terminal edge of α3. The other study reveals that the wing is flexible and alters conformation on binding to Z-DNA and identifies two positively charged residues that increase the Zα on-rate [35]. Another structure was solved with Z-DNA for the

310

Alan Herbert

goldfish C. auratus PKZ Zα, which induces the fastest B- to Z-DNA flip known (half-transition time of 45.19 s compared to human ADAR1 Zα of 117.59 s and about 270 s for both human ZBP1 and yata E3 Zα [34]). The first residue of β3 is a lysine and is unique among Zα domains. Another lysine present at “n-4” and an arginine at “n+1” that may also contribute to the fast on-rate. 5. ORF112 Zα from the cyprinid herpesvirus 3 inhibits PKZ in a manner analogous to the competition between ZBP1 and vaccinia E3 for Z-RNA [37]. Intriguingly, the Zα structure of ORF112 and the yata E3 are very similar even though the host viruses are otherwise quite dissimilar. The evolutionary reason for this convergence is unknown [38]. As with E3, the wing of ORF112 is quite flexible. The crystal was prepared with the ORF112 Zα bound to an 18-base pair DNA, leading to some novel observations. First, formation of a dimer on the surface of the Z-DNA locks the conformation of the Zα wing. The C-terminal part of the α3 recognition helix of the one monomer packed tightly against the α3 of the other monomer and formed a salt bridge with its β2 strand. This interaction would potentially induce Z-DNA formation in the adjacent segment and promote assembly of additional dimers, enabling multimer formation for a single viral Zα domain. The process would inhibit activation of host responses by coating Z-DNA with a filament of viral protein. Another novel finding in this study was a base-specific contact with the α3 “n+5” arginine (where “n” represents the conserved asparagine) with O6 and N7 of a guanine on the opposite strand that forms a CH–π bond with the other monomer of the dimer (Fig. 3). Such an interaction is also possible with a lysine in yata E3 α3 helix and raises the possibility that such strand-specific interactions stabilize the Z-conformation-dependent CH–π tyrosine contact, prolonging the off-rate and promoting extension of the lefthanded helix into adjacent segments. Collectively these discoveries were enabled by methods derived from many different research areas, including biophysics, biochemistry, bioinformatics, and genetic analyses, showing the power of multidisciplinary approaches. The success also depended heavily on the proper interrogation of freely accessible big data repositories. These resources make it possible to test hypotheses by querying against multiple databases. This “virtual” mode of experimentation is rapid. What once took a large laboratory and much research funding to find can now be refined computationally in a very time-efficient manner. It would have been a surprise if the causal nature of Zα variants in Aicardi–Goutie`res disease proved in this manner and cross-validated against many data sources had not been replicated by mouse studies [87, 89–91]. There is still no mouse

Z-DNA and Z-RNA: Methods—Past and Future

311

model for the human N173S variant that is more critical to the interaction of Zα with Z-DNA and Z-RNA than P193A. It is safe to predict that such virtual experimentation will be even more powerful in the future as better tools are developed, yielding predictions that will be validated by bench scientists with a high probability of success.

7

New Approaches to Z-DNA and Z-RNA Biology There are a number of methods now available that can help further our understanding of flipon biology, but are not yet applied to the study of Z-DNA- and Z-RNA-dependent phenotypes. 1. Biochemical Studies. The existence of additional classes of Z-DNA- and Z-RNA-binding proteins is likely. These proteins would include those that regulate assembly of cellular machines on DNA associated with changes in chromatin state, such as those reported for CSF1 and HO-1 promoters that involve Z-DNA formation [45, 48, 116]. As discussed above, there may be variants of the wHTH found in Zα with different residues involved in the recognition helix and wing that optimize the function performed by an associated domain. There are many new opportunities for protein discovery that go beyond the initial successful approach used 27 years ago to discover Zα. The ability now exists to discover new Z-specific proteins in a high throughput manner using Z-affinity probes and mass spectroscopy to identify candidates for further validation. The interactions between the probe and the protein can then be mapped back to distinuisgh domains that are associated with B-DNA (or A=RNA) binding from those that recognize Z-DNA (or Z-RNA). In the case of unknown domains, computational modeling of an interaction with Z-DNA (or Z-RNA) based on predicted protein structure [117] can focus attention on a subset of candidates. A different class of protein interactions with DNA may involve intrinsically disordered protein regions (IDR) that only become structured on binding Z-DNA or Z-RNA, representing an initial step necessary for assembling a cellular machine on DNA within open chromatin [3]. The alternating lysine and glycine amino acid repeat is one example of an IDR that can recognize Z-DNA [118]. It is of interest that such repeats are present in variant histones like H2A.Z variant 2 [3]. It is likely that IDR allow evolutionary fine-tuning of interactions with Z-DNA and Z-RNA as the strength of interaction can be easily varied by changing the length of the peptide repeat. 2. Structural Studies. With the recent advances in singlemolecule cryo-electron microscopy, there is the opportunity

312

Alan Herbert

to study protein interactions on topologically closed DNA loops and dsRNAs with immobilized ends. Studies on form V nucleic acids and mini-circles would be one example [119– 122]. These approaches allow capture of alternative conformations, like Z-DNA or Z-RNA, dynamically formed as the protein engages with its substrate. They also can trap the Z-conformation to offer a static target for the protein to bind, lowering the entropic cost of engaging IDR. One example to test is whether MDA5 binding induces Z-RNA formation in Alu Z-Boxes as expected [88]. Another application is to test whether engagement of helicases in chromatin remodeling complexes is able to eject nucleosomes by inducing Z-DNA formation by unwrapping the nucleosome to generate negative supercoiling at that location [3]. 3. Computational Studies. The first steps to identify epigenetic changes that alter Z-flipon conformation have been taken with the application of machine learning in the DeepZ program [123]. Other approaches based on transformers are also very p r o m i s i n g ( h t t p s : // b i o r x i v. o r g / c g i / c o n t e n t / short/2023.01.12.523822v1). There are many possible ways in which base modifications can regulate context-specific cellular responses [2]. Generating datasets under different perturbation conditions will help improve the computational predictions generated. Related approaches are necessary to predict Z-RNA formation. The energetics are likely to differ from those of Z-DNA as unpaired region of dsRNA can nucleate Z-RNA formation with the paying cost of forming an A–Z junction. Also, noncanonical base-pairing is likely an important feature that favors Z-RNA formation [81]. 4. Evolutionary Approaches. The variation of viral sequence under selection is likely to alter the propensity of sequences to form left-handed conformations and reflect the dependence of the viral persistence cycle on ADAR1 and ZBP1. This process would be most evident when viruses infect a new host species for the first time. There are now rich sequence databases tracking viral sequence changes occuring when novel viral pathogens cross the species barrier to infect humans. In other evolutionary scenarios, variations in genomic Z-DNA-or Z=RNA-forming sequences may also arise under selection. An example is provided by stickleback fish when they are transferred to an environment where predators are absent. Here the loss of pelvic hindfins arises from the deletion of a Z-DNA forming d(T-G) repeat in the Pitx gene enhancer [124]. A higher throughput approach can also be applied to understanding the role of Z-DNA in the recent evolution of species in cases where rapidly evolving regions have been identified by comparing closely related species to each other. Massively

Z-DNA and Z-RNA: Methods—Past and Future

313

parallel sequencing has identified genomic segments that show enhanced variation between related species coupled with differences in their ability to enhance cell-based reporter assays [125]. How many of these rapidly evolving sequences adopt alternative conformation is unknown, but the overlap with phenotypic trait loci recorded in genetic databases is of interest. The results from such discovery experiments will provide a sequence context under which conditions the flipons involved are active. The analysis will enable the design of sequence-specific reagents to target specific flipon conformations that act as “on/off” switches in the cell. 5. Cellular Studies. Since Z-DNA has a distinct spectroscopic signature, there is potential to follow Z-DNA formation in live cells in real time. Two studies have used Fourier transform infrared spectroscopy to detect Z-DNA formation in cells either treated with histone deacetylase inhibitors or overexpressing chromatin remodeling proteins. Both approaches required fixation of cells [126, 127]. The possibility exists that the signal to noise ratio can be improved in living cells by newer techniques such as field-resolved spectroscopy [128], although the long wavelengths employed currently are a limitation of the method. Cell-wide perturbations using drugs, interferon stimulation, or CRISPR/Cas9 regulated gene expression are possible ways of inducing sufficient Z-formation to initially validate these new methods to show the real-time observation of formation of Z structures inside living cells. 6. Genetic Studies. With the general availability of whole genome sequencing of patients and families in large cohorts with well-annotated medical records, it should be possible to map variants in ADAR1 and ZBP1 to additional disease outcomes. While those afflictions associated with interferon dysregulation are of interest, other diseases where necroptosis is a driver of pathology also require further study. It is currently unknown the extent to which ZBP1-dependent cell death contributes to end-stage disease. In particular, neurological disorders have many causes but necroptosis may be the final common pathway shared by most. The role of the X-chromosome in autoimmunity is of interest as it is enriched in repeat elements [129] potentially capable of forming Z-RNA and activating an immune response against self. The X-chromosome also encodes a dsRNA helicase DDX3X that affects interferon responses. Expression of DDX3X is subject to leaky regulation by Xist [130], a X-chromosome non-coding RNA transcript that inactivates one copy of the X-chromosome in females. Whether DDX3X plays a role in the higher incidence of autoimmunity in females is worthy of further

314

Alan Herbert

investigation. The genetics are likely further complicated by the DDX3X Y-chromosome DDX3Y paralog that could compensate in males for DDX3X loss of function variants. A different class of disease outcomes may arise from the release of nucleic acids by damaged mitochondria. The circular mitochondrial genome contains strong Z-DNA-forming sequences in the ribosomal genes and is transcribed bidirectionally to form dsRNA [80, 131], both of which have the potential to activate ZBP1 and cause cell death. These approaches can then be extended to new Z-binding proteins as they are discovered. 7. Pathogen Studies. Z-formation can influence the outcome of viral disease either through editing by ADAR1 or through apoptosis and necroptosis induced by ZBP1. The genetics of the host and pathogen interactions provide the same opportunity for studying flipons as bacteriophages did for the investigation of codons. Retrotransposons also provide a simple system for testing flipons in action as they have driven the selection of Z-DNA- and Z-RNA-triggered host defense systems [80]. 8. Small Molecule Studies. It seems that many drugs currently in the clinic also have the potential to alter flipon conformation by alterations to chromatin state and RNA transcription. These include DNA methyltransferases (DAC), histone deacetylases (Depsi), histone demethylases (KDM1A inhibitor S2101), histone methylases (EHMT2 inhibitor UNC0638, EZH2 inhibitor GSK343) [132], and inhibitors of nuclear export [133]. Compounds like 5-AZA-CdR, an FDA-approved DNA methyltransferase inhibitor, can increase transcription of endogenous retroelements like SINEs that lie downstream of potential Z-DNA-forming CpG-rich islands [134]. While upregulation of ADAR in tumors can prevent activation of ZBP1 by the transcripts produced [135], deletion or mutation of the ZBP1-dependent necroptotic machinery is another way tumors can survive these interventions [136]. We were recently able to overcome both ADAR-mediated separation of ZBP1 and loss of the ZBO1-cell death pathway in cancer cells by activating ZBP1-dependent necroptosis of tumors using the small molecule CBL0137 that induces Z-DNA in stromal fibroblasts. The combination of CBL0137 and a checkpoint blocking antibody was effective in regressing immunotherapy refractory tumor in mice [137]. It is possible to assess pre- and posttreatment outcomes by scoring dsRNA editing using a cellwide method called the Alu editing index [138]. Effects of these drugs on ZBP1 activity can be assayed by the extent of RIPK3 and mixed lineage kinase domain-like (encoded by MLKL) phosphorylation. The induction of Z-DNA and Z-RNA can also be scored using the Z-specific antibody Z22

Z-DNA and Z-RNA: Methods—Past and Future

315

[50], improving on an earlier approach where Z-DNA-specific antiserum was used to show induction of Z-DNA by a class of carbazole compounds [139]. There are also opportunities to develop new chemical entities that target both ADAR1 and ZBP1 interaction with Z-DNA and Z-RNA. Perhaps the more interesting classes of compounds are those that alter the interactions of Z-binding proteins with the complexes that they localize to specific sites within the cell to perform a particular function. Potentially altering the assembly of these condensates will produce more exact outcomes because only a subset of Z-dependent pathways will be targeted. 9. Protein Engineering Studies. From the structural and mutagenic studies of Zα proteins, we are in a position to design proteins with specific Z-binding properties. By varying residues in the α3 helix and the β sheets, we can control affinity for Z-DNA and alter its relative affinity for B-DNA. We can also change the rate that the proteins flip B-DNA to Z-DNA. At one extreme, following the example of the E3 protein, we can evolve proteins that bind only to preformed Z-DNA and Z-RNA, while at the other extreme, the potential exists to engineer sequence-specific binding proteins, where residues outside those present in the α3 helix engage B-DNA sequence elements that initiate the Zα induced flip of the adjacent segment to Z-DNA conformation. Alternatively, the α3 helix could be designed with a B-DNA specific face and a Z-DNA specific face, with one interaction sequence-specific and the other structure-specific. These types of helices exist in nature. The best known example is the yeast Rap1 protein that uses one surface to bind a B-DNA sequence motif and another to contact a G4 quadruplex [140]. It may also be possible to target a particular Zα domain to a Z-DNA-forming locus using an oligonucleotide-guided approach as has been done with miniCRISPR fusions. Adding domains to the construct with a defined function then may permit the assembly of a specific cellular machine at the site selected. Other uses for such fusions may be for targeting enzymatic reporters to locations where Z-formation is occurring. 10. Genetic Engineering Studies. The formation by Z-DNA by simple repeat sequences and endogenous retroviral and latent viral elements enables targeting of Zα to particular genomic loci. The collection of Z-DNA-forming sequences involved provides a toolbox to build genetic circuits where the conformation of the element acts as a digital switch to alter outcomes. The simplest circuits would activate a reporter gene that signals the occurrence of some event that induces Z-DNA formation. More complicated circuits could perform logical operations [4]. In our toolbox, we have a range of different Zα domains

316

Alan Herbert

to control Z-DNA formation, or to detect it as it happens. We also have ways of regulating local helical stress within a topological domain to control Z-DNA formation. There include designs where one alternative conformation competes with another [141], where we can induce RNA polymerase activity to power the flip [16] or express helicases that induce the transition to Z-DNA [88]. The outcomes we can control are gene expression [45, 46, 48, 142], RNA processing [143– 146], translation [58, 147], and cell death [50, 57, 96]. We can use existing modalities such as small molecules, oligonucleotides, peptides, proteins, and vectors to implement the design. Selection can be performed under conditions where there is a high mutagenic rate for the repeat sequences used to expand the size of the library screened, then used in hosts where the constructs are stable. The screens can be automated by using self-replicating lytic viruses where the progeny surviving selection are passed from one step to the next. During the process we will learn more about the logic of the Z-DNA- and Z-RNA-based genetic circuits used in nature.

8

Final Thoughts There is no doubt that the advancement of science tracks with the development of new methods that produce quantitative data previously beyond measurement. At some stage, there is more than sufficient data, but the challenge becomes how to use it effectively to provide good explanations of observed events. Rather than relying on data to generate hypothesis, the history of discovery in the Z-DNA field has demonstrated that it is more effective to use data experimentally to refine hypothesis that can then lead to better ways to test their validity. While it is good to be skeptical, it is better to work enthusiastically to improve data collection and analytical methods to answer the question asked. As with us all, experts are bounded by their domain of knowledge. Like Lord Kelvin’s dismissal of heavier than air flight as impossible (1895), predictions made by experts outside their area of expertise often are not reliable guides to future experimental success. Even at the time Lord Kelvin made his prediction, it was well known that birds could fly, and it was easy to demonstrate that they were heavier than the medium they flew in. Lord Kelvin’s other famously wrong prediction was that Earth’s age was between 20 and 400 million years old (1862). The estimate caused Darwin angst as it was incompatible with his theory of evolution [148], which was solidly based on the use of data to generate hypotheses that could be validated with other data. Clearly Lord Kelvin’s misapplied methodology was driven by confirmation bias. He never questioned his assumptions, basing his model on the rate at which the earth was cooling while ignoring

Z-DNA and Z-RNA: Methods—Past and Future

317

the convection currents within the earth’s molten core. If he had accounted for convention in his model, his estimate for the earth’s age would have been at variance with the age he calculated for the sun using a different method [148].

9

What Will We Find in the Future? Firstly, we will understand that genomes are optimized for computation by natural selection. Organisms would not survive otherwise. Those genomes from multicellular organisms are inherently digital. They are built mostly from repeat nucleotide sequences, with only a minority of sequences that code for protein. The logic these genomes implement is based on flipons, repeat sequences that form alternative nucleic acid structures under physiological conditions. The flipons act as binary switches. They vary the readout of genomic information in response to a constantly changing environment. They direct the compilation of different transcripts from the coding regions of the genome. By doing so, they create the genetic programs that run the cell [3]. The combinatorial possibilities enabled by flipons are enormous, creating massive variability for natural selection to act upon. Flipons based on Z-DNA and Z-RNA enable rapid transcriptional responses to perturbations in the cell. They catalyze the localized transition from one chromatin state to another [149]. They also terminate the cell when the system crashes by initiating the programmed cell death pathway. Other types of flipons, like G4 quadruplexes, are more stable than Z-flipons and can act as memory elements to speed responses to reoccurring events and to enable transmission of the current state to progeny cells. Flipons based on a different set of repeat sequences switch conformation enable responses to pH change, while others are sensitive to environmental mutagens [4]. Small RNAs that bind flipons or their junctions with A-RNA or B-DNA also offer the opportunity to modulate flipon conformation, but in a sequence-specific fashion. For example, formation of a G4 quadruplex could be modulated by binding of a small RNA to one strand or the other, with a small RNA bound to the C-rich strand springing the G-rich strand free to adopt a fourstranded fold. Engagement of the flipon by structure-specific proteins would then determine the outcome without the need for sequence-specific B-DNA or A-RNA binding proteins. Only with time would the proteins evolve to elaborate on the sequence-specific effects set by the small RNAs that shape flipons (Fig. 4). The use of small RNAs to shape flipon conformations offers many new exciting experimental and therapeutic opportunities. Engineering approaches based on an understanding of digital genomes offer a more efficient approach for building optimal

318

Alan Herbert

Fig. 4 Regulating flipon conformation with small RNAs. (a) Small RNAs (such as a microRNAs) can bind to a flipon motif M or to the junction J of the flipon with lower energy forms of nucleic acid. (b) Depending on the strand preference, a small RNA can promote or inhibit a sequence from flipping to an alternative conformation. For example, M1 binding to the top G-rich strand as shown would prevent formation of a G4 quadruplex while M2 binding to the bottom C-rich strand would free up the top strand to fold into a G4 quadruplex. The possible sites for M and J small RNAs are numbered. The use of small RNAs thereby allows programming of promoter shape. The structure-specific binding proteins that dock then direct the assembly of cellular machines that control outcomes. Such a mechanism does not require sequence-specific binding proteins but can be enhanced by them

circuit design than those based on inorganic strata. While silicon solutions are best for certain classes of problems, genetic computers with their logic based on biology offer obvious advantages: cells are self-powered, capable of self-replication and repair, and they operate in many different and unstructured environments. Given the huge design space offered by flipons embedded in DNA, the choice of a cell type with the best initial starting conditions is key to a successful experimental outcome. A second thing we will understand is that the inherent imprecision of genetic computers increases their adaptability. Building on the old adage that nothing in biology is 100%, we will find that many uses of flipons in genetic circuits are nonproductive, resulting in RNAs or protein products that are of no immediate use with their output best trashed, even though the disposal comes with a metabolic cost. Multicellular biological systems are built for fault tolerance, not for maximal energy efficiency. They are able to add more mitochondria as needed. Imprecision generates variants that at some point are beneficial under a particular set of selections. That is the payoff from using a sloppy system to discover what is possible. When variants offer a survival advantage by actually producing a useful output, they are automatically transmitted to progeny because they are embedded in the genome. Over time, the frequency of this class of variant will increase unless more adaptive ones arise, or the selection pressure changes. Thirdly, we will accept that the DNA damage and genomic instability associated with the alternative conformations found in

Z-DNA and Z-RNA: Methods—Past and Future

319

digital genomes are not an inherent feature of flipons. The bad outcomes are likely a product of errors in the transcription, translation, or replication machinery that operates downstream of flipons [150]. As such, the rate of genomic instability is a feature that can be engineered by controlling the basal error rate in those pathways. Tumors favor a high error rate to generate sufficient variability to enable their survival. Loss of function helicase variants promote disease by disrupting the normal processes flipons regulate by locking them in one conformation or the other [151, 152]. Given the inherent error rate in downstream processes, it is not unexpected that fear conditioning is associated with both Z-DNA formation [55] and with double-stranded breaks [153]. The breaks are most likely due to topoisomerase errors and represent a failure of the enzyme to religate the DNA strands it cleaves. This event occurs with some frequency in actively transcribed genes that depend upon the topoisomerases to relax superhelical stress produced by RNA polymerases as RNA is made. The negative supercoiling drives the Z-DNA formation that is observed [16, 154]. Since breaks are a normal occurrence, it is unlikely that the reported DNA damage during fear conditioning is an inherent part of memory formation. Instead, it is more likely that flipon conformation is a more critical part of learning.

10

What Is There to Do? Hopefully, at this stage, the reader has already thought of questions in their particular fields that remain unanswered and has selected a suitable method for further investigation. Alternatively, the reader may feel the need to develop a better approach, one that provides more precise measurements. In this process, it is important to search for methods that follow the problem rather than the reverse. The questions will always change as new results become available and a new approach will always be worth considering. A lack of knowledge about a particular field is not in itself a particularly important limitation to making progress as ignorance can help motivate the search for strategies unanticipated by those much more experienced, who, as a rule, rely on what has worked for them in the past. Often a new method is the best way forward, as it will provide answers faster than previous approaches could and generate results previous approaches could not. Novel methods are certainly needed to help answer a number of interesting questions in the Z-DNA and Z-RNA fields. These include the following: Are there other families of Z-DNA-binding proteins with specific functions? Can we engineer Z-DNA- and ZRNA-binding proteins that are sequence-specific? Are there other ways to target Z-DNA and Z-RNA formation in a sequence-specific manner to enhance particular outcomes? How can we follow

320

Alan Herbert

Z-formation in living cells in real time? How does flipon conformation impact processes like transcription, splicing, recombination, repair, and memory formation? What kind of genetic circuits based on Z-DNA and Z-RNA flipons can we engineer and optimize by selection? How can we target Z-DNA and Z-RNA therapeutically to improve immunological responses in cancers and viruses? Can we treat autoimmune diseases by inhibiting Z-DNA- and Z-RNAdependent responses against self-nucleic acids?

Acknowledgments I would like to thank Sid Balachandran and Maria Poptsova for their helpful comments on the manuscript. References 1. Herbert A (2019) A genetic instruction code based on DNA conformation. Trends Genet 35:887–890. https://doi.org/10.1016/j.tig. 2019.09.007 2. Herbert A (2020) ALU non-B-DNA conformations, flipons, binary codes and evolution. R Soc Open Sci 7(6):200222. https://doi. org/10.1098/rsos.200222 3. Herbert A (2021) The simple biology of flipons and condensates enhances the evolution of complexity. Molecules 26(16):4881. h t t p s : // d o i . o r g / 1 0 . 3 3 9 0 / molecules26164881 4. Herbert A (2020) Simple repeats as building blocks for genetic computers. Trends Genet. https://doi.org/10.1016/j.tig.2020.06.012 5. Moore SP, Rich A, Fishel R (1989) The human recombination strand exchange process. Genome 31(1):45–52. https://doi.org/ 10.1139/g89-012 6. Rich A, Nordheim A, Wang AHJ (1984) The chemistry and biology of left-handed Z-DNA. Annu Rev Biochem 53(1): 791–846. https://doi.org/10.1146/ annurev.bi.53.070184.004043 7. Morange M (2007) What history tells us IX. Z-DNA: when nature is not opportunistic. J Biosci 32(4):657–661. https://doi.org/ 10.1007/s12038-007-0065-5 8. Pohl FM, Jovin TM (1972) Salt-induced co-operative conformational change of a synthetic DNA: equilibrium and kinetic studies with poly (dG-dC). J Mol Biol 67(3): 375–396 9. Mitsui Y, Langridge R, Shortle BE, Cantor CR, Grant RC, Kodama M, Wells RD

(1970) Physical and enzymatic studies on poly d(I-C)-poly d(I-C), an unusual doublehelical DNA. Nature 228(5277):1166–1169. https://doi.org/10.1038/2281166a0 10. Wang AH, Quigley GJ, Kolpak FJ, Crawford JL, van Boom JH, van der Marel G, Rich A (1979) Molecular structure of a left-handed double helical DNA fragment at atomic resolution. Nature 282(5740):680–686 11. Ho PS, Ellison MJ, Quigley GJ, Rich A (1986) A computer aided thermodynamic approach for predicting the formation of Z-DNA in naturally occurring sequences. EMBO J 5(10):2737–2744 12. Thamann TJ, Lord RC, Wang AH, Rich A (1981) The high salt form of poly(dG-dC). poly(dG-dC) is left-handed Z-DNA: Raman spectra of crystals and solutions. Nucleic Acids Res 9(20):5443–5457. https://doi.org/10. 1093/nar/9.20.5443 13. Moller A, Nordheim A, Kozlowski SA, Patel DJ, Rich A (1984) Bromination stabilizes poly(dG-dC) in the Z-DNA form under low-salt conditions. Biochemistry 23(1): 5 4 – 6 2 . h t t p s : // d o i . o r g / 1 0 . 1 0 2 1 / bi00296a009 14. Lafer EM, Valle RP, Moller A, Nordheim A, Schur PH, Rich A, Stollar BD (1983) ZDNA-specific antibodies in human systemic lupus erythematosus. J Clin Invest 71(2): 314–321. https://doi.org/10.1172/ jci110771 15. Behe M, Felsenfeld G (1981) Effects of methylation on a synthetic polynucleotide: the B-Z transition in poly(dG-m5dC).poly (dG-m5dC). Proc Natl Acad Sci U S A

Z-DNA and Z-RNA: Methods—Past and Future 78(3):1619–1623. https://doi.org/10. 1073/pnas.78.3.1619 16. Peck LJ, Wang JC (1983) Energetics of B-toZ transition in DNA. Proc Natl Acad Sci U S A 80(20):6206–6210 17. Champ PC, Maurice S, Vargason JM, Camp T, Ho PS (2004) Distributions of Z-DNA and nuclear factor I in human chromosome 22: a model for coupled transcriptional regulation. Nucleic Acids Res 32(22): 6501–6510. https://doi.org/10.1093/nar/ gkh988 18. Herbert AG, Rich A (1993) A method to identify and characterize Z-DNA binding proteins using a linear oligodeoxynucleotide. Nucleic Acids Res 21(11):2669–2672. https://doi.org/10.1093/nar/21.11.2669 19. Herbert A, Lowenhaupt K, Spitzner J, Rich A (1995) Chicken double-stranded RNA adenosine deaminase has apparent specificity for Z-DNA. Proc Natl Acad Sci U S A 92(16): 7550–7554. https://doi.org/10.1073/pnas. 92.16.7550 20. Herbert A, Alfken J, Kim YG, Mian IS, Nishikura K, Rich A (1997) A Z-DNA binding domain present in the human editing enzyme, double-stranded RNA adenosine deaminase. Proc Natl Acad Sci U S A 94(16):8421–8426 21. Herbert AG, Spitzner JR, Lowenhaupt K, Rich A (1993) Z-DNA binding protein from chicken blood nuclei. Proc Natl Acad Sci U S A 90(8):3339–3342 22. Herbert A, Lowenhaupt K, Spitzner J, Rich A (1995) Double-stranded RNA adenosine deaminase binds Z-DNA in vitro. Nucleic Acids Symp Ser 33:16–19 23. Kim YG, Kim PS, Herbert A, Rich A (1997) Construction of a Z-DNA-specific restriction endonuclease. Proc Natl Acad Sci U S A 94(24):12875–12879. https://doi.org/10. 1073/pnas.94.24.12875 24. Kim YG, Lowenhaupt K, Maas S, Herbert A, Schwartz T, Rich A (2000) The zab domain of the human RNA editing enzyme ADAR1 recognizes Z-DNA when surrounded by B-DNA. J Biol Chem 275(35): 26828–26833. https://doi.org/10.1074/ jbc.M003477200 25. Herbert A, Schade M, Lowenhaupt K, Alfken J, Schwartz T, Shlyakhtenko LS, Lyubchenko YL, Rich A (1998) The Zα domain from human ADAR1 binds to the Z-DNA conformer of many different sequences. Nucleic Acids Res 26(15):3486–3493. https://doi.org/10.1093/nar/26.15.3486

321

26. Schade M, Behlke J, Lowenhaupt K, Herbert A, Rich A, Oschkinat H (1999) A 6 bp Z-DNA hairpin binds two Z alpha domains from the human RNA editing enzyme ADAR1. FEBS Lett 458(1):27–31 27. Schade M, Turner CJ, Lowenhaupt K, Rich A, Herbert A (1999) Structure-function analysis of the Z-DNA-binding domain Zα of dsRNA adenosine deaminase type I reveals similarity to the (alpha + beta) family of helix-turn-helix proteins. EMBO J 18(2):470–479. https:// doi.org/10.1093/emboj/18.2.470 28. Schwartz T, Rould MA, Lowenhaupt K, Herbert A, Rich A (1999) Crystal structure of the Zα domain of the human editing enzyme ADAR1 bound to left-handed Z-DNA. Science 284(5421):1841–1845 29. Placido D, Brown BA 2nd, Lowenhaupt K, Rich A, Athanasiadis A (2007) A left-handed RNA double helix bound by the Z alpha domain of the RNA-editing enzyme ADAR1. Structure 15(4):395–404. https:// doi.org/10.1016/j.str.2007.03.001 30. Schade M, Turner CJ, Kuhne R, Schmieder P, Lowenhaupt K, Herbert A, Rich A, Oschkinat H (1999) The solution structure of the Zα domain of the human RNA editing enzyme ADAR1 reveals a prepositioned binding surface for Z-DNA. Proc Natl Acad Sci U S A 96(22):12465–12470 31. Schwartz T, Behlke J, Lowenhaupt K, Heinemann U, Rich A (2001) Structure of the DLM-1-Z-DNA complex reveals a conserved family of Z-DNA-binding proteins. Nat Struct Biol 8(9):761–765. https://doi. org/10.1038/nsb0901-761 32. Ha SC, Kim D, Hwang HY, Rich A, Kim YG, Kim KK (2008) The crystal structure of the second Z-DNA binding domain of human DAI (ZBP1) in complex with Z-DNA reveals an unusual binding mode to Z-DNA. Proc Natl Acad Sci U S A 105(52):20671–20676. h t t p s : // d o i . o r g / 1 0 . 1 0 7 3 / p n a s . 0810463106 33. de Rosa M, Zacarias S, Athanasiadis A (2013) Structural basis for Z-DNA binding and stabilization by the zebrafish Z-DNA dependent protein kinase PKZ. Nucleic Acids Res 41(21):9924–9933. https://doi.org/10. 1093/nar/gkt743 34. Kim D, Hur J, Park K, Bae S, Shin D, Ha SC, Hwang HY, Hohng S, Lee JH, Lee S, Kim YG, Kim KK (2014) Distinct Z-DNA binding mode of a PKR-like protein kinase containing a Z-DNA binding domain (PKZ). Nucleic Acids Res 42(9):5937–5948. https://doi. org/10.1093/nar/gku189

322

Alan Herbert

35. Subramani VK, Kim D, Yun K, Kim KK (2016) Structural and functional studies of a large winged Z-DNA-binding domain of Danio rerio protein kinase PKZ. FEBS Lett 590(14):2275–2285. https://doi.org/10. 1002/1873-3468.12238 36. Ha SC, Lokanath NK, Van Quyen D, Wu CA, Lowenhaupt K, Rich A, Kim YG, Kim KK (2004) A poxvirus protein forms a complex with left-handed Z-DNA: crystal structure of a Yatapoxvirus Zalpha bound to DNA. Proc Natl Acad Sci U S A 101(40):14367–14372. h t t p s : // d o i . o r g / 1 0 . 1 0 7 3 / p n a s . 0405586101 37. Tome AR, Kus K, Correia S, Paulo LM, Zacarias S, de Rosa M, Figueiredo D, Parkhouse RM, Athanasiadis A (2013) Crystal structure of a poxvirus-like zalpha domain from cyprinid herpesvirus 3. J Virol 87(7): 3998–4004. https://doi.org/10.1128/JVI. 03116-12 38. Kus K, Rakus K, Boutier M, Tsigkri T, Gabriel L, Vanderplasschen A, Athanasiadis A (2015) The structure of the Cyprinid herpesvirus 3 ORF112-Zα Z-DNA complex reveals a mechanism of nucleic acids recognition conserved with E3L, a poxvirus inhibitor of interferon response. J Biol Chem 290(52): 30713–30725. https://doi.org/10.1074/ jbc.M115.679407 39. Ha SC, Lowenhaupt K, Rich A, Kim YG, Kim KK (2005) Crystal structure of a junction between B-DNA and Z-DNA reveals two extruded bases. Nature 437(7062): 1183–1186. https://doi.org/10.1038/ nature04088 40. de Rosa M, de Sanctis D, Rosario AL, Archer M, Rich A, Athanasiadis A, Carrondo MA (2010) Crystal structure of a junction between two Z-DNA helices. Proc Natl Acad Sci U S A 107(20):9088–9092. https://doi. org/10.1073/pnas.1003182107 41. Kwon JA, Rich A (2005) Biological function of the vaccinia virus Z-DNA-binding protein E3L: gene transactivation and antiapoptotic activity in HeLa cells. Proc Natl Acad Sci U S A 102(36):12759–12764. https://doi.org/ 10.1073/pnas.0506011102 42. Oh DB, Kim YG, Rich A (2002) Z-DNAbinding proteins can act as potent effectors of gene expression in vivo. Proc Natl Acad Sci U S A 99(26):16666–16671. https:// doi.org/10.1073/pnas.262672699 43. Wittig B, Wolfl S, Dorbic T, Vahrson W, Rich A (1992) Transcription of human c-myc in permeabilized nuclei is associated with formation of Z-DNA in three discrete regions of the gene. EMBO J 11(12):4653–4663

44. Wolfl S, Martinez C, Rich A, Majzoub JA (1996) Transcription of the human corticotropin-releasing hormone gene in NPLC cells is correlated with Z-DNA formation. Proc Natl Acad Sci U S A 93(8): 3664–3668. https://doi.org/10.1073/pnas. 93.8.3664 45. Liu H, Mulholland N, Fu H, Zhao K (2006) Cooperative activity of BRG1 and Z-DNA formation in chromatin remodeling. Mol Cell Biol 26(7):2550–2559. https://doi. org/10.1128/MCB.26.7.2550-2559.2006 46. Liu R, Liu H, Chen X, Kirby M, Brown PO, Zhao K (2001) Regulation of CSF1 promoter by the SWI/SNF-like BAF complex. Cell 106(3):309–318 47. Ray BK, Dhar S, Shakya A, Ray A (2011) ZDNA-forming silencer in the first exon regulates human ADAM-12 gene expression. Proc Natl Acad Sci U S A 108(1):103–108. h t t p s : // d o i . o r g / 1 0 . 1 0 7 3 / p n a s . 1008831108 48. Maruyama A, Mimura J, Harada N, Itoh K (2013) Nrf2 activation is associated with Z-DNA formation in the human HO-1 promoter. Nucleic Acids Res 41(10):5223–5234. https://doi.org/10.1093/nar/gkt243 49. Kim YG, Muralinath M, Brandt T, Pearcy M, Hauns K, Lowenhaupt K, Jacobs BL, Rich A (2003) A role for Z-DNA binding in vaccinia virus pathogenesis. Proc Natl Acad Sci U S A 100(12):6974–6979. https://doi.org/10. 1073/pnas.0431131100 50. Zhang T, Yin C, Boyd DF, Quarato G, Ingram JP, Shubina M, Ragan KB, Ishizuka T, Crawford JC, Tummers B, Rodriguez DA, Xue J, Peri S, Kaiser WJ, Lopez CB, Xu Y, Upton JW, Thomas PG, Green DR, Balachandran S (2020) Influenza virus Z-RNAs induce ZBP1-mediated necroptosis. Cell 180(6):1115–1129. https://doi.org/10. 1016/j.cell.2020.02.050 51. Koehler H, Cotsmire S, Zhang T, Balachandran S, Upton JW, Langland J, Kalman D, Jacobs BL, Mocarski ES (2021) Vaccinia virus E3 prevents sensing of Z-RNA to block ZBP1-dependent necroptosis. Cell Host Microbe. https://doi.org/10.1016/j. chom.2021.05.009 52. Ng SK, Weissbach R, Ronson GE, Scadden AD (2013) Proteins that contain a functional Z-DNA-binding domain localize to cytoplasmic stress granules. Nucleic Acids Res 41(21): 9786–9799. https://doi.org/10.1093/nar/ gkt750 53. George CX, Ramaswami G, Li JB, Samuel CE (2016) Editing of cellular self-RNAs by adenosine deaminase ADAR1 suppresses innate

Z-DNA and Z-RNA: Methods—Past and Future immune stress responses. J Biol Chem 291(12):6158–6168. https://doi.org/10. 1074/jbc.M115.709014 54. Gabriel L, Srinivasan B, Kus K, Mata JF, Joao Amorim M, Jansen LET, Athanasiadis A (2021) Enrichment of Zα domains at cytoplasmic stress granules is due to their innate ability to bind to nucleic acids. J Cell Sci 134(10). https://doi.org/10.1242/jcs. 258446 55. Marshall PR, Zhao Q, Li X, Wei W, Periyakaruppiah A, Zajaczkowski EL, Leighton LJ, Madugalle SU, Basic D, Wang Z, Yin J, Liau WS, Gupte A, Walkley CR, Bredy TW (2020) Dynamic regulation of Z-DNA in the mouse prefrontal cortex by the RNA-editing enzyme Adar1 is required for fear extinction. Nat Neurosci 23(6): 718–729. https://doi.org/10.1038/ s41593-020-0627-5 56. Thapa RJ, Ingram JP, Ragan KB, Nogusa S, Boyd DF, Benitez AA, Sridharan H, Kosoff R, Shubina M, Landsteiner VJ, Andrake M, Vogel P, Sigal LJ, tenOever BR, Thomas PG, Upton JW, Balachandran S (2016) DAI senses influenza A virus genomic RNA and activates RIPK3-dependent cell death. Cell Host Microbe 20(5):674–681. https://doi.org/ 10.1016/j.chom.2016.09.014 57. Upton JW, Kaiser WJ, Mocarski ES (2012) DAI/ZBP1/DLM-1 complexes with RIP3 to mediate virus-induced programmed necrosis that is targeted by murine cytomegalovirus vIRA. Cell Host Microbe 11(3):290–297. https://doi.org/10.1016/j.chom.2012. 01.016 58. Taghavi N, Samuel CE (2013) RNA-dependent protein kinase PKR and the Z-DNA binding orthologue PKZ differ in their capacity to mediate initiation factor eIF2alpha-dependent inhibition of protein synthesis and virus-induced stress granule formation. Virology 443(1):48–58. https://doi. org/10.1016/j.virol.2013.04.020 59. Bergan V, Jagus R, Lauksund S, Kileng O, Robertsen B (2008) The Atlantic salmon Z-DNA binding protein kinase phosphorylates translation initiation factor 2 alpha and constitutes a unique orthologue to the mammalian dsRNA-activated protein kinase R. FEBS J 275(1):184–197. https://doi. org/10.1111/j.1742-4658.2007.06188.x 60. Yang PJ, Wu CX, Li W, Fan LH, Lin G, Hu CY (2011) Cloning and functional analysis of PKZ (PKR-like) from grass carp (Ctenopharyngodon idellus). Fish Shellfish Immunol 31(6):1173–1178. https://doi.org/10. 1016/j.fsi.2011.10.012

323

61. Nikpour N, Salavati R (2019) The RNA binding activity of the first identified trypanosome protein with Z-DNA-binding domains. Sci Rep 9(1):5904. https://doi.org/10.1038/ s41598-019-42409-1 62. Herbert A (2019) Mendelian Disease caused by variants affecting recognition of Z-DNA and Z-RNA by the Zα domain of the double-stranded RNA Editing Enzyme ADAR. Eur J Hum Genet. https://doi.org/ 10.1038/s41431-019-0458-6 63. Park C, Zheng X, Park CY, Kim J, Lee SK, Won H, Choi J, Kim YG, Choi HJ (2020) Dual conformational recognition by Z-DNA binding protein is important for the B-Z transition process. Nucleic Acids Res 48(22): 12957–12971. https://doi.org/10.1093/ nar/gkaa1115 64. Quyen DV, Ha SC, Lowenhaupt K, Rich A, Kim KK, Kim YG (2007) Characterization of DNA-binding activity of Z alpha domains from poxviruses and the importance of the beta-wing regions in converting B-DNA to Z-DNA. Nucleic Acids Res 35(22): 7714–7720. https://doi.org/10.1093/nar/ gkm748 65. Kang YM, Bang J, Lee EH, Ahn HC, Seo YJ, Kim KK, Kim YG, Choi BS, Lee JH (2009) NMR spectroscopic elucidation of the B-Z transition of a DNA double helix induced by the Z alpha domain of human ADAR1. J Am Chem Soc 131(32):11485–11491. https:// doi.org/10.1021/ja902654u 66. Kim SH, Lim SH, Lee AR, Kwon DH, Song HK, Lee JH, Cho M, Johner A, Lee NK, Hong SC (2018) Unveiling the pathway to Z-DNA in the protein-induced B-Z transition. Nucleic Acids Res 46(8):4129–4137. https://doi.org/10.1093/nar/gky200 67. Ramakrishnan V, Finch JT, Graziano V, Lee PL, Sweet RM (1993) Crystal structure of globular domain of histone H5 and its implications for nucleosome binding. Nature 362(6417):219–223. https://doi.org/10. 1038/362219a0 68. Brodsky S, Jana T, Mittelman K, Chapal M, Kumar DK, Carmi M, Barkai N (2020) Intrinsically disordered regions direct transcription factor in vivo binding specificity. Mol Cell 79(3):459–471. https://doi.org/ 10.1016/j.molcel.2020.05.032 69. Kim SH, Jung HJ, Lee IB, Lee NK, Hong SC (2021) Sequence-dependent cost for Z-form shapes the torsion-driven B-Z transition via close interplay of Z-DNA and DNA bubble. Nucleic Acids Res. https://doi.org/10. 1093/nar/gkab153

324

Alan Herbert

70. Bae S, Kim Y, Kim D, Kim KK, Kim YG, Hohng S (2013) Energetics of Z-DNA binding protein-mediated helicity reversals in DNA, RNA, and DNA-RNA duplexes. J Phys Chem B 117(44):13866–13871. https://doi.org/10.1021/jp409862j 71. Dagliyan O, Tarnawski M, Chu PH, Shirvanyants D, Schlichting I, Dokholyan NV, Hahn KM (2016) Engineering extrinsic disorder to control protein activity in living cells. Science 354(6318):1441–1444. https://doi.org/10.1126/science.aah3404 72. Nishikura K (2016) A-to-I editing of coding and non-coding RNAs by ADARs. Nat Rev Mol Cell Biol 17(2):83–96. https://doi.org/ 10.1038/nrm.2015.4 73. Athanasiadis A, Placido D, Maas S, Brown BA 2nd, Lowenhaupt K, Rich A (2005) The crystal structure of the Zβ domain of the RNA-editing enzyme ADAR1 reveals distinct conserved surfaces among Z-domains. J Mol Biol 351(3):496–507. https://doi.org/10. 1016/j.jmb.2005.06.028 74. Kawakubo K, Samuel CE (2000) Human RNA-specific adenosine deaminase (ADAR1) gene specifies transcripts that initiate from a constitutively active alternative promoter. Gene 258 (1-2):165-172 75. Herbert A, Rich A (2001) The role of binding domains for dsRNA and Z-DNA in the in vivo editing of minimal substrates by ADAR1. Proc Natl Acad Sci U S A 98(21): 12132–12137. https://doi.org/10.1073/ pnas.211419898 76. Athanasiadis A, Rich A, Maas S (2004) Widespread A-to-I RNA editing of Alu-containing mRNAs in the human transcriptome. PLoS Biol 2(12):e391. https://doi.org/10.1371/ journal.pbio.0020391 77. Blow M, Futreal PA, Wooster R, Stratton MR (2004) A survey of RNA editing in human brain. Genome Res 14(12):2379–2387. https://doi.org/10.1101/gr.2951204 78. Kim DD, Kim TT, Walsh T, Kobayashi Y, Matise TC, Buyske S, Gabriel A (2004) Widespread RNA editing of embedded alu elements in the human transcriptome. Genome Res 14(9):1719–1725. https://doi.org/10. 1101/gr.2855504 79. Levanon EY, Eisenberg E, Yelin R, Nemzer S, Hallegger M, Shemesh R, Fligelman ZY, Shoshan A, Pollock SR, Sztybel D, Olshansky M, Rechavi G, Jantsch MF (2004) Systematic identification of abundant A-to-I editing sites in the human transcriptome. Nat Biotechnol 22(8):1001–1005. https://doi. org/10.1038/nbt996

80. Herbert A (2019) Z-DNA and Z-RNA in human disease. Commun Biol 2(1):7. https://doi.org/10.1038/s42003-0180237-x 81. Nichols PJ, Bevers S, Henen M, Kieft JS, Vicens Q, Vo¨geli B (2021) Recognition of non-CpG repeats in Alu and ribosomal RNAs by the Z-RNA binding domain of ADAR1 induces A-Z junctions. Nat Commun 12(1). https://doi.org/10.1038/s41467021-21039-0 82. Patterson JB, Thomis DC, Hans SL, Samuel CE (1995) Mechanism of interferon action: double-stranded RNA-specific adenosine deaminase from human cells is inducible by alpha and gamma interferons. Virology 210(2):508–511. https://doi.org/10.1006/ viro.1995.1370 83. Hartner JC, Walkley CR, Lu J, Orkin SH (2009) ADAR1 is essential for the maintenance of hematopoiesis and suppression of interferon signaling. Nat Immunol 10(1): 109–115. https://doi.org/10.1038/ni.1680 84. Ward SV, George CX, Welch MJ, Liou LY, Hahm B, Lewicki H, de la Torre JC, Samuel CE, Oldstone MB (2011) RNA editing enzyme adenosine deaminase is a restriction factor for controlling measles virus replication that also is required for embryogenesis. Proc Natl Acad Sci U S A 108(1):331–336. h t t p s : // d o i . o r g / 1 0 . 1 0 7 3 / p n a s . 1017241108 85. Rice GI, Kasher PR, Forte GM, Mannion NM, Greenwood SM, Szynkiewicz M, Dickerson JE, Bhaskar SS, Zampini M, Briggs TA, Jenkinson EM, Bacino CA, Battini R, Bertini E, Brogan PA, Brueton LA, Carpanelli M, De Laet C, de Lonlay P, del Toro M, Desguerre I, Fazzi E, GarciaCazorla A, Heiberg A, Kawaguchi M, Kumar R, Lin JP, Lourenco CM, Male AM, Marques W Jr, Mignot C, Olivieri I, Orcesi S, Prabhakar P, Rasmussen M, Robinson RA, Rozenberg F, Schmidt JL, Steindl K, Tan TY, van der Merwe WG, Vanderver A, Vassallo G, Wakeling EL, Wassmer E, Whittaker E, Livingston JH, Lebon P, Suzuki T, McLaughlin PJ, Keegan LP, O’Connell MA, Lovell SC, Crow YJ (2012) Mutations in ADAR1 cause Aicardi-Goutieres syndrome associated with a type I interferon signature. Nat Genet 44(11):1243–1248. https://doi.org/10.1038/ng.2414 86. Liddicoat BJ, Piskol R, Chalk AM, Ramaswami G, Higuchi M, Hartner JC, Li JB, Seeburg PH, Walkley CR (2015) RNA editing by ADAR1 prevents MDA5 sensing of endogenous dsRNA as nonself. Science

Z-DNA and Z-RNA: Methods—Past and Future 349(6252):1115–1120. https://doi.org/10. 1126/science.aac7049 87. Maurano M, Snyder JM, Connelly C, HenaoMejia J, Sidrauski C, Stetson DB (2021) Protein kinase R and the integrated stress response drive immunopathology caused by mutations in the RNA deaminase ADAR1. Immunity. https://doi.org/10.1016/j. immuni.2021.07.001 88. Herbert A (2021) To “Z” or not to “Z”: Z-RNA, self-recognition, and the MDA5 helicase. PLoS Genet 17(5):e1009513. https://doi.org/10.1371/journal.pgen. 1009513 89. de Reuver R, Dierick E, Wiernicki B, Staes K, Seys L, De Meester E, Muyldermans T, Botzki A, Lambrecht BN, Van Nieuwerburgh F, Vandenabeele P, Maelfait J (2021) ADAR1 interaction with Z-RNA promotes editing of endogenous doublestranded RNA and prevents MDA5dependent immune activation. Cell Rep 36(6):109500. https://doi.org/10.1016/j. celrep.2021.109500 90. Nakahama T, Kato Y, Shibuya T, Inoue M, Kim JI, Vongpipatana T, Todo H, Xing Y, Kawahara Y (2021) Mutations in the adenosine deaminase ADAR1 that prevent endogenous Z-RNA binding induce AicardiGoutieres-syndrome-like encephalopathy. Immunity 54(9):1976–1988 e1977. https:// doi.org/10.1016/j.immuni.2021.08.022 91. Tang Q, Rigby RE, Young GR, Hvidt AK, Davis T, Tan TK, Bridgeman A, Townsend AR, Kassiotis G, Rehwinkel J (2021) Adenosine-to-inosine editing of endogenous Z-form RNA by the deaminase ADAR1 prevents spontaneous MAVS-dependent type I interferon responses. Immunity 54(9): 1961–1975 e1965. https://doi.org/10. 1016/j.immuni.2021.08.011 92. Takaoka A, Wang Z, Choi MK, Yanai H, Negishi H, Ban T, Lu Y, Miyagishi M, Kodama T, Honda K, Ohba Y, Taniguchi T (2007) DAI (DLM-1/ZBP1) is a cytosolic DNA sensor and an activator of innate immune response. Nature 448(7152): 501–505. https://doi.org/10.1038/ nature06013 93. DeFilippis VR, Alvarado D, Sali T, Rothenburg S, Fruh K (2010) Human cytomegalovirus induces the interferon response via the DNA sensor ZBP1. J Virol 84(1): 585–598. https://doi.org/10.1128/JVI. 01748-09 94. Kaiser WJ, Upton JW, Mocarski ES (2008) Receptor-interacting protein homotypic interaction motif-dependent control of

325

NF-kappa B activation via the DNA-dependent activator of IFN regulatory factors. J Immunol 181(9):6427–6434. https://doi.org/10.4049/jimmunol.181.9. 6427 95. Rebsamen M, Heinz LX, Meylan E, Michallet MC, Schroder K, Hofmann K, Vazquez J, Benedict CA, Tschopp J (2009) DAI/ZBP1 recruits RIP1 and RIP3 through RIP homotypic interaction motifs to activate NF-kappaB. EMBO Rep 10(8):916–922. https://doi.org/10.1038/embor.2009.109 96. Newton K, Wickliffe KE, Maltzman A, Dugger DL, Strasser A, Pham VC, Lill JR, RooseGirma M, Warming S, Solon M, Ngu H, Webster JD, Dixit VM (2016) RIPK1 inhibits ZBP1-driven necroptosis during development. Nature 540(7631):129–133. https:// doi.org/10.1038/nature20559 97. Kuriakose T, Man SM, Malireddi RK, Karki R, Kesavardhana S, Place DE, Neale G, Vogel P, Kanneganti TD (2016) ZBP1/DAI is an innate sensor of influenza virus triggering the NLRP3 inflammasome and programmed cell death pathways. Sci Immunol 1(2). https://doi.org/10.1126/sciimmunol. aag2045 98. Lin J, Kumari S, Kim C, Van TM, Wachsmuth L, Polykratis A, Pasparakis M (2016) RIPK1 counteracts ZBP1-mediated necroptosis to inhibit inflammation. Nature 540(7631):124–128. https://doi.org/10. 1038/nature20558 99. Maelfait J, Liverpool L, Bridgeman A, Ragan KB, Upton JW, Rehwinkel J (2017) Sensing of viral and endogenous RNA by ZBP1/DAI induces necroptosis. EMBO J 36(17): 2529–2543. https://doi.org/10.15252/ embj.201796476 100. Guo H, Gilley RP, Fisher A, Lane R, Landsteiner VJ, Ragan KB, Dovey CM, Carette JE, Upton JW, Mocarski ES, Kaiser WJ (2018) Species-independent contribution of ZBP1/ DAI/DLM-1-triggered necroptosis in host defense against HSV1. Cell Death Dis 9(8): 816. https://doi.org/10.1038/s41419018-0868-3 101. Sridharan H, Ragan KB, Guo H, Gilley RP, Landsteiner VJ, Kaiser WJ, Upton JW (2017) Murine cytomegalovirus IE3-dependent transcription is required for DAI/ZBP1mediated necroptosis. EMBO Rep 18(8): 1429–1441. https://doi.org/10.15252/ embr.201743947 102. Devos M, Tanghe G, Gilbert B, Dierick E, Verheirstraeten M, Nemegeer J, de Reuver R, Lefebvre S, De Munck J, Rehwinkel J, Vandenabeele P, Declercq W,

326

Alan Herbert

Maelfait J (2020) Sensing of endogenous nucleic acids by ZBP1 induces keratinocyte necroptosis and skin inflammation. J Exp Med 217(7):e20191913. https://doi.org/ 10.1084/jem.20191913 103. Jiao H, Wachsmuth L, Kumari S, Schwarzer R, Lin J, Eren RO, Fisher A, Lane R, Young GR, Kassiotis G, Kaiser WJ, Pasparakis M (2020) Z-nucleic-acid sensing triggers ZBP1-dependent necroptosis and inflammation. Nature. https://doi.org/10. 1038/s41586-020-2129-8 104. Kesavardhana S, Malireddi RKS, Burton AR, Porter SN, Vogel P, Pruett-Miller SM, Kanneganti TD (2020) The Zα2 domain of ZBP1 is a molecular switch regulating influenzainduced PANoptosis and perinatal lethality during development. J Biol Chem. https:// doi.org/10.1074/jbc.RA120.013752 105. Wang R, Li H, Wu J, Cai ZY, Li B, Ni H, Qiu X, Chen H, Liu W, Yang ZH, Liu M, Hu J, Liang Y, Lan P, Han J, Mo W (2020) Gut stem cell necroptosis by genome instability triggers bowel inflammation. Nature 580(7803):386–390. https://doi.org/10. 1038/s41586-020-2127-x 106. Newton K, Dugger DL, Wickliffe KE, Kapoor N, de Almagro MC, Vucic D, Komuves L, Ferrando RE, French DM, Webster J, Roose-Girma M, Warming S, Dixit VM (2014) Activity of protein kinase RIPK3 determines whether cells die by necroptosis or apoptosis. Science 343(6177): 1357–1360. https://doi.org/10.1126/sci ence.1249361 107. Li D, Chen J, Guo J, Li L, Cai G, Chen S, Huang J, Yang H, Zhuang Y, Wang F, Wang X (2021) A phosphorylation of RIPK3 kinase initiates an intracellular apoptotic pathway that promotes prostaglandin2alpha-induced corpus luteum regression. elife 10. https:// doi.org/10.7554/eLife.67409 108. Yoon S, Kovalenko A, Bogdanov K, Wallach D (2017) MLKL, the protein that mediates necroptosis, also regulates endosomal trafficking and extracellular vesicle generation. Immunity 47(1):51–65 e57. https://doi. org/10.1016/j.immuni.2017.06.001 109. Daniels BP, Kofman SB, Smith JR, Norris GT, Snyder AG, Kolb JP, Gao X, Locasale JW, Martinez J, Gale M Jr, Loo YM, Oberst A (2019) The nucleotide sensor ZBP1 and kinase RIPK3 induce the enzyme IRG1 to promote an antiviral metabolic state in neurons. Immunity 50(1):64–76 e64. https:// doi.org/10.1016/j.immuni.2018.11.017 110. Ponnusamy K, Tzioni MM, Begum M, Robinson ME, Caputo VS, Katsarou A,

Trasanidis N, Xiao X, Kostopoulos IV, Iskander D, Roberts I, Trivedi P, Auner HW, Naresh K, Chaidos A, Karadimitris A (2021) The innate sensor ZBP1-IRF3 axis regulates cell proliferation in multiple myeloma. Haematol Online. https://doi.org/10.3324/ haematol.2020.274480 111. Kahmann JD, Wecking DA, Putter V, Lowenhaupt K, Kim YG, Schmieder P, Oschkinat H, Rich A, Schade M (2004) The solution structure of the N-terminal domain of E3L shows a tyrosine conformation that may explain its reduced affinity to Z-DNA in vitro. Proc Natl Acad Sci U S A 101(9): 2712–2717. https://doi.org/10.1073/pnas. 0308612100 112. Brandt T, Heck MC, Vijaysri S, Jentarra GM, Cameron JM, Jacobs BL (2005) The N-terminal domain of the vaccinia virus E3L-protein is required for neurovirulence, but not induction of a protective immune response. Virology 333(2):263–270. https://doi.org/10.1016/j.virol.2005. 01.006 113. Koehler H, Cotsmire S, Langland J, Kibler KV, Kalman D, Upton JW, Mocarski ES, Jacobs BL (2017) Inhibition of DAI-dependent necroptosis by the Z-DNA binding domain of the vaccinia virus innate immune evasion protein, E3. Proc Natl Acad Sci U S A 114(43):11506–11511. https:// doi.org/10.1073/pnas.1700999114 114. Cao H, Dai P, Wang W, Li H, Yuan J, Wang F, Fang CM, Pitha PM, Liu J, Condit RC, McFadden G, Merghoub T, Houghton AN, Young JW, Shuman S, Deng L (2012) Innate immune response of human plasmacytoid dendritic cells to poxvirus infection is subverted by vaccinia E3 via its Z-DNA/RNA binding domain. PLoS One 7(5):e36823. https://doi.org/10.1371/journal.pone. 0036823 115. Rothenburg S, Deigendesch N, Dittmar K, Koch-Nolte F, Haag F, Lowenhaupt K, Rich A (2005) A PKR-like eukaryotic initiation factor 2alpha kinase from zebrafish contains Z-DNA binding domains instead of dsRNA binding domains. Proc Natl Acad Sci U S A 102(5):1602–1607. https://doi.org/10. 1073/pnas.0408714102 116. Zhang J, Ohta T, Maruyama A, Hosoya T, Nishikawa K, Maher JM, Shibahara S, Itoh K, Yamamoto M (2006) BRG1 interacts with Nrf2 to selectively mediate HO-1 induction in response to oxidative stress. Mol Cell Biol 26(21):7942–7952. https://doi.org/ 10.1128/MCB.00700-06

Z-DNA and Z-RNA: Methods—Past and Future 117. Jumper J, Evans R, Pritzel A, Green T, Figurnov M, Ronneberger O, Tunyasuvunakool K, Bates R, Zidek A, Potapenko A, Bridgland A, Meyer C, Kohl SAA, Ballard AJ, Cowie A, Romera-Paredes B, Nikolov S, Jain R, Adler J, Back T, Petersen S, Reiman D, Clancy E, Zielinski M, Steinegger M, Pacholska M, Berghammer T, Bodenstein S, Silver D, Vinyals O, Senior AW, Kavukcuoglu K, Kohli P, Hassabis D (2021) Highly accurate protein structure prediction with AlphaFold. Nature 596(7873): 583–589. https://doi.org/10.1038/ s41586-021-03819-2 118. Takeuchi H, Hanamura N, Hayasaka H, Harada I (1991) B-Z transition of poly (dG-m5dC) induced by binding of Lys-containing peptides. FEBS Lett 279(2): 253–255. https://doi.org/10.1016/00145793(91)80161-u 119. Stettler UH, Weber H, Koller T, Weissmann C (1979) Preparation and characterization of form V DNA, the duplex DNA resulting from association of complementary, circular singlestranded DNA. J Mol Biol 131(1):21–40. https://doi.org/10.1016/0022-2836(79) 90299-7 120. Gruskin EA, Rich A (1993) B-DNA to Z-DNA structural transitions in the SV40 enhancer: stabilization of Z-DNA in negatively supercoiled DNA minicircles. Biochemistry 32(9):2167–2176. https://doi.org/10. 1021/bi00060a007 121. Zhang Y, Cui Y, An R, Liang X, Li Q, Wang H, Wang H, Fan Y, Dong P, Li J, Cheng K, Wang W, Wang S, Wang G, Xue C, Komiyama M (2019) Topologically constrained formation of stable Z-DNA from normal sequence under physiological conditions. J Am Chem Soc 141(19): 7758–7764. https://doi.org/10.1021/jacs. 8b13855 122. Chen H, Cheng K, Liu X, An R, Komiyama M, Liang X (2020) Preferential production of RNA rings by T4 RNA ligase 2 without any splint through rational design of precursor strand. Nucleic Acids Res 48(9): e54. https://doi.org/10.1093/nar/gkaa181 123. Beknazarov N, Jin S, Poptsova M (2020) Deep learning approach for predicting functional Z-DNA regions using omics data. Sci Rep 10(1). https://doi.org/10.1038/ s41598-020-76203-1 124. Xie KT, Wang G, Thompson AC, Wucherpfennig JI, Reimchen TE, MacColl ADC, Schluter D, Bell MA, Vasquez KM, Kingsley DM (2019) DNA fragility in the parallel evolution of pelvic reduction in stickleback fish.

327

Science 363(6422):81–84. https://doi.org/ 10.1126/science.aan1425 125. Girskis KM, Stergachis AB, DeGennaro EM, Doan RN, Qian X, Johnson MB, Wang PP, Sejourne GM, Nagy MA, Pollina EA, Sousa AMM, Shin T, Kenny CJ, Scotellaro JL, Debo BM, Gonzalez DM, Rento LM, Yeh RC, Song JHT, Beaudin M, Fan J, Kharchenko PV, Sestan N, Greenberg ME, Walsh CA (2021) Rewiring of human neurodevelopmental gene regulatory programs by human accelerated regions. Neuron. https://doi. org/10.1016/j.neuron.2021.08.005 126. Zhang F, Huang Q, Yan J, Chen Z (2016) Histone acetylation induced transformation of B-DNA to Z-DNA in cells probed through FT-IR spectroscopy. Anal Chem 88(8): 4179–4182. https://doi.org/10.1021/acs. analchem.6b00400 127. Li Y, Huang Q, Yao G, Wang X, Zhang F, Wang T, Shao C, Zheng X, Jing X, Zhou H (2020) Remodeling chromatin induces Z-DNA conformation detected through Fourier transform infrared spectroscopy. Anal Chem 92(21):14452–14458. https:// doi.org/10.1021/acs.analchem.0c02432 128. Pupeza I, Huber M, Trubetskov M, Schweinberger W, Hussain SA, Hofer C, Fritsch K, Poetzlberger M, Vamos L, Fill E, Amotchkina T, Kepesidis KV, Apolonski A, Karpowicz N, Pervak V, Pronin O, Fleischmann F, Azzeer A, Zigman M, Krausz F (2020) Field-resolved infrared spectroscopy of biological systems. Nature 577(7788): 52–59. https://doi.org/10.1038/s41586019-1850-7 129. Bailey JA, Carrel L, Chakravarti A, Eichler EE (2000) Molecular evidence for a relationship between LINE-1 elements and X chromosome inactivation: The Lyon repeat hypothesis. Proceedings of the National Academy of Sciences 97(12):6634–6639. https://doi. org/10.1073/pnas.97.12.6634 130. Yu B, Qi T, Li R, Shi Q, Satpathy AT, Chang HY (2021) B cell-specific XIST complex enforces X-inactivation and restrains atypical B cells. Cell 184(7):1790–1803.e17 S0092867421001653. https://doi.org/10. 1016/j.cell.2021.02.015 131. Dhir A, Dhir S, Borowski LS, Jimenez L, Teitell M, Rotig A, Crow YJ, Rice GI, Duffy D, Tamby C, Nojima T, Munnich A, Schiff M, de Almeida CR, Rehwinkel J, Dziembowski A, Szczesny RJ, Proudfoot NJ (2018) Mitochondrial double-stranded RNA triggers antiviral signalling in humans. Nature 560(7717):238–242. https://doi.org/10. 1038/s41586-018-0363-0

328

Alan Herbert

132. Sato T, Cesaroni M, Chung W, Panjarian S, Tran A, Madzo J, Okamoto Y, Zhang H, Chen X, Jelinek J, Issa JJ (2017) Transcriptional selectivity of epigenetic therapy in cancer. Cancer Res 77(2):470–481. https://doi. org/10.1158/0008-5472.CAN-16-0834 133. Karki R, Sundaram B, Sharma BR, Lee S, Malireddi RKS, Nguyen LN, Christgen S, Zheng M, Wang Y, Samir P, Neale G, Vogel P, Kanneganti TD (2021) ADAR1 restricts ZBP1-mediated immune response and PANoptosis to promote tumorigenesis. Cell Rep 37(3):109858. https://doi.org/ 10.1016/j.celrep.2021.109858 134. Mehdipour P, Marhon SA, Ettayebi I, Chakravarthy A, Hosseini A, Wang Y, de Castro FA, Loo Yau H, Ishak C, Abelson S, O’Brien CA, De Carvalho DD (2020) Epigenetic therapy induces transcription of inverted SINEs and ADAR1 dependency. Nature 588(7836):169–173. https://doi.org/10. 1038/s41586-020-2844-1 135. Herbert A (2019) ADAR and immune silencing in cancer. Trends Cancer 5(5):272–282. https://doi.org/10.1016/j.trecan.2019. 03.004 136. Moriwaki K, Bertin J, Gough PJ, Orlowski GM, Chan FK (2015) Differential roles of RIPK1 and RIPK3 in TNF-induced necroptosis and chemotherapeutic agent-induced cell death. Cell Death Dis 6:e1636. https:// doi.org/10.1038/cddis.2015.16 137. Zhang T, Yin C, Fedorov A, Qiao L, Bao H, Beknazarov N, Wang S, Gautam A, Williams RM, Crawford JC, Peri S, Studitsky V, Beg AA, Thomas PG, Walkley C, Xu Y, Poptsova M, Herbert A, Balachandran S (2022) ADAR1 masks the cancer immunotherapeutic promise of ZBP1-driven necroptosis. Nature 606(7914):594–602 https://doi.org/10. 1038/s41586-022-04753-7 138. Roth SH, Levanon EY, Eisenberg E (2019) Genome-wide quantification of ADAR adenosine-to-inosine RNA editing activity. Nat Methods 16(11):1131–1138. https:// doi.org/10.1038/s41592-019-0610-9 139. Safina A, Cheney P, Pal M, Brodsky L, Ivanov A, Kirsanov K, Lesovaya E, Naberezhnov D, Nesher E, Koman I, Wang D, Wang J, Yakubovskaya M, Winkler D, Gurova K (2017) FACT is a sensor of DNA torsional stress in eukaryotic cells. Nucleic Acids Res 45(4):1925–1945. https://doi.org/10.1093/nar/gkw1366 140. Traczyk A, Liew CW, Gill DJ, Rhodes D (2020) Structural basis of G-quadruplex DNA recognition by the yeast telomeric protein Rap1. Nucleic Acids Res 48(8):

4562–4571. https://doi.org/10.1093/nar/ gkaa171 141. Ellison MJ, Fenton MJ, Ho PS, Rich A (1987) Long-range interactions of multiple DNA structural transitions within a common topological domain. EMBO J 6(5): 1513–1522. https://doi.org/10.1002/j. 1460-2075.1987.tb02394.x 142. Mulholland N, Xu Y, Sugiyama H, Zhao K (2012) SWI/SNF-mediated chromatin remodeling induces Z-DNA formation on a nucleosome. Cell Biosci 2:3. https://doi. org/10.1186/2045-3701-2-3 143. Hui J, Hung L-H, Heiner M, Schreiner S, Neumu¨ller N, Reither G, Haas SA, Bindereif A (2005) Intronic CA-repeat and CA-rich elements: a new class of regulators of mammalian alternative splicing. EMBO J 24(11): 1988–1998. https://doi.org/10.1038/sj. emboj.7600677 144. Solomon O, Oren S, Safran M, DeshetUnger N, Akiva P, Jacob-Hirsch J, Cesarkas K, Kabesa R, Amariglio N, Unger R, Rechavi G, Eyal E (2013) Global regulation of alternative splicing by adenosine deaminase acting on RNA (ADAR). RNA 19(5):591–604. https://doi.org/10.1261/ rna.038042.112 145. Aktas T, Avsar Ilik I, Maticzka D, Bhardwaj V, Pessoa Rodrigues C, Mittler G, Manke T, Backofen R, Akhtar A (2017) DHX9 suppresses RNA processing defects originating from the Alu invasion of the human genome. Nature 544(7648):115–119. https://doi. org/10.1038/nature21715 146. Chakraborty P, Huang JTJ, Hiom K (2018) DHX9 helicase promotes R-loop formation in cells with impaired RNA splicing. Nat Commun 9(1):4346. https://doi.org/10. 1038/s41467-018-06677-1 147. Hasler J, Strub K (2006) Alu elements as regulators of gene expression. Nucleic Acids Res 34(19):5491–5497. https://doi.org/10. 1093/nar/gkl706 148. England P, Molnar P, Richter F (2007) John Perry’s neglected critique of Kelvin’s age for the earth: a missed opportunity in geodynamics. GSA Today 17(1). 4-1052-517317-1-4-25025. https://doi.org/10.1130/ gsat01701a.1 149. Herbert A (2022) Nucleosomes and flipons exchange energy to alter chromatin conformation the readout of genomic information and cell fate. BioEssays 44(12):e2200166. https://doi.org/10.1002/bies.202200166 150. McKinney JA, Wang G, Mukherjee A, Christensen L, Subramanian SHS, Zhao J,

Z-DNA and Z-RNA: Methods—Past and Future Vasquez KM (2020) Distinct DNA repair pathways cause genomic instability at alternative DNA structures. Nat Commun 11(1): 236. https://doi.org/10.1038/s41467019-13878-9 151. Lerner LK, Sale JE (2019) Replication of G quadruplex DNA. Genes (Basel) 10(2):95. https://doi.org/10.3390/genes10020095 152. Wang E, Thombre R, Shah Y, Latanich R, Wang J (2021) G-Quadruplexes as pathogenic drivers in neurodegenerative disorders. Nucleic Acids Res 49(9):4816–4830. https://doi.org/10.1093/nar/gkab164

329

153. Stott RT, Kritsky O, Tsai LH (2021) Profiling DNA break sites and transcriptional changes in response to contextual fear learning. PLoS One 16(7):e0249691. https://doi.org/10. 1371/journal.pone.0249691 154. Herrero-Ruiz A, Martinez-Garcia PM, Terron-Bautista J, Millan-Zambrano G, Lieberman JA, Jimeno-Gonzalez S, CortesLedesma F (2021) Topoisomerase IIalpha represses transcription by enforcing promoter-proximal pausing. Cell Rep 35(2): 108977. https://doi.org/10.1016/j.celrep. 2021.108977

INDEX A

D

ADAR1 Z-DNA binding region Zα ................54, 59, 60, 70, 159, 160, 164 Adenosine deaminases acting on RNA I (ADAR1) ............................. vi, 36–38, 40, 44–46, 54, 59–63, 65, 70, 87, 105, 115, 144, 159, 160, 164, 269, 278, 285–292, 298–302, 304–310, 312–315 2-Aminopurine (2AP)................................................... 106 Analytical ultracentrifugation (AUC) ................ 252–254, 256, 259, 261, 264–266, 269–273 A-Z junctions ....................................................................vi

Deep learning ......................................... vi, 218, 220, 221 DeepZ .................................... vi, 218, 219, 221–225, 312 DNA cleavage................................................................ 154 DNA conformations ................36–38, 57, 132, 180, 297 DNA double-strand break (DSB) ...................... 228–231, 234–237 DNA nanotechnology .................................................. 242 DNA origami................................................... vi, 241–250 DNA–protein interaction ..............................77, 123, 126 DNA secondary structures ........................................... 131 DNA single-strand break .............................................. 236 DNA supercoiling ................................................ 180, 190 Dynamics ............................................ v, 5, 13, 16, 18, 19, 28, 85–102, 143, 212, 228, 241, 242, 245, 299

B B-DNA .......................................v, 1, 2, 7, 11, 14–19, 21, 22, 24, 25, 33, 35, 36, 43, 45, 46, 53, 54, 56, 70, 77, 81, 90, 97, 98, 105, 107–109, 116, 122, 123, 125, 127, 131, 167, 179–182, 195–198, 200, 203, 211, 227, 231–236, 252, 296–298, 300, 302–305, 309, 311, 315 Bioinformatics ................................................20, 299, 310 BRG1 .................................................................... 158, 164 B-to-Z DNA transition.......................................... v, vi, 38 BZ junction ..........................................v, 36, 50, 105–111 B-Z transition ......................vi, 35, 36, 38, 44, 45, 49, 50

C CD left-handed DNA ......................................11, 33, 196 CD spectroscopy .................................................... 37, 132 CD spectrum ................................... 8, 14, 18, 25, 27, 34, 35, 37, 38, 43, 44, 46, 121, 124, 131, 132, 134– 137, 139, 196, 204, 244, 257, 258, 272 Chemical shift perturbation (CSP) ................................ 69 ChIP-Seq ........................................ vi, 167–177, 217–219 Chromatin immunoprecipitation (ChIP) ...........................................vi, 88, 159–161, 163–165, 167, 168, 172–174 Circular dichroism (CD) .....................................v, 11, 13, 14, 16, 19, 33–41, 43–50, 118, 121–123, 131– 137, 139, 196, 252–254, 256–260, 266, 269, 271, 272, 296, 298, 304 CNN ............................................................ 221, 223, 224 Conformation-specific nuclease ................................... 143

E Epigenetic ................................................... 168, 204, 212, 218, 308, 312 Extrusion base structure ........................................ 17, 107 Ez score ................................................................ 259, 260

F Flipons .......................................................... 20, 295, 311, 313, 314, 317–320 Fluorescence probe ................................................... v, 108 19 F NMR probe ................................................... 116, 123 FokI......................................................143, 146, 147, 153

G Genetic instability .....................................vi, 54, 227–238 Genetics ............................................. 6, 17, 19, 167, 180, 195, 228, 282, 295, 296, 299, 307, 310, 313– 318, 320

H Heme oxygenase-1 (HO-1) ...................vi, 157–165, 311 Hexaamminecobalt ........................ 36, 37, 40, 44–46, 50 High-speed atomic force microscopy (HS-AFM) ........................................242, 245–250 High-throughput genomics ....................... 107, 167, 181

Kyeong Kyu Kim and Vinod Kumar Subramani (eds.), Z-DNA: Methods and Protocols, Methods in Molecular Biology, vol. 2651, https://doi.org/10.1007/978-1-0716-3084-6, © The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Science+Business Media, LLC, part of Springer Nature 2023

331

Z-DNA: METHODS AND PROTOCOLS

332 Index I

S

In-cell NMR spectroscopy................................... 122, 126 Influenza A virus (IAV) .................................. vi, 279–282 Infrared (IR) spectroscopy ...................................v, 53, 54 Interferon ...................................299, 306, 307, 309, 313 Isothermal titration calorimetry (ITC).............. 252–254, 259, 261–263, 266, 270, 272

Shuttle vector ................................................................ 228 Single-molecule FRET.................................................... 89 Single-molecule observation ....... 85–102, 241, 242, 246 Spermine ............................................... 40, 44–46, 50, 53, 60, 61, 133–135, 137–140 Statistical mechanics.................................... 180, 198, 202 Supramolecular chemistry ................................... 134, 135

L Left-handed DNA .................vi, 13, 18, 21, 24, 296, 297 Linker-mediated PCR (LM-PCR) ..................... 229, 230, 234, 235

M Machine learning (ML) ...................................... 212, 218, 219, 223, 312 Magnetic tweezers (MT) ..................................85, 86, 88, 89, 93, 95, 96, 298 5-Methylcytosine (5mC)................................................ 204

N Necroptosis............... 115, 299, 301, 307–309, 313, 314 255 nm .................................................. 35–38, 41, 43–46 295 nm ........................ 35–37, 43, 45, 46, 122, 257–259 Non-B DNA.................................................144, 179–181 Nuclear magnetic resonance (NMR) ..................v, 15, 28, 37, 38, 69, 70, 73, 75–78, 87, 116, 121–128, 252–256, 263, 266–270, 272, 298, 306, 308

O Omics data............................................................ 219, 223 ORD ....................................................................... 7, 8, 34

P Parallel-stranded psRRG-A DNA ............................26, 28 Phosphorothioate-modified Z-DNA .........................9, 22 PKZ......................................... 35, 38, 40, 43, 44, 59, 60, 65, 66, 70, 105, 111, 112, 115, 278, 298–301, 304, 307, 309, 310 Porphyrin.................................................... v, 38, 131–140 Potassium permanganate ............................ 181, 186, 187 Protein expression ......................147, 150, 154, 159, 290 Protein purification ................................................ 64, 147

R Real-time dynamics ......................................................... 98 Real-time polymerase chain reaction (real-time PCR) ................................159–162, 164 Reporter gene....................................................... 298, 315 R-L transition of poly[d(G-C)]............................ 1, 9–11, 13, 15, 16, 28, 33 RNA sequencing ........................................................... 286 RNN ..................................................................... 221, 223

T Time-course measurement ..........................38, 44–46, 49 Titration............................... v, 69, 70, 75–77, 79, 80, 82, 138, 154, 256, 257, 259–262, 266–268, 272 Topoisomerase IIα (TOP2A) ...................................21, 22 8-Trifluoromethyl-2’-deoxy-guanosine ............................v

V Vaccinia virus .......................................... vi, 278, 279, 282

W Wavelength scan .......................................... 35, 38, 41, 49

X X-ray diffraction ...............................................15, 54, 196

Y Yeast artificial chromosomes (YACs)............................. 229, 231–233, 236, 238

Z Z-alpha.................................................................. 107, 110 Zαα.......................................................144, 146, 147, 152 Zαα-FOK (Zαα-FN) ............................................ 143–154 Zα domains................................................. 218, 269, 298, 299, 301, 303, 305, 306, 308–310, 315 Z* DNA ....................................................................18–20 Z-DNA ............................................ 1, 33, 53, 59, 70, 85, 105, 115, 131, 143, 158, 167, 180, 195, 217, 227, 251, 278, 285, 295 Z-DNA binding domain...........................................54, 59 Z-DNA binding protein 1 (ZBP1) ........................ 59, 60, 70, 105, 278, 298–301, 304, 307–310, 312–315 Z-DNA-binding proteins (ZBPs) ......................vi, 20–22, 38, 59, 70, 74–78, 81, 82, 98, 105 Z-DNA crystallization .............................. 59, 62, 63, 298 Z-DNA forming region ........................36, 181, 222, 266 Z-DNA forming site (ZFS) ................105, 109, 167–169 Z-DNA mechanics ..................................................85–102 Z-DNA structure ......................................v, 37, 115, 116, 122, 127, 182, 190, 196, 218, 309 Zipper model............................................... 198, 200, 211 Z-RNA........................................................vi, 17, 18, 251, 252, 257–260, 263, 268, 271, 277–283, 295–320