124 32 12MB
English Pages 300 [297] Year 2020
Mahmud Hossain
Selected Reaction Monitoring Mass Spectrometry (SRM-MS) in Proteomics A Comprehensive View
Selected Reaction Monitoring Mass Spectrometry (SRM-MS) in Proteomics
Mahmud Hossain
Selected Reaction Monitoring Mass Spectrometry (SRM-MS) in Proteomics A Comprehensive View
Mahmud Hossain Sanofi Genzyme Framingham, MA, USA
ISBN 978-3-030-53432-5 ISBN 978-3-030-53433-2 (eBook) https://doi.org/10.1007/978-3-030-53433-2 © Springer Nature Switzerland AG 2020, Corrected Publication 2020 This work is subject to copyright. All rights are reserved by the Publisher, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting, reproduction on microfilms or in any other physical way, and transmission or information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or hereafter developed. The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication does not imply, even in the absence of a specific statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use. The publisher, the authors, and the editors are safe to assume that the advice and information in this book are believed to be true and accurate at the date of publication. Neither the publisher nor the authors or the editors give a warranty, expressed or implied, with respect to the material contained herein or for any errors or omissions that may have been made. The publisher remains neutral with regard to jurisdictional claims in published maps and institutional affiliations. This Springer imprint is published by the registered company Springer Nature Switzerland AG The registered company address is: Gewerbestrasse 11, 6330 Cham, Switzerland
To my academic mentors Richard D. Smith, PhD Battelle Fellow & Chief Scientist, Biological Sciences Division, Pacific Northwest National Laboratory (PNNL), Richland, WA, USA Professor Fred W. McLafferty Peter J.W. Debye Professor, Emeritus, Dept. of Chemistry, Cornell University, NY, USA Professor Patrick A. Limbach Vice President for Research & Professor of Chemistry, University of Cincinnati, OH, USA Professor J. Ricky Cox Anna S. Brown & Ruth B. Logan Endowed Chair, Dept. of Chemistry, Murray State University, Murray, KY, USA Professor Mustafizur Rahman (in memory of) Department of Biochemistry & Molecular Biology, University of Dhaka, Bangladesh
To my virtual mentors (for targeted proteomics, especially for SRM-MS) Professor Ruedi Aebersold Dept. of Biology, Institute of Molecular Systems Biology, ETH Zurich, Switzerland Steven A. Carr, PhD Senior Director of Proteomics, Institute Scientist, Broad Institute of MIT & Harvard, USA To my lifelong mentors (my parents) M. Amir Hossain & Fatema Begum Hopkinton, Massachusetts, USA
The end of one journey is simply the start of another. You have to see what you missed the first time, see again what you already saw, see in springtime what you saw in summer, in daylight what you saw at night, see the sun shining where you saw the rain falling, see the crops growing, the fruit ripen, the stone which has moved, the shadow that was not there before. You have to go back to the footsteps already taken, to go over them again or add fresh ones alongside them. You have to start the journey anew. Always. Nobel laureate Jose’ Saramago in his Journey to Portugal
vii
Preface
Mass spectrometry (MS)-based proteomic research, mostly facilitated by the advent of electrospray and MALDI ionization as well as by the significant advancements in MS technology, has been developed tremendously over the last two decades for both qualitative and quantitative analyses. Two major approaches have been utilized for this purpose – shotgun or discovery and targeted methodologies. In the last decade, selected reaction monitoring mass spectrometry (SRM-MS, aka, multiple reaction monitoring mass spectrometry or MRM-MS) has emerged as the most mature MS-based targeted proteomics platform for the consistent detection and accurate quantification of preselected proteins in complex biological matrices which utilized the capability of triple quadrupole (QqQ) or quadrupole ion trap (QTrap) mass spectrometers to selectively isolate target peptide ion and to monitor its paired selected fragment ion together, known as transition, that identify and quantify a peptide, and through inference, the cognate protein in proteome, sub-proteome, tissue, or cellular protease digests. Upon initial development, these assays can be applied to wide-ranging biological or biomedical studies across laboratories in a highly multiplexed form with high-sensitivity and great reproducibility for high-throughput protein detection and quantification with a wide dynamic range of up to five orders of magnitude. With its utmost significance, Nature Methods declared SRM-MS as the “Method of the Year” for 2012. Despite all its competitive advantages, a significant effort is needed to develop a highly efficient SRM assay which includes selection of proteins, proteotypic peptides and transitions, optimization of LC-MS parameters, validation of transitions, among others, with sequential, sometimes lengthy and iterative steps that often limit the broader applications. As a biological mass spectrometry researcher of more than a decade, I personally felt for a book on SRM-MS that would provide a wide- ranging aspect of this “gold-standard” targeted proteomics method including some fundamentals, method development, instrumentation, data acquisition and analysis, public data storage and reuse, future directions, and so on. In this book, mostly SRM-MS on targeted proteomics has been discussed, although there are other applications available on small molecules/other biomolecules. This book is aimed at academics and researchers from undergraduate to postgraduate and beyond of mass spectrometry and/or biomedical area, and at any general mass spectrometry and/or proteomics enthusiast as well. ix
x
Preface
First two chapters of this book is related to general mass spectrometry. History of mass spectrometry – from the time of J.J. Thompson to current period – has been described in Chap. 1 which mentioned how mass spectrometry, with its continued innovation, always played a distinct role in contemporary Sciences – first in Physics/ Nuclear Physics, then in Chemistry, now in Biology – over the years. In Chap. 2, major components of mass spectrometers – ion source, mass analyzers, detectors, and vacuum systems – are described. Besides, tandem mass spectrometry and various dissociation methods, including collision induced dissociation (CID), that are used nowadays have been discussed. Chapter 3 starts with different proteomics approaches where SRM-MS and other targeted MS-based proteomics approaches are described and compared. In Chap. 4, the development of an SRM-MS assay has been discussed which includes general workflow, selection of protein, peptide, transition and its validation, assay optimization, and quality control. In Chap. 5, available bioinformatics tools, both for preacquisition or method development and post-MS acquisition data analysis, have been described. SRM-MS has become a prominent technique for precise quantification of targeted proteins. Both relative, label-free and label-based quantification and absolute quantification techniques, using AQUA, QconCAT, and PSAQ, with SRM have been mentioned in Chap. 6. Major applications are described in Chaps. 7 and 8. In Chap. 7, some selected proteomics applications, such as general proteomics, biomarkers, and clinical area, are cited. In Chap. 8, posttranslational modifications (PTMs) and analysis of different PTMs using SRM with selected examples have been discussed. The book concludes with Chap. 9 in which current challenges, contemporary trends, and future directions are stated. It is neither intended nor possible to describe every topic of SRM-MS in great details in this book; however, with chapter-end references and in bibliography with the list of review articles, interested readers will find plenty of resources for further readings. For the last decade, field of SRM-MS on proteomics has been shaped, enriched, and guided by the research groups of Ruedi Aebersold, Richard Smith, Steven Carr, Michael MacCoss, Christoph Borchers, and Amanda Paulovich, to name a few. I am deeply honored and privileged to work/discuss with several of these finest scientists. I am a big fan of their research and ideas. Many thanks to Professor M. Waheeduzzaman of Austin Peay State University, TN, my uncle, and his family – Sabina, Nadeem, and Naveed Zaman – for their enduring love and affection for me. It took several years to prepare this manuscript. During this process, my wife Khodeja Fatema Hossain and my daughter Fariha Fardin Hossain provided tremendous encouragement and endless support, and of course, sacrificed their valuable family time for me. I sincerely appreciate the extraordinary help and continued support provided by Maria David (Project Coordinator) and, especially, Merry Stuber (Senior Editor) of Springer Nature. In fact, without their inspiration, and reminder, it would probably take me forever to complete. Thank you Khodeja, Fariha, Maria, and Merry from the bottom of my heart! Thanks in advance to all my readers. I will be greatly honored if anyone in anyway is benefited from this book. Framingham, MA, USA
Mahmud Hossain
Contents
1 Introduction���������������������������������������������������������������������������������������������� 1 1.1 Introduction to Mass Spectrometry�������������������������������������������������� 1 1.1.1 Mass Spectrometer���������������������������������������������������������������� 1 1.1.2 Mass Spectrum���������������������������������������������������������������������� 2 1.2 General Applications of Mass Spectrometry������������������������������������ 3 1.2.1 Proteomics���������������������������������������������������������������������������� 4 1.2.2 Other Biological Applications���������������������������������������������� 4 1.2.3 Pharmaceuticals�������������������������������������������������������������������� 4 1.2.4 Clinical���������������������������������������������������������������������������������� 4 1.2.5 Environmental���������������������������������������������������������������������� 5 1.2.6 Geological ���������������������������������������������������������������������������� 5 1.3 History of Mass Spectrometry���������������������������������������������������������� 5 1.3.1 The Early Years of Mass Spectrometry�������������������������������� 6 1.3.2 Use of Mass Spectrometry Pre– and Post–World War II Era������������������������������������������������������������������������������ 8 1.3.3 Beginning of Biomolecule Research by Mass Spectrometry���������������������������������������������������������� 9 1.3.4 Paradigm Shift in Mass Spectrometry Analysis�������������������� 11 1.3.5 The Development of Proteomics������������������������������������������ 12 References�������������������������������������������������������������������������������������������������� 14 2 The Mass Spectrometer and Its Components��������������������������������������� 17 2.1 Major Components of the Mass Spectrometer���������������������������������� 17 2.1.1 Ion Source ���������������������������������������������������������������������������� 17 2.1.2 Mass Analyzers �������������������������������������������������������������������� 26 2.1.3 Detectors of Mass Spectrometry������������������������������������������ 32 2.1.4 Other Components of Mass Spectrometer: Vacuum System�������������������������������������������������������������������� 37 2.2 Tandem Mass Spectrometry (MS/MS) �������������������������������������������� 39 2.2.1 MS/MS Instrumentation ������������������������������������������������������ 39 2.2.2 Scan Modes in Tandem Mass Spectrometry ������������������������ 40 xi
xii
Contents
2.3 Dissociation Methods in Mass Spectrometry ���������������������������������� 41 2.3.1 Collision-Induced Dissociation (CID)���������������������������������� 42 2.3.2 High-Energy Collisional Dissociation (HCD)���������������������� 45 2.3.3 Electron Capture Dissociation���������������������������������������������� 46 2.3.4 Electron Transfer Dissociation (ETD)���������������������������������� 47 References�������������������������������������������������������������������������������������������������� 48 3 Selected Reaction Monitoring Mass Spectrometry������������������������������ 53 3.1 Proteomics���������������������������������������������������������������������������������������� 53 3.1.1 Bottom-Up, Top-Down, and Middle-Down Proteomics������ 55 3.1.2 Shotgun/Discovery and Targeted Proteomics ���������������������� 58 3.2 Selected Reaction Monitoring Mass Spectrometry (SRM-MS) ������ 60 3.2.1 SRM-MS in Comparison to Non-MS Analytical Technique������������������������������������������������������������ 60 3.2.2 Instrumentation for SRM-MS ���������������������������������������������� 63 3.2.3 Commercial Triple Quadrupole Instruments������������������������ 63 3.2.4 Distinctive Features of SRM-MS������������������������������������������ 65 3.3 Variations of the Conventional SRM-MS ���������������������������������������� 66 3.3.1 Constrained SRM-MS���������������������������������������������������������� 66 3.3.2 MRM3����������������������������������������������������������������������������������� 68 3.3.3 Dynamic SRM-MS �������������������������������������������������������������� 70 3.3.4 Photo-SRM �������������������������������������������������������������������������� 71 3.3.5 MALDI-SRM������������������������������������������������������������������������ 73 3.3.6 MSIA-SRM�������������������������������������������������������������������������� 74 3.4 Other Targeted MS Methodologies�������������������������������������������������� 76 3.4.1 Selected Ion Monitoring ������������������������������������������������������ 76 3.4.2 Accurate Inclusion Mass Screening (AIMS)������������������������ 76 3.4.3 pSRM (pseudo-Selected Reaction Monitoring)�������������������� 77 3.4.4 Parallel Reaction Monitoring������������������������������������������������ 79 3.4.5 Data-Independent Acquisition (DIA)������������������������������������ 80 3.4.6 Comparison of SRM with PRM and DIA/SWATH�������������� 83 References�������������������������������������������������������������������������������������������������� 85 4 Development of SRM-MS Experiment�������������������������������������������������� 89 4.1 General Workflow of a Proteomic SRM-MS Assay������������������������� 89 4.2 Selection of Protein�������������������������������������������������������������������������� 89 4.3 Selection of Peptide�������������������������������������������������������������������������� 91 4.3.1 Peptide Selection Criteria ���������������������������������������������������� 91 4.3.2 Peptide Selection Techniques����������������������������������������������� 93 4.4 Selection of Transition���������������������������������������������������������������������� 96 4.4.1 Transition Selection Process ������������������������������������������������ 96 4.4.2 Validation of Selected Transitions���������������������������������������� 99 4.5 Optimization of Experimental Parameters���������������������������������������� 101 4.6 Assay Development and Data Acquisition���������������������������������������� 103 4.6.1 Development of an SRM-MS Assay Workflow�������������������� 106 4.7 System Suitability Monitoring and Quality Control������������������������ 106 References�������������������������������������������������������������������������������������������������� 111
Contents
xiii
5 Bioinformatics Tools for SRM-MS �������������������������������������������������������� 115 5.1 SRM Preacquisition Stage���������������������������������������������������������������� 117 5.1.1 MRMaid�������������������������������������������������������������������������������� 117 5.1.2 ATAQS���������������������������������������������������������������������������������� 118 5.1.3 TIQAM �������������������������������������������������������������������������������� 119 5.1.4 MaRiMba������������������������������������������������������������������������������ 120 5.1.5 PChopper������������������������������������������������������������������������������ 121 5.1.6 Picky ������������������������������������������������������������������������������������ 122 5.1.7 Skyline���������������������������������������������������������������������������������� 122 5.2 Databases Related to SRM-MS�������������������������������������������������������� 124 5.2.1 SRMAtlas������������������������������������������������������������������������������ 124 5.2.2 CPTAC Assay Portal ������������������������������������������������������������ 128 5.2.3 MRMAssayDB �������������������������������������������������������������������� 129 5.2.4 ProteomeTools���������������������������������������������������������������������� 131 5.3 SRM-MS Data Acquisition on LC-MS�������������������������������������������� 132 5.4 SRM Postacquisition Stage�������������������������������������������������������������� 134 5.4.1 Data Analysis with ATAQS �������������������������������������������������� 134 5.4.2 MRMer���������������������������������������������������������������������������������� 134 5.4.3 Postacquisition Data Analysis with Skyline ������������������������ 136 5.4.4 SRMstats and MSstats���������������������������������������������������������� 138 5.4.5 Automated SRM-MS Data Analysis Workflow for Large-Scale Studies�������������������������������������������������������� 139 5.5 Commercial Software from Instrument Vendors������������������������������ 140 5.5.1 Pinpoint (ThermoScientific)������������������������������������������������� 140 5.5.2 MRMPilot (AB Sciex)���������������������������������������������������������� 141 5.5.3 TargetLynx (Waters)�������������������������������������������������������������� 141 5.5.4 MassHunter (Agilent) ���������������������������������������������������������� 142 References�������������������������������������������������������������������������������������������������� 142 6 Quantification by SRM-MS�������������������������������������������������������������������� 145 6.1 Protein Quantification by SRM-MS������������������������������������������������� 145 6.2 Relative Quantification �������������������������������������������������������������������� 145 6.2.1 Label-Free Approach������������������������������������������������������������ 146 6.2.2 Stable Isotope Labeling Approach���������������������������������������� 150 6.3 Absolute Quantification�������������������������������������������������������������������� 159 6.3.1 Absolute Quantification (AQUA) ���������������������������������������� 160 6.3.2 Quantification Concatemer (QconCAT) ������������������������������ 161 6.3.3 Protein Standard Absolute Quantification (PSAQ) �������������� 163 6.4 Calibration Reference Standards������������������������������������������������������ 164 6.5 Validation of Quantitative SRM Assay �������������������������������������������� 165 References�������������������������������������������������������������������������������������������������� 170 7 SRM-MS Applications in Proteomics���������������������������������������������������� 173 7.1 General Proteomic Applications ������������������������������������������������������ 174 7.1.1 Applications in Systems Biology/Network Biology������������ 174 7.2 Clinical/Biomarkers Applications���������������������������������������������������� 177
xiv
Contents
7.2.1 Application on Cancers�������������������������������������������������������� 181 7.2.2 Biomarker Quantification Using Dried Blood Spots (DBS) �������������������������������������������������������������� 186 7.3 PSA as a Model Biomarker�������������������������������������������������������������� 188 References�������������������������������������������������������������������������������������������������� 193 8 SRM-MS for Posttranslational Modification Analysis ������������������������ 197 8.1 Posttranslational Modifications (PTMs) of Proteins������������������������ 197 8.1.1 Posttranslational Modification���������������������������������������������� 197 8.1.2 Determination of Posttranslational Modifications���������������� 198 8.1.3 PTM Analysis by Mass Spectrometry���������������������������������� 202 8.2 Analysis of Phosphorylation ������������������������������������������������������������ 206 8.3 Analysis of Ubiquitination���������������������������������������������������������������� 217 8.4 Analysis of Glycosylation���������������������������������������������������������������� 220 8.4.1 Analytical Strategy of Protein Glycosylation by Mass Spectrometry���������������������������������������������������������� 222 8.4.2 Protein Glycosylation by SRM-MS�������������������������������������� 223 8.5 Analysis of Acetylation�������������������������������������������������������������������� 227 References�������������������������������������������������������������������������������������������������� 230 9 Challenges, Current Trends, and Future Directions���������������������������� 235 9.1 Sensitivity Issue�������������������������������������������������������������������������������� 235 9.1.1 Improvement in Sample Preparation������������������������������������ 236 9.1.2 Utilization of Various Separation Methods�������������������������� 241 9.1.3 Advances in MS Instrumentation������������������������������������������ 245 9.2 Multiplexing Capability Issue���������������������������������������������������������� 247 9.3 Reproducibility Issue������������������������������������������������������������������������ 250 9.4 Global SRM-MS Data Repositories�������������������������������������������������� 253 9.5 Outlook �������������������������������������������������������������������������������������������� 259 References�������������������������������������������������������������������������������������������������� 262 Correction to: Selected Reaction Monitoring Mass Spectrometry (SRM-MS) in Proteomics�������������������������������������������������������� C1 Glossary of SRM-MS-Related Terms������������������������������������������������������������ 267 Bibliography ���������������������������������������������������������������������������������������������������� 275 Index������������������������������������������������������������������������������������������������������������������ 281
Abbreviations
AIMS Accurate inclusion mass screening ASMS American Society for Mass Spectrometry AUC Area under curve AP Affinity purification AQUA Absolute QUAntification ATAQS Automated and targeted analysis with quantitative SRM CAD Collision activated dissociation CID Collision induced dissociation CE Collision energy COM Center-of-mass CPTAC Clinical Proteomics Tumor Analysis Consortium CV Coefficient of variation Da Dalton DDA Data-dependent acquisition DIA Data-independent acquisition EI Electron ionization; Electron impact eV Electron volt ECD Electron capture dissociation EDD Electron detachment dissociation ETD Electron transfer dissociation EID Electron ionization dissociation ELISA Enzyme-linked immunosorbent assay ESI Electrospray ionization FAB Fast atom bombardment FAIMS High-field asymmetric waveform ion mobility spectrometry FWHM Full width at half maximum FDR False discovery rate FT-ICR Fourier-transform ion cyclotron resonance GPMDB Global Proteomic Machine Database HCD Higher energy C-trap dissociation HPLC High performance liquid chromatography xv
xvi
Abbreviations
HR/AM High resolution/accurate mass HUPO Human Proteome Organization IR Infrared ICAT Isotope-coded affinity tag IS Internal standard iSRM Intelligent SRM iTRAQ Isobaric tag for relative and absolute quantification Kda Kilodalton LC Liquid chromatography LDI Laser desorption ionization LOD limit of detection LOQ Limit of quantification m Mass of a molecule or a compound MALDI Matric-assisted laser desorption ionization MassIVE Mass spectrometry Interactive Virtual Environment MRM3 Multiple reaction monitoring cubed MS Mass spectrometry MS/MS Tandem mass spectrometry MSIA Mass spectrometric immunoassay MudPIT Multidimensional Protein Identification Technology NanoESI Nanoelectrospray ionization NCBI National Center for Biotechnology Information NIST National Institute of Standards and Technology PTM Posttranslational modification PRIDE PRoteomics IDEntification database PRM Parallel reaction monitoring PRISM High pressure, high-resolution separations with intelligent selection and multiplexing PSAQ Protein standard absolute quantification PASSEL PeptideAtlas SRM experiment library PX Proteome exchange PSM Peptide spectrum match PSA Prostate specific antigen PTP Proteotypic peptide conCAT Artificial concatemer of standard (Q) peptides QqQ Triple quadrupole QToF Quadrupole time-of-flight QTrap Quadrupole ion trap Rf Radio frequency RP Reversed phase RT Retention time SCX Strong cation exchange chromatography SIM Selected ion monitoring SISCAPA Stable isotope standards and capture by anti-peptide antibodies SILAC Stable isotope labeling by/with amino acids in cell culture
Abbreviations
SID SIL SIS S/MRM S/MRM-MS SNP SPE SWATH
xvii
Stable isotope dilution Stable isotope-labeled Stable isotope-labeled internal standard Selected/multiple reaction monitoring Selected/multiple reaction monitoring-mass spectrometry Single nucleotide polymorphism Solid phase extraction Sequential windowed acquisition of all theoretical fragmentation spectra TMT Tandem mass tag ToF Time-of-flight TPP Trans-proteome pipeline UPLC Ultra performance liquid chromatography UV Ultraviolet z Charge state of the ion
Chapter 1
Introduction
1.1 Introduction to Mass Spectrometry Mass spectrometry (MS), fundamentally, measures the mass-to-charge ratio (m/z) of ions to determine the molecular weight that can be utilized for the identification and quantification of molecule(s) even in a very complex mixture. It is, nowadays, already established as an essential tool for the analysis of chemical and biological compounds. MS, in many ways, differs from other analytical tools in that, unlike spectroscopy instruments, the analytes in mass spectrometry do not absorb any radiation—infrared (IR), ultraviolet (UV), or radio waves—from the electromagnetic spectrum; and unlike IR, Raman, and nuclear magnetic resonance (NMR) spectroscopy, it is a destructive analytical method where analytes were unrecoverable after analysis (Smith 2004). The competitive advantages of mass spectrometry stand mainly from its high sensitivity and selectivity, high throughput, very low sample consumption, and its versatile usages in a wide range of fields and applications.
1.1.1 Mass Spectrometer The mass spectrometer is the instrument that is used in mass spectrometry to measure the m/z values and the relative abundances of ions. All mass spectrometer has three basic components in common: an ion source, a mass analyzer, and a detector. A schematic is shown in Fig. 1.1. As mass spectrometry measures mass-to-charge ratio, the molecules first need to be converted into gas-phase ions and the ion source is used to produce ions by capture or loss of electrons or protons. In mass analyzer, the ions are separated according to their m/z values via magnetic or electric fields. Then ions of a specific m/z value reach the detector. A current signal is then produced and recorded. In case of Orbitrap mass analyzer, while the ions repeatedly pass the detector plates, they produce an induction current or image current, which © Springer Nature Switzerland AG 2020 M. Hossain, Selected Reaction Monitoring Mass Spectrometry (SRM-MS) in Proteomics, https://doi.org/10.1007/978-3-030-53433-2_1
1
2
1 Introduction
Fig. 1.1 General scheme of a mass spectrometer
can be detected and subsequently converted to a frequency spectrum using the Fourier transform (FT).
1.1.2 Mass Spectrum A mass spectrum (plural, spectra) is a two-dimensional plot of ion signal intensity (y-axis) as a function of m/z (x-axis) where the signal intensity, typically known as a peak, represents the abundance of each ionic species related to a corresponding m/z that is detected (Fig. 1.2). The peak with lower m/z values is recorded earlier in the mass spectrum followed by their higher ones. The peak with the highest intensity is known as a base peak. The intensity of each peak can be represented either as
Fig. 1.2 A typical mass spectrum
1.2 General Applications of Mass Spectrometry
3
Fig. 1.3 Profile (a) and centroid data of the same mass spectrum (b). (From Smith et al. (2014), Springer Nature, Copyright exception)
an absolute amount, or as a normalized relative percent (%) to the base peak, or to the sum of all peak intensities. In a spectrum, molecular ion signal can be represented either as a profile data where a peak is generated by a collection of signals over multiple mass spectrometry scans, or as a bar-graph type centroid data where the signals are displayed as distinct m/z value with a very narrow line representing the local maxima in a single spectrum, as shown in Fig. 1.3. The m/z value is always determined from the peak apex in profile data. Both spectrum types have their own advantages and limitations. With profile data, it is convenient to separate the analyte signal from the instrument noises, whereas with centroid data, the acquired mass spectrometry file size is significantly smaller than profile data, albeit with some information loss (Smith et al. 2014). A spectrum can also be converted to a tabular form, where merely the m/z and the peak intensity values are listed. A mass spectrum may contain a molecular ion (M+•) peak as derived from intact molecular ion with its highest m/z value, or fragment ion peak(s) as caused by fragmentation of a specific molecular ion, or both, along with several noise peaks originated from the instrument and/or from the sample matrices.
1.2 General Applications of Mass Spectrometry Mass spectrometry is a powerful technique with diverse applications in chemistry, physics, biology, and other disciplines of science. It is, nowadays, used extensively in both industry and in academia for both routine and research purposes (Hassan 2012). With the improvement in sample preparation methodology, advent of high- resolution instrument with increased sensitivity, and data analysis automation, mass spectrometry has opened new avenues in analysis of more complicated biological systems than ever (Guengerich 2011). This importance of mass spectrometry to biological research has already been recognized by awarding the 2002 Nobel Prize in Chemistry to John Fenn and Koichi Tanaka for their achievement in developing soft desorption ionization methods for the analysis of biological macromolecules
4
1 Introduction
(www.nobel.se/chemistry). Some of the selected current applications of mass spectrometry are discussed as follows.
1.2.1 Proteomics Today one of largest applications of mass spectrometry is in proteomics. MS helps to identify unknown protein(s) and determine its sequence and higher-order structure, function, interactions with other protein/molecules, various post-translational modifications (PTMs), and relative and absolute quantification even in the complex matrices. The proteomics of human, other animals, plants, and microbes can be analyzed with mass spectrometry—at their protein level (top-down proteomics), or at the peptide level (bottom-up proteomics), or in between (middle-down approach).
1.2.2 Other Biological Applications Other biomolecules, for example, lipids, carbohydrates and oligonucleotides, and metabolites are also routinely analyzed with mass spectrometry.
1.2.3 Pharmaceuticals Mass spectrometry applications in pharmaceuticals industry and research include drug discovery and development, drug metabolism, combinatorial chemistry, pharmacokinetics. Recently, MS applications in the area of biopharmaceuticals are increasing rapidly. It has been used in various phases of protein therapeutics developments. Mass spectrometry already extended its usage in primary sequence, intraand inter-molecular disulfide bonds, carbohydrate structure and profile, post-translational modifications, in-process and in-storage characterizations, as well as in determining other quality attributes in biologics and biosimilar.
1.2.4 Clinical Various clinical applications include various forensic and drug testing. Discovery, verification, and even validation of the biomarkers are another area for mass spectrometry.
1.3 History of Mass Spectrometry
5
1.2.5 Environmental Mass spectrometry is routinely used for polycyclic aromatic hydrocarbons (PAHs) and polychlorinated biphenyls (PCBs) analyses, for water quality analysis, and for analysis of various pesticides in food.
1.2.6 Geological For the analysis of petroleum composition and for carbon dating, mass spectrometry has already made significance progress.
1.3 History of Mass Spectrometry Today, the capability of mass spectrometry for the identification and the quantification of thousands of biomolecules accurately in a very complex bio-matrix makes it an indispensable tool in bioanalysis. The applications of MS, ever increasingly, become broader across biology and medicine. This has not always been the case for this century-old technology. In fact, the birth of mass spectrometry has its root in nuclear physics and chemistry. Its journey was started by its inventor J.J. Thomson, an English physicist, in the early twentieth century, for measuring the masses of atoms and the discovery of stable isotopes (Thomson 1897). In the late 1940s during World War II, with its improved design, mass spectrometer was used in Manhattan Project for the purification and evaluation of uranium, 235U (Nier 1940). Another use of mass spectrometry, at that time, was in petroleum process monitoring for quality fuel, which extends, after the war, for the development of petroleum characterization and of petroleum-based products. In the early 1960s, with the knowledge of gas-phase ionization and fragmentation process, the use of mass spectrometer extended to the characterization of natural products and small synthetic compounds. By the late 1960s, with the advent of chemical ionization (CI), gas chromatography (GC) coupled with MS and in mid-1970s triple quadrupole (QqQ) mass spectrometer brought peptide sequencing in full swing. However, a tremendous opportunity appeared in the 1980s with the introduction of novel ionization techniques—fast atom bombardment (FAB), electrospray ionization (ESI), and matrix-assisted laser desorption/ionization (MALDI)—in the direct analysis of large biomolecules, which ultimately paved the way for today’s vast usages and applications in the fields of cell biology, in multiple omics, and many more to come.
6
1 Introduction
1.3.1 The Early Years of Mass Spectrometry The origin of mass spectrometry has its link with the studies of the transmission of electricity through gases, an exciting area of research at around the late nineteenth century for the fundamental aspects of atomic and molecular structure. In 1886, Eugen Goldstein, a German physicist working at the Berlin Observatory, identified “canal rays” (Kanalstrahlen, in German) by passing electric discharge in a Crookes gas-filled low-pressure discharge tube. The rays, opposite to already detected negatively charged cathode rays at that time, were emitted from the anode traveling toward the cathode, and were unaffected by weak magnetic fields positioned near the discharge tube. He concluded that the canal rays were composed of positively charged particles (Grayson 2002). In 1898, another German physicist, Wilhelm Wien, was able to deflect the canal rays by strong parallel electric and magnetic fields. He established that these rays carried a positive electric charge and found that particles with varying charge-to- mass (e/m) ratios followed different parabolic curves and the e/m values depended on the type of gas in the discharge tube (Grayson 2002). Enthused by the experimental results of Goldstein and Wien, Joseph John (J.J.) Thomson (Fig. 1.4), a physicist in the Cavendish Laboratory at the Cambridge University, started his own research on the transmission of electricity through gases at a lower pressure. Thomson, born in Manchester in 1856, entered Trinity College in Cambridge as a minor scholar in 1876, and became a Fellow of Trinity College in 1880, lecturer in 1883, and Master in 1918. He was the Cavendish Professor of Experimental Physics at the University of Cambridge from 1884 to 1918 (Thomson- bio n.d.). All through his carrier, Thomson attempted to discover the fundamental structure of matter and its basic building blocks (Sharma 2013).
Fig. 1.4 Joseph John (J.J.) Thompson (Reprinted with permission from Sharma (2013))
1.3 History of Mass Spectrometry
7
Thomson, with his cathode ray tube, made by his lab assistant and glass-blower Ebenezer Everett, was able to measure e/m (charge-to-mass ratio) and e of electron, and ultimately deduced the mass of the electron (Thomson 1897). He got his Nobel Prize in 1906 in Physics for this discovery of the electron. His initial work on cathode rays eventually built the basis of the field of mass spectrometry. From 1905 to 1914, Thomson focused his research on the positively charged canal rays. In his positive ray parabola apparatus—mass spectrograph—ions generated in gas discharge tubes passed through parallel magnetic and electric fields before being deflected into parabolic trajectories and ultimately detected on a photographic plate (Fig. 1.5). Then he measured the parabolas on the photograph and calculated the corresponding charge-to-mass ratio of the particles using mathematical equations (Thomson 1910). With his mass spectrograph, Thomson was able to identify H+ and H2+ from hydrogen; O+ and O2+ from oxygen; and C+, O+, Cl+, CO+, Cl2+, and COCl2+ from polyatomic ion of COCl2. He also recorded ions generated from hydrocarbon molecules (Grayson 2002). Later, Thomson replaced the photographic plate of his instrument by adding a slit with a Faraday cup behind it as the photographic plate had some sensitivity bias for heavy atoms, and the quantitative determination of isotope abundance had become challenging. Ions with their various e/m were first focused sequentially onto this parabolic slit by varying the strength of the magnetic field, and then the ion current was measured as they hit the second metal plate attached to an electroscope. A graph, ion abundance versus mass, was generated where peak intensity was directly related to the cognate ion abundance (Grayson 2002). This was, by far, the earliest of the mass spectrometers built by Thomson and his lab associates, and the journey of the field of mass spectrometry had just begun. After joining at Thomson’s lab, Francis William Aston—Nobel Laureate, 1922, in Chemistry—focused his research on the detection and proof of existence of stable isotopes (Aston 1920). Aston redesigned and built a series of three mass spectrometers—built in 1919, 1925, and in 1937—with increased intensity of the signal and
Fig. 1.5 A schematic of the positive ray parabola apparatus of Thomson. Here, ions enter at the left, travel through the deflection region of L with an electric field E and a magnetic field B and pass through the field-free region of D before being detected with a photographic plate (Reprinted with permission from Sharma (2013))
8
1 Introduction
with improved resolving power that could decisively differentiate the two isotopes of Neon: 20Ne and 22Ne. He utilized a gas discharge tube as an ion source. A very narrow, well-aligned ion beam was produced using two narrow slits where ions were deflected first with a parallel plate deflector and then with a magnetic field. The combination of both deflections placed the ions on the focal plane of the instrument that depended on the charge-to-mass ratio of the ion, but was independent of the ion energy (Sharma 2013). By this time, stable isotopes of more than 50 elements had been reported by Aston, and one of the major applications of mass spectrometry at that period was the study of isotope analysis of the stable elements. Physicist Arthur Dempster of the University of Chicago, a contemporary of Aston, reported the discovery of three isotopes of magnesium in 1920 with his newly built mass spectrometer, also known as magnetic sector analyzer. This instrument configuration deflected the rays by 180° with the application of a strong magnetic field that mainly focused the rays of an m/z through a narrow slit, and these were then detected with an electrometer in real time. Dempster’s mass spectrometer became the basis of later commercially developed instruments (Jarchum 2015). In 1935, he reported the isotopic analysis of the last four known elements: platinum, palladium, gold, and iridium. To obtain the mass spectra of these elements, he developed a new type of ion source in which a high-frequency spark was discharged between two electrodes made of the sample material to be analyzed, and, in fact, this was the progenitor to the spark source for ionization of solid metals (Grayson 2002). By 1935, Aston, Dempster, and other scientists already completed the isotope analysis of almost all of the elements listed in the then parodic table. In 1929, Walker Bleakney of Princeton University developed a method of positive ray analysis and was able to measure the first four ionization energies of mercury, where he heated a tungsten wire filament to generate electron flow, and then used a magnetic field to focus the electrons into a narrow beam (Bleakney 1929). This technique, known as electron impact or electron ionization (EI), later became the gold standard ionization method for a long time (Finkelstein 2015).
1.3.2 U se of Mass Spectrometry Pre– and Post–World War II Era In the 1940s, more mass spectrometers were initiated to build with better sophistication. Alfred Nier, a physicist and electrical engineer at the University of Minnesota at that time, designed and built several mass spectrometers including the revolutionary Nier-Johnson mass spectrometer, along with his colleague E.G. Johnson, who combined electrostatic and magnetic analyzers in a unique configuration. He also introduced a compact but efficient sector-field mass spectrometer that had reduced weight of the electromagnet, had less power consumption, and could easily extend the growing demand of applications in industry and in academia for various isotope and gas analysis. With his improved instrument, in 1939, Nier reported third and
1.3 History of Mass Spectrometry
9
very rare isotope of uranium, 234U (previously reported 238U by Aston, and 235U by Dempster), and later, with Enrico Fermi and John Dunning, showed that the slow neutron fissionable isotope of uranium was 235U. During World War II, Nier constructed mass spectrometer-based apparatus to detect leaks in the gas centrifuges used to enrich 235U in the Manhattan Project. He also discovered 13C isotope and enriched it to use as a tracer in studies of bacterial metabolism; and developed the measurement of 207Pb/206Pb by mass spectrometer in the planet’s crust in the age determination of the earth (Griffiths 2008). After World War II, the use of mass spectrometry extended from the academia to the industry with more compact, efficient, and higher-resolving power instruments. Commercially available mass spectrometers were also available at that time, both from Westinghouse Electric and Manufacturing Company and from Consolidated Engineering Corporation (CEC), mostly based on the design of Alfred Nier. Petroleum industry started to use mass spectrometry increasingly for the development and characterization of oil and oil-based products and processes, which also led to the fundamental studies of gas-phase ionization and fragmentation mechanism.
1.3.3 B eginning of Biomolecule Research by Mass Spectrometry In the early 1960s, Klaus Biemann, an organic analytical chemistry professor of Massachusetts Institute of Technology (MIT), started to exploit mass spectrometry to elucidate the structure of natural products. His graduate studies on organic synthesis in Austria and his postdoctoral research at the MIT on alkaloids and peptides made him interested to verify the structure of unknown natural compounds by mass spectrometry. He was also inspired by the research of William Stahl who was using mass spectrometry at that time for the identification of small organic compounds by comparing with spectral database (Griffiths 2008). Biemann was able to fragment several compounds with electron ionization mass spectrometry and deduced their structure. In electron ionization process, the energized electron beams typically bombard the volatilized molecules. He also developed the fragmentation rules and patterns for alkaloids and small peptides (Biemann et al. 1959). A detailed understanding of the mechanism of fragmentation pathways started to develop at this time. Site-specific stable isotope labeling was also used in this intricate process. Carl Djerassi, a researcher from Stanford University, was inspired and helped by Biemann to utilize mass spectrometry for his research on steroids, alkaloids, and terpenoids. During the same time, Fred W. McLafferty was working with mass spectrometry at Dow Chemical Co. Later he moved to academia—first to Purdue University and later to Cornell University. He was mostly interested to work on the instrumentation and methodology development of mass spectrometry. Unlike Biemann and some other contemporary researchers, he was working on known chemical compounds,
10
1 Introduction
and matched mass spectra to their known structure. By this way he deduced the mass spectrometry fragmentation mechanism of various organic compounds, one of which is “McLafferty rearrangement” (Gilpin and McLafferty 1957), coined by Djerassi. By the late 1960s, chemical ionization (CI) method was developed by Burnaby Munson and Frank Field. In this soft ionization technique, gas-phase ion-molecule reactions occur between a reagent ion (CH4+) and the volatilized analyte. Unlike electron ionization, CI produces little fragmentation (Munson and Field 1966). During the 1960s, gas chromatography-mass spectrometry (GC-MS), the first hyphenated MS technique, was also developed where analysis could be done online with analyte separation on the chromatography and detection onto the mass spectrometry (Gohlke and McLafferty 1993). From the 1970s, peptide sequencing by mass spectrometry had started to develop with several challenges. Although polyamino alcohol- or permethyl-derivatized peptides were able to detect with mass spectrometry, these were, however, mostly limited by multiple derivatization steps, lower detection sensitivity, and by the larger peptides due to volatility issue (Yates III 2011). Ernest Lawrence, in 1932, invented the particle accelerator, known as cyclotron, where a static magnetic field was used in which the charged particles follow an outward spiral, and are accelerated by a rapidly changing radiofrequency (rf) field. He was awarded Nobel Prize in Physics in 1939 for this invention (Georgescu 2015). Later, John Hipple and co-workers utilized their knowledge of magnetic sector mass spectrometer with the principle of cyclotron acceleration to develop a new instrument—omegatron—where they tune the frequency of an additional rf field in resonance with the cyclotron frequency to accelerate only the ions of a specific m/z. The ions moved along the exact outward-spiraling trajectory to hit the detector, and this technique is known as ion cyclotron resonance (ICR) mass analysis (Hipple et al. 1949). Later, in 1974, Alan Marshall and Melvin Comisarow measured the image current generated by the charges in the detector plate, rather than directly detecting the charge particles. Turning off the rf excitation enabled several ions to rotate at the cyclotron frequency, while they repeatedly passed the detector plates, producing a free induction current that can be detected and then converted to a frequency spectrum using Fourier transform (FT). The technique becomes known as Fourier transform ICR or simply FTICR (Comisarow and Marshall 1974). The development of triple quadrupole (QqQ) mass spectrometer (Yost and Enke 1978), in the 1970s, was one the major instrument developed for small molecule and biomolecule analyses. Although this tandem MS technique was used extensively for small molecules, Donald Hunt started to use it for peptide sequencing where a peptide ion, or mixtures of ions, with a certain m/z range was selected in the first quadrupole, fragmented in the second quadrupole used as a collision cell by collision induced dissociation (CID) process, and the product ions were then detected in the third quadrupole. In fact, with the advent of triple quadrupole instrument, selected reaction monitoring mass spectrometry (SRM-MS) or multiple reaction monitoring mass spectrometry (MRM-MS) technique was developed that revolutionized the area of mass spectrometry quantification, initially of small molecules, and later of
1.3 History of Mass Spectrometry
11
biomolecules including peptides. This initial work on peptide sequencing by fragmentation with tandem MS technique ultimately paved the way for the current-day proteomics research.
1.3.4 Paradigm Shift in Mass Spectrometry Analysis In the 1980s, introduction of the three new ionization techniques revolutionized the field of mass spectrometry. In 1981, Michael Barber introduced the fast atom bombardment (FAB) method (Barber et al. 1981). It is a soft ionization method without the need for any derivatization. In FAB, the fast-moving atoms with ~8 K eV hit the surface of a glycerol matrix and sputtered both glycerol and sample molecules into the gas phase. This helped analysis of compounds with extended mass range and to improve the capability of peptide sequence determination, but with the limitation of compounds of about 1000 Daltons. Very little ion fragmentation occurred with the FAB method, and it also required using offline chromatographic separation procedure. Later, the other two new ionization techniques—matrix-assisted laser desorption/ionization (MALDI, developed by Michael Karas and Franz Hillenkamp (1988)) and electrospray ionization (ESI, developed by John Fenn et al. (1990))— transformed the area of biological mass spectrometry. Laser desorption (LD) was introduced as an effective method for producing gas- phase ions from biomolecules during the 1980s. However, obtaining a useful mass spectrum depended critically on the various physical natures of the analytes, and biomolecules with masses over ~1000 Da always produced mostly fragment ions. This LD-based approach did not become a useful technique for high molecular weight biological molecules until 1987, when the introduction of an excess of a light-absorbing compound, known as matrix, combined with the analytes of interest changed this idea. The addition of ultrafine metal powder to analyte solution in glycerol (suspension matrices developed by Tanaka et al. (1988)) and the co- crystallization of the analyte with an organic small molecule solution (chemical matrices developed by Karas and Hillenkamp (1988)) revolutionized the area of mass spectrometry by producing spectra of proteins of ~100,000 Da. In the late 1980s, another ionization method, electrospray ionization (ESI), developed by John Fenn of Yale University revolutionized the area of mass spectrometry by analyzing large biological macromolecules (Fenn et al. 1990). In 2002, he was awarded the Nobel Prize in Chemistry along with Koichi Tanaka (Nobel Prize in Chemistry 2002). Research on electrospray, in fact, had started long before that. In 1968, Malcom Dole of Northwestern University reported that large gasphase ions could be generated by electrospraying a dilute solution into an evaporated chamber containing nitrogen, and by evaporating the volatile solvent from the produced tiny droplets. Dole utilized steaming dilute polymer solution through a needle having high voltage relative to the spray chamber wall, and upon electrospraying into the evaporation chamber, the polymer ions were then produced in the ambient gas by the interplay of evaporation and electrostatic repulsion (Dellisanti
12
1 Introduction
2015). However, Dole faced several c hallenges, including unavailability of mass spectrometers at that time that could detect large biological macromolecules, as well as re-solvation of ions—where a substantial temperature drops during its adiabatic expansion and the electrosprayed droplets rapidly saturate with solvent vapor. Therefore, large gas-phase ions of his experiments got re-solvated and provided higher than actual masses with unknown values (Dellisanti 2015). Fenn was interested in this research, though initially unable to overcome some of the challenges that Dole faced. After years of research, with a newly designed electrospray instrument along with quadrupole mass analyzer, Fenn and his co-workers were able to obtain mass spectra of several small molecules including vitamin B6, as well as were able to generate mass spectra of intact ions of labile molecules, and thus the journey of electrospray ionization mass spectrometry (ESI-MS) begun. Fenn later analyzed larger molecules, including gramicidin S and cyclosporine A with singly and multiply charged ions, where m/z were in the range of 800–1800 Da, comfortably within the detection range of available mass spectrometers at the contemporary period. Another competitive advantage of ESI was its ability to interface liquid-phase separations to a mass spectrometer (Smith and Udseth 1988). With all of these features, ESI is nowadays a technique of choice to generate ions in liquid chromatography-mass spectrometry and is widely used in many chemical and biological applications including proteomics.
1.3.5 The Development of Proteomics The emergence of MALDI and ESI for the analysis of large biomolecules as well as the development of computer technology revolutionized protein research, and the area of proteomics—the study of the entire set of proteins expressed at any certain time, their function, and transformational cascades in the living cell. Researchers found that the molecular weight of peptides from the enzymatic digestion of proteins and tandem mass spectra of peptides could be matched to amino acid sequences of database. Hanzel and co-workers first utilized MALDI to develop a rapid peptide mass fingerprinting method for protein identification from two-dimensional gels, where MS-based technique was used to match peptide mass spectra to molecular weight data inferred from known amino acid sequences (Henzel et al. 1993). With their computational algorithm, Fragfit, they were able to identify ten isolated Escherichia coli proteins by using only three peptide masses per protein from 91,000 protein sequences (Perry 2015). Later, John Yates and his co-workers developed SEQUEST algorithm to match empirical ESI-LC-MS/MS-based mass spectra to theoretical spectra obtained from amino acid sequence in the GenPept database, which successfully identified peptides from E. coli and yeast digests (Eng et al. 1994). This ultimately changed the contemporary protein-sequence pattern and formed the basis of shotgun proteomics approach where it was possible to identify peptides on a large scale along with its post-translational modifications (PTMs) in complex mixtures, cells, or organelles without any individual protein separation.
1.3 History of Mass Spectrometry
13
During this time, Matthias Mann and Matthias Wilm showed that sequence tags— peptide fragments containing short identifiable amino acid sequence along with the mass of the regions flanking the peptide—could also be utilized to match proteins with known sequences from database (Mann and Wilm 1994). Methodology was also developed to characterize intact proteins, an area known as top-down proteomics (Kelleher et al. 1999a), opposite to bottom-up proteomics discussed above. Top-down proteomics provide useful information in understanding the function and structures of different protein species. However, it is challenging to fragment intact proteins in gas phase due to its stable structures related to its intramolecular hydrogen and electrostatic bonding. To overcome this challenge, a new fragmentation methodology, electron capture dissociation (ECD)—which allows definite localization of even labile PTMs by extensively sequencing the entire protein—has been developed (Zubarev et al. 1998; McLafferty et al. 2001; Kelleher et al. 1999b). FTICR instrumentation, used earlier preferably for top-down proteomics, has had limited availability whereas intact proteins possess limitation with sensitivity and throughput along with more intrinsic challenges—protein size and dynamic range issues (Kelleher 2004). New MS instrumentation with improved capabilities and applications has also been developed. In 2000, Alexander Marakov developed the Orbitrap based on the concept of Kingdon trap—an electrostatic device, developed by Kenneth Kingdon in 1923 (Kingdon 1923), consisting of a cylinder with a wire surrounding its axis having a voltage difference in between where charged particles could be trapped. In Orbitrap, Marakov used spindle-shaped electrode instead of wire and barrel-like electrode, where the ions could follow spiral trajectories around the spindle and could be trapped in an electrostatic harmonic potential. The m/z of analytes can be determined from harmonic axial oscillations and is detected from the image current (Marakov 2000). This high-accuracy and high-resolution mass spectrometer is nowadays widely used in proteomics. New generation of quadrupole time-of-flight (QToF), ion mobility, FTICR, Orbitrap mass spectrometers possesses greater selectivity, higher mass accuracy, and faster scan-speed with higher resolution. Besides proteomics, other omics areas like metabolomics and lipidomics were also benefited by these developments. Several mass spectrometry-based quantification methodologies have been developed with either label-free approach or with the incorporation of stable isotope labeling. With the development of isotope-coded affinity tag (ICAT), which involves chemical modifications of two comparing protein samples with isotope-coded affinity tags (Gygi et al. 1999); stable isotope labeling by amino acids in cell culture (SILAC), where cells were grown on medium containing metabolically labeled amino acids (Ong et al. 2002); isobaric tag for relative and absolute quantification (iTRAQ (Weise et al. 2007)); or tandem mass tags (TMTs (Thompson et al. 2003)), which are mostly chemical labeling techniques using tandem mass tags allowing multiple samples to be quantified simultaneously, quantification today has become more sensitive, accurate, and having high throughput. Within a couple of decades, mass spectrometry-based proteomics have been developed from single protein analysis tool to a near-complete proteome characterization technique (Peng et al. 2003).
14
1 Introduction
References Aston FW. The constitution of atmospheric neon. Philos Mag. 1920;39:449–55. Barber M, Bordoli RS, Sedgwick RD, Tyler AN. Fast atom bombardment of solids (F.A.B.): a new ion source for mass spectrometry. J Chem Soc Chem Commun. 1981;7:325–7. Biemann K, Gapp F, Siebl J. Application of mass spectrometry to structural problems. I. Amino acid sequence in peptides (7). J Am Chem Soc. 1959;81:2274–5. Bleakney WA. A new method of positive ray analysis and its application to the measurement of ionization potentials in mercury vapor. Phys Rev. 1929;34:157–60. Comisarow MB, Marshall AG. Fourier transformation ion cyclotron resonance spectroscopy. Chem Phys Lett. 1974;25:282–3. Dellisanti, CD. Electrospray makes molecular elephants fly. Nature milestones|Mass Spectrometry, October, 2015;15. www.nature.com/milestones/mass-spec Eng JK, McCormack AL, Yates JR. An approach to correlate tandem mass spectral data of peptides with amino acid sequences in a protein database. J Am Soc Mass Spectrom. 1994;5:976–89. Fenn JB, Mann M, Meng CK, Wong SF, Whitehouse CM. Electrospray ionization - principles and practice. Mass Spectrom Rev. 1990;9:37–70. Finkelstein J. Development of ionization methods. Nature milestones|Mass Spectrometry, October 2015; 6 www.nature.com/milestones/mass-spec Georgescu J. Spinning ion trajectories. Nature milestones|Mass Spectrometry, October 2015; 8. www.nature.com/milestones/mass-spec Gilpin JA, McLafferty FW. Mass spectrometric analysis of aliphatic aldehydes. Anal Chem. 1957;29:990–4. Gohlke RS, McLafferty FW. Early gas chromatography/mass spectrometry. J Am Soc Mass Spectrom. 1993;4:367–71. Grayson M, editor. Measuring mass: from positive rays to proteins. Philadelphia: Chemical Heritage Press; 2002. Griffiths J. A brief history of mass spectrometry. Anal Chem. 2008;80:5678–83. Guengerich FP. Thematic mini-review series on biological applications of mass spectrometry. J Biol Chem. 2011;286:25417. Gygi SP, Rist B, Gerber SA, Turecek F, Gelb MH, Aebersold R. Quantitative analysis of complex protein mixtures using isotope-coded affinity tags. Nat Biotechnol. 1999;17:994–9. Hassan BAR. Mass spectrometry (importance and uses). Pharm Anal Acta. 2012;3:1000e138. Henzel WJ, et al. Identifying proteins from two-dimensional gels by molecular mass searching of peptide fragments in protein sequence databases. Proc Natl Acad Sci U S A. 1993;90:5011–5. Hipple JA, Sommer H, Thomas HA. A precise method of determining the Faraday by magnetic resonance. Phys Rev. 1949;76:1877–8. Jarchum I. Discovering the power of mass-to-charge. Nature milestones|Mass Spectrometry, October 2015; 5. www.nature.com/milestones/mass-spec Karas M, Hillenkamp F. Laser desorption ionization of proteins with molecular masses exceeding 10 000 daltons. Anal Chem. 1988;60:2299–301. Kelleher NL. Top down proteomics. Anal Chem. 2004;76:197A–203A. Kelleher NL, Lin HY, Valaskovic GA, Aaserud DJ, Fridriksson EK, McLafferty FW. Top down versus bottom up protein characterization by tandem high-resolution mass spectrometry. J Am Chem Soc. 1999a;121:806–12. Kelleher NL, Zubaraev RA, Bush K, et al. Localization of labile posttranslational modifications by electron capture dissociation: the case of γ-carboxyglutamic acid. Anal Chem. 1999b;71:4250–3. Kingdon KH. A method for the neutralization of electron space charge by positive ionization at very low gas pressure. Phys Rev. 1923;21:408–18. Mann M, Wilm M. Error-tolerant identification of peptides in sequence databases by peptide sequence tags. Anal Chem. 1994;66:4390–9. Marakov A. Electrostatic axially harmonic orbital trapping: a high-performance technique of mass analysis. Anal Chem. 2000;72:1156–62.
References
15
McLafferty FW, Horn DM, Breuker K, et al. Electron capture dissociation of gaseous multiply charged ions by fourier-transform ion cyclotron resonance. J Am Soc Mass Spectrom. 2001;12:245–9. Munson MSB, Field FH. Chemical ionization mass spectrometry. I General introduction. J Am Chem Soc. 1966;88:2621–30. Nier AO. A mass spectrometer for routine isotope abundance measurements. Rev Sci Instrum. 1940;11:212. Nobel Prize in Chemistry 2002. www.nobel.se/chemistry/laureates/2002/index.html Ong SE, Blagoev B, Kratchmarova I, Kristensen DB, Steen H, Pandey A, Mann M. Stable isotope labeling by amino acids in cell culture, SILAC, as a simple and accurate approach to expression proteomics. Mol Cell Proteomics. 2002;1:376–86. Peng J, Elias JE, Thoreen CC, Licklider LJ, Gygi SP. Evaluation of multidimensional chromatography coupled with tandem mass spectrometry (LC/LC-MS/MS) for large-scale protein analysis: the yeast proteome. J Proteome Res. 2003;2:43–50. Perry S. Protein discovery goes global. Nature milestones|Mass Spectrometry, October 2015; 19. www.nature.com/milestones/mass-spec Sharma KS. Mass spectrometry – the early years. Int J Mass Spectrom. 2013;3–8:349–50. Smith RM. Understanding mass spectra: a basic approach. 2nd ed. Hoboken: Wiley; 2004. Smith RD, Udseth HR. Capillary zone electrophoresis-MS. Nature. 1988;331:639–40. Smith R, Mathis AD, Ventura D, Prince JT. Proteomics, lipidomics, metabolomics: a mass spectrometry tutorial from a computer scientist’s point of view. BMC Bioinf. 2014;15(Suppl 7):S9. Tanaka K, Waki H, Ido Y, Akita S, Yoshida Y, Yoshida T. Protein and polymer analyses up to m/z 100,000 by laser ionization time-of-flight mass spectrometry. Rapid Commun Mass Spectrom. 1988;2:151–3. Thompson A, Schafer J, Kuhn K, Kienle S, Schwarz J, Schmidt G, Neumann T, Hamon C. Tandem mass tags: a novel quantification strategy for comparative analysis of complex protein mixtures by MS/MS. Anal Chem. 2003;75:1895–904. Thomson JJ. Cathode rays. Philos Mag. 1897;44:293–316. Thomson JJ. Rays of positive electricity. Philos Mag. 1910;20:752–67. Series 6 Thomson-bio. n.d.. http://www.nobelprize.org/nobel_prizes/physics/laureates/1906/thomson-bio. html Weise S, Reidegeld KA, Meyer HE, Warscheid B. Protein labeling by iTRAQ: a new tool for quantitative mass spectrometry in proteome research. Proteomics. 2007;7:340–50. Yates JR III. A century of mass spectrometry: from atoms to proteomes. Nat Methods. 2011;8:633–7. Yost RA, Enke CG. Selected ion fragmentation with a tandem quadrupole mass spectrometer. J Am Chem Soc. 1978;100:2274–5. Zubarev RA, Kelleher NL, McLafferty FW. Electron capture dissociation of multiply charged protein cations. A nonergodic process. J Am Chem Soc. 1998;120:3265–6.
Chapter 2
The Mass Spectrometer and Its Components
2.1 Major Components of the Mass Spectrometer Mass spectrometer is an analytical instrument that is typically used to determine the molecular weight of chemical compounds or biomolecules by separating their ions in vacuum according to the mass-to-charge ratio (m/z). Mass spectrometers generally consist of three distinct basic components: 1. Ion source 2. Mass analyzer 3. Detector The sample molecules are first introduced and ionized in ion source. The newly formed ions in gaseous states are then passed into mass analyzer where they are separated in vacuum according to their mass-to-charge ratio (m/z) with the use of electric and/or magnetic field. At the end, the gaseous ions are transferred to the detector where an electric current is produced and amplified. The information of the amplified signals is then transmitted to the attached computer for spectrum generation and for further analysis.
2.1.1 Ion Source Mass spectrometer can only detect ions of compounds or molecules. It cannot detect radicals or neutral molecules. For mass spectrometry (MS) analysis, ionization of samples is thus the essential preliminary step. In ion source, sample molecules are first introduced to ionize. The ionization of neutral molecules in ion source can be done by various ways: electron transfer (ejection, capture, or dissociative ionization), proton transfer (protonation or deprotonation), adduct formation, charge transfer, or ion pair formation. After ionization, newly formed ions of molecules are © Springer Nature Switzerland AG 2020 M. Hossain, Selected Reaction Monitoring Mass Spectrometry (SRM-MS) in Proteomics, https://doi.org/10.1007/978-3-030-53433-2_2
17
18
2 The Mass Spectrometer and Its Components
moved toward the mass analyzer. Several methods have been developed to ionize the analyte in mass spectrometry. 2.1.1.1 Electron Ionization (EI) Electron ionization (EI), previously known as electron impact, is one of the earliest ionization techniques developed by Dempster (1918) in the beginning of the twentieth century and later modified and improved by Bleakney (1929). EI is still being used today for the analysis of low-to-moderate polar, relatively volatile, and non- ionic small-molecule organic compounds in gas chromatography-mass spectrometry (GC/MS). Some highly polar or ionic compounds can also be analyzed with appropriate derivatization. Several factors contribute to the formation of product ions: the strength of the bonds to be broken, the stability of the products of the fragmentation, and the internal energy of the fragment ions themselves. The voltage difference or the potential at which molecular ions are first observed in a mass spectrometer is known as the ionization potential. With increasing potential, more product ions are generated. Since EI can lead to the generation of fragments in addition to intact molecular ions, it is known as hard ionization method (Downard 2004). 2.1.1.2 Chemical Ionization (CI) Chemical ionization (CI), initially developed by Munson and Field in the mid-1960s (Munson and Field 1966), is a soft ionization technique, where gaseous molecules interact with ions and include transfer of an electron, proton, or other charged species among the reactants. This bimolecular reaction demands a comparatively increased number of ion–molecule collisions and is typically attained by increasing the partial pressure of the reagent gas (Gross 2004). One of the most common reagent gases is methane. When subjected to electron impact, a molecule of methane can ionize to form CH4+· by electron loss. This ion can react with a second molecule of methane to produce CH5+.
CH 4 +⋅ + CH 4 > CH 5 + + CH 3⋅
The ion CH5+ is an efficient proton donor, so that a sample molecule M also present in the ionization chamber can be ionized according to equation: M + CH 5 + > [ M + H ] + CH 4 +
To prevent the direct ionization of molecule M, methane is present in the ion source at a much higher concentration than the sample. Because of this, a chemical ionization source operates at a much higher pressure (10–3 to 10–4 torr) than an EI source.
2.1 Major Components of the Mass Spectrometer
19
Besides proton transfer, there are other chemical processes that can occur inside an ion source during chemical ionization, which include charge transfer, ion–molecule addition, and nucleophilic displacement reactions for negatively charged ions (Downard 2004).
M +· + N > M + N +·
M + + N > [M + N ]
+
Nu − + AB > Nu − A + B−
CI ion sources are like EI ion sources. Current instruments are constructed as EI/CI combination ion sources and can switch in between within a second. Chemical ionization is basically a result of several competing chemical reactions, and therefore assay sensitivity depends on various factors, such as the primary electron energy, electron current, the reagent gas, gas pressure, or ion source temperature. CI mass spectrometry has similar limitations like EI on the volatility and thermal stability of the compound being analyzed; however, direct insertion of the sample into the source allows the analysis of relatively involatile and thermally unstable compounds (Munson 2006). 2.1.1.3 Electrospray Ionization (ESI) Mass spectrometry measures the mass-to-charge ratio (m/z) of chemical or molecular species that are present in gaseous form and carry a net charge (Konermann et al. 2012). One of the early steps in this process is to convert the analytes into gaseous ions. Several ionization techniques have been developed, including electron ionization and chemical ionization, since the early days of mass spectrometry. These methods generate extensive fragmentation of the molecules and were not being able to measure precisely the molecular mass of the intact large biomolecules including proteins, which are polar, nonvolatile, and thermally labile. To overcome these analytical challenges, matrix-assisted laser desorption/ionization (MALDI) and electrospray ionization (ESI) have emerged as important ionization techniques that enable characterizations on a diverse set of compounds and that revolutionized the area of mass spectrometry-based proteomics over the past several decades. In MALDI, analytes are mostly embedded in a matrix, and gaseous ions are formed by exposure to a laser pulse. On the other hand, ESI converts solution-phase analytes into gas-phase ions to measure with mass spectrometry (Konermann et al. 2012). With very little or no fragmentations, both methods are considered as soft ionization techniques. The initial electrospray ionization-mass spectrometry (ESI-MS) concept was developed by Malcom Dole of Northwestern University in the 1960s, while he was trying to determine the molecular mass of the synthetic polymers including
20
2 The Mass Spectrometer and Its Components
p olystyrene. However, at that time, he lacked a suitable ionization method that could transfer these nonvolatile solids into gas-phase ions without much fragmentation. Inadvertently he discovered the existence of electrospray while visiting a car manufacturer, where he found that the cars were painted by electrospray. He then applied the idea and developed an instrument that generated charged polystyrene molecules in the gas phase with molecular mass near kDa range (Dole et al. 1968). Later, John Fenn utilized ESI to the solutions of salts and small ions, and, with a quadrupole mass spectrometer, generated ions in gas phase (Yamashita and Fenn 1984); with his co-workers, he also demonstrated that ESI-MS could be used efficiently for the ionization and characterization of large biomolecules in the gas phase without breaking them apart (Whitehouse et al. 1985). For this ground-breaking discovery, Fenn was awarded the Nobel Prize in 2002. Since its discovery for large biomolecules, the uses of electrospray ionization interfaced with mass spectrometers have been growing at an enormous rate with new areas of application and instrument development (Shibdas and Mazumdar 2012). Electrospray has several advantages over other ionization techniques: it is compatible with traditional liquid chromatography (LC)-based separation technique as it ionizes molecules directly from the liquid phase; it has very low chemical specificity; ESI-generated ions are very stable; it is unlimited in terms of analyte mass; and, best of all, it has high ionization efficiency. Hence, ESI today is the most widely used ionization technique in chemical and biochemical mass spectrometry (Wilm 2011). The Mechanism of Electrospray Ionization Electrospray ionization utilizes electrical energy to transfer the ions from solutions to gas phase before being analyzed by mass spectrometry. There are three major steps in the electrospray ionization process: (1) generation of charged droplets of analytes at the high-voltage electrospray capillary tips; (2) solvent evaporation and repeated charge-induced droplet disintegrations, resulting in very small highly charged droplets that are enabled to produce gas-phase ions; and (3) the production of gas-phase ions from these nanodroplets (Kebarle and Verkerk 2009). As electrospray process by itself has long been used for the electrostatic dispersion of liquids and the generation of aerosol particle even before being interfaced with mass spectrometry, the first two steps were already well studied and well known. However, the mechanism of the third process—gas-phase ion generation—was not fully clear yet. Several models have been proposed to describe these phenomena. Electrospray of different analytes can proceed through different mechanisms. Low molecular weight analytes follow the ion evaporation model (IEM (Iribarne and Thomson 1976; Thomson and Iribarne 1979)), and large globular proteins are released into gas phase via the charge residue model (CRM (Dole et al. 1968; Winger et al. 1993; Nguyen and Fen 2007)), whereas chain ejection model (CEM (Ahadi and Konermann 2012; Konermann et al. 2012)) is applied to unfolded and disordered proteins (Konermann et al. 2013), as shown in Fig. 2.1. All three electrospray steps occur at the atmospheric pressure, where analyte solution is passed through a stainless steel or quartz silica capillary tube that is held at an electrical potential of several kV; and in positive ion mode, the potential is kept
2.1 Major Components of the Mass Spectrometer
21
Fig. 2.1 Summary of electrospray mechanism. (a) IEM: Small ion ejection from a charged nanodroplet. (b) CRM: Release of a globular protein into the gas phase. (c) CEM: Ejection of an unfolded protein. (d) Collision-induced dissociation of a gaseous multiprotein complex. Charge equilibration in panels (c) and (d) is indicated by red arrows. (Reprinted with permission from Konermann 2013, © 2013, American Chemical Society)
positive relative to ground. The solution at the capillary tip is deformed into a Taylor cone, which generates a fine mist of highly charged droplets with the same polarity as the capillary voltage, and the initial ESI droplets have radii in the micrometer range. The application of a coaxial or nebulizing gas enhances the flow rate (Covey et al. 2009). In positive ion mode, each droplet is positively charged due to the presence of excess ions including H+, NH4+, Na+, and K+. Protons are often the main contributor to the net droplet charge, as majority of the analyte solutions are acidic. Besides, protons are also generated at the metal/solution interface inside the capillary (Van Berkel and Kertesz 2007). This and with other charge-balancing reactions, the ESI source acts as an electrochemical cell. The current in the circuit of Fig. 2.2 is mediated by ions and charged droplets that move in the gas phase, as well as electron flow through the wires that connect the ESI capillary to the mass spectrometer (Van Berkel and Kertesz 2007). The charged droplets emitted from the Taylor cone pass through a pressure gradient and potential gradient toward the mass analyzer, and with an increased ESI- source temperature often assisted by another stream of nitrogen drying gas, the charged droplets are continuously reduced in size by evaporation of the solvent, resulting in an increase in surface charge density with a decrease in the droplet
22
2 The Mass Spectrometer and Its Components
Fig. 2.2 Schematic of electrospray ionization process. (Reprinted with permission from Konermann 2013, © 2013, American Chemical Society)
radius. In the case of aqueous/organic mixtures, the organic component usually evaporates more readily, causing a gradual increase in water percentage (Kebarle and Verkerk 2009). The charge density on the shrinking droplets builds up until surface tension is balanced by Coulombic repulsion. Droplets at the Rayleigh limit produce even smaller and highly charged nanodroplets via jet fission. Repeated evaporation/fission events ultimately yield the final generation of ESI droplets with radii of a few nanometers (Konermann et al. 2012). Gaseous analyte ions that are detected by MS are produced from these highly charged nanodroplets (Fen 2003). At the end, the electric field strength within the charged droplet reaches a critical point at which it is energetically possible for ions at the surface of the droplets to be ejected into the gas phase, and is sampled by a skimmer cone followed by acceleration into mass analyzer for subsequent analysis of molecular mass and measurement of ion intensity (Ho et al. 2003). Nanoelectrospray Ionization Nanoelectrospray ionization (nanoESI), developed by Wilm and Mann (1996), is an electrospray source where comparatively much smaller quantities of analytes are required. In nanoESI, emitter tip openings are only a few micrometers, compared to ~100 μm for conventional ESI, resulting in a more efficient generation of gas-phase ions. The small nozzle diameter in nanoESI reduces the size of the initially produced droplets. As a result, a lower number of evaporation/fission cycles is required before analyte ions are converted into gas phase. The highly charged nanodroplets are, in fact, the precursors of gaseous analyte ions in both conventional ESI and nanoESI. Nanoelectrospray ionization operates at flow rates of M f + + Mn
2.2.1 MS/MS Instrumentation Tandem mass spectrometry instruments are of two types: (1) tandem-in-space and (2) tandem-in-time. Tandem-in-space instruments possess a separate analyzer for every stage of MS analysis. Beam-type analyzers are included in this category. In around mid-1970s, sector instruments were first used for MS/MS; later, triple quadrupole (QqQ) instruments were developed as a cheaper, user-friendly alternative to sector instruments (Yost and Enke 1978), and are still widely used today. In QqQ instrument, the first and third quadrupole are used as mass analyzers, whereas the second quadrupole is utilized as collision cell. In the 1980s, QTof instruments were also developed for tandem mass spectrometry (Glish and Goeringer 1984). In tandem-in-space instruments, the various stages of mass spectrometry analysis are performed in the same analyzer, but in separate times. Trapping instruments are typically included in this group where they have the advantage of being able to do multiple stages of MS analysis, or MSn with single analyzer; however, they cannot perform parent-ion scans or neutral-loss scans. The tandem-in-time instru-
40
2 The Mass Spectrometer and Its Components
ments—including quadrupole ion trap and FT-ICR mass spectrometer—typically provide higher MS/MS efficiency as ions do not have to transfer from one analyzer to another for every stage of analysis, and comparatively longer time frame of the experiment gives the precursor ions ample time to dissociate. In quadrupole ion trap, the implementation of MS/MS experiment is much simpler and faster, and has smaller footprint as well as lower cost (Glish and Vachet 2003).
2.2.2 Scan Modes in Tandem Mass Spectrometry The three main scan modes using tandem mass spectrometer are product-ion scan, precursor-ion scan, and neutral-loss scan (de Hoffman 1996; Kinter and Sherman 2000). Only triple quadrupole instruments perform all three types of scans. 1. Product-ion scan: In this scan mode, ions of specific m/z value are selected with MS1 and are passed into collision cells filled with an inert gas—helium, argon, or nitrogen. The ions are activated by collision and are induced to dissociate. The resultant product ions are then analyzed with MS2 that is set to scan a mass range. Product ion spectra provide fragmentation information of a molecular ion that can be used for structural elucidation. 2. Precursor-ion scan: The precursor-ion scan is fundamentally opposite to the product-ion scan, as precursor ions are recorded here for a selected fragment ion. In this mode, the MS2 is set to pass only ions with a selected m/z value, whereas the MS1 is set to scan over a selected mass range. Ions that pass through MS1 are detected only when, after its dissociation in the collision cell with inert gas, it produces the specific product ion. In product and precursor-ion scans, the neutral fragment is not important in MS/MS experiment. Precursor-ion scan has been utilized to identify peptides containing a phosphoserine (pSer), phosphothreonine (pThr), or phosphotyrosine (pTyr) based on the formation of PO3– of m/z 79 in a negative ion experiment, where it determines the molecular weight of all peptides that can produce this diagnostic ion; and after identification, the specific peptides are sequenced by product-ion scan in the positive ion mass spectrometry (Kinter and Sherman 2000). 3. Neutral-loss scan: In neutral-loss scan, the two stages of mass analysis are coordinated with an m/z difference. Here, both MS1 and MS2 are scanned in such a way that they differ by the desired neutral mass that is lost in the dissociation reaction between two MS stages, and it provides the ability to scan a certain compound type in complex mixture. This scan allows the detection of all ions that after fragmentation provide the loss of a given neutral fragment. Phosphopeptides, containing phosphoserine (pSer) and phosphothreonine (pThr), typically lose H3PO4 upon dissociation—a neutral species with a mass of 98 Da. These peptides can be detected by scanning MS2 to select ions that have changed mass by this amount in the dissociation between MS1 and MS2 (Glish and Vachet 2003) (Fig. 2.6).
2.3 Dissociation Methods in Mass Spectrometry
41
Fig. 2.6 Scan modes in tandem mass spectrometer
An additional type of tandem MS experiment is selected reaction monitoring (SRM, also known as multiple reaction monitoring, MRM), where MS1 is focused on selected mass of precursor ion, and MS2 is set to focus on a specific product ion of it. This method is like selected ion monitoring technique in typical mass spectrometry assay; however, ion selected by MS1 is only detected if it generates a specific product ion upon dissociation by a selected reaction. This method provides higher sensitivity and selectivity, as mass analyzers only focus on specific precursor and product ion pair, rather than scanning wide mass range window.
2.3 Dissociation Methods in Mass Spectrometry The main feature of the MS/MS experiment is the dissociation that occurs in the collision cell—most usually unimolecular dissociation enhanced by the ion activation. This ion activation typically increases the internal energy of the precursor ion to dissociate before its analysis at MS2. Collision-induced dissociation (CID) is a widely used dissociation technique, where the precursor ions collide with inert neutral gas resulting in the conversion of some of their kinetic energy to internal energy for necessary fragmentation (Glish and Vachet 2003). CID can be used with both types of instruments, tandem-in-space and tandem-in-time. There are other
42
2 The Mass Spectrometer and Its Components
types of dissociation methods involved in MS/MS process: high-energy collisional dissociation (HCD); surface-induced dissociation (SID), where kinetic energy of an ion coverts to internal energy through a collision with a surface; photo-dissociation, where ions are activated by photon(s); or various types of electron-induced dissociation including electron capture dissociation (ECD), electron transfer dissociation (ETD), electron detachment dissociation (EDD), or electron-induced dissociation (EID).
2.3.1 Collision-Induced Dissociation (CID) Collision-induced dissociation (CID), initially utilized for the structural elucidation of ions, has become one of the most widely used fragmentation methods to characterize biomolecules, including peptides, proteins, and carbohydrates. CID is useful in the analysis of samples in complex matrices, as well as sequencing of biopolymers. In CID process, the precursor ions are isolated, accelerated to higher kinetic energies, and allowed to collide with a neutral target gas—typically nitrogen, helium, or argon—at a moderately high pressure. Part of the ion’s kinetic energy is lost as its translational velocity decreases and is converted to internal energy leading the ions to fragment, which can be used as a signature spectrum. Collision-induced dissociation occurs in two steps: a faster activation or energetic excitation step, followed by a comparatively slower dissociation step, as shown below:
mp + + N > mp + · + N ′
mp +· > ma + + mb
Here, mp+ is the precursor ion, N is the target gas, and mp+· and N′ are in their post- collision state. In the dissociation state, ma+ and mb are the products of unimolecular dissociation of the precursor ion, mp+·. Based on the translational energy of the precursor ions, CID processes can be divided into two categories: low-energy (eV) collision and high-energy (kV) collision. Low-energy collisions, mostly in quadrupole and trapping instruments, occur when precursor ions have kinetic energies from a few eV up to few hundred eV, and typically excite vibrational state in a molecule resulting in a much narrower range of internal energy distribution (Cotte-Rodriguez et al. 2013). Heavier gases such as xenon or argon are used to increase the probability of observing fragments in low- energy collisions, and ion activation is achieved most often by multiple collisions. The pressure in the collision cell is also important. At higher gas pressures, both the number of ions undergoing collisions and the probability for an individual ion to
2.3 Dissociation Methods in Mass Spectrometry
43
collide multiple times with neutral gas molecules increases, and the product ions formed can be further dissociated by subsequent collisions and then also with higher gas pressures. High-energy collisions, common in magnetic sector or time-of-flight instruments, occur between a neutral gas molecule and a precursor ion accelerated to kinetic energy of several kV range. Ion excitation is mainly electronic at high- energy collisions, and their ions have a wider range of internal energies; therefore, most structurally viable fragmentation is possible. In high-energy collisions, the mass of the target gas has smaller effect as the center of the mass energy is a small fraction of the larger kinetic energy; hence, changes in the collision conditions— nature of collision gas, or its pressure in the collision cell—do not result in significant changes in the fragmentation process (Downard 2004). In order to simplify the dynamics of the kinetic to internal energy conversion process between the fast ion and a slow target gas molecule, a simpler center-of- mass (COM) framework has been used instead of the laboratory reference frame, since the “COM” momentum is always zero. The total energy available for kinetic to internal energy conversion is the relative kinetic energy of the collision partners. This center-of-mass collision energy, ECOM, is a fraction of the laboratory kinetic energy, ELab, if the velocity of the neutral is ignored. As described in Eq. 2.1, ECOM depends on masses of the collision partners, where m is the mass of the neutral collision gas and M is the mass of the ion to be activated.
ECOM = m / ( m + M ) ∗ ELab
(2.1)
ECOM increases with the mass of the target (m) and decreases as a function of 1/M. Therefore, large precursor ions have less internal energy deposited to undergo fragmentation during the collision process (Cotte-Rodriguez et al. 2013). To maximize ECOM and minimize ion–molecule reactions and charge transfer, relatively heavy atomic targets (e.g., Ar, Xe) with high ionization potentials have been used in the electron volt ELab collisional activation. CID of Peptide Ions Fragmentation pattern of CID on peptides depends on multiple parameters: the amino acid composition and size of the peptide, excitation method, time scale of the instrument, the charge state of the ion, etc. (Paizs and Suhal 2005). Under low- energy collisions, peptide precursor ions fragment along the backbone at the amide bonds forming structurally informative sequence ions that involve b and y ions containing the N- and C-terminus, respectively, and less useful non-sequence ions by losing small neutrals like water, ammonia. (Hunt et al. 1986; Biemann 1988; Papayannopoulos 1995). Peptide fragmentation nomenclature is shown in Fig. 2.7. Peptide ion dissociation pathways have been studied and the most comprehensive model is “mobile proton model” that describes how protonated peptides dissociate upon excitation (Wysocki et al. 2000; Summerfield et al. 1997; Dongre et al. 1996). According to this model, upon collisional activation, fragmentation of pep-
44
2 The Mass Spectrometer and Its Components
Fig. 2.7 Peptide fragmentation nomenclature. (Reprinted with permission from Paizs and Suhal 2005)
tides is initiated by the transfer of proton(s) intramolecularly to the cleavage sites among backbone protonation sites. Proton transfer is enabled by the proton affinity of the heteroatoms of the amide bond to be cleaved. When the number of ionizing protons is higher than the number of strong basic sites, charge-directed fragmentation pathways become energetically available at several sites, with enhanced cleavage at the N-terminal side of proline, which is known as proline effect (Wysocki et al. 2005; Barlow and O’Hair 2008; Breci et al. 2003). Enhanced cleavage is also observed at the C-terminal to a protonated histidine residue. This could be attributed to its ability to transfer a proton to the backbone to form a resonance-stabilized cyclic b ion. When all protons are bound to basic residues such as arginine where the number of ionizing protons is equal to or below the number of strong basic sites, cleavage occurs selectively at the C-terminus of aspartic or glutamic acid via a charge-remote mechanism where proton is derived from the acidic side chain (Wysocki et al. 2005). Peptides with basic amino acids have higher energy thresholds for fragmentation as compared to those with less basic amino acid groups. The basic sites that sequester the protons require more energy to facilitate fragmentation (Shukla and Futrell 2000). Even though b and y ions are the most useful ion types for sequencing, there are other ion types (e.g., c, x, z, a, d, w, v, ammonium ions) used for spectral identification and database searches, but these are typically observed at high-energy CID (Shukla and Futrell 2000; Wysocki et al. 2005; Papayannopoulos 1995). A difference between low- and high-energy CID is the abundant dissociation of amino acid side chains forming d-, w-, and v-type ions. Side-chain cleavages are useful for distinction of isomeric and isobaric amino acids in peptide sequencing (Wysocki et al. 2005). Another characteristic of the high-energy CID process is the abundance of ammonium ions in the low mass range region (Wells and McLucky 2005; Papayannopoulos 1995). CID has several limitations (Cotte-Rodriguez et al. 2013): (1) CID sometime generates sequence scrambling, which makes peptide and protein sequence challenging; (2) after application of CID on PTM-containing ions, PTM dissociates earlier than amide bond due to weaker nature, which makes the PTM localization challenging; (3) in structural biology, CID has limitation in determining subunit
2.3 Dissociation Methods in Mass Spectrometry
45
topology for non-covalent protein assemblies mainly due to significant subunit unfolding during activation process.
2.3.2 High-Energy Collisional Dissociation (HCD) High-energy collisional dissociation (HCD) is a type of collisional dissociation that occurs in the HCD cell of the Orbitrap mass analyzer of LTQ Velos and Q Exactive type instruments. The HCD cell consists of a straight multipole mounted inside a collision gas-filled tube. A voltage offset between C-trap and HCD cell accelerates parent ions into the collision gas inside the HCD cell, which causes the ions to fragment into product ions. The product ions are then returned to the Orbitrap analyzer for mass analysis. HCD produces triple quadrupole-like product ion mass spectra. The Orbitrap mass analyzer was introduced more than 10 years ago, and hybrid instruments based on this analyzer have already become useful in proteomics (Zubarev and Marakov 2013), consisting of an upfront mass spectrometer coupled to a C-trap, which stores and compresses the ion population prior to injection into the Orbitrap analyzer. Up to Orbitrap Velos and Elite models, the linear ion trap was used for the precursor selection and fragmentation. Few years ago, Q Exactive mass spectrometer was also developed where a quadrupole front-end was used instead of linear ion trap. Compared with the linear ion trap, quadrupole mass filters have the advantage of being capable of modulating the rf field, allowing only a select set of ions to have stable trajectories when passing through the rod assembly (Scheltema et al. 2014). HCD (high-energy collisional dissociation) fragmentation is available for the LTQ Orbitrap where ions are fragmented in a collision cell rather than an ion trap, and then transferred back through the C-trap for analysis at high resolution at the Orbitrap. Two configurations were initially developed for HCD on the linear ion trap Orbitrap (LTQ-Orbitrap) hybrid instrument (Olsen et al. 2007). In the first approach, C-trap was used to fragment the ions that are initially isolated in the linear ion trap part of the instrument. In regular operation, the entering ions have a low kinetic energy, and trapping efficiency is very high for all m/z range. When C-trap is used as a collision cell, and with high kinetic energies, incoming high mass ions will be lost, unless the rf amplitude on the rods is increased significantly. However, with increased rf amplitude, the low mass cut-off points for fragment ion storage change to a higher m/z value. Therefore, C-trap fragmentation requires a compromise setting of the rf amplitude, just enough to retain high mass ions at a given collision energy but accepting a change in the low mass cut-off for stored fragment ions. To overcome this drawback, the second configuration was developed where a dedicated octupole collision cell was added to the instrument for fragmentation. The octupole collision cell (120 mm length, 5.5 mm id, 2 mm rod diameter) is enclosed in a gas tight shroud aligned to the C-trap device. The collision cell is provided with an rf voltage (2.6 MHz, 500 V p-p) of which the DC offset can be varied ±250 V,
46
2 The Mass Spectrometer and Its Components
and a collision gas of choice, for example, nitrogen. In a typical HCD process, ions of a determined number, either mass selected or not, are transferred from the linear ion trap to the C-trap, which is held at ground potential. For HCD, ions are emitted from the C-trap to the octupole by setting a trap lens. In the octupole, ions collide with the gas at their normalized collision energy as determined by their ion mass, charge, and the nature of the collision gas. The product ions are then transferred from the octupole back to the C-trap by raising the potential of the octupole. A short time delay of 30 ms is used to ensure that all the ions are transferred. At the end, ions are ejected from the C-trap into the Orbitrap analyzer. Both types of configurations are shown in Fig. 2.8. All the Q Exactive MS and LTQ-Orbitrap Velos MS and later models have this second iteration (Olsen et al. 2007). One of the major disadvantages is that the acquisition speed of HCD MS/MS Orbitrap spectra, compared to ion trap CID spectra, is about half. Besides, HCD with Orbitrap detection has lower sensitivity than electron multiplier-based CID (Jedrychowski et al. 2011).
2.3.3 Electron Capture Dissociation Electron capture dissociation (ECD) is a gas-phase ion fragmentation technique used in FT-ICR-MS, which has already been established for top-down sequencing and for other applications and is complementary to traditional ion dissociation techniques. ECD was first demonstrated by Roman Zubarev and his co-workers in 1998 (Zubarev et al. 1998). Two main reasons for ECD being mostly used in FT-ICR-MS instruments are: (1) even though dissociation occurs at a very fast rate and can be non-ergodic, however, when the electron moves to the close proximity of charge, typically it requires minimum several milliseconds to successfully capture electron by the precursor ions, which exceeds the residence time of ions in time-offlight or quadrupole instruments (Tsybin et al. 2001); (2) ECD efficiency is highest
Fig. 2.8 Schematic of the hybrid linear ion trap Orbitrap instrument. (Reprinted with permission from Olsen et al. 2007)
2.3 Dissociation Methods in Mass Spectrometry
47
for electron energies [ M + nH ]
+A
fragments
where, M is the analyte (peptide or protein) molecule and A the radical anion. Coon et al. later developed negative ETD for anionic peptide/protein species (Coon et al. 2005a), where an electron from an anionic analyte is transferred to the cationic reagent, and the anion undergoes internal rearrangement and fragmentation, with backbone cleavage at the C–Cα bond mostly resulting in a- and x-type product ions, from which the structural elements of acidic peptides and proteins can be deduced (McAlister et al. 2012). A combined or complementary application of ETD and CID could significantly increase the sequence coverage and add confidence to peptide and protein identification, and characterization of their PTMs (Guthals and Bandeira 2012). With its highly efficient and fast ion/ion reaction capability—taking place in milliseconds— ETD can easily be coupled with online chromatographic separation (Coon et al. 2005b; Halim et al. 2013). Versatile applications of ETD include determination of phosphorylation and glycosylation, assessment of glycosylation micro-heterogeneity, identification of cancer biomarkers, fragmentation analysis of monoclonal antibodies, and protein quantification (Riley and Coon 2018). Recently, ETD also emerged as a feasible technique in top-down proteomics (Fornelli et al. 2012). However, the performance is challenged by non-covalent interactions, which needs to be broken before individual product ions are detected by ETD, which could be overcome by post-ETD collisional activation (ETcaD) (Swaney et al. 2006). ETcaD, a supplemental CID method, mainly fragments the non-dissociated electron transfer product species, thereby improving the ETD efficiency of the doubly charged peptide ions. Efficiency can also be increased by modification of peptides with fixed charge tags (Vasicek and Broadbelt 2009). With its wide-ranging applications, ETD appears to have a great prospective for further development and usages.
References Aebersold R, Mann M. Mass spectrometry-based proteomics. Nature. 2003;422:198–227. Ahadi E, Konermann L. Modeling the behavior of coarse-grained polymer chains in charged water droplets: implications for the mechanism of electrospray ionization. J Phys Chem B. 2012;116:104–12.
References
49
Barlow CK, O’Hair RAJ. Gas-phase peptide fragmentation: how understanding the fundamentals provides a springboard to developing new chemistry and novel proteomic tools. J Mass Spectrom. 2008;43:1301–19. Biemann K. Contributions of mass spectrometry to peptide and proteinstructure. Biomed Environ Mass Spectrom. 1988;16:99–111. Bleakney W. A new method of positive ray analysis and its application to the measurement of ionization potentials in mercury vapor. Phys Rev. 1929;34:157–60. Bogdanov B, Smith RD. Proteomics by FT-ICR mass spectrometry: top down and bottom up. Mass Spectrom Rev. 2005;24:168–200. Breci LA, Tabb DL, Yates JR III, et al. Cleavage N-terminal to proline: analysis of a database of peptide tandem mass spectra. Anal Chem. 2003;75:1963–71. Busch KL. High-vacuum pumps in mass spectrometers. Spectroscopy. 2001;16:14–8. Busch KL, Glish GL, McLucky SA. Mass spectrometry/mass spectrometry: techniques and applications of tandem mass spectrometry. New York: VCH; 1988. Chen XH, Turecek F. The arginine anomaly: Arginine radicals are poor hydrogen atom donors in electron transfer induced dissociations. J Am Chem Soc. 2006;128:12520–30. Coates M, Wilkins C. Laser desorption fourier transform mass spectra of malto-oligosaccharides. Biomed Mass Spectrom. 1985;12:424–8. Comisarow MH, Marshall AG. Frequency sweep Fourier transform ion cyclotron resonance spectroscopy. Chem Phys Lett. 1974;25:282–3. Coon JJ, Shabanowitz J, Hunt DF, Syka JE. Electron transfer dissociation of peptide anions. J Am Soc Mass Spectrom. 2005a;16:880–2. Coon JJ, Syka JEP, Shabanowitz J, Hunt DF. Tandem mass spectrometry for peptide and protein sequence analysis. BioTechniques. 2005b;38(4):519–23. Cotte-Rodriguez I, Miao Z, Zhang Y, Chen H. Introduction to protein mass spectrometry, (Chapter 1). In: Chen G, editor. Characterization of protein therapeutics using mass spectrometry. New York: Springer Science+Business Media; 2013. https://doi.org/10.1007/978-1-4419-7862-2_1. Covey TR, Thomson BA, Schneider BB. Atmospheric pressure ion sources. Mass Spectrom Rev. 2009;28:870–97. de Hoffmann, E. tandem mass spectrometry: a primer. J Mass Spectrom. 1996;31:129–37. de Hoffmann E, Stroobant V. Mass spectrometry – principles and applications. 3rd ed. West Sussex: Wiley; 2007. de la Mora F. Electrospray ionization of large multiply charged species proceeds via Dole’s charged residue mechanism. Anal Chem Acta. 2000;406:93–104. Dempster AJ. A new method of positive ray analysis. Phys Rev. 1918;11:316–25. Dole M, Mack LL, Hines RL, Mobley RC, Ferguson L, Alice MB. Molecular beams of macroions. J Chem Phys. 1968;49:2240–9. Dongre AR, Jones JL, Somogyi A, et al. In fluence of peptide composition, gas-phase basicity, and chemical modification on fragmentation efficiency: evidence for the mobile proton model. J Am Chem Soc. 1996;118:8365–74. Downard K. Mass spectrometry: a foundation course. Cambridge: The Royal Society of Chemistry; 2004. Ejsing CS, Duchoslav E, Sampaio J, Simons K, Bonner R, Thiele C, Ekroos K, Shevchenko A. Aotomated identification and quantification of glycerophospholipid molecular species by multiple precursor ion scanning. Anal Chem. 2007;78:6202–14. Fen JB. Electrospray wings for molecular elephant. Angew Chem Int Ed. 2003;42:3871–94. Fornelli L, Damoc E, Thomas PM, Kelleher NL, Aizikov K, Denisov E, Marakov A, Tsybin YO. Analaysis of antact monoclonal antibody IgG1 by electron transfer dissociation orbitrap FTMS. Mol Cell Proteomics. 2012;11:1758–67. Glish GL, Goeringer DE. Tandem quadrupole/time-of-flight instrument for mass spectrometry/ mass spectrometry. Anal Chem. 1984;56:2291–5. Glish GL, Vachet RW. The basics of mass spectrometry in the twenty-first century. Nat Rev Drug Discov. 2003;2:140–50. Gross JH. Mass spectrometry – a textbook. Heidelberg: Springer; 2004.
50
2 The Mass Spectrometer and Its Components
Guthals A, Bandeira N. Peptide identification by tandem mass spectrometry with alternate fragmentation modes. Mol Cell Proteomics. 2012;9:550–7. Haag AM. Mass analyzers and mass spectrometers. In: Mirzaei H and Carrasco M, editors. Modern proteomics - sample preparation, analysis and practical applications. Advances in Experimental Medicine and Biology. 2016;919. https://doi.org/10.1007/978-3-319-41448-5_7. Halim A, Ruetschi U, Larsen G, Nilsson J. LC-MS/MS characterization of O-glycosylation sites and glycan structures of human cerebrospinal fluid glycoproteins. J Proteome Res. 2013;12:573–84. Hillenkamp F, Karas M, beavis RC, Chait BT. Matrix-assisted laser desorption ionization mass spectrometry of biopolymers. Anal Chem. 1991;63:A1193–202. Ho CS, Lam CWK, Chan MHM, Cheung RCK, Law LK, Lit LCW, Ng KF, Suen MWM, Tai HL. Electrospray ionization mass spectrometry: principles and clinical applications. Clin Biochem Rev. 2003;24:3–12. Hogan CJ, Carroll JA, Rohrs HW, Biswas P, Gross ML. Charge carrier field emission determines the number of charges on native state proteins in electrospray ionization. J Am Chem Soc. 2008;130:6926–7. Hossain M, Limbach PA. A comparison of MALDI matrices. In: Cole RB, editor. Electrospray and MALDI mass spectrometry: fundamentals, instrumentation, practicalities, and biological applications. 2nd ed. New York: Wiley; 2010. Hu Q, Noll RJ, Li H, Makarov A, Cooks RG. The orbitrap: a new mass spectrometer. J Mass Spectrom. 2005;40:430–43. Hunt DF, Yates JR III, Shabanowitz J, Winston S, et al. Protein sequencing by tandem mass spectrometry. Proc Natl Acad Sci U S A. 1986;83:6233–7. Iribarne J, Thomson B. On the evaporation of small ions from charged droplets. J Chem Phys. 1976;64:2287–94. Jedrychowski MP, Huttlin EL, Hass W, Sowa ME, Rad R, Gygi SP. Evaluation of HCD- and CID- type fragmentation within their respective detection platforms for murine phosphoproteomics. Mol Cell Proteomics. 2011;10:1–9. https://doi.org/10.1074/mcp.M111.009910. Jennings KR, Dolnikowski GG. Mass analyzers. Methods Enzymol. 1990;193:37–61. Juraschek R, Dulks T, Karas M. Nanospray – more than just a minimized-flow electrospray ion source. J Am Soc Mass Spectrom. 1999;10:300–8. Karas M, Hillenkamp F. Laser desorption ionization of proteins with molecular masses exceeding 10,000 daltons. Anal Chem. 1988;60:2299–302. Kebarle P, Verkerk UH. Electrospray: from ions in solution to ions in the gas phase, what we know now. Mass Spectrom Rev. 2009;28:898–917. Khatri N, Ankit G, Ruchi T, Ajoy B, Prasant B. A review on mass spectrometry detectors. Int Res J Pharm. 2012;3:33–42. Kinter M, Sherman NE. Protein sequencing and identification using tandem mass spectrometry. New York: Wiley; 2000. Knochenmuss, R. MALDI ionization mechanisms: an overview. In: Cole RB, editor. Electrospray and MALDI mass spectrometry: fundamentals, instrumentation, practicalities, and biological applications, 2nd ed. Hoboken: Wiley; 2010. Konermann L. A minimalist model for exploring conformational effects on the electrospray charge state distribution of proteins. J Phys Chem B. 2007:6534–43. Konermann L, Rodriguez AD, Liu J. On the formation of highly charged gaseous ions from unfolded proteins by electrospray ionization. Anal Chem. 2012;84:6798–804. Konermann L, Ahadi E, Rodriguez AD, Vahidi S. Unraveling the mechanism of electrospray ionization. Anal Chem. 2013;85:2–9. Koppenaal DW, Barinaga CJ, Denton MB, Sperline RP, Hieftje GM, Schilling GD, Andrade FJ, Barnes JH IV. Mass spectrometry detectors – eye on ions. Anal Chem. 2007;77:418A–27A. Mamyrin BA. Laser-assisted reflectron time-of-flight mass spectrometry. Int J Mass Spectrom Ion Process. 1994;131:1–19. Marakov A. Electrostatic axially harmonic orbital trapping: a high-performance technique of mass analysis. Anal Chem. 2000;72:1156–62.
References
51
Marshall AG, Hendrickson CL, Jackson GS. Fourier transform ion cyclotron resonance mass spectrometry: a primer. Mass Spectrom Rev. 1998;17:1–35. McAlister GC, Russell JD, Rumachik NG, Hebert AS, Syka JE, Geer LY, Westphall MS, Pagliarini DJ, Coon JJ. Analysis of the acidic proteome with negative electron-transfer dissociation mass spectrometry. Anal Chem. 2012;84:2875–82. McLafferty FW. Tandem mass spectrometry. New York: Wiley; 1983. Medhe S. Mass spectrometry: detectors review. Chem Biomol Eng. 2018;3:51–8. Mikesh LM, Ueberheide B, Chi A, et al. The utility of ETD mass spectrometry in proteomic analysis. Biochim Biophys Acta. 2006;1764:1811–22. Munson B. Chemical ionization mass spectrometry: theory and applications. In: Encyclopedia of analytical chemistry: applications, theory and instrumentation. New York: Wiley; 2006. Munson MSB, Field FH. Chemical ionization mass spectrometry. I. General introduction. J Am Chem Soc. 1966;88:2621–30. Nguyen S, Fen JB. Gas-phase ions of solute species from charged droplets of solutions. PNAS. 2007;104:1111–7. O’Connor PB. The development of matrix-assisted laser desorption/ionization sources. In: Electrospray and MALDI mass spectrometry: fundamentals, instrumentation, practicalities, and biological applications. 2nd ed. Hoboken: Wiley; 2010. O’Connor PB, Cournoyer JJ, Pitteri SJ, et al. Differentiation of aspartic and isoaspartic acids using electron transfer dissociation. J Am Soc Mass Spectrom. 2006;17:15–9. Olsen JV, Macek B, Lange O, Marakov A, Horning S, Mann M. Higher-energy C-trap dissociation for peptide modification analysis. Nat Methods. 2007;4:709–12. Paizs B, Suhal S. Fragmentation pathways of protonated peptides. Mass Spectrom Rev. 2005;24:508–48. Papayannopoulos IA. The interpretation of collision-induced dissociation tandem mass spectra of peptides. Mass Spectrom Rev. 1995;14:49–73. Paul W, Steinwedel H. A new mass spectrometer without magnetic field. Z Naturforsch. 1953;8A:448–50. Paul W, Steinwedel HS. US Patent 2939952, 1960. Riley NM, Coon JJ. The role of electron transfer dissociation in modern proteomics. Anal Chem. 2018;90:40–64. Sarbu M, Ghiulai RM, Zamfir AD. Recent developments and applications of electron transfer dissociation mass spectrometry in proteomics. Amino Acids. 2014;46:1625–34. Scheltema RA, Hauschild J-P, Lange O, Hornburg D, Denisov E, Damoc E, Kuehn A, Marakov A, Mann M. The Q Exactive HF, a benchtop mass spectrometer with a pre-filter, high- performance quadrupole and an ultra-high-field Orbitrap analyzer. Mol Cell Proteomics. 2014;13(12):3698–708. Scherl A. Clinical protein mass spectrometry. Methods. 2015;81:3–14. Schmidt A, Karas M, Dulks T. Effect of different solution flow rates on analyte signals in nano- ESI-MS, or when does ESI turn into nano-ESI. J Am Soc Mass Spectrom. 2003;14:492–500. Schurenberg M, Dreisewerd K, Hillenkamp F. Laser desorption/ionization mass spectrometry of peptides and proteins with particle suspension matrices. Anal Chem. 1999;71:221–9. Scigelova M, Hornshaw M, Giannakopulos A, Makarov A. Fourier transform mass spectrometry. Mol Cell Proteomics. 2011;10:M111.009431. https://doi.org/10.1074/mcpM111.009431. Shibdas B, Mazumdar S. Electrospray ionization mass spectrometry: a technique to access the information beyond the molecular weight of the analyte. Int J Anal Chem. 2012; Article ID 282574, 40 pages. Shukla AK, Futrell JH. Tandem mass spectrometry: dissociation of ions by collisional activation. J Mass Spectrom. 2000;35:1069–90. Siuzdak G. Mass spectrometry for biotechnology. San Diego: Academic; 1996. Stafford GC, Kelley PE, Syka JEP, Reynolds WE, Todd JFJ. Recent improvements in analytical applications of advanced ion trap technology. Int J Mass Spectrom Ion Process. 1984;60:85–98. Stephens WE. A pulsed mass spectrometer with time dispersion. Phys Rev. 1946;69:691. Summerfield SG, Whiting A, Gaskell SJ. Intra-ionic interactions in electrosprayed peptide ions. Int J Mass Spectrom Ion Process. 1997;162:149–61.
52
2 The Mass Spectrometer and Its Components
Swaney DL, McAlister GC, Wirtala, et al. Supplemental activation method for high-efficiency electron-transfer dissociation of doubly protonated peptide precursors. Anal Chem. 2006;79:477–85. Syka JE, Coon JJ, Schroeder MJ, Shabanowitz J, hunt DF. Peptide and protein sequence analysis by electron transfer dissociation mass spectrometry. Proc Natl Acad Sci. 2004;101:9528–33. Syrstad EA, Turecek F. Toward a general mechanism of electron capture dissociation. J Am Soc Mass Spectrum. 2005;16:208–24. Tanaka K. The origin of macromolecule ionization by laser irradiation (noble lecture). Angew Chem Int Ed. 2003;42:3861–70. Tanaka K, Waki H, Ido Y, Akita S, Yoshida Y, Yoshida T. Protein and polymer analyses up to m/z 100,000 by laser ionization time-of-flight mass spectrometry. Rapid Commun Mass Spectrom. 1988;2:151–3. Tang L, Kebarle P. Dependence of ion intensity in electrospray mass spectrometry on the concentration of the analytes in the electrosprayed solution. Anal Chem. 1993;65:3654–68. Thomson B, Iribarne J. Field induced ion evaporation from liquid surfaces at atmospheric pressure. J Phys Chem. 1979;71:4451–63. Tsybin YO, Kakansson P, Budnik BA, Haselmann KF, Kjeldsen F, Gorshkov M, Zubarev RA. Improved low-energy injection systems for high rate electron capture dissociation in Fourier transform ion cyclotron resonance mass spectrometry. Rapid Commun Mass Spectrom. 2001;15:1849–54. Van Berkel GJ, Kertesz V. Anal Chem. 2007;79:5510–20. Vasicek L, Broadbelt JS. Enhanced electron transfer dissociation through fixed charge derivatization of cysteines. Anal Chem. 2009;81:7876–84. Vastola F, Mumma R, Pirone A. Analysis of organic salts by laser ionization. Org Mass Spectrom. 1970;3:101–4. Wells JM, McLucky SA. Collision-induced dissociation (CID) of peptides and proteins. Methods Enzymol. 2005;402:148–85. Whitehouse CM, Dreyer RN, Yamashita M, Fenn JB. Electrospray interface for liquid chromatographs and mass spectrometers. Anal Chem. 1985;57:675–9. Wiesner J, Premsler T, Sickmann A. Application of electron transfer dissociation (ETD) for the analysis of posttranslational modifications. Proteomics. 2008;8:4466–83. Wiley WC, Mclaren IH. Time-of-flight mass spectrometer with improved resolution. Rev Sci Instrum. 1955;26:1150–7. Wilm M. Principles of electrospray ionization. Mol Cell Proteomics. 2011;10(7):M111.009407. https://doi.org/10.1074/mcp.M111.009407. Wilm M, Mann M. Analytical properties of the nanospray ion source. Anal Chem. 1996;68:1–8. Winger BE, Light-Wahl KJ, Ogorzalek Loo RR, Udseth HR, Smith RD. Observation and implications of high mass-to-charge ratio ions from electrospray ionization mass spectrometry. J Am Soc Mass Spectrom. 1993;4:536–45. Wysocki VH, Tsaprailis G, Smith LL, et al. Mobile and localized protons: a framework for understanding peptide dissociation. J Mass Spectrom. 2000;35:1399–406. Wysocki VH, Resing KA, Zhang Q, et al. Mass spectrometry of peptides and proteins. Methods. 2005;35:211–22. Yamashita M, Fenn JB. Electrospray ion source. Another variation on the free-jet theme. J Phys Chem. 1984;88:4451–9. Yost RA, Enke CG. Selected ion fragmentation with a tandem quadrupole mass spectrometer. J Am Chem Soc. 1978;100:2274–5. Zubarev RA. Electron-capture dissociation tandem mass spectrometry. Curr Opin Biotechnol. 2004;15:12–6. Zubarev RA, Marakov AA. Orbitrap mass spectrometry. Anal Chem. 2013;85:5288–96. Zubarev RA, Kelleher NL, McLafferty FW. Electron capture dissociation of multiply charged protein cations. A non-ergodic process. J Am Chem Soc. 1998;120:3265–6. Zubarev RA, Horn DM, Fridriksson EK, Kelleher NL, Kruger NA, Lewis MA, Carpenter BK, McLafferty FW. Electron capture dissociation for structural characterization of multiply charged protein cations. Anal Chem. 2000;72:563–73.
Chapter 3
Selected Reaction Monitoring Mass Spectrometry
3.1 Proteomics Proteomics has emerged as a powerful approach for studying biological processes that directly profiles changes in cells and tissues (Link and LaBaer 2009). Characterization of the proteins present in a biological system, or the proteome, thus provides a foundation for better understanding of the complexities inherent in biology (Angel et al. 2012). The term “proteomics” was first coined in 1995 and was defined as the large-scale characterization of the entire protein complement of a cell line, tissue, or organism (Wilkins et al. 1995). Proteomics is challenging as the proteins in cells or tissues vary widely in size, shape, isoelectric point, hydrophobicity, and biological affinity. Besides, the diversity of amino acid side chains and the ability of protein to fold into three-dimensional (3D) conformation provide each protein its distinct physical, chemical, and heterogeneous functional properties. Alternative RNA splicing, RNA editing, proteolytic processing, posttranslational modifications, protein stability, and transient protein associations and dependency on cell type or physiological state also increase the complexity of a cellular proteome (Mallick and Kuster 2010). Protein abundance in a proteome can also range over 12 orders of magnitude (Anderson and Anderson 2002). Proteome is highly dynamic compared to its genome. An organism typically has different protein expression patterns in different cell types, stages of the cell cycle, stages of development, and environmental conditions. Protein interactions, intracellular locations, and posttranslational modifications are constantly changing during all these events. One of the great challenges for proteomics is accurately and comprehensively identifying changes in a cell or tissue proteome (Link and LaBaer 2009). In the early years of proteomics research, two-dimensional (2D) gel electrophoresis was developed in 1970 as a technique to resolve and visualize large numbers of proteins from whole cell or tissue extracts (O’Farrell et al. 1977). Thousands of proteins were separated by coupling isoelectric focusing with SDS-PAGE where the © Springer Nature Switzerland AG 2020 M. Hossain, Selected Reaction Monitoring Mass Spectrometry (SRM-MS) in Proteomics, https://doi.org/10.1007/978-3-030-53433-2_3
53
54
3 Selected Reaction Monitoring Mass Spectrometry
intensity of 2D gel spots represented the abundance of the proteins and was used to quantify the proteome. Later, the reproducibility has been dramatically improved using immobilized pH gradients (IPG), standardization of the procedure, fluorescent labels for proteins, and improvements in imaging and computational analysis. It has been used for global protein profiling of cells and tissues. In this technique, the gel spots are typically excised, in-gel digested with trypsin, and the peptides are then identified by mass spectrometry (Link and LaBaer 2009). At present, proteomics investigations are increasingly quantitative and comprehensive (Beck et al. 2011). The rapid evolution of proteomics is due, in most part, to the speedy development of the modern mass spectrometry (MS) and liquid chromatography (LC) (Mallick and Kuster 2010). LC-MS (liquid chromatography mass spectrometry) has been a major technology driving proteomics. It has become the method of choice for sensitively and rapidly identifying peptides, proteins, and posttranslational modifications (Aebersold and Mann 2003). The technique can generate vast amounts of data on a large number of proteins in a short period. It enables the measurement and identification of peptides at a rate of thousands of sequences per day with better than femtomole or sub-nanogram sensitivity in complex biological samples. Without or with the addition of various labeling of peptides or proteins with stable isotopes, it can also be used to quantify the abundance of a large number of proteins in the proteome. Currently, proteomics researches are wide ranging—from protein profiling to analyzing signaling pathways at the system biology level, as shown in Fig. 3.1, where within each topic diverse scientific questions are being sought and diverse approaches are being utilized that vary widely in terms of versatility, technical capability, difficulty and, of course, cost, as discussed by Graves et al. (Graves and Haystead 2002). Different technologies are utilized in proteomic workflow that includes from sample preparation to ultimate protein quantification as shown in Fig. 3.2 (Mallick and Kuster 2010).
Fig. 3.1 Types of proteomics and their multifaceted applications
3.1 Proteomics
55
Fig. 3.2 Technologies for proteomic workflow. (Reprinted with permission from Mallick and Kuster (2010))
3.1.1 Bottom-Up, Top-Down, and Middle-Down Proteomics Proteomic analysis by mass spectrometry typically is carried out using either of the two approaches—bottom-up and top-down. In the bottom-up approach, proteins are cleaved by proteases into peptides before mass spectrometry analysis, whereas in top-down proteomics intact proteins are analyzed (Han et al. 2008). As trypsin- based bottom-up proteomics covers mostly 0.7–3 kDa peptides, top-down generally covers small to medium sized intact proteins of 10 kDa to a little over 30 kDa, and a third approach has recently been introduced to cover the middle-range peptides—3 kDa to 10 kDa—named as middle-down proteomics (Cristobal et al. 2017). Separation of peptides and proteins is a critical element in these approaches, which may involve multiple separation steps, including reversed phase, size exclusion, hydrophobic interaction (HILIC), electrophoresis, tube gel electrophoresis, capillary electrophoresis, isoelectric focusing, and ion exchange at the protein and/ or peptide level, while carried out either off-line (independent of the MS) or on-line (coupled directly to the MS) (Catherman et al. 2014).
56
3 Selected Reaction Monitoring Mass Spectrometry
Bottom-Up Proteomics The bottom-up approach has been widely adopted in current proteomics for large- scale and high-throughput analysis of highly complex samples, which can be done using one of the two strategies—separation before digestion and digestion before separation. The former is carried out using off-line protein separation/fractionation, followed by resultant peptide analysis by peptide mass fingerprinting (PMF) technique, or further peptide separation by LC coupled to a MS/MS. In the latter strategy, proteins are digested first into peptides and then peptides are separated by LC, followed by MS/MS analysis, as shown in Fig. 3.3. The fragmented peptides are identified by search algorithms—SEQUEST, MASCOT, MaxQuant, X!Tandem, Andromeda, MS-GF+, MS-Amanda, or Byonic—by correlating the experimental fragment ion masses to theoretical mass spectra in protein databases derived from genomic sequencing (Tsiatsiani and Heck 2015). Top-Down Proteomics The top-down approach involves gas-phase ionization of intact protein and the high- resolution measurement of its intact and fragment ion masses, which enable a more comprehensive characterization of various protein isoforms and PTMs (posttransla-
Fig. 3.3 Strategies for protein characterization. (Reprinted with permission from Han et al. (2008))
3.1 Proteomics
57
tional modifications) (McLafferty et al. 2007). The development of ECD and other dissociation techniques has increased the sequence coverage of small- and medium- sized proteins; however, routine top-down large protein analysis with a mass of larger than 50 kDa is still a challenge (Han et al. 2008). Although this approach is suitable for the analysis of single proteins or a simple mixture of multiple proteins, the proteome-wide analysis capability at the intact protein level is still behind the bottom-up approach considering proteome coverage, sensitivity, and overall throughput (Catherman et al. 2014). Top-down approach was initially developed on FTICR mass spectrometry while taking the advantage of its high resolution and high mass accuracy. New instrumentation and dissociated techniques expanded the use of top-down MS on other instruments as well—Orbitrap, TOF, and ion trap (Han et al. 2008). With continued efforts in throughput and the complexity of biological sample analysis, top-down proteomics needs efficient software for fast and automated processing of raw data. ProSight PTM was the first and widely used software platform capable of performing multiple search types on various databases for processing a large amount of top-down data (Catherman et al. 2014). Recently developed native mass spectrometry techniques (Leney and Heck 2017) also provide new opportunity for analyzing protein complexes via top-down proteomics (Zhang et al. 2011). Middle-Down Proteomics The relatively small size of tryptic peptides in bottom-up proteomics generates problems such as sample complexity, challenges in assigning peptides to specific gene products rather than protein groups, and loss of single and combinatorial PTM information. On the other hand, the top-down proteomics overcomes these challenges by characterizing intact proteins, although its efficiency declines in the high- mass region. Hence, the middle-down proteomics has emerged as the technique that could unite positive aspects of both bottom-up and top-down proteomics (Cong et al. 2012) and may also provide better proteome coverage, including the identification of splice variants and other isoforms (Cristobal et al. 2017). Different approaches have been followed to optimize the middle-down proteomics. To generate middle range sized peptides from cellular lysates, Alba et al. explored the use of Asp-N and Glu-C proteases (Cristobal et al. 2017). Microwave- accelerated acid hydrolysis, which produces Asp-selective chemical cleavage, had been evaluated by using several acid modifiers (Cannon et al. 2010). To increase the depth of proteome, a strong cation exchange (SCX) separation, carefully tuned to improve the separation of longer peptides, is combined with RP-LC using columns packed with a larger pore-size material. Besides, after evaluating the optimum MS settings, Alba et al. also assessed various peptide fragmentation techniques, including HCD, ETD, and EThcD, for the characterization of middle-sized peptides (Cristobal et al. 2017). Cong et al. proposed a generic approach to middle-down proteomics with two essential features—a size-dependent protein fractionation technique comprising tube gel electrophoresis and a robust but restricted proteolysis method consisting of Lys-C, Lys-N, and chemical methods, including microwave- assisted acid hydrolysis; however, these methods generate peptides only marginally
58
3 Selected Reaction Monitoring Mass Spectrometry
longer than tryptic peptides in large-scale proteome studies (Cong et al. 2012). To overcome this challenge, Cong et al. reported protease OmpT to achieve a robust, albeit restricted, proteolysis of a complex proteome that is known to cleave between two consecutive basic amino acid residues (Lys/Arg-Lys/Arg) and with kcat/KM in the 104–108 s−1 M−1 range. Derived from the outer membrane of Escherichia coli K12, OmpT is a member of novel omptin protease family. In the study, the authors developed and optimized OmpT into an effective protease to generate more than 2 kDa peptides suitable for middle-down proteomics (Cong et al. 2012). Analysis of histones PTMs by mass spectrometry has become an essential tool for the characterization of chromatin composition and dynamics. Simone et al. reviewed the advancement of the middle-down MS strategy applied to histones, which involves the analysis of N-terminal tails of intact histones with 50–60 amino acids, where middle-down MS provides sufficient robustness and reliability and is less technically challenging than PTM quantification on intact histones by top-down approach (Sidoli and Garcia 2017). However, very few histone chromatin biology studies (Moradian et al. 2014) have applied middle-down approach potentially due to the seeming high complexity of the methodology. It is also applied for polyubiquitin chains (Xu and Peng 2008; Valkevich et al. 2014) and the ribosomal proteome (Cannon et al. 2010).
3.1.2 Shotgun/Discovery and Targeted Proteomics Shotgun/Discovery Proteomics Shotgun, used as synonymous of discovery proteomics, is a powerful technique for identifying proteins, individually or in complex matrices (Marcotte 2007), and is named after shotgun DNA sequencing, where long DNA sequences are computationally reconstructed from many short sequences. It identifies proteins from tandem mass spectra of their proteolytic digested peptides. The process typically starts with a complex mixture of proteins, then digesting the proteins into peptides by sequence-specific proteolysis. Each peptide is isolated in the mass spectrometer and characterized by tandem mass spectrometry that involves breaking the peptides into many smaller fragments and measuring the mass spectrum. The cognate peptides, and ultimately their parent proteins, are identified from the tandem mass spectra with the help of a peptide spectral database (Marcotte 2007). The resolution and peak capacity of the multidimensional separation techniques are important in efficient analysis. MudPIT (Washburn et al. 2001) has already been applied to the analyses of complete cell lysates, organisms, tissue extracts, and various sub-proteomes (Delahunty and Yates III 2007), with its recent advances in improving peptide separation by ultra-high-pressure LC (Livesay et al. 2008) and anion and cation mixed bed ion exchanges (Motoyama et al. 2007). Shotgun proteomics is conceptually simple; however, it generates increased complexity of peptide mixtures that requires highly sensitive and efficient separation and can lead to incorrect protein identification as not all peptides of a protein can be observed or
3.1 Proteomics
59
identified correctly with mass spectrometry analysis due to unexpected modifications (Xuemei et al. 2008). Targeted Proteomics In targeted proteomics, the proteins to be measured are known, and therefore it is an appropriate technique for testing a specific hypothesis for a subset of proteins in many protein backgrounds. It is emerging as a significant technique—remarkably declared Method of the Year for 2012 from Nature Methods—to provide more precise, quantitative, and sensitive data to detect the protein(s) of interest. This technique is currently implemented using various mass analyzers—quadrupoles, orbitraps, and time-of-flights. Borras et al. classified the targeted methods used in proteomics into four groups (Borras and Sabido 2017): (i) Targeted acquisition at MS1 (precursor) level: It includes selected ion monitoring (SIM). (ii) Targeted acquisition at MS2 (fragment) level: It includes SIM-triggered MS2, selected reaction monitoring (SRM), and parallel reaction monitoring (PRM) MS. (iii) Targeted data analysis at MS1 level: Here, peptide data are acquired for nearly all peptide ions present in the test samples using high-resolution mass spectrometry (HRMS) for acquiring full mass range at the peptide level during entire chromatographic time range. Data from MS1 DIA can be used as a MS1 map consisting of mass-to-charge, retention time, and intensity information for each ion at the peptide level, albeit without any sequence information. Unfortunately, this technique might exhibit higher chemical noise due to the co-elution of peptides with both high and low abundance, which can limit the dynamic range, resulting in signal interferences (Borras and Sabido 2017). (iv) Targeted data analysis at MS2 level: It includes data-independent acquisition (DIA) methods at the MS2 level that acquire all fragment ions from any peptides in the samples implemented in quadrupole-orbitrap ion traps and quadrupole time-of-flight hybrid mass analyzers. As DIA-MS acquires data of all detectable peptides present in the samples, the targets can be selected post mass spectra acquisition with/without the use of reference spectral or chromatographic libraries. The DIA-MS method at the MS2 level can be considered targeted only when a specific hypothesis is being tested (Borras and Sabido 2017). The gold standard mass spectrometry method for targeted proteomics is the selected reaction monitoring mass spectrometry (SRM-MS). Use of SRM typically results in higher sensitivity and better signal-to-noise ratio than for quantification using shotgun proteomics (Lange et al. 2008). As mass spectrometry instrumentation and related technology and methodology develop, the sensitivity, specificity, and highly accurate quantification will move targeted proteomics further for more biological applications (Marx 2013).
60
3 Selected Reaction Monitoring Mass Spectrometry
3.2 S elected Reaction Monitoring Mass Spectrometry (SRM-MS) The SRM-MS method was first described in the late 1970s, where it provided selective, sensitive, and quantitative analysis of small organic molecules (Yost and Enke 1979), and later it has been implemented in targeted proteomics to complement discovery/shotgun proteomics for the detection and quantification of specific, predetermined analytes with known fragmentation patterns in complex biological matrices (Picotti and Aebersold 2012). In SRM, a set of peptides is selected to ultimately quantify, by inference, a protein of interest. It is performed on a triple quadrupole mass spectrometer, where the first quadrupole is set to filter specifically selected predefined peptide or precursor ion with a narrow isolation window, which is then fragmented by collision-induced dissociation (CID) in the second quadrupole by collision of the precursor ion with a neutral collision gas, such as nitrogen (Fig. 3.4). Generated fragment ions are then transferred to the third quadrupole, where only a selected m/z can pass, resulting in a chromatographic trace with retention time and signal intensity as coordinates (Lange et al. 2008). In SRM-MS, precursor-fragment ion pair is known as transition. Three to five fragment ions per peptide are measured, whereas one to three proteotypic peptides from a protein are used for the quantification. The time that the mass analyzer spends for each transition is known as dwell time, whereas the time between two sampling events of the same transition is called the cycle time. With smaller dwell time, more individual sampling can be accommodated in a cycle for a reasonable number of data points (at least 5–8) within chromatographic elution time. SRM-MS or MRM-MS At the beginning, when single precursor ion and single fragment ion of it (one transition) were selected for detection, it was known as single reaction monitoring, whereas for multiple precursor/fragment ion pairs (multiple transitions), it was known as multiple reaction monitoring (MRM). Later these two terms were replaced by one unique term—selected reaction monitoring (SRM)—to avoid the ambiguity; however, MRM and SRM are still used today interchangeably by the scientific community.
3.2.1 S RM-MS in Comparison to Non-MS Analytical Technique SRM-MS has become a viable alternative/complementary to affinity-based (e.g., ELISA or Western blotting) or imaging-based (e.g., immunofluorescence or GFP fluorescence) techniques for protein quantification; however, the choice of an appropriate quantification method is made based on the analytical variables and the scientific question(s) of the pertaining research project. Picotti et al. compared some
3.2 Selected Reaction Monitoring Mass Spectrometry (SRM-MS)
61
Fig. 3.4 Schematic diagram of selected reaction monitoring mass spectrometry. (Reprinted with permission from Colangelo et al. (2013))
of the shared analytical variables of SRM-MS and affinity-based and imagingbased target protein quantification methods in radar charts, as shown in Fig. 3.5, where each spoke length from the center is proportional to the magnitude of the variable relative to the maximum magnitude across the compared methodologies (Picotti and Aebersold 2012). Some immunoassays have simplicity, lower detection limit, and prompt result- producing capability; however, they are limited by availability, selectivity, and the cost of developing antibody reagents (Vidova and Spacil 2017). Besides, development of immunoassays is time-consuming, and a majority of commercially available assays require further validation (Bordeaux et al. 2010). Immunofluorescence technique has higher sensitivity than SRM-MS and can be applied to cells or tissue sections and mass cytometry. GFP fluorescence fusion proteins provide the ability to quantify the amount of the target protein in single cells that offer the information of the cell-to-cell variability in expression and subcellular localization of proteins. In this technique, multiplexing capability is limited, and the introduced tag may affect the localization, function, stability, and expression of the target protein. Both these imaging techniques, however, depend on antibodies and have thus limitations related to it (Picotti and Aebersold 2012). On the other hand, SRM-MS enables flexible, quantitative, and routine assay of hundreds of peptides within a shorter period using innovative instrument control software called intelligent-SRM (iSRM) (Kiyonami et al. 2011), faster assay development, and is capable of distinguishing closely related proteo-forms due to protein isoforms, posttranslational modifications (PTMs), or single nucleotide polymorphisms (SNPs) (Picotti and Aebersold 2012).
62
3 Selected Reaction Monitoring Mass Spectrometry
Fig. 3.5 Performance profiles of common analytical variables of SRM-, affinity-, and imaging- based methods for target protein quantification. (Reprinted with permission from Picotti and Aebersold (2012))
3.2 Selected Reaction Monitoring Mass Spectrometry (SRM-MS)
63
3.2.2 Instrumentation for SRM-MS Triple quadrupole (QqQ) mass analyzer is predominately used for SRM-MS technique. Quadrupole mass analyzer consists of four rods with DC and RF voltages applied. An ion of selected m/z will be stable and able to pass through only the quadrupole with a specific DC/RF voltage combination like a mass filter, as shown in Fig. 3.6. The main advantages of QqQ in SRM-MS analysis include: (a) high selectivity of MS/MS process, (b) rapid duty cycle compatible with ultra-high performance liquid chromatography—UPLC or UHPLC, (c) higher sensitivity compared to other mass analyzers, and (d) data processing with minimally required storage space and computational power (Vidova and Spacil 2017). The limitations of SRM-MS on a QqQ instrument over untargeted or discovery proteomics include: (a) an initial method for development and refinement is needed before sample acquisition, (b) acquisition method should be included in the preselected transition list relevant to the targeted peptides/proteins, and (c) in SRM-MS data are acquired only on selected transition list without further possibility of post-acquisition data mining for additional target peptides/proteins (Vidova and Spacil 2017). Typical isolation window utilized for the Q1 filter varies between 0.7 and 1.2 m/z, with the newest instrument having 0.2 m/z, while the Q3 filter generally acquires at unit resolution or 0.7 m/z.
3.2.3 Commercial Triple Quadrupole Instruments Triple quadrupole (QqQ) mass spectrometer is nowadays a very popular instrument, especially for SRM-MS applications, and available from various manufacturers on different models for diverse applications. Current instruments have higher sensitiv-
Fig. 3.6 Schematic of a quadrupole mass analyzer. (Reprinted with permission from Savaryn et al. (2016))
64
3 Selected Reaction Monitoring Mass Spectrometry
ity, resolution, and scan speed, as well as smaller footprint. They also are accompanied with necessary software for data analysis and compliance requirement (21 CFR Part 11) and are also compatible with various LC systems—analytical, micro, and nano grade. Besides proteomics, additional application areas of these i nstruments include biopharma, clinical research, environmental safety, food safety, forensic toxicology, pharma, impurity quantification, metabolomics, and lipidomics. Information on different commercial triple quadrupoles can be obtained from their respective websites. ThermoFisher Scientific (www.Thermofisher.com) Currently, Thermo has different types of triple quadrupoles with the brand name of TSQ (Triple Stage Quadrupole) in the market to meet the various needs of the research/analytical laboratories worldwide, which include TSQ Fortis, TSQ Endura, TSQ Quantis, and TSQ Altis. These instruments have different levels of system sensitivity, resolution, scan speed, and of course applications. TSQ Fortis is good for small molecule quantification, whereas the latter three are more suitable for targeted proteomics, with TSQ Altis having higher sensitivity and resolution, and are compatible with FAIMS Pro (differential ion mobility) interface that reduces chemical noise and matrix interference, resulting in improved assay robustness and increased sensitivity. AB SCIEX (www.sciex.com) SCIEX triple quadrupoles are known by their brand names of Triple Quad and QTRAP—6500+ system, 5500+ system, 4500 system, 3500 system, etc. In the SCIEX QTRAP system, the third quadrupole is a linear ion trap (LIT) instead of quadrupole which offers added features including enhanced product ion scan and MRM3 (multiple reaction cubed) beyond the classical SRM mode. The latest SCIEX quad systems feature multicomponent IonDrive technology, and the sensitivity, speed, and performance provided through this technology enhancement, according to the manufacturer, can enable researchers to detect all ions in a single injection— from low to high mass compounds in positive or negative polarity—with high sensibility and accuracy. It also is compatible with SelexIon differential ion mobility technology, providing, according to the manufacturer, better separation of isobaric species, isolation of challenging co-eluting contaminants, and reduction of high background noise. Waters (www.Waters.com) Waters manufactures several triple quadrupole mass spectrometers designed for quantitative UPLC®-MS/MS, which includes Xevo TQ-XS, Xevo TQ-S, Xevo TQ-S micro, Xevo TQ-S cronos, Xevo TQD, etc. Among these, Xevo TQ-XS is the most sensitive one, which provides, according to the manufacturer, the highest level of sensitivity, reliability, reproducibility, and performance. Its applications are in the area of biomedical research, forensics, food safety, agrochemical, and environment. According to Waters, Xevo TQ-XS has StepWave XS™ ion guide that facilitates increased sensitivity that is reproducible and has reliable quantification ability at the very lowest levels of concentration. StepWave XS passively removes neutrals and
3.2 Selected Reaction Monitoring Mass Spectrometry (SRM-MS)
65
gas load, while it actively transfers ions toward mass analyzer, resulting in improved sensitivity and robustness. The Xevo QqQs has multiple ion source options that can provide optimum ionization for each function required for specific laboratory/ research project, which include ESI, ESCI, APCI, APGC, ASAP, nanoflow ESI, and ionkey. Agilent (www.Agilent.com) The triple quadrupole of Agilent includes Ultivo, 6495C, and 6470 triple quadrupole LC/MS. The 6495C triple quadrupole LC/MS system is the highest performance QqQ from Agilent, which provides higher sensitivity, extended mass range of up to 3000 m/z, ease of maintenance, and direct Skyline integration to facilitate peptide quantification method development and analysis, according to the manufacturer. It has several distinctive features: third-generation iFunnel technology that provides higher sensitivity and precision at low intensities; VacShield that provides quick capillary maintenance without venting the instrument, resulting in minimum maintenance downtime; Q1 ion optics that reduces contamination and achieves lower limit of detection by efficient ion transfer; and curved collision cell that helps in efficient collection and transmission of fragment ions without any cross-talk. Some of its applications include peptide quantification, food safety, environmental, clinical/biomedical research, and forensics.
3.2.4 Distinctive Features of SRM-MS The major characteristic features of SRM-MS include the following: (a) Higher specificity—In SRM-MS, the first and third quadrupoles act as filters to specifically select targeted m/z of precursor ions and its selected product ions. These dual levels of mass selection, with a narrow mass window of 0.20–1.0 Da, result in a higher selectivity where majority of the co-eluting background ions are filtered out very efficiently compared to the shotgun proteomics approach. (b) Increased sensitivity—In SRM-MS, no full mass spectra are acquired. Instead, mass spectra acquisitions are concentrated only on those targeted m/z of the selected peptide–product ion pairs. This non-scanning nature of operation has increased the sensitivity of the analysis by one to two orders of magnitude when compared with other data-dependent or data-independent type of analysis where full range mass spectra are recorded (Lange et al. 2008). (c) Wide dynamic range—SRM-MS has a linear response over a wide dynamic range up to five orders of magnitude with standard peptides and up to four orders of magnitude with indigenous peptides in biological matrices, which greatly helps in detecting low abundance proteins in complex biological matrices for biomarker and proteomics studies. (d) Multiplexing capability—One of the major competitive advantages of SRM-MS is its multiplexing capability, owing to which thousands of transitions (peptide– product pairs), representing hundreds of proteins, can be detected and quanti-
66
3 Selected Reaction Monitoring Mass Spectrometry
fied in a single LC-SRM-MS analysis, albeit depends, in part, on the total cycle time and chromatographic peak width. (e) Higher precision and reproducibility—SRM-MS has a higher precision and reproducibility. In a recent multisite SRM-MS assessment study, interlaboratory variability in terms of CV in detecting known amounts of 10 peptides in a complex trypsin digest was ~10%, whereas based on selected spiked peptides, generated from trypsin digestion of standard proteins into the plasma digest, the variability showed 1042.5 transition) has high background noise and interference that affect quantification. With the use of MRM3 method, the adjacent and distal interference peaks were significantly reduced that resulted in increased specificity (Liu et al. 2013). Development of SRM assays of peptides targeting PTMs with low site occupancy can be achieved with pre-fractionation of the sample, depletion of the high abundant proteins, especially of serum/plasma, or enrichment of the target protein(s)/ similar protein groups (e.g., phosphopeptides with immobilized metal affinity/TiO2 columns or glycopeptides with biotin tagged reagent and streptavidin columns). Proteins with various isoforms can be enriched with immunoprecipitation (IP). The authors report quantification of three isoforms of TGF-β in mouse plasma and in human saliva from three genetic vascular disease patients by using IP with a monoclonal antibody, followed by the detection of three signature peptides representing each of the isoforms with three transitions each. IP could reduce nonspecific matrix component and dramatically increase assay sensitivity (Liu et al. 2013).
3.3.2 MRM3 Many of the serum/plasma proteins with clinical interest are in lower abundance range. Therefore, development of a robust and highly sensitive SRM-based assay for these types of proteins is quite challenging and is only achievable via SCX fractionation, immune depletion of high abundance proteins, mixed cation exchange solidphase extraction, conventional micro-bore reverse phase chromatography, and coupled with mass spectrometry detection. However, biomarker verification of a large cohort of clinical samples will need assay robustness and high throughput. Fortin et al. developed a method, called multiple reaction cubed (MRM3), for the quantification of very low abundance proteins—at a lower ng/mL level—in serum/ plasma without any immunoenrichment of the targeted proteins, as shown in Fig. 3.8.
3.3 Variations of the Conventional SRM-MS
69
Fig. 3.8 Schematic of MRM3 method developed on a hybrid triple quadrupole linear ion trap mass spectrometer. (Reprinted with permission from Fortin et al. (2009), © 2009, American Chemical Society)
This unique methodology is developed on the latest generation of hybrid quadrupole/linear ion trap instrument where the MRM3 method involves (i) selection at unit resolution of a proteotypic peptide ion in Q1, (ii) CID (collision-induced dissociation) fragmentation of the selected peptide ion in Q2, and (iii) trapping at unit resolution of one of the most intense fragment ions, fragmentation of this species to generate second-generation ions, and mass selective scanning of these fragment ion patterns—all in Q3. The MRM3 chromatogram is then reconstructed by summing the extracted signals from the MS3 fragment ions (Lemoine et al. 2012). The valida-
70
3 Selected Reaction Monitoring Mass Spectrometry
tion of the MS3 spectrum can be done by comparing relative intensities either with prototypic peptides from protein standard digest or with AQUA peptide standards. The optimal number of product ions required in each stage to minimize the interference will depend on the peptide sequence, their concentration, and the complexity of the sample matrix. With decreasing concentration, the higher is the likelihood of getting contaminating interferences from nonselected peptides. Using MRM3, PSA in human serum was quantified with improved specificity and sensitivity at an LOD and LOQ of 1.5 ng/mL and 4.5 ng/mL, respectively, by using only albumin immune depletion, trypsin digestion, and solid-phase extraction of the peptide. Here, use of the conventional bore LC instead of nanoLC provided higher robustness and more closeness toward typical clinical laboratory setting (Fortin et al. 2009). Several research groups applied MRM3 with higher specificity in analyzing brain natriuretic peptide (Mollah et al. 2011), human protein kinases in whole cell lysates (Kusebauch et al. 2010), and C-reactive protein in human serum (Ceglarek et al. 2010). On average, MRM3 assay achieved a three- to fivefold improvement in LOQ when compared to the conventional SRM (Lemoine et al. 2012). The drawback of MRM3 for accurate quantification using internal standards is the relatively long duty cycle of 300 ms between two acquisition points that is incompatible with a large set of targeted peptides. It can be overcome by scheduled acquisition, where a unique peptide is targeted over ~1 min around its expected retention time. Another limitation is that the usage of MRM3 is mostly limited to hybrid triple quadrupole/linear ion trap instrument that has the capability to trap, fragment, and analyze only the fragment ions of the targeted primary product ions. To overcome the limitation related to multiplexing capability where tens of targeted proteins with a wide concentration range across hundreds of clinical samples in biofluids are involved, the peptides from the relatively higher abundant proteins and from proteins with unknown abundances can be measured first with the conventional SRM and then undetected and lower abundance proteins can be analyzed—as a group of 5–15 peptides separated from each other by at least 1 min—by setting up multiplexed MRM3 method (Lemoine et al. 2012).
3.3.3 Dynamic SRM-MS The maximum number of peptide or precursor ions that can be included in a single LC-SRM-MS is limited by its total cycle time and chromatographic peak widths. One way to overcome this constraint is to use dynamic SRM (or scheduled SRM or time-scheduled SRM), where the transitions of targeted peptide ions are acquired only within a pre-selected retention time window (e.g., 2–4 min) around its elution time, and thereby increasing the number of peptide ions detected in one LC-SRM-MS analysis (Stahl-Zeng et al. 2007). The comparison of conventional versus dynamic SRM is shown in Fig. 3.9. For dynamic SRM, two of the most critical criteria are the prior knowledge of the retention time of targeted peptide ions, which can be either derived from previous
3.3 Variations of the Conventional SRM-MS
71
Fig. 3.9 Schematic of dynamic SRM compared to the conventional SRM. (Reprinted with permission from Gallien et al. (2011))
empirical data or predicted from computational tools, and reproducibility of chromatography elution profile. In case of a large number of target peptides, reproducibility is a must. Reference standard peptides covering the total elution time can be used in monitoring the overall chromatographic performance. In case of LC separation, it is preferable to spread out the precursor peptides as wide as possible to maximally utilize the retention time window and separate any interfering peak from the target peak Gallien et al. (2011).
3.3.4 Photo-SRM Quantification of low abundance molecules in complex biological samples using the conventional SRM-MS is challenging. Without the use of extensive sample fractionation or improved chromatography, the interfering transitions will possibly influence the integration of the peak areas of target transitions. This can be overcome by improving the selectivity of precursor ions using FAIMS or increased quadrupole resolution. Enajalbert et al. described a novel approach named as photo-SRM that replaced the conventional collision-induced dissociation (CID) by laser-induced dissociation (LID) in the visible wavelength range to improve in detection specificity for SRM by selectively fragmenting only chromophore-tagged compounds (Enjalbert et al. 2011). Here, a laser beam at 532 nm was focused in Q2 of API300 triple quadrupole MS. The laser is a 532 nm continuous wavelength laser (Oxxinus Inc., Santa Clara, CA), and its output is 300 mW with a beam diameter of 0.6 mm or divergence 1.2 mrad. The laser beam passes through one diaphragm and enters the QqQ using two mirrors. To avoid fragmentation in Q1 and Q3, the laser beam is slightly off-axis (~2°). A schematic of the instrument setup is shown in Fig. 3.10. In fact, most biomolecules, including proteins, do not absorb in the visible wavelength range. After optimization of the method, oxytocin peptide was used for photo-SRM. Oxytocin was derivatized by the thiol-reactive QSY®7 C5-maleimide
72
3 Selected Reaction Monitoring Mass Spectrometry
Fig. 3.10 Schematic of photo-SRM instrumentation. (Reprinted with permission from Enjalbert et al. (2011))
quencher on cysteine residues to shift its absorption in the visible range and spike in human plasma digest. Photo-SRM chromatograms showed better detection specificity and sensitivity than the conventional SRM, and the absence of interference within targeted transition window resulted in ~50-fold improvement in LOD. Photo- SRM at 473 nm was evaluated for the detection specificity of cysteine-containing peptides from seven proteins—afamin, albumin, C4b binding protein alpha chain, complement C3, Gelsolin isoform 1, Haptoglobin beta chain, plasminogen—tagged with a dabcyl chromophore in plasma proteins (Enjalbert et al. 2013). When comparing the best proteotypic cysteine-containing peptides, photo-SRM generated up to ~10-fold increased or comparable signals than the conventional SRM. With diluted human plasma in rat plasma, photo-SRM extended response linearity across a calibration point. Photo-SRM was also utilized to detect estrogens in plasma (Enjalbert et al. 2014). In this study, both photo-SRM and the conventional SRM were simultaneously used for the analysis of estrogens along with two coagulation proteins heparin cofactor and factor XIIa within a single chromatographic run. In case of single estrogen quantification, photo-SRM method is comparable to the conventional one without any loss of specificity or sensitivity. For simultaneous analysis of both small molecules and proteins in complex biofluids, the authors found that a combination of the conventional and photo-SRM could improve the detection specificity of estrogens due to a more specific fragmentation step and the increased hydrophobicity of chromophore-tagged compounds. Photo-SRM can also be used alone for simultaneous detection of small molecules and peptides with the use of the same chromophore or two different chromophores. Photo-SRM, in no ways, is a replacement of the conventional SRM; however, it is a good complement for which LOQ for peptides, in many cases, is similar or even better (Enjalbert et al. 2013).
3.3 Variations of the Conventional SRM-MS
73
3.3.5 MALDI-SRM MALDI has been coupled to SRM as an alternative for quantification, initially of small molecules and later of peptides. Corr et al. constructed a MALDI source on an API 4000 QqQ mass spectrometer for the purpose of high-speed quantification of drugs and other low molecular weight compounds (Corr et al. 2006). Ion generation and transport process were optimized that affect analysis speed, throughput, and instrument robustness. Parameters related to desorption speed, beam spreading, ion flight times, sensitivity, signal-to-noise, ion fragmentation, sample carryover, and contamination were evaluated. A high repetition rate laser MALDI ion source coupled to QqQ mass spectrometer was evaluated for fast quantitative analysis of small-molecule pharmaceuticals in terms of linearity and dynamic range (Gobey et al. 2005), where the high repetition rate laser improves sensitivity and precision and QqQ MS decreases chemical background signal from MALDI matrices. Performance of small-molecule analysis by MALDI-SRM assays was compared with the conventional SRM in terms of sensitivity and speed. Owing to the lower success rate of MALDI compared to ESI, the authors suggested that MALDI analysis can be performed first to rapidly assay the large number of samples sets, then quickly screen the data for failed ones, and finally analyze those for the conventional SRM assay. Ultrafast quantitative MALDI-SRM assay has been reported for the quantification of saquinavir—a HIV-protease inhibitor drug largely prescribed for the treatment of AIDS—in human plasma. The experiment was performed on a MALDI-4000 QTRAP equipped with a prototype MALDI source (AB/MDS Sciex, Concord, ON) (Fig. 3.11) in which a high repetition rate (1000 Hz), frequency-tripled (355 nm) Nd:YAG laser was used. The assay was found to be linear from 5 to 10,000 ng/mL using penta-deuterated saquinavir (SQV-d5) as an internal standard and from 5 to 1000 ng/mL using reserpine as the internal standard. Accuracy and precision were in the range of 101–108% and 3.9–115% with SQV-d5 and in the range of 93–108% and 3.5–15% with reserpine, respectively. MALDI-SRM has also been extended to peptide analysis. A therapeutic human recombinant monoclonal antibody (mAb) was used, and four of its tryptic peptides generated were selected as quantification surrogates (Lesur et al. 2010). 4-Sulfophenyl isothiocyanate (SPITC) was used as a CID fragmentation enhancer derivatization agent that generated nearly all y-ions although moderately impacting on the sensitivity. Good precision and accuracy could be obtained over 2.5 orders of magnitude. The advantages of using SPITC are the selectivity enhancement and the possibility to monitor several SRM transitions that is beneficial for the possibility of replacing transition(s) in case of any matrix interferences. The major limitations of MALDISRM include limited m/z range, comparatively lower resolution of most QqQ instruments, and relatively poor reproducible signals for quantification (Shi et al. 2012).
74
3 Selected Reaction Monitoring Mass Spectrometry
Fig. 3.11 Schematic of the MALDI-QqQLIT instrument. (a) MALDI source, (b) first mass analyzing quadrupole, (c) collision cell, and (d) linear ion trap. (Reprinted with permission from Wagner et al. (2008))
3.3.6 MSIA-SRM Mass spectrometric immunoassay (MISA)-SRM method has been developed for high-throughput, quantitative, and highly selective mass spectrometric targeted immunoassays for clinically important proteins in human plasma/serum and to address the challenges of sensitivity across certain concentration ranges, high precision ( 0.8) to the SRM transitions. (d) Example of SRM transitions displayed as chromatographic traces (inset) or as a spectrum imitation (SRM intensities) compared to the corresponding MS/MS spectrum. Red peaks in the MS/MS spectrum indicate peaks matched to b and y ions. (Reprinted with permission from Picotti et al. (2010))
labeled and the nonlabeled ones using similar instrument conditions. This criterion is particularly useful for the detection of target peptides, even with low abundance in samples and coeluted with nontargeted peptides having similar retention time. It also helps to figure out potential interferences from nonspecific transitions during quantification. With all the above benefits, the use of synthetic peptides is sometimes limited for a project with a large set of target proteins due to its high cost and tome-intensive nature. However, Picotti et al., as illustrated in Fig. 4.4, suggested a way to minimize this formidable cost and especially time by using crude synthetic peptides during SRM assay development (Picotti et al. 2010). With this process, it is possible to extract and validate the optimal SRM transitions for ~100 peptides/h with high degree of confidence.
4.5 Optimization of Experimental Parameters
101
(b) Using parallel acquisition of multiple transitions. For the selected peptides of moderate to high abundance present in a comparatively simpler matrix, it is possible to validate by using parallel acquisition multiple transitions of a selected peptide. In theory, all the transitions from the same peptide should be coeluted during mass spectrometry analysis, and the increasing number of transitions will reduce the probability of random match with nonselected transitions. This process is usually applicable for transitions with known and established LC retention time (Lange et al. 2008). (c) Using SRM-triggered MS/MS scan. An SRM transition can be systematically validated by using a full fragmentation spectrum of the targeted peptides (Lange et al. 2008), as illustrated in Fig. 4.5. In this verification process, usually performed on a triple quadrupole instrument, a full MS/MS spectrum is automatically acquired in response to a detected transition. The acquired tandem mass spectra are then matched with the predicted peptide fragment in order to certain that the detected SRM signals come from a peptide. The major disadvantage of this method is the lower sensitivity and selectivity when compared with typical SRM as it uses a wider mass selection window, which results in fragmentation spectra often containing peaks from coeluting multiple species especially in case of complex biological matrices. Besides, SRM-triggered MS/MS scans require much longer duty cycle that may interrupt the typical sequence of SRM experiments in the case of large multiplex quantitative assays even when performed on a Q-linear trap instrument (Kiyonami et al. 2011). (d) Using composite tandem mass spectra Instead of using full MS/MS spectra, a composite spectrum generated by measuring multiple fragments ions from one targeted peptide has recently been introduced (Kiyonami et al. 2011). In this innovative data acquisition technique, SRM is performed in two ways to simultaneously quantify and confirm the identity of the targeted peptides. First, a set of primary transitions (typically two or three most intense fragments) is monitored continuously during a scheduled elution time window (2–4 min) to precisely quantify each peptide. If the signals exceed a preset threshold, then it triggers a data-dependent event where a set of six to eight transitions is acquired. These additional transitions are used to generate composite MS/ MS spectra to confirm the identity of the targeted peptides. This instrument control software, named as intelligent SRM (iSRM), shown in Fig. 4.6 was initially applied to analyze the tryptic digest of a yeast lysate to demonstrate its performance and showed an LOD down to tens of attomoles and a throughput of over 6000 transitions targeting 757 peptides in a single experiment.
4.5 Optimization of Experimental Parameters The assay sensitivity of SRM-MS can be improved by several ways—improving peptide-related parameters and improving fragmentation-related parameters.
102
4 Development of SRM-MS Experiment
Fig. 4.5 Validation of transitions. SRM-triggered MS/MS experiment for the validation of transitions for peptide VFAQFSSFVDSVIAK. (a) SRM traces of five transitions. Two peaks with coeluting transitions are apparent at 37.5 and 43.3 min. (b) MS/MS spectra triggered at the apex of SRM peaks 1 (upper panel) and 2 (lower panel). Peaks matching the respective y ions are colored in red. Even though at 43.3 min, SRM transition intensities are higher, the MS/MS spectra clearly show that the targeted peptide is eluting at 37.5 min. Utilizing transition intensities at 43.3 min without validation would lead to false quantification values for the targeted peptide. (Reprinted with permission from Lange et al. (2008))
(a) Improving peptide-related parameters This can be done by selecting charge states with highest intensity and by optimizing declustering potential (DP). It is essential to select the charge state of a specific protein with highest signal intensity for the optimum sensitivity. Charge state information can be getting either from in-house empirical data or from data repositories. Several chromatographic conditions, for example, flow rate, solvents, chemical background, and ionization can influence charge state distribution (Lange et al. 2008). The dissolvation and dissociation of peptide ion clusters are achieved by a voltage potential known as declustering potential. Mostly QTtrap instrument has this option to optimize; as with too high DP, peptides will be fragmented into the source region whereas with very lower DP value, it is challenging to get single ion during ionization process. Same optimum DP value can be used for all the transitions from a specific peptide and charge state. (b) Improving fragmentation-related parameters It is possible to improve sensitivity of SRM-MS assay by optimizing the collision energy of individual transitions. It is time-consuming and laborious process and is dependent on the instrument type and experimental conditions. Some instrument manufacturers and software usually provide mathematical equation linearly correlating optimum CE values to the m/z of peptides with specific charge states. Some of these mathematical functions are good for only doubly charged state, but not with the triply charge states of precursor ions (MacLean et al. 2010). However, several studies reported that the improvement of assay sensitivity related to individual transition optimization process is below twofold. For a project with a large set of peptide/protein, it is suggested that CE
4.6 Assay Development and Data Acquisition
103
Fig. 4.6 Principle of iSRM. (a) The iSRM logic in which two transitions of a given peptide are monitored continuously and trigger a data-dependent event if both signals exceed a preset threshold. (b) and (c) The primary and data-dependent (secondary) SRM events. (d) and (e) The channels (and the corresponding ion intensities) of a primary and a secondary iSRM event, respectively. (Reprinted with permission from Kiyonami et al. (2011))
of individual transition may not be necessary unless those are from very low abundance peptides or having low signal intensity.
4.6 Assay Development and Data Acquisition After selecting proteotypic peptides of targeted protein(s), their predominant charge states, and three to five most intense transitions of each peptide ion, the SRM assay method is developed. By using heavy isotope-labeled proteins/peptides standards, chromatographic retention time of precursor ions, and relative signal intensities of MS/MS fragmentation of a selected peptide, SRM assay can be validated. It is
104
4 Development of SRM-MS Experiment
utmost interest to include the maximum number of peptides and transitions in a single LC–SRM-MS assay. In fact, multiplexing capability is an intrinsic advantage of SRM-MS assay over other contemporary analytical techniques. However, there is a limitation to the number of peptides to be analyzed with high sensitivity and accuracy in a single experiment. The total number of transitions that can be measured in a single assay depends on the analysis cycle time. The SRM cycle time is the product of the total number of transitions included into the cycle, and the time needed in acquiring each transition, known as dwell time. It can be expressed as
Cycle time = total number of transitions × dwell timeof each transition
In an experiment, all dwell time can be kept similar or different, depending on the target peptide(s), especially for the analysis of low abundant precursor ions where longer dwell time can be used compared to other abundant ions. For a higher dwell time, the signal-to-noise ratio will be higher, and then the LOD will be lower for a transition, but the cycle time will be increased. On the other hand, lower dwell time will reduce signal-to-noise ratio, as illustrated in Fig. 4.7, and thus will hamper the detection of low abundance ions. Including many transitions at a fixed dwell time in an SRM assay also results in a longer cycle time, and therefore, inadequate number of data points will be used to construct the chromatographic elution profile of the selected peptides. This will limit accurate quantification process. The dwell time, number of transitions acquired, and the overall cycle time—all ultimately depend on chromatographic peak width. Especially for accurate quantification, at least eight to ten data points should be acquired throughout LC elution time (Lange et al. 2008). In a nano-UPLC system, typically used in general proteomics experiment, the average peak widths are 10–20 s. Therefore, the cycle time should be within 1.25 s. A trade-off is needed between the total number of transitions selected and of each dwell time in each MS run with acceptable sensitivity, especially for a large set of peptide analyses. There are several ways to optimize the maximum utilities of dwell time. One of the ways is the use of comparatively higher dwell time for low abundance precursor ions, if known beforehand; and lower dwell time for higher abundance species. Another strategy is to utilize elution time information of specific peptide(s) and schedule their data acquisition, accordingly, known as scheduled SRM (Stahl-Zeng et al. 2007). In this process, as illustrated in Fig. 4.8, the elution time information of all targeted peptides is first determined during the assay development. Then multiplex SRM assay was designed by scheduling 2–4 min time window to acquire the transitions of only selected peptide(s) eluting in that narrow time limit. Several of these short time windows, each containing a unique set of precursor ions, can be accommodated in a single schedule SRM assay. Eventually, the total number of peptides measured in single LC–MS assay is increased without conceding the limit of detection or their quantitative accuracy. Two prerequisite criteria are important for a fully utilized scheduled SRM assay. First, the LC gradient should be set such a way that the elution time of the selected peptides is spread across the whole run time. Therefore, peptides will have less overlap between time windows. Second and
4.6 Assay Development and Data Acquisition
105
Fig. 4.7 Quantitative accuracy as a function of dwell time and cycle time. Quantification of a peptide using different settings for dwell time and cycle time. Reducing the dwell time from 50 to 5 ms decreases accuracy. With cycle times of 10 or 20 s, the peak height cannot be estimated correctly even though the accuracy of the individual data points is excellent at 500 ms dwell time. Changes in dwell time do not affect absolute signal intensity as it is plotted normalized as counts per second. (Reprinted with permission from Lange et al. (2008))
Fig. 4.8 Time-scheduled LC-SRM. Comparison of conventional and time-scheduled LC-SRM analysis. In a time-scheduled LC-SRM analysis, a time constraint is added to schedule the expected transitions within a defined time window enclosing the retention time of the peptides. Time- scheduled LC-SRM significantly increases the total number of peptides that can be measured in a single analysis. (Reprinted with permission from Gallien et al. (2011))
106
4 Development of SRM-MS Experiment
the most critical is that the chromatography should be standardized and reproducible. It is possible, from a highly reproducible chromatography, to narrow down the LC run window and divide into increased number of sections, thereby including more target peptides in a single run.
4.6.1 Development of an SRM-MS Assay Workflow As we already discussed in this chapter, SRM assay development starts with selecting protein(s) of interest followed by the selection of proteotypic peptide(s) and fragment(s) of the precursor or peptide ion(s) generated upon fragmentation in a triple quadrupole collision cell. The area under the resulting SRM peak is proportional to the amount of the peptide and the protein eventually presented in the sample. These assays need prior knowledge of the precursor and product ions as well as retention time information if it is scheduled. Ebhardt describes a methodological overview of the SRM-MS which contains fundamentals of SRM-MS as well as an optimized SRM pipeline from assay generation to data analyzed (Ebhardt 2014). Manes et al. described a protocol that was used to accurately measure the absolute abundance of proteins of the chemotaxis signaling pathway within RAW 264.7 cells, which is a mouse monocyte/macrophage cell line, where quantification of Gi2, a heterotrimeric G-protein α-subunit, was discussed in detail (Manes et al. 2015). Large biological system or network where a molecular component of a pathway is perturbed, and it requires to measure the detection and quantification of a set of potential-effected proteins, SRM-MS is an ideal tool for this purpose. Feng and Picotti recently presented a practical guide to the development of SRM assays for the selected set of proteins to monitor in the complex biological matrix (Feng and Picotti 2016). The overall workflow is shown in Fig. 4.9. In this workflow, the development process includes—sample preparation, SRM assay design, SRM assay validation and refinement, LC-SRM/MS setup, and final quantitative analysis. In another attempt, Faca provided a detailed protocol for quantification of set of proteins from cell line extracts using the SRM-based targeted proteomics approach (Faca 2017)
4.7 System Suitability Monitoring and Quality Control SRM-MS is a powerful tool for the detection and quantification of targeted peptides in complex matrices with an important goal to generate peptide quantifications that are (a) suitable for the investigation and (b) reproducible across laboratories and acquisitions. The former goal can be realized through system suitability tests that verify the satisfaction of the LC–MS instruments being used with the prespecified criteria immediately before analysis and depends on a series of reference materials
4.7 System Suitability Monitoring and Quality Control
107
Fig. 4.9 Workflow of SRM-based targeted proteomic analyses. (Reprinted with permission from Feng and Picotti (2016))
and specifications to evaluate various aspects of the assay—consistency of the response, analyte carryover, retention time stability, mass accuracy, or signal-to- noise; whereas the later goal can be achieved by quality control that utilizes reference materials and/or calibration standards and specification to provide in-process quality assurance of the assay sample profile (Dogu et al. 2017). A comprehensive study by Abbatiello et al. has been performed to design system suitability test for peptide analysis using SRM-MS with nano-LC across instruments, vendors, and laboratories that contributed a guideline for the input matrices—peak area and retention time (Abbatiello et al. 2013). For targeted proteomics including SRM-MS studies, SProCoP (Bereman et al. 2014) and AutoQC (Bereman et al. 2016) have been developed to monitor system suitability using statistical process control (SPC) which is mainly a collection of statistical method and graphical summaries that provide cautionary signs with occurrence of undesirable deviations and ultimately help to identify and correct the causes of those deviations (Bramwell 2013). Statistical process control in proteomics (SProCoP) is an open-source R-based tool added to Skyline software where it takes retention time, total peak area, full-width at half maximum (FWHM), or peak asymmetry as input and provides control charts as output for each peptide and applies fixed decision thresholds. Similarly, AutoQC is an open-source interface between Skyline and Panorama server that uses similar inputs and provides control charts for nonstandardized outputs and facilitates monitoring of the instrument performance over time and thus enables early intervention for instrument troubleshooting. AutoQC can analyze the data on real time. Dagu et al. adapt the state-of-the art methods of longitudinal SPC—simultaneous monitoring of mean and variation of a matrix, time-weighted control charts to detect small changes, and change point analysis, and maps for high-dimensional decision-making—to both system suitability and quality control of SRM experiments in their MSstatsQC (Dogu et al. 2017). The methods are implemented in open-source R-based software that can be utilized via a command line or graphical
108
4 Development of SRM-MS Experiment
user interface—locally or on a remote server. The method takes retention time, total peak area, and peak asymmetry, or any other user-selected quantitative matrix as input and helps to make decide according to user-defined criteria or according to selected guide set. The authors demonstrated the advantages of these methods using simulated longitudinal quality control data for individual peptides and using experimental longitudinal system suitability test data from CPTAC consortium evaluating multiple matrices and multipeptide criteria of performance (Dogu et al. 2017). Its updated version, MSstatsQC 2.0, also supports experiments of global data- dependent and data-independent acquisition with its R/Bioconductor package (Dogu et al. 2019). Monitoring of some of the quality matrices during and after the analysis is helpful for quality assessment. One such matrix is interference. It is necessary to ensure that the transitions used for quantification must have strong signal and be interference free, especially when biofluids are used. Detection of Interferences Due to wide mass window of triple quadrupole, interferences may arise in SRM-MS experiment from other nontargeted peptides or from chemical noises. This is mostly true for low-abundance peptides in biological matrices. When interferences occur in one or multiple transition(s) to a significant extent, different set of transition(s) should be used for SRM quantification. There are several ways by which a true signal from a targeted transition can be differentiated from the interference signals (Eidhammer et al. 2013): (i) Same retention time (RT): If heavy-labeled peptides are used, then RT can be compared to the transition of nonlabeled analyte to identify interferences. If 13 C- or 15N-labeled peptides are used, then RT will be the same for both nonlabeled and labeled peptides transitions. (ii) Ratio of peak intensity: Even though the signal intensities of heavy and light peptides may vary due to their concentration, however, for the peptide with same sequence the MS/MS fragmentation ratio should be similar among transitions. If the peak intensity ratio is different from rest of the transitions from the same peptide, then it must be interference, shown in Fig. 4.10 (Percy et al. 2013). (iii) Similar peak shape: The transitions from the same peptide should have the same peak shape in selected ion chromatogram (SIC). Any gross deviation will indicate for interference or noise. (iv) MS/MS spectra for identification: In some type of acquisition mode, a full MS/ MS spectrum can be acquired after acquiring a transition which can be searched against protein sequence database for peptide verification. This approach uses reduced dwell time with the expense of decreasing sensitivity and accuracy (Eidhammer et al. 2013). (v) SIC score of mProphet: In mProphet software (Reiter et al. 2011), all of the peaks in selected ion chromatograms, cognate to a peptide, are known as a “unit” and a score is given with respect to its likelihood from a true precursor
4.7 System Suitability Monitoring and Quality Control
109
Fig. 4.10 In the control samples, three transitions per peptides are monitored in buffer and plasma. The average relative ratios of the Q1/Q3 SRM ion pairs for the heavy standard peptide in buffer, in plasma, and the endogenous peptide in plasma are determined, and the variability and assessment of peak shape, symmetry, and retention time are performed. (Reprinted with permission from Parker and Borchers 2014))
ion, where the scores are sorted and a false discovery rate (FDR) is calculated based on a set of selected decoy transition (Eidhammer et al. 2013). (vi) Manual identification: Interference can also be identified through manual inspection of each transition and spectra, but with the expense of time and not suitable for large dataset. An algorithm, named as automated detection of inaccurate and imprecise transitions (AuDIT), has been developed to automatically identify inaccurate transition based on the observation of interfering signal or inconsistent recovery among replicates (Abbatiello et al. 2010). AuDIT (www.broadinstitute.org/cancer/software/genepattern/modules/AuDIT.html). is designed to evaluate SRM-MS data by comparing the relative fragment ion intensities of the precursor peptide ions to those of the stable isotope-labeled internal (SIS) peptides in the experiment, followed by t-test for the determination of any statistical significance differences. A coefficient of variance (CV) is also calculated from the ratio of the analyte peak area to the SIS peak area from the replicates. This software tool has demonstrated the capability of identifying problematic transitions and achieving accuracies of 94–100% for the correct identification of interferences or incorrect transitions.
110
4 Development of SRM-MS Experiment
Recently, Eshghi et al. have developed TargetMSQC, an R-based framework and toolset to facilitate high-throughput data analysis for targeted proteomics with the core objective of automating quality assessment of chromatographic peaks to identify poor chromatography or interference (Eshghi et al. 2018). It calculates matrices to measure several quality aspects of a chromatographic peak—symmetry, sharpness and modality, coelution and shape similarity of transitions in a peak group, transition ratio consistency between endogenous and reference peptides, and retention time consistency across acquisitions. This tool takes advantages of supervised machine learning to identify peaks with poor chromatography or interference based on a small training dataset manually annotated earlier by an expert analyst resulting in improved throughput and accuracy of interference detection especially in large- scale multicenter proteomic studies. The author demonstrated the utilization of this tool on an SRM-MS assay developed to study longitudinal dynamics of cerebrospinal fluid biomarkers of Alzheimer’s disease (Eshghi et al. 2018) Rapid Assessing with Q4SRM Software Recently, several technical advantages in mass spectrometry have greatly improved the overall throughput and multiplexing capability of SRM assays (Duriez et al. 2017; Kim et al. 2016; Matsumoto et al. 2017); however, analyzing this vast quantity of data needs significant computational efforts and time. There needs to be some computational tool that can help rapid quality assessment immediately after the start of LC/MS data acquisition. In order to assist in point-of-acquisition, Gibbons et al. recently developed Q4SRM, an easy-to-use C#.NET application software, that can quickly examine the SRM signal from all heavy-labeled target peptides to identify low-quality SRM transitions (Gibbons et al. 2019). Heavy-labeled (e.g., 13C/15N) internal standard peptides are generally used in SRM assay which helps to optimize the process and used as a reference for determining endogenous peptide abundances. The quality of measurement of these standards, thus, is a crucial factor in determining the success of overall data acquisition. With four QC matrices, this tool identifies problem both related to individual transition as well as a group of transitions cognate to a peptide, as shown in Fig. 4.11. Two matrices are monitored for the assessment of individual transitions. The first matrix, called peak position, measures the distance from the maximal peak intensity to the edge of the user-defined scheduled acquisition window. When the peak maximum is too close to the edge of the window, the tool flags a transition with a warning. This matrix evaluates that peaks completely elute within expected scheduled time without any truncation. The second matrix, called S/N heuristic, calculates the ratio of the maximum intensity to the median intensity. This examines the S/N ratio as the intensity of the transition relative to the background intensity of unrelated signal. The other two matrices are related to the group of transitions linked to same protein. The third matrix, called total signal intensity, is the sum of the intensities for all transitions associated with a peptide which is to confirm that at least user- defined signal threshold generates for all transitions in the group. The fourth matrix, called peak concurrence, calculates the closeness of the transitions in elution that are generated from the same peptides. Generally, time of max intensity for the same
References
111
Fig. 4.11 Q4SRM for SRM-MS quality assessment. (1a) and (1b) Single transition matrices; (2a) and (2b) Multitransition matrices. (Reprinted with permission from Gibbons et al. (2019) © 2019, American Chemical Society)
peptide should be identical and only the presence of any interference can cause this value to be out of sync. This software is open source and available at: https://github.com/PNNL-Comp- Mass-Spec/Q4SRM. It generates both tab-delimited text and graphical output, and a summary image to show low- and high-quality SRM. Users can perform comparative analysis of text outputs from long-term monitoring by using R or Jupyter analytics platforms. Q4SRM is available with both a graphical user interface and a command-line interface.
References Abbatiello SE, Mani DR, Keshishian H, Carr SA. Automated detection of inaccurate and imprecise transitions in peptide quantification by multiple reaction monitoring mass spectrometry. Clin Chem. 2010;56:291. Abbatiello SE, Mani DR, Schilling B, Maclean B, et al. Design, implementation and multisite evaluation of a system suitability protocol for the quantitative assessment of instrument performance in liquid chromatography-multiple reaction monitoring -MS (LC-MRM-MS). Mol Cell Proteomics. 2013;12:2623–39. Addona TA, Abbaatiello SE, Schilling B, et al. Multi-site assessment of the precision and reproducibility of multiple reaction monitoring-based measurements of proteins in plasma. Nat Biotechnol. 2009;27:633–41. Bereman MS, MacLean B, Tomazela DM, Liebler DC, MacCoss MJ. The development of selected reaction monitoring methods for targeted proteomics via empirical refinement. Proteomics. 2012;12:1134. Bereman MS, Johnson R, Bollinger J, Boss Y, Shulman N, MacLean B, Hoofnagle AN, MacCoss MJ. Implementation of statistical process control for proteomic experiments via LC MS/MS. J Am Soc Mass Spectrum. 2014;25:581–7. Bereman MS, Beri J, Sharma V, Nathe C, Eckels J, MacLean B, Hoofnagle AN, MacCoss MJ. An automated pipeline to monitor system performance in liquid chromatography tandem mass spectrometry proteomic experiments. J Proteome Res. 2016;15:4763–9.
112
4 Development of SRM-MS Experiment
Bhowmick P, Mohammed Y, Borchers CH. MRMAssayDB: an integrated resource for validated targeted proteomic assays. Bioinformatics. 2018;34:3566–71. Bramwell D. An introduction to statistical process conrol in research proteomics. J Proteome. 2013;95:3–21. Craig R, Cortens JP, Beavis RC. Open source system for analyzing, validating, and storing protein identification data. J Proteome Res. 2004;3:1234–42. de Graaf EL, Altelaar AF, van Breukelen B, Mohammed S, et al. Improving SRM assay development: a global comparon between triple quadrupole, ion trap, and higher energy CID peptide fragmentation spectra. J Proteome Res. 2011;10:4334–41. Deutsch EW, Lam H, Aebersold R. PeptideAtlas: a resource for target selection for emerging targeted proteomics workflows. EMBO Rep. 2008;9:429–34. Dogu E, Mohammad-Taheri S, Abbatiello SE, Bereman MS, MacLean B, Schilling B, Vitek O. MSstatsQC: longitudinal system suitability monitoring and quality control for targeted proteomic experiments. Mol Cell Proteomics. 2017;16:1335–47. Dogu E, Mohammad-Taheri S, Olivella R, Marty F, Lienert I, Reiter L, Sabido E, Vitek O. MSstatsQC 2.0: R/Bioconductor package for statistical quality control of mass spectrometry- based proteomics experiments. J Proteome Res. 2019;18:678–86. Duriez E, et al. large-scale SRM screen of urothelial bladder cancer candidate biomarkers in urine. J Proteome Res. 2017;16:1617–31. Ebhardt HA. Selected reaction monitoring mass spectrometry: a methodology overview. Chapter 16. In: Jorrin-Novo JV, et al., editors. Plant proteomics: methods and protocols, Methods in molecular biology, vol. 1072: Springer; 2014. Eidhammer I, Barsnes H, Eide GE, Martens L. Computational and statistical methods for protein quantification by mass spectrometry. West Sussex: Wiley; 2013. Eshghi ST, Auger P, Mathews WR. Quality assessment and interference detection in targeted mass spectrometry data using machine learning. Clin Proteomics. 2018;15:33. Eyers CE, Lawless C, Wedge DC, Lau KW, Gaskell SJ, Hubbard SJ. CONSeQuence: prediction of reference peptides for absolute quantitative proteomics using consensus machine learning approaches. Mol Cell Proteomics. 2011;10 https://doi.org/10.1074/mcp.M110.003384,1-12. Eyk JEV, Liu X, Ji W, Fu Q, Grote E. Using pure protein to build a multiple reaction monitoring mass spectrometry assay for targeted detection and quantification. Methods Mol Biol. 2013;1005:199–213. Faca VM. Selected reaction monitoring for quantification of cellular proteins. Chapter 18. In: Guest PC, editor. Multiplex biomarker techniques: methods and applications, Methods in molecular biology, vol. 1546: Springer; 2017. Falkner JA, Andrews PC. Tranche: secure decentralized data storage for the proteomics community. J Biomol Tech. 2007;1:3. Feng Y, Picotti P. Selected reaction monitoring to measure proteins of interest in complex samples: a practical guide. Chapter 4. Jorg Reinders (eds.). In: Proteomics in system biology: methods and protocols, Methods in molecular biology, vol. 1394: Springer; 2016. Fusaro VA, Mani DR, Mesirov JP, Carr SA. Prediction of high-responding peptides for targeted protein assays by mass spectrometry. Nat Biotecnol. 2009;27:190–8. Gallien S, Duriez E, Domon B. Selected reaction monitoring applied to proteomics. J Mass Spectrom. 2011;46:298–312. Geiger T, Clarke S. Deamidation, isomerization, and racemization at asparaginyl and aspartyl residue in peptides. Succinimide-linked reactions that contribute to protein degradation. J Biol Chem. 1987;262:785–94. Gibbons BC, Fillmore TL, Gao Y, Moore RJ, et al. Rapidly assessing the quality of targeted proteomics experiments through monitoring stable-isotope labeled standards. J Proteome Res. 2019;18:694–9. Griffiths JR, Unwin RD, Evans CA, Leech SH, Corfe BM, Whetton AD. The application of a hypothesis-driven strategy to the sensitive detection and location of acetylated lysine residue. J Am Soc Mass Spectrom. 2007;18:1423–8.
References
113
Kim Y, et al. targeted proteomics identifies liquid-biopsy signatures for extracapsular prostate cancer. Nat Commun. 2016;7:11906. Kiyonami R, Schoen A, Prakash A, Peterman S, zabrouskov V, Picotti P, Aebersold R, Huhmer A, Domon B. Increased selectivity, analytical precision, and throughput in targeted proteomics. Mol Cell Proteomics. 2011;10:10(2) M110.002931; https://doi.org/10.1074/mcp. M110.002931. Lai MC, Topp EM. Solid-state chemical stability of proteins and peptides. J Pharm Sci. 1999;88:489–500. Lange V, Picotti P, Domon B, Aebersold R. Selected reaction monitoring for quantitative proteomics: a tutorial. Mol Syst Biol. 2008;4:222. Lau KW, Hart SR, Lynch JA, Wong SC, Hubbard SJ, Gaskell SJ. Observations on the detection of b- and y-types ions in the collisionally activated decomposition spectra of protonated peptides. Rapid Commun Mass Spectrom. 2009;23:1508. MacLean B, Tomazela DM, Abbatiello SE, Zhang S, Whiteaker JR, Paulovich AG, Carr SA, MacCoss MJ. Effect of collision energy optimization on the measurement of peptides by selected reaction monitoring (SRM) mass spectrometry. Anal Chem. 2010;82:10116–24. Mallick P, Schirle M, Chen SS, Flory MR, Lee H, Martin D, Ranish J, Raught B, Schmitt R, Werner T, Werner T, Kuster B, Aebersold R. Computational prediction of proteotypic peptides for quantitative proteomics. Nat Biotechnol. 2007;25:125–31. Manes NP, Mann JM, Nita-Lazar A. Selection reaction monitoring mass spectrometry for absolute protein quantification. J Visual Exp. 2015;102:e52959. Matsumoto M, et al. A large-scale targeted proteomics assay resource based on an in vitro human proteome. Nat Methods. 2017;14:251–8. Meng Z, Veenstra TD. Targeted mass spectrometry approaches for protein biomarker verification. J Proteome. 2011;74:2650–265. Mohammed Y, Domanski D, Jackson AM, Smith DS, Deelder AM, Palmblad M, Borchers CH. PeptidePicker: a scientific workflow with web interface for selecting appropriate peptides for targeted proteomics experiments. J Proteome. 2014;106:151–61. Mollah S, Wertz IE, Phung Q, Arnott D, Dixit VM, Lill JR. Targeted mass spectrometric strategy for global mapping of ubiquitination on proteins. Rapid Commun Mass Spectrom. 2007;21:3357–64. Paizs B, Suhai S. Fragmentation pathways of protonated peptides. Mass Spectrom Rev. 2005;24:508–48. Pan S, Aebersold R, Chen R, Rush J, et al. Mass spectrometry based targeted protein quantification: methods and applications. J Proteome Res. 2009;8:787–97. Parker CE, Borchers CH. Mass spectrometry-based biomarker discovery, verification, and validation – quality assurance and control of protein biomarker assays. Mol Oncology. 2014;8:840–58. Percy AJ, Chambers AG, Smith DS, Borchers CH. Standardized protocols for quality control of MRM-based plasma proteomic workflow. J Proteome Res. 2013;12:222–33. Picotti P, Aebersold R. Selected reaction monitoring-based proteomics: workflows, potential, pitfalls and future directions. Nat Methods. 2012;9:555–66. Picotti P, Rinner O, Stallmach R, Dautel F, Farrah T, Domon B, Wenschuh H, Aebersold R. High- throughput generation of selected reaction-monitoring assays for proteins and proteomics. Nat Methods. 2010;7:43–6. Rauth M. LC-MS/MS for protein and peptide quantification in clinical chemistry. J Chromatogr B. 2012;883-884:59–67. Reiter R, Rinner O, Picotti P, Huttenhain R, Beck M, brunsniak MK, Hengartner MO, Aebersold R. mProphet: automated data processing and statistical validation for large-scale SRM experiments. Nat Methods. 2011;8:430–5. Sanders WS, Bridges SM, McCarthy FM, Nanduri B, et al. Prediction of peptides observable by mass spectrometry applied at the experimental set level. BMC Bioinformatics. 2007;8:S23. Shadforth I, Xu W, Crowther D, Bessant C. GAPP: a fully automated software for the confident identification of human peptides from tandem mass spectra. J Proteome Res. 2006;5:2849–52.
114
4 Development of SRM-MS Experiment
Sherwood CA, Eastham A, Lee LW, Risler J, Mirzaei H, Falkner JA, Martin DB. Rapid optimization of MRM-MS instrument parameters by subtle alteration of precursor and product m/z targets. J Proteome Res. 2009a;8:3746–51. Sherwood CA, Eastham A, Lee LW, Risler J, Vitek O, Martin DB. Correlation between y-type ions observed in ion trap and triple quadrupole mass spectrometers. J Proteome Res. 2009b;8:4243–51. Slechtova T, Gilar M, Kalikova K, Tesarova E. Insight into trypsin miscleavage: comparison of kinetic constants of problematic peptide sequences. Anal Chem. 2015;87:7636–43. Stahl-Zeng J, Lange V, Ossola R, Eckhardt K, Krek W, Aebersold R, Domon B. High sensitivity detection of plasma proteins by multiple reaction monitoring of N-glycosites. Mol Cell Proteomics. 2007;6:1809. Unwin RD, Grifiths JR, Leverentz MK, Grallert A, Hagan IM, Whetton AD. Multiple reaction monitoring to identify sites of protein phosphorylation with high sensitivity. Mol Cell Proteomics. 2005;4:1134–44. Vizcaino JA, et al. A guide to the proteomic identifications database for proteomics data repository. Proteomics. 2009;9:4276–83. Williamson BL, Marchese J, Morrice NA. Automated identification and quantification of protein phosphorylation sites by LC/MS on a hybrid triple quadrupole linear ion trap mass spectrometer. Mol Cell Proteomics. 2006;5:337–46. Wu C, Shi T, Brown JN, He J, et al. Expiditing SRM assay development for large-scale targeted proteomics experiments. J Proteome Res. 2014;13:4479–87. Yost RA, Enke CG. Triple quadrupole mass spectrometry for direct mixture analysis and structure elucidation. Anal Chem. 1979;51:1251.
Chapter 5
Bioinformatics Tools for SRM-MS
Typically, the selected reaction monitoring-mass spectrometry (SRM-MS) experiment enables the quantification of a specific protein in a sample by analyzing protease digested peptide ion and its selected fragment ion as a pair, known as transition, which can be multiplexed in analyzing hundreds of these transitions simultaneously in a single liquid chromatography-selected reaction monitoring (LC-SRM) experiment by optimized scheduling with the current advent of advanced instrument operating software. Most of the discovery bottom-up proteomics require computationally intensive postacquisition data analysis, and Iulia Lazar recently reported such computational and bioinformatics tools (Lazar 2017). However, SRM-MS also requires preacquisition bioinformatic analysis to determine proteotypic peptides (PTP) and their optimal transitions to accurately quantify proteins of interest which necessitates efficient bioinformatics tools. Brusniak et al. recently described an overall bioinformatics processing steps in typical SRM-MS proteomics study, shown in Fig. 5.1, where pre- (blue boxes) and post- (green boxes) data acquisition steps are described with their associated processes (Brusniak et al. 2012). Few of the tools are involved in either pre- or postacquisition steps, but most of the current software are involved in both stages of SRM assays. In this chapter we will mostly discuss open access tools, and at the end commercial software will be discussed in short. In preanalysis steps of SRM experiment, for a set of targeted proteins, proteotypic peptides for each protein are first selected with their preferred charge states and optimum collision energy (CE)—critical for optimum fragment ions. Peptides and their fragment ions can be selected using rule-based prediction models or from empirical spectral libraries from similar mass spectrometer. To maximize the number of transitions in a single LC-SRM-MS run, a schedule/dynamic SRM experiment is an efficient approach with optimized LC retention time, which can be done
The original version of this chapter was revised. The correction to this chapter is available at https://doi.org/10.1007/978-3-030-53433-2_10
© Springer Nature Switzerland AG 2020 , Corrected Publication 2020 M. Hossain, Selected Reaction Monitoring Mass Spectrometry (SRM-MS) in Proteomics, https://doi.org/10.1007/978-3-030-53433-2_5
115
116
5 Bioinformatics Tools for SRM-MS
Fig. 5.1 Typical bioinformatic processing steps in SRM-MS proteomics studies. (With permission from Brusniak et al. (2012))
with computational prediction models. In SRM measurements with both isotopically labeled (typically, C-terminal 15N and 13C) and endogenous peptides, the transition list includes both heavy and light matched pairs of precursor and fragment m/z for each labeled and endogenous peptide pair. Postanalysis of SRM MS includes three steps: the first step is the extraction of chromatograms from measured data, using m/z and retention time with respective tolerances from transition input Q1 m/z, Q3 m/z, and retention time values, where smoothing algorithm can be applied. The second step is the detection of confidently identified transitions, whereas the third step is to select quantifiable transitions and
5.1 SRM Preacquisition Stage
117
determine either absolute or relative quantification of each protein. Several bioinformatics tools have already been developed to support both pre- and postanalysis stages either automated or semi-automated manner.
5.1 SRM Preacquisition Stage SRM assay development is the major stage in pre–data acquisition part, where protein selection is the first step. However, the most critical part is to effectively select peptide(s) and their fragment ion(s) that are used as surrogate for the cognate protein(s). Peptide selection is challenging due to mis-cleavages or partial cleavages, nonniqueness, or due to post-translational modifications. Software can help to avoid those problematic peptides (Colangelo et al. 2013). Another assay development challenge is the interferences of the spectra. Discovery data is often collected on high resolution Orbitrap instruments, whereas QqQ instruments have unit resolution [0.7 Th full-width half-maximum (FWHM)] with the consequence that transitions can overlap that leads in erroneous peak identification and measurement. Possibility of the interference, researchers utilize multiple transitions for each protein, where downstream peak integration algorithm can align those for accurate peak integration. Multiple transitions also help in selecting highly sensitive and interference-free peaks (Prakash et al. 2009). Assays with a few transitions, interference spectra can be identified visually. However, for an assay with large set of transitions, peak can be evaluated by dot-product scores generated by software in matching with library spectra, and flag disputed spectra for further review. To accommodate maximum number of transitions, LC-SRM runs can be scheduled to acquire subsets of peaks based on their retention time (RT), which can be predicted using SSRCalc (Krokhin et al. 2004; Krokhin and Spicer 2010) or other software (Guo, 1986a, Guo, 1986b), or by running internal standard peptides in generating empirical RT. Several free bioinformatics tools have been developed to help researchers generating SRM assays which include MRMaid (new version, MRMaid 2.0) (Mead et al. 2009; Fan et al. 2012), ATAQS (Brusniak et al. 2011), Skyline (Maclean et al. 2010).
5.1.1 MRMaid MRMaid (Mead et al. 2009; Fan et al. 2012) exploits available knowledge and resources to generate reliable transitions for SRM assays. Therefore, researchers do not need for theoretical prediction of fragmentation or to do discovery assays. The new version, MRMaid 2.0, uses spectra from European Bioinformatics Institute’s (EBI’s) PRIDE database (Vizcaino et al. 2015) that significantly increases the coverage and quality of transitions, ultimately allowing support for human, mouse, and yeast. MRMaid consists of three main components. The first one is a rational database designed to store transitions which is periodically populated with transitions
118
5 Bioinformatics Tools for SRM-MS
by the transition builder pipeline, the second component. Finally, the web site (www. Mrmaid.info) provides a visual interface through which users can access transitions from the database. Using this tool, one can enter one or more protein name or accession number to obtain a list of transitions based on user-selected filter criteria—instrument type, amino acid, transition value, maximum number of peptides per protein, and maximum number of transitions per peptide. Several quantitative metrics have been used in MRMaid to measure the suitability of peptide and fragment ions—“peptide score” that is designed to indicate the expected performance of the peptide in SRM; “peptide or product ion probability” that represents the probability of observing a specific peptide in presence of parent protein; and “product ion intensity” that represents the height of the ion peak in the spectrum. Estimated RT is also calculated from hydrophobicity and by using SSRCalc method. Data can be exported in CSV or TraML (Deutsch 2012) format. One of the drawbacks is that the data come from a wide range of experiments with various objectives, different sample preparation and instrument methods using various metrics. Transitions with modified peptides are not supported by MRMaid. It can only be used for preacquisition or assay development stage of SRM.
5.1.2 ATAQS Automated and Targeted Analysis with Quantitative SRM (ATAQS) is the first open-source software that supports both pre- and postacquisition steps in SRM-MS—project management and generation of protein, peptide, and transitions and the validation of peptide detection (Brusniak et al. 2011). This software is designed to support multiple users, can interface with web browsers on a personal computer, and can connect to institution-wide computer resources for high throughput data analysis. ATAQS workflow consists of seven steps, as shown in Fig. 5.2. The first five steps are for preacquisition or assay development of SRM. The last two steps are for postacquisition data analysis. In the first step, the experiment required to be annotated by selecting exact mass spectrometry instrument and organism, also the researchers they share the project. In step two, the user needs to provide or select a list of proteins. ATAQS provides three curated disease-specific protein list to choose from: (a) prostate tumor, containing 1055 proteins, (b) type II diabetes, containing 954 proteins, and (c) breast cancer-related human kinase signaling, containing 32 proteins. In step three, the user can explore the selected proteins’ characteristics by using PIPE2 and can revise the list. In step four, the user needs to select the peptides and the transitions using the filters. In step five, the user can select a decoy or heavy/light pairs based on user-selected decoy generating algorithms and labeling methods. ATAQS is web-based and connected to various web services—PeptideAtlas (used to select peptide spectra) (Deutsch et al. 2008), TIQAM (to generate in silico peptides for a given protein) (Lange et al. 2008), PIPE2 (to generate a list of proteins to design an SRM assay) (Ramos et al. 2011),
5.1 SRM Preacquisition Stage
119
Fig. 5.2 Summary of ATAQS workflow. (Reprinted with permission from Brusniak et al. (2011))
and PABST (PeptideAtlas Best SRM Transition) to generate optimal transitions (Brusniak et al. 2011).
5.1.3 TIQAM Several other software tools have been developed to select proteotypic peptides (PTPs) and their best performing transitions. Lange and co-workers developed Targeted Identification for Quantitative Analysis by MRM (TIQAM) (Lange et al. 2008). It is a bundle of three executable Java applications—TIQAM-Digestor, TIQAM-PeptideAtlas, and TIQAM-Viewer—that can be run separately. It uses PeptideAtlas and other resources of information to select the PTPs of the targeted proteins and generates a list of transitions which are used for SRM-triggering MS2 spectra acquisition. TIQAM integrates the acquired data to support the best performing transitions whose identification has been confirmed by MS2 spectra. The workflow is shown in Fig. 5.4. The targeted proteomic strategy along with the use of TIQAM, the authors were able to reliably quantify low abundance virulence factors from cultures of the human pathogen Streptococcus pyogenes exposed to
120
5 Bioinformatics Tools for SRM-MS
Fig. 5.3 Skyline user interface with results, RT prediction and spectral library views. (Reprinted with permission from Maclean et al. (2010))
increasing amount of plasma which allowing them to define the subset of virulence proteins that is regulated upon plasma exposure. TIQAM must be downloaded and local MySQL DB setup must be configured.
5.1.4 MaRiMba MaRiMba is an open-source application for spectral library-based SRM transition list generator and operated through a graphical user interface (GUI) that incorporated into the Trans-Proteomic Pipeline (Sherwood et al. 2009). The software can (1) generate SRM transition list based on spectral libraries, (2) include only user- selected proteins and peptides, (3) choose precursor peptides and/or product ions based on user-selected inputs—isoelectric point (pI) or charge state, (4) exclude peptides assigned to more than one protein in a specified proteome, (5) calculate RT prediction for each peptide for schedule SRM based on algorithm 3.0 of
5.1 SRM Preacquisition Stage
121
Fig. 5.4 Skyline support for isotope-labeled internal standards and quantification. (Reprinted with permission from Maclean et al. (2010))
SSRCalculator (Krokhin et al. 2004), (6) add transitions corresponding to heavy peptides for SIL. MaRiMba provides users the ability to design more customized SRM transition lists with higher speed which facilitates the ability to expedite the workflow of SRM assays and expand the application of targeted proteomics to systems biology. For generating transition list, both TIQAM and MRMaid depend on the utilization of data repositories (PeptideAtlas and PRIDE, respectively) whereas MaRiMba can create lists either from public spectral libraries or from spectral libraries created directly from the user’s data.
5.1.5 PChopper PChopper has been developed to help designing studies for SRM-based phosphorylation analysis (Afzal et al. 2011). Beside phosphorylation, it can be used to target other PTMs or simply to target regions within a selected protein sequence which
122
5 Bioinformatics Tools for SRM-MS
can be done with single or multiple enzymes. It was implemented using service- oriented architecture (SOA) that enables for quickly and easily predicting suitable enzymes and the resulting peptides for SRM experiments.
5.1.6 Picky Zauber et al. developed a fast and user-friendly online tool for SRM and PRM assays, named as Picky, where users need to provide only protein identifiers—gene symbol or uniport accession number—for human or mouse protein(s) of interest, and then the tool selects related tryptic peptides and their empirical retention time from the ProteomeTools data set (Zauber et al. 2018). Picky selects transition with most intense fragment ions in designing SRM method with several other user- defined options like isotope labels, fragmentation types, and SRM dwell time in respect to protein abundance. For scheduling retention time (RT), users can directly upload empirical RTs of their target peptides, or it can be predicted from the uploaded experimental peptide RTs acquired on the user’s LC system with the help of ProteomeTools where the authors found more than 80% of RTs were appropriately predicted within ±3 min elution time window—a much improvement when compared to the prediction by hydrophobicity scores. The peptide acquisition list is further optimized when the number of peptides monitored in parallel exceeds a user-defined threshold by removing the lowest- scoring peptide from the protein with the greatest number of targeted peptides in a repetitive manner. Therefore, it selects the best set of peptides for targeted assay with the given chromatographic settings and parameters. Picky exports an inclusion list, which can be imported into the acquisition software of the mass spectrometers, exhibits annotated fragmentation spectra, and exports the related spectral library which can be imported into Skyline to validate the acquired targeted proteomics data (Zauber et al. 2018).
5.1.7 Skyline Skyline currently become the most popular software—over 8700 registered users and more than 64,000 installations so far with over 1100 citations—for both the pre- and postanalysis of SRM-MS assays in the academia and in the industry. Skyline (MacCoss group, University of Washington, WA) is an open source and freely available Windows client application for the method generation and quantitative mass spectrometry data analysis where Skyline document model contains extensive mass spectrometry data from various proteomics experiments—SRM, PRM, DDA, and DIA. It is designed to accelerate proteomics method development and sharing of methods and analysis results across instrument platforms—Thermo, Sciex, Waters, Agilent, Bruker, and Shimadzu. The software was developed under
5.1 SRM Preacquisition Stage
123
the open source Apache 2.0 License as part of the ProteoWizard project (Maclean et al. 2010). Here mostly SRM-MS proteomics related issues of Skyline will be discussed. The initial goal of the Skyline was to develop a single informatics tool with a user-friendly interface, comprehensive file compatibility, MS instrument vendor- data processing, data visualization, and optimal computational requirements that can help to develop MS methods including SRM-MS and to analyze the data collected for chromatography-based quantitative MS experiments (Pino et al. 2017). Besides its core function, Skyline also allows research community to integrate their informatics tools into Skyline through an external tool store that supports point-and- click installation and can be run from the Skyline Tools menu without modifying the Skyline codebase which includes Biodiversity plugin, MPPReport, MS1Probe, MSstats, Population variation, Prego, Protter, QuaSAR, and SProCoP with applications ranging from assay development to biological inference (Broudy et al. 2014). Skyline supports all stages of assay development or preanalysis of SRM-MS including selection of target peptides and related transitions prior to MS acquisition, validation of transitions by MS/MS spectra, optimization of collision energy, or determination of retention time for schedule SRM-MS. Target peptides are generally selected either from empirical data, public databases, or from predictive algorithms. Skyline facilitates researchers to use preliminary empirical data for evaluation of selected peptide responses and related interference in the chromatograms as well as from supported online repositories—PeptideAtlas (Desiere et al. 2006), Human Proteinpedia (Mathivanan et al. 2008), GPM database (Craig et al. 2004), and PRIDE (Vizcaino et al. 2015). It also implemented open-source PREGO algorithm (Searle et al. 2015) as a plug-in that predicts highly responding peptides using an artificial neural network on DIA experimental data as peptide signals in DIA data sets are better representation of same peptides in SRM data sets (Wu et al. 2014). Like peptide selection, transition signal response can be assessed in Skyline via empirical data to evaluate potential high-responding interference-free transitions in sample matrix at defined instrument conditions. Prior knowledge of peptide elution time information is critical for developing scheduled SRM. Skyline incorporates peptide RT prediction tool SSRCalc, and indexed retention time (iRT) method where a standard set of reference peptides can be used to calibrate RT prediction for any number of target peptides of interest on new chromatography system. After predicting peptides using either methods, Skyline can export an acquisition table with relevant information for a schedule SRM-MS experiment. Skyline facilitates necessary adjustment of the scheduling windows to comply with the instrument’s speed and the number of transitions eluting at each time point with a visualization option in the retention time pane that displays the number of transitions eluting over the chromatographic gradient under several potential scheduling window lengths. In Skyline, an automated pipeline for optimizing collision energy (CE) is integrated to achieve maximum product ion intensity and has ability to store optimized experimental parameters in a library for sharing or future use (Fig. 5.3). Skyline also contains Bibliospec, a software tool, for creating and searching tandem MS peptide spectrum libraries, which provide more accurate information on fragment ion inten-
124
5 Bioinformatics Tools for SRM-MS
sities and efficient spectra search. It also supports MS/MS spectral library creation where it takes the best scoring PSM from a variety of supported search engines— Mascot, ByOnic, SEQUEST, ID Picker, MaxQuant, Morpheus, MS-GF+, OMSSA, PEAKS DB, etc.—as a reference spectrum, picking the most intense in the event of a tie. Besides, Skyline supports several sources of reference libraries, including Peptide Atlas, NIST, and GPM. After generating a Skyline document with all the necessary settings and optimization, the finalized assay can be exported as a native method for QqQ instruments. Acquired MS data can then be imported into the Skyline document for further peptide and transition validation (Fig. 5.4). The process of export, acquisition, and refinement is repeated until fully optimized.
5.2 Databases Related to SRM-MS The database and informatics resources related to SRM-MS, including SRMAtlas (Kusebauch et al. 2016), PASSEL (Farrah et al. 2012), GPMDB (Craig et al. 2004; Walsh et al. 2009), PanoramaWeb (Sharma et al. 2014), CPTAC assay portal (Whiteaker et al. 2014), PeptidePicker (Mohammed et al. 2014), MRMAtlasDB (Bhowmick et al. 2018), or ProteomeTools (Zolg et al. 2017), offer the possibility of retrieving targeted proteomics assays that can be directly applied to the assay development of high-throughput and sensitive quantitative analysis of targeted protein/peptide sets without the need for additional method development. Here we describe three of those databases and knowledge bases—SRMAtlas, CPTAC assay portal, and MRMAssayDB.
5.2.1 SRMAtlas The SRMAtlas (www.srmatlas.org) is a database of targeted proteomics assays to detect and quantify proteins in biological matrices by mass spectrometry which contains SRM-MS transitions from multiple data sources on various organisms— yeast Saccharomyces cerevisiae, pathogen Mycobacterium tuberculosis, and human. It provides high-quality assays of indigenous and synthetic peptides acquired on a QqQ MS and the related informatics that aim to utilize for the future development of SRM-based proteomic workflows without the requirement for additional method development. Currently, SRMAtlas contains ~28,000 SRM assays that covers 97% of yeast S. cerevisiae proteome. The data set was developed by optimizing and validating assay results from numerous SRM-triggered MS2 analysis of yeast digest where each assay provides multiple SRM coordinates—m/z of the peptide ion, charge state, m/z of the fragment ions with higher signal intensities, hydrophobicity and retention times, collision energy, intensity ratios of fragment ions, and the type of
5.2 Databases Related to SRM-MS
125
QqQ instrument used (Picotti et al. 2008). It covers assays for yeast proteins involved in all biological processes and all dynamic ranges of cellular abundances, down to ~50 molecules/cell. SRMAtlas database also comprises of ~13,300 SRM assays that covers 97% of 4012 annotated proteins of human pathogen M. tuberculosis (Mtb), the causative agent of devastating infectious disease tuberculosis (TB) from which ~1.4 million people die each year. Covered SRM assays includes absolute abundance for 55% of all Mtb proteins in proteome ranging over four orders of magnitude. To evaluate the utility of this database, Schubert et al. monitored the entire Mtb dormancy survival regulon (DosR) that is linked to anaerobic survival and Mtb persistence and found that this public database can support the sensitive, precise, and reproducible quantification of virtually any Mtb protein by a robust mass spectrometric method (Schubert et al. 2013). Human SRMAtlas was developed from the generation and verification of a collection of highly specific 158,000 SRM assays that covering 99.7% of the 20,277 annotated human proteins. Kusebauch et al. reported data on 166,174 proteotypic peptides that provide various independent assays for the quantification of any human protein and its spliced variants, mutations, and post-translational modifications (Kusebauch et al. 2016). The development scheme is outlined in Fig. 5.5. The human SRMAtlas development process consists of five steps: (i) defining the target proteome, (ii) selection of proteotypic peptides, (iii) selection of peptides for protein isoforms, SNPs, and N-glycosylated proteins, (iv) development of SRM assays via synthetic peptides, and (v) compiling the data into SRMAtlas as an accessible resource. 5.2.1.1 Defining the Target Proteome Manually annotated and reviewed 20,277 human protein sequences and their 14,677 isoforms present in UniProt/Swiss-Prot database were used as reference to select peptides for this initiative where membrane bound proteins, large multidomain proteins, and protein activation events resulting in nontryptic cleavage sites were included equally. 5.2.1.2 Selection of Proteotypic Peptides At least five best proteotypic peptides were selected from each protein while considering various physicochemical properties—peptide length, hydrophobicity, charge state, limitations in chemical synthesis, chemical, and post-translationally modified-amino acid containing peptides. As Kusebauch et al. reported, peptide selection was optimized mostly by empirical data; however, in absence of it, best peptides were selected using computational prediction tools. Initially, peptides were selected from PeptideAtlas (Desiere et al. 2006) repository using an empirical suitability score (ESS) considering peptide probability, the number of repeated
126
5 Bioinformatics Tools for SRM-MS
Fig. 5.5 Human SRMAtlas development scheme. (Reprinted with permission from Kusebauch et al. (2016))
identifications, and selective physiochemical properties. However, only 9946 out of 20,277 UniProt/Swiss-Prot proteins—26% of the proteome—were detected by at least one peptide whereas 5319 proteins were represented by five or more unique peptides. For remaining proteins, computational prediction tools and algorithms like PeptideSieve (Mallick et al. 2007) and PeptideAtlas best SRM transition
5.2 Databases Related to SRM-MS
127
(PABST) were used where at least five peptides per protein were selected if not limited by the sequence. 5.2.1.3 P eptide Selection for Isoforms, SNPs, and N-Glycosylated Proteins About 11,309 peptides were selected that represent unique splice forms, all suitable C-terminal peptides resulting in 6820 additional peptides of 20,277 proteins and 1937 peptides for splice variants. Out of 3662 peptides that have population frequency greater than 30% using NCBI dbSNP, only 3094 peptides were selected using peptide selection criteria. About 5199 membrane proteins, 1748 secreted proteins, and 784 membrane proteins from 47 tissue types were used to select 10,938 peptides representing N-glycosites located in the extracellular protein domain. Besides, 142 peptides from 129 proteins were selected to provide SRM assays for both the pre/pro-hormone and the activated form. Overall, 166,174 peptides from 20,255 proteins were selected using the iterative and comprehensive selection process that constitute 99.9% of the predicted human proteome according to UniProt/ Swiss-Prot. SRM assays could not be developed for remaining only 22 proteins as those were inaccessible by tryptic peptides with selection criteria (Kusebauch et al. 2016). 5.2.1.4 Development of SRM Assays Using Synthetic Peptides In order to generate optimum SRM assay, all the selected peptides were chemically synthesized, and high-resolution high-mass accuracy reference fragment ion spectra were generated where five different collision energies (CEs) and at least five MS/ MS spectra per CE were acquired from each peptide ion. Overall, 158,015 successfully developed SRM assays have been generated with the use of 166,174 chemically synthesized peptides to develop human SRMAtlas (Kusebauch et al. 2016). 5.2.1.5 Compiling the Results into SRMAtlas Database All developed SRM assays were uploaded to SRMAtlas to provide free and unlimited access to this compendium where researchers can query for verified assay coordinates that include peptide sequence, precursor, fragment ions, charge states, fragment ratios, optimum CE for different MS instruments, retention time, hydrophobicity, and peptide uniqueness among human proteome. Besides, it is linked to external knowledge bases—neXtProt, PeptideAtlas, the Human Protein Atlas, Pathway Commons, and SRMCollider (Kusebauch et al. 2016).
128
5 Bioinformatics Tools for SRM-MS
5.2.2 CPTAC Assay Portal The Clinical Proteomic Tumor Analysis Consortium (CPTAC) of National Cancer Institute (NCI) (http://assays.cancer.gov/) has launched as an open-source repository to curate and disseminate highly characterized targeted mass spectrometry- based assay by providing detailed information of assay performance characterization data, standard operating procedures (SOP), and access to reagents (Whiteaker et al. 2014). The main goal is to facilitate robust quantification of all human proteins and to harmonize the quantification of targeted MS-based assays over time and across laboratories (Whiteaker et al. 2016). The overall structure of CPTAC Portal is divided into four parts, as shown in Fig. 5.6. These are: (1) Database of qualified assay—Portal facilitates query of a database of well-characterized targeted MS-based assay. Information is collected from three sources—a web-based metadata collection form, a repository of characterization data, and links with external bioinformatics sites. The metadata collection form is completed by contributing laboratories during new data upload which contains information like instrument type, matrix type, method parameters, publications, detailed SOP. (2) Repository of characterization data and processing scripts—Targeted MS data are analyzed on Skyline and stored in a vendor-neutral repository called Panorama which currently supports data from six vendors— Thermo, AB Sciex, Waters, Agilent, Bruker, and Shimadzu. It is enabling upload
Fig. 5.6 Overview of the CPTAC assay portal (a) Structure of the portal, and (b) Assay characterization to evaluate performance. (Reprinted with permission from Whiteaker et al. (2014))
5.2 Databases Related to SRM-MS
129
and viewing of assay characterization data and download of Skyline documents for assay implementation. (3) Links to external information and resources— Bioinformatics annotations are obtained from external sources for biologically relevant queries and mapping of peptide analytes relative to sequence domains, isoforms, SNP, and PTMs. It uses bioDBnet, UniProt, PhosphositePlus, KEGG, BioGRID, GeneCards. (4) Web-based interaction tool for exploring, visualization, and features—Portal user interface contains a main page that enables to do query and filter the available assays for target protein, and then can go to individual assay pages to get assay parameters, validation data, and downloadable content including raw data and SOPs (Whiteaker et al. 2014). A framework for SRM fit-for-purpose assay validation has been established by CPTAC, with input from the related research community through a workshop jointly organized by NCI and the National Heart, Lung, and Blood Institute which recommended five experimental set up for assay characterization. Experiments 1 and 2 are required for each assay whereas experiments 3–5 are optional. Experiment 1 is a response curve where peptides are spiked into a background matrix to evaluate linearity, the intra-batch precision of LC-SRM analysis, upper and lower limit of quantification, and to determine selectivity. Experiment 2 evaluates repeatability by spiking three replicates of peptides at three concentrations—low, medium, and high—across 5 days. Experiment 3 measures selectivity by evaluating parallelism in multiple biological replicates of the test matrix. Experiment 4 evaluates the stability of peptides after sample preparation. Experiment 5 is to show that endogenous peptides can be quantified in a relevant matrix (Whiteaker et al. 2014). CPTAC assay portal is organized into four levels. First level or landing page facilitates database query for validated assays to quantify proteins with biological relevance. The second level provides a protein-centric view where peptide analytes are mapped relative to sequence domain, isoforms, SNP, or PTMs. The third level provides detailed assay characterization through Panorama. The four level allows users to implement selected assays in their own laboratories by downloading Skyline documents, SOPs for sample preparation, assay parameters, and by participating community-based discussion board (Whiteaker et al. 2014).
5.2.3 MRMAssayDB MRMAssayDB, a free web-based interface, integrates multiple public repositories and knowledgebase into one comprehensive resource with periodical update for validated targeted proteomics assay (Fig. 5.7) (Bhowmick et al. 2018). It captures SRM-based targeted proteomics assay data from PASSEL (Farrah et al. 2012), CPTAC (Whiteaker et al. 2014), PanoramaWeb (Sharma et al. 2014), SRMAtlas (Kusebauch et al. 2016), and PeptideTracker (Mohamed et al. 2016) as well as it links UniProtKB, KEGG pathways, and Gene Ontologies (GOs), and provides several visualization options on the peptide and protein level where it currently contains more than 168,000 assays covering more than 34,000 proteins from 63
130
5 Bioinformatics Tools for SRM-MS
Fig. 5.7 MRMAssayDB homepage with search functionality and statistics. (Reprinted with permission from Bhowmick et al. (2018))
organisms. More than 13,500 of these proteins are present in ~2300 KEGG pathways representing ~300 master pathways and mapping to ~13,000 GO biological processes. SRM assays contain multiple information—UniProtKB accession number, protein name, gene name, organisms, peptide sequence, uniqueness in the proteome, isoforms, labeled inter standard used, and a link to the original data source. All the assays can be searched/filtered based on partial protein name, protein accession, partial peptide sequence, biological pathway, or disease. After initial search, additional filters can be used, and a custom report can be downloaded. In advanced search in MRMAssayDB, multiple search parameters—for example, protein name and location or pathway—can be utilized simultaneously within one search. As
5.2 Databases Related to SRM-MS
131
mutations and PTMs are important in the study of different diseases, data on known disease-causing mutations and PTMs are mapped to the assays in MRMAssayDB. This database maps SRM assays to the KEGG pathways (Kanehisa and Goto 2000) and extracts known protein–protein interactions from the STRING database (Szklarczyk et al. 2015) using color-coded schemes. It also links assays to the Gene Ontology (GO) from QuickGO web-based portals for functional annotation of the proteins which includes Biological Process, Molecular Function, and Cellular Component. The protein structure viewer in MRMAssayDB facilitates the exploration and visualization of 3D protein structures to visually explore the location of the surrogate peptide—whether in the solvent accessible region or embedded inside the folded structure, or form part of alpha helix or beta sheet conformation. This database utilizes the strengths of Python, Java, Django, JSmol, Jvenn, Cytoscape.js, Ajax, and Selenium Web driver with a suite of integrated bioinformatics tools.
5.2.4 ProteomeTools ProteomeTools, a project to synthesize a library of ~1.4 million individual peptides to cover extensively all canonical human proteins, has been initiated in building molecular and digital tools to enable and enrich human proteomic research (Zolg et al. 2017). Project overview of the ProteomeTools is shown in Fig. 5.8. For this study, three different sets of peptides were utilized comprising 330,286 nonredundant peptides covering 19,840 human genes as annotated in UniProt/Swiss-Prot. First, a proteotypic peptide set consisting of 124,875 peptides from 15,855 human UniProt/Swiss-Prot annotated genes that cover proteins derived from prior mass spectrometric evidence available in ProteomicsDB. Second, a missing gene set was generated which contained tryptic peptides mapping to genes that lacked confident experimental identification in ProteomicsDB and comprised of 140,458 peptides
Fig. 5.8 Overview of the proteomeTools project. (a) Planned segmentation of the 1.4 million peptides. (b) Estimation of synthesis success using peptide precursor intensity information for the peptide SVSLLEER and its byproducts. (c) Boxplots for the number of MS/MS spectra identifying a given peptide with very high confidence (Andromeda score > 100, total of 11.3 million PSMs in 11 types of MS/MS). (d) Distribution of peptide and protein identifications as a function of the Andromeda score. (Reprinted with permission from Zolg et al. (2017))
132
5 Bioinformatics Tools for SRM-MS
covering 4818 genes. Third, a subset of the SRMAtlas peptides consists of 90,967 peptides from 19,099 genes covering both proteins with empirical evidence and missing proteins. Tryptic peptides were individually synthesized following the Fmoc-based solid- phase synthesis strategy, combined into pools of ~100 peptides and spiked with 66 non-naturally occurring and 15 stable-isotope-labeled peptides for retention time calibration with circumventing the near isobaric peptides to prevent ambiguity in the MS data or to cover the entire LC gradient. For each peptide pool, an inclusion list was generated to target peptides for fragmentation in LC-MS/MS assays using five dissociation techniques—HCD, CID, ETD, ETHCD and ETCID—using ion trap or orbitrap. HCD spectra were recorded at six different normalized collision energies in order to identify conditions for the measurement of peptides by SRM, PRM, or DIA/SWATH. Raw data were analyzed using MaxQuant searching individual LC-MS/MS acquisitions against pool-specific databases. Then, reference mass spectra, termed PROSPEC for ProteomeTools Spectrum Compendium, were generated for the identification and quantification of human peptides and proteins. ProteomeTools, in many ways, can become a valuable resources for the proteomics community including SRM-MS users—the project team is ready to accept propose sets of peptides from the research community to include: willing to provide their 100 clones of peptide library to the interested researchers for further exploring using alternative MS instruments, ion mobility devices or various chromatographic tools; existing data in ProteomeExchange and in ProteomicsDB can be reused and reanalyzed by any researcher; data are updated every 6 months; or can be developed software and/or statistical tools from the current and future data for the large-scale proteomic projects of SRM, PRM, or DIA/SWATH including functionally significant proteins.
5.3 SRM-MS Data Acquisition on LC-MS After optimization of peptide and transition selection as well as instrument parameters, SRM-MS data are acquired using QqQ or QTrap MS interfaced with LC. Data acquisition and instrument operating software are generally provided by the corresponding instrument vendors. With increasing number of transitions, dwell time, and cycle time must be minimized in order to accommodate all targets in a single LC/MS acquisition, occasionally with the compromise of data quality (Colangelo et al. 2013). There needs to be getting at least 5–8 data points within the chromatographic elution time for each transition to get efficient quantification. Therefore, for nano-electrospray, cycle time should be 2–3 s, considering the peak width is 20–25 s. Colangelo et al. discussed on three approaches that are used currently for SRM acquisition—conventional SRM, scheduled SRM, and triggered SRM or iSRM, as shown in Fig. 5.9. In conventional SRM (Colangelo et al. 2013), the simplest one is the conventional SRM where it cycles through the monitoring of all selected transitions during the entire time of the chromatographic elution. The
5.3 SRM-MS Data Acquisition on LC-MS
133
Fig. 5.9 Various SRM acquisition techniques. (Reprinted with permission from Colangelo et al. (2013))
scheduled SR is relied on peptide retention time and a scheduling window for a set of selected transitions during LC-SRM acquisition (Stahl-Zeng 2007). In triggered SRM or iSRM, initially a single SRM transition is scheduled to be acquired with an elution window; however, if this signal intensity goes over a user-defined threshold, it triggers additional transitions from the same precursor ion to be monitored which can maximize the utilization of dwell time and cycle time while providing accurate and quantitative measurements of both primary and secondary transitions (Kiyonami 2011).
134
5 Bioinformatics Tools for SRM-MS
5.4 SRM Postacquisition Stage Postacquisition stage starts after LC/MS acquisition and involves major steps involving chromatogram extraction, detection of transitions, and relation/absolute quantification of transitions. Additional steps involve data quality assessment/filtering, data visualization/exploratory analysis, and fold change/statistical significance analysis (Colangelo et al. 2013). Several open-source bioinformatic tools are available for postacquisition data analysis of SRM-MS—ATAQS, MRMer, TIQAM, Skyline, etc.
5.4.1 Data Analysis with ATAQS Steps 6 and 7 of ATAQS workflow support postacquisition stage where mzXML or mzML format of measured data set is selected and the experiments are grouped according to transition list (Brusniak et al. 2011). The smoothed transition traces are then used by transition group algorithm to select the best peak group. Usually several peaks are in transition trace and each peak is then grouped with the nearest peak of the whole transition set where several properties are calculated in ranking for the peak group—sum intensities, RT deviation, and number of transitions in a peak group. Figure 5.10 shows an example selecting correct peak group over interference peak group by using the ranked score. The best peak group of a selected peptide is used to calculate seven classifiers for discriminant analysis. These peak group property values are used as features for the semi-supervised learning approach implementation in mProphet. The sum of the intensity values is used the main feature and the other six values are used as secondary features. The transition set is divided into a training set and a test data set using tenfold cross validation, and linear discriminant analysis (LDA) determines the linear combination of features that differentiate between target transitions from decoy transitions. The user can also select an FDR cutoff to determine validated peptides in each sample. At the end, verified transitions can be export via TraML format to share with the community (Brusniak et al. 2011). A unique feature of ATAQS is that it provides an interface to select optimum transitions of given peptides as well as to select biologically relevant proteins through PIPE2 (Ramos et al. 2011).
5.4.2 MRMer Martin and co-workers developed instrument-independent software platform, called MRMer, to manage highly complex SRM assays containing quantitative analyses using heavy/light isotope peptide pairs (Martin et al. 2008) (Fig. 5.11). It extracts information from MS files encoded in the platform-independent mzXML format,
5.4 SRM Postacquisition Stage
135
Fig. 5.10 ATAQS validator. The best peak group selection algorithm selects correct peak group in presence of a strong interference peak group. (Brusniak et al. (2011), open access article distributed under the Creative Commons attribution license)
Fig. 5.11 A screen shot of MRMer showing precursor-product pairs from a standard digest of yeast enolase. (a) precursor-product pairs, (b) co-elusion of three product ions from precursor m/z 643.860, (c) summary of quantitative information. (Martin et al. (2008), open access article distributed under the Creative Commons attribution license)
136
5 Bioinformatics Tools for SRM-MS
and automatically extracts and groups precursor-product pairs for visual validation of co-elution and calculates absolute and relative area under the curves (AUCs) for standard and AQUA/SILAC type experiments, respectively. MRMer supports both small- and large-scale SRM-MS studies. The grouping and visualization tools of MRMer allow easy assessment of peak evaluation. Besides, MRMer facilitates to (1) evaluate whether the selected dwell time is adequate for the intensity of each of the peaks targeted, (2) evaluate the adequacy of a data acquisition window for scheduled studies, (3) quickly compare studies acquired across a range of adjustable parameters such as collision energy and source voltages. MRMer allows users to interactively select the start and end RT of the curve that used for quantification for a given transition, and to manually accept verified transitions for a given peptide ion (Brusniak et al. 2012).
5.4.3 Postacquisition Data Analysis with Skyline After mass spectrometry acquisition, vendor-specific raw files or files with mzXML or mzML format are imported into Skyline and caches the information into a single high-performance data file. The data processing is ended in a calculated peak area or area under the curve (AUC) for each targeted peptide ions, visualization of the data, and cached chromatogram information for quick retrieval, as shown in Fig. 5.12. Mass spectrometry raw files are directly imported into the Skyline document where RT and peak intensity for a specific m/z are extracted. The settings are usually prepared earlier, and can be imported from or exported to other Skyline documents that help for the analysis of similar samples and instrumentation. Detection of peaks is performed by the Chromatogram Retention time Alignment and Wrapping for Differential Analysis of Data (CRAWDAD) peaks algorithm. Skyline generates peak groups for each targeted modified peptide by the combination of raw peaks for its chromatograms and grouping them by RT overlap. For peak picking or peptide identification, the ten results from peak grouping are evaluated for the probability of the peptide representation which are derived both from the CRAWDAD statistics and raw chromatogram data features with weighted coefficients. The peak group with the highest score is identified as correct peak for a peptide. Peak area is calculated in the Skyline by using total integrated area within the peak boundaries without the background area. To display the peak, Skyline uses the interpolated points in the unsmoothed chromatograms to calculate peak area though it has several peak smoothing options—1D, 2D, Savitzky-Golay, etc. Chromatograms of each peptides are generally visualized with the boundaries and indicators for the retention time and dot product of each detected pick.
5.4 SRM Postacquisition Stage Fig. 5.12 Data processing pipeline in Skyline. It extracts information from native, vendor-specific file formats or from portable files, and provides peak area calculation and visualization of the data. (Reprinted with permission from Pino et al. (2017))
137
138
5 Bioinformatics Tools for SRM-MS
5.4.4 SRMstats and MSstats SRMstats has been developed as a general statistical modeling framework for protein significance analysis for SRM-based quantitative proteomics which can be applied to a diverse experimental designs and workflows—both label-based and label-free (Chang et al. 2012). A conventional SRM bioinformatics analysis workflow includes assay development, LC-MS acquisition/processing, and protein significance analysis. The first two steps are already supported by various software that have already been discussed in this chapter whereas the last step is predominantly supported by SRMstats that appropriately combines the quantitative measurements for a targeted protein across isotope labels, peptides, charge states, transitions, samples, and biological conditions to detect proteins that change in abundance between conditions in a statistically significant manner considering controlled FDR. It is implemented in an open-source R-based package that takes quantified endogenous and reference transitions as inputs, and affords functionalities for data visualization, quality control, model-based protein significance analysis, and experimental planning for future experiments which can be used as stand-alone or as a module integrated with Skyline and ATAQS. Statistical framework for protein significance analysis in SRM measurements is done in four steps—problem statement, exploratory data analysis, model-based analysis, and statistical design of future experiments—as shown in Fig. 5.13. Problem statement (step 1) includes the comparisons of interest, the experimental design, and the scope of the conclusions of the study for the selection of an appropriate statistical model for the data. Exploratory data analysis (step 2) visualizes the biological and the technical mass spectrometry acquisition variation in the data that help to select the most appropriate statistical model. Model-based analysis (step 3) utilizes a linear mixed-effects model that characterizes the targeted quantitative measurements to detect systematic changes in protein abundance between conditions with high sensitivity and selectivity at user-defined FDR in which a distinctive feature for label-based experiment is that the software model the individual intensities directly rather than calculate the ratios of the endogenous and the reference
Statistical framework in SRMstats
-features, samples, missingness
populations
random samples
intensity
normalization/filtering
run run reference endogenous
model
-label-based, label-free -group comparison, time course
Statistical design of future experiments sample size calculations - number of replicates - number of peptides
per-protein inference
-detect DE proteins with adjusted p-values to control FDR
number of replicates per group
summary reports
intensity
comparisons of interest scope of conclusions
Model-based analysis
proteins names
Exploratory data analysis
Problem statement
log fold change
Comparisons names
Fig. 5.13 Overview of SRMstats approach. (Reprinted with permission from Chang et al. (2012))
5.4 SRM Postacquisition Stage
139
transitions within a run. Statistical design of future experiments (step 4) help to choose between label-based and label-free workflows as well as to calculate required sample size for a future experiment that allows researchers to substantially reduce the required number of biological replicates as compared with the t test. If there is relatively small between-run variation, both label-based and label-free workflows provide similar results, and in this case the label-free workflow could be considered as a viable alternative to the other as it is cheaper and higher in throughput. MSstats (Kall and Vitek 2011; Clough et al. 2012) and its updated version MSstats 2.0 (Choi et al. 2014) are open source R-based packages using a flexible family of linear mixed models for statistical relative quantification of proteins and peptides in mass spectrometry-based proteomics that supports both label-based and label-free workflows of DDA, targeted, and DIA/SWATH acquisitions. Its applicability includes the assays that make arbitrary complex comparisons of experimental conditions or time. It takes identified and quantified spectral peaks as inputs and provides a list of differentially abundant peptides or proteins, or summaries of peptide or protein relative abundance as output. MSstats 2.0 is available as an external tool within Skyline (Broudy et al. 2014) or can be downloaded from Bioconductor (www.bioconductor.org) to be used in an R command line workflow that also facilitates the development of new statistical methodologies to implement.
5.4.5 A utomated SRM-MS Data Analysis Workflow for Large-Scale Studies With the advancement of mass spectrometry instruments, workflow and resources, current improved SRM experimental design and data analysis capability enabled to handle large quantitative data sets containing hundreds of proteins in hundreds of samples. Surinova et al. described a robust and automated workflow for the analysis of large quantitative SRM data sets that integrates data processing, statistical protein quantification, and dissemination of the results by using three open-access bioinformatic tools—mProphet, SRMs tats, and PASSEL, as shown in Fig. 5.14 (Surinova et al. 2013). Upon mass spectrometry acquisition, the downstream data analysis
Fig. 5.14 Computational and statistical analysis of a large-scale SRM data set. (Reprinted with permission from Surinova et al. (2013))
140
5 Bioinformatics Tools for SRM-MS
proceeds in multiple steps. (a) Per-run analysis with mProphet, which extracts targeted peptide transition groups from the raw data, discriminates between the true and false positives peak group identifications using a probabilistic scoring model and integrates the intensities of confidently identified peptide transition groups. (b) Between-run analysis with SRMstats that summarizes the inherent SRM data structure with a linear mixed-effect model, performs protein significance analysis comparing various conditions as well as generates detailed analysis reports and plots. (c) Data storage, retrieval, and query with PASSEL repository which enable SRM data sets together with their accompanying analysis files to be uploaded and further utilization by the research community (Surinova et al. 2013).
5.5 Commercial Software from Instrument Vendors Mass spectrometry manufacturers provide vendor-specific software tools to develop SRM methods as well as to analyze acquired data using their instruments. Major software includes Pinpoint (ThermoScientific), MRMPilot (AB Sciex), MultiQuant (AB Sciex), TargetLynx (Waters), MassHunter (Agilent), etc.
5.5.1 Pinpoint (ThermoScientific) Pinpoint facilitates the generation of targeted quantitative assays and Thermo- specific iSRM assays, and support TSQ series and Orbitrap series mass spectrometers. It allows preacquisition assay development and postacquisition data analysis in a single software. It is designed to automate the initial selection of SRM transitions by predicting proteotypic peptides and determining the best transitions. Then the method can be transferred to Thermo instruments for data acquisition. Peptides are quantified by single-point, normal curve, and reversed curve approaches (Colangelo et al. 2013). Single-point quantification is designed for labeled experiments that provide relative quantification of endogenous peptide to heavy-labeled peptide. A normal curve approach can be applied for low-abundance indigenous peptides whereas reserved-curve approach uses light peptide as internal standard. In Pinpoint, transitions are rated, and RT reproducibility and CE optimization can be visualized through graphical plot generation (Brusniak et al. 2012). It allows manually to validate transitions and quantify proteins.
5.5.2 MRMPilot (AB Sciex) MRMPilot is used to develop SRM assays on AB Sciex instruments and can generate assay for peptides with chemical modification, PTMs, and heavy-labeled. MRMPilot then builds the Analyst software acquisition method. It is developed
5.5 Commercial Software from Instrument Vendors
141
assays based on experimental results and organize assay files. It also supports MIDAS workflow which uses AB Sciex QTrap to trigger a MS/MS experiment only when an SRM signal goes above a specified threshold enabling users to confirm the identification of the peptide (Unwin et al. 2009). MultiQuant (AB Sciex) is used to detect and quantify SRM transition abundances from AB Sciex.wiff files which supports both relative or absolute and unlabeled or labeled peptides internal standards. It supports data processing from MIDAS workflow. For data analysis, it contains two peak integration algorithms—MQ4 and SignalFinder. MQ4 can group transitions for peak detection and integration, Gaussian smoothing, baseline subtraction, peak splitting, and calculation of noise and contains parameters like peak width and RT window related to scheduled SRM. SignalFinder, on the other hand, utilizes a confidence-based iterative algorithm that can adopt changes in peak shape, baseline, and noise. Method created in can be imported or exported as a text file, whereas results from data analysis can be exported as either AB Sciex MarkerView format or as regular text files. For absolute quantification, MultiQuant generates standard curve on heavy peptides which can be utilized for determining unknown sample quantity. After calibration, a concentration and accuracy are determined for the target peptides (Colangelo et al. 2013).
5.5.3 TargetLynx (Waters) TargetLynx software is used for data acquisition, processing, and reposting of SRM results from Waters QqQ instruments. It can evaluate and flag transition based on set quality control criteria. The QuantOptimize Application Manager automates SRM method set up where maximum of five transitions can be selected and optimized for collision energy. In TargetLynx, peak integration is performed by ApexTrack which determines peaks and baselines according to tail, shoulder, and skewness characteristics. TargetLynx develops calibration curves on standard samples in which various polynomial curve fitting is available to quantify samples of unknown concentration (Colangelo et al. 2013).
5.5.4 MassHunter (Agilent) MassHunter allows fast and sensitive identification and configuration that helps to generate method for triggered SRM (tSRM) acquisition which is like iSRM of Thermo instruments. It allows for the selection of ten transitions that can be utilized at any combination as primary and secondary transitions. It allows optimization of collision energy and select the best transitions from empirical data. MassHunter can integrate and process raw files from Agilent instrument, and perform relative or absolute quantification using unlabeled or labeled peptide internal standards. Reports of data analysis are stored in XML format or can be exported to excel files (Colangelo et al. 2013).
142
5 Bioinformatics Tools for SRM-MS
References Afzal V, Huang JTJ, Atrih A, Crowther DJ. PChopper: high throughput peptide prediction for MRM/SRM transition design. BMC Bioinformatics. 2011;12:338. Bhowmick P, Mohammed Y, Borchers CH. MRMAssayDB: an integrated resource for validated targeted proteomics assays. Bioinformatics. 2018;34:3566–71. Broudy D, Killeen T, Choi M, Shulman N, et al. A framework for installable external tools in skyline. Bioinformatics. 2014;30:2521–3. Brusniak M, Kwok ST, Christiansen M, Campbell D, et al. ATAQS: a computational software tool for high throughput transition optimization and validation for selected reaction monitoring mass spectrometry. BMC Bioinformatics. 2011;12(78) Brusniak MK, Chu CS, Kusenauch U, Sartain MJ, Watts JD, Moritz RL. An assessment of current bioinformatics solutions for analyzing LC-MS data acquired by selected reaction monitoring technology. Proteomics. 2012;12:1176–84. Chang CY, Picotti P, Huttenhain R, Heinzelmann-Schwarz V, Jovanovic M, Sebersold R, Vitek O. Protein significance analysis in selected reaction monitoring (SRM) measurements. Mol Cell Proteomics. 2012;11:1–12. https://doi.org/10.1074/mcp.M111.014662. Choi M, Chang CY, Clough T, Broudy D, et al. MSstats: an R package for statistical analysis of quantitative mass spectrometry-based proteomic experiments. Bioinformatics. 2004;30:2524–6. Clough T, Thaminy S, Ragg S, Aebersold R, Vitek O. Statistical protein quantification and significance analysis in label-free LC-M experiments with complex designs. BMC Bioinformatics. 2012;13:S16. Colangelo CM, Chung L, Bruce C, Cheung KH. Review of software tools for design and analysis of large-scale MRM proteomic datasets. Methods. 2013;61:287–98. Craig R, Cortens JP, Beavis RC. Open source system for analyzing, validating, and storing protein identification data. J Proteome Res. 2004;3:1234–42. Desiere F, Deutsch EW, King NL, et al. The peptide atlas project. Nucl Acids Res. 2006;34:D655–8. Deutsch EW. File formats commonly used in mass spectrometry proteomics. Mol Cell Proteomics. 2012;11:1612–21. Deutsch EW, Lam H, Aebersold R. PeptideAtlas: a resource for target selection for emerging targeted proteomics workflows. EMBO Rep. 2008;9:429–34. Fan J, Mohareb F, Bond N, Lilley KS, et al. MRMaid 2.0: mining PRIDE for evidence based SRM transitions. OMICS. 2012;16:483–8. Farrah T, Deutsch EW, Kreisberg R, Sun Z, et al. PASSEL: the PeptideAtlas SRM experiment library. Proteomics. 2012;12:1170–5. Guo D, Mant CT, Taneja AK, Parker JMR, et al. Prediction of peptide retention times in reversed- phase high-performance liquid chromatography I. Determination of retention coefficients of amino acid residues of model synthetic peptides. J Chromatogr A. 1986a;359:499–518. Guo D, Mant CT, Taneja AK, Hodge RS. Prediction of peptide retention times in reversed-phase high-performance liquid chromatography II. Correlation of observed and predicted peptide retention times factors and influencing the retention times of peptides. J Chromatogr A. 1986b;359:519–32. Kall L, Vitek O. Computational mass spectrometry-based proteomics. PLoS Comput Biol. 2011;7:e1002277. Kanehisa M, Goto SKEGG. Kyoto encyclopedia of genes and genomes. Nucleic Acids Res. 2000;28:27–30. Kiyonami. Increased sensitivity, analytical precision and throughput in targeted proteomics. Mol Cell Proteomics. 2011;10, M110.002931:1–11. Krokhin, OV and Spicer, V. Predicting peptide retention times for proteomics. Curr Protoc Bioinformatics, Chapter 13, 2010, Unit 13.14, Wiley. Krokhin OV, Craig R, Spicer V. An improved model for prediction of retention times of tryptic peptides in ion pair reversed-phase HPLC: its application to protein peptide mapping by off- line HPLC-MALDI MS. Mol Cell Proteomics. 2004;3:908–19.
References
143
Kusebauch U, Campbell DS, Deutsch EG, Chu CS, et al. Human SRMAtlas: a resource of targeted assays to quantify the complete human proteome. Cell. 2016;166:766–78. Lange V, Malmstrom JA, Didion J, King NL, et al. Targeted quantitative analysis of Streptococcus pyogenes virulence factors by multiple reaction monitoring. Mol Cell Proteomics. 2008;7:1489–500. Lazar, IM. Bioinformatics resources for interpreting proteomics mass spectrometry data. Chapter 19, Iulia M. Lazar et al. (eds.), Proteomics for Drug Discovery: Methods and Protocols, Methods in Molecular Biology, Vol. 1647, Springer Science+Business Media LLC, 2017. Maclean B, Tomazela DM, Shulman N, Chambers M, Finney GL, Frewen B, Kern R, Tabb DL, Liebler DC, MacCoss MJ. Skyline: an open source document editor for creating and analyzing targeted proteomics experiments. Bioinformatics. 2010;26:966–8. Mallick P, Schirle M, Chen SS, Flory MR, et al. Computational prediction of proteotypic peptides for quantitative proteomics. Nat Biotechnol. 2007;25:125–31. Martin DB, Holzman T, May D, Peterson A, et al. MRMer, an interactive open source and cross- platform system for data extraction and visualization of multiple reaction monitoring experiments. Mol Cell Proteomics. 2008;7:2270–8. Mathivanan S, Ahmed M, Ahn NG, et al. Human Proteinpedia enables sharing of human protein data. Nat Biotechnol. 2008;26:164–7. Mead JA, Bianco L, Ottone V, Barton C, et al. MRMaid: the web-based tool for designing multiple reaction monitoring (MRM) transitions. Mol Cell Proteomics. 2009;8:696–705. Mohamed Y, et al. Peptide tracker: a knowledgebase for collecting and storing information on protein concentrations in biological tissues. Proteomics. 2016;106:151–61. Mohammed Y, et al. PeptidePicker: a scientific workflow with web interface for selecting appropriate peptides for targeted proteomics experiments. J Proteome. 2014;106:151–61. Picotti P, Lam H, Campbell D, Deutsch EW, et al. A database of mass spectrometric assays for the yeast proteome. Nat Methods. 2008;5:913–4. Pino LK, Searle BC, Bollinger JG, Nunn B, et al. The skyline ecosystem: informatics for quantitative mass spectrometry proteomics. Mass Spectrom Rev. 2017:1–16. Prakash A, Tomazela DM, Frewen B, Maclean B, Peterman S, MacCoss MJ. Expediting the development of targeted SRM assays: using data from shotgun proteomics to automate method development. J Proteome Res. 2009;8:2733–9. Ramos H, Shannon P, Brusniak MY, Kusebauch U, et al. The protein information and property explorer 2: gaggle-like exploration of biological properties data within one web page. Proteomics. 2011;11:12, 78. Schubert OT, Mouritsen J, Ludwig C, Rost HL, et al. The Mtb proteome library: a resource of assays to quantify the complete proteome of Mycobacterium tuberculosis. Cell host Microbe. 2013;13:602–12. Searle BC, Egertson JD, Bollinger J, Stergachis AB, et al. Using data independent acquisition to model high-responding peptides for targeted proteomics experiments. Mol Cell Proteomics. 2015;14:2331–40. Sharma V, Eckels J, et al. Panorama: a targeted proteomics knowledge base. J Proteome Res. 2014;13:4205–10. Sherwood CA, Eastham A, Lee LW, Peterson LE, et al. MaRiMba: a software application for spectral library-based MRM transition list assembly. J Proteome Res. 2009;8:4396–405. Stahl-Zeng. High sensitivity detection of plasma proteins by multiple reaction monitoring of N-glycosites. Mol Cell Proteomics. 2007;6:1809–17. Surinova S, Huttenhain R, Chang CY, Espona L, et al. Automated selected reaction monitoring data analysis workflow for large-scale targeted proteomic studies. Nat Protocols. 2013;8:1602–19. Szklarczyk D, Franceschini A, et al. STRING v10: protein-protein interaction networks, integrated over the tree of life. Nucleic Acids Res. 2015;43:D447–52. Unwin RD, Griffiths JR, Whetton AD. A sensitive mass spectrometric method for hypothesisdriven detection of peptide post-translational modifications: multiple reaction monitoring initiated detection and sequencing (MIDAS). Nat Protoc. 2009;4:870–7.
144
5 Bioinformatics Tools for SRM-MS
Vizcaino JA, Csordas A, del Toro N, et al. Update of the PRIDE database and its related tools. Nucleic Acid Res. 2015;44:D447–56. Walsh GM, Lin S, Evans DM, Khosrovi-Eghbal A, et al. Implementation of a data repository- driven approach for targeted proteomics experiments by multiple reaction monitoring. J Proteome. 2009;72:838–52. Whiteaker JR, Halusa GN, et al. CPTAC assay portal: a repository of targeted proteomic assays. Nat Methods. 2014;11:703–4. Whiteaker JR, Halusa GN, Hoofnagle AN, Sharma V, et al. Using the CPTAC assay portal to identify and implement highly characterized targeted proteomics assays. Methods Mol Biol. 2016;1410:223–36. Wu C, Shi T, Brown JN, He J, et al. Expediting SRM assay development for large-scale targeted proteomics experiments. J Proteome Res. 2014;13:4479–87. Zauber H, Kirchner M, Selbach M. Picky: a simple online PRM and SRM method designer for targeted proteomics. Nat Methods. 2018;15:156–7. Zolg DP, Wilhelm M, Schnatbaum K, Zerweck J, et al. Building proteome tools based on a complete synthetic human proteome. Nat Methods. 2017;14:259–62.
Chapter 6
Quantification by SRM-MS
6.1 Protein Quantification by SRM-MS Proteins play many important roles in biological system, and thus the study of proteins on a proteome scale has received increasing attention nowadays. Quantitative proteome analysis is essential for discovering biomarkers and their quantitative changes for disease diagnostics, treatment, and progression. Sensitive MS-based proteomics approaches can identify many proteins; however, most changes resulting from a selected perturbed biological system can be detected solely from accurate quantitative information. Besides, biology and especially system biology increasingly necessitate quantitative data as an input for modeling (Ong and Mann 2005). Quantification with selected reaction monitoring mass spectrometry (SRM-MS) can be done in two different ways—relative quantification or absolute quantification, as shown in Table 6.1. In relative quantification, the expression level of cellular protein(s) is defined in relation to another measure of the same protein(s) in a different state or sample. In absolute quantification, the total amount of the target protein(s) is measured—in molarity, copies/cell, or in weight/volume. In fact, absolute quantification comprehends relative comparisons where relative ratios of the target protein(s) can be determined from their known amounts in comparing samples (Ong and Mann 2005).
6.2 Relative Quantification The relative quantification of protein(s) expressed in biological samples under different states is of great importance for characterizing proteins with important biological functions, screening disease related biomarkers, and in identifying drug targets (Zhou et al. 2014). Relative quantification with SRM-MS approach can be carried out either with label-free technique or by using stable isotope-labeling © Springer Nature Switzerland AG 2020 M. Hossain, Selected Reaction Monitoring Mass Spectrometry (SRM-MS) in Proteomics, https://doi.org/10.1007/978-3-030-53433-2_6
145
146
6 Quantification by SRM-MS
Table 6.1 Protein quantification techniques using SRM-MS Quantification type Relative quantification
Labeling strategy No labeling Metabolic stable isotope labeling Chemical stable isotope labeling
Absolute quantification
Enzymatic stable isotope labeling Chemical stable isotope labeling Metabolic stable isotope labeling
Quantification method Label free 15 N ammonium sulfate SILAC ICAT iTRAQ mTRAQ 18 O water AQUA QconCAT PSAQ
ethods. In label-free technique, the mass spectrometry signals of indigenous m peptide(s) in different samples are measured sequentially and then compared. This technique is mainly used for its simplicity, low-cost, and straightforward nature; however, the quantitative accuracy is limited compared to stable isotope methods where stable isotope tags are attached to proteins or peptides, and mass spectrometry signals are measured of light and heavy or differently labeled peptides used as internal standard, thus minimizing any experimental and instrumental variability (Picotti and Aebersold 2012). Isotope labeling approach requires expensive labels, specific software, and technical expertise to analyze data, and limited by the number of samples that can be analyzed in a single analysis. Besides, some labeling strategies cannot easily be utilized to all kinds of samples (Neilson et al. 2011).
6.2.1 Label-Free Approach The recent observation of a correlation between protein abundance in a sample and peak areas (Bondarenko et al. 2002; Chelius and Bondarenko 2002) or the number of MS/MS spectra (Liu et al. 2004) as well as the current development of high- resolution and highly sensitive yet robust chromatography and mass spectrometry instrumentation and software has led to the development of label-free quantification approach. This technique is convenient for the large-scale proteome quantification with continuously improving dynamic range and accuracy. As mentioned, label-free quantification can be carried out into two distinct ways: (1) area under the curve (AUC) where signal intensity measurement is based on precursor ion spectra and (2) spectral counting (SC) where measurement is based on counting the number of peptides assigned to a protein in MS/MS experiment. A typical label-free workflow is shown in Fig. 6.1.
6.2 Relative Quantification
147
Fig. 6.1 Overview of the label-free quantitative MS experimental workflow. (Reprinted with permission from Neilson et al. (2011))
6.2.1.1 AUC or Signal Intensity Measurement AUC involves the measurement of chromatographic peak areas for any given peptide in LC-MS runs where it is linearly proportional to the concentration of the measured peptides in the range of 10 fmol–100 pmol as well as of the measured proteins in biological matrices in concentration ranges of 10 fmol–1 nmol (Neilson et al. 2011). A typical process of protein quantification with AUC relies on the measurement of ion abundances at specific retention times for the given peptide ions as they elute from a chromatography column into the mass spectrometer. In AUC measurement, several concerns related to the quantification accuracy and reproducibility among samples may include (i) wide retention time of peptides that may cause overlap with co-eluting peptides, (ii) biological variations resulting in multiple spectra for the same peptide, (iii) technical variations in elution time or variable MS signal intensity, and (iv) noises from background chemical interferences (Neilson et al. 2011). These variabilities can be overcome by using computational methods which is used to process raw mass spectrometry data for retention time alignment, noise suppression, optimal peak peaking and signal abundance normalization across analyses, and/or high-resolution mass spectrometer which helps getting accurate quantification.
148
6 Quantification by SRM-MS
6.2.1.2 Spectral Counting (SC) Spectral counting relies on the observation that higher abundant peptides will be selected for fragmentation in data-dependent mass spectrometry acquisition, and will produce more MS/MS spectra, and thus proportional to the corresponding protein amount in the sample (Liu et al. 2004). The spectral counting method has diversified from summing spectra to modifying counts with a normalization factor, and combining strategies for increased accuracy (Neilson et al. 2011). An estimate of the protein abundance in a sample can be calculated using the protein abundance index (PAI), which is defined as the number of observed peptides in the experiment divided by the number of observable tryptic peptides for each protein within a given mass range of the instrument. PAI was later modified to emPAI which is the exponential form of PAI minus one, and the protein content can also be calculated as molar (%)—by dividing the emPAI value of a protein by the sum of all emPAI values multiplied by 100. Peptide having different physicochemical properties may introduce variability and bias in mass spectrometry analyses. This can be overcome by Absolute Protein Expression (APEX) measurement which calculates the number of observed peptide mass spectra for a protein and the probability of the peptides being detected by the mass spectrometer. A correction factor Oi—used for the expectation of observing a tryptic peptide in analyses—is determined by a machine learning classification algorithm mostly based on peptide length and its amino acid sequence. The length of a protein may also affect SC, and a longer protein thus can generate more peptides and subsequent MS/MS spectra and vice versa. To take this into consideration, a normalized spectral abundance factor (NSAF) has been introduced which is calculated by dividing the SC for a protein by its length, and then normalized by the sum of all SC/L for all proteins in the analyses. The dynamic range of NSAF is about four orders of magnitude and can detect the fold change of protein abundances as low as 1.4 (Zybailov et al. 2006). In an effort to achieve better results from label-free techniques, three methods of spectral counting—peptide count (the number of unique peptides identified for the protein of interest), SC (the number of MS/MS spectra per peptide), and fragment ion intensity from MS/MS spectra—have been combined to develop a normalized Spectral Index (SI). SI then converted to SIN by dividing the SI for protein k by the sum of the SI for all proteins in a replicate and is further normalized by dividing the length of a protein. Efficacy of SIN was found superior when compared with other spectral counting and AUC methods. 6.2.1.3 SRM MS as a Label-free Quantification Approach Label-free quantification has become a viable alternative to label-based techniques due to high-cost and required technical expertise related to the later. However, it is challenging to analyze low-abundance peptides with label-free technique. SRM-MS
6.2 Relative Quantification
149
has already been established as an effective technique to analyze lower abundance peptides in complex matrices. Recently, label-free SRM-MS method has been introduced where label-free quantification was used with SRM in analyzing biological changes of adipokines—well known to release regulation factors associated with metabolic disorders—under oxidative stress using a triple quadrupole mass spectrometer (Choi et al. 2010). Initially, a 250 fmol–2 pmol of standard β-galactosidase of Escherichia coli was spiked into two different mixtures of adipocyte lysates—one mixture was composed of a part of hydrophilic fraction of HPLC (ACN 5–35%) and the other mixture was composed of a part of hydrophobic fraction of HPLC (ACN 40–60%). Both mixtures were digested with trypsin, and then three peptides of β-galactosidase were quantified by the label-free SRM-MS for method validation with a linearity (R2) of 0.97–0.99 and a CV of 1.8–17.3%, as shown in Fig. 6.2. Later, the methodology was applied to discover biological changes of adipokines under oxidative stress. Eight proteins—out of 194 identified, along with simultaneously monitored three β-galactosidase peptides for normalization— were selected and quantified between control and hydrogen peroxide (H2O2)treated groups, where the secretion levels of matrix metalloproteinase-2 (MMP-2), Stromal cell-derived factor-1 (SDF-1, CXCL12), resistin, and complement factor D (CFD, adipsin) were found to be down-regulated, and the tissue inhibitor of metalloproteinase-2 (TIMP-2) and aldolase A were up-regulated. This label-free SRM approach, in fact, provided an efficient quantitative analysis of targeted low-abundance proteins in complex mixtures without requiring any internal standard.
Fig. 6.2 Label-free quantification of spiked standard β-galactosidase. (a) SRM chromatogram of 250 fmol of tryptic β-galactosidase peptides. (b) Linearity and reproducibility of label-free SRM quantification. (Reprinted with permission from Choi et al. (2010))
150
6 Quantification by SRM-MS
6.2.2 Stable Isotope Labeling Approach Quantitative proteomics using stable isotope labeling allows identification of equivalent peptides or peptide fragments by utilizing the specific increase in mass related to mass tags with isotopes. The typical workflow includes labeling proteome samples at either protein or peptide level. The labeled samples are then mixed and fractionated by chromatography, followed by mass spectrometry analysis. The ratios of the two or more different isotopic variants are determined from their related spectra, and the ratios are then utilized to identify protein or peptide relative abundances (Chahrour et al. 2015). Protein or peptide labeling is of great importance in determining accuracy, precision, and dynamic range for the relative quantification of proteomics (Zhou et al. 2014). According to properties of the labeling reagents, the isotope labeling can be divided into (i) metabolic labeling, (ii) chemical labeling, and (iii) enzyme-catalyzed labeling. Each of these methods can be combined to SRM-MS. For metabolic labeling, peptide structure, fragmentation patterns as well as their relative signal intensities are conserved, and thus, are convenient to derive the assays for the heavy peptides from the fragment-ion spectrum of the peptides, and the knowledge of the site and type of label incorporation. On the other hand, chemically attached tags typically change precursor ion mass and peptide fragmentation pattern, and thus, new SRM-MS assays need to be developed related to labeled peptides (Picotti and Aebersold 2012). Stable isotope labeling can also be categorized—in terms of mass difference between light and heavy peptides or proteins—as non-isobaric, pseudo- isobaric, and isobaric labeling. In non-isobaric labeling, the preferred single/double digit mass difference, in Dalton scale, between light and heavy peptides is mostly to limit the isotopic cluster overlap in mass spectra as in SILAC labeling. In pseudo- isobaric labeling, the mass difference between light and heavy peptides is based on the small mass defects, at millidalton (mDa) scale, in common stable isotopes and can only be determined by high-resolution mass spectrometer, as in Nuecode SILAC or in pIDL labeling. In isobaric labeling, the masses of the light and their related heavy peptides are indistinguishable, or too small to be distinguished by mass spectrometry, and therefore, the intensities of reporter ions in iTRAQ or fragment ion pairs of MS/MS spectrum in IPTL are used for quantification (Zhou et al. 2014). 6.2.2.1 Metabolic Labeling The initial attempts of metabolic labeling were based on growing simple organisms on 15N-enriched media and were initially introduced for bacteria (Langen et al. 2000) and for yeast (Oda et al. 1999). Later, it was extended to mammalian cell (Conrads et al. 2001); however, it was found to be unsustainable for most culture protocols because of lacking a scalable source of 15N-enriched serum (Chahrour et al. 2015). Ong and co-workers, to overcome this challenge, introduced the method of SILAC labeling, especially for mammalian cell culture (Ong et al. 2002). Introduction of the isotope as amino acids in the growth media effectively labels
6.2 Relative Quantification
151
each newly synthesized protein. Metabolic labeling provides high accuracy as it introduced less variation during sample preparation (Zhou et al. 2014). The study of system biology entails highly sensitive analytical methodology for the fast and consistent identification, and quantification of target sets of proteins across multiple samples. This has been historically attempted by Western blot, ELISA, or by shotgun mass spectrometry-based techniques. Affinity reagents are limited by longer assay development time and low multiplexing capability; on the other hand, mass spectrometry methods used were non-targeted, and they stochastically sample comparatively higher abundant fraction of the proteome. Thus, they fail to generate complete and consistent data sets among samples (Picotti et al. 2009). Recently, a targeted proteomics approach based on SRM-MS and 15N metabolic labeling has been developed to detect and quantify proteins expressed to a concentration of