Evolutionary Biology – Concepts, Molecular and Morphological Evolution: 13th Meeting 2009 [1 ed.] 9783642123399, 9783642123405

The annual Evolutionary Biology Meetings in Marseille aim to bring together leading scientists, promoting an exchange of

280 98 4MB

English Pages 363 [367] Year 2010

Report DMCA / Copyright

DOWNLOAD PDF FILE

Table of contents :
Front Matter....Pages i-xiv
Front Matter....Pages 1-1
Extinct and Extant Reptiles: A Model System for the Study of Sex Chromosome Evolution....Pages 3-17
Constraints, Plasticity, and Universal Patterns in Genome and Phenome Evolution....Pages 19-47
Starvation-Induced Reproductive Isolation in Yeast....Pages 49-65
Populations of RNA Molecules as Computational Model for Evolution....Pages 67-79
Pseudaptations and the Emergence of Beneficial Traits....Pages 81-98
Front Matter....Pages 99-99
Transferomics: Seeing the Evolutionary Forest Using Phylogenetic Trees....Pages 101-114
Comparative Genomics and Transcriptomics of Lactation....Pages 115-132
Evolutionary Dynamics in the Aphid Genome: Search for Genes Under Positive Selection and Detection of Gene Family Expansions....Pages 133-142
Mammalian Chromosomal Evolution: From Ancestral States to Evolutionary Regions....Pages 143-158
Mechanisms and Evolution of Dorsal–Ventral Patterning....Pages 159-177
Evolutionary Genomics for Eye Diversification....Pages 179-186
Do Long and Highly Conserved Noncoding Sequences in Vertebrates Have Biological Functions?....Pages 187-206
Front Matter....Pages 207-207
Male-Killing Wolbachia in the Butterfly Hypolimnas bolina ....Pages 209-227
Evolution of Immunosuppressive Organelles from DNA Viruses in Insects....Pages 229-248
The Neogastropoda: Evolutionary Innovations of Predatory Marine Snails with Remarkable Pharmacological Potential....Pages 249-270
Antennal Hammers: Echos of Sensillae Past....Pages 271-282
Adaptive Radiation of Neotropical Emballonurid Bats: Molecular Phylogenetics and Evolutionary Patterns in Behavior and Morphology....Pages 283-299
Trends in Rhizobial Evolution and Some Taxonomic Remarks....Pages 301-315
Convergent Evolution of Morphogenetic Processes in Fungi....Pages 317-328
Evolution and Historical Biogeography of a Song Sparrow Ring in Western North America....Pages 329-342
Front Matter....Pages 207-207
Cave Bear Genomics in the Paleolithic Painted Cave of Chauvet-Pont d’Arc....Pages 343-356
Back Matter....Pages 357-363
Recommend Papers

Evolutionary Biology – Concepts, Molecular and Morphological Evolution: 13th Meeting 2009 [1 ed.]
 9783642123399, 9783642123405

  • 0 0 0
  • Like this paper and download? You can publish your own PDF file online for free in a few minutes! Sign Up
File loading please wait...
Citation preview

Evolutionary Biology – Concepts, Molecular and Morphological Evolution

.

Pierre Pontarotti Editor

Evolutionary Biology – Concepts, Molecular and Morphological Evolution

Editor Dr. Pierre Pontarotti UMR 6632 Universite´ d’Aix-Marseille/CNRS Laboratoire Evolution Biologique et Mode´lisation, case 19 Place Victor Hugo 3 13331 Marseille Cedex 03 France [email protected]

ISBN 978-3-642-12339-9 e-ISBN 978-3-642-12340-5 DOI 10.1007/978-3-642-12340-5 Springer Heidelberg Dordrecht London New York Library of Congress Control Number: 2010933958 # Springer-Verlag Berlin Heidelberg 2010 This work is subject to copyright. All rights are reserved, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting, reproduction on microfilm or in any other way, and storage in data banks. Duplication of this publication or parts thereof is permitted only under the provisions of the German Copyright Law of September 9, 1965, in its current version, and permission for use must always be obtained from Springer. Violations are liable to prosecution under the German Copyright Law. The use of general descriptive names, registered names, trademarks, etc. in this publication does not imply, even in the absence of a specific statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use. Cover design: WMXDesign GmbH, Heidelberg, Germany Cover illustration: An antennal tip of a female parasitic wasp (Ichneumonidae: Cryptinae: Latibulus sp.). See Fig. 16.3b Printed on acid-free paper Springer is part of Springer Science+Business Media (www.springer.com)

Preface

The 13th Evolutionary Biology Meeting was held in Marseille on the 22–25 September 2009. These events aim to gather leading scientists involved in research on evolutionary biology, promoting an exchange of state-of-the-art knowledge and the initiation of inter-group collaborations. Over the past years, this has been rewarded by the publication of several important review articles dealing with this subject matter. For me personally, the Evolutionary Biology Meeting is a valuable scientific exchange platform serving as booster for the use of evolutionary-based approaches not only in biology but also in other scientific fields. In 2009, some 100 presentations (oral, as well as “fast presentation” and traditional posters) admirably reflected the epistemological nature of the meeting. I selected one fifth of the most representative contributions for this book, these 21 articles being organized in different categories: Evolutionary Biology Concepts, Genome/Molecular Evolution, and Morphological Evolution/Speciation. I would like to thank the contributors to this book, as well as all other participants who helped making this meeting such as success, and our sponsors – the Universite´ de Provence, CNRS, GDR BIM, Conseil Ge´ne´ral 13, and Ville de Marseille. I gratefully acknowledge the support of members of the Association pour l’Etude de l’Evolution Biologique (AEEB). In addition, I am indebted to the staff of our publisher, Springer, for their competence and help. Last but not least, I sincerely wish to thank the AEEB coordinator, Axelle Pontarotti, for the excellent organization of the meeting and the production of the book. In terms of collaborative scientific exchange and the publication of this proceedings, the scientific output of the 13th Marseille meeting reflects the high quality not only of individual contributions but also of the Marseille way of hosting, for which Axelle Pontarotti is an outstanding ambassador. Marseille, France May 2010

Pierre Pontarotti

v

.

Contents

Part I

Evolutionary Biology Concepts

1

Extinct and Extant Reptiles: A Model System for the Study of Sex Chromosome Evolution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 Daniel E. Janes

2

Constraints, Plasticity, and Universal Patterns in Genome and Phenome Evolution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19 Eugene V. Koonin and Yuri I. Wolf

3

Starvation-Induced Reproductive Isolation in Yeast . . . . . . . . . . . . . . . . . 49 Eugene Kroll, R. Frank Rosenzweig, and Barbara Dunn

4

Populations of RNA Molecules as Computational Model for Evolution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67 Michael Stich, Carlos Briones, Ester Lzaro, and Susanna C. Manrubia

5

Pseudaptations and the Emergence of Beneficial Traits . . . . . . . . . . . . . . 81 Steven E. Massey

Part II

Genome/Molecular Evolution

6

Transferomics: Seeing the Evolutionary Forest Using Phylogenetic Trees . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101 John W. Whitaker and David R. Westhead

7

Comparative Genomics and Transcriptomics of Lactation . . . . . . . . . 115 Christophe M. Lefe`vre, Karensa Menzies, Julie A. Sharp, and Kevin R. Nicholas

vii

viii

Contents

8

Evolutionary Dynamics in the Aphid Genome: Search for Genes Under Positive Selection and Detection of Gene Family Expansions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 133 Morgane Ollivier and Claude Rispe

9

Mammalian Chromosomal Evolution: From Ancestral States to Evolutionary Regions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 143 Terence J. Robinson and Aurora Ruiz-Herrera

10

Mechanisms and Evolution of Dorsal–Ventral Patterning . . . . . . . . . . 159 Claudia Mieko Mizutani and Rui Sousa-Neves

11

Evolutionary Genomics for Eye Diversification . . . . . . . . . . . . . . . . . . . . . . 179 Atsushi Ogura

12

Do Long and Highly Conserved Noncoding Sequences in Vertebrates Have Biological Functions? . . . . . . . . . . . . . . . . . . . . . . . . . . . 187 Yoichi Gondo

Part III

Morphological Evolution/Speciation

13

Male-Killing Wolbachia in the Butterfly Hypolimnas bolina . . . . . . . . 209 Anne Duplouy and Scott L. O’Neill

14

Evolution of Immunosuppressive Organelles from DNA Viruses in Insects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 229 Brian A. Federici and Yves Bigot

15

The Neogastropoda: Evolutionary Innovations of Predatory Marine Snails with Remarkable Pharmacological Potential . . . . . . . . 249 Maria Vittoria Modica and Mande¨ Holford

16

Antennal Hammers: Echos of Sensillae Past . . . . . . . . . . . . . . . . . . . . . . . . . 271 Nina Laurenne and Donald L.J. Quicke

17

Adaptive Radiation of Neotropical Emballonurid Bats: Molecular Phylogenetics and Evolutionary Patterns in Behavior and Morphology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 283 Burton K. Lim

18

Trends in Rhizobial Evolution and Some Taxonomic Remarks . . . . 301 Julio C. Martı´nez-Romero, Ernesto Ormen˜o-Orrillo, Marco A. Rogel, Aline Lo´pez-Lo´pez, and Esperanza Martı´nez-Romero

Contents

ix

19

Convergent Evolution of Morphogenetic Processes in Fungi . . . . . . . 317 Sylvain Brun and Philippe Silar

20

Evolution and Historical Biogeography of a Song Sparrow Ring in Western North America . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 329 Michael A. Patten

21

Cave Bear Genomics in the Paleolithic Painted Cave of Chauvet-Pont d’Arc . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 343 Ce´line Bon and Jean-Marc Elalouf

Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 357

.

Contributors

Yves Bigot Laboratoire d’Etude des Parasites Ge´ne´tiquesParc Grandmont, Universite´ de Tours, U.F.R. des Sciences et Techniques, 37200 Tours, France Ce´line Bon CEA, IBiTec-S, F-91191, Gif-sur-Yvette cedex, France, celine.bon@ cea.fr Sylvain Brun UFR des Sciences du Vivant, Universite´ de Paris 7 – Denis Diderot, 75205 Paris Cedex 13, France; Institut de Ge´ne´tique et Microbiologie, UMR CNRS – Universite´ de Paris 11, UPS Baˆt. 400, 91405, Orsay cedex, France Barbara Dunn Department of Genetics, Stanford University, Stanford, CA 94305, USA Anne Duplouy School of Biological Sciences, The University of Queensland, Brisbane, QLD 4072, Australia, [email protected] Jean-Marc Elalouf CEA, IBiTec-S, F-91191 Gif-sur-Yvette cedex, France Brian A. Federici Department of Entomology and Interdepartmental Graduate Programs in Genetics and Microbiology, University of California, Riverside, CA 92521, USA; Laboratoire d’Etude des Parasites Ge´ne´tiquesParc Grandmont, Universite´ de Tours, U.F.R. des Sciences et Techniques, 37200 Tours, France, [email protected] Yoichi Gondo Mutagenesis and Genomics TeamRIKEN BioResource Center, 3-1-1 Koyadai, Tsukuba 305-0074, Japan, [email protected] Mande¨ Holford York College and Graduate Center, and The American Museum of Natural History, The City University of New York, NY, USA, mholford@york. cuny.edu

xi

xii

Contributors

Daniel E. Janes Department of Organismic and Evolutionary Biology, Harvard University, Cambridge, MA 02138-3899, USA, [email protected] Eugene V. Koonin National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20892, USA, [email protected] Eugene Kroll Division of Biological Sciences, University of Montana, Missoula, MT 59812, USA, [email protected] Nina Laurenne Museum of Natural History, Entomology Division, University of Helsinki, P.O. Box 17(P. Arkadiankatu 13), 00014, Helsinki, Finland, nina. [email protected] Christophe M. Lefe`vre Institute for Technology Research and Innovation, Deakin University, Waurn Ponds, Geelong, VIC 3217, Australia; CRC for Innovative Dairy Products, Department of Zoology, University of Melbourne, Melbourne, VIC 3010, Australia; Victorian Bioinformatics Consortium, Monash University, Clayton, Melbourne, VIC 3080, Australia, [email protected] Burton K. Lim Department of Natural History, Royal Ontario Museum, 100 Queen’s Park, Toronto, Ontario M5S 2C6, Canada, [email protected] Aline Lo´pez-Lo´pez Centro de Ciencias Geno´micas, UNAM, Av. Universidad, Cuernavaca, Morelos 62210, Me´xico Julio C. Martı´nez-Romero Centro de Ciencias Av. Universidad, Cuernavaca, Morelos 62210, Me´xico

Geno´micas,

UNAM,

Esperanza Martı´nez-Romero Centro de Ciencias Geno´micas, UNAM, Av. Universidad, Cuernavaca, Morelos 62210, Me´xico, esperanzaeriksson@ yahoo.com.mx Steven E. Massey Biology Department, University of Puerto Rico – Rio Piedras, P.O. Box 23360, San Juan, Puerto Rico 00931, USA, [email protected] Karensa Menzies Institute for Technology Research and Innovation, Deakin University, Waurn Ponds, Geelong, VIC 3217, Australia; CRC for Innovative Dairy Products, Department of Zoology, University of Melbourne, Melbourne, VIC 3010, Australia Claudia Mieko Mizutani Department of Biology, Case Western Reserve University, 10900 Euclid Ave, Cleveland, OH 447080, USA Department of Genetics, Case Western Reserve University, 10900 Euclid Ave, Cleveland, OH 447080, USA, [email protected]

Contributors

xiii

Maria Vittoria Modica Sapienza University of Rome, Piazzale Aldo Moro 5, 00185 Rome, Italy, [email protected] Kevin R. Nicholas Institute for Technology Research and Innovation, Deakin University, Waurn Ponds, Geelong, VIC 3217, Australia; CRC for Innovative Dairy Products, Department of Zoology, University of Melbourne, Melbourne, VIC 3010, Australia Scott L. O’Neill School of Biological Sciences, The University of Queensland, Brisbane, QLD 4072, Australia Atsushi Ogura Division of Advanced Sciences, Ochadai Academic Production, Ochanomizu University, Ohtsuka 2-1-1, Bunkyo, Tokyo 112-8610, Japan, ogura. [email protected] Morgane Ollivier INRA, UMR1099 BiO3P, Domaine de la Motte, F-35653, Le Rheu, France Ernesto Ormen˜o-Orrillo Centro de Ciencias Av. Universidad, Cuernavaca, Morelos 62210, Me´xico

Geno´micas,

UNAM,

Michael A. Patten Oklahoma Biological Survey and Department of Zoology, University of Oklahoma, 111 E. Chesapeake Street, Norman, OK 73019, USA, [email protected] Donald L.J. Quicke Department of Life Sciences, Imperial College London, Silwood Park Campus, Ascot, Berkshire SL5 7PY, UK; Department of Entomology, Natural History Museum, London, SW7 5BD, UK Claude Rispe INRA, UMR1099 BiO3P, Domaine de la Motte, F-35653, Le Rheu, France, [email protected] Terence J. Robinson Evolutionary Genomics Group, Department of Botany and Zoology, University of Stellenbosch, Private Bag X1, Matieland 7602, South Africa, [email protected] Marco A. Rogel Centro de Ciencias Geno´micas, UNAM, Av. Universidad, Cuernavaca, Morelos 62210, Me´xico R. Frank Rosenzweig Division of Biological Sciences, University of Montana, Missoula, MT 59812, USA Aurora Ruiz-Herrera Unitat de Citologia i Histologia, Departament de Biologia Cel.lular, Fisiologia i Inmunologia, Universitat Auto`noma de Barcelona, Campus

xiv

Contributors

Bellaterra, 08193, Barcelona, Spain; Institut de Biotecnologia i Biomedicina, Universitat Auto`noma de Barcelona, Campus Bellaterra, 08193 Barcelona, Spain, [email protected] Julie A. Sharp Institute for Technology Research and Innovation, Deakin University, Waurn Ponds, Geelong, VIC 3217, Australia; CRC for Innovative Dairy Products, Department of Zoology, University of Melbourne, Melbourne, VIC 3010, Australia Philippe Silar UFR des Sciences du Vivant, Universite´ de Paris 7 – Denis Diderot, 75205 Paris Cedex 13, France; Institut de Ge´ne´tique et Microbiologie, UMR CNRS – Universite´ de Paris 11, UPS Baˆt. 400, 91405 Orsay cedex, France, [email protected] Rui Sousa-Neves Department of Biology, Case Western Reserve University, 10900 Euclid Ave, Cleveland, OH 447080, USA Michael Stich Dpto de Evolucio´n Molecular, Centro de Astrobiologı´a (CSIC-INTA), Ctra de Ajalvir, km 4, Torrejo´n de Ardoz, Madrid 28850, Spain, [email protected] David R. Westhead Institute of Molecular and Cellular Biology, University of Leeds, Garstang Building, Leeds LS2 9J, UK, [email protected] John W. Whitaker Institute of Molecular and Cellular Biology, University of Leeds, Garstang Building, Leeds, LS2 9J, UK, [email protected] Yuri I. Wolf National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20892, USA

Part I Evolutionary Biology Concepts

Chapter 1

Extinct and Extant Reptiles: A Model System for the Study of Sex Chromosome Evolution Daniel E. Janes

Abstract The evolution and functional dynamics of sex chromosomes are focuses of current biological research. Although common organismal morphologies and functions of males and females are found among amniotes, underlying sex chromosome organizations and sex-determining mechanisms are widely variable. This chapter investigates the role that reptiles play in the study of sex chromosome evolution. Reptile studies have described the coevolution of genotypic sex determination and viviparity, the adaptive significance of sex-determining mechanisms, and shared ancestry of chromosomes. Novel resources, including whole-genome sequences and mapped sex-linked markers, have allowed researchers to examine sex chromosome evolution in reptiles, an important group for this type of study for their position as the sister group to mammals. Compared with mammals, reptiles exhibit much more variability in sex chromosome organization, providing raw material for study of sex chromosome evolution across amniotes.

1.1

Introduction

Embryos develop as either male or female depending on factors that vary widely among amniotes. Broadly speaking, amniotes can be classified as either genotypically sex-determined (GSD) or temperature-dependently sex-determined (TSD). Embryos of GSD species, including all mammals, birds, snakes, and many lizards and turtles, develop as either male or female depending on chromosomal contributions from parents at conception. Many, but not all, of these species exhibit detectable

D.E. Janes Department of Organismic and Evolutionary Biology, Harvard University, Cambridge, MA 02138-3899, USA e-mail: [email protected]

P. Pontarotti (ed.), Evolutionary Biology – Concepts, Molecular and Morphological Evolution, DOI 10.1007/978-3-642-12340-5_1, # Springer-Verlag Berlin Heidelberg 2010

3

4

D.E. Janes

cytogenetic sex differences (i.e., heteromorphic sex chromosomes). The difference between heteromorphic and homomorphic sex chromosomes could be explained by the length of the interval since the origin of genotypic sex determination in a species (Ohno 1967; Janes et al. 2010b). Apparently, sex chromosomes begin to diverge from each other only after a new GSD system arises (see Sect. 1.3.1). This sex difference in karyotype is not apparent in individuals of TSD amniotes that develop as male or female primarily in response to incubation temperature, including all crocodilians, tuataras, and some turtles and lizards. In this review, I will describe the variability of sex-determining mechanisms among amniotes. This variability includes, for example, the temperatures that trigger male or female development and the timing of temperature’s effect among TSD species, as well as the presence or absence and type of sex chromosomes in GSD species. Almost all mammals exhibit male heterogamety in which females carry two X sex chromosomes of the same size and content, whereas males carry one X sex chromosome and one smaller, degenerated Y sex chromosome. In birds, females are heterogametic which means they carry the smaller, degenerated W sex chromosome and one larger, more gene-rich Z sex chromosome, whereas male birds carry two Z sex chromosomes. This difference in heterogamety affects the genomics of amniotes in ways that are discernible from genome sequencing and experimental evidence. Further, the evolutionary history of sex-determining mechanisms informs the different arrangements of amniotic sex chromosomes that have been studied using techniques that include phylogenetic inference, cytogenetic mapping, and measurements of population genetics parameters. Recent studies of sex-determining mechanisms and, specifically, the evolution of sex chromosomes have focused on extinct and extant reptiles for two reasons. First, nonavian reptiles exhibit greater variety of sexdetermining mechanisms and sex chromosomes than birds or mammals. Second, genomic resources for reptiles (including birds) have recently improved to an extent that previously untestable hypotheses are now open to experimentation and comparative analyses (Janes et al. 2008).

1.2 1.2.1

Sex-Determining Mechanisms Patterns and Variability

Amniote sex-determining mechanisms are typically described as either GSD or TSD but within those categories, functional patterns vary. As described above, GSD species vary in their organization of sex chromosomes [i.e., female heterogamety (ZW system) or male heterogamety (XY system)] (Fig. 1.1a). Phylogenetic inference and comparative chromosome hybridizations suggest that male and female heterogamety have evolved more than once among amniotes although the exact number of independent origins is debated (Ezaz et al. 2009; Organ and Janes 2008). Likewise, the number of independent origins of temperature-dependent sex

1 Extinct and Extant Reptiles

a

5

b Type Ia TSD Type Ib TSD Type II TSD GSD Male H eterogamety Female H eterogamety No H eterogamety

Male H eterogamety

XX

X Female H eterogamety

W Z Z

%

Z

Male O f f spring / Clutch

Y

No H eterogamety

AA

AA

Incubation Temperature

Fig. 1.1 (a) Pairs of sex chromosomes that consist of either a male-specific Y chromosome and an X chromosome or a female-specific W chromosome and a Z chromosome. Species that exhibit these sex chromosomes are described as either male heterogametic (XY system) or female heterogametic (ZW system). Other GSD species exhibit no detectable heterogameties or sex differences in karyotype. (b) Influence of incubation temperature on offspring sex ratios among temperature-dependently (TSD) and genotypically sex-determined (GSD) species. The y-axis models the proportion of males yielded per clutch of eggs incubated at different points on the thermal gradient indicated on the x-axis. Sex-determining response to incubation temperature follows one of three patterns (Type Ia, Ib, or II) in TSD species. GSD species produce similarly balanced offspring sex ratios regardless of incubation temperature or type of heterogamety

determination is not clear. Although the sex-determining mechanisms of two or more species may respond to incubation temperature in a similar manner, the similarity may represent convergence. Three basic patterns of sex-determining response to incubation temperature (Types Ia, Ib, and II) have been described (Fig. 1.1b) (Bull 1983). Species that exhibit Type Ia temperature-dependent sex determination, such as loggerhead (Caretta caretta), green (Chelonia mydas), and leatherback (Dermochelys coriacea) sea turtles, produce more male offspring from eggs incubated at cooler temperatures (Standora and Spotila 1985). Species with Type Ib temperature-dependent sex determination, such as all crocodilians, produce more male offspring from eggs incubated at warmer temperatures (Valenzuela 2004). Species with Type II temperature-dependent sex determination, such as leopard geckos (Eublepharis macularius), produce a maximal proportion of males from eggs incubated at an intermediate temperature, whereas cooler or warmer temperatures yield higher proportions of females (Janes and Wayne 2006; Viets et al. 1994).

6

D.E. Janes

The timing of the effect of temperature on sex-determining response also varies among TSD reptiles. Shine et al. (2007) tested two TSD lizards for the effects of fadrozole, a chemical that blocks the bioconversion of testosterone to estrogen, thereby causing male development in eggs incubated at female-producing temperatures. In this type of experiment, the stage during which fadrozole affects offspring sex ratios represents the thermally sensitive period when temperature can influence sex determination. In two TSD reptiles, jacky dragons (Amphibolurus muricatus) and Duperrey’s window-eyed skinks (Bassiana duperreyi), the thermally sensitive period in which sex could be reversed by fadrozole treatment occurred in the first half of the postoviposition incubation period. The thermally sensitive period has been shown to occur slightly later in turtles and tuataras, during only the middle third of the postoviposition incubation period (Ewert et al. 2004; Mitchell et al. 2006) and occurs even later in crocodilians, during the third quarter of the entire incubatory period (Lang and Andrews 1994). GSD amniotes exhibit a similar degree of variability (Organ and Janes 2008). In birds, snakes, and some turtles and lizards, females are the heterogametic sex. Male heterogamety is found in some turtles and lizards and throughout mammals (with exceptions). The mammalian exceptions include, among others, the mole vole (Ellobius lutescens) in which a Y sex chromosome is absent. Both males and females of this species carry one X sex chromosome (Just et al. 1995; Vogel et al. 1998). Within heterogameties, there is variation in the extent of degeneration of either the male-specific Y sex chromosome or the female-specific W sex chromosome. For example, the Z and W sex chromosomes of emus (Dromaius novaehollandiae) are virtually homomorphic, whereas in chickens (Gallus gallus), the W sex chromosome is considerably smaller than the Z sex chromosome (Janes et al. 2009; Solari 1994). Clearly, a single line of demarcation between genotypic and temperature-dependent sex determination is overly simplistic and does not accurately represent the evolutionary history of sex-determining mechanisms in amniotes (Sarre et al. 2004).

1.2.2

Adaptive Significance of Sex-Determining Mechanisms

The variability of reptilian sex-determining mechanisms and, among GSD species, type of heterogamety are difficult to explain. Among agamid lizards, for example, species within the same genus with no discernible differences in natural history exhibit different sex-determining mechanisms (Ezaz et al. 2009; Uller et al. 2006). However, the adaptive significance of both genotypic and temperature-dependent sex determination has been explored in theory and experimentation. Fisher (1930) argued that parents should invest equally in sons and daughters. If sons and daughters represent equivalent parental investment, genotypic sex determination is expected to balance offspring sex ratios by matching them to the balanced

1 Extinct and Extant Reptiles

7

probability of inheriting an X or a Y chromosome from a male parent in a male heterogametic species or the probability of inheriting a Z or a W chromosome from a female parent in a female heterogametic species. Charnov and Bull (1977) hypothesized that temperature-dependent sex determination would allow parents greater control over offspring sex ratios in environments where the costs of sons and daughters are unequal and fluctuating. However, the Charnov–Bull hypothesis has not acquired much empirical support. Parents of TSD species do not appear to control offspring sex ratios by nesting behavior. However, Freedberg and Wade (2001) suggested that offspring sex ratios are inherited as nest sites, and their unique exposures to sun and soil temperature are passed matrilineally. Also, Warner and Shine (2008) demonstrated that incubation temperature can affect reproductive success in jacky dragons. Male jacky dragons hatched from eggs incubated at the optimal male-producing temperature had greater lifetime reproductive success than males hatched from eggs incubated at a different temperature and experimentally masculinized by chemical aromatase inhibition. The same pattern of greater reproductive success was reported among females incubated at either the optimal female-producing temperature or a different temperature. This study provides evidence that, in a TSD species, incubation temperature directly influences reproductive success in a sex-differential manner. Although this study supports the Charnov–Bull hypothesis, it does not explain why some species would benefit from temperature-dependent sex determination but not other closely related species with similar life history traits. Reproductive mode, whether a species is oviparous (egg-laying) or viviparous (live-bearing), is associated with type of sex-determining mechanism. Viviparity appears to be enabled by genotypic but not temperature-dependent sex determination. From a sample of 94 extant amniote species for which sex-determining mechanism, reproductive mode, and phylogenetic position are known, only two, perhaps three, exhibit both temperature-dependent sex determination and viviparity. The southern water skink (Eulamprus tympanum) and its sister species (Eulamprus heatwolei) give live birth and exhibit temperature-dependent sex determination and some evidence suggests that the spotted skink (Niveoscincus ocellatus) is also TSD and viviparous (Organ et al. 2009). For TSD species including these skinks, producing both male and female offspring requires exposing different embryos to one of at least two (optimal male-producing and optimal female-producing) thermal environments. For viviparous species, this requirement entails manipulating maternal body temperature and evidence for maternal manipulation of body temperature in TSD, viviparous skinks is debated (Allsop et al. 2006; While and Wapstra 2009). Further, as explained in Sect. 1.4, fluctuations in maternal body temperatures are even less likely in thermally consistent environments such as deep oceans. Apparently, thermal consistency is not an issue for oviparous, TSD species such as crocodilians and sea turtles because their nests experience sufficient thermal variation from top to bottom to explain mixed sex ratios emerging from clutches of eggs (Georges 1992 but see Warner and Shine 2009).

8

1.2.3

D.E. Janes

Genotype and Environment Interaction

The proximate differences among sex-determining mechanisms remain unclear. Controlled incubation studies in the laboratory have been used to identify species in which incubation temperatures may or may not skew offspring sex ratios. These incubation experiments that measure offspring sex ratios are challenged by the possibility that a specific temperature that elicits a sex-determining response goes inadvertently untested. Further, in a tested species, the difference between a temperature that yields a consistent offspring sex ratio and a temperature that yields lethality may be too small to tease them apart in incubation studies. In the face of such uncertainty, many experimental characterizations of sex-determining mechanisms are considered tentative (Viets et al. 1994). In addition to results from incubation studies, GSD and TSD species can be distinguished by the presence or absence of sex chromosomes. If a species has detectable sex chromosomes, then offspring sex ratios are expected to be defined by genotype. However, an exception to this rule has been presented by a study of central bearded dragons (Pogona vitticeps) (Quinn et al. 2007). Central bearded dragons exhibit clear female heterogamety, yet extreme incubation temperatures can feminize genotypically male embryos. This result suggests environmental effects on sex determination in a GSD species. Likewise, genotypic effects have been reported for leopard geckos (Eublepharis macularius), a reptile that has been classified as exhibiting TSD because incubation studies of leopard geckos demonstrate a clear and repeatable influence of incubation temperature on offspring sex ratios (Janes et al. 2007; Viets et al. 1993; Wagner 1980). Nonetheless, a quantitative genetic effect on temperature-dependent sex determination is clear from study of sex-determining response to incubation temperature in different matrilineal lines of leopard geckos. Janes and Wayne (2006) identified genetically dissimilar females within a captive-bred colony of leopard geckos. These females were each mated to fertile males and the resultant offspring were placed randomly within one of three environmental chambers set to temperatures known to produce either 0%, 50%, or 70% male offspring. In this species, a 100% male-producing incubation temperature has not been identified. Although incubation temperature overwhelmingly influenced offspring sex ratios across family lines, a genotype  environment interaction was detected in the varying offspring sex ratios from different matrilineal lines exposed to the same incubation temperatures. This result suggests that families vary in their sex-determining response to incubation temperature. Genotype  environment interactions also indicate that a studied trait is polygenic (Falconer and MacKay 1996). Polygenic inheritance is relevant to conservation of TSD reptiles that may be exceptionally vulnerable to climate change because of the possibility that they are not exposed to temperatures needed to produce both sons and daughters (Huey and Janzen 2008). If there is an underlying polygenic control of sex-determining responses to temperature in TSD reptiles, then there is opportunity for microevolution and adaptation to changing climates. Recent modeling has suggested that tuataras (Sphenodon guntheri) occupy a habitat in

1 Extinct and Extant Reptiles

9

which ambient temperature is expected to change to a degree that could negatively affect offspring sex ratios within the next century (Huey and Janzen 2008). If sexdetermining responses to temperature do not change adaptively, the remaining possibilities include extinction or migration to cooler habitats but migration is unlikely without human intervention considering tuataras’ habitat of small islands off New Zealand.

1.3 1.3.1

Sex Chromosomes Origins and Degeneration of Sex Chromosomes

Heteromorphic sex chromosomes arise when one of a pair of sex chromosomes degenerates to a sufficient degree that cytogenetic differences between the pair are observable. A number of different causes for this degeneration have been proposed, including the Hill–Robertson effect, background selection, Muller’s Ratchet, and hitchhiking of deleterious alleles onto favored mutations (Charlesworth and Charlesworth 2000; Charlesworth et al. 1987). The Hill–Robertson effect prevents the repair or elimination of deleterious alleles because of their close linkage to beneficial alleles and background selection explains rates of elimination or fixation by the degree to which an allele is either deleterious or beneficial. Mildly deleterious alleles are more likely to be tolerated than more seriously deleterious alleles (Charlesworth and Charlesworth 2000). If mildly deleterious alleles are permitted to accumulate on the Y chromosome as a result of reduced repair via recombination with the X, then, over time, the mean fitness of the Y chromosome declines. The accumulation of mildly deleterious alleles, known as Muller’s Ratchet, eventually causes an allele to become damaged and then eliminated. Following that, the homologous copy becomes fixed at a rate that is much faster than the fixation rate for genes that are retained as two copies (Rice 1987). Hitchhiking works in conjunction with Muller’s Ratchet to hasten the degeneration of the Y chromosome. Deleterious mutations that hitchhike with favorable alleles on the Y are less likely to be purged, further reducing the overall fitness of the chromosome. These forces drive the degeneration of sex chromosomes after an initial event that converts an ancestral pair of autosomes into sex chromosomes. Ohno (1967) described the origination of sex chromosomes from ancestral autosomes. Once a novel sex-determining gene is either exapted from a different function or transposed to a chromosome from elsewhere in the genome, recombination ceases in the general vicinity of the gene. This block to recombination allows parents to pass the sex-determining gene to either sons or daughters, depending on the nature of the expression of the sex-determining gene. In mammals, a single-copy gene called the sex-determining region on the Y (Sry) initiates male sexual development (Sinclair et al. 1990). Cessation of recombination around the Sry or some other ancestral sex-determining gene speeds up

10

D.E. Janes

Muller’s Ratchet, causing the degeneration of the mammalian Y chromosome. The evolution of avian sex chromosomes may have followed a different path. In chickens, dosage-dependent effects of a Z-linked gene, Dmrt1, appear to drive male sexual development rather than the absence of a single copy of a W-linked gene (Smith et al. 2009). Reptiles provide an excellent model for the process of sex chromosome degeneration because of the intermediate stages of chromosomal degeneration found in the group. For example, the smooth softshell turtle (Apalone mutica) is GSD but sex chromosomes have not yet been identified, most likely due to a lack of sufficient heteromorphy (Valenzuela et al. 2006). Further, micro-sex chromosomes have been found in central bearded dragons (Pogona vitticeps), common snake-necked turtles (Chelodina longicollis), and Chinese soft-shelled turtles (Pelodiscus sinensis) (Ezaz et al. 2005, 2006; Kawai et al. 2007). The variety of sex chromosome organizations has been mapped onto phylogenetic trees to investigate the number of origins of sex chromosomes and types of heterogameties in the group (Janzen and Krenz 2004; Pokorna and Kratochvil 2009). Parsimony, likelihood, Bayesian, and stochastic approaches reconstruct temperature-dependent sex determination as ancestral to archosaurs (turtles, crocodilians, and birds) (Organ and Janes 2008). Turtles are extraordinarily variable in their organizations of sex chromosomes with species exhibiting male heterogamety, female heterogamety, no detectable heterogamety, or temperature-dependent sex determination (Organ and Janes 2008). These results indicate multiple independent origins of sex chromosomes among archosaurs (Fig. 1.2). Also, Matsubara et al. (2006) demonstrated a lack of sequence similarity between the female heterogametic sex chromosomes of birds and those of snakes, indicating at least two independent origins of sex chromosomes. Reptiles, with such variability and rapidly improving genomic resources, provide tremendous raw material for studies of the causes and consequences of sex chromosome origination and degeneration.

1.3.2

Detection of Sex Chromosomes

Species for which genotypic sex determination has been ascribed but sex chromosomes have not yet been identified are an important focus of research on reptile genomics (Janes et al. 2010a). For species like the smooth softshell turtle, sex chromosomes have not been reported but it is unclear if this is because they are lacking in this species or if current cytogenetic techniques are not yet sufficiently sensitive to detect them. The cytogenetic technique of C-banding, which stains the heterochromatic regions of chromosomes, has identified female-specific W sex chromosomes in central bearded dragons (P. vitticeps) (Ezaz et al. 2005) as well as eastern bearded dragons (Pogona barbata), Nobbi dragons (Amphibolurus nobbi), and Mallee dragons (Ctenophorus fordi) (Ezaz et al. 2009). Comparative genomic hybridization, Ag–NOR staining, and fluorescent in situ hybridization (FISH) are also standard techniques for identifying karyotypic sex differences (Kawai

1 Extinct and Extant Reptiles

11

F M

F M

F M F M

Mammals

Tuatara

Geck os

F M

Sk ink s

Lacertid liz ards

F M

Snak es

F M

Iguanids

F M

Birds

Crocodilians

F M

Turtles

Amphibians

F M

F M

0 Mya

100 Mya

200 Mya

300 Mya

Fig. 1.2 Presence or absence of male or female heterogamety across amphibians, nonavian and avian reptiles, and mammals (Organ and Janes 2008). Sex chromosomes have not been reported for crocodilians or tuataras, both exhibiting temperature-dependent sex determination. Female heterogamety is exhibited by snakes but is shaded differently in this figure to indicate that snake sex chromosomes do not share sequence with avian sex chromosomes as the two pairs of sex chromosomes most likely resulted from independent origins of female heterogamety (Matsubara et al. 2006). The characterization of similarities or differences between avian sex chromosomes and female heterogameties found in other reptiles and the estimation of the number of independent origins of sex chromosomes are focuses of reptilian genomics research (Janes et al. 2010a)

et al. 2007). As more sex chromosomes are identified, more sex-linked sequences will be cataloged for reptile species. For example, 18 S–28 S ribosomal RNA genes are located on both micro-sex chromosomes in the Chinese soft-shelled turtle but in more copies on the W chromosome than on the Z chromosome (Kawai et al. 2007). Comparative FISH mapping of sex-linked markers will be useful for supporting or rejecting hypotheses regarding the evolutionary history of sex-determining mechanisms. Clearly, snake and bird sex chromosomes have little or no sequence in common but the similarities and differences of sex chromosomes among birds, turtles, and possibly TSD reptiles have not yet been characterized (Fig. 1.2) (Janes et al. 2010b). However, Kawagoshi et al. (2009) identified five Z-linked markers in the Chinese soft-shelled turtle by FISH mapping cDNA fragments of the genes GIT2, NF2, SBNO1, SF3A1, and TOP3B. These markers map to chicken chromosome 15, suggesting a common origin.

12

1.3.3

D.E. Janes

Heterogamety and Dosage Compensation

Hypotheses are emerging about the differences between male and female heterogamety. For example, dosage compensation appears to function differently between male heterogametic and female heterogametic species. Genes found on the X chromosome in male heterogametic species and on the Z chromosome in female heterogametic species occur in different doses between males and females. Mammals balance gene dosage by inactivating an X chromosome. X-chromosome inactivation transcriptionally silences genes on one of two X chromosomes in a female, thereby balancing gene dosage between males and females (Payer and Lee 2008). Birds, however, do not globally inactivate a Z chromosome in males. Rather, dosage compensation appears to act rarely and on small regions of avian sex chromosomes (Melamed and Arnold 2007). In fact, global dosage compensation has only been found in male heterogametic groups, including therian mammals, fruitflies (Drosophila), and nematodes (Caenorhabditis elegans), whereas local dosage compensation has been found in female heterogametic groups, including birds and lepidopterans (Mank 2009). At present, the pattern has only been described among three male heterogametic groups and two female heterogametic groups and has yet to be explored among reptiles (but see King and Lawson 1996). Inactivation or hyper-transcription of sex-linked genes and entire chromosomes should be compared between closely related male heterogametic and female heterogametic reptiles, particularly among emydid turtles, chameleons, and geckos that exhibit differences in heterogamety within families (Organ and Janes 2008).

1.4

Fossil Evidence

Extinct reptiles are relevant to the study of sex chromosome evolution because of the order in which genotypic sex determination and sex chromosomes evolve. Sex chromosomes become detectable only after they have been sufficiently affected by evolutionary forces that arise subsequent to the block to recombination caused by either the novel function or novel location of a sex-determining gene. Fossils of extinct reptiles allow us to examine the history of sex-determining mechanisms and subsequently predict which extinct reptiles exhibited genotypic sex determination. Organ et al. (2009) used a reversible-jump Markov-chain Monte Carlo algorithm to establish a Bayesian posterior probability distribution for models of correlated change between different types of sex-determining mechanisms and reproductive modes in extant amniotes (see Sect. 1.2.2). Reproductive mode describes the means by which parents produce young. Among amniotes, species are either viviparous or oviparous. The Bayesian analysis yielded a significant result for correlated evolution of genotypic sex determination and viviparity. Oviparity does not effectively predict a certain sex-determining mechanism but viviparity predicts genotypic sex determination. As described above, only two, perhaps three, of 94 studied extant

1 Extinct and Extant Reptiles

13

amniotes are both viviparous and TSD. This correlation permitted a prediction of genotypic sex determination in extinct species known to be viviparous. In fact, fossil evidence demonstrates viviparity in several extinct marine reptiles, including sauropterygians, mosasaurs, and ichthyosaurs. The study predicted sex-determining mechanisms for seven species for which sex-determining mechanisms were known but not introduced to the algorithm. This test group included six extant reptiles and an extinct horse (Propalaeotherium) for which pregnant specimens have been found in the fossil record. The study showed that genotypic sex determination could be accurately predicted for viviparous species. All ten marine reptiles examined in the study were assigned a significant posterior probability of having genotypic sex determination. Organ et al. (2009) argued that this result is meaningful for the natural history of extinct marine reptiles. Oviparity in the open ocean would not have been possible for amniote species like ichthyosaurs because amniotic eggs require gas-exchange with the atmosphere (Andrews and Mathies 2000). Extant marine reptiles including saltwater crocodiles (Crocodylus porosus) and sea turtles nest on land but extinct marine reptiles like ichthyosaurs did not have a body plan that was likely to allow terrestrial nesting. Freed by viviparity from the requirement to nest on land, extinct marine reptiles evolved morphologies that were adaptive to pelagic existence. These morphologies included fluked tails, dorsal fins, and wing-shaped limbs. Further, if prerequisite for the evolution of viviparity, genotypic sex determination may have permitted the adaptive radiation of extinct marine reptiles since viviparity seems to be a prerequisite for the pelagic existence of those species (Caldwell and Lee 2001).

1.5

Impact of Genome Projects and Future Directions

The study of sex chromosome evolution has much to gain from current genome sequencing efforts. At present, only the green anole (Anolis carolinensis) and the painted turtle (Chrysemys picta) are focuses of genome sequencing projects (Janes et al. 2008) but the recently announced Genome 10K collection of species that has been targeted for whole-genome sequencing includes 3,297 nonavian reptiles (Haussler et al. 2009). In particular, the genome sequences of 140 turtles, 569 iguanids, and 621 geckos that have been targeted for genome sequencing will provide a window into the variability of sex-determining mechanisms and sex chromosome organizations found in these three groups. The identities and map locations of sex-linked markers will support or reject current hypotheses of common origins of sex chromosomes. For example, Kawai et al. (2009) suggested a common origin between the sex chromosome pairs of the gecko lizard (Gekko hokouensis) and chicken because they share a linkage group that consists of six markers. Following the publication of multiple reptile genomes, studies of this kind will involve more markers in more species, allowing more robust conclusions to be made regarding the number of independent origins of reptilian sex chromosomes.

14

D.E. Janes

Until the sequencing and mapping of sex-linked and sex-differentiating markers have reached a more advanced stage, studies of reptilian sex chromosomes will be smaller in scope. Nonetheless, sex-linked markers have been identified in birds (Backstro¨m et al. 2006; Hillier et al. 2004), snakes (Matsubara et al. 2006), turtles (Kawagoshi et al. 2009), and lizards (Kawai et al. 2009). These sequences provide sufficient raw material for mapping comparisons among pairs of reptilian sex chromosomes. Comparative mapping studies, in concert with ancestral reconstructions, will directly inform questions regarding the number of independent origins of sex chromosomes in reptiles and why sex chromosome systems have higher turnover in nonavian reptiles than they have in either birds or mammals. Acknowledgments I would like to thank Miguel Alcaide, Maude Baldwin, Elena Gonzalez, June Yong Lee, Christopher Organ, and Irene Salicini for their critical reviews of this chapter. This work has benefited from conversations with Nicole Valenzuela (NV), Scott V. Edwards (SVE), Tariq Ezaz, Jennifer A.M. Graves, Arthur Georges, and Andrew Sinclair. Support in the laboratory and valuable discussions were shared by Christopher Balakrishnan, Charles Chapus, and Andrew Shedlock. Funding for this work was provided by a grant from the United States National Science Foundation (MCB0817687) to NV and SVE. Last, I would like to thank Pierre Pontarotti for the invitation to contribute to the 13th Evolutionary Biology Meeting at Marseille where this work was presented.

References Allsop DJ, Warner DA, Langkilde T, Du W, Shine R (2006) Do operational sex ratios influence sex allocation in viviparous lizards with temperature-dependent sex determination? J Evol Biol 19(4):1175–1182 Andrews RM, Mathies T (2000) Natural history of reptilian development: constraints on the evolution of viviparity. Bioscience 50(3):227–238 Backstro¨m N, Brandstrom M, Gustafsson L, Qvarnstrom A, Cheng H, Ellegren H (2006) Genetic mapping in a natural population of collared flycatchers (Ficedula albicollis): conserved synteny but gene order rearrangements on the avian Z chromosome. Genetics 174(1): 377–386 Bull JJ (1983) Evolution of sex determining mechanisms. Benjamin/Cummings, Menlo Park, CA Caldwell MW, Lee MSY (2001) Live birth in Cretaceous marine lizards (mosasauroids). Proc R Soc Lond B Biol Sci 268(1484):2397–2401 Charlesworth B, Charlesworth D (2000) The degeneration of Y chromosomes. Phil Trans Roy Soc Lond B 355(1403):1563–1572 Charlesworth B, Coyne JA, Barton NH (1987) The relative rates of evolution of sex chromosomes and autosomes. Am Nat 130(1):113–146 Charnov EL, Bull J (1977) When is sex environmentally determined. Nature 266(5605):829–830 Ewert BJ, Etchberger CR, Nelson CE (2004) Turtle sex-determining modes and TSD patterns, and some TSD pattern correlates. In: Valenzuela N, Lance VA (eds) Temperature-dependent sex determination in vertebrates. Smithsonian Books, Washington, DC, pp 21–32 Ezaz T, Quinn AE, Miura I, Sarre SD, Georges A, Graves JAM (2005) The dragon lizard Pogona vitticeps has ZZ/ZW micro-sex chromosomes. Chromosome Res 13(8):763–776 Ezaz T, Valenzuela N, Grutzner F, Miura I, Georges A, Burke RL, Graves JAM (2006) An XX/XY sex microchromosome system in a freshwater turtle, Chelodina longicollis (Testudines: Chelidae) with genetic sex determination. Chromosome Res 14(2):139–150

1 Extinct and Extant Reptiles

15

Ezaz T, Quinn AE, Sarre SD, O’Meally D, Georges A, Graves JAM (2009) Molecular marker suggests rapid changes of sex-determining mechanisms in Australian dragon lizards. Chromosome Res 17(1):91–98 Falconer DS, MacKay TFC (1996) Introduction to quantitative genetics. Longmann Press, London, UK Fisher RA (1930) The genetical theory of natural selection. Oxford University Press, New York, USA Freedberg S, Wade MJ (2001) Cultural inheritance as a mechanism for population sex-ratio bias in reptiles. Evolution 55(5):1049–1055 Georges A (1992) Thermal characteristics and sex determination in field nests of the pig-nosed turtle, Carettochelys insculpta (Chelonia, Carettochelydidae), from northern Australia. Aust J Zool 40(5):511–521 Haussler D, O’Brien SJ, Ryder OA, Barker FK, Clamp M, Crawford AJ, Hanner R, Hanotte O, Johnson WE, McGuire JA, Miller W, Murphy RW, Murphy WJ, Sheldon FH, Sinervo B, Venkatesh B, Wiley EO, Allendorf FW, Amato G, Baker CS, Bauer A, Beja-Pereira A, Bermingham E, Bernardi G, Bonvicino CR, Brenner S, Burke T, Cracraft J, Diekhans M, Edwards S, Ericson PGP, Estes J, Fjelsda J, Flesness N, Gamble T, Gaubert P, Graphodatsky AS, Graves JAM, Green ED, Green RE, Hackett S, Hebert P, Helgen KM, Joseph L, Kessing B, Kingsley DM, Lewin HA, Luikart G, Martelli P, Moreira MAM, Nguyen N, Orti G, Pike BL, Rawson DM, Schuster SC, Seuanez HN, Shaffer HB, Springer MS, Stuart JM, Sumner J, Teeling E, Vrijenhoek RC, Ward RD, Warren WC, Wayne R, Williams TM, Wolfe ND, Zhang YP (2009) Genome 10K: a proposal to obtain whole-genome sequence for 10 000 vertebrate species. J Hered 100(6):659–674 Hillier LW, Miller W, Birney E, Warren W, Hardison RC, Ponting CP, Bork P, Burt DW, Groenen MAM, Delany ME, Dodgson JB, Chinwalla AT, Cliften PF, Clifton SW, Delehaunty KD, Fronick C, Fulton RS, Graves TA, Kremitzki C, Layman D, Magrini V, McPherson JD, Miner TL, Minx P, Nash WE, Nhan MN, Nelson JO, Oddy LG, Pohl CS, Randall-Maher J, Smith SM, Wallis JW, Yang SP, Romanov MN, Rondelli CM, Paton B, Smith J, Morrice D, Daniels L, Tempest HG, Robertson L, Masabanda JS, Griffin DK, Vignal A, Fillon V, Jacobbson L, Kerje S, Andersson L, Crooijmans RPM, Aerts J, van der Poel JJ, Ellegren H, Caldwell RB, Hubbard SJ, Grafham DV, Kierzek AM, McLaren SR, Overton IM, Arakawa H, Beattie KJ, Bezzubov Y, Boardman PE, Bonfield JK, Croning MDR, Davies RM, Francis MD, Humphray SJ, Scott CE, Taylor RG, Tickle C, Brown WRA, Rogers J, Buerstedde JM, Wilson SA, Stubbs L, Ovcharenko I, Gordon L, Lucas S, Miller MM, Inoko H, Shiina T, Kaufman J, Salomonsen J, Skjoedt K, Wong GKS, Wang J, Liu B, Yu J, Yang HM, Nefedov M, Koriabine M, deJong PJ, Goodstadt L, Webber C, Dickens NJ, Letunic I, Suyama M, Torrents D, von Mering C, Zdobnov EM, Makova K, Nekrutenko A, Elnitski L, Eswara P, King DC, Yang S, Tyekucheva S, Radakrishnan A, Harris RS, Chiaromonte F, Taylor J, He JB, Rijnkels M, Griffiths-Jones S, Ureta-Vidal A, Hoffman MM, Severin J, Searle SMJ, Law AS, Speed D, Waddington D, Cheng Z, Tuzun E, Eichler E, Bao ZR, Flicek P, Shteynberg DD, Brent MR, Bye JM, Huckle EJ, Chatterji S, Dewey C, Pachter L, Kouranov A, Mourelatos Z, Hatzigeorgiou AG, Paterson AH, Ivarie R, Brandstrom M, Axelsson E, Backstrom N, Berlin S, Webster MT, Pourquie O, Reymond A, Ucla C, Antonarakis SE, Long MY, Emerson JJ, Betran E, Dupanloup I, Kaessmann H, Hinrichs AS, Bejerano G, Furey TS, Harte RA, Raney B, Siepel A, Kent WJ, Haussler D, Eyras E, Castelo R, Abril JF, Castellano S, Camara F, Parra G, Guigo R, Bourque G, Tesler G, Pevzner PA, Smit A, Fulton LA, Mardis ER, Wilson RK (2004) Sequence and comparative analysis of the chicken genome provide unique perspectives on vertebrate evolution. Nature 432(7018):695–716 Huey RB, Janzen FJ (2008) Climate warming and environmental sex determination in tuatara: the last of the Sphenodontians? Proc R Soc Lond B Biol Sci 275(1648):2181–2183 Janes DE, Wayne ML (2006) Evidence for a genotype  environment interaction in sex-determining response to incubation temperature in the leopard gecko, Eublepharis macularius. Herpetologica 62(1):56–62

16

D.E. Janes

Janes DE, Bermudez D, Guillette LJ, Wayne ML (2007) Estrogens induced male production at a female-producing temperature in a reptile (Leopard Gecko, Eublepharis macularius) with temperature-dependent sex determination. J Herpetol 41(1):9–15 Janes DE, Organ C, Valenzuela N (2008) New resources inform study of genome size, content, and organization in nonavian reptiles. Integr Comp Biol 48(4):447–453 Janes DE, Ezaz T, Graves JAM, Edwards SV (2009) Recombination and nucleotide diversity in the sex chromosomal pseudoautosomal region of the emu, Dromaius novaehollandiae. J Hered 100(2):125–136 Janes DE, Fujita MK, Organ CL, Shedlock AM, Edwards SV (2010a) Genome evolution in Reptilia, the sister group of mammals. Annu Rev Genom Hum Genet (in press) Janes DE, Organ CL, Edwards SV (2010b) Variability in sex-determining mechanisms influences genome complexity in Reptilia. Cytogenet Genome Res 127(2–4):242–248 Janzen FJ, Krenz JG (2004) Phylogenetics: which was first, TSD or GSD? In: Valenzuela N, Lance VA (eds) Temperature-dependent sex determination in vertebrates. Smithsonian Books, Washington, DC, pp 121–130 Just W, Rau W, Vogel W, Akhverdian M, Fredga K, Graves JAM, Lyapunova E (1995) Absence of Sry in species of the vole Ellobius. Nat Genet 11(2):117–118 Kawagoshi T, Uno Y, Matsubara K, Matsuda Y, Nishida C (2009) The ZW micro-sex chromosomes of the chinese soft-shelled turtle (Pelodiscus sinensis, Trionychidae, Testudines) have the same origin as chicken chromosome 15. Cytogenet Genome Res 125:125–131 Kawai A, Nishida-Umehara C, Ishijima J, Tsuda Y, Ota H, Matsuda Y (2007) Different origins of bird and reptile sex chromosomes inferred from comparative mapping of chicken Z-linked genes. Cytogenet Genome Res 117(1–4):92–102 Kawai A, Ishijima J, Nishida C, Kosaka A, Ota H, Kohno S, Matsuda Y (2009) The ZW sex chromosomes of Gekko hokouensis (Gekkonidae, Squamata) represent highly conserved homology with those of avian species. Chromosoma 118(1):43–51 King RB, Lawson R (1996) Sex-linked inheritance of fumarate hydratase alleles in natricine snakes. J Hered 87:81–83 Lang JW, Andrews HV (1994) Temperature-dependent sex determination in crocodilians. J Exp Zool 270(1):28–44 Mank JE (2009) The W, X, Y and Z of sex-chromosome dosage compensation. Trends Genet 25(5):226–233 Matsubara K, Tarui H, Toriba M, Yamada K, Nishida-Umehara C, Agata K, Matsuda Y (2006) Evidence for different origin of sex chromosomes in snakes, birds, and mammals and step-wise differentiation of snake sex chromosomes. Proc Natl Acad Sci USA 103(48):18190–18195 Melamed E, Arnold AP (2007) Regional differences in dosage compensation on the chicken Z chromosome. Genome Biol 8(9):R202 Mitchell NJ, Nelson NJ, Cree A, Pledger S, Keall SN, Daugherty CH (2006) Support for a rare pattern of temperature-dependent sex determination in archaic reptiles: evidence from two species of tuatara (Sphenodon). Front Zool 3:9 Ohno S (1967) Sex chromosomes and sex linked genes. Springer, Berlin Organ CL, Janes DE (2008) Evolution of sex chromosomes in Sauropsida. Integr Comp Biol 48 (4):512–519 Organ CL, Janes DE, Meade A, Pagel M (2009) Genotypic sex determination enabled adaptive radiations of extinct marine reptiles. Nature 461(7262):389–392 Payer B, Lee JT (2008) X chromosome dosage compensation: how mammals keep the balance. Annu Rev Genet 42:733–772 Pokorna M, Kratochvil L (2009) Phylogeny of sex-determining mechanisms in squamate reptiles: are sex chromosomes an evolutionary trap? Zool J Linn Soc 156(1):168–183 Quinn AE, Georges A, Sarre SD, Guarino F, Ezaz T, Graves JAM (2007) Temperature sex reversal implies sex gene dosage in a reptile. Science 316(5823):411 Rice WR (1987) Genetic hitchhiking and the evolution of reduced genetic activity of the Y sex chromosome. Genetics 116(1):161–167

1 Extinct and Extant Reptiles

17

Sarre SD, Georges A, Quinn A (2004) The ends of a continuum: genetic and temperaturedependent sex determination in reptiles. Bioessays 26(6):639–645 Shine R, Warner DA, Radder R (2007) Windows of embryonic sexual lability in two lizard species with environmental sex determination. Ecology 88(7):1781–1788 Sinclair AH, Berta P, Palmer MS, Hawkins JR, Griffiths BL, Smith MJ, Foster JW, Frischauf AM, Lovell-badge R, Goodfellow PN (1990) A gene from the human sex-determining region encodes a protein with homology to a conserved DNA-binding motif. Nature 346(6281): 240–244 Smith CA, Roeszler KN, Ohnesorg T, Cummins DM, Fairlie PG, Doran TJ, Sinclair AH (2009) The avian Z-linked gene DMRT1 is required for male sex determination in the chicken. Nature 461:267–271 Solari AJ (1994) Sex chromosomes and sex determination in vertebrates. CRC Press, Boca Raton, FL Standora EA, Spotila JR (1985) Temperature-dependent sex determination in sea turtles. Copeia 3:711–722 Uller T, Mott B, Odierna G, Olsson M (2006) Consistent sex ratio bias of individual female dragon lizards. Biol Lett 2(4):569–572 Valenzuela N (2004) Introduction. In: Valenzuela N, Lance VA (eds) Temperature-dependent sex determination in vertebrates. Smithsonian Books, Washington, DC, pp 1–4 Valenzuela N, LeClere A, Shikano T (2006) Comparative gene expression of steroidogenic factor 1 in Chrysemys picta and Apalone mutica turtles with temperature-dependent and genotypic sex determination. Evol Dev 8(5):424–432 Viets BE, Tousignant A, Ewert MA, Nelson CE, Crews D (1993) Temperature-dependent sex determination in the leopard gecko, Eublepharis macularius. J Exp Zool 265(6):679–683 Viets BE, Ewert MA, Talent LG, Nelson CE (1994) Sex-determining mechanisms in squamate reptiles. J Exp Zool 270(1):45–56 Vogel W, Jainta S, Rau W, Geerkens C, Baumstark A, Correa-Cerro LS, Ebenhoch C, Just W (1998) Sex determination in Ellobius lutescens: the story of an enigma. Cytogenet Cell Genet 80(1–4):214–221 Wagner E (1980) Temperature-dependent sex determination in a gekko lizard. Q Rev Biol 55:21, appendix Warner DA, Shine R (2008) The adaptive significance of temperature-dependent sex determination in a reptile. Nature 451(7178):566–568 Warner DA, Shine R (2009) Maternal and environmental effects on offspring phenotypes in an oviparous lizard: do field data corroborate laboratory data? Oecologia 161(1):209–220 While GM, Wapstra E (2009) Snow skinks (Niveoscincus ocellatus) do not shift their sex allocation patterns in response to mating history. Behaviour 146:1405–1422

Chapter 2

Constraints, Plasticity, and Universal Patterns in Genome and Phenome Evolution Eugene V. Koonin and Yuri I. Wolf

Abstract Evolutionary genomics identifies multiple constraints that differentially affect different parts of the genomes of diverse life forms. The selective pressures that shape the evolution of viral, prokaryotic, and eukaryotic genomes differ dramatically, and substantial differences exist even between animal and bacterial lineages. Constraints on protein evolution appear to be more universal and could be determined by the fundamental physics of protein folding. Some key features of the molecular phenome such as protein abundance turn out to be unexpectedly conserved and hence strongly constrained. The constraints that shape the evolution of genomes and phenomes are complemented by the plasticity and robustness of genome architecture, expression, and regulation. Several universal “laws” of genome and phenome evolution were detected, some of which seem to be dictated by selective constraints and others by neutral process.

2.1

Introduction

In principle, the entire genome of any life form can be perceived as evolving under constraints (purifying selection) the strength of which varies from 0 (unconstrained evolution) to 1 (absolute conservation). Moreover, constraints affect evolution at all levels of biological organization, from genome sequence to genome architecture to gene expression to molecular interactions to actual organismal phenotypes (Kimura 1983; Lynch 2007c). Generally, constraints on the rates and paths of evolution can be divided into genomic, those that are manifest at the level of the genome sequence and architecture, and phenomic, those that pertain to phenotypic characteristics (although ultimately realized through genomic changes as well). Comparative E.V. Koonin and Y.I. Wolf National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20892, USA e-mail: [email protected]

P. Pontarotti (ed.), Evolutionary Biology – Concepts, Molecular and Morphological Evolution, DOI 10.1007/978-3-642-12340-5_2, # Springer-Verlag Berlin Heidelberg 2010

19

20

E.V. Koonin and Y.I. Wolf

genomics and systems biology produce massive amounts of diverse data that provide for previously inconceivable insights into the patterns and processes of genome and phenome evolution (Kitano 2002; Medina 2005; Koonin and Wolf 2006; Lynch 2007c; Loewe 2009; Yamada and Bork 2009). Comparative genomics allows us, at least in principle, to measure the strength of constraints that affect different classes of sites in genomes and to elucidate the biological nature of these constraints. However, genome comparison does more than that as it gives us material to address evolutionary constraints beyond the traditional aspect of sequence conservation to higher level questions such as: how constrained in evolution are gene repertoires of organisms, genome architecture, evolution rate itself, and more? The massive influx of data from systems biology takes the study of evolutionary constraints into new dimensions by allowing researchers to ask qualitatively new questions: what are the nature and strength of constraints that affect gene expression, regulatory, and interaction networks, metabolic fluxes and other characteristics of organisms that can be denoted “molecular phenome”? In this article, we present a broad overview of the constraints that affect gene sequences, genome architectures, and molecular phenotypic characteristics such as gene expression level and the structures of protein–protein interaction and regulatory networks. We attempt a genome-wide and organism-wide assessment of different types of constraints operative at different levels and additionally discuss the concepts of robustness and plasticity that are intimately linked to constraints. Of course, the subject we address is vast and cannot be reasonably covered in full in one, relatively brief review. We leave out some important areas such as developmental constraints and only fleetingly touch upon others such as evolution of regulatory networks. Nevertheless, it is our hope that even such sketchy discussion reveals some important general aspects of constraints that define evolution at diverse levels of biological organization.

2.2

Evolutionary Constraints on Sequence Evolution Across Genomes and Taxa

The origins and characteristic strengths of constraints that affect different classes of sequences in genomes of different life forms are extremely diverse and certainly are not yet known in full. Typically, the constraints on sequences encoding proteins and structural RNAs (such as rRNAs and tRNAs) are stronger than the constraints on noncoding sequences although, for each type of sequences, there is a broad distribution of constraint strengths, and the ranges of the distributions overlap (Shabalina and Kondrashov 1999; Margulies et al. 2007). Obviously, constraints that affect a particular class of sites can be measured only by comparison to another class of sites that can be construed to evolve neutrally. The choice of an appropriate neutral model is a major problem in molecular evolution. In the pregenomic era, Motoo

2

Constraints, Plasticity, and Universal Patterns in Genome and Phenome Evolution

21

Kimura, the founder of the neutral theory, was the first to come up with the simple but important idea that pseudogenes that are numerous in vertebrates could be used as a neutral baseline for assessing selection pressure (Kimura 1983). Despite some exceptional cases of pseudogene recruitment for specific functions (Khachane and Harrison 2009), in general, this contention still appears to hold true (Harrison and Gerstein 2002). Genomics revealed additional sources of (apparently) neutrally evolving sequences such as introns and intergenic regions in animals (Parsch et al. 2010; Resch et al. 2007). However, a general difficulty with any attempt to define a universal baseline of neutral evolution is that different parts of a genome differ in their mutation rates, and consequently, in the rate of neutral evolution for which the fixation rate equals the mutation rate (Ellegren et al. 2003). Therefore, for a reliable estimate of the strength of selection/constraints, the neutral model has to be derived from the same gene/region for which selection is being measured. Several such measures have been developed (Nielsen 2005; Charlesworth and Eyre-Walker 2008; Eyre-Walker and Keightley 2009). The most popular gage of selection pressure for protein-coding sequences naturally follows from the redundancy and nonrandom structure of the genetic code in which the same amino acid typically is encoded by codons that differ only in their third (or less commonly first) positions. This measure, Ka/Ks (dN/dS), is the ratio of the number or rate of nonsynonymous substitutions (those that change an amino acid in the encoded protein) to the number or rate of synonymous substitutions (those that occur in synonymous positions of codons and so do not affect the protein sequence) (Hurst 2002; Ellegren 2008). The assumption that underpins the use of Ka/Ks as a measure of selection is that synonymous sites evolve neutrally or at least under weak selection compared with nonsynonymous sites, allowing the use of synonymous sites as the baseline to measure the constraints on protein evolution. As a crude approximation, this assumption holds as for the great majority of protein-coding genes from any organism, Ka/Ks 1, which is construed as evidence of evolution under positive selection. Genes evolving under positive selection encode specialized proteins for which rapid change is paramount for function that typically involves “arms race” between competing agencies such as hosts and parasites; examples include proteins bacterial surface proteins (Petersen et al. 2007; Muzzi et al. 2008) and proteins involved in mammalian spermatogenesis, sperm competition, and sperm–egg interaction (Nielsen et al. 2005; Turner et al. 2008). Of course, evolution under positive selection is not unconstrained as constraints on the overall protein structure still apply (Worth et al. 2009) but evolution along the available trajectories proceeds rapidly. The fact that most protein-coding genes evolve under constraints imposed by purifying selection by no means implies that all amino acid sites are subject to the

22

E.V. Koonin and Y.I. Wolf dN dS

0.0001

0.001

0.01

0.1

1

10

distance between Human and Macaque orthologs

H uman-Macaq ue B.cenocepacia-B. vietnamiensis AspergillusNeosartorya

0.001

0.01

0.1 dN/dS ratio for orthologs

1

10

Fig. 2.1 The distributions of evolutionary rates for nonsynonymous and synonymous sites of protein-coding genes in primates and the Ka/Ks ratios for three diverse pairs of species (Wolf et al. 2009)

same constraints. On the contrary, the evolutionary rates of sites and by implication the strength of constraints affecting different sites are well described by a characteristic skewed Gamma distribution (or more precisely a mixture of Gamma distribution), with a small fraction of sites that are virtually unconstrained or, in some cases, subject to positive selection and the majority of the sites subject to broadly distributed constraints (Kelly and Churchill 1996; Grishin et al. 2000; Mayrose et al. 2005; Nielsen 2005). The characteristic strengths of constraints that affect evolution of protein-coding genes widely differ between organisms. Typically, prokaryotic proteins are subject to stronger constraints than eukaryotic proteins, especially, those of multicellular forms (plants and animals), with the characteristic median Ka/Ks values in the range of 0.01–0.1 and 0.1–0.5, respectively (Fig. 2.1) (Jordan et al. 2002;

2

Constraints, Plasticity, and Universal Patterns in Genome and Phenome Evolution

23

Novichkov et al. 2009b). The values of Ka/Ks and by inference the strength of constraints widely differ between evolutionary lineages such as diverse lineages of bacteria and archaea, and seem to be related to the specific lifestyles of the respective organisms (Novichkov et al. 2009b). The assumption that synonymous sites in protein-coding genes evolve neutrally is useful for measuring selection acting at the protein level but in itself is a rough approximation at best. The universally observed, significant positive correlation between Ka and Ks (Makalowski and Boguski 1998; Drummond and Wilke 2008, 2009; Ellegren 2008) indicates that evolution of synonymous sites is constrained as well and suggests that the evolutionary forces that shape the evolution of nonsynonymous and synonymous sites are related (see the section on protein evolution below). More accurate and powerful tests for purifying and positive selection affecting different classes of sites are variations of the classic McDonald–Kreitman test which compares the patterns of substitutions for within species variation (polymorphisms) with those for between species divergences, under the assumption that the fraction of nonneutral polymorphisms is negligible (Nielsen 2001, 2005). The overall distributions of constraints across genomes are dramatically different in life forms with distinct genome architectures, in particular, between viruses and prokaryotes, on the one hand, with their “wall-to-wall” genomes that consist mostly of protein-coding and RNA-coding genes, and multicellular eukaryotes in whose genomes the coding nucleotides are in the minority, on the other hand (Lynch and Conery 2003; Koonin 2009a) (Fig. 2.2). On a per nucleotide basis, the constraints affecting compact genomes, particularly, those of prokaryotes are orders of magnitude greater than the constraints on the larger genomes of multicellular eukaryotes. Considering the characteristic low Ka/Ks values indicative of strongly constrained evolution of protein sequences (Fig. 2.1), there are almost no sequences whose evolution is (effectively) unconstrained in the compact viral and prokaryotic genomes. The notable exception are pseudogenes that are common in some parasitic bacteria such as Rickettsia or Mycobacterium leprae (Harrison and Gerstein 2002; Darby et al. 2007; Monot et al. 2009). In typical genomes of free-living prokaryotes and especially viruses, noncoding regions constitute only 10–15% of the genome, and a considerable fraction of these sequences consists of regulatory elements (promoters, operators, terminators, and translation initiation regions) whose evolution is variably constrained (Molina and van Nimwegen 2008). The genomes of most viruses are even more compact than prokaryotic genomes, with nearly all of the genome sequence taken up by protein-coding genes (Koonin 2009a). Unicellular eukaryotes resemble prokaryotes in their overall genome architecture (notwithstanding important differences such as the absence of operons and the presence of varying numbers of introns) and show a roughly similar distribution of evolutionary constraints although the fraction of apparently unconstrained noncoding sequences in these genomes is somewhat greater. However, the genomes of multicellular eukaryotes (plants and especially animals) present a stark contrast. These organisms have intron-rich genomes with long intergenic regions, and a substantial, albeit variable fraction of these noncoding sequences indeed appear to

24

E.V. Koonin and Y.I. Wolf

100%

strong constraints

80%

" j unk " genome

60%

40%

introns control elements

20%

w eak constraints

O RFs multiicellular euk aryotes

unicellular euk aryotes

prok aryotes

viruses

0%

Fig. 2.2 Approximate distribution of evolutionary constraints across genomes with different architectures. The fractions of different classes of sequences subject to constraints of varying strength are shown as rough approximation of the values that are typical of the respective class of genomes

undergo unconstrained evolution (Fig. 2.2). Using McDonald–Kreitman-based approaches, it is possible to estimate the fraction of the nucleotides in a genome that are subject to evolutionary constraints (Sella et al. 2009). These estimated fractions substantially differ even between animals: in Drosophila, 70% of the sites including 65% of the noncoding sites appear to be subject to selection (including positive selection) (Sella et al. 2009), whereas in mammals, this fraction is estimated at 5–6% only as determined using repeats ancestral to human and mouse as a neutral baseline (Waterston et al. 2002). An independent approach based on the deviations from the expected neutral distribution of insertions and deletions in mammalian genomes led to an even lower value of 3% of sites under constraint (Lunter et al. 2006). It is notable, however, that the absolute numbers of sites subject to selection in these animal genomes of widely different size are quite close. By contrast, in Arabidopsis, a plant that is comparable to Drosophila in terms of genome size and overall architecture, the fraction of constrained noncoding sites appears to be substantially lower. The estimate of 3–6% for the fraction of constrained sites in mammalian genomes is remarkable from two opposite standpoints. On the one hand, it appears that the great majority of the mammalian genomic DNA after all fits the early (and much maligned) definition of junk (Doolittle and Sapienza 1980). Of course,

2

Constraints, Plasticity, and Universal Patterns in Genome and Phenome Evolution

25

recruitment of “junk” sequences, such as those of diverse transposable elements, for various functions is common (Jordan et al. 2003; Bowen and Jordan 2007), so yesterday’s junk can be today’s essential gene (and vice versa) but at any given time, most of the primate genome evolves without appreciable constraints. But the converse aspect of these estimates is that, as protein-coding sequences comprise only 1.2% of the genome (Waterston et al. 2002), the substantial majority of the selected sites do not encode amino acids. We still do not know the actual distribution of the constrained sites among different classes of sequences or the distribution of selection pressures but some important contributions and their approximate magnitudes have become clear. In particular, the selective pressure on 50 -terminal and especially long 30 -terminal untranslated regions of mammalian genomes is comparable to that affecting synonymous sites in coding regions if not stronger (Duret et al. 1993; Shabalina et al. 2004; Drake et al. 2006). An even greater contribution to the noncoding part of the mammalian “selectome” using the term in the most general sense as the totality of sites subject to all form of selection as opposed to the original usage limited to positive selection (Proux et al. 2009) is the ever-growing compendium of noncoding RNA genes present in vertebrate genomes, the RNome (Costa 2005). A major and currently best characterized part of the RNome consists of thousands of regulatory microRNAs that are subject to a broad range of evolutionary constraints (Shabalina and Koonin 2008; Carthew and Sontheimer 2009). In addition, there are numerous long noncoding (macro) RNAs the functions of which remain largely unclear although there is striking anecdotal evidence of roles of these RNAs in gene regulation and development (Ponting et al. 2009). Approximately 3,000 macroRNAs were found to be conserved in mammals and are subject to a selective pressure that appears to be comparable to the constraints affecting protein-coding genes (Ponjavic et al. 2007). Beyond doubt, the known part of the RNome is the proverbial tip of the iceberg, especially considering the detection of transcripts from nearly all sequences in mammalian genomes (Bertone et al. 2004; Johnson et al. 2005). Comparativegenomic analysis reveals numerous conserved sequences (including the so-called ultraconserved elements that retained their identity throughout long evolutionary spans such as the entire course of vertebrate evolution) within introns and intergenic regions of animals and plant genomes (Dermitzakis et al. 2005; Elgar 2009), but so far transcription into a specific functional RNA has been demonstrated only for a few of these (Bejerano et al. 2004; Baira et al. 2008). Nevertheless, it has been shown that the ultraconserved sequences are subject to “ultraselection” suggesting key functions that remain to be deciphered (Katzman et al. 2007). On the whole, the problem of evolutionarily constrained “dark matter” in animal genomes remains pertinent as the status of the majority of constrained nucleotides is still unclear, at least, in vertebrates, the organisms with the lowest known gene density. In particular, the extent of sequence conservation unrelated to transcription but rather caused by requirements of expression regulation, chromatin structure, and other factors is still a wide open question. To succinctly summarize the current understanding of the constraints affecting different types of sites across the known diversity of the genomes (Fig. 2.2), some

26

E.V. Koonin and Y.I. Wolf

fundamental, straightforward conclusions appear indisputable, in particular, that nonsynonymous sites in protein-coding sequences and sequences encoding structural RNAs are among the most strongly constrained and that the characteristic distributions of constraints critically depend on genome architecture. However, beyond these basic principles, and perhaps unexpectedly, the evolutionary regimes seem to widely differ even for rather closely related lineages, and much additional work in diverse organisms is required to develop a comprehensive picture of the constraints and pressures that shape genome evolution.

2.3

Evolutionary Constraints on Gene and Genome Architectures

Beyond sequence evolution, comparative genomics yields massive amounts of data on the evolution of gene and genome organization, or architecture. An aspect of gene architecture that is common to all life forms but is particularly prominent in eukaryotes is the multidomain organization of proteins (Koonin et al. 2000). Numerous proteins consist of multiple “evolutionary domains” that may or may not correspond to structural domains but in either case show varying degrees of evolutionary mobility. The multidomain organization of some key proteins is conserved through the entire course of evolution of domains of cellular life (archaea, bacteria, and eukaryotes), as is the case of the association of polymerase domains with nuclease domains in different families of DNA polymerases (Aravind and Koonin 1998), to mention just one striking example. More generally, however, domain rearrangements at all ranges of evolutionary distances form an important resource of evolutionary plasticity which is particularly remarkable in the case of so-called promiscuous domains which combine with diverse other domains in numerous proteins and often provide connections in interaction and regulatory networks and complexes (Wuchty and Almaas 2005; Basu et al. 2008, 2009). A feature of gene architecture that is almost fully eukaryote-specific is the exon–intron organization of protein-coding genes which in eukaryotes consist of multiple exons separated by introns. A notable discovery of comparative genomics is the high level of conservation of intron positions over long evolutionary spans: indeed, up to 25–30% of the intron positions are shared between animals and plants, with the implication that most of these introns remained in the same positions throughout eukaryotic evolution (Fedorov et al. 2002; Rogozin et al. 2003; Roy and Gilbert 2006). Within some of the animals lineages, in particular, vertebrates, there seems to be almost complete intron stasis, with minimal intron loss and virtually no gain. In a sharp contrast, evolution of other lineages, such as nematodes, as well as many groups of unicellular eukaryotes, involves extensive turnover of introns (Carmel et al. 2007; Roy and Penny 2007). Thus, evolution of eukaryotic gene architecture shows a complex landscape, with a dynamic evolutionary process in some lineages but much less change in others.

2

Constraints, Plasticity, and Universal Patterns in Genome and Phenome Evolution

27

Genome architecture refers to all aspects of the mapping of genetic elements onto the genome including gene order, clustering, and co-regulation of genes with related functions, allocation of genes to individual chromosomes, etc. (Carmel et al. 2007; Lynch 2007c; Roy and Penny 2007; Koonin 2009a). The very first comparisons of the order of genes in sequenced bacterial genomes revealed a remarkable lack of conservation of the long-range gene order which contrasts with the recurrent presence of partially conserved arrays of co-regulated genes, operons, in diverse prokaryotes (Mushegian and Koonin 1996a; Dandekar et al. 1998). Subsequent analysis has shown that the divergence of long-range gene orders in prokaryotes is roughly proportional to sequence divergence of protein-coding genes but evolution of gene order is extremely fast such that, for many lineages, no long-range conservation is seen even at very low levels of sequence divergence. Beyond this general pattern, the rate of gene order decay substantially differs between prokaryotic lineages (Novichkov et al. 2009b) (Fig. 2.3). The gene order in prokaryotes appears to be disrupted primarily by inversions centered at the origin of replication the frequency of which dramatically differs among prokaryotes (Eisen et al. 2000). Apparently, the origin-centered inversion is a neutral process that is not constrained (or minimally constrained) by purifying selection and depends primarily on the activity of the relevant recombination machinery. In contrast to the lack of conservation of the long-range gene order, prokaryotic operons are characterized by a combination of evolutionary resilience and plasticity, forming overlapping gene arrays that are partially shared by evolutionarily

Genome rearrangement distance (dY)

0.3 Shew anella baltica 0.25

Bacillus anthracis Burk holderia ambif aria Yersinia pestis

0.2

0.15

0.1

0.05

0 0

0.5

1

1.5

2

2.5

Sequence distance (dS)

Fig. 2.3 Divergence of large-scale genome organization vs. protein sequence conservation. The data are shown for four sets of closely related bacterial strains from the ATGC database (Novichkov et al. 2009a). The rearrangement distance (dY) is calculated as the fraction of (putative) orthologs that do not belong to regions of synteny. The dS value of 1 approximately corresponds to 93–97% identity between the compared sequences (Novichkov et al. 2009b)

28

E.V. Koonin and Y.I. Wolf

distant organisms (Rogozin et al. 2002; Ling et al. 2009). To a large extent, the wide spread of some operons among prokaryotes (the ribosomal superoperon and membrane transport cassette operons being the prime cases in point) owes to horizontal gene transfer (HGT) as captured in the selfish operon concept (Lawrence and Roth 1996; Lawrence 1999). When a transferred piece of DNA includes an entire operon consisting of genes encoding a complete pathway or functional system, the chances of fixation dramatically increase. The lack of long-range gene order conservation notwithstanding, the gross architecture of prokaryotic genomes is not entirely unconstrained: there are substantial biases in gene localization, for instance, the preferential codirectionality of gene transcription with replication, conceivably, as a result of selection for minimization of the chance of collision between RNA polymerase and replication forks (Rocha 2008). With a few notable exceptions, such as nematodes and trypanosomes, eukaryotes have no operons; those operons that do exist have nothing to do with prokaryotic operons and seem to have evolved de novo (Blumenthal 2004; Osbourn and Field 2009). Attempts to identify nonrandomness in the eukaryotic gene order, in the form of clustering of genes with connected functions, similar expression levels, and patterns, and other similar characteristics have led to mixed results (Hurst et al. 2004; Koonin 2009a; Osbourn and Field 2009). With some striking exceptions such as the strict order of the animal Hox genes (Lemons and McGinnis 2006), the trends in gene clustering tend to be weak, so the gene order can be considered quasirandom (Koonin 2009a). Evolution of gene order in eukaryotes seems to be determined, primarily, by random chromosomal breaks, and there are no highly conserved gene arrays between distantly related forms, such as different animal phyla, let alone animals and fungi or plants. On the whole, evolution of genome architecture appears to be shaped by the interplay of strong constraints that determine the conservation of operons, weak constraints on other forms of functional clustering and large-scale gene organization, and extensive dynamics of genome rearrangements and HGT. This dynamics both counteracts weak constraints by disrupting gene associations and reinforces the effect of stronger constraints as in the case of horizontal spread of “selfish” operons.

2.4

Evolutionary Constraints on Genome Size, Gene Number, Evolution of Orthologous Gene Lineages, and Gene Repertoires

The number of protein-coding genes in cellular life forms varies within a surprisingly narrow range compared with the genome size and especially considering the difference in organizational complexity between prokaryotes and multicellular eukaryotes. Excluding, on one end of the spectrum, extremely reduced genomes of some intracellular parasitic bacteria that seem to be on their way to becoming

2

Constraints, Plasticity, and Universal Patterns in Genome and Phenome Evolution

29

organelles (Nakabachi et al. 2006) and, on the other end, polyploid plant genomes, the number of encoded proteins varies only from 500 to 25,000, less than two orders of magnitude (Koonin 2009a). The largest known bacterial genome contains only about twofold fewer protein-coding genes than the most complex eukaryotic genomes. As already mentioned above, the genome architectures are drastically different between unicellular and multicellular life forms, so that in unicellular organisms, especially in prokaryotes, the number of encoded proteins closely correlates with the genome size (roughly constant gene density, around one gene per kilobase of DNA), whereas in multicellular organisms, especially animals, the two are decoupled. What constrains the number of encoded proteins from below and from above? The low threshold of genomic complexity intuitively relates to a “minimal gene set for cellular life”, that is, the minimal set of genes sufficient to maintain a functional cell (in practice, of course, a prokaryotic cell) (Koonin 2003; Moya et al. 2009). The concept of a minimal gene set is intrinsically linked to the definition of gene orthology and orthologous gene sets and nonorthologous gene displacement. Orthologs are genes that evolved from a single ancestral gene in the last common ancestor of the compared genomes in contrast to paralogs, genes that evolved by duplication (Koonin 2005). For the majority of genes, evolution of orthologous gene lineages is constrained within a distinct trajectory so that such lineages remain unique and distinguishable from each other over long evolutionary spans. This evolutionary distinctness of orthologous lineages provides for the considerable effectiveness of straightforward methods for identifications of orthologous genes sets based on “bidirectional best hits” and is key to comparative genomics allowing comprehensive comparison of gene repertoires and delineation of core sets of conserved genes and putative minimal gene sets (Tatusov et al. 1997; Altenhoff and Dessimoz 2009). Minimal gene sets for cellular life derived by comparative-genomic and experimental approaches converge at 250–350 genes and seem to encode most of the essential cellular functions (Koonin 2003; Moya et al. 2009). However, an apparent paradox is that a set of 250–350 conserved orthologous genes can be derived only in comparisons of small sets of genomes of not too diverse organisms as exemplified by the first analysis of this kind that compared the parasitic bacteria Haemophilus influenzae and Mycoplasma genitalium and yielded a hypothetical minimal gene set of approximately 250 genes (Mushegian and Koonin 1996b). The core set of ubiquitously conserved genes is continuously shrinking with the addition of new sequenced genomes and seems to be limited to approximately 30 genes, all encoding proteins involved in translation and transcription (Charlebois and Doolittle 2004; Koonin and Wolf 2008). The explanation is nonorthologous gene displacement: most of the essential cellular functions can be performed by members of more than one orthologous gene set, and in many cases, genes or systems responsible for the same function are completely unrelated (Koonin et al. 1996; Koonin 2003). The relevant concept for defining a minimal genetic complement of a cell – the low bound of genomic complexity – is not a unique minimal gene set but rather a unique set of indispensable functional niches that can be filled with diverse collections of genes. Minimal requirements for specific life styles can be defined similarly, for

30

E.V. Koonin and Y.I. Wolf

instance, the minimal gene complement of an autotrophic organism, which includes about 1,000 essential functions (Koonin 2003). Thus, the low bound is defined by the minimal number of functions that are necessary to support a particular life style, but even at this fundamental level of cellular organization, there is notable plasticity in terms of specific gene complements supporting these functions. The nature of the upper bound of genetic complexity is much less clear. However, the question why, despite the accelerating genome sequencing, the maximum number of genes practically does not grow, seems pressing, especially, considering the decoupling of gene number and genome size seen in multicellular prokaryotes. One attractive hypothesis is the “bureaucratic ceiling of complexity”. It has been noticed that different functional classes of genes scale differently with the total number of genes in a genome. Some variation notwithstanding, in prokaryotes, there seem to be three fundamental exponents that characterizes these dependences: 0, 1, and 2 (van Nimwegen 2003; Koonin and Wolf 2008). Genes for proteins involved in information processing (translation, transcription, and replication) scale with a 0 exponent, i.e., the number of these genes reaches a plateau already in the smallest genomes and effectively does not depend on the overall genomic complexity; metabolic enzymes and transport proteins scale roughly proportionally to the total number of genes, whereas regulators and signal transduction system components scale quadratically (Fig. 2.4). The characteristic exponents of the three broad functional classes of genes show remarkably little variation across prokaryotic lineages suggesting that the differential evolutionary dynamics of genes with different functions reflect fundamental “laws” of evolution of cellular organization (Molina and van Nimwegen 2009) or, in other words, distinct, strong constraints on the functional composition 10000

Number of proteins in the class

Transcriptional regulators Signal transduction Metabolism 1000

Translation

g = 1.0 100 γ = 0.2

10 g = 1.9 g = 1.9 1 100

1000

10000

Total number of proteins in COGs

Fig. 2.4 Differential scaling of four broad classes of genes with the total number of genes in prokaryotic genomes. The data are from (Koonin and Wolf 2008); genes that did not belong to COGs (typically, 15–20% in each genome) were not taken into account

2

Constraints, Plasticity, and Universal Patterns in Genome and Phenome Evolution

31

of genomes. Eukaryotic genes show similar even if less pronounced patterns of power law gene scaling, with the exponent for the regulatory genes being substantially greater than one (van Nimwegen 2003). The deep underlying causes of the superlinear scaling of the regulators remain to be understood. A simple “toolbox” model of evolution of prokaryotic metabolic networks seems to be compatible with the quadratic scaling of regulators (Maslov et al. 2009). Under this model, enzymes for utilizing new metabolites together with their dedicated regulators are added (primarily, via HGT) to a progressively versatile reaction network, and because of the growing complexity of the preexisting network that provides enzymes for intermediate reactions, the ratio of regulators to regulated genes steadily grows. Regardless of the exact underlying mechanisms, the superlinear scaling of the regulators clearly could determine the upper limit of the growth of the gene number. At some point (that is not easy to identify precisely), the cost of adding extra regulation (“inflating bureaucracy”) will inevitably become unsustainable, curbing the growth of genetic complexity. The bureaucracy ceiling hypothesis seems particularly plausible in view of the surprising lack of major gene number expansion in vertebrates where the coupling between the gene number and genome size is obviously broken (see also below). In these organisms, the cost of replication can be ruled out as the major factor determining the upper limit, and the cost of regulation, possibly, along with the cost of expression, is the most likely candidate for the role of the principal constraint. It is not by chance, then, that vertebrates evolved other, elaborate means of increasing the proteomic complexity, such as the pervasive alternative splicing and alternative transcription (Nilsen and Graveley 2010), and regulatory complexity (the expansive, still under-appreciated regulatory RNome) that do not involve inflation of the number of protein-coding genes. A major process of genome evolution that in eukaryotes could be the principal path to innovation is gene duplication leading to the formation of paralogous gene families (Ohno 1970; Lespinet et al. 2002). The size distribution of paralogous families in each studied genome follows a power-law-like function that is reproduced, with a high precision, by a simple gene birth and death model conditioned on the equilibrium (constant size) in genome evolution (Karev et al. 2002; Koonin et al. 2002). This process seems to underlie a fundamental constraint on gene demography that is coupled to the constraint on the total number of genes. Beyond the sheer numbers of genes, comparative genomics yields insights into the constraints on and plasticity of gene repertoires. In agreement with the findings on the small and shrinking cores of conserved genes, nonorthologous gene displacement, and extensive redundancy, gene loss has emerged as a major factor of evolution in all life forms. Gene loss is dominant over other processes in the evolution of parasites but is extensive in all lineages, in particular, in the evolution of many animal taxa as illustrated by the high level of orthology between vertebrates and primitive animals such as sea anemone and trichoplax, in contrast to much more limited orthologous relationships between vertebrates and arthropods or nematodes (Putnam et al. 2007; Srivastava et al. 2008). Individual genes show a broad distribution of propensities for gene loss (PGL) (Krylov et al. 2003), and

32

E.V. Koonin and Y.I. Wolf

moreover, it appears that the observed evolutionary and phenomic features of genes are compatible with a steady-state model of genome evolution under which the distribution of PGL as well as the distribution of gene loss rate remain effectively constant over extended evolutionary spans (Wolf et al. 2009). This distribution might be another important constraint governing genome evolution.

2.5

The Causes of Evolution of Protein-Coding Genes

Protein-coding genes, at least, the nonsynonymous positions that determine the amino acid identity, are among the most strongly constrained sequences in all genomes. However, the distribution of the rates of evolution among orthologous genes in any pair of compared genomes spans 3–4 orders of magnitude and is much broader than the distribution of the rates for synonymous sites (Fig. 2.1). Remarkably, the shapes of the rate distributions for orthologous proteins are highly similar for all studied cellular life forms, from bacteria to archaea to mammals (Wolf et al. 2009) (Fig. 2.5). Another universal of genomic and phenomic evolution is the anticorrelation between the rate of evolution of a protein-coding gene and its expression level: highly expressed genes evolve slowly, a dependence that was invariably observed in all model organisms for which expression data are available (Pal et al. 2001, 2006; Krylov et al. 2003; Drummond and Wilke 2008). Given the aforementioned positive Burk holderia Salinispora Methanococcus H omo Aspergillus model

0.01

0.1

1 Relative evolution rate

10

Fig. 2.5 The universal distribution of evolutionary rates across orthologous gene sets. The evolutionary rates for five pairs of closely related organisms from different branches of life were calculated as nucleotide distances for the complete sets of orthologous genes (Wolf et al. 2009). The relative evolution rate for each gene was obtained by dividing its evolution rate by the median rate for the respective pair of organisms. “Model” refers to estimated transition rates in 134 mutationally connected networks for simulated robustly folding 18-mer protein-like molecules (Lobkovsky et al. 2010). Original model rates were normalized by their median value and scaled to standard deviation of 0.25 to match the width of the distributions derived from biological data

2

Constraints, Plasticity, and Universal Patterns in Genome and Phenome Evolution

33

correlation between Ka and Ks, it is not surprising that both rates show the same dependence; more unexpectedly, this anticorrelation with the evolutionary rate was detected also for 30 UTRs but not for 50 UTRs (Jordan et al. 2004). The existence of these universals of genomic evolution and their fundamental link with phenomic characteristics suggest that the primary causes of protein evolution could have more to do with fundamental principles of protein folding than with unique biological functions. It has been proposed that the principal selective factor underlying the evolution of proteins is robustness to misfolding, owing to the deleterious effect of misfolded proteins that, in addition to the expenditure of energy, can be toxic to the cell (Drummond et al. 2005; Drummond and Wilke 2008, 2009). Moreover, under this model, evolution of synonymous sites is constrained, at least, in part, by the same factors as the evolution of proteins owing to the pressure for the preferential use of optimal codons in highly expressed proteins and in specific sites that are important for protein folding (Drummond and Wilke 2008; Zhou et al. 2009), and evolution of the 30 UTRs could follow the same trend (Jordan et al. 2004) as these regions are involved in the regulation of translation. A recent modeling study of misfolding-dominated protein evolution that employed a simple off-lattice model of protein folding and produced estimates of evolutionary rates under the assumption that protein misfolding was the only source of fitness cost (Lobkovsky et al. 2010) reproduced the universal distribution of protein evolutionary rates as well as the dependence between evolutionary rate and expression with considerable accuracy (Fig. 2.5). These findings suggest that the universal rate distribution indeed might be a consequence of fundamental physics of proteins and provide for a general model of protein evolution under which evolution of a given protein is determined, primarily, by its intrinsic robustness to misfolding which also determines the attainable level of translation (Fig. 2.6) (Wolf et al. 2010). In general, the robustness of a protein to misfolding and accordingly the rate of evolution are determined by the size of the (nearly) neutral network, that is, the network of sequences that have approximately the same robustness and accordingly the same fitness as the original sequence (Wagner 2008). Under the model (Wolf et al. 2010), the nearly neutral network size is (roughly) inversely proportional to the robustness of the original sequence, i.e., in the fitness landscape, robust, highly expressed proteins occupy tall, steep peaks, with small areas of high fitness, hence slow evolution; in contrast, proteins with lower robustness occupy lower and wider peaks, with larger areas of high fitness, allowing faster evolution (Fig. 2.6). The original hypothesis on misfolding-dominated evolution of protein-coding genes held that misfolding was largely induced by mistranslation of the coding sequence (Drummond and Wilke 2008, 2009). The latest analysis of the relative contributions of structural–functional constraints and translation rate to protein evolution imply that stochastic misfolding of the native sequence could be even more common and consequential than mistranslation-induced misfolding (Wolf et al. 2010). Nevertheless, mistranslation (somatic mutation), which is relatively frequent [10 4–10 5 per codon (Kramer and Farabaugh 2007)], is likely to be an important factor affecting the instantaneous shape of the robustness landscape by temporarily expanding the nearly neutral network (Fig. 2.6).

34

E.V. Koonin and Y.I. Wolf protein f amily Y folding robustness

protein f amily X

sequence space

at low ex pression ( higher evolution rate)

at high ex pression ( low er evolution rate) fitness low

high

Fig. 2.6 A conceptual model of misfolding-driven protein evolution. The cartoon schematically shows the robustness/fitness landscapes for two protein families at high and low expression levels. The high fitness/robustness area (green) reflects the size of the nearly neutral network in the sequence space

The view of protein evolution under which the primary constraints have to do more with the maintenance of the native folding as well as intermolecular interactions than with unique protein functions seems to be compatible with the recent large-scale analysis of protein family evolution (Worth et al. 2009).

2.6

Constraints on Molecular Phenotypes

The advances of systems biology provide for direct evolutionary study of molecular phenomic variables, such as gene expression, protein abundance, and architecture of interaction networks. In other words, it is now possible to assess evolutionary variance and constraints by directly comparing gene expression profiles and networks, protein abundances and other features of the molecular phenotype between different organism and evolutionary lineages.

2

Constraints, Plasticity, and Universal Patterns in Genome and Phenome Evolution

35

Molecular phenomic variables, such as gene expression level and number of interaction partners of a protein, show a distinct structure of dependences among themselves and with evolutionary variables such as sequence evolution rate and the rate of gene loss (Wolf et al. 2006). The correlations between phenomic variables are typically positive, i.e., highly expressed proteins also tend to interact with many other proteins, to have many paralogs etc., whereas the correlations between the phenomic and evolutionary variables are negative, for instance, highly expressed genes on average evolve slower than those expressed at a low level. Thus, as exemplified by the model of protein evolution discussed above, constraints on the ranges of phenomic variables, in part, appear to constrain evolution of gene sequences, gene repertoires, and genome architectures. Several studies suggested that gene expression in animals is not strongly constrained during evolution (Jordan et al. 2004; Khaitovich et al. 2004) or at least has a major neutral component (Jordan et al. 2004; Khaitovich et al. 2004). However, subsequent analyzes revealed clear signatures of selective constraints that affect gene expression (Denver et al. 2005; Jordan et al. 2005; Gilad et al. 2006). Recently, it has been shown that the abundances of orthologous proteins are strongly correlated even among distantly related animals. A correlation coefficient greater than 0.8 was observed for approximately 3,000 orthologous genes from the nematode C. elegans and the fly D. melanogaster, a value that is in sharp contrast with the correlation coefficients in the range of 0.2–0.4 that are typically seen in comparisons of genomic and molecular phenomic variables (Wolf et al. 2006). Strikingly, the correlation between protein abundances was found to be substantially greater than the correlation between mRNA expression rates and between the rates of coding sequence evolution (measured by comparison of orthologous genes from pairs of closely related species) within the same set of genes (Schrimpf et al. 2009; Wolf et al. 2010). Thus, assuming there are no unrecognized biases in the measurements, protein abundance appears to be constrained during evolution to a substantially greater extent than gene expression and even stronger than the sequence evolution itself. The global architectures of protein interaction and gene coexpression networks appear to be universal across all life forms, with the characteristic power law distribution of the network node degree (number of connections) (Barabasi and Oltvai 2004). Local network structures seem to be much less strongly constrained and differ even among closely related organisms (Bergmann et al. 2004; Tsaparas et al. 2006). However, a comparison of gene coexpression networks from the so-called mutation accumulation lineages of C. elegans, in which the selective constraints are effectively removed (Denver et al. 2005), with those of the natural isolate suggests that it is the local wiring of the coexpression network that is constrained by selection, whereas the global properties are not affected by the removal of constraints (Jordan et al. 2008). Thus, the similar global network properties seen in widely different organisms might reflect “neutral” rather than selective constraints, that is, could have evolved via simple, stochastic, nonselective processes as exemplified by birth-and-death models of genome and network evolution (Koonin et al. 2002; Lynch 2007a).

36

2.7

E.V. Koonin and Y.I. Wolf

Constraints on Evolutionary Trajectories: What Happens When the Tape of Evolution Is Rewound?

An intriguing, deep question in evolutionary biology is how constrained is the course of evolution itself, or in other words, to what extent the evolutionary process is free to explore different trajectories between the given initial and end states (Kassen 2009). In theory, mutational trajectories in sequence space are considered to be fundamentally stochastic (Mani and Clarke 1990). However, experimental evolution studies indicate that paths of adaptive evolution are substantially constrained by interactions between mutation (epistasis and pleiotropy) although not to the point of becoming deterministic. A series of experiments on evolution of bacterial antibiotic resistance resulting from 5 point mutations in the b-lactamase gene showed that, of the 120 trajectories across the sequence space, 102 were inaccessible to evolution, and of the remaining 18 trajectories, several had negligible probability of realization (Weinreich et al. 2006). Even stronger constraints were identified in a subsequent study that explored a more complex fitness landscape by simultaneously evolving resistance to two antibiotics (Novais et al. 2010). The remarkable long-term study of bacterial evolution under controlled conditions by Lenski and coworkers provides examples of both parallel emergence of the same mutations under a particular selective pressure and the realization of multiple trajectories (Barrick et al. 2009; Barrick and Lenski 2009; Kassen 2009; Stanek et al. 2009). For instance, it has been explicitly shown that evolution of the same, extremely rare phenotype, the ability to grow on citrate, proceeded along distinct trajectories in different Escherichia coli populations (Blount et al. 2008). Direct studies of evolutionary trajectories in the sequence space are still very limited but they have already made it clear that, although historical contingency is crucial in the evolutionary process (Jacob 1977), the exploration of the sequence space is strongly constrained so that only a minority of theoretically possible trajectories are accessible. The extent of these constraints depends on the shape of the fitness landscape: the more rugged the landscape, the stronger the constraints. The shape of the landscape itself depends on the nature, strength, and interactions of the relevant selective factors and evolves with time, which makes it more of a seascape (Mustonen and Lassig 2009, 2010).

2.8

Robustness, Plasticity, and Evolutionary Constraints

The aspects of evolution that are orthogonal to constraints are the plasticity of genomic and phenomic characteristics and the robustness of molecular phenotypes (Wagner 2005). In many groups of organisms, large-scale genome organization seems to be only weakly constrained so that gene order substantially differs even between closely related organisms, especially, among prokaryotes (Koonin 2009a; Novichkov et al. 2009b) (Fig. 2.6). The gene repertoire of many organisms,

2

Constraints, Plasticity, and Universal Patterns in Genome and Phenome Evolution

37

especially, prokaryotes shows plasticity that may even exceed the plasticity of genome architecture as dramatically illustrated by rapid genome reduction in parasitic bacteria (Darby et al. 2007) and by acquisition of pathogenicity islands that may comprise over 30% of the recipient genome in bacterial pathogens (Dobrindt et al. 2003). The plasticity of genome organization and composition is paralleled by the evolutionary flexibility of regulatory networks and complements the more strongly constrained evolution of individual genes (Lozada-Chavez et al. 2006; Kazakov et al. 2009). Evolutionary plasticity and the strength of evolutionary constraints are tightly linked to robustness of biological systems, that is, resistance of phenotypes to genetic perturbation (mutations, recombination, etc.). Robustness seems to be an evolved property as demonstrated by the study of specialized buffering mechanism (for instance, those mediated by molecular chaperones of the HSP90 family), the impairment of which (often by environmental stress) reveals hidden genetic variation and accordingly enhances the evolutionary potential of the organism (Queitsch et al. 2002; Wagner 2008; Masel and Siegal 2009). Recently, the concept of variation stabilization has been extended to include numerous genes that are not molecular chaperones but possess extremely diverse functions; it seems that stabilization is a general property of interaction networks, so that disruption of almost any highly connected node reduces robustness of the system and leads to increased variation (Bergman and Siegal 2003). A comprehensive study of such “capacitor” properties of yeast mutants revealed approximately 300 genes (about 6% of the total) whose disruption significantly decreased the robustness of yeast to environmental perturbations (Levy and Siegal 2008). Thus, robustness might be a major, selectable mechanism that counteracts evolutionary constraints, in particular, those caused by the interaction between mutations, and enhances plasticity.

2.9

Effective Population Size as the General Determinant of Evolutionary Constrains and Distinction Between Constraints and Neutral Conservation

The classic population genetics theory asserts that the effectiveness of purifying selection is proportional to the effective population size of the given organism (assuming a uniform mutation rate for simplicity). In other words, only those mutational changes can be fixed or efficiently eliminated during evolution for which s > 1/Ne, where s is the selection coefficient and Ne is the effective population size (Lynch 2007c). Conversely, mutations with s < 1/Ne are effectively “invisible” to selection. This simple dependence seems to be an important, possibly, the primary determinant of the constraints that affect different aspects of genome and phenome evolution. In particular, differences in Ne seem to underlie the qualitative difference in the genome architectures of unicellular and multicellular organisms

38

E.V. Koonin and Y.I. Wolf

described above (Lynch and Conery 2003; Lynch 2007b). Substantial genome expansion seems to be attainable only in organisms with small populations and the attendant weak selection, such as plants and animals. In these organisms, the deleterious effect of propagation of nonfunctional sequences is often too small to allow their “detection” and elimination by purifying selection. Accordingly, evolutionary conservation does not automatically imply that the conserved feature is constrained by purifying selection but rather, somewhat paradoxically, can reflect weak purifying selection that is insufficient to eliminate nonadaptive ancestral features. Evolution of the exon–intron gene structure in eukaryotes provides an excellent case in point for this population-genetic paradigm. Most of the introns do not appear to possess a distinct function but do require distinct splicing signals for transcript maturation to occur accurately. Thus, approximately 25 nucleotides per intron are subject to purifying selection of varying strength (Lynch 2006a). Because of the associated cost of selection and also owing to the expenditure of time and energy on replication and transcription of intronic sequences, functionless introns are weakly deleterious for the respective organisms. However, a simple estimate taking into account the characteristic mutation rates in eukaryotes shows that the deleterious effect of introns is “visible” to purifying selection only in relatively large populations with Ne on the order of 107 or greater. This is the characteristic range of effective population sizes of unicellular eukaryotes, whereas multicellular eukaryotes typically have smaller populations (Lynch and Conery 2003; Lynch 2006a, 2007c). The effect of these differences on the evolution of genome architecture in eukaryotes is dramatic. Unlike genomes of unicellular forms that typically contain less than one intron per gene, and in many case, only a few introns in the entire genome, plants, and animals possess numerous introns, up to 8 per gene in vertebrates (Roy and Gilbert 2006). The positions of many introns are conserved in orthologous genes of animals and plants (see above), that is, most likely, since the time of existence of the last common ancestor of the extant eukaryotes. However, there seems to be no reason to claim that, in general, the positions of introns are constrained during evolution. The conservation of intron positions appears to be due to the weak purifying selection that precludes efficient elimination of introns in organisms with small characteristic values of Ne. Beyond the sheer number of introns, the features of introns themselves drastically differ: all the introns in intron-poor genomes of unicellular eukaryotes are short, with tightly controlled lengths and highly conserved, optimized splice signals at exon–intron junctions (Irimia et al. 2007; Irimia and Roy 2008). By contrast, introns in intron-rich genomes, such as plants, and animals, are often long (especially, in vertebrates) and are bounded by relatively weak, suboptimal splice signals owing to the relatively low selection favoring strong splicing signals (Irimia et al. 2009). The existence of these long introns with weak splice signals, which yield relatively inaccurate splicing, provides for the evolution of alternative splicing and nested gene structures, the crucial factors of structural and regulatory diversification of proteins and RNAs in multicellular eukaryotes.

2

Constraints, Plasticity, and Universal Patterns in Genome and Phenome Evolution

39

The case of intron evolution illustrates the crucial interplay of constraints and plasticity that is central to the evolution of genomes and molecular phenomes (Fig. 2.7). Effective population size determines the background strength of purifying selection (constraints). When Ne is small, as in multicellular eukaryotes, constraints are relatively weak, so plasticity is enhanced such that nonfunctional genomic elements like introns can be retained, the result being a system that is relatively inefficient and vulnerable to random factors that can cause extinction, but also possesses a high potential for evolutionary innovation. Conversely, when Ne is large, as in most prokaryotes, many aspects of evolution are strongly constrained although there is still much plasticity in the evolution of these organisms thanks to dynamic, effectively neutral processes, in particular, HGT. Its fundamental importance notwithstanding, it is important to keep in mind that Ne determines the course of evolution only on a coarse grain scale. Thus, a comparative analysis of the Kn/Ks values among prokaryotic lineages failed to detect a negative correlation between selective constraints and genome size, as implied by the straightforward population genetic perspective (Lynch 2006b). On the contrary, larger genomes tend to evolve under stronger constraints (even when only free-living microbes are analyzed) suggesting that lifestyle could be a critical determinant of genome evolution (favoring, in particular, gene acquisition via HGT in variable environments) independent of Ne (Jordan et al. 2002; Novichkov et al. 2009b).

strong

f unctional and f olding-critical sites

low

intron donor and acceptor sites

typical protein sites typical regulatory sites

protein f unction operons and gene clusters

constraimts

protein abundance

synonymous sites in CDS

plasticity

gene islands and superoperons mRNA abundance gene neighborhoods disordered segments introns

f unctional and regulatory netw ork s

" j unk " genome w eak molecular structure and dynamics

local genome contex t

genome-scale gene order

genome architecture

high molecular phenomics

level of organiz ation

Fig. 2.7 Genomic and phenomic constraints operative at different levels of biological organization. The scales are rough approximations

40

2.10

E.V. Koonin and Y.I. Wolf

Conclusions: Selective and Neutral Constraints and Evolutionary Universals

The prevailing theme that emerges from the recent advances of evolutionary genomics and evolutionary systems biology is the plurality of constraints that affect the evolution of different types of sequences in any genome, genome architectures, and molecular phenomes (Fig. 2.7) along with major differences of evolutionary regimens between taxa. Nevertheless, beyond this diversity, comparative-genomic and molecular phenomic analysis reveals universal patterns that at least in some cases are compatible with relatively simple and general models of evolution. As discussed here, such models start to suggest simple, fundamental causes underlying important aspects of evolution such as the constraints on evolution of proteins and evolution of gene repertoire (Table 2.1). In this context, it seems appropriate to expand the notion of constraints to include not only selective but also “neutral” constraints that are determined by nonselective, stochastic properties of biological systems and are often amenable to modeling using techniques borrowed from statistical physics (Table 2.1) (Frank 2009; Koonin 2009b). Evolutionary trajectories in the sequence space seem to be strongly constrained, thus substantially limiting the “tinkering potential” of evolution, using the famous metaphor of Jacob (Jacob 1977). The evolutionary process thus appears to be a compromise “between design and bricolage” (Wilkins 2007), the design aspect Table 2.1 Universals of genome and molecular phenome evolution Universal pattern Putative underlying Nature of process/model relevant constraints Approximately log-normal Protein folding Selective: protein distribution of robustness to evolutionary rates of misfolding protein-coding genes Protein folding Selective: protein Anticorrelation between robustness to evolution rate and misfolding expression level dependent on (translation rate) of translation protein-coding genes rate “Toolbox”-like growth Neutral Distinct scaling laws for of metabolic different functional networks classes of genes

Birth and death Power law like distribution process of gene of paralogous gene family evolution size Network evolution by Power law like distribution preferential of node degree in attachments interaction and coexpression networks

Neutral

Neutral

References

(Wolf et al. 2009; Lobkovsky et al. 2010) (Drummond and Wilke 2008, 2009; Wolf et al. 2010)

(van Nimwegen 2003; Maslov et al. 2009; Molina and van Nimwegen 2009) (Karev et al. 2002; Koonin et al. 2002) (Barabasi and Oltvai 2004; Tsaparas et al. 2006)

2

Constraints, Plasticity, and Universal Patterns in Genome and Phenome Evolution

41

brought about by constraints (certainly having nothing to do with any intelligence) and the bricolage stemming from the evolved robustness and the ensuing plasticity of evolving organisms. Comparative genomics and systems approaches transform evolutionary biology into a much more complex but also more precise, quantitative field than it was in the twentieth century. Next generation sequencing, quantitative proteomics, and other systemic approaches, combined with more specific approaches of experimental evolution, can be expected to reveal the specific, precise constraints affecting diverse aspects of genome and phenome evolution.

References Altenhoff AM, Dessimoz C (2009) Phylogenetic and functional assessment of orthologs inference projects and methods. PLoS Comput Biol 5:e1000262 Aravind L, Koonin EV (1998) Phosphoesterase domains associated with DNA polymerases of diverse origins. Nucleic Acids Res 26:3746–3752 Baira E, Greshock J, Coukos G, Zhang L (2008) Ultraconserved elements: genomics, function and disease. RNA Biol 5:132–134 Barabasi AL, Oltvai ZN (2004) Network biology: understanding the cell’s functional organization. Nat Rev Genet 5:101–113 Barrick JE, Lenski RE (2009) Genome-wide mutational diversity in an evolving population of Escherichia coli. Cold Spring Harb Symp Quant Biol 16:345–355 Barrick JE, Yu DS, Yoon SH, Jeong H, Oh TK, Schneider D, Lenski RE, Kim JF (2009) Genome evolution and adaptation in a long-term experiment with Escherichia coli. Nature 461:1243–1247 Basu MK, Carmel L, Rogozin IB, Koonin EV (2008) Evolution of protein domain promiscuity in eukaryotes. Genome Res 18:449–461 Basu MK, Poliakov E, Rogozin IB (2009) Domain mobility in proteins: functional and evolutionary implications. Brief Bioinform 10:205–216 Bejerano G, Pheasant M, Makunin I, Stephen S, Kent WJ, Mattick JS, Haussler D (2004) Ultraconserved elements in the human genome. Science 304:1321–1325 Bergman A, Siegal ML (2003) Evolutionary capacitance as a general feature of complex gene networks. Nature 424:549–552 Bergmann S, Ihmels J, Barkai N (2004) Similarities and differences in genome-wide expression data of six organisms. PLoS Biol 2:E9 Bertone P, Stolc V, Royce TE, Rozowsky JS, Urban AE, Zhu X, Rinn JL, Tongprasit W, Samanta M, Weissman S, Gerstein M, Snyder M (2004) Global identification of human transcribed sequences with genome tiling arrays. Science 306:2242–2246 Blount ZD, Borland CZ, Lenski RE (2008) Historical contingency and the evolution of a key innovation in an experimental population of Escherichia coli. Proc Natl Acad Sci USA 105:7899–7906 Blumenthal T (2004) Operons in eukaryotes. Brief Funct Genomic Proteomic 3:199–211 Bowen NJ, Jordan IK (2007) Exaptation of protein coding sequences from transposable elements. Genome Dyn 3:147–162 Carmel L, Rogozin IB, Wolf YI, Koonin EV (2007) Patterns of intron gain and conservation in eukaryotic genes. BMC Evol Biol 7:192 Carthew RW, Sontheimer EJ (2009) Origins and mechanisms of miRNAs and siRNAs. Cell 136:642–655 Charlebois RL, Doolittle WF (2004) Computing prokaryotic gene ubiquity: rescuing the core from extinction. Genome Res 14:2469–2477

42

E.V. Koonin and Y.I. Wolf

Charlesworth J, Eyre-Walker A (2008) The McDonald–Kreitman test and slightly deleterious mutations. Mol Biol Evol 25:1007–1015 Costa FF (2005) Non-coding RNAs: new players in eukaryotic biology. Gene 357:83–94 Dandekar T, Snel B, Huynen M, Bork P (1998) Conservation of gene order: a fingerprint of proteins that physically interact. Trends Biochem Sci 23:324–328 Darby AC, Cho NH, Fuxelius HH, Westberg J, Andersson SG (2007) Intracellular pathogens go extreme: genome evolution in the Rickettsiales. Trends Genet 23:511–520 Denver DR, Morris K, Streelman JT, Kim SK, Lynch M, Thomas WK (2005) The transcriptional consequences of mutation and natural selection in Caenorhabditis elegans. Nat Genet 37:544–548 Dermitzakis ET, Reymond A, Antonarakis SE (2005) Conserved non-genic sequences – an unexpected feature of mammalian genomes. Nat Rev Genet 6:151–157 Dobrindt U, Agerer F, Michaelis K, Janka A, Buchrieser C, Samuelson M, Svanborg C, Gottschalk G, Karch H, Hacker J (2003) Analysis of genome plasticity in pathogenic and commensal Escherichia coli isolates by use of DNA arrays. J Bacteriol 185:1831–1840 Doolittle WF, Sapienza C (1980) Selfish genes, the phenotype paradigm and genome evolution. Nature 284:601–603 Drake JA, Bird C, Nemesh J, Thomas DJ, Newton-Cheh C, Reymond A, Excoffier L, Attar H, Antonarakis SE, Dermitzakis ET, Hirschhorn JN (2006) Conserved noncoding sequences are selectively constrained and not mutation cold spots. Nat Genet 38:223–227 Drummond DA, Wilke CO (2008) Mistranslation-induced protein misfolding as a dominant constraint on coding-sequence evolution. Cell 134:341–352 Drummond DA, Wilke CO (2009) The evolutionary consequences of erroneous protein synthesis. Nat Rev Genet 10:715–724 Drummond DA, Bloom JD, Adami C, Wilke CO, Arnold FH (2005) Why highly expressed proteins evolve slowly. Proc Natl Acad Sci USA 102:14338–14343 Duret L, Dorkeld F, Gautier C (1993) Strong conservation of non-coding sequences during vertebrates evolution: potential involvement in post-transcriptional regulation of gene expression. Nucleic Acids Res 21:2315–2322 Eisen JA, Heidelberg JF, White O, Salzberg SL (2000) Evidence for symmetric chromosomal inversions around the replication origin in bacteria. Genome Biol 1(6):RESEARCH0011 Elgar G (2009) Pan-vertebrate conserved non-coding sequences associated with developmental regulation. Brief Funct Genomic Proteomic 8:256–265 Ellegren H (2008) Comparative genomics and the study of evolution by natural selection. Mol Ecol 17:4586–4596 Ellegren H, Smith NG, Webster MT (2003) Mutation rate variation in the mammalian genome. Curr Opin Genet Dev 13:562–568 Eyre-Walker A, Keightley PD (2009) Estimating the rate of adaptive molecular evolution in the presence of slightly deleterious mutations and population size change. Mol Biol Evol 26:2097–2108 Fedorov A, Merican AF, Gilbert W (2002) Large-scale comparison of intron positions among animal, plant, and fungal genes. Proc Natl Acad Sci USA 99:16128–16133 Frank SA (2009) The common patterns of nature. J Evol Biol 22:1563–1585 Gilad Y, Oshlack A, Rifkin SA (2006) Natural selection on gene expression. Trends Genet 22:456–461 Grishin NV, Wolf YI, Koonin EV (2000) From complete genomes to measures of substitution rate variability within and between proteins. Genome Res 10:991–1000 Harrison PM, Gerstein M (2002) Studying genomes through the aeons: protein families, pseudogenes and proteome evolution. J Mol Biol 318:1155–1174 Hurst LD (2002) The Ka/Ks ratio: diagnosing the form of sequence evolution. Trends Genet 18:486 Hurst LD, Pal C, Lercher MJ (2004) The evolutionary dynamics of eukaryotic gene order. Nat Rev Genet 5:299–310

2

Constraints, Plasticity, and Universal Patterns in Genome and Phenome Evolution

43

Irimia M, Roy SW (2008) Evolutionary convergence on highly-conserved 3’ intron structures in intron-poor eukaryotes and insights into the ancestral eukaryotic genome. PLoS Genet 4: e1000148 Irimia M, Penny D, Roy SW (2007) Coevolution of genomic intron number and splice sites. Trends Genet 23:321–325 Irimia M, Roy SW, Neafsey DE, Abril JF, Garcia-Fernandez J, Koonin EV (2009) Complex selection on 5’ splice sites in intron-rich organisms. Genome Res 19:2021–2027 Jacob F (1977) Evolution and tinkering. Science 196:1161–1166 Johnson JM, Edwards S, Shoemaker D, Schadt EE (2005) Dark matter in the genome: evidence of widespread transcription detected by microarray tiling experiments. Trends Genet 21:93–102 Jordan IK, Rogozin IB, Wolf YI, Koonin EV (2002) Microevolutionary genomics of bacteria. Theor Popul Biol 61:435–447 Jordan IK, Rogozin IB, Glazko GV, Koonin EV (2003) Origin of a substantial fraction of human regulatory sequences from transposable elements. Trends Genet 19:68–72 Jordan IK, Marino-Ramirez L, Wolf YI, Koonin EV (2004) Conservation and coevolution in the scale-free human gene coexpression network. Mol Biol Evol 21:2058–2070 Jordan IK, Marino-Ramirez L, Koonin EV (2005) Evolutionary significance of gene expression divergence. Gene 345:119–126 Jordan IK, Katz LS, Denver DR, Streelman JT (2008) Natural selection governs local, but not global, evolutionary gene coexpression networks in Caenorhabditis elegans. BMC Syst Biol 2:96 Karev GP, Wolf YI, Rzhetsky AY, Berezovskaya FS, Koonin EV (2002) Birth and death of protein domains: a simple model of evolution explains power law behavior. BMC Evol Biol 2:18 Kassen R (2009) Toward a general theory of adaptive radiation: insights from microbial experimental evolution. Ann N Y Acad Sci 1168:3–22 Katzman S, Kern AD, Bejerano G, Fewell G, Fulton L, Wilson RK, Salama SR, Haussler D (2007) Human genome ultraconserved elements are ultraselected. Science 317:915 Kazakov AE, Rodionov DA, Alm E, Arkin AP, Dubchak I, Gelfand MS (2009) Comparative genomics of regulation of fatty acid and branched-chain amino acid utilization in proteobacteria. J Bacteriol 191:52–64 Kelly C, Churchill GA (1996) Biases in amino acid replacement matrices and alignment scores due to rate heterogeneity. J Comput Biol 3:307–318 Khachane AN, Harrison PM (2009) Assessing the genomic evidence for conserved transcribed pseudogenes under selection. BMC Genomics 10:435 Khaitovich P, Weiss G, Lachmann M, Hellmann I, Enard W, Muetzel B, Wirkner U, Ansorge W, Paabo S (2004) A neutral model of transcriptome evolution. PLoS Biol 2:E132 Kimura M (1983) The neutral theory of molecular evolution. Cambridge University Press, Cambridge Kitano H (2002) Computational systems biology. Nature 420:206–210 Koonin EV (2003) Comparative genomics, minimal gene-sets and the last universal common ancestor. Nat Rev Microbiol 1:127–136 Koonin EV (2005) Orthologs, paralogs, and evolutionary genomics. Annu Rev Genet 39:309–338 Koonin EV (2009a) Evolution of genome architecture. Int J Biochem Cell Biol 41:298–306 Koonin EV (2009b) Darwinian evolution in the light of genomics. Nucleic Acids Res 37:1011–1034 Koonin EV, Wolf YI (2006) Evolutionary systems biology: links between gene evolution and function. Curr Opin Biotechnol 17:481–487 Koonin EV, Wolf YI (2008) Genomics of bacteria and archaea: the emerging dynamic view of the prokaryotic world. Nucleic Acids Res 36(21):6688–6719 Koonin EV, Mushegian AR, Bork P (1996) Non-orthologous gene displacement. Trends Genet 12:334–336 Koonin EV, Aravind L, Kondrashov AS (2000) The impact of comparative genomics on our understanding of evolution. Cell 101:573–576

44

E.V. Koonin and Y.I. Wolf

Koonin EV, Wolf YI, Karev GP (2002) The structure of the protein universe and genome evolution. Nature 420:218–223 Kramer EB, Farabaugh PJ (2007) The frequency of translational misreading errors in E. coli is largely determined by tRNA competition. RNA 13:87–96 Krylov DM, Wolf YI, Rogozin IB, Koonin EV (2003) Gene loss, protein sequence divergence, gene dispensability, expression level, and interactivity are correlated in eukaryotic evolution. Genome Res 13:2229–2235 Lawrence J (1999) Selfish operons: the evolutionary impact of gene clustering in prokaryotes and eukaryotes. Curr Opin Genet Dev 9:642–648 Lawrence JG, Roth JR (1996) Selfish operons: horizontal transfer may drive the evolution of gene clusters. Genetics 143:1843–1860 Lemons D, McGinnis W (2006) Genomic evolution of Hox gene clusters. Science 313:1918–1922 Lespinet O, Wolf YI, Koonin EV, Aravind L (2002) The role of lineage-specific gene family expansion in the evolution of eukaryotes. Genome Res 12:1048–1059 Levy SF, Siegal ML (2008) Network hubs buffer environmental variation in Saccharomyces cerevisiae. PLoS Biol 6:e264 Ling X, He X, Xin D (2009) Detecting gene clusters under evolutionary constraint in a large number of genomes. Bioinformatics 25:571–577 Lobkovsky AE, Wolf YI, Koonin EV (2010) Universal distribution of protein evolution rates as a consequence of protein folding physics. Proc Natl Acad Sci USA 107(7):2983–2988, doi: 10.1073/pnas.0910445107 Loewe L (2009) A framework for evolutionary systems biology. BMC Syst Biol 3:27 Lozada-Chavez I, Janga SC, Collado-Vides J (2006) Bacterial regulatory networks are extremely flexible in evolution. Nucleic Acids Res 34:3434–3445 Lunter G, Ponting CP, Hein J (2006) Genome-wide identification of human functional DNA using a neutral indel model. PLoS Comput Biol 2:e5 Lynch M (2006a) The origins of eukaryotic gene structure. Mol Biol Evol 23:450–468 Lynch M (2006b) Streamlining and simplification of microbial genome architecture. Annu Rev Microbiol 60:327–349 Lynch M (2007a) The evolution of genetic networks by non-adaptive processes. Nat Rev Genet 8:803–813 Lynch M (2007b) The frailty of adaptive hypotheses for the origins of organismal complexity. Proc Natl Acad Sci USA 104(Suppl 1):8597–8604 Lynch M (2007c) The origins of genome architecture. Sinauer Associates, Sunderland, MA Lynch M, Conery JS (2003) The origins of genome complexity. Science 302:1401–1404 Makalowski W, Boguski MS (1998) Synonymous and nonsynonymous substitution distances are correlated in mouse and rat genes. J Mol Evol 47:119–121 Mani GS, Clarke BC (1990) Mutational order: a major stochastic process in evolution. Proc R Soc Lond B Biol Sci 240:29–37 Margulies EH, Cooper GM, Asimenos G, Thomas DJ, Dewey CN, Siepel A, Birney E, Keefe D, Schwartz AS, Hou M, Taylor J, Nikolaev S, Montoya-Burgos JI, Loytynoja A, Whelan S, Pardi F, Massingham T, Brown JB, Bickel P, Holmes I, Mullikin JC, Ureta-Vidal A, Paten B, Stone EA, Rosenbloom KR, Kent WJ, Bouffard GG, Guan X, Hansen NF, Idol JR, Maduro VV, Maskeri B, McDowell JC, Park M, Thomas PJ, Young AC, Blakesley RW, Muzny DM, Sodergren E, Wheeler DA, Worley KC, Jiang H, Weinstock GM, Gibbs RA, Graves T, Fulton R, Mardis ER, Wilson RK, Clamp M, Cuff J, Gnerre S, Jaffe DB, Chang JL, Lindblad-Toh K, Lander ES, Hinrichs A, Trumbower H, Clawson H, Zweig A, Kuhn RM, Barber G, Harte R, Karolchik D, Field MA, Moore RA, Matthewson CA, Schein JE, Marra MA, Antonarakis SE, Batzoglou S, Goldman N, Hardison R, Haussler D, Miller W, Pachter L, Green ED, Sidow A (2007) Analyses of deep mammalian sequence alignments and constraint predictions for 1% of the human genome. Genome Res 17:760–774 Masel J, Siegal ML (2009) Robustness: mechanisms and consequences. Trends Genet 25:395–403 Maslov S, Krishna S, Pang TY, Sneppen K (2009) Toolbox model of evolution of prokaryotic metabolic networks and their regulation. Proc Natl Acad Sci USA 106:9743–9748

2

Constraints, Plasticity, and Universal Patterns in Genome and Phenome Evolution

45

Mayrose I, Friedman N, Pupko T (2005) A gamma mixture model better accounts for among site rate heterogeneity. Bioinformatics 21(Suppl 2):ii151–ii158 Medina M (2005) Genomes, phylogeny, and evolutionary systems biology. Proc Natl Acad Sci USA 102(Suppl 1):6630–6635 Molina N, van Nimwegen E (2008) Universal patterns of purifying selection at noncoding positions in bacteria. Genome Res 18:148–160 Molina N, van Nimwegen E (2009) Scaling laws in functional genome content across prokaryotic clades and lifestyles. Trends Genet 25:243–247 Monot M, Honore N, Garnier T, Zidane N, Sherafi D, Paniz-Mondolfi A, Matsuoka M, Taylor GM, Donoghue HD, Bouwman A, Mays S, Watson C, Lockwood D, Khamispour A, Dowlati Y, Jianping S, Rea TH, Vera-Cabrera L, Stefani MM, Banu S, Macdonald M, Sapkota BR, Spencer JS, Thomas J, Harshman K, Singh P, Busso P, Gattiker A, Rougemont J, Brennan PJ, Cole ST (2009) Comparative genomic and phylogeographic analysis of Mycobacterium leprae. Nat Genet 41:1282–1289 Moya A, Gil R, Latorre A, Pereto J, Pilar Garcillan-Barcia M, de la Cruz F (2009) Toward minimal bacterial cells: evolution vs. design. FEMS Microbiol Rev 33:225–235 Mushegian AR, Koonin EV (1996a) Gene order is not conserved in bacterial evolution. Trends Genet 12:289–290 Mushegian AR, Koonin EV (1996b) A minimal gene set for cellular life derived by comparison of complete bacterial genomes [see comments]. Proc Natl Acad Sci USA 93:10268–10273 Mustonen V, Lassig M (2009) From fitness landscapes to seascapes: non-equilibrium dynamics of selection and adaptation. Trends Genet 25:111–119 Mustonen V, Lassig M (2010) Fitness flux and ubiquity of adaptive evolution. Proc Natl Acad Sci USA 107(9):4248–4253 Muzzi A, Moschioni M, Covacci A, Rappuoli R, Donati C (2008) Pilus operon evolution in Streptococcus pneumoniae is driven by positive selection and recombination. PLoS ONE 3:e3660 Nakabachi A, Yamashita A, Toh H, Ishikawa H, Dunbar HE, Moran NA, Hattori M (2006) The 160-kilobase genome of the bacterial endosymbiont Carsonella. Science 314:267 Nielsen R (2001) Statistical tests of selective neutrality in the age of genomics. Heredity 86:641–647 Nielsen R (2005) Molecular signatures of natural selection. Annu Rev Genet 39:197–218 Nielsen R, Bustamante C, Clark AG, Glanowski S, Sackton TB, Hubisz MJ, Fledel-Alon A, Tanenbaum DM, Civello D, White TJ, Sninsky JJ, Adams MD, Cargill M (2005) A scan for positively selected genes in the genomes of humans and chimpanzees. PLoS Biol 3(6):e170 Nilsen TW, Graveley BR (2010) Expansion of the eukaryotic proteome by alternative splicing. Nature 463:457–463 Novais A, Comas I, Baquero F, Canton R, Coque TM, Moya A, Gonzalez-Candelas F, Galan JC (2010) Evolutionary trajectories of beta-lactamase CTX-M-1 cluster enzymes: predicting antibiotic resistance. PLoS Pathog 6(1):e1000735 Novichkov PS, Ratnere I, Wolf YI, Koonin EV, Dubchak I (2009a) ATGC: a database of orthologous genes from closely related prokaryotic genomes and a research platform for microevolution of prokaryotes. Nucleic Acids Res 37:D448–D454 Novichkov PS, Wolf YI, Dubchak I, Koonin EV (2009b) Trends in prokaryotic evolution revealed by comparison of closely related bacterial and archaeal genomes. J Bacteriol 191:65–73 Ohno S (1970) Evolution by gene duplication. Springer-Verlag, Berlin-Heidelberg-New York Osbourn AE, Field B (2009) Operons. Cell Mol Life Sci 66:3755–3775 Pal C, Papp B, Hurst LD (2001) Highly expressed genes in yeast evolve slowly. Genetics 158:927–931 Pal C, Papp B, Lercher MJ (2006) An integrated view of protein evolution. Nat Rev Genet 7:337–348 Parsch J, Novozhilov S, Saminadin-Peter SS, Wong KM and Andolfatto P (2010) On the utility of short intron sequences as a reference for the detection of positive and negative selection in Drosophila. Mol Biol Evol [Epub ahead of print]

46

E.V. Koonin and Y.I. Wolf

Petersen L, Bollback JP, Dimmic M, Hubisz M, Nielsen R (2007) Genes under positive selection in Escherichia coli. Genome Res 17:1336–1343 Ponjavic J, Ponting CP, Lunter G (2007) Functionality or transcriptional noise? Evidence for selection within long noncoding RNAs. Genome Res 17:556–565 Ponting CP, Oliver PL, Reik W (2009) Evolution and functions of long noncoding RNAs. Cell 136:629–641 Proux E, Studer RA, Moretti S, Robinson-Rechavi M (2009) Selectome: a database of positive selection. Nucleic Acids Res 37:D404–D407 Putnam NH, Srivastava M, Hellsten U, Dirks B, Chapman J, Salamov A, Terry A, Shapiro H, Lindquist E, Kapitonov VV, Jurka J, Genikhovich G, Grigoriev IV, Lucas SM, Steele RE, Finnerty JR, Technau U, Martindale MQ, Rokhsar DS (2007) Sea anemone genome reveals ancestral eumetazoan gene repertoire and genomic organization. Science 317:86–94 Queitsch C, Sangster TA, Lindquist S (2002) Hsp90 as a capacitor of phenotypic variation. Nature 417:618–624 Resch AM, Carmel L, Marino-Ramirez L, Ogurtsov AY, Shabalina SA, Rogozin IB, Koonin EV (2007) Widespread positive selection in synonymous sites of mammalian genes. Mol Biol Evol 24:1821–1831 Rocha EP (2008) The organization of the bacterial genome. Annu Rev Genet 42:211–233 Rogozin IB, Makarova KS, Murvai J, Czabarka E, Wolf YI, Tatusov RL, Szekely LA, Koonin EV (2002) Connected gene neighborhoods in prokaryotic genomes. Nucleic Acids Res 30:2212–2223 Rogozin IB, Wolf YI, Sorokin AV, Mirkin BG, Koonin EV (2003) Remarkable interkingdom conservation of intron positions and massive, lineage-specific intron loss and gain in eukaryotic evolution. Curr Biol 13:1512–1517 Roy SW, Gilbert W (2006) The evolution of spliceosomal introns: patterns, puzzles and progress. Nat Rev Genet 7:211–221 Roy SW, Penny D (2007) Patterns of intron loss and gain in plants: intron loss-dominated evolution and genome-wide comparison of O. sativa and A. thaliana. Mol Biol Evol 24:171–181 Schrimpf SP, Weiss M, Reiter L, Ahrens CH, Jovanovic M, Malmstrom J, Brunner E, Mohanty S, Lercher MJ, Hunziker PE, Aebersold R, von Mering C, Hengartner MO (2009) Comparative functional analysis of the Caenorhabditis elegans and Drosophila melanogaster proteomes. PLoS Biol 7:e48 Sella G, Petrov DA, Przeworski M, Andolfatto P (2009) Pervasive natural selection in the Drosophila genome? PLoS Genet 5:e1000495 Shabalina SA, Kondrashov AS (1999) Pattern of selective constraint in C. elegans and C. briggsae genomes. Genet Res 74:23–30 Shabalina SA, Koonin EV (2008) Origins and evolution of eukaryotic RNA interference. Trends Ecol Evol 23:578–587 Shabalina SA, Ogurtsov AY, Rogozin IB, Koonin EV, Lipman DJ (2004) Comparative analysis of orthologous eukaryotic mRNAs: potential hidden functional signals. Nucleic Acids Res 32:1774–1782 Srivastava M, Begovic E, Chapman J, Putnam NH, Hellsten U, Kawashima T, Kuo A, Mitros T, Salamov A, Carpenter ML, Signorovitch AY, Moreno MA, Kamm K, Grimwood J, Schmutz J, Shapiro H, Grigoriev IV, Buss LW, Schierwater B, Dellaporta SL, Rokhsar DS (2008) The Trichoplax genome and the nature of placozoans. Nature 454:955–960 Stanek MT, Cooper TF, Lenski RE (2009) Identification and dynamics of a beneficial mutation in a long-term evolution experiment with Escherichia coli. BMC Evol Biol 9:302 Tatusov RL, Koonin EV, Lipman DJ (1997) A genomic perspective on protein families. Science 278:631–637 Tsaparas P, Marino-Ramirez L, Bodenreider O, Koonin EV, Jordan IK (2006) Global similarity and local divergence in human and mouse gene co-expression networks. BMC Biol 6:70 Turner LM, Chuong EB, Hoekstra HE (2008) Comparative analysis of testis protein evolution in rodents. Genetics 179:2075–2089

2

Constraints, Plasticity, and Universal Patterns in Genome and Phenome Evolution

47

van Nimwegen E (2003) Scaling laws in the functional content of genomes. Trends Genet 19:479–484 Wagner A (2005) Robustness, evolvability, and neutrality. FEBS Lett 579:1772–1778 Wagner A (2008) Neutralism and selectionism: a network-based reconciliation. Nat Rev Genet 9:965–974 Waterston RH, Lindblad-Toh K, Birney E, Rogers J, Abril JF, Agarwal P, Agarwala R, Ainscough R, Alexandersson M, An P, Antonarakis SE, Attwood J, Baertsch R, Bailey J, Barlow K, Beck S, Berry E, Birren B, Bloom T, Bork P, Botcherby M, Bray N, Brent MR, Brown DG, Brown SD, Bult C, Burton J, Butler J, Campbell RD, Carninci P, Cawley S, Chiaromonte F, Chinwalla AT, Church DM, Clamp M, Clee C, Collins FS, Cook LL, Copley RR, Coulson A, Couronne O, Cuff J, Curwen V, Cutts T, Daly M, David R, Davies J, Delehaunty KD, Deri J, Dermitzakis ET, Dewey C, Dickens NJ, Diekhans M, Dodge S, Dubchak I, Dunn DM, Eddy SR, Elnitski L, Emes RD, Eswara P, Eyras E, Felsenfeld A, Fewell GA, Flicek P, Foley K, Frankel WN, Fulton LA, Fulton RS, Furey TS, Gage D, Gibbs RA, Glusman G, Gnerre S, Goldman N, Goodstadt L, Grafham D, Graves TA, Green ED, Gregory S, Guigo´ R, Guyer M, Hardison RC, Haussler D, Hayashizaki Y, Hillier LW, Hinrichs A, Hlavina W, Holzer T, Hsu F, Hua A, Hubbard T, Hunt A, Jackson I, Jaffe DB, Johnson LS, Jones M, Jones TA, Joy A, Kamal M, Karlsson EK, Karolchik D, Kasprzyk A, Kawai J, Keibler E, Kells C, Kent WJ, Kirby A, Kolbe DL, Korf I, Kucherlapati RS, Kulbokas EJ, Kulp D, Landers T, Leger JP, Leonard S, Letunic I, Levine R, Li J, Li M, Lloyd C, Lucas S, Ma B, Maglott DR, Mardis ER, Matthews L, Mauceli E, Mayer JH, McCarthy M, McCombie WR, McLaren S, McLay K, McPherson JD, Meldrim J, Meredith B, Mesirov JP, Miller W, Miner TL, Mongin E, Montgomery KT, Morgan M, Mott R, Mullikin JC, Muzny DM, Nash WE, Nelson JO, Nhan MN, Nicol R, Ning Z, Nusbaum C, O’Connor MJ, Okazaki Y, Oliver K, Overton-Larty E, Pachter L, Parra G, Pepin KH, Peterson J, Pevzner P, Plumb R, Pohl CS, Poliakov A, Ponce TC, Ponting CP, Potter S, Quail M, Reymond A, Roe BA, Roskin KM, Rubin EM, Rust AG, Santos R, Sapojnikov V, Schultz B, Schultz J, Schwartz MS, Schwartz S, Scott C, Seaman S, Searle S, Sharpe T, Sheridan A, Shownkeen R, Sims S, Singer JB, Slater G, Smit A, Smith DR, Spencer B, Stabenau A, Stange-Thomann N, Sugnet C, Suyama M, Tesler G, Thompson J, Torrents D, Trevaskis E, Tromp J, Ucla C, Ureta-Vidal A, Vinson JP, Von Niederhausern AC, Wade CM, Wall M, Weber RJ, Weiss RB, Wendl MC, West AP, Wetterstrand K, Wheeler R, Whelan S, Wierzbowski J, Willey D, Williams S, Wilson RK, Winter E, Worley KC, Wyman D, Yang S, Yang SP, Zdobnov EM, Zody MC, Lander ES (2002) Initial sequencing and comparative analysis of the mouse genome. Nature 420:520–562 Weinreich DM, Delaney NF, Depristo MA, Hartl DL (2006) Darwinian evolution can follow only very few mutational paths to fitter proteins. Science 312:111–114 Wilkins AS (2007) Between “design” and “bricolage”: genetic networks, levels of selection, and adaptive evolution. Proc Natl Acad Sci USA 104(Suppl 1):8590–8596 Wolf YI, Carmel L, Koonin EV (2006) Unifying measures of gene function and evolution. Proc Biol Sci 273:1507–1515 Wolf YI, Novichkov PS, Karev GP, Koonin EV, Lipman DJ (2009) The universal distribution of evolutionary rates of genes and distinct characteristics of eukaryotic genes of different apparent ages. Proc Natl Acad Sci USA 106:7273–7280 Wolf YI, Gopich IV, Lipman DJ, Koonin EV (2010) Relative contributions of intrinsic structuralfunctional constraints and translation rate to the evolution of protein-coding genes. Genome Biol Evol 2010:190–199 Worth CL, Gong S, Blundell TL (2009) Structural and functional constraints in the evolution of protein families. Nat Rev Mol Cell Biol 10:709–720 Wuchty S, Almaas E (2005) Evolutionary cores of domain co-occurrence networks. BMC Evol Biol 5:24 Yamada T, Bork P (2009) Evolution of biomolecular networks: lessons from metabolic and protein interactions. Nat Rev Mol Cell Biol 10:791–803 Zhou T, Weems M, Wilke CO (2009) Translationally optimal codons associate with structurally sensitive sites in proteins. Mol Biol Evol 26:1571–1580

Chapter 3

Starvation-Induced Reproductive Isolation in Yeast Eugene Kroll, R. Frank Rosenzweig, and Barbara Dunn

Abstract Speciation in eukaryotes is one of the central issues in evolutionary biology. Retrospective studies of existing species may not reveal the molecular events underlying speciation, as it is frequently impossible to distinguish changes which preceded speciation from those which happened after speciation has occurred. We propose a model for experimental speciation using a well-studied Eukaryotic organism, the yeast Saccharomyces cerevisiae, and starvation as an agent of speciation. Starvation can be viewed as a general and widespread consequence of catastrophic environmental change that leads to a decrease in survival or reproductive success. We find that yeast populations subjected to a month-long starvation exhibit a drastic increase in genomic rearrangements compared with a modest increase in point mutation. We subsequently find that starved yeast populations become reproductively isolated from their ancestor, which we attribute to chromosomal abnormalities in the starved clones’ genomes. Our model provides direct molecular evidence – that speciation can rapidly occur without the precondition of geographic separation or divergent selection.

3.1

Continuing Uncertainty over Species Definitions Among the Eukarya

Two central questions in eukaryotic evolutionary biology are: how do new species emerge and how are they perpetuated? We can provisionally define a species as group of organisms that shares a complex genetic network of interacting alleles and E. Kroll, and R.F. Rosenzweig Division of Biological Sciences, University of Montana, 32 campus dr., Missoula, MT 59812, USA e-mail: [email protected] B. Dunn Department of Genetics, Stanford University, Stanford, CA 94305, USA

P. Pontarotti (ed.), Evolutionary Biology – Concepts, Molecular and Morphological Evolution, DOI 10.1007/978-3-642-12340-5_3, # Springer-Verlag Berlin Heidelberg 2010

49

50

E. Kroll et al.

preserves its integrity by restricting the exchange of genetic material with other such networks (Mayr 1966). The processes by which new networks emerge, i.e., speciation, appear to be diverse and their relative contributions remain the subject of considerable controversy. While Darwin explicitly linked the process of speciation to the adaptation of organisms to novel environments (Darwin 1859, Ch. 4), neo-Darwinists have emphasized the role of interpopulation isolation (Fisher 1930; Dobzhansky 1937; Muller 1940; Mayr 1942). Uncertainty persists as to which of these emphases is correct (Lande 1989; Vulic et al. 1999; Orr and Presgraves 2000; Schilthuizen 2000; Turelli et al. 2001; Sinervo and Svensson 2002; Herrmann et al. 2003), largely due to the dearth of knowledge about the specific molecular mechanisms that underlie eukaryotic speciation and the fact that species can be defined in various ways. The most widely used definition for speciation is based on the biological species concept, i.e., the cessation of gene flow between groups of organisms, or “reproductive isolation” (Dobzhansky 1937; Mayr 1942, 1996; Lande 1989; Coyne and Orr 1998). Though this definition is not universally accepted (Darwin 1859, Ch. 8; Schilthuizen 2000 and refs. therein), it is, due to its inherently measurable nature, the most amenable framework for experimental investigation. Relative reproductive isolation between two species is a quantitative trait that can be measured as a ratio between fertilities of interspecific hybrids and their conspecific parentals. Importantly, “relative reproductive isolation” can be used as a proxy to assess divergence between closely related organisms. Reproductive isolation in sexual species can be both pre and postzygotic. To date, efforts to explain incipient speciation in eukaryotes have focused on prezygotic isolation mechanisms such as spatial/temporal or behavioral separation (Orr and Presgraves 2000). In many cases, the former arises in allopatry, whereas the latter is viewed as a reinforcing mechanism. That said, much theoretical and experimental work now indicates that postzygotic mechanisms, e.g., the inviability or infertility of interspecific hybrids, can play crucial roles in initiating reproductive isolation, and that prezygotic mechanisms might therefore evolve at a later stage (Lande 1989; Schliewen et al. 1994; Dieckmann and Doebeli 1999; Schilthuizen 2000; Turelli et al. 2001; Via 2001). Hence, it may be difficult to uncover, using existing species as evidence, the important transformative events that initiate speciation, as genetic divergence following reproductive isolation is likely to obscure initial steps in the process. In other words, most of the genetic differences that separate contemporary species by enforcing isolation may not be the differences that originally caused speciation. Overall, our research goal is to elucidate the exact molecular mechanisms that bring about speciation in a model eukaryote and to do so in real time under controlled laboratory conditions. We contend that an experimental rather than a comparative approach is more likely to enable us to clarify the role of postzygotic mechanisms in the initial stages of speciation.

3 Starvation-Induced Reproductive Isolation in Yeast

3.2

51

The Nature of Postzygotic Reproductive Isolation in Eukaryotes

Postzygotic reproductive isolation manifests in the inviability or infertility of hybrid progeny (Orr and Presgraves 2000). While hybrid inviability can be caused by developmental incompatibilities or dysgenesis (Hartl et al. 1997), hybrid infertility is likely a consequence of defective hybrid meiosis. Establishment of postzygotic reproductive isolation in eukaryotes has been explained by one of two competing theories. One, “the chromosomal theory of speciation,” holds that chromosomal changes (genomic rearrangements) disrupt recombination and segregation of homologues in meiosis I, and/or fine-scale mutations disrupt meiotic recombination via the action of mismatch repair (White 1978; King 1993; Radman and Wagner 1993; Chambers et al. 1996; Searle 1998; Britton-Davidian et al. 2000; Rieseberg 2001). The other, “the genic theory of speciation” (“speciation genes”) holds that genic changes, e.g., functional incompatibilities between diverged alleles, result in lower hybrid fitness (Bateson 1909; Dobzhansky 1937; Muller 1940; Coyne and Orr 1998). A third possibility, the idea that postzygotic isolation occurs due to a combination of these two theories described above, has also been proposed (Henikoff et al. 2001; Noor et al. 2001; Rieseberg 2001). The chromosomal theory of speciation did not gain acceptance during the early studies on postzygotic reproductive isolation due to two reasons: first, pioneering experiments in Drosophila by Dobzhansky appeared to demonstrate the genic nature of postzygotic isolation (Dobzhansky 1933); second, chromosomal speciation appears to be incompatible due to the “underdominance” effect, wherein an individual is rendered less fertile if it sustains a chromosomal rearrangement, and it thus would not be able to form a new species (Livingstone and Rieseberg 2004). The archetypal experiments by T. Dobzhansky first demonstrated that the sterility of male hybrids formed as a result of interbreeding between two races of Drosophila pseudoobscura distinguished by several chromosomal rearrangements was due to mis-segregation of homologous chromosomes in meiosis I (Dobzhansky 1933). Dobzhansky further noted that in rare instances when tetraploid spermatogonia were found in these interracial hybrids, chromosomes also had mis-segregated in meiosis I. From this observation, Dobzhansky deduced that because every chromosome in tetraploid hybrid meioses is furnished with its exact homologue, tetraploidization should have restored faithful segregation of homologues if mis-segregation had been caused by chromosomal rearrangements and not by genic incompatibilities. Thus, he concluded that genic incompatibilities, not chromosomal changes, caused hybrid sterility to occur in the male hybrids of two races of D. pseudoobscura. The ensuing rush for “speciation genes” or, rather, incompatible alleles, did render some tangible results, notably from the cloning of Odysseus, a gene encoding a homeobox protein responsible for interspecific incompatibilies in Drosophila (Ting et al. 1998; Greenberg et al. 2003) and several more genes that control hybrid infertility (Lee et al. 2008; Phadnis and Orr 2009).

52

E. Kroll et al.

However, there is absolutely no way to make certain that such incompatible alleles were the actual reason for speciation and not merely the product of species divergence; in other words, finding speciation genes is not in fact a proof that speciation ultimately has a genic nature. Intriguingly, in a footnote to his pioneering paper on the genic nature of reproductive isolation mentioned above, Dobzhansky acknowledges that he did not report the results of the reciprocal cross, which is “different in many important details” and would be “published elsewhere” (Dobzhansky 1933). In stark contrast to the aforementioned studies of Dobzhansky, Noor and colleagues used the very same species of Drosophila to directly implicate large chromosomal inversions in the reproductive isolation between sympatric D. pseudoobscura and D. persimilis populations (Noor et al. 2001). Indeed, inversions and other small rearrangements that may have a deleterious effect on meiosis have been shown to be abundant between related species in many species of yeast (Seoighe et al. 2000; Kellis et al. 2003; Fischer et al. 2006), as well as in roundworms (Hutter et al. 2000), mice (Hauffe and Searle 1998), plants (Blanc et al. 2000), and a variety of other organisms (for a review: Eichler and Sankoff 2003). Experiments on tetraploidization in several species of plants showed that certain types of chromosomal rearrangements were responsible for postzygotic reproductive isolation (Anderson 1949; White 1978; Searle 1998; Pialek et al. 2001; Rieseberg 2001). Chromosomal rearrangements have also been implicated in human evolution, acting to decrease gene flow in the chromosomal regions that harbor inversions (Navarro and Barton 2003). In Saccharomyces cerevisiae, chromosomal inversions have been shown to directly and efficiently impair the progression of meiosis (Dresser et al. 1994; Jinks-Robertson et al. 1997; Chen and Jinks-Robertson 1999). As for the concept of underdominance – a decrease in, or lack of, the ability to go through meiosis due to one or more heterozygous rearrangement – overshadowing the chromosomal speciation theory, it is fair to say that different genomic rearrangements may have very different effects on meiosis, ranging from irrelevant to prohibitive, with all shades in between. Clearly, an organism that contains a chromosomal rearrangement that abrogates meiosis is not going to form a new species; however, a partial restriction of gene flow resulting from a rearrangement could allow for faster rates of sequence and functional divergence (Lande 1989; Noor et al. 2001; Rieseberg 2001; Navarro and Barton 2003), increasing the probability of speciation. Finally, using the same logic that is used for epistasis in speciation genes (Bateson 1909), genomic rearrangements can also form incompatible pairs, further destabilizing meiosis in hybrid organisms. Assuming that the experimental observations supporting both theories of postzygotic isolation are correct, should one conclude that these opposing results reflect variation in experimental techniques, or are they more readily explained as variations between diverse taxa? And is it then reasonable to assume that both the genic and chromosomal models of speciation (acting in concert or separately in different taxa) can act in the process of speciation? To address these questions experimentally, we have developed a laboratory assay using the yeast S. cerevisiae to isolate reproductively separated clones during the course of prolonged starvation.

3 Starvation-Induced Reproductive Isolation in Yeast

3.3

53

A Starvation-Based Experimental Model May Help Resolve Uncertainties Concerning the Molecular Basis for Speciation

Comparative analyzes of existing species may poorly discriminate between changes that cause speciation and those that arise secondarily (Schilthuizen 2000). However, experimental evidence obtained under conditions physiologically close to optimal may be difficult to acquire as these conditions typically result in low and constant mutation rates (Drake et al. 1998) that make speciation less likely to occur (Rice and Hostert 1993). We have therefore developed an experimental laboratory model to study speciation that uses prolonged starvation as a proxy for sudden and severe environmental change. This treatment effectively disrupts normal living conditions, disintegrating a population’s niche, over time diminishing its mean fitness, measured as both survivorship and reproductive capacity. Furthermore, starvation is a condition that virtually all species experience and that many contend with regularly in the wild (Koch 1971; Death and Ferenci 1994). All manner of environmental change, such as wildfire, flood, sudden transfer to a new habitat, or even the invasion of a competitive species can bring about starvation. We hypothesize that because starvation is universally experienced in the wild owing to a plethora of circumstances, natural selection has brought about mechanisms that respond to this generic signal in ways that may increase population diversity via increased mutations, including large-scale genome rearrangements.

3.4

Starvation-Responses That Could Increase Population Genetic Diversity

Escherichia coli’s SOS system activates multiple responses to DNA damage, nutrient starvation, and low temperature that are both mutagenic and recombinogenic (Witkin and Wermundsen 1979; Dri and Moreau 1993; Friedberg et al. 1995; McKenzie et al. 2000). Following activation of the SOS system, bacteria sustain a high frequency of random mutation, rearrangements, and transposition (Radman 1975; Witkin 1976; Petit et al. 1991; Guerin et al. 2009), revealing a genetic link between stress caused by highly challenging environmental conditions and variability (Taddei et al. 1997). In fact, it has been shown that starvation-induced mutagenesis in bacteria is directly controlled by the SOS system (Taddei et al. 1995; Hastings et al. 2000; McKenzie et al. 2000; Finkel 2006; He et al. 2006) as well as by global stress response (Zinser and Kolter 1999; Bjedov et al. 2003; Lombardo et al. 2004). Eukaryotes possess a combination of genetic pathways that may be functionally analogous to those of bacteria, such as checkpoint adaptation, translesion synthesis,

54

E. Kroll et al.

stress signaling, and others (Toczyski et al. 1997; Kai and Wang 2003; Smets et al. 2010). However, although the causal connection between environmental stress and an increase in adaptively significant variation has been well studied, the molecular basis for such connection in eukaryotes remains obscure. By employing starvation to mimic severe stress, we hope to model conditions in nature with which all populations must contend (Death and Ferenci 1994) and to discover molecular mechanisms that link catastrophic environmental change with the types of genetic variation that could lead to speciation.

3.5

Advantages of Using Yeast as Model to Study Speciation in Real Time

Several factors contributed to our choice of S. cerevisiae as a model organism. S. cerevisiae is a well-studied organism that possesses most of the major signal transduction (Smets et al. 2010) and DNA maintenance pathways (San Filippo et al. 2008) found in other eukaryotes. Also, the genomes of multiple strains of S. cerevisiae and more than ten-related species have been sequenced. Lastly, yeast genetics, especially as it relates to DNA maintenance, cell cycle, checkpoints and stress resistance, is well understood. In S. cerevisiae, as in higher eukaryotes, the controlled occurrence of DNA double-strand breaks early in meiotic prophase is essential for the maturation of the synaptonemal complex as well as for chiasmata formation in diplotene and for faithful homologue segregation at anaphase I (Peoples et al. 2002; Page and Hawley 2003). This dependence is reinforced by the pachytene checkpoint (Roeder and Bailis 2000), which ensures that meiotic recombination and homologue synapsis are completed before cells proceed to metaphase I. In contrast, the chromosomes in another well-studied yeast species, Schizosaccharomyces pombe, do not form synaptonemal complexes in meiosis (Davis and Smith 2003); while in the popular multicellular model organisms C. elegans and D. melanogaster, double-strand breaks are not required for chromosome synapsis to occur (Dernburg et al. 1998; Jang et al. 2003). Moreover, heterogametic (male) meioses in Drosophila and other Diptera and Lepidoptera occur in the complete absence of recombination (Hawley 2002). Thus, among favored models systems, the processes of meiosis in S. cerevisiae most resemble those found within meioses of mouse and human spermatocytes (Lichten 2001; Page and Hawley 2003). Finally, in S. cerevisiae, reproductive isolation manifests as a quantitative trait that can be scored as the efficiency of producing viable spores or spore yield (a combination of sporulation efficiency and spore viability). We chose an S288c strain [BY4743 (Brachmann et al. 1998)] for our speciation studies because this diploid, unlike other laboratory strains, does not spontaneously sporulate when starved, and thus starved diploids that have not gone through meiosis can be reliably obtained.

3 Starvation-Induced Reproductive Isolation in Yeast

3.6

55

Three Modes of PostZygotic Isolation in Yeast – Sequence, Chromosome, Breakpoint-Recombination

The six nonhybrid species that comprise the sensu stricto group of Saccharomyces (S. cerevisiae, S. paradoxus, S. mikatae, S. cariocanus, S. kudriavzevii, and S. bayanus) show large genomic rearrangements relative to each other, as detected by pulsed-field gel analysis, with the exception of S. cerevisiae and S. paradoxus which are almost identical. Fischer et al. showed that these rearrangements did not in fact correspond to a phylogenetic tree based on sequence divergence of rRNA (Fischer et al. 2000, 2006), and thus concluded that genomic rearrangements were unimportant in the speciation of yeast. Interestingly, the restoration of the colinearity of gene order between two sensu stricto species, S. cerevisiae and S. mikatae, did lead to a partial restoration of the interspecific hybrid fertility (Delneri et al. 2003), indicating that genomic rearrangements are important for the maintenance of the postzygotic reproductive isolation in yeast. Mutational load and the action of the mismatch repair system also affect, albeit partially, reproductive isolation between S. cerevisiae and S. paradoxus (Chambers et al. 1996; Chen and Jinks-Robertson 1999), as crossing-over in yeast is dependent on sequence homology between homeologous chromosomes (Hunter et al. 1996). However, experiments suggesting these possibilities were conducted with extant species, where genetic changes such as sequence divergence – proposed as a possible cause for a reproductive barrier – may actually have occurred after the speciation event and thus might not be a reason for the initial reproductive barrier. Additionally, dominant epistatic incompatibilities between two sensu stricto species of Saccharomyces have been shown not to be important for speciation by either tetraploidization experiments (Greig et al. 2002) or directly checking for speciation genes (Greig and Leu 2009). Although one pair of incompatible alleles has been recently identified between S. cerevisiae and S. bayanus (Lee et al. 2008), it is again unclear whether this incompatibility was a driving force, or a secondary consequence, of the initial speciation event. Chromosome rearrangements are plentiful in yeast genomes. Genomic rearrangements, such as reciprocal translocations, transpositions, insertions, deletions, and inversions, are ubiquitous features of even closely related species. Studies using pulsed-field gel analysis and hybridization, such as Fischer et al. (Fischer et al. 2000), identified only a small subset of all rearrangements and inversions among the sensu stricto species – as shown by subsequent whole genome sequencing – because smaller rearrangements and inversions simply cannot be resolved by pulsed-field gels. Remarkably, of all the syntenic breakpoints between S. cerevisiae and S. bayanus, less than 10% are large-scale rearrangements (Fischer et al. 2001). Sequence data from the S. bayanus, S. mikatae, and S. paradoxus genomes have revealed many more genomic rearrangements than were previously known, especially at chromosome ends (Kellis et al. 2003). The nine inversions that exist between the genomes of these three species and S. cerevisiae are flanked by tRNA genes, usually of the same isoacceptor type (Kellis et al. 2003). This finding suggests that inversions and perhaps other rearrangements that have accumulated in

56

E. Kroll et al.

the genomes of the Saccharomyces spp. arose via homologous recombination. An alternative hypothesis is that rearrangements may have been caused by yeast retrotransposons (Ty), as the tRNA genes are hotspots for Ty1, 3, and 5 transposition (Natsoulis et al. 1989). In addition, nonhomologous end-joining may have played a role in creating some of the rearrangements, as has been observed among flor yeast used in fortified winemaking (Infante et al. 2003). Thus, in our opinion, certain genomic rearrangements that include small and large inversions, small translocations, and small insertion–deletions that escape detection by pulsed-field gel analysis (but discovered later by sequencing) may be a ubiquitous feature of evolving genomes. We further suggest that such rearrangements may play a key role in incipient speciation among yeasts and other Eukaryotes.

3.7

Starved Yeast Cultures Sustain High Frequencies of Genomic Rearrangements

In extant species of Saccharomyces yeast, the rates of genomic rearrangements are highly variable (Fischer et al. 2006). We contend that starvation as a result of environmental change can affect the rates of genomic variation. Moreover, we have already shown that a champagne strain, DB146, sustains a massive amount of change in genomic architecture after prolonged starvation (Coyle and Kroll 2008). To appraise the effect of prolonged starvation on genomic change, we starved multiple random clones of the laboratory yeast diploid BY4743 (Brachmann et al. 1998), essentially as described (Coyle and Kroll 2008). During a 1-month-long starvation treatment, and accounting for diminished viability, the starving cultures underwent an average of ten generations. At no point did we observe sporulating cells in starving cultures. For comparison, we established a control by growing BY4743 cells in rich medium for approximately twice the number of generations that starved cultures underwent. Because, strictly speaking, the cells obtained at the end of these 20 generations are neither ancestral nor “wild-type” to the starved cultures, we chose to call them “nonstarved” cultures.

3.7.1

Starved Cultures Sporulate at Lower Level Than the Nonstarved Cultures

Genomic rearrangements may create a reproductive barrier between two populations, as discussed previously. If a reproductive barrier existed between our starved and ancestral populations, it would manifest as decreased fertility of starved cultures in backcrosses between haploid progeny of the starved and ancestral populations when compared with the values for nonstarved to ancestral backcrosses. Both efficiency of sporulation (the frequency at which yeast cells form gametes or spores) and spore viability (colony-forming units per number of spores plated) could be

3 Starvation-Induced Reproductive Isolation in Yeast

57

Fig. 3.1 Starved and nonstarved cultures of BY4743 sporulated for 2 days. Arrows denote spore sacks (asci) that contain three or four spores. (a) Starved diploid culture. Only one misshapen spore sack (ascus) is shown (arrow). (b) Nonstarved culture. The majority of cells have formed asci

expected to affect hybrid fertility. Generally, only a partial measure of fertility – spore viability – is measured in crosses between separate yeast species (Naumov et al. 2000). Since different species usually require different conditions for optimal sporulation, sporulation efficiency of the interspecific hybrid lacks an obvious control. However, in our case, we used only one ancestral strain, and thus we were able to assess both sporulation efficiency and spore viability of the backcross hybrids. To score these traits, we incubated the cells overnight in fresh rich medium to minimize the fraction of dead cells in starved cultures, then sporulated them using conditions optimized for the ancestral strain, We scored sporulation efficiency and the viability of the resultant spores. For all comparisons we used nonparametric statistical tests, as we could not assume normal distribution for our data. Nonstarved diploid cultures sporulated at the efficiency characteristic of the BY4743 ancestor and spore viability was nearly 100%. In contrast, starved BY4743 cultures sporulated about at half the frequency of the nonstarved cultures, even after prolonged sporulation (Coyle et al. in preparation). Nevertheless, spore viability among sporulated starved cultures was almost as high as that of spores derived from the nonstarved cultures (Fig. 3.1). The fact that starved cultures exhibited significantly lower sporulation efficiency than nonstarved control suggests the possibility that accumulated changes in the genomes of starved cultures alter their fertility. Viable spores derived from such starved cells might be wholly or partially reproductively isolated from each other and from the ancestral population.

3.7.2

A Subset of Starved Backcrosses Show Lower Fertility Than the Nonstarved Backcrosses

To test how reproductive isolation was distributed within starved cultures we assessed the fertility of the backcrossed hybrids. We isolated rare spores from

58

E. Kroll et al.

1 month starved cultures, germinated those spores into haploid strains, or “starved isolates” and performed backcrosses. We then sporulated the resultant backcross hybrids and measured their sporulation efficiency and spore viability; finally, we compared their hybrid fertility with that of the nonstarved isolates. The results recapitulate the previous findings for starved diploids: multiple backcrossed hybrids exhibited significantly lower average sporulation efficiency than the nonstarved backcrosses (Mann–Whitney U test). Specifically, about one-third of starved isolates used for the backcross analysis showed a sporulation efficiency that was significantly lower than those of their respective nonstarved intercrosses (Coyle et al. in preparation). In contrast to sporulation efficiency, spore viability in all cases was indistinguishable from the ancestral (Coyle et al. in preparation).

3.7.3

Starved Isolates Reproductively Isolated from the Ancestral Population Are Self-Fertile

Complete inability to undergo meiosis would prevent the establishment of a new species. This might be caused either by mutations in genes important for meiosis or by a chromosome aberration that prohibits meiosis. To ensure that the starved isolates could have found a new lineage, capable of sexual reproduction, we selfed starved isolates that exhibited lower fertility in backcrosses. To do this, we made haploid progeny of those starved isolates homothallic and isolated their selfed diploid progeny. After sporulating these selfed diploids we found that their sporulation efficiency was significantly higher than the fertility of the backcross hybrid (Coyle et al. in preparation). We concluded that starved isolates reproductively isolated from the ancestral population were self-fertile and able to form new sexually reproducing lineages, that is, new biological species. These results confirm bona fide incipient speciation arising in a yeast population within a 1-month period of starvation.

3.7.4

Molecular Basis of Reproductive Barrier in a Starved Isolate

To discover the molecular mechanism of reproductive isolation, we further studied several of the reproductively isolated starved isolates. Our experiments showed that forward mutation frequency increased only two times in starved populations compared with the nonstarved control, which could not account for the widespread reproductive isolation. In contrast, pulsed-field gel analysis revealed a 6.6% total frequency of new chromosomal variants in the starved BY4743 cultures, with no rearrangements detected in nonstarved cultures (Coyle et al. in preparation). This frequency is orders of magnitude higher than can be estimated for a typical laboratory yeast strain (Schmidt et al. 2006). Finally, using microarray-based comparative genomic hybridization (Dunn et al. 2005) we showed that all starved isolates contained deletions and additions of genomic DNA (Coyle et al. in preparation).

3 Starvation-Induced Reproductive Isolation in Yeast

59

In particular, one isolate contained duplication of the whole Chromosome I (Coyle et al. in preparation). We decided to examine this disomic haploid isolate further to determine whether chromosomal abnormalities which arose during starvation could explain this strain’s reproductive isolation. As has been reasoned before, in tetraploid hybrid meioses every chromosome is furnished with its exact homologue (Dobzhansky 1933), therefore tetraploidization should restore faithful segregation of homologues if mis-segregation in the diploid hybrid were caused by chromosomal rearrangements and not by genic incompatibilities. In our case, when we crossed the disomic starved isolate to its haploid ancestor, we obtained a diploid hybrid with trisomy for Chromosome I (two copies of the chromosome from the starved isolate and one from the ancestor). If the Chromosome I trisomy were responsible for the lowered fertility of the backcross hybrid, because there was no homologue furnished for the extra Chromosome I, we would expect tetraploidization of this hybrid to restore its fertility. If the fertility of the backcross hybrid were not restored then we would have to assume that an epistatic interaction between incompatible alleles underlies reproductive barrier between this isolate and its ancestor. To test for this possibility, we obtained tetraploid versions of the trisomic backcross hybrid by deleting one of the two MAT loci in the hybrid. We identified hybrids expressing either MATa or MATalpha and crossed such strains using a micromanipulator to produce several independent tetraploid versions of the backcross hybrid. We repeated this procedure with the nonstarved isolates to obtain control tetraploids. After tetraploidy was confirmed by tetrad dissection, we sporulated the resulting diploid hybrids and their tetraploid derivatives and measured the sporulation efficiency as before. The results are shown in Fig. 3.2. 100 90 80 70 60 50 40 30 20 10 0

a

b

c

d

Fig. 3.2 Relative sporulation efficiency of (a) Diploid starved backcross hybrid with extra Chromosome I, (b) tetraploid starved backcross hybrid with extra Chromosome I, (c) diploid nonstarved backcross hybrid, (d) tetraploid backcross hybrid. Ancestral sporulation efficiency is assumed to be 100%. Spore viability in all strains was indistinguishable from the ancestral

60

E. Kroll et al.

Independently obtained tetraploid derivatives of the trisomic hybrid showed a dramatic increase in sporulation efficiency compared with the diploid hybrid using Mann–Whitney U test (Coyle et al. in preparation). In contrast, the increase in sporulation efficiency of the nonstarved tetraploidized backcross hybrids was indistinguishable from that of the nonstarved diploid backcross, indicating that tetraploidization does not generally result in increased sporulation efficiency in the nonstarved clones. Our results indicate that reproductive isolation in the starved disomic isolate cannot be a consequence of the allelic incompatibilities between the disomic isolate and the nonstarved ancestor. Rather, these results support the hypothesis that chromosomal rather than genic differences underlie reduced fertility of the starved isolate.

3.8

Conclusions

The experiments described here provide insight into the phenomenon of starvationassociated genomic rearrangements and its possible role in establishing reproductive isolation. Starvation is a condition that most natural organisms frequently contend within the wild. Because a variety of changes in the external milieu can result in starvation, we contend that starvation is a generic “interpreter” of catastrophic environmental change. Organisms that evolved mechanisms to harness starvation as signal to increase population diversity could be expected to leave more descendants in the wake of such catastrophes. These mechanisms represent an alternative population-level evolutionary response to the many individual-level responses that enable organisms to persist under severe stress (e.g., spores, hibernation, aestivation, extreme desiccation resistance, etc.). Eukaryotes possess genetic mechanisms able to respond to stressful conditions; however, no connection between starvation, starvation-induced genetic variation, and speciation has been experimentally established in eukaryotes. Our experiments provide evidence for this connection by showing that starved yeast populations sustain genomic rearrangements at a dramatically higher frequency than nonstarved populations, and that certain clones that survive starvation are reproductively isolated from their ancestors. These newly evolved clones may represent incipient species. Genomic rearrangements have been shown to occur in yeast during chemical treatment (Hughes et al. 2000) and growth in nutrient-limiting conditions (Adams et al. 1992; Dunham et al. 2002). In fact, Dunham et al. note that several of their parallel cultures grown in continuous culture under glucose limitation failed to sporulate, a phenomenon similar to the one observed here (Dunham et al. 2002). This phenotype arose after 250–500 generations of continuous growth, unlike our cultures which only underwent 10 generations during the course of starvation. Recently, another study has shown that adaptation to diverse environments leads to incipient speciation in yeast (Dettman et al. 2007), echoing the classic experiments in Drosophila (Rice and Hostert 1993). The authors attempted to examine the

3 Starvation-Induced Reproductive Isolation in Yeast

61

molecular nature of de novo speciation, using correlation between hybrid fitness and fertility. Interestingly, in contrast to findings in extant yeast species (Greig 2009), their yeast hybrids, like ours, retained almost 100% of spore viability but exhibited lower sporulation efficiency (Dettman et al. 2007). We contend that genomic rearrangements arising during starvation may contribute to reproductive isolation, supporting the chromosomal theory of speciation (White 1978). When the rate of genomic rearrangements is very low and the effective population size is high, the chromosomal theory of speciation cannot plausibly explain the process of speciation (Rieseberg 2001). However, the stress of complete starvation circumvents these problems by dramatically increasing the rate of chromosomal rearrangements in starving populations and simultaneously decreasing the effective population size (because of the lower chances of having enough resources to mate and also because of lower viability). Thus, environmental conditions leading to starvation may favor the establishment of small, reproductively isolated, inbred subpopulations that harbor restructured genomes poised to undergo rapid speciation without a requirement for any other type of prezygotic isolation. Acknowledgments We would like to acknowledge technical help from S. Coyle. This work was supported by NSF grant 0134648 to E.K., NASA grant NNX07AJ28G grant to R.F.R. and NSF ADVANCE grant DBI-0340856 to BD

References Adams J, Puskas-Rozsa S, Simlar J, Wilke CM (1992) Adaptation and major chromosomal changes in populations of Saccharomyces cerevisiae. Curr Genet 22:13–19 Anderson E (1949) Introgressive hybridization. Chapman & Hall, London Bateson W (1909) Heredity and variation in modern lights. Darwin and modern science. Cambridge University Press, Cambridge, UK Bjedov I, Tenaillon O, Gerard B, Souza V, Denamur E, Radman M, Taddei F, Matic I (2003) Stress-induced mutagenesis in bacteria. Science 300:1404–1409 Blanc G, Barakat A, Guyot R, Cooke R, Delseny M (2000) Extensive duplication and reshuffling in the Arabidopsis genome. Plant Cell 12:1093–1101 Brachmann CB, Davies A, Cost GJ, Caputo E, Li J, Hieter P, Boeke JD (1998) Designer deletion strains derived from Saccharomyces cerevisiae S288C: a useful set of strains and plasmids for PCR-mediated gene disruption and other applications. Yeast 14:115–132 Britton-Davidian J, Catalan J, da Graca Ramalhinho M, Ganem G, Auffray JC, Capela R, Biscoito M, Searle JB, da Luz Mathias M (2000) Rapid chromosomal evolution in island mice. Nature 403:158 Chambers SR, Hunter N, Louis EJ, Borts RH (1996) The mismatch repair system reduces meiotic homeologous recombination and stimulates recombination-dependent chromosome loss. Mol Cell Biol 16:6110–6120 Chen W, Jinks-Robertson S (1999) The role of the mismatch repair machinery in regulating mitotic and meiotic recombination between diverged sequences in yeast. Genetics 151:1299–1313 Coyle S, Kroll E (2008) Starvation induces genomic rearrangements and starvation-resilient phenotypes in yeast. Mol Biol Evol 25:310–318 Coyle S, Dunn B, Rosenzweig RF, Kroll E (in preparation) The molecular basis of starvationassociated reproductive isolation in yeast

62

E. Kroll et al.

Coyne JA, Orr HA (1998) The evolutionary genetics of speciation. Philos Trans R Soc Lond B Biol Sci 353:287–305 Darwin C (1859) On the origin of species by means of natural selection, or the preservation of favoured races in the struggle for life. J. Murray, London Davis L, Smith GR (2003) Nonrandom homolog segregation at meiosis I in Schizosaccharomyces pombe mutants lacking recombination. Genetics 163:857–874 Death A, Ferenci T (1994) Between feast and famine: endogenous inducer synthesis in the adaptation of Escherichia coli to growth with limiting carbohydrates. J Bacteriol 176:5101–5107 Delneri D, Colson I, Grammenoudi S, Roberts IN, Louis EJ, Oliver SG (2003) Engineering evolution to study speciation in yeasts. Nature 422:68–72 Dernburg AF, McDonald K, Moulder G, Barstead R, Dresser M, Villeneuve AM (1998) Meiotic recombination in C. elegans initiates by a conserved mechanism and is dispensable for homologous chromosome synapsis. Cell 94:387–398 Dettman JR, Sirjusingh C, Kohn LM, Anderson JB (2007) Incipient speciation by divergent adaptation and antagonistic epistasis in yeast. Nature 447:585–588 Dieckmann U, Doebeli M (1999) On the origin of species by sympatric speciation. Nature 400:354–357 Dobzhansky T (1933) On the sterility of the interracial hybrids in Drosophila pseudoobscura. Proc Natl Acad Sci USA 19:397–403 Dobzhansky T (1937) Genetics and the origin of species. Columbia Press, New York Drake JW, Charlesworth B, Charlesworth D, Crow JF (1998) Rates of spontaneous mutation. Genetics 148:1667–1686 Dresser ME, Ewing DJ, Harwell SN, Coody D, Conrad MN (1994) Nonhomologous synapsis and reduced crossing over in a heterozygous paracentric inversion in Saccharomyces cerevisiae. Genetics 138:633–647 Dri AM, Moreau PL (1993) Phosphate starvation and low temperature as well as ultraviolet irradiation transcriptionally induce the Escherichia coli LexA- controlled gene sfiA. Mol Microbiol 8:697–706 Dunham MJ, Badrane H, Ferea T, Adams J, Brown PO, Rosenzweig F, Botstein D (2002) Characteristic genome rearrangements in experimental evolution of Saccharomyces cerevisiae. Proc Natl Acad Sci USA 99:16144–16149 Dunn B, Levine RP, Sherlock G (2005) Microarray karyotyping of commercial wine yeast strains reveals shared, as well as unique, genomic signatures. BMC Genomics 6(1):53–57 Eichler EE, Sankoff D (2003) Structural dynamics of eukaryotic chromosome evolution. Science 301:793–797 Finkel SE (2006) Long-term survival during stationary phase: evolution and the GASP phenotype. Nat Rev Microbiol 4:113–120 Fischer G, James SA, Roberts IN, Oliver SG, Louis EJ (2000) Chromosomal evolution in Saccharomyces. Nature 405:451–454 Fischer G, Neuveglise C, Durrens P, Gaillardin C, Dujon B (2001) Evolution of gene order in the genomes of two related yeast species. Genome Res 11:2009–2019 Fischer G, Rocha EP, Brunet F, Vergassola M, Dujon B (2006) Highly variable rates of genome rearrangements between hemiascomycetous yeast lineages. PLoS Genet 2:e32 Fisher RA (1930) The Genetical theory of natural selection. Oxford, UK Friedberg E, Walker G, Siede W (1995) DNA repair and mutagenesis. Am Soc Microbiol, Washington, DC Greenberg AJ, Moran JR, Coyne JA, Wu CI (2003) Ecological adaptation during incipient speciation revealed by precise gene replacement. Science 302:1754–1757 Greig D (2009) Reproductive isolation in Saccharomyces. Heredity 102:39–44 Greig D, Leu JY (2009) Natural history of budding yeast. Curr Biol 19:R886–R890 Greig D, Borts RH, Louis EJ, Travisano M (2002) Epistasis and hybrid sterility in Saccharomyces. Proc R Soc Lond B Biol Sci 269:1167–1171

3 Starvation-Induced Reproductive Isolation in Yeast

63

Guerin E, Cambray G, Sanchez-Alberola N, Campoy S, Erill I, Da Re S, Gonzalez-Zorn B, Barbe J, Ploy MC, Mazel D (2009) The SOS response controls integron recombination. Science 324:1034 Hartl DL, Lohe AR, Lozovskaya ER (1997) Regulation of the transposable element mariner. Genetica 100:177–184 Hastings PJ, Bull HJ, Klump JR, Rosenberg SM (2000) Adaptive amplification. An inducible chromosomal instability mechanism. Cell 103:723–731 Hauffe HC, Searle JB (1998) Chromosomal heterozygosity and fertility in house mice (Mus musculus domesticus) from Northern Italy. Genetics 150:1143–1154 Hawley RS (2002) Meiosis: how male flies do meiosis. Curr Biol 12:R660–R662 He AS, Rohatgi PR, Hersh MN, Rosenberg SM (2006) Roles of E. coli double-strand-break-repair proteins in stress-induced mutation. DNA Repair 5:258–273 Henikoff S, Ahmad K, Malik HS (2001) The centromere paradox: stable inheritance with rapidly evolving DNA. Science 293:1098–1102 Herrmann RG, Maier RM, Schmitz-Linneweber C (2003) Eukaryotic genome evolution: rearrangement and coevolution of compartmentalized genetic information. Philos Trans R Soc Lond B Biol Sci 358:87–97, discussion 97 Hughes TR, Roberts CJ, Dai H, Jones AR, Meyer MR, Slade D, Burchard J, Dow S, Ward TR, Kidd MJ, Friend SH, Marton MJ (2000) Widespread aneuploidy revealed by DNA microarray expression profiling. Nat Genet 25:333–337 Hunter N, Chambers SR, Louis EJ, Borts RH (1996) The mismatch repair system contributes to meiotic sterility in an interspecific yeast hybrid. EMBO J 15:1726–1733 Hutter H, Vogel BE, Plenefisch JD, Norris CR, Proenca RB, Spieth J, Guo C, Mastwal S, Zhu X, Scheel J, Hedgecock EM (2000) Conservation and novelty in the evolution of cell adhesion and extracellular matrix genes. Science 287:989–994 Infante JJ, Dombek KM, Rebordinos L, Cantoral JM, Young ET (2003) Genome-wide amplifications caused by chromosomal rearrangements play a major role in the adaptive evolution of natural yeast. Genetics 165:1745–1759 Jang JK, Sherizen DE, Bhagat R, Manheim EA, McKim KS (2003) Relationship of DNA doublestrand breaks to synapsis in Drosophila. J Cell Sci 116:3069–3077 Jinks-Robertson S, Sayeed S, Murphy T (1997) Meiotic crossing over between nonhomologous chromosomes affects chromosome segregation in yeast. Genetics 146:69–78 Kai M, Wang TS (2003) Checkpoint activation regulates mutagenic translesion synthesis. Genes Dev 17:64–76 Kellis M, Patterson N, Endrizzi M, Birren B, Lander ES (2003) Sequencing and comparison of yeast species to identify genes and regulatory elements. Nature 423:241–254 King M (1993) Species evolution: the role of chromosome change. Cambridge University Press, Cambridge Koch AL (1971) The adaptive responses of Escherichia coli to a feast and famine existence. Adv Microb Physiol 6:147–217 Lande R (1989) Fisherian and Wrightian theories of speciation. Genome 31:221–227 Lee HY, Chou JY, Cheong L, Chang NH, Yang SY, Leu JY (2008) Incompatibility of nuclear and mitochondrial genomes causes hybrid sterility between two yeast species. Cell 135:1065–1073 Lichten M (2001) Meiotic recombination: breaking the genome to save it. Curr Biol 11: R253–R256 Livingstone K, Rieseberg L (2004) Chromosomal evolution and speciation: a recombinationbased approach. New Phytol 161:107–112 Lombardo MJ, Aponyi I, Rosenberg SM (2004) General stress response regulator RpoS in adaptive mutation and amplification in Escherichia coli. Genetics 166:669–680 Mayr E (1942) Systematics and the origins of species. Columbia University Press, New York Mayr E (1966) Animal species and evolution. Harvard University Press, Cambridge Mayr E (1996) What is a species and what is not? Philos Sci 63:262–277 McKenzie GJ, Harris RS, Lee PL, Rosenberg SM (2000) The SOS response regulates adaptive mutation. Proc Natl Acad Sci USA 97:6646–6651

64

E. Kroll et al.

Muller HJ (1940) Bearing of the Drosophila work on systematics. In: Huxley J (ed) The new systematics. Clarendon, Oxford, pp 185–268 Natsoulis G, Thomas W, Roghmann MC, Winston F, Boeke JD (1989) Ty1 transposition in Saccharomyces cerevisiae is nonrandom. Genetics 123:269–279 Naumov GI, James SA, Naumova ES, Louis EJ, Roberts IN (2000) Three new species in the Saccharomyces sensu stricto complex: Saccharomyces cariocanus,Saccharomyces kudriavzevii and Saccharomyces mikatae. Int J Syst Evol Microbiol 50(Pt 5):1931–1942 Navarro A, Barton NH (2003) Chromosomal speciation and molecular divergence–accelerated evolution in rearranged chromosomes. Science 300:321–324 Noor MA, Grams KL, Bertucci LA, Reiland J (2001) Chromosomal inversions and the reproductive isolation of species. Proc Natl Acad Sci USA 98:12084–12088 Orr HA, Presgraves DC (2000) Speciation by postzygotic isolation: forces, genes and molecules. Bioessays 22:1085–1094 Page SL, Hawley RS (2003) Chromosome choreography: the meiotic ballet. Science 301:785–789 Peoples TL, Dean E, Gonzalez O, Lambourne L, Burgess SM (2002) Close, stable homolog juxtaposition during meiosis in budding yeast is dependent on meiotic recombination, occurs independently of synapsis, and is distinct from DSB-independent pairing contacts. Genes Dev 16:1682–1695 Petit MA, Dimpfl J, Radman M, Echols H (1991) Control of large chromosomal duplications in Escherichia coli by the mismatch repair system. Genetics 129:327–332 Phadnis N, Orr HA (2009) A single gene causes both male sterility and segregation distortion in Drosophila hybrids. Science 323:376–379 Pialek J, Hauffe HC, Rodriguez-Clark KM, Searle JB (2001) Raciation and speciation in house mice from the Alps: the role of chromosomes. Mol Ecol 10:613–625 Radman M (1975) SOS repair hypothesis: phenomenology of an inducible DNA repair which is accompanied by mutagenesis. Basic Life Sci 5A:355–367 Radman M, Wagner R (1993) Mismatch recognition in chromosomal interactions and speciation. Chromosoma 102:369–373 Rice W, Hostert E (1993) Laboratory experiments on speciation: what have we learned in 40 years? Evolution 47:1637–1653 Rieseberg LH (2001) Chromosomal rearrangements and speciation. Trends Ecol Evol 16:351–358 Roeder GS, Bailis JM (2000) The pachytene checkpoint. Trends Genet 16:395–403 San Filippo J, Sung P, Klein H (2008) Mechanism of eukaryotic homologous recombination. Annu Rev Biochem 77:229–257 Schilthuizen M (2000) Dualism and conflicts in understanding speciation. Bioessays 22:1134–1141 Schliewen UK, Tautz D, Paabo S (1994) Sympatric speciation suggested by monophyly of crater lake cichlids. Nature 368:629–632 Schmidt KH, Pennaneach V, Putnam CD, Kolodner RD (2006) Analysis of gross-chromosomal rearrangements in Saccharomyces cerevisiae. Methods Enzymol 409:462–476 Searle JB (1998) Speciation, chromosomes, and genomes. Genome Res 8:1–3 Seoighe C, Federspiel N, Jones T, Hansen N, Bivolarovic V, Surzycki R, Tamse R, Komp C, Huizar L, Davis RW, Scherer S, Tait E, Shaw DJ, Harris D, Murphy L, Oliver K, Taylor K, Rajandream MA, Barrell BG, Wolfe KH (2000) Prevalence of small inversions in yeast gene order evolution. Proc Natl Acad Sci USA 97:14433–14437 Sinervo B, Svensson E (2002) Correlational selection and the evolution of genomic architecture. Heredity 89:329–338 Smets B, Ghillebert R, De Snijder P, Binda M, Swinnen E, De Virgilio C, Winderickx J (2010) Life in the midst of scarcity: adaptations to nutrient availability in Saccharomyces cerevisiae. Curr Genet 56:1–32 Taddei F, Matic I, Radman M (1995) cAMP-dependent SOS induction and mutagenesis in resting bacterial populations. Proc Natl Acad Sci USA 92:11736–11740 Taddei F, Vulic M, Radman M, Matic I (1997) Genetic variability and adaptation to stress. EXS 83:271–290

3 Starvation-Induced Reproductive Isolation in Yeast

65

Ting CT, Tsaur SC, Wu ML, Wu CI (1998) A rapidly evolving homeobox at the site of a hybrid sterility gene. Science 282:1501–1504 Toczyski DP, Galgoczy DJ, Hartwell LH (1997) CDC5 and CKII control adaptation to the yeast DNA damage checkpoint. Cell 90:1097–1106 Turelli M, Barton NH, Coyne JA (2001) Theory and speciation. Trends Ecol Evol 16:330–343 Via S (2001) Sympatric speciation in animals: the ugly duckling grows up. Trends Ecol Evol 16:381–390 Vulic M, Lenski RE, Radman M (1999) Mutation, recombination, and incipient speciation of bacteria in the laboratory. Proc Natl Acad Sci USA 96:7348–7351 White MJD (1978) Modes of speciation. W.H. Freeman& Co, SanFrancisco Witkin EM (1976) Ultraviolet mutagenesis and inducible DNA repair in Escherichia coli. Bacteriol Rev 40:869–907 Witkin EM, Wermundsen IE (1979) Targeted and untargeted mutagenesis by various inducers of SOS functions in Escherichia coli. Cold Spring Harb Symp Quant Biol 43(Pt 2):881–886 Zinser ER, Kolter R (1999) Mutations enhancing amino acid catabolism confer a growth advantage in stationary phase. J Bacteriol 181:5800–5807

Chapter 4

Populations of RNA Molecules as Computational Model for Evolution Michael Stich, Carlos Briones, Ester La´zaro, and Susanna C. Manrubia

Abstract We consider populations of RNA molecules as computational model for molecular evolution. Based on a large body of previous work, we review some recent results. In the first place, we study the sequence–structure map, its implications on the structural repertoire of a pool of random RNA sequences and its relevance for the RNA world hypothesis of the origin of life. In a scenario where template replication is possible, we discuss the internal organization of evolving populations and its relationship with robustness and adaptability. Finally, we explore how the effect of the mutation rate on fitness changes depends on the degree of adaptation of an RNA population.

4.1

Introduction

Molecular evolution covers a huge area of research, ranging from prebiotic chemistry and questions on the origin of life, through many aspects related to the origin of and the relationships among species, the study of viral and bacterial evolution and their medical implications up to the artificial design and in vitro selection of molecules, with all their applications in nano- and biotechnology. In this chapter, we do not aim to give a complete overview of that wide research field, but focus on the use of populations of RNA molecules as a model to understand evolution of prebiotic replicators in the RNA world. As RNA viruses share many characteristics with primitive RNA molecules with replicative ability, these studies can also be used to tackle many aspects of viral evolution. Although a large body of our work is inspired by experiments, in this chapter we focus on theoretical approaches for understanding evolutionary processes. M. Stich, C. Briones, E. La´zaro, and S.C. Manrubia Dpto de Evolucio´n Molecular, Centro de Astrobiologı´a (CSIC-INTA), Ctra de Ajalvir, km 4, 28850 Torrejo´n de Ardoz (Madrid), Spain e-mail: [email protected]

P. Pontarotti (ed.), Evolutionary Biology – Concepts, Molecular and Morphological Evolution, DOI 10.1007/978-3-642-12340-5_4, # Springer-Verlag Berlin Heidelberg 2010

67

68

M. Stich et al.

RNA molecules are a very well suited model for studying evolution because they incorporate, in a single molecular entity, both genotype and phenotype. While errors in the replication process introduce mutations in the RNA sequence (genotype), selection acts upon the function (phenotype) of the molecule. Since in many cases the spatial structure of the molecule is crucial for its biochemical function, the structure of an RNA molecule can be considered as a minimal representation of the phenotype. In current biology, RNA viruses are the paradigmatic example for evolving populations: replication is fast, it takes place with a relatively high error rate, and population sizes are large. This has made RNA viruses an often used example for quasispecies, a concept originally proposed by Eigen (1971) and developed over the last decades in the context of virology (Domingo 2006). It states that a population of replicators, e.g., an RNA virus evolving within an infected host, cannot be represented by only one, fittest, genome, but by the spectrum of related mutants that are present in the population. The quasispecies evolves under a certain error (mutation) rate and the cloud of mutants enables the population to adapt quickly to new environmental situations, such as population bottlenecks and changed selective pressures. Under constant external conditions, a quasispecies approaches a dynamic equilibrium between selection of favorable sequences (what we mean by favorable, will be specified below) and the diversity constantly introduced by mutation. Therefore, the mutation rate is of crucial importance in the study of such heterogeneous populations in molecular evolution (Huynen et al. 1996; Biebricher and Eigen 2005): if the mutation rate becomes too large, selection becomes inefficient, the correlations between the genomes within the population decay, and the whole population may even become extinct. There are many reported examples of the extinction of RNA virus populations when replication takes place at increased error rates due to the presence of mutagenic agents (Sierra et al. 2000; Domingo 2005; Cases-Gonza´lez et al. 2008). These results have inspired a new promising antiviral strategy named lethal mutagenesis (Loeb et al. 1999). Another field of research within molecular evolution is the quest for understanding the origin and early evolution of life. One of the most appealing theories in this context is the so-called RNA world hypothesis. It is based on the facts that RNA cannot only represent a genetic code, like DNA in present-day cells, but also can act as catalyst of biochemical reactions, like present-day enzymes. Therefore, a single RNA molecule could have been endowed with the two main features of living matter, providing the genome (i.e., the blueprint for replication) and the primordial machinery for replication and metabolism. One of the open questions in this context is how the first template-dependent RNA polymerase ribozyme could have emerged. Experimentally, a minimum size of approximately 165 nucleotides has been established for such a molecule (Johnston et al. 1999; Joyce 2004), a length three to four times that of the longest RNA oligomers obtained by random polymerization (Huang and Ferris 2003, 2006). Hence, one of the main challenges within the RNA world scenario is to convincingly bridge this gap. In this chapter, we will review some recent results obtained in our lab (Manrubia and Briones 2007; Stich et al. 2007, 2008, 2010; Briones et al. 2009) and put them

4 Populations of RNA Molecules as Computational Model for Evolution

69

into the context of the aforementioned issues. The first part of this chapter tries to deepen our understanding of the sequence–structure map, relevant for the RNA world model. Then, we discuss the internal organization of evolving populations and its relevance for robustness and adaptability. Subsequently, we explore the relationship between microscopic mutation rate and the fractions of beneficial and deleterious mutations, as observed in experiments or used in phenomenological models.

4.2

Structural Repertoire of RNA Pools

RNA structure is crucial for biochemical function of an RNA molecule. A lot of research efforts are dedicated to the folding process that relates RNA sequences with RNA structures. For our purpose, it is sufficient to consider two-dimensional secondary structures as good approximation of real three-dimensional structures. Two fundamental properties of the sequence–structure map are that (1) the number of different sequences is much higher than the number of structures and (2) not all possible structures are equally probable (Fontana et al. 1993; Schuster et al. 1994). In this context, common structures are those which have many different sequences folding into them and rare structures are those which have only few sequences folding into them. In this section, we explore the structural repertoire of a pool of random sequences. We first describe the results of the folding of 108 RNA molecules of length 35 nt consisting of random sequences composed of the four types of nucleotides A, C, G, and U (Stich et al. 2008). As secondary structure of each molecule, we take the minimum free energy structure as given by the fold () routine from the Vienna RNA Package (Hofacker et al. 1994). RNA secondary structures consist of stems, where base pairing (A–U, G–C, G–U) between nucleotides occurs, and unpaired regions. In standard bracket notation, nucleotides paired with each other are denoted by “(” and “)”, while unpaired nucleotides are represented by “.”. Among unpaired regions, we can distinguish dangling ends and different kinds of loops: hairpin loops, bulges, interior loops, and multiloops. The simplest structure is called a stem–loop, it consists of one hairpin loop and one stem, and possibly one or two dangling ends. While there are 4n sequences of length n (the so-called sequence space), the number Sn of different structures (the structure space) is much smaller. Based on theoretical studies (Waterman 1978), the expression Sn  0.7131  n 3/2 (2.2888)n has been given (Gr€ uner et al. 1996). Therefore, different sequences will actually fold into the same secondary structure, grouping into neutral networks of genomes (Gr€uner et al. 1996; Huynen et al. 1996). Neutral networks are formed by genomes sharing the same phenotype, here secondary structure, and which are connected by (single) mutational events. The sequence–structure map turns out to be very complex. Two sequences that are just one mutation apart may fold into structures very different from each other. At the same time, in a relatively small neighborhood of any sequence, almost all common structures can be found (Fontana et al. 1993).

70

M. Stich et al.

In our case, 108 sequences folded into 5,163,324 structures (Stich et al. 2008). A way to visualize the uneven distribution of sequences into structures is the frequency–rank diagram. In Fig. 4.1a, we have ranked the structures according to the number of sequences folding into them. One can see that there are around thousand common structures, each of them obtained from about 104 different sequences. On the other hand, we also find a few million rare structures yielded by only one or two sequences. Although for a much smaller pool, this has already been reported before (Schuster et al. 1994; Gr€ uner et al. 1996; Schuster and Stadler 1994; Tacker et al. 1996). In order to study the distribution of common vs. rare structures in more detail, we have proposed a classification where we characterize a structure in terms of three numbers (Stich et al. 2008): (a) the number of hairpin loops, H, (b) the sum of bulges and interior loops, I, and (c) the number of multiloops, M. For example, a simple stem–loop structure, denoted as SL, is characterized by (H,I,M) ¼ (1,0,0), and all stem–loop structures found in the pool are grouped into that structure family. Other important families are the hairpin structure family, HP, with one interior loop or bulge (1,1,0), the double stem–loop, DSL, represented by (2,0,0), and the simple hammerhead structure, HH, by (2,0,1). Of course, there exist more complicated structure families, as detailed in Stich et al. (2008). For the pool that we have folded, we find that only 21 structure families are enough to cover all the 5.2 million structures identified. Our analysis, displayed in Fig. 4.1b, shows that the vast majority of sequences fold into simple structure families. For example, 79.0% of all sequences belong to only three structure families (HP, HP2, SL, in decreasing abundance), and 92.1% of all sequences fold into simple structures with at most 3 stems (HP, HP2, SL, DSL, DSL2, HH). Note that 2.1% of all sequences remain open and do not fold. Our data is in agreement with other findings on the structural repertoire of RNA sequence

a

b

105

c

HP

3

10

open rest

2

10

DSL2

101

DSL

100 100 101 102 103 104 105 106 107

Rank

HH HP3 HP2

Binned absolute frequency

Frequency

104

105 104

SL

103 102 101 100 10–1 10–2 10–3

HP HP2 SL DSL HP3 DSL2

10–4 100 101 102 103 104 105 106 107

Rank

Fig. 4.1 (a) Frequency–rank diagram of the 5,163,324 different secondary structures, obtained by folding 108 RNA sequences of length 35 nt. (b) Distribution of the sequences in structure families according to their frequency. Higher-order hairpins, HPx, are defined as (H,I,M) ¼ (1,x,0), being x  2, higher-order double stem–loops, DSLx, as (H,I,M) ¼ (2, x 1,0), and higher-order hammerheads, HHx, as (2, x 1,1). (c) Frequency–rank diagram according to the structural family. The upper thick solid curve denotes the same curve as in (a). Parts (a) and (c) after Stich et al. (2008)

4 Populations of RNA Molecules as Computational Model for Evolution

71

pools where the influence of the sequence length (Sabeti et al. 1997; Gevertz et al. 2005), the nucleotide composition (Knight et al. 2005; Kim et al. 2007), and pool size (Gevertz et al. 2005) has been studied. Now, we can reconsider the frequency–rank diagram. We sum up all structures of a given structure family within a rank interval. Through this binning procedure, we obtain for each structure family a curve which describes its relative frequency compared with that of the other families. The curves for the most frequent families are shown in Fig. 4.1c. We immediately see that the most frequent structures belong to the stem–loop family, followed by the hairpin family, double stem loops, higherorder hairpin families, and hammerheads. For low ranks, the SL curve is identical with the curve describing all structures. For ranks between 4  103 and 104, it is the HP curve which practically coincides with the total curve. Interestingly, the position of the bump around rank 103 falls together with the locations where the SL and HP families are equally present. Hence, we conclude that the bumps in the frequency–rank diagram correspond to the succession of different structural families and are not smoothed by better sampling of the sequence space. What implications have these findings for the RNA world scenario? The standard view of the RNA world hypothesis states that the first chains of polymerized polynucleotides consisted of random sequences. Therefore, it is important to study the structural and subsequently the functional repertoire of such short sequences. We have seen that a random pool is very rich in simple structures. However, as already mentioned above, short molecules cannot perform template-dependent replication. Therefore, we devised a four-step model of modular evolution as a possible pathway for the emergence of functional and progressively longer molecules starting with a random pool of RNA oligomers (Briones et al. 2009). The first step is the random polymerization of RNA molecules up to 40-mers. The second step is the folding of these sequences, leading to high fractions of simple structures like hairpins, as just shown. The third step is based on the observation that simple hairpin structures, similar to those formed by short random sequences in huge amounts, are actually known to show catalytic activity, leading to RNA–RNA ligation (Puerta-Ferna´ndez et al. 2003). If a certain fraction of the hairpin molecules originated is capable of displaying ligase activity, longer molecules may be formed. Even though the majority of the long molecules may not perform ligase activity, some of them will keep the modular structure of their building blocks and remain active to catalyze further RNA–RNA ligations (Manrubia and Briones 2007). This suggests that hairpin ribozymes, both in individual modules and in combined structures, could have catalyzed the synthesis of progressively longer RNA molecules from short and structurally simpler modules (Briones et al. 2009). Finally, the fourth step of the model consists of a maturation of these ligating RNA molecules of intermediate length into self-replicating RNA ligase networks, which could coexist and even compete with each other, leading eventually to a molecule long and complex enough to perform template-dependent RNA replication [further details in Briones et al. (2009)]. It is important to emphasize that the whole model relies strongly on the observation that simple structures like hairpins – with potential ligase activity – are ubiquitous in pools of random RNA sequences.

72

4.3

M. Stich et al.

Internal Organization of Evolving Populations

Above, we have discussed the static picture of the sequence–structure map. Once replication within a population is possible, evolution through Darwinian selection is triggered. Here, RNA serves as a model to study the interplay between mutation, selection, and the diversity sustained in populations of fast mutating replicators (Stich et al. 2007). First, we briefly describe the evolutionary algorithm. Our system consists of a population of N replicating RNA sequences, each of length n nucleotides. At the beginning of the simulation, every molecule is initialized with a random sequence. Every time that a sequence replicates, each of its nucleotides has a probability m (mutation rate) to be replaced by another nucleotide, randomly chosen among the four possibilities A, C, G, U. At each generation, the sequences are folded into secondary structures as described above. We define a target structure that represents in a simple way optimal performance in a given environment. It can be a hairpin, hammerhead, or any other structure: the qualitative behavior of the system does not depend on this choice. We compare every folded structure with the target structure by means of the base pair distance di, defined as the number of base pairs that have to be opened and closed to transform a given structure into the target structure (Hofacker et al. 1994). The closer a secondary structure is to the target structure, the higher the probability p(di) that the corresponding sequence i replicates: expð bdi Þ : pðdi Þ ¼ PN i¼1 expð bdi Þ

(4.1)

The parameter b denotes the selective pressure and is here chosen as b ¼ 2/n. Generations in our simulations are nonoverlapping and the offspring generation is calculated according to Wright–Fisher sampling. Two relevant quantities to characterize the state of the population are the P average distance d ¼ Ni¼1 di =N to the target structure and the fraction r of structures in the population folding exactly into the target structure. Because of the persisting action of mutation, both quantities fluctuate in time even after reaching the asymptotic regime. Therefore, we perform averages over long time intervals (and different realizations, starting from distinct initial RNA populations), obtain, respectively. ing mean values denoted by d and r In order to quantify collective properties of the molecular ensemble, we first determine the consensus sequence of the population, given by, for each position along the sequence, the most frequent type of nucleotide found within the population. In real RNA molecular and viral quasispecies, the consensus sequence is obtained by means of population sequencing (Thurner et al. 2004; Simmonds et al. 2004; Domingo 2006), and it does not necessarily correspond to any of the individual sequences present in the population. It is straightforward to fold the consensus sequence and obtain the structure of the consensus sequence, for which its

4 Populations of RNA Molecules as Computational Model for Evolution

73

coincidence with the target structure can be determined. At each time step we count either one, corresponding to coincidence, or zero, otherwise. Averages over time C , which corresponds to the (and realizations) of this binary variable yield r probability that, at a randomly chosen time step, the structure of the consensus sequence coincides with the target structure. We further define a consensus structure. It is calculated by determining, for each position along the molecule, the most frequent structural state found within the population, i.e., unpaired “.”, paired upstream “(”, or paired down-stream “)”. Due to this definition, the consensus structure does not necessarily represent a valid secondary structure of an RNA molecule. This procedure is hence fundamentally different from assigning a consensus structure to an alignment of sequences (Hofacker et al. 2002). Averages over time (and realizations) of the coincidence  S. between the consensus structure and the target structure yield the probability r Within this model, evolution takes place in the following way: sequences which fold into structures similar to the target structure will replicate more likely and their fraction in the population increases. Mutation introduces diversity and enables the system to find structures that are closer to the target, and finally find and fix the target structure. Starting from a random set of sequences, we can distinguish several phases of evolution: the search phase, where d decreases while r ¼ 0. This phase finishes at generation gA when a molecule folds into the target structure for the first time. Then, the phase of fixation begins, where – on average – d still decreases and r increases. However, due to the stochastic nature of mutation – and hence in particular for large mutation rates as will be explored further below – the population may lose again the target structure (and r drops down to zero). If r does not drop to zero for 500 consecutive generations, we say that the target structure has been fixed at generation gF. Then, the asymptotic regime is reached, where d and r fluctuate around constant values and which corresponds to a mutation–selection equilibrium. If the mutation rate m is too large, the population is unable to maintain the target structure within the population. In absence of an analytic theory for the system we are studying, we determine the fixation threshold as the value mF at which the curve gF(m) diverges. Since we now have defined the main quantities to describe the population, we show the results in Fig. 4.2. They were obtained from simulations for a system of N ¼ 1,000 RNA molecules of length n ¼ 30 nt evolving toward a hairpin structure. S . The quantity r  describes the funda; r C , and r In (a) we show the curves for r  mental property of a quasispecies at mutation–selection equilibrium. For small m, r takes maximal values. This means that a population contains the largest fraction of  correctly folded molecules if it evolves at small mutation rates. As m increases, r decreases monotonously until it approaches zero. To determine the fixation threshold, we look at Fig. 4.2b where we show the curves of the search time and search plus fixation time. The solid curve represents the search time. We observe that for small m finding the target structure is difficult because only little diversity is introduced and the search process is slow. Therefore, fixation takes a long time. As m increases, the introduced diversity in the population becomes larger and both search and search plus fixation times decrease. However, fixation turns out to be a

74

M. Stich et al.

a

b

1

300

0.8

250 200

ρ ρC ρS

0.6 0.4

gA gF

150 100

0.2 0

50

0

0.01

0.02

0.03

0.04

µ

0.05

0.06

0.07

0.08

0

0

0.01

0.02

0.03

0.04

0.05

0.06

0.07

0.08

µ

Fig. 4.2 (a) Asymptotic properties of a population of size N ¼ 1,000 and molecules of length n ¼ 30 nt as function of the mutation rate m. Displayed are the average fraction of correctly folded S . Averaging has been performed over 4,000 generations , and the quantities r C and r structures r and 20 realizations, disregarding the first 2,000 generations. (b) Search time gA and search plus fixation time gF. We locate the fixation threshold where gF diverges. Averaging has been performed over 200 realizations. The population evolves toward a hairpin target structure given by ..((((((. . .(((. . .))). . ..)))))) in bracket notation

difficult task if m is too large, and the curves for search and search plus fixation start to deviate. The search plus fixation time gF (dotted curve) diverges around m  0.045, where we approximately locate the fixation threshold for this n and  for small m target structure. This means that while the population shows largest r and highest degree of diversity close to the fixation threshold, the search and fixation times are optimized for intermediate mutation rates around m  0.025 well below the fixation threshold. S . The C and r Coming back to Fig. 4.2a, we now have a look at the curves for r . This means C lies for all considered mutation rates above the curve of r curve of r that based upon the information of the consensus sequence only, one may overestimate the evolutionary success. This effect is observed both below and above the fixation threshold. For example, for m ¼ 0.05, where only 0.5% sequences fold into the target structure, and only into an intermittent way, the probability that the consensus sequence folds into the target structure is still 18%. Consequently, the population remains close to sequences that actually fold into the target structure although it is unable to fix it. Obviously, this is related to the fact that at least part of the population are descendents from the same sequence and hence are closely related to each other. Note that the probability that a sequence of the population folds into the target structure is different from the probability that the consensus sequence does. Since consensus sequences are readily obtained from molecular or viral quasispecies, one should take into account this difference.  S , we observe a qualitatively different behavior: Considering now the curve for r for m < 0.025, the probability that the consensus structure coincides with the target structure is practically one, while for m > 0.025, it approaches zero. For small m, this effect can be easily explained: the weight of all the correctly folded molecules  S high. But in Stich et al. (2007), we showed that even is strong enough to keep r

4 Populations of RNA Molecules as Computational Model for Evolution

75

neglecting the correctly folded molecules and for large mutation rates, among the remaining sequences there is a sufficiently large fraction of those molecules which have a similar structure to the target structure. An analogous effect is known for random sequences: in a small neighborhood of a given sequence, the most probable structures are identical or very similar to the structure of the reference sequence (Fontana et al. 1993). Even where rS ¼ 0, the distribution of the structure states along the chain may still resemble the target structure and the positions where the concordance is broken correspond to positions that are actually less stable. S the similarity among C senses the similarity among the sequences and r While r  for most of the mutation the structures, both quantities take superior values than r rates in spite of the fact that selection is actually acting upon structure (not sequence) and that the corresponding fitness landscape is rough. This means that the population retains relevant structural information in a distributed fashion even above the fixation threshold. This represents a strong structural robustness and suggests that certain functional RNA secondary structures may effectively withstand high mutation rates (Stich et al. 2007).

4.4

Phenotypic Effect of Mutations

In the last section, we have already discussed the optimal mutation rate to promote adaptation in an evolving system. Here, we calculate the distribution of the effects of mutations on fitness and the relative fractions of beneficial and deleterious mutations (Stich et al. 2010). It is important to recall that the effect of mutations on the phenotype depends on the genomic and populational context. We explore two different situations: the mutation–selection equilibrium (equilibrated population) and the first stages of the adaptation process (adapting population). Here, we consider a population of N ¼ 1,000 molecules of length n ¼ 50 nt evolving toward a hairpin target structure. The change in fitness of an RNA sequence under replication is quantified by the change of distance to the target structure, i.e., by Dij ¼ di – dj, where i denotes the mother and j the daughter sequence. Hence, for Dij > 0 (Dij < 0), the mutations lead to an increase (decrease) of fitness and hence are beneficial (deleterious). If Dij ¼ 0, either no mutation occurred or the mutations had no effect on fitness (were neutral). As we sum up over N values of Dij at each generation (and over generations and realizations as specified below), we obtain a probability distribution P(D) of the changes in fitness. In Fig. 4.3a, we show for three different mutation rates the distributions P(D), obtained for populations at mutation–selection equilibrium. The part of the distribution with the largest weight represents replication events with no or neutral mutations (D ¼ 0). For a very low mutation rate, negative fitness events strongly dominate over the positive ones and hence beneficial mutations are rare. As the mutation rate increases, the curves move up for positive and negative D since there are more mutation events. Although in particular beneficial mutations occur more often, negative fitness effects still dominate in absolute numbers.

76

a

M. Stich et al.

b 100

100 µ = 5x10–4 µ = 1x10–2 µ = 4x10–2 10

10 –1 p q Π(0)

–2

Π(∆)

10 –2 10 –3

10 –4

10 –4 µF 10 –6

–30

–20

–10

0

10

20

30

10 –5 10 –4

10

–3

10



c

10 –1

100

µ

d 100

100

10

–2

µ = 5x10–4 µ = 1x10–2 µ = 4x10–2

–2

10 –1 p q Π(0)

Π(∆)

10 –2 10 –3

10 –4

10 – 4 µF 10 –6

–30

–20

–10

0

10

20

30

10 –5 10 –4



10

–3

10

–2

10 –1

100

µ

Fig. 4.3 Phenotypic changes of mutations for optimized (a, b) and adapting (c, d) populations. (a) Probability distribution P(D) obtained from 300 generations in the asymptotic regime and for three different values of m. (b) Beneficial (q) and deleterious (p) phenotypic mutation rates as function of the microscopic mutation rate m for optimized populations. Replication events without fitness change are given by P(0). (c) As (a), but for adapting populations (probability distributions obtained from the first 50 generations and 6 different realizations). (d) As (b), but for adapting populations. The thin curves denote the curves from (b). The target structure is ((((. . .. . ..(((((. (((((. . .. . .))))).))))). . .. . ..)))) in bracket notation. After Stich et al. (2010)

From the distribution P we can calculate the fraction of deleterious changes p and beneficial changes q in the following way: q¼

Z

1

Z

0

PðDÞdD;

(4.2)

PðDÞdD:

(4.3)





1

These quantities represent the beneficial and deleterious phenotypic mutation rates which shall not be confounded with the microscopic mutation rate m. By definition, p þ q þ P(0) ¼ 1.

4 Populations of RNA Molecules as Computational Model for Evolution

77

How q and p depend on m is depicted in Fig. 4.3b. For low mutation rates, we see that p is more than two orders of magnitude larger than q. As m increases, both p and q increase, although p > q for all m, in particular for mutation rates below the fixation threshold, for this n and target structure approximately located at mF ¼ 0.02. As m increases, the fraction of replication events with no change in fitness, given by P(0), decreases. The ratio p/q decreases from more than two orders of magnitude to less than one close to mF. This reflects the fact that the higher the mutation rate at which a population has reached mutation–selection equilibrium the lower the fraction of correctly folded molecules, and hence beneficial mutations are more probable. However, these beneficial mutations do not increase the degree of adaptation of the population due to the difficulties to get fixed at high error rate. In Fig. 4.3c,d, we show the distribution P(D) and the functional behavior of (p, q) ¼ f(m) for adapting populations. In this case, fitness changes are measured before the target structure has been found. The distributions P(D) behave in a qualitatively similar way, although quantitative differences to Fig 4.3a can be seen, e.g., for m ¼ 0.0005: The range of negative D is smaller than for an equilibrated population, so very deleterious mutations are not present, and also the overall level of deleterious mutations is lower. At the same time, beneficial mutations are more common. This observation can be explained by the fact that since the population is still relatively far from target, mutations that drive a sequence even further are less likely. For the same reason, mutations that have a positive effect on fitness are more probable. Figure 4.3d summarizes the results: In an adapting population, p is smaller than at equilibrium, and q is larger, although these differences get much lower as the error rate increases. However, in all cases there are still more deleterious mutations than beneficial ones. Again, both phenotypic mutation rates increase as m increases, while replication events without phenotypic change decrease.

4.5

Summary

Here, we have presented recent results with RNA populations as computational model to explore and understand evolutionary processes, using the complex underlying sequence–structure–function relationship of RNA molecules. In the first section, we showed some observations on the structural repertoire of random RNA sequences (Stich et al. 2008). One important result is that simple structures like stem–loops and hairpins are dominant in pools of short sequences. This finding, together with other results and arguments, allowed us to devise a stepwise model of modular evolution for the origin of the RNA world (Briones et al. 2009). In the second section, we introduced an algorithm of RNA evolution in silico (Stich et al. 2007). After characterizing the asymptotic state of the population (at mutation–selection equilibrium), we showed that search and fixation times are optimized for intermediate mutation rates, far from the fixation threshold where the creation of diversity is maximal and far from the regime of low mutation rates

78

M. Stich et al.

where evolutionary success is optimized (in terms of correctly folded molecules). These results have important implications for the adaptability of virus and replicator populations that, due to the changes in the selective pressures that they continuously experience, need to have the capability to adapt rapidly, which can be obtained by the selection of high mutation rates. However, the difficulties for the fixation of beneficial mutations, together with the low fitness values attained when replication takes place at mutation rates close to the error threshold, suggest that viral quasispecies operate at mutation rates considerably smaller. Furthermore, close to and even beyond the fixation threshold, RNA populations show clear signatures of the target structure they try to approach (Stich et al. 2007). For example, even a population that contains practically no molecule that folds into the correct structure, as a whole may actually harbor the target structure as the structure of its consensus sequence. This demonstrates that the evolutionary success of the population is more robust than suggested by the spectrum of its mutants alone. Finally, we have established a connection between the microscopic mutation rate m and the phenotypic mutation rates p and q (Stich et al. 2010). These mutation rates are used in phenomenological models of population dynamics and also in fitting models of data obtained from experiments (Eyre-Walker and Keightley 2007). We find that adapting populations have a much larger fraction of beneficial mutations than equilibrated ones, especially for small mutation rates. Furthermore, we have shown that increases in m do not cause linearly proportional increases in p and q, as often assumed in simple models of population evolution. In summary, our results encourage the combined approach of experimental research and computational modeling for studying molecular evolution. Acknowledgments The authors acknowledge support from Spanish MICIIN through projects FIS2008-05273 and BIO2007-67523, from INTA, and from Comunidad Auto´noma de Madrid, project MODELICO (S2009/ESP-1691).

References Biebricher CK, Eigen M (2005) The error threshold. Virus Res 107:117–127 Briones C, Stich M, Manrubia SC (2009) The dawn of the RNA world: Toward functional complexity through ligation of random RNA oligomers. RNA 15:743–749 Cases-Gonza´lez C, Arribas M, Domingo E, La´zaro E (2008) Beneficial effects of population bottlenecks in an RNA virus evolving at increases error rate. J Mol Biol 384:1120–1129 Domingo E (ed) (2005) Virus entry into error catastrophe as a new antiviral strategy. Virus Res 107:115–228 Domingo E (ed) (2006) Quasispecies: concept and implications for virology. Springer, Berlin Eigen M (1971) Self-organization of matter and the evolution of biological macromolecules. Naturwissenschaften 58:465–523 Eyre-Walker A, Keightley PD (2007) The distribution of fitness effects of new mutations. Nat Rev Genet 8:610–618 Fontana W, Konings DAM, Stadler PF, Schuster P (1993) Statistics of RNA secondary structures. Biopolymers 33:1389–1404

4 Populations of RNA Molecules as Computational Model for Evolution

79

Gevertz J, Gan HH, Schlick T (2005) In vitro RNA random pools are not structurally diverse: a computational analysis. RNA 11:853–863 Gr€ uner W, Giegerich R, Strothmann D, Reidys C, Weber J, Hofacker IL, Stadler PF, Schuster P (1996) Analysis of RNA sequence structure maps by exhaustive enumeration. I. Neutral networks. Monatsh Chem 127:355–374 Hofacker IL, Fontana W, Stadler PF, Bonhoeffer LS, Tacker M, Schuster P (1994) Fast folding and comparison of RNA secondary structures. Monatsh Chem 125:167–188 Hofacker IL, Fekete M, Stadler PF (2002) Secondary structure prediction for aligned RNA sequences. J Mol Biol 319:1059–1066 Huang W, Ferris JP (2003) Synthesis of 35–40 mers of RNA oligomers from unblocked monomers. A simple approach to the RNA world. Chem Commun 12:1458–1459 Huang W, Ferris JP (2006) One-step, regioselective synthesis of up to 50-mers of RNA oligomers by montmorillonite catalysis. J Am Chem Soc 128:8914–8919 Huynen MA, Stadler PF, Fontana W (1996) Smoothness within ruggedness: the role of neutrality in adaptation. Proc Natl Acad Sci USA 93:397–401 Johnston WK, Unrau PJ, Lawrence MS, Glasner ME, Bartel DP (1999) RNA-catalyzed RNA polymerization: accurate and general RNA-templated primer extension. Science 292:1319–1325 Joyce GF (2004) Directed evolution of nucleic acid enzymes. Annu Rev Biochem 73:791–836 Kim N, Gan HH, Schlick T (2007) A computational proposal for designing structured RNA pools for in vitro selection of RNAs. RNA 13:478–492 Knight R, De Sterck H, Markel R, Smit S, Oshmyansky A, Yarus M (2005) Abundance of correctly folded RNA motifs in sequence space, calculated on computational grids. Nucleic Acids Res 33:5924–5935 Loeb LA, Essigmann JM, Kazazi F, Zhang J, Rose KD, Mullins JI (1999) Lethal mutagenesis of HIV with mutagenic nucleoside analogs. Proc Natl Acad Sci USA 96:1492–1497 Manrubia SC, Briones C (2007) Modular evolution and increase of functional complexity in replicating RNA molecules. RNA 13:97–107 Puerta-Ferna´ndez E, Romero-Lo´pez C, Barroso-delJesu´s A, Berzal-Herranz A (2003) Ribozymes: recent advances in the development of RNA tools. FEMS Microbiol Rev 27:75–97 Sabeti PC, Unrau PJ, Bartel DP (1997) Accessing rare activities from random RNA sequences: the importance of the length of molecules in the starting pool. Chem Biol 4:767–774 Schuster P, Stadler PF (1994) Landscapes: complex optimization problems and biopolymer structures. Comput Chem 18:295–324 Schuster P, Fontana W, Stadler PF, Hofacker IL (1994) From sequences to shapes and back: a case study in RNA secondary structures. Proc R Soc Lond B Biol Sci 255:279–284 Sierra S, Da´vila M, Lowenstein PR, Domingo E (2000) Response of foot-and-mouth disease virus to increased mutagenesis. J Virol 74:8316–8323 Simmonds P, Tuplin A, Evans DJ (2004) Detection of genome-scale ordered RNA structure (GORS) in genomes of positive-stranded RNA viruses: implication for virus evolution and host persistence. RNA 10:1337–1351 Stich M, Briones C, Manrubia SC (2007) Collective properties of evolving molecular quasispecies. BMC Evol Biol 7:110 Stich M, Briones C, Manrubia SC (2008) On the structural repertoire of pools of short, random RNA sequences. J Theor Biol 252:750–763 Stich M, La´zaro E, Manrubia SC (2010) Phenotypic effect of mutations in evolving populations of RNA molecules. BMC Evol Biol 10:46 Tacker M, Stadler PF, Bornberg-Bauer EG, Hofacker IL, Schuster P (1996) Algorithm independent properties of RNA secondary structure predictions. Eur Biophys J 25:115–130 Thurner C, Witwer C, Hofacker IL, Stadler PF (2004) Conserved RNA secondary structures in flaviviridae genomes. J Gen Virol 85:1113–1124 Waterman MS (1978) Secondary Structure of Single-stranded Nucleic Acids. In: Rota G-C (ed) Studies in Foundation and Combinatorics, vol 1 of: Advances in Mathematics Supplementary Studies. Academic Press, New York, pp 167–212

Chapter 5

Pseudaptations and the Emergence of Beneficial Traits Steven E. Massey

Abstract There is increasing evidence for the emergence of some beneficial traits in biological systems in the absence of direct selection. Many of these encompass mutational robustness, which increasingly appears to arise as a byproduct of natural selection, as a consequence of the biased incremental change of complex biological systems. Understanding the emergence of robustness in disparate biological systems is facilitated by the use of graph theory and the concept of connectivity. A particular case that is explored here is that of the standard genetic code (SGC). The SGC is arranged so that mutations tend to result in conservative as opposed to radical amino acid changes, a property termed “error minimization”. A commonly cited explanation for this property is the “Adaptive Code” hypothesis, which proposes that error minimization has been directly selected for. However, it is shown that direct selection of the error minimization property is mechanistically difficult. In addition, it is apparent that error minimization may arise simply as a result of code expansion, this is termed the “emergence” hypothesis. The emergence of error minimization in the genetic code is likened to other biological examples, where mutational robustness arises from the innate dynamics of complex systems; these include neutral networks and a variety of subcellular networks. The concept of “biased incrementalism” is introduced to account for the emergence of robustness in these diverse systems, while the term “pseudaptation” is used for such traits that are beneficial to fitness, but are not directly selected for.

S.E. Massey Biology Department, University of Puerto Rico – Rio Piedras, P.O. Box 23360, San Juan, Puerto Rico 00931, USA e-mail: [email protected]

P. Pontarotti (ed.), Evolutionary Biology – Concepts, Molecular and Morphological Evolution, DOI 10.1007/978-3-642-12340-5_5, # Springer-Verlag Berlin Heidelberg 2010

81

82

5.1

S.E. Massey

Adaptive Evolution and Natural Selection

The modern definition of an adaptation is tautological in relation to natural selection; from Mayr’s book “What Evolution Is” (Mayr 2001), adaptations are beneficial traits that arise by natural selection, of if they occur by chance are maintained by natural selection. From a panselectionist perspective, all beneficial phenotypes are to be regarded as adaptations, arising from natural selection. However, it may be argued that the definition of “adaptation” is not inviolate; indeed, it is worth remembering that until the modern synthesis natural selection was not widely accepted as the predominant force behind adaptive evolution; the so-called “eclipse of Darwinism” (Huxley 1942). The theme of this work is to clarify the definition of adaptation, in the context of natural selection, and to examine examples of beneficial traits that have arisen in the absence of direct selection, and how they should be defined.

5.2

Emergence as a By-Product of Natural Selection

Emergence is a term used in studies of complexity, to describe properties that arise from the summation of numerous individual interactions. Diverse examples include the emergence of nonrandom network topologies (ranging from biological networks such as metabolic networks, social interaction networks such as sexual contact and scientific collaboration networks, infrastructure networks such as the Internet and power networks and chemical networks; Gleiss et al. 2001; Albert and Barabasi 2002), weather features such as hurricanes, and Adam Smith’s “invisible hand” that self-regulates the market. In biological systems emergent properties may be directly selected for; examples of this include termite mounds, the shoaling behavior of fish or the ability of ant colonies to solve geometric problems, such as the shortest route to a food source. In contrast, this chapter is devoted to addressing cases where beneficial traits emerge in the absence of direct selection.

5.3

“Pseudaptation” as a Descriptor of Beneficial Traits That Arise in the Absence of Direct Selection

The term “spandrel” was coined to describe phenotypes that arise without the direct agency of natural selection (Gould and Lewontin 1979; Gould 1997). However, it is unclear in the definition whether these traits are beneficial to fitness. Therefore, it is proposed that the term “spandrel” should be used to refer to phenotypes that arise nonadaptively, as a side-product of natural selection, but are not clearly beneficial to fitness. This work is devoted to discussing beneficial traits that are not directly selected for, hence a term is required for such phenomena; it is suggested that the term “pseudaptation” is used for such traits (Massey 2010).

5

Pseudaptations and the Emergence of Beneficial Traits

83

The prefix “pseud” is used to indicate the potential tendency to misinterpret such traits as true adaptations resulting from natural selection. In contrast, therefore, “adaptations” are beneficial traits that result from the agency of natural selection. The vast majority of beneficial traits are expected to be true adaptations.

5.4

The Genetic Code as a Case Study

The standard genetic code (SGC) will be used as a case study for illustrating how a pseudaptation may emerge in a complex system. The arrangement of amino acids to codons in the SGC is such that proteins are remarkably robust to the deleterious effects of mutations and transcriptional/translational errors, in comparison to randomly generated genetic codes (Alff-Steinberger 1969; Di Giulio 1989; Haig and Hurst 1992; Ardell 1998; Freeland et al. 2000; Gilis et al. 2001; Goodarzi et al. 2004, etc.). This property of the SGC is termed “error minimization” (EM) and results in a tendency for conservative as opposed to radical amino acid substitutions (Fig. 5.1). EM can be expressed mathematically by the “EM value”. This is a

Fig. 5.1 The influence of the structure of the standard genetic code on the proportions of conservative or radical amino acid substitutions. There are 75 different amino acid substitutions that can result from a single point mutation, due to the structure of the SGC. The similarity of the amino acids separated by a single point mutation was defined according to the Grantham matrix. The proportion of substitutions that corresponded to different Grantham values was binned accordingly. The chart shows a strong skew toward conservative substitutions

84

S.E. Massey

parameter that calculates the average difference between two amino acids arising from a nonsynonymous mutation and is defined as follows: EM ¼

61 X Nt X i¼1 N¼1

dNi =Nt

!,

61

ðMassey 2008Þ;

where there are i sense codons, Nt is the total number of sense codons separated by a single point mutation from the ith codon under consideration, dNi is the physicochemical distance between the amino acids coded for by the ith sense codon and the Nth sense point mutation, according to the 20  20 Grantham physicochemical similarity matrix (Grantham 1974). The smaller the value between two amino acids in the Grantham matrix, the more similar they are, thus the smaller the EM value the larger the extent of EM in a genetic code. The EM value of the SGC is 60.7, while the EM value of a computationally randomly generated code is 74.5. Only 0.03% of computationally randomly generated genetic codes possess EM values equal or better than that of the SGC, which is indication of the remarkable optimization of the SGC (Massey 2008a). Thus, the code is near optimal for the property of EM. The EM value of the SGC can be understood as representing the average connectivity of all the codons. Figure 5.2 shows how a typical codon may be represented as the node of a graph, with edges representing point mutations to different codons. Each codon may be represented this way, thus the SGC may be envisaged as a graph, composed of 64 nodes. The EM value represents the average connectivity of the SGC, thus robustness arises from a maximization of the average connectivity of the code in terms of neutrality. Another way of putting this is that the amino acids are assigned to codon blocks so that the likelihood of an amino acid substitution being selectively neutral is high. The property of EM is beneficial in that it limits the deleterious effects of mutations and transcriptional/translational errors. Thus, the “Adaptive Code” hypothesis proposes that the EM property is a beneficial trait that has been selected via natural selection (Freeland et al. 2000). The Adaptive Code hypothesis implies that “code space”, the space of alternative genetic codes, was “searched” by natural selection until a near-optimal code was reached (the SGC). However, there are problems with this scenario, discussed next.

5.4.1

Challenges for the Adaptive Code Hypothesis to Explain the Origin of the EM Property of the SGC

There are several challenges for explaining how the EM property was directly selected for by natural selection. First, in order to “find” an optimal or near-optimal genetic code for the property of EM, the code space of alternative codes needs to be searched. This necessitates the occurrence of “codon reassignments”, which are where the amino acid identity of a codon(s) is reassigned from one of the 20 amino acids to another. While these have occurred in nature, mainly in mitochondrial

5

Pseudaptations and the Emergence of Beneficial Traits

85

Table 5.1 The emergence of EM in simulations of genetic code evolution i)

ii)

iii) 5

V 1

A 2

D 3

6

8

G 4 V

A 2

1

iv)

7

9 5 13

10 6 14

11

12

D 3

v)

5

7

8

15

10

13

8

18

10

11

12

5

6

7

8

1

A 2

D 3

G 4

V

G 4

9

9

10

11

12 16

6 14

7 17

8

15

10

19

8

D3

V 1

A2

D3

G4

V 1

A2

20

G4

Percentage of alternate Average EM value Average percentage Selective criteria of alternate codes optimization of alternate codes that have equal or (amino acid codes compared with the superior error difference according minimization than the standard genetic code to the Grantham standard genetic code matrix) 100 kb >10 kb >10 kb >10 kb >10 kb >100 kb 10 kb

>10 kb

>100 kb 10 kb >10 kb

Distanceb >10 kb

196 Y. Gondo

LCNS426 503 95.6 chrX 39,829,216 39,829,717 LCNS358 502 96.2 chr10 77,543,699 77,544,199 LCNS411 502 96.0 chr18 43,323,216 43,323,716 LCNS113 502 95.4 chr2 177,211,340 177,211,838 LCNS114 502 95.8 chr2 177,393,500 177,394,001 LCNS153 502 98.8 chr6 97,651,729 97,652,230 LCNS424 502 95.2 chr9 75,991,328 75,991,829 LCNS433 502 95.2 chrX 147,900,026 147,900,525 LCNS295 501 96.2 chr17 32,268,766 32,269,265 LCNS403 501 99.2 chr18 22,169,478 22,169,978 LCNS291 501 95.0 chr2 57,958,773 57,959,273 LCNS277 501 96.2 chr2 60,709,281 60,709,781 LCNS197 501 95.8 chr4 85,619,277 85,619,777 LCNS264 500 96.2 chr11 115,737,826 115,738,325 LCNS035 500 96.0 chr9 127,959,649 127,960,155 a Length is in bp b Distance from the nearby protein-coding sequence in the mouse genome

chrX chr14 chr18 chr2 chr2 chr4 chr19 chrX chr11 chr18 chr11 chr11 chr5 chr9 chr2

11,645,035 23,468,812 76,812,642 74,976,214 75,155,446 24,596,593 19,511,315 67,145,332 84,424,428 15,003,269 26,714,982 23,915,928 102,074,332 46,468,340 33,888,639

11,645,535 23,469,312 76,813,143 74,976,715 75,155,947 24,597,094 19,511,816 67,145,833 84,424,928 15,003,769 26,715,482 23,916,426 102,074,830 46,468,838 33,889,144

Intron Intron Intergene Intergene Intergene Intron Intergene Intergene Intergene Intron Intergene Intergene Intergene Intergene Intergene >10 kb >10 kb >10 kb >10 kb >100 kb

>10 kb >100 kb >10 kb >100 kb >10 kb >100 kb >10 kb >10 kb

12 Do Long and Highly Conserved Noncoding Sequences in Vertebrates 197

Table 12.3 Twenty most and least similar LCNS I.D. Conservation % Identity Chr. Lengtha LCNS438 962 99.8 chrX LCNS269 596 99.7 chr3 LCNS441 819 99.5 chrX LCNS637 667 99.3 chr10 LCNS344 785 99.2 chr7 LCNS403 501 99.2 chr18 LCNS414 581 99.1 chr18 LCNS103 557 99.1 chr2 LCNS592 551 99.1 chr5 LCNS400 559 98.9 chr18 LCNS152 538 98.9 chr6 LCNS477 616 98.9 chr3 LCNS472 525 98.9 chr9 LCNS039 516 98.8 chr9 LCNS153 502 98.8 chr6 LCNS506 739 98.8 chr7 LCNS640 550 98.7 chr10 LCNS634 603 98.7 chr5 LCNS585 664 98.6 chr5 LCNS242 504 98.6 chr8 LCNS620 627 95.1 chr2 LCNS249 546 95.1 chr16 LCNS328 586 95.1 chr14 LCNS342 586 95.1 chr14 Human (hg18) Start End 24,918,245 24,919,206 138,466,129 138,466,724 24,804,732 24,805,549 102,437,335 102,438,001 20,970,118 20,970,902 22,169,478 22,169,978 43,024,590 43,025,170 174,904,641 174,905,197 77,183,641 77,184,191 20,946,991 20,947,549 97,769,812 97,770,349 181,919,488 181,920,103 134,485,104 134,485,628 127,696,600 127,697,115 97,651,729 97,652,230 114,117,479 114,118,217 102,405,068 102,405,616 139,475,193 139,475,792 81,183,117 81,183,780 37,357,379 37,357,882 44,024,538 44,025,163 49,663,946 49,664,490 33,182,370 33,182,955 98,953,026 98,953,611 Chr. chrX chr9 chrX chr19 chr12 chr18 chr18 chr2 chr13 chr18 chr4 chr3 chr2 chr2 chr4 chr6 chr19 chr18 chr13 chr8 chr17 chr8 chr12 chr12

Start 90,555,381 100,171,994 90,674,418 44,773,397 119,958,510 15,003,269 77,101,560 73,106,824 95,472,039 13,897,326 24,471,724 33,781,632 28,740,382 34,100,300 24,596,593 15,388,331 44,745,421 36,448,133 91,510,428 27,787,978 85,148,509 91,497,355 55,017,916 109,360,726

Mouse (mm9) End 90,556,342 100,172,686 90,675,236 44,774,063 119,959,293 15,003,769 77,102,140 73,107,380 95,472,588 13,897,883 24,472,261 33,782,247 28,740,906 34,100,813 24,597,094 15,389,069 44,745,970 36,448,659 91,511,091 27,788,481 85,149,135 91,497,899 55,018,500 109,361,309

Location Intron Intergene Intron Intergene Intergene Intron Intron Intergene Intergene Intron Intron Intergene Intron Intron Intron 30 UTR Intergene Intergene Intergene Intergene Intron Intergene Intron Intron

>10 kb >10 kb

>10 kb 10 kb >10kb >100 kb

>10 kb >10 kb

>10 kb

10 kb >10 kb >10 kb

>10 kb >100 kb

>100 kb

Distanceb

198 Y. Gondo

LCNS361 889 95.1 chr10 78,060,662 78,061,550 LCNS140 786 95.0 chr8 59,976,155 59,976,939 LCNS350 583 95.0 chr6 1,723,057 1,723,639 LCNS280 603 95.0 chr2 60,370,645 60,371,246 LCNS381 723 95.0 chr13 99,409,271 99,409,993 LCNS406 522 95.0 chr18 35,155,674 35,156,195 LCNS072 522 95.0 chr2 147,006,391 147,006,912 LCNS442 522 95.0 chrX 71,478,279 71,478,798 LCNS067 803 95.0 chr2 146,405,657 146,406,459 LCNS057 542 95.0 chr2 144,471,545 144,472,086 LCNS432 642 95.0 chrX 147,827,152 147,827,793 LCNS291 501 95.0 chr2 57,958,773 57,959,273 LCNS088 601 95.0 chr2 163,813,314 163,813,913 LCNS549 621 95.0 chr15 65,691,994 65,692,614 LCNS621 661 95.0 chr2 44,661,079 44,661,739 LCNS255 680 95.0 chr16 50,492,105 50,492,780 a Length is in bp b Distance from the nearby protein-coding sequence in the mouse genome

chr14 chr4 chr13 chr11 chr14 chr18 chr2 chrX chr2 chr2 chrX chr11 chr2 chr9 chr17 chr8

23,913,918 6,717,506 32,062,621 24,212,663 122,852,051 27,787,891 47,158,094 99,489,220 46,521,971 44,480,335 67,071,319 26,714,982 63,413,486 63,159,161 85,694,024 92,233,969

23,914,806 6,718,290 32,063,203 24,213,264 122,852,772 27,788,411 47,158,615 99,489,740 46,522,769 44,480,871 67,071,960 26,715,482 63,414,085 63,159,778 85,694,683 92,234,648

Intergene Intron Intron Intergene Intergene Intergene Intergene Intergene Intergene Intron Intron Intergene Intergene Intron Intron Intergene >10 kb >100 kb

10 kb >10 kb >10 kb >100 kb 10 kb >10 kb >100 kb >10 kb >100 kb >10 kb >10 kb >10 kb >100 kb

12 Do Long and Highly Conserved Noncoding Sequences in Vertebrates 199

200

Y. Gondo

Table 12.4 Summary of LCNS locations

12.5

UTR

50 30 >100 kb >10 kb 10 kb >100 kb >10 kb 10 kb

6 1.0% 3.6% 16 2.6% Intron 3 0.5% 41.1% 119 19.5% 129 21.1% Intergenic 147 24.1% 55.3% 132 21.6% 59 9.7% Total 611 The location of 611 LCNS are classified to one of eight categories based on the distance from the nearby protein-coding sequence in the mouse genome

Working Hypotheses for Genomic Sequence Conservation

To understand the biological function(s) of the highly conserved noncoding sequences, it is necessary to consider plausible mechanisms of making conserved sequences in many different species. Four working hypotheses that would create and/or maintain highly conserved sequences in coding sequences as well as in noncoding sequences will be discussed below. These working hypotheses are not exclusive to each other. Two or more combinations of plausible mechanisms may contribute to maintain the conservation of genomic sequences among various species. Among the four working hypotheses, only the first one (Sect. 12.5.1) requires significant biological function for the maintenance of conserved sequences whereas the other three hypotheses do not necessary need such functions to explain the highly identical sequences among various species.

12.5.1

Functional Constraint

As described above, the primary working hypothesis for maintenance of LCNS is that evolutionary constraints keep the functionally important genomic DNA sequence from changing. Such functional genomic sequences may be protected from accumulation of spontaneous and/or induced mutations by natural selection. Mutations usually disrupt and disturb the normal function of the gene (or genomic sequence), since the nature of the mutation is random in terms of base-pair array in the genome. It is why radiations, chemical mutagens, and other genotoxic agents are usually harmful to biology and cause various genetic disorders including tumorigenesis, genetic diseases, and predispositions of various genetic risk factors to individuals. Such detrimental mutations are eliminated from natural populations by Darwinian selection. Thus, having more significant function, a genomic sequence tends to exhibit higher degree of conservation among various species due to the evolutionary constraints. To directly test this hypothesis, in vivo assay has been conducted (Poulin et al. 2005; Pennacchio et al. 2006; Visel et al. 2008). By using a transgenic mouse

12

Do Long and Highly Conserved Noncoding Sequences in Vertebrates

201

enhancer assay with reporter genes, highly conserved elements have been experimentally examined of their enhancer cis-regulatory activity. For instance, Pennacchio et al. (2006) tested 167 highly conserved sequences and found that 45% of the sequences had tissue-specific cis-regulatory function at mouse embryonic day 11.5. Furthermore, Visel et al. (2008) compared such enhancer activities between UCE and highly conserved but not in 100% identity sequences by using the transgenic approach. They confirmed the enhancer activity not only in UCE but also in the other highly conserved sequences, suggesting UCE may be a part of a larger enhancer family in the genome. Derti et al. (2006) proposed another possible function of UCE. They proposed that the UCE and/or flanking sequences might maintain the diploid karyotype by the dosage sensitivity. Mammalian UCEs are highly depleted among segmental duplications and copy number variants. This hypothesis seems to be concordant with the fact that UCEs were not found on Y chromosome, human chromosome 21, or in the syntenic regions of the mouse genome. The Y chromosome is only the only nondiploid region in mammals. Human chromosome 21, in which trisomy causes Down syndrome, might be less tolerant of diploid constraint. We, however, found three LCNS on human chromosome 21 and the syntenic region in the mouse (Sakuraba et al. 2008 and Sect. 12.4.2.3). Knockout (KO) mouse studies of UCEs have raised controversial findings related to the functional constraint hypothesis. For instance, Ahituv et al. (2007) disrupted four UCE independently and analyzed the KO mice. None of four KO mouse strains exhibited any anomalies, indicating such UCE should be dispensable. Then, McLean and Bejerano (2008) found that ultraconserved-like elements were over 300-fold less likely than neutral DNA to have been lost during rodent evolution. If UCEs are dispensable, then they should have been lost from the population, similar to neutral sequences. The mutagenesis analysis of highly conserved sequences is also discussed in Sect. 12.5.2.

12.5.2

Mutational Cold Spots

If a genomic sequence is a mutational cold spot, meaning little or no mutation occurs in a sequence, such a genomic sequence might keep the same array of base pairs in many generations and consequently conserved in many different species. Since many mutagens directly target genomic DNA sequences to modify or break down DNA molecules, tightly packed chromatin structure, e.g., in heterochromatic regions, prevent the mutagen from attacking DNA molecules, resulting in a void of the accumulation of mutations. Alternatively (or together), an enhanced DNA repair system in particular genomic sequences would be another mechanism to give rise to mutational cold spots. Whatever the mechanisms of making mutational cold spots would be, if they exist in the genome, they would be highly conserved portions of the genome.

202

Y. Gondo

Bejerano et al. (2004) found much less but some SNPs in human UCE than average. Thus, some mutations have occurred in UCEs. Several analyses of genotype data in human SNP projects (Drake et al. 2006; Katzman et al. 2007) indirectly suggested that UCE and highly conserved sequences were not mutational cold spots. We, therefore, experimentally tested if LCNS are mutational cold spots by using ENU mutagenesis (Sakuraba et al. 2008). We have produced 10,000 ENUmutagenized G1 mice and extracted each DNA (Sakuraba et al. 2005). By using a high-throughput mutation discovery system combining PCR amplification and heteroduplex detection (Sakuraba et al. 2005), several LCNS as well as nonLCNS were subjected to detect ENU-induced mutations (Sakuraba et al. 2008). We found 12 and 136 ENU-induced mutations by screening a total of 16.5 and 181.0 Mb of LCNS and nonLCNS, respectively. Thus, ENU-mutations were found one in 1.371 Mb and in 1.331 Mb of LCNS and nonLCNS, respectively. This very equivalent ENU-induced mutation frequency was also reproduced in a new enhanced mutation discovery system, in which we found 23 and 207 ENU-induced mutations by screening 24.2 and 223.9 Mb of LCNS and nonLCNS, respectively (Sakuraba et al. 2008). Thus, the mutational cold spot hypothesis is unlikely to explain the maintenance of highly conserved sequences in vertebrates during evolution. All the G1 mice that were examined for the ENU mutagenesis study above were maintained as frozen sperm (Sakuraba et al. 2005); therefore, it is possible to analyze live mice carrying an ENU-induced mutation in the LCNS. The total of 35 mouse strains carrying an ENU-induced mutation in an LCNS are listed in our WEB site (http://www.brc.riken.go.jp/lab/mutants/genedriven.htm) and freely available based upon request to RIKEN BioResource Center (BRC) (http://www. brc.riken.jp/lab/animal/en/depo.shtml).

12.5.3

Horizontal Transfer

Another mechanism to make a highly conserved genomic sequence among various species is a recent event of DNA transfer from one species to the other. Interspeciesactive transposition and retroposition would be a plausible mechanism. If a DNA segment horizontally transferred to many species at one time very recently, the transmitted portion of the genomic DNA would have the very similar sequences in the affected species. One discrepancy is that the horizontal transfer by transposon, for instance, usually gives rise to multiple copies in the genome, comprising a part of repetitive sequences. Also, if horizontal transfer happened very recently, the degree of conservation should not be inversely proportional to the evolutionary distance. As described in Sect. 12.4.2.4, however, the degree of the LCNS conservation was inversely proportional to the evolutionary distance. A simple transposon hypothesis does not explain syntenic localization of UCE and LCNS pairs in human and mouse. A combination of functional constraint and horizontal transfer may have occurred. At the beginning of adaptive radiation of vertebrate species, horizontal

12

Do Long and Highly Conserved Noncoding Sequences in Vertebrates

203

transfer might have been very active via various transposons and spread out to many radiated ancestors of vertebrates. If the transposons had been originated not from the direct ancestor species but from e.g., fungi, viruses, and/or bacteria, it is reasonable that neither UCE nor LCNS would be found in any invertebrate species. In this model, various sequences could have been horizontally transmitted to various loci in the genome of many vertebrate ancestors. Then bottleneck and founder effects reduced the number of ancestors and a few lineages furthermore may have undergone adaptive radiations. Each lineage, then, would maintain the syntenic localization of highly conserved sequences like UCEs and LCNS in human, mouse, and rat. Functional constraints might have been maintaining only the highly conserved sequences like UCE and LCNS but flanking sequences diversified. Bejerano et al. (2006) showed some evidence of retroposon-like origins of UCEs.

12.5.4

Concerted Evolution and Gene Conversion

Some portions of genomic sequences have been homogenized to the identical or similar sequences, resulting in the concerted evolution (Nenoi et al. 1998; Gondo et al. 1998; Nei et al. 2000; Okada Y, Gondo Y, Ikeda JE, unpublished). An example has been found in the genomic sequences of the ubiquitin gene among very diversified species of fungi, plants, and animals including human. Gene families code ubiquitin and head-to-tail tandem structure and unequal crossing over seems to maintain the identical genomic DNA sequence of the poly-ubiquitin gene (Nenoi et al. 1998; Nei et al. 2000). A deubiquitinase gene coding for USP17 in human (Gondo et al. 1998; Saitoh et al. 2000) was also found to be very conserved among tested mammalian species (Gondo et al. 1998; Okada Y, Gondo Y, Ikeda JE unpublished). The USP17 gene was found on human chromosomes 4 and 8 with 50–100 head-to-tail tandem copies and a few copies, respectively (Gondo et al. 1998; Okada et al. 2002). The USP17 gene was also identified in many mammalian species in head-to-tandem repeat structure except in the mouse (Gondo et al. 1998; Okada Y, Gondo Y. Ikeda JE unpublished). The copy numbers on human chromosome 4 were highly polymorphic (Gondo et al. 1998) but the 4.7 kb unit sequence of the USP17 gene with the flanking sequences was very identical between copies (99%). The degree of homology (> 99%) between the 4.7 kb repeating units was at the level of the UCE and LCNS. The extremely high similarity was found not only within the tandem repeat on the chromosome 4 but also in a few copies on the chromosome 8. Thus, simple unequal crossing over to homogenize the unit sequence may not be enough to explain the highly conserved 4.7-kb sequences in human and other mammalian species. Some unknown gene-conversion mechanism might have homogenized the 4.7-kb unit sequences between the tandemly repeated sequences on chromosome 4 as well as between unit sequences on chromosome 4 and 8. If the homogenization mechanism of the ubiquitin and the 4.7-kb unit including the USP17

204

Y. Gondo

gene is revealed, it might provide another working hypothesis to give rise to highly conserved sequences.

12.6

Conclusions

Highly conserved sequences have been found in vertebrates. The rich accumulation of the knowledge of highly conserved sequences in vertebrates raises various questions and working hypotheses. The answers, however, are yet to be determined. One of the most critical issues in this field of study is the lack of highly conserved sequences like UCE and LCNS in invertebrate species. Invertebrates may have their own highly conserved sequences. It is necessary to survey in the other clade if some other classes of highly conserved sequences exist. The horizontal transfer hypothesis emphasizes the importance of genomic sequence data not only from species that are closely related to vertebrates but also from more distantly related organisms including fungi, bacteria, and viruses. Even metagenomics of lower eukaryotes and prokaryotes may provide key genomic sequencing data set to explain the presence of highly conserved sequences in vertebrates. New generation sequencing technologies should enhance such surveys. Extensive surveys of highly conserved sequences in all kingdoms may provide clues to understand the nature of highly conserved sequences in the genome such as the origin, mechanism of conservation, and function if any at all. Acknowledgments Author appreciates Dr. Daniel E. Janes for constructive discussions and critical reading of this manuscript. The author thanks Dr. Yoshiyuki Sakaki and his colleagues at RIKEN Genomic Sciences Center and Dr. Masayuki Yamamura and his colleagues at Tokyo Institute of Technology for the extraction of LCNS and useful discussions. The author also acknowledges Dr. Yoshiyuki Sakuraba and the members of the Population and Quantitative Genomics Team at RIKEN Genomic Sciences Center, where the most of the LCNS works described in this chapter was conducted. This work is partly supported by Grants-in-Aid for Scientific Research (A) (KAKENHI 15200032 and KAKENHI 21240043).

References Ahituv N, Zhu Y, Visel A, Holt A, Afzal V, Pennacchio LA, Rubin EM (2007) Deletion of ultraconserved elements yields viable mice. PLoS Biol 5(9):e234 Bantle JA, Hahn WE (1976) Complexity and characterization of polyadenylated RNA in the mouse brain. Cell 8:139–150 Bejerano G, Pheasant M, Makunin I, Stephen S, Kent WJ, Mattick JS, Haussler D (2004) Ultraconserved elements in the human genome. Science 304(5675):1321–1325 Bejerano G, Lowe CB, Ahituv N, King B, Siepel A, Salama SR, Rubin EM, Kent WJ, Haussler D (2006) A distal enhancer and an ultraconserved exon are derived from a novel retroposon. Nature 441(7089):87–90 Britten RJ, Kohne D (1968) Repeated sequences in DNA. Science 161(841):529–540

12

Do Long and Highly Conserved Noncoding Sequences in Vertebrates

205

Chikaraishi DM, Deeb SS, Sueoka N (1978) Sequence complexity of nuclear RNAs in adult rat tissue. Cell 13:111–120 Dermitzakis ET, Reymond A, Lyle R, Scamuffa N, Ucla C, Deutsch S, Stevenson BJ, Flegel V, Bucher P, Jongeneel CV, Antonarakis SE (2002) Numerous potentially functional but nongenic conserved sequences on human chromosome 21. Nature 420(6915):578–582 Dermitzakis ET, Reymond A, Scamuffa N, Ucla C, Kirkness E, Rossier C, Antonarakis SE (2003) Evolutionary discrimination of mammalian conserved non-genic sequences (CNGs). Science 302(5647):1033–1035 Derti A, Roth FP, Church GM, Wu CT (2006) Mammalian ultraconserved elements are strongly depleted among segmental duplications and copy number variants. Nat Genet 38(10): 1216–1220 Drake JA, Bird C, Nemesh J, Thomas DJ, Newton-Cheh C, Reymond A, Excoffier L, Attar H, Antonarakis SE, Dermitzakis ET, Hirschhorn JN (2006) Conserved noncoding sequences are selectively constrained and not mutation cold spots. Nat Genet 38(2):223–227 Gondo Y, Okada T, Matsuyama N, Saitoh Y, Yanagisawa Y, Ikeda JE (1998) Human megasatellite DNA RS447: copy-number polymorphisms and interspecies conservation. Genomics 54(1):39–49 International Human Genome Sequencing Consortium (2001) Initial sequencing and analysis of the human genome. Nature 409:860–921 International Human Genome Sequencing Consortium (2004) Finishing the euchromatic sequence of the human genome. Nature 431:932–945 Judd BH, Shen MW, Kaufman TC (1972) The anatomy and function of a segment of the X chromosome of Drosophila melanogaster. Genetics 71(1):139–156 Katzman S, Kern AD, Bejerano G, Fewell G, Fulton L, Wilson RK, Salama SR, Haussler D (2007) Human genome ultraconserved elements are ultraselected. Science 317(5840):915 McLean C, Bejerano G (2008) Dispensability of mammalian DNA. Genome Res 18(11): 1743–1751 Mouse Genome Sequencing Consortium (2002) Initial sequencing and comparative analysis of the mouse genome. Nature 420:520–562 Mukai T (1964) The genetic structure of natural populations of Drosophila melanogaster I. Spontaneous mutation rate of polygenes controlling vaiability. Genetics 50:1–19 Mukai T (1978) Population genetics. Kodansha Scientific, Tokyo, in Japanese Mukai T, Chigusa SI, Mettler LE, Crow JF (1972) Mutation rate and dominance of genes affecting viability in Drosophila melanogaster. Genetics 72(2):335–355 Nei M, Rogozin IB, Piontkivska H (2000) Purifying selection and birth-and-death evolution in the ubiquitin gene family. Proc Natl Acad Sci USA 97(20):10866–10871 Nenoi M, Mita K, Ichimura S, Kawano A (1998) Higher frequency of concerted evolutionary events in rodents than in man at the polyubiquitin gene VNTR locus. Genetics 148(2):867–876 Nowak R (1994) Mining treasures from “junk DNA”. Science 263:608–610 O’Brien SJ, Menotti-Raymond M, Murphy WJ, Nash WG, Wienberg J, Stanyon R, Copeland NG, Jenkins NA, Womack JE, Marshall Graves JA (1999) The promise of comparative genomics in mammals. Science 286(5439):458–481 Ohnishi O (1977) Spontaneous and ethyl methanesulfonate-induced mutations controlling viability in Drosophila melanogaster. II. Homozygous effect of polygenic mutations. Genetics 87(3):529–545 Okada T, Gondo Y, Goto J, Kanazawa I, Hadano S, Ikeda JE (2002) Unstable transmission of the RS447 human megasatellite tandem repetitive sequence that contains the USP17 deubiquitinating enzyme gene. Hum Genet 110(4):302–313 Pennacchio LA, Ahituv N, Moses AM, Prabhakar S, Nobrega MA, Shoukry M, Minovitsky S, Dubchak I, Holt A, Lewis KD, Plajzer-Frick I, Akiyama J, De Val S, Afzal V, Black BL, Couronne O, Eisen MB, Visel A, Rubin EM (2006) In vivo enhancer analysis of human conserved non-coding sequences. Nature 444(7118):499–502

206

Y. Gondo

Poulin F, Nobrega MA, Plajzer-Frick I, Holt A, Afzal V, Rubin EM, Pennacchio LA (2005) In vivo characterization of a vertebrate ultraconserved enhancer. Genomics 85(6):774–781 Saitoh Y, Miyamoto N, Okada T, Gondo Y, Showguchi-Miyata J, Hadano S, Ikeda JE (2000) The RS447 human megasatellite tandem repetitive sequence encodes a novel deubiquitinating enzyme with a functional promoter. Genomics 67(3):291–300 Sakuraba Y, Sezutsu H, Takahasi KR, Tsuchihashi K, Ichikawa R, Fujimoto N, Kaneko S, Nakai Y, Uchiyama M, Goda N, Motoi R, Ikeda A, Karashima Y, Inoue M, Kaneda H, Masuya H, Minowa O, Noguchi H, Toyoda A, Sakaki Y, Wakana S, Noda T, Shiroishi T, Gondo Y (2005) Molecular characterization of ENU mouse mutagenesis and archives. Biochem Biophys Res Commun 336(2):609–616 Sakuraba Y, Kimura T, Masuya H, Noguchi H, Sezutsu H, Takahasi KR, Toyoda A, Fukumura R, Murata T, Sakaki Y, Yamamura M, Wakana S, Noda T, Shiroishi T, Gondo Y (2008) Identification and characterization of new long conserved noncoding sequences in vertebrates. Mamm Genome 19(10–12):703–712 Visel A, Prabhakar S, Akiyama JA, Shoukry M, Lewis KD, Holt A, Plajzer-Frick I, Afzal V, Rubin EM, Pennacchio LA (2008) Ultraconservation identifies a small subset of extremely constrained developmental enhancers. Nat Genet 40(2):158–160 Wetmur J, Davidson N (1968) Kinetics of renaturation of DNA. J Mol Biol 31(3):349–370

Part III Morphological Evolution / Speciation

Chapter 13

Male-Killing Wolbachia in the Butterfly Hypolimnas bolina Anne Duplouy and Scott L. O’Neill

Abstract Maternally inherited insect symbionts often manipulate host reproduction for their own benefit. Symbionts are transmitted to the next host generation through the female hosts, and as such males represent dead ends for transmission. Natural selection therefore favors symbiont-induced phenotypes that provide a reproductive advantage to infected females, regardless of possible negative selective effects on males. Male-killing (MK) is one such phenotype, in which symbionts kill the male progeny of infected females. Compared with other symbiont-associated reproductive phenotypes, MK is relatively unexplored mechanistically as well as ecologically. A male-killing Wolbachia bacterium strain named wBol1 has been described in the tropical butterfly Hypolimnas bolina. By reviewing the different features of this association it is possible to summarize what is already known about the biology and evolution of MK symbionts, as well as highlight the current gaps in our understanding of this striking reproductive phenotype.

13.1

Introduction

There are numerous symbiotic associations known to occur within nature; however, few associations are more complex than those involving endosymbiosis. The study of endosymbionts challenges the scientific community with questions about how each member of the symbiosis coexists and how they maximize their reproductive fitness. Endosymbionts are extremely common and over the course of evolution have arisen in very different taxonomic groups. In insects, although endosymbiotic eukaryotic microorganisms are common (e.g., the yeast-like endosymbiont

A. Duplouy and S.L. O’Neill School of Biological Sciences, The University of Queensland, Brisbane, QLD 4072, Australia e-mail: [email protected]

P. Pontarotti (ed.), Evolutionary Biology – Concepts, Molecular and Morphological Evolution, DOI 10.1007/978-3-642-12340-5_13, # Springer-Verlag Berlin Heidelberg 2010

209

210

A. Duplouy and S.L. O’Neill

Symbiotaphrina buchneri infecting anobiid beetles, Noda and Kodama 1996; or the fungal symbiont of the brown-banded cockroach species Supella longipalpa, Gibson and Hunter 2009), most described endosymbionts are bacteria including members of the Proteobacteria (e.g., Buchnera and Wolbachia), Flavobacteria (e.g., Blattabacterium), and Mollicutes (e.g., Spiroplasma) (Werren and O’Neill 1997; Bourtzis and Miller 2003, 2006), amongst others. Insect endosymbionts also show diversity in their modes of transmission, either vertically (maternally) transmitted from mother to offspring or horizontally transmitted. In the latter case, symbiont may be infectious within a single species or between different species. Examples of occasional horizontal transfer of maternally transmitted symbionts have been reported (Werren and O’Neill 1997). “Primary endosymbionts” are usually obligate endosymbionts, needed for host reproduction and/or survival. For example, Moran et al. (2005) showed that Buchnera aphidicola provides essential nutrients deficient within the aphid host’s diet. Some primary endosymbionts have been shown to display phylogenetic concordance with their hosts over millions of years demonstrating long-term coevolution (Moran et al. 1993, 1994; Bandi et al. 1994). Facultative endosymbionts, often referred to as “secondary endosymbionts,” infect individuals already carrying a primary symbiont. A classic example is the pea aphid Acyrthosiphon pisum that harbors multiple secondary symbionts such as Hamiltonella defensa, in addition to the primary symbiont Buchnera sp. (Moran et al. 2005; Oliver et al. 2007). The functional roles of secondary symbionts within the host are not always well defined, as any effect can be hidden by the action of the primary symbionts (Chen et al. 2000; Moran et al. 2005; Ruan et al. 2006). Finally, “reproductive symbionts,” also termed “guest microbes” (Bourtzis and Miller 2003), were first described as symbionts able to enhance their own fitness by manipulating host reproduction (Taylor and Hoerauf 1999). Some of these distortions involve sex ratio manipulation of the host. Spiroplasma for example kills males in Drosophila species (Hurst et al. 1999a), while Cardinium sterilizes certain males of the wasp Encarsia pergandiella (Hunter et al. 2003). However, recent studies have revealed additional capabilities of reproductive symbionts that enhance their fitness without affecting the host’s reproductive system (Brownlie et al. 2009).

13.2

Wolbachia Pipientis

Wolbachia pipientis is a species of obligate intracellular alpha-Proteobacteria closely related to Rickettsia. Wolbachia were first discovered in the early 1920s in the ovaries of the mosquito Culex pipiens (Hertig and Wolbach 1924). Based on genetic variation Wolbachia strains were divided into eight highly divergent supergroups named A through H (Bandi et al. 1998; Zhou et al. 1998; Bourtzis and Miller 2003; Lo et al. 2007). The two most studied and described Wolbachia supergroups, A and B, diverged approximately 50–70 million years ago (Werren et al. 1995; Werren and O’Neill 1997). Wolbachia belonging to these two groups, known as the

13

Male-Killing Wolbachia in the Butterfly Hypolimnas bolina

211

“arthropod Wolbachia,” are mostly harbored by insects but are also described from other host phyla such as Crustacea or Arachnida. Supergroups A and B Wolbachia are mostly parasitic and induce a broad range of reproductive distortions in their hosts. In comparison, Wolbachia belonging to both the C and D supergroups are mutualistic strains required for fertility and development of their filarial nematode hosts (Bandi et al. 1998). Within the C and D clusters, Wolbachia phylogeny is concordant with host phylogeny, suggesting long-term coevolution. The remaining four clusters (E–H) infect various arthropods or nematodes; however, these associations are often poorly described and symbiont-induced effects are not always known (Vandekerckhove et al. 1999; Lo and Evans 2007; Covacin and Barker 2007). W. pipientis, the most extensively studied reproductive endosymbiont to date, has the greatest diversity of host interactions including mutualism and all types of known reproductive manipulations – cytoplasmic incompatibility, feminization, parthenogenesis, or male-killing (O’Neill et al. 1997).

13.2.1

Reproductive Distortions

Maternally transmitted endosymbionts, such as Wolbachia, can enhance their transmission rate by manipulating their host’s reproduction (O’Neill et al. 1997; Bourtzis and Miller 2003). To understand the benefits they gain from these manipulations, it is worthwhile summarizing what is known about the most common symbiont-induced reproductive phenotypes. The first reproductive manipulation to be attributed to Wolbachia was cytoplasmic incompatibility (CI). In the 1950s, Ghelelovitch (1952) and Laven (1959) described crosses between strains of the mosquito Culex pipiens that sometimes failed to produce progeny. Later, Yen and Barr (1971) showed that Wolbachia was the causative agent of these reproductive failures. Wolbachia-infected males when crossed with uninfected females failed, whereas all other possible crosses (crosses between uninfected individuals, and between infected females and either uninfected males or males carrying the same infection) resulted in normal reproductive output. The mechanistic basis of this reproductive incompatibility between uninfected females and infected males has been linked to abnormalities during fertilization by cytological studies (Tram and Sullivan 2002). Abnormal behavior of chromosomal material from infected males causes incompatibility with female pronuclei and later the death of the progeny. The CI of these gametes provides an advantage to infected females, as they can successfully mate with both infected and noninfected males. As a result, the maternally transmitted symbiont spread rapidly into the host population. CI is not unique to Wolbachia: Cardinium also induces CI in the parasitoid wasp Encarsia pergandiella (Hunter et al. 2003; Perlman et al. 2008), and CI has been described as the most common endosymbiont-induced reproductive manipulation in arthropods. As Wolbachia are maternally transmitted, some strains distort the sex ratio of their host population to favor the female sex only, creating populations where males are sometimes extremely rare. Three mechanisms, feminization, parthenogenesis, and

212

A. Duplouy and S.L. O’Neill

male-killing (MK), cause imbalanced sex ratio in the host population. Feminizing symbionts such as Cardinium and Wolbachia have been found in numerous arthropod hosts including the isopod Armadillidium vulgare (Cordaux et al. 2004), the butterfly species Ostrinia furnacalis and Eurema hecabe (Narita et al. 2007; Kageyama et al. 2008), and the spider mite Brevipalpus phoenicis (Weeks et al. 2001). During feminization, genetic males reproduce as functional females, which therefore transmit Wolbachia to their progeny (Rigaud 1997; Stouthamer et al. 1999). Feminization is often mistaken for parthenogenesis, as both mechanisms produce female-biased populations. Although feminization requires sexual reproduction, parthenogenesis allows the production of viable progeny without the need for a male partner. Two types of parthenogenesis have been described: arrhenotokous parthenogenesis (or arrhenotoky) occurs when diploid females arise from fertilized eggs and thelytokous parthenogenesis (or thelytoky) where females are produced from unfertilized eggs. In the wasp species Trichogramma spp., thelytoky is induced by Wolbachia (Stouthamer and Kazmer 1994), which restores diploidy by enhancing the fusion of the two nuclei of the first mitotic division (Stouthamer and Kazmer 1994; Huigens et al. 2000). Finally, a wide range of endosymbiont-infected arthropods produce only daughters as male offspring die at an early development stage. Males are usually killed embryonically, but deaths also occur much later, typically in fourth instar larvae (Hurst 1991). This common reproductive manipulation is known as male-killing (MK). MK is caused by at least nine different bacteria from four taxonomic groups: Mollicutes, Flavobacteria, Rickettsiaceae, and Enterobacteriaceae (Hurst et al. 1997, 2003). However, there are still very few studies investigating the underlying cytogenetic and genetic mechanisms of this phenotype.

13.3

Male-Killing Wolbachia in the Butterfly Hypolimnas bolina

Although MK systems are diverse, a review of the association between the MK Wolbachia strain wBol1 and H. bolina provides a general overview of this reproductive phenotype. H. bolina, also known as the common or great egg-fly (Australia), or blue-moon butterfly (New Zealand), was first described by Linnaeus in 1758. This species has a vast subtropical distribution from Sri Lanka to French Polynesia and a latitudinal range from Hong-Kong to Canberra, Australia. Occasional reports describe H. bolina in Japan and New Zealand since the 1970s (Ramsay 1971; Clarke and Sheppard 1975; Morishita and Kazuhiko 2002; Patrick 2004), but it is suspected that these regions do not support endemic populations (Common and Waterhouse 1972). Individuals observed in Japan and New Zealand were probably migratory individuals using favorable meteorological conditions (Ryan and Harris 1990; Christensen 2004) to invade from close neighboring regions such as South East Asia (SEA) or Australia, where stable populations exist (Gibbs 1961; Ramsay and Ordish 1966; Ramsay 1971; Christensen 2004).

13

Male-Killing Wolbachia in the Butterfly Hypolimnas bolina

13.3.1

213

All-Female Broods in the Butterfly H. bolina

A strong female sex distortion has been described in numerous H. bolina populations throughout their wide geographical distribution (Simmonds, 1926, Clarke et al. 1975; Dyson et al. 2002; Charlat et al. 2005). All-female broods were first described in the 1920s (Poulton 1923; Simmonds 1926). This reproductive trait was showed to be exclusive to females and therefore due to a cytoplasmic factor (Clarke et al. 1975). It was reported not to be parthenogenesis as males were dying at early stages of development (Clarke et al. 1975, 1983). Dyson et al. (2002) identified W. pipientis as the causative agent of male rareness in H. bolina, using PCR amplification and sequence analysis of a bacterial surface protein gene (Zhou et al. 1998). This Wolbachia strain termed wBol1 was shown to kill the male progeny of infected female butterflies at an early embryonic stage before caterpillars hatch from the eggs (Dyson et al. 2002, Fig. 13.1). First identified in Fiji, wBol1 was found to be present in most H. bolina populations across the South Pacific (Charlat et al. 2005). One intriguing feature of the wBol1/H. bolina association has been a variation in wBol1 infection prevalence among different host populations. wBol1 infections were absent from

eggs 4 days

wBol1 ( 2)

Death of the wBol1-inf ected male embryos

( 1)

4 days

( 5) ( 3)

( 4)

5 caterpillar instars 25 days

7 days

pupae

wBol1

Fig. 13.1 Life cycle of wBol1-infected Hypolimnas bolina: (1) a wBol1-infected female mates with an uninfected male, (2) all males die during embryogenesis, only female eggs hatch 4 days after being laid, (3) caterpillars develop in 20 days through 5 larval instars, (4) wBol1-infected females emerge from 7-day old pupae, (5) and 4 days after emerging from the pupae, females are reproductively mature

214

A. Duplouy and S.L. O’Neill N

Island not inf ected by wBol 1 Low ( 2)

and medium inf ection rate

H igh inf ection rate ( 1) Equator

( 3)

( 7) ( 4)

( 5)

( 8)

( 6)

( 9) ( 11)

Australia

New

Z ealand

( 10) ( 12)

1000 km

Fig. 13.2 Wolbachia infection frequencies in 12 H. bolina populations. (1) Philippines, (2) Thailand, (3) Vanuatu, (4) Fiji, (5) New Caledonia: Ile des Pins, (6) Australia: Brisbane, (7) Independent Samoa, (8) American Samoa, French Polynesia: (9) Moorea, (10) Tahiti, (11) Rurutu, and (12) Tubuai. Less than 65% of the females are wBol1-infected in islands with low and medium infection frequencies, and 65–100% of the females are wBol1-infected in islands with high infection frequency

Australian and the Tubuai (French Polynesia, Austral Islands Archipelago) H. bolina populations, while wBol1 infection frequencies of up to 50% in Fijian populations and more than 85% in both the Independent Samoan and Tahitian populations were recorded (Fig. 13.2; Charlat et al. 2005, 2006).

13.3.2

Competition Between Wolbachia Infections

A number of possible reasons have been suggested to explain the heterogeneity in wBol1 infection rates (Fig. 13.2, Table 13.1). In the extreme case of Tubuai (Austral Islands Archipelago, French Polynesia), no butterflies were found to be infected by the male killer strain wBol1, while on the closest neighboring island of Rurutu, only 210 km away, female wBol1 infection rate was more than 75% (Charlat et al. 2005, 2006). It was found that butterflies from Tubuai were infected with another Wolbachia strain, named wBol2. The wBol2 strain is an A-group Wolbachia that is phylogenetically distant from wBol1, a B-group Wolbachia. Crosses between wBol1-infected females and wBol2-infected males were fully incompatible and

13

Male-Killing Wolbachia in the Butterfly Hypolimnas bolina

215

Table 13.1 Percentage of males and females in different populations naturally uninfected (column 2) or infected by the different Wolbachia strains (columns 3–5) MK % wBol2% wBol1-bPopulations % Uninfected % wBol1-arepressor infected male/ infected male/ male/female infected male/ gene female female female Philippines 0/0 100/100 0/0 0/0 Present Thailand 0/0 100/100 0/0 0/0 Present Ile des Pins 100/17 0/83 0/0 0/0 Fiji 100/50 0/50 0/0 0/0 Vanuatu 100/70 0/30 0/0 0/0 Australia 100/100 0/0 0/0 0/0 Ind. Samoa 0/0 100/100 0/0 0/0 Present Am. Samoa 0/0 0/0 0/0 100/100 Moorea 98/17 0/80 0/3 2/0 Tahiti 100/4 0/90 0/6 0/0 Rurutu 98/29 0/69 0/0 2/2 Tubuai 2/2 0/0 0/0 98/98 MK repressor gene presence is shown in column 6 (Charlat et al. 2005, 2006, 2007b; Hornett et al. 2006)

lead to unviable progeny. This phenotype was the result of wBol2-induced CI in H. bolina (Charlat et al. 2006). The competition between wBol2 and wBol1 and the strong CI observed between the two Wolbachia strains make the invasion of Tubuai by the MK strain, wBol1, extremely unlikely. The presence of wBol2 was reported in several other islands of the South Pacific where wBol1 was not shown to occur (Charlat et al. 2006).

13.3.3

When the MK Phenotype Is Repressed, wBol1 Induces CI

At the other extreme, all H. bolina from South East Asian populations were infected by wBol1, including males (Charlat et al. 2005; Hornett et al. 2006). Under the strong selection pressure exerted by the wBol1 infection, butterflies have evolved resistance to the MK phenotype. This mutation led to survival of male offspring and restored a balanced sex ratio (Hornett et al. 2006, Table 13.1). If wBol1 from host populations with the MK repressor gene were shown to retain their ability to induce MK in nonresistant host, then it would suggest either (1) that the repressor gene was the result of an extremely recent mutation in the host or (2) that the MK character was linked to a desirable trait providing an advantage to the repressed wBol1. Otherwise, long-term evolution in a host population that repressed MK may result in the loss of wBol1’s MK virulence – a character no longer able to spread in the population. Hornett and co-workers (2008) conducted crosses between MK resistant H. bolina from SEA and nonresistant populations of French Polynesia (Moorea and Tahiti, Society Islands Archipelago) and tested whether wBol1 from SEA could induce MK. The SEA wBol1 infection was able to distort host

216

A. Duplouy and S.L. O’Neill

reproduction when transferred into a French Polynesian background, indicating that wBol1 from SEA can still induce the MK phenotype in nonresistant hosts. The study also revealed a complete failure in egg hatch when SEA males carrying both the MK infection and MK repressor gene(s) were crossed with uninfected females. Control crosses showed that the females were not sterile, suggesting that in addition to MK wBol1 also induces CI in this population of H. bolina (Hornett et al. 2008).

13.3.4

MK Wolbachia Diversity in H. bolina

More recently, Charlat et al. (2009) shown that the MK phenotype in H. bolina was induced by two substrains, wBol1-a and wBol1-b. Although they are extremely closely related phylogenetically, genetic variations between them have only been found at two loci, wBol1-a and wBol1-b show phenotypic differences that make them interesting candidates for comparative analysis (Charlat et al. 2009). wBol1-a and wBol1-b seem to differ in their sensitivity to the MK repressor from SEA. Preliminary results suggest that wBol1-a MK was repressed when transferred into a SEA background, while wBol1-b showed persistent MK phenotype in this novel host background (Charlat pers. comm. 2007). These results suggest small variations in the MK genetic bases between these two substrains. These two substrains also differ in their transmission level. The wBol1-b infection, which has been found in only French Polynesia and Vanuatu (Charlat et al. 2009, Table 13.1), was associated with mitochondrial haplotypes (mitotypes) 3 and 6. These mitotypes were also found in Wolbachia-free butterflies, suggesting imperfect vertical transmission of wBol1-b. In contrast, the most common strain wBol1-a was present on all the islands where wBol1 was previously described (Charlat et al. 2005, 2006 Table 13.1) and was strictly associated with mitotype 1. The almost complete absence of uninfected butterflies carrying mitotype 1 suggests a very high transmission efficiency of wBol1-a. More recent investigations into wBol1-a genetic variation have found no evidence that wBol1-a prevalence was related to genetic differences between wBol1-a populations (Duplouy et al. 2009). The age of the infection in the South Pacific islands may vary; for example, the wBol1-a invasion of Fiji could be more recent than that of Tahiti, where a larger proportion of females carried the infection.

13.3.5

A Rapidly Evolving System

The association between wBol1 and H. bolina has proved to be highly dynamic. In 2001, Samoan H. bolina populations were shown to have at most a single male per hundred females. Charlat and colleagues (2007a) reported in a 2006 survey equal sex ratios and a second case of MK repression in the South Pacific. It was not known whether the genetic basis of MK resistance was similar in both SEA and Samoan

13

Male-Killing Wolbachia in the Butterfly Hypolimnas bolina

217

populations. Nonetheless, the shift in population sex ratio from 100:1 to 1:1 in less than ten generations seemed to be one of the fastest ever recorded (Charlat et al. 2007a). A more ancient but similar evolution of a MK repressor gene has also been described in butterflies from Malaysian Borneo (Hornett et al. 2009). The spread of wBol1 through SEA and the South Pacific was estimated to have taken less than 3,000 years (Duplouy et al. 2009). However, in some populations, local invasions were suggested to have occurred more rapidly, on the scale of a century. Museum samples from different South Pacific islands were tested for both the infection type and prevalence in previous butterfly generations. In the 120 years from 1883 until 2002, the infection frequencies in the French Polynesian Islands of Ua Huka and Tahiti varied from very low prevalence (0% and less than 20%, respectively) to very high prevalence (more than 80%) (Hornett et al. 2009).

13.4 13.4.1

Open Questions in wBol1 Research wBol1 Biogeography

The biogeography of the wBol1-a infection in the South Pacific was one of the most intriguing aspects of this system. The presence of butterfly populations on numerous South Pacific islands provided clear evidence of natural migrations occurring between islands; however, the range of these exchanges remained an unknown factor. Butterfly populations infected with the CI-inducing strain wBol2 were almost as common as wBol1-infected populations in the South Pacific. In contrast, populations where the two infections coexist have rarely been recorded, and doubly infected butterflies have never been found (Charlat et al. 2005). Models predicted that, in this system, a CI-infected population would resist MK invasion. Under the same conditions, a MK-infected population would only resist invasion by CI-inducing Wolbachia if the latter did not reach a certain frequency threshold (Freeland and McCabe 1997; Engelst€adter et al. 2004). If the limit was exceeded, then the CI-inducing strain became more competitive and therefore, spread into the population driving the former MK infection to extinction (Engelst€adter et al. 2004). Butterfly populations where wBol1 and wBol2 were in competion were rare in the South Pacific islands (Charlat et al. 2005; Engelst€adter et al. 2008). This rarity suggested a low migration rate between islands, allowing MK-infected populations to resist wBol2 invasion.

13.4.2

Effects of MK Infection on Host Fitness

Endosymbiotic infections are generally costly to maintain as the symbionts exploit resources that are destined for their host (Haine 2008). In order to be maintained

218

A. Duplouy and S.L. O’Neill

and spread within host populations, symbionts may develop strategies that enhance the fitness of infected hosts relative to uninfected individuals. Wolbachia strains have developed very intimate relationships with their hosts and stress treatments have shown that some strains are beneficial to their hosts (Hedges et al. 2008; Brownlie et al. 2009). Modeling predicts that MK fixation would lead to population extinction because of a severe shortage of males (Hamilton 1967; Hurst 1991; Randerson et al. 2000); however, wBol1 infection sometimes exceeds 75% of host individuals in a population. The success of infected individuals over uninfected ones suggests that wBol1 infection may confer a fitness advantage to its hosts, but the nature of this benefit has not yet been characterized in H. bolina.

13.4.2.1

Benefits from the Infection in Other Host/MK Wolbachia Associations

Direct benefits from the infection, such as an increased size, fecundity, or longevity, have been recorded in different associations with MK Wolbachia (Ikeda 1970; Majerus and Hurst 1997; Fry et al. 2004). These observations contrast with the wBol1/H. bolina system where no benefit of this type has been shown (Dyson and Hurst 2004; Charlat et al. 2007b). Similarly, although indirect benefits from MK infection have been described in several other MK systems, none has yet been associated with a fitness increase in wBol1-infected butterflies. (Werren 1987) suggested that MK endosymbionts could reduce sibling inbreeding, thereby favoring infected females. This explanation makes sense for species that lay many eggs on the same plant and are not very mobile after hatching. In the case of H. bolina, however, butterflies lay few eggs per plant and are good migrants as individuals have frequently invaded New Zealand from Australia, a journey of 2,000 km (Ramsay 1971; Ryan and Harris 1990; Patrick 2004). Majerus and Hurst (1997) suggested that the success of MK strains in ladybirds (e.g., Adalia bipunctata) was correlated with different host characteristics, including cannibalism at various developmental stages; so infected females gain nutrition from feeding on their dead brothers, and from large clutch sizes, as MK reduces sibling competition for food by diminishing their numbers by half. H. bolina does not exhibit the characteristics of a host in which a MK Wolbachia would be successful. This butterfly is strictly herbivorous during its larval stages and as adult feeds exclusively on nectar. As such, male death would provide no direct nutritional benefit to infected sisters and as females lay only 1–2 eggs per plant, food competition would be limited (Nafus 1993; Kemp 1998).

13.4.2.2

Alternative Hypotheses

wBol1-a may confer a “hidden” selective advantage to infected hosts (Duplouy et al. 2009). Insects are often infected with entomopathogenic agents (fungi, viruses

13

Male-Killing Wolbachia in the Butterfly Hypolimnas bolina

219

or bacteria). Phytophageous insects, such as butterflies, also have to avoid plant defenses, such as toxic compounds, developed by their host plants to fight against natural enemies (Lindroth 1989; Li et al. 2003; Wen et al. 2006). Caterpillars are common prey for parasitoid or predatory wasps such as Cotesia spp. or Polistes spp. (Stamp and Bowers 1988; Nafus 1993; Beckage et al. 1994; van Nouhuys and Hanski 2005). These selective pressures allow the survival of only resistant or adapted individuals (Hochberg 1991; Russell and Moran 2005; Moran 2006; Haine 2008). Wolbachia may confer their host a benefit when exposed to toxins and/or parasites and thereby increase its prevalence within host populations. Recent studies have shown Wolbachia-infected flies delay mortality after virus infection (Hedges et al. 2008; Teixeira et al. 2008). Investigating the effect of wBol1 infection in a metacommunity involving the host, the symbiont, and at least a third party such as a virus or a parasitoid wasp could provide insights into fitness benefit(s) this infection provides the butterfly host with. Fitness benefit(s) that could therefore help explaining the striking success of wBol1-a in H. bolina.

13.4.3

Mechanisms of MK

13.4.3.1

Cytology of MK

Two types of MK have been characterized based on the timing of male death (Hurst 1991). “Early MK” occurs during embryogenesis while “late MK” takes effect during larval or pupal stages. Both early and late MK were observed in Wolbachiainfected insects (Hurst et al. 1999b; Fialho and Stevens 2000; Jiggins et al. 2000; Dyson et al. 2002; Jaenike 2007); however, the underlying mechanisms of either phenomena have not yet been elucidated. Studies on MK Spiroplasma-infected Drosophila have shown that male embryo death was associated with abnormal mitoses, while later death was caused by degeneration of cell nuclei (pycnosis) (Counce and Poulson 1962). In a similar system, modification of the dosage compensation complex (DCC), which is involved in sex differentiation, can also rescue males from MK symbionts. This indicates that the DCC may be involved in expression of the MK phenotype (Veneti et al. 2005). Although MK in Wolbachia-infected insects must also involve host sex determination, similar mechanisms to those in Spiroplasma associations have not yet been identified. One study showed that treatment of wBol1-a-infected butterflies with bacteriostatic antibiotics delayed the MK effect. This demonstrates that wBol1 was able to identify male individuals and induce MK at different time points during host development (Charlat et al. 2007c). However, it is unknown if the basic mechanisms of the MK phenotype are identical at each time point. As suggested, MK could be expressed through different pathways (Hurst and Jiggins 2000), which would complicate the identification of the mechanistic basis of these MK phenotypes.

220

13.4.3.2

A. Duplouy and S.L. O’Neill

Genomics of MK

To date the genomes of one mutualistic and three CI-inducing Wolbachia strains have been sequenced (Wu et al. 2004; Foster et al. 2005; Klasson et al. 2008, 2009) and several others are underway. Wolbachia’s intracellular biology has hampered the completion of whole genome-sequencing projects. The genome sequence of the MK strain wBol1 is nearing completion (Duplouy pers.comm.), and analysis of the first chromosomal DNA sequence of a MK Wolbachia strain will certainly be of great value. Comparative genomic analysis of wBol1 with the closely related and fully sequenced wPip strain, which induces CI in Culex mosquitoes, should provide an unique opportunity to investigate the evolution of Wolbachia genomes across relatively short evolutionary timescales. This first genomic comparison between a MK strain and a CI-inducing strain offers opportunities to test hypotheses concerning the evolution and induction of the MK phenotype, such as identifying candidate genes involved in both MK and CI. Previous whole genome analyses have attempted to link genetic elements, such as ankyrin coding genes, to the induction of different reproductive manipulations (Iturbe-Ormaetxe et al. 2005; Duron et al. 2007; Walker et al. 2007; Klasson et al. 2008). Ankyrin repeat domains are believed to be involved in cellular and molecular functions via protein–protein interactions (Caturegli et al. 2000; Mosavi et al. 2004). Twenty-three, 29, and 60 ankyrin genes have been annotated in the Wolbachia strains wMel, wRi, and wPip, respectively (Wu et al. 2004; Klasson et al. 2009; Walker et al. 2007), while wBm seems to contain only 5 ankyrin coding genes (Foster et al. 2005). wBol1 is phylogenetically close to the wPip strain, and it is therefore expected that the MK strain also contains a large number of ankyrin coding genes. The number and density of ankyrin coding genes in pathogenic strains make them good candidates in the search for genes likely to play a role in the interactions between Wolbachia and its host (Iturbe-Ormaetxe et al 2005; Duron et al. 2007; Walker et al. 2007). Despite intensive efforts, Wolbachia transformation is currently not an available technique. While waiting for an efficient transformation protocol for Wolbachia, genomic comparison of Wolbachia strains may provide extremely valuable data. If the mechanisms of MK are similar across strains, the genetic basis of this phenotype should be conserved between these strains and putative MK genes could potentially be identified. Genome comparisons of phylogenetically related strains such as wBol1 and wPip or wBol1-a and wBol1-b, which induce different phenotypes in their hosts, may identify highly variable genes or genetic features potentially involved in the induction of the observed phenotypic differences. To date, only protein coding genes have been investigated as potential genetic mechanisms underlying Wolbachia-induced phenotypes (Sinkins et al 2005; Walker et al. 2007; Duron et al. 2007). Small RNA molecules (sRNAs) are known in other systems to act through RNA interference (RNAi) to regulate translation of targeted genes (Tjaden et al. 2006 and including references). Similarly, MK Wolbachia could use sRNAs, rather than proteins to distort their hosts’ reproductive system. Comparative projects should therefore not only focus on

13

Male-Killing Wolbachia in the Butterfly Hypolimnas bolina

221

protein coding genes present in Wolbachia genomes, but also on the diversity of sRNA sequences, as they could also play a key role in the distortion of host reproductive systems.

13.4.3.3

Role of the Host in the Expression of MK

We have already described the symbiont-induced effects on different aspects of host biology; however, biological interactions are rarely unidirectional. Hosts can also act to mitigate any negative fitness effects associated with the symbiont. These interactions have been highlighted in different Wolbachia associations; however, the molecular mechanisms that underlie these interactions are not understood. In SEA and Samoa, H. bolina evolved resistance to the MK phenotype of wBol1-a, saving males from embryonic death (Hornett et al. 2006; Charlat et al. 2007a). Although the investigation of the butterfly genetics is in progress and should soon provide answers (Hornett pers. comm. 2009), it is not yet known if the resistance mechanism involves one or several genes, and whether this resistance is identical in both the SEA and Samoan populations (Charlat et al. 2007a). This repression, however, confirms the active involvement of the host in the phenotype induced by its symbiont. More interestingly, butterfly resistance to MK resulted in wBol1-a shifting to inducing CI (Hornett et al. 2008). In general, the reproductive phenotype observed in the natural host has been maintained in transfected hosts (Braig et al. 1994; Riegler et al. 2004; Sakamoto et al. 2005; McMeniman et al. 2008); however, immediate phenotypic shifts after transfection have been reported (Sasaki et al. 2002, 2005; Jaenike 2007). Phylogenetic studies of Wolbachia have demonstrated that very closely related strains express different phenotypes in their native hosts (Baldo et al. 2006), suggesting that shifts in phenotype expression are probably more common than originally thought. It also suggests that MK and CI might share a similar molecular basis that is differently expressed depending on host genotype (Jaenike 2007). Both phenotypes could be mechanistically similar; however, MK has evolved to be more extreme in its outcome than the CI.

13.5

Conclusion

Wolbachia have attracted the attention of a large scientific community, hoping to understand the biology of this bacterium that induces such a wide range of host phenotypes and has great potential as a biological control agent of insect pests and human diseases (Brelsfoard et al. 2009; McMeniman et al. 2009; Moreira et al. 2009). Many discoveries have been made in the last decade, but a multitude of questions still remain to be answered. MK is one of the least known Wolbachia phenotypes. Although we have a relatively good understanding of how MK Wolbachia affect host populations, genetics and dynamics, the cytology and genomics

222

A. Duplouy and S.L. O’Neill

aspects underlying the MK phenotype both remain poorly understood. We may come closer to finding answers with projects such as whole genome comparison of MK strains, but we are still far from having resolved all of Wolbachia’s mysteries. Acknowledgments We would like to thank Dr. I. Iturbe-Ormaetxe, Dr. M. Woolfit and Dr. P. Cook for very constructive comments on the manuscript. We are grateful to the Australian Research Council (DP0772992) and to The University of Queensland (UQCS and UQIRTA) for provision of the funds.

References Baldo L, Hotopp JCD, Jolley KA, Bordenstein SR, Biber SA, Choudhury RR, Hayashi C, Maiden MCJ, Tettelin H, Werren JH (2006) Multilocus sequence typing for Wolbachia. Appl Environ Microbiol 72(11):7098–7110 Bandi C, Damiani G, Magrassi L, Grigolo A, Fani R, Sacchi L (1994) Flavobacteria as intracellular symbionts in cockroaches. Proc Biol Sci 257:43–48 Bandi C, Anderson TJC, Genchi C, Blaxter ML (1998) Phylogeny of Wolbachia in filarial nematodes. Proc Biol Sci 265:2407–2413 Beckage NE, Tan FF, Schleifer KW, Lane RD, Cherubin LL (1994) Characterization and biological effects of Cotesia congregata polydnavirus on host larvae of the tobacco hornworm, Manduca sexta. Arch Insect Biochem Physiol 26:165–195 Bourtzis K, Miller TA (eds) (2003) Insect symbiosis. CRC Press, New York, NY Bourtzis K, Miller TA (eds) (2006) Insect symbiosis, vol 2. CRC Press, New York, NY Braig HR, Guzman H, Tesh RB, O’Neill SL (1994) Replacement of the natural Wolbachia symbiont of Drosophila simulans with a mosquito counterpart. Nature 367:453–455 Brelsfoard CL, StClair W, Dobson SL (2009) Integration of irradiation with cytoplasmic incompatibility to facilitate a lymphatic filariasis vector elimination approach. Parasit Vectors 2:38 Brownlie JC, Cass BN, Riegler M, Witsenburg JJ, Iturbe-Ormaetxe I, McGraw EA, O’Neill CL (2009) Evidence for metabolic provisioning by a common invertebrate endosymbiont, Wolbachia pipientis, during periods of nutritional stress. PLoS Pathog 5:6 Caturegli P, Asanovich KM, Walls JJ, Bakken JS, Madigan JE, Popov VL, Dumler JS (2000) ankA: an Ehrlichia phagocytophila group gene encoding a cytoplasmic protein antigen with ankyrin repeats. Infect Immun 68(9):5277–5283 Charlat S, Hornett EA, Dyson EA, Ho PPY, Thi-Loc N, Schilthuizen M, Davies N, Roderick GK, Hurst GDD (2005) Prevalence and penetrance variation of male-killing Wolbachia across Indo-Pacific populations of the butterfly Hypolimnas bolina. Mol Ecol 14:3525–3530 Charlat S, Engelstadter J, Dyson E, Hornett E, Duplouy A, Tortosa P, Davies N, Roderick G, Wedell N, Hurst G (2006) Competing selfish genetic elements in the butterfly Hypolimnas bolina. Curr Biol 16:2453–2458 Charlat S, Hornett EA, Fullard JH, Davies N, Roderick GK, Wedell N, Hurst GDD (2007a) Extraordinary flux in sex ratio. Science 317:214 Charlat S, Reuter M, Dyson EA, Hornett EA, Duplouy A, Davies N, Roderick GK, Wedell N, Hurst GDD (2007b) Male-killing bacteria trigger a cycle of increasing male fatigue and female promiscuity. Curr Biol 17:273–277 Charlat S, Davies N, Roderick GK, Hurst GDD (2007c) Disrupting the timing of Wolbachiainduced male-killing. Biol Lett 3:154–156 Charlat S, Duplouy A, Hornett EA, Dyson EA, Davies N, Roderick GK, Wedell N, Hurst GDD (2009) The joint evolutionary histories of Wolbachia and mitochondria in Hypolimnas bolina. BMC Evol Biol 9:64

13

Male-Killing Wolbachia in the Butterfly Hypolimnas bolina

223

Chen D-Q, Montllor CB, Purcell AH (2000) Fitness effects of two facultative endosymbiotic bacteria on the pea aphid, Acyrthosiphon pisum, and the blue alfalfa aphid, A. kondoi. Entomol Exp Appl 95:315–323 Christensen B (2004) Tracking of migrant blue moon butterfly, Hypolimnas bolina nerina, using web-based software. Weta 28:47–48 Clarke C, Sheppard PM (1975) The genetics of the mimetic butterfly Hypolimnas bolina (L.). Philos Trans R Soc Lond B Biol Sci 272(917):229–265 Clarke C, Sheppard P, Scali V (1975) All-female broods in the butterfly Hypolimnas bolina (L.). Proc Biol Sci 189:29–37 Clarke SC, Jonhson G, Jonson B (1983) All-female broods in Hypolimnas bolina (L.). A re-survey of West Fiji after 60 years. Biol J Linn Soc 19:221–235 Common IFB, Waterhouse DF (1972) Butterflies of Australia. Angus and Robertson, Sydney Cordaux R, Michel-Salzat A, Frelon-Raimond M, Rigaud T, Bouchon D (2004) Evidence for a new feminizing Wolbachia strain in the isopod Armadillidium vulgare: evolutionary implications. Heredity 93:78–84 Counce SJ, Poulson DF (1962) Developmental effects of the sex-ratio agent in embryos of Drosophila willistoni. J Exp Zool 151:17–31 Covacin C, Barker SC (2007) Supergroup F Wolbachia bacteria parasite lice (Insecta: Phthiraptera). Parasitol Res 100:479–485 Duplouy A, Hurst GDD, O’Neill SL, Charlat S (2009) Rapid spread of male-killing Wolbachia in the butterfly Hypolimnas bolina. J Evol Biol. Doi:10.1111/j.1420-9101.2009.01891.x Duron O, Boureux A, Echaubard P, Berthomieu A, Berticat C, Fort P, Weill M (2007) Variability and expression of ankyrin domain genes in Wolbachia infecting the mosquito Culex pipiens. J Bacteriol 189(12):4442–4448 Dyson EA, Hurst GDD (2004) Persistence of an extreme sex-ratio bias in a natural population. PNAS 101(17):6520–6523 Dyson E, Kamath M, Hurst G (2002) Wolbachia infection associated with all-female broods in Hypolimnas bolina (Lepidoptera: Nymphalidae): evidence for horizontal transmission of a butterfly male killer. Heredity 88:166–171 Engelst€adter J, Telschow A, Hammerstein P (2004) Infection dynamics of different Wolbachiatypes within one host population. J Theor Biol 231:345–355 Engelst€adter J, Telschow A, Yamamura N (2008) Coexistence of cytoplasmic incompatibility and male-killing-inducing endosymbionts, and their impact on host flow. Theor Popul Biol 73:125–133 Fialho RF, Stevens L (2000) Male-killing Wolbachia in a flour beetle. Proc Biol Sci 267:1469–1474 Foster J, Ganatra M, Kamal I, Ware J, Makarova K, Ivanova N, Bhattacharyya A, Kapatral V, Kumar S, Posfai J, Vincze T, Ingram J, Moran L, Lapidus A, Omelchenko M, Kyrpides N, Ghedin E, Wang S, Goltsman E, Joukov V, Ostrovskaya O, Tsukerman K, Mazur M, Comb D, Koonin E, Slatko B (2005) The Wolbachia genome of Brugia malayi: endosymbiont evolution within a human pathogenic nematode. PLoS Biol 3:599–614 Freeland SJ, McCabe BK (1997) Fitness compensation and the evolution of selfish cytoplasmic elements. Heredity 78:391–402 Fry AJ, Palmer MR, Rand DM (2004) Variable fitness effects of Wolbachia infection in Drosophila melanogaster. Heredity 93:379–389 Ghelelovitch S (1952) Sur le determinisme genetique de la sterilite dans les croisements entre differentes souches de Culex autogenicus Roubaud. C R Acad Sci III 234:2386–2388 Gibbs GW (1961) New Zealand butterflies. Tuatara J Biol Soc 9:65–76 Gibson CM, Hunter MS (2009) Inherited fungal and bacterial endosymbiont of a parasitic wasp and its cockroach host. Microb Ecol 57(3):542–549 Haine ER (2008) Symbiont-mediated protection. Proc Biol Sci 275:353–361 Hamilton WD (1967) Extraordinary sex ratios. Science 156(774):477–488

224

A. Duplouy and S.L. O’Neill

Hedges LM, Brownlies JC, O’Neill SL, Johnson KN (2008) Wolbachia and virus protection in insects. Science 322:702 Hertig M, Wolbach SB (1924) Studies on Rickettsia-like microorganisms in insects. J Med Res 44:329–374 Hochberg ME (1991) Viruses as costs to gregarious feeding behaviors in the Lepidoptera. Oikos 61(3):291–296 Hornett EA, Charlat S, Duplouy AMR, Davies N, Roderick GK, Wedell N, Hurst GDD (2006) Evolution of male killer suppression in natural population. PLoS Biol 4(9):e283 Hornett EA, Duplouy AMR, Davies N, Roderick GK, Wedell N, Hurst GDD, Charlat S (2008) You can’t keep a good parasite down: evolution of a male-killer suppressor uncovers cytoplasmic incompatibility. Evolution 62(5):1258–1263 Hornett EA, Charlat S, Wedell N, Jiggins CD, Hurst GDD (2009) Rapidly shifting sex ratio across a species range. Curr Biol 19:1628–1631 Huigens ME, Luck RF, Klaassen RHG, Maas MFPM, Timmermans MJTN, Stouthamer R (2000) Infectious parthenogenesis. Nature 405:178–179 Hunter MS, Perlman SJ, Kelly SE (2003) A bacterial symbiont in the Bacteroidetes induces cytoplasmic incompatibility in the parasitoid wasp Encarsis pergandiella. Proc Biol Sci 270:2185–2190 Hurst L (1991) The incidences and evolution of cytoplasmic male killers. Proc Biol Sci 244:91–99 Hurst GDD, Jiggins FM (2000) Male-killing bacteria in insects: mechanisms, incidence, and implications. Emerg Infect Dis 6(4):329–336 Hurst GDD, Hurst LD, Majerus MEN (1997) Cytoplasmic sex ratio distorters. In: O’Neill SL, Hoffmann AA, Werren JH (eds) Influential passengers, inherited microorganisms and arthropod reproduction. Oxford University Press Inc, New York, pp 125–154 Hurst GDD, van der Schulenburg JHG, Majerus TMO, Bertrand D, Zakharov IA, Baungaard J, Volkl W, Stouthamer R, Majerus MEN (1999a) Invasion of one insect species, Adalia bipunctata, by two different male-killing bacteria. Insect Mol Biol 8(1):133–139 Hurst GDD, Jiggins FM, van der Schulenburg JHG, Bertrand D, West SA, Goriacheva II, Zakharov IA, Werren JH, Stouthamer R, Majerus MEN (1999b) Male-killing Wolbachia in two species of insect. Proc Biol Sci 266(1420):735–740 Hurst GDD, Jiggins FM, Majerus MEN (2003) Inherited microorganisms that selectively kill male hosts: the hidden players of insect evolution? In: Bourtzis K, Miller TA (eds) Insect symbiosis. CRC Press, New York, NY, pp 177–197 Ikeda H (1970) The cytoplasmic-inherited ‘sex-ratio-condition’ in natural and experimental populations of Drosophila bifasciata. Genetics 65:311–333 Iturbe-Ormaetxe I, Riegler M, O’Neill SL (2005) New names for old strains?Wolbachia wSim is actually wRi. Genome Biol 6:401 Jaenike J (2007) Spontaneous emergence of a new Wolbachia phenotype. Evolution 61 (9):2244–2252 Jiggins FM, Hurst GDD, Jiggins CD, von der Schulenburg JHG, Majerus MEN (2000) The butterfly Danaus chrysippus is infected by a male-killing Spiroplasma bacterium. Parasitology 120:439–446 Kageyama D, Narita S, Noda H (2008) Transfection of feminizing Wolbachia endosymbionts of the butterfly, Eurema hecabe, into the cell culture and various immature stages of the silkmoth, Bombyx mori. Microb Ecol 56(4):733–741 Kemp DJ (1998) Oviposition behaviour of post-diapause Hypolimnas bolina (L.) (Lepidoptera: Nymphalidae) in tropical Australia. Aust J Zool 46:451–459 Klasson L, Walker T, Sebaihia M, Sanders MJ, Quail MA, Lord A, Sanders S, Earl J, O’Neill SL, Thomson N, Sinkins SP, Parkhill J (2008) Genome evolution of Wolbachia strain wPip from the Culex pipiens group. Mol Biol Evol 25(9):1877–1887 Klasson L, Westberga J, Sapountzis P, Naslund K, Lutnaes Y, Darby AC, Veneti Z, Chend L, Braig HR, Garrett R, Bourtzis K, Andersson SGE (2009) The mosaic genome structure of the Wolbachia wRi strain infecting Drosophila simulans. PNAS 106(14):5725–5730

13

Male-Killing Wolbachia in the Butterfly Hypolimnas bolina

225

Laven H (1959) Speciation by cytoplasmic isolation in the Culex pipiens complex. Cold Spring Harb Symp Quant Biol 24:166–175 Li W, Schuler MA, Berenbaum MR (2003) Diversification of furanocoumarin-metabolizing cytochrome P450 monooxygenases in two papilionids: specificity and substrate encounter rate. PNAS 100(Suppl 2):14593–14598 Lindroth RL (1989) Host plant alteration of detoxication activity in Papilio glaucus glaucus. Entomol Exp Appl 50:29–35 Lo N, Evans TA (2007) Phylogenetic diversity of the intracellular symbiont Wolbachia in termites. Mol Phylogenet Evol 44:461–466 Lo N, Paraskevopoulos C, Bourtzis K, O’Neill SL, Werren JH, Bordenstein SR, Bandi C (2007) Taxonomic status of the intracellular bacterium Wolbachia pipientis. Int J Syst Evol Microbiol 57:654–657 Majerus MEN, Hurst GDD (1997) Ladybirds as a model for the study of male-killing symbionts. Entomophaga 42(1/2):13–20 McMeniman CJ, Lane AM, Fong AW, Voronin DA, Iturbe-Ormaetxe I, Yamada R, McGraw EA, O’Neill SL (2008) Host adaptation of a Wolbachia strain after long-term serial passage in mosquito cell lines. Appl Environ Microbiol 74(22):6963–6969 McMeniman CJ, Lane RV, Cass BN, Fong AWC, Sidhu M, Wang Y-F, O’Neill SL (2009) Stable introduction of a life-shortening Wolbachia infection into the mosquito Aedes aegypti. Science 323:141–144 Moran NA (2006) Symbiosis. Curr Biol 16(20):866–871 Moran NA, Munson MA, Baumann P, Ishikawa H (1993) A molecular clock in endosymbiotic bacteria is calibrated using the insect hosts. Proc Biol Sci 253:167–171 Moran NA, Baumann P, von Dohlen C (1994) Use of DNA sequences to reconstruct the history of the association between members of the Sternorrhyncha (Homoptera) and their bacterial endosymbionts. Eur J Entomol 91:79–83 Moran NA, Dunbar HE, Wilcox JL (2005) Regulation of transcription in a reduced bacterial genome: nutrient-provisioning genes of the obligate symbiont Buchnera aphidicola. J Bacteriol 187(12):4229–4237 Moreira LA, Iturbe-ormaetxe I, Jeffery JAL, Lu G, Pyke AT, Hedges LM, Rocha BC, HallMendelin S, Day A, Riegler M, Hugo LE, Johnson KN, Kay BH, McGraw EA, van der Hurk AF, Ryan PA, O’Neill SL (2009) A Wolbachia symbiont in Aedes aegypti limits infection with dengue, chikungunya and Plasmodium. Cell 139(7):1268–1278 Morishita and Kazuhiko (2002) A migrant from an oceanic island – Hypolimnas bolina, 6 days stay near Zushi Beach, Kanagawa, Japan. Butterflies 32:24–26 Mosavi LK, Cammett TJ, Desrosiers DC, Peng Z-Y (2004) The ankyrin repeat as molecular architecture for protein recognition. Protein Sci 13:1435–1448 Nafus DM (1993) Movement of introduced biological control agents onto nontarget butterflies, Hypolimnas spp. (Lepidoptera: Nymphalidae). Environ Entomol 22(2):265–272 Narita S, Kageyama D, Nomura M, Fukatsu T (2007) Unexpected mechanism of symbiontinduced reversal of insect sex: feminizing Wolbachia continuously acts on the butterfly Eurema hecabe during larval development. Appl Environ Microbiol 73(13):4332–4341 Noda H, Kodama K (1996) Phylogenetic position of yeast-like endosymbionts of Anobiid beetles. Appl Environ Microbiol 62(1):162–167 O’Neill SL, Hoffmann AA, Werren JH (1997) Influencial passengers,inherited microorganisms and arthropod reproduction. Oxford University Press Inc., New York Oliver KM, Campos J, Moran NA, Hunter MS (2007) Population dynamics of defensive symbionts in aphids. Proc Biol Sci 275:293–299 Patrick BH (2004) Invasion of the blue moon butterfly in Taranaki. Weta 28:45–46 Perlman SJ, Kelly SE, Hunter MS (2008) Population biology of cytoplasmic incompatibility: maintenance and spread of Cardinium symbionts in a parasitic wasp. Genetics 178:1003–1011 Poulton EB (1923) All female families of Hypolimnas bolina, bred in Fiji by HW Simmonds. Proc R Ent Soc Lond 1923:9–12

226

A. Duplouy and S.L. O’Neill

Ramsay GW (1971) The blue moon butterfly Hypolimnas bolina nerina in New Zealand during autumn, 1971. N Z Entomol 5:73–75 Ramsay GW, Ordish RG (1966) The Australian blue moon butterfly Hypolimnas bolina nerina (F.) in New Zealand. NZ J Sci 9:719–729 Randerson JP, Smith NGC, Hurst LD (2000) The evolutionary dynamics of male-killers and their hosts. Heredity 84:152–160 Riegler M, Charlat S, Stauffer C, Mercot H (2004) Wolbachia transfer from Rhagoletis cerasi to Drosophila simulans: investigating the outcomes of host-symbiont coevolution. Appl Environ Microbiol 70(1):273–279 Rigaud T (1997) Inherited microorganisms and sex determination of arthropod hosts. In: O’Neill SL, Hoffmann AA, Werren JH (eds) Influential passengers, inherited microorganisms and arthropod reproduction. Oxford University Press Inc, New York, pp 81–101 Ruan Y-M, Xu J, Liu S-S (2006) Effects of antibiotics on fitness of the B biotype and a non-B biotype of the whitefly Bemisia tabaci. Entomol Exp Appl 121:159–166 Russel JA, Moran NA (2005) Horizontal transfer of bacterial symbiont: heritability and fitness in a novel aphid host. Appl Environ Microbiol 71(12):7987–7994 Ryan PA, Harris AC (1990) A note of recent records of Australian butterflies in New Zealand. N Z Entomol 13:40–41 Sakamoto H, Ishikawa Y, Sasaki T, Kikuyama S, Tatsuki S, Hoshizaki S (2005) Transinfection reveals the crucial importance of Wolbachia genotypes in determining the type of reproductive alteration in the host. Genet Res 85:205–210 Sasaki T, Kubo T, Ishikawa H (2002) Interspecific transfer of Wolbachia between two lepidopteran insects expressing cytoplasmic incompatibility: a Wolbachia variant naturally infecting Cadra cautella causes male-killing in Ephesia kuehniella. Genetics 162:1313–1319 Sasaki T, Massaki N, Kubo T (2005) Wolbachia variant that induces two distinct reproductive phenotypes in different hosts. Heredity 95:389–393 Simmonds HW (1926) Sex ratio of Hypolimnas bolina in Viti Levu, Fiji. Proc R Ent Soc Lond 1:29–32 Sinkins SP, Walker T, Lynd AR, Steven AR, Makepeace BL, Godfray HC, Parkhill J (2005) Wolbachia variability and host effects on crossing type in Culex mosquitoes. Nature 14:257–260 Stamp NE, Bowers MD (1988) Direct and indirect effects of predatory wasps (Polistes sp.: Vespidae) on gregarious caterpillars (Hemileuca lucina: Saturniidae). Oecologia 75:619–624 Stouthamer R, Kazmer D (1994) Cytogenetics of microbe-associated parthenogenesis and its consequences for gene flow in Trichogramma wasps. Heredity 73:317–327 Stouthamer R, Breeuwer JAJ, Hurst GDD (1999) Wolbachia pipientis: microbial manipulator of arthropod reproduction. Annu Rev Microbiol 53:71–102 Taylor MJ, Hoerauf A (1999) Wolbachia bacteria of filarial nematodes. Parasitol Today 15 (11):437–442 Teixeira L, Ferreira A, Ashburner M (2008) The bacterial symbiont Wolbachia induces resistance to RNA viral infections in Drosophila melanogaster. PLoS Biol 6(12):2753–2763 Tjaden B, Goodwin SS, Opdyke JA, Guillier M, Fu DX, Gottesman S, Storz G (2006) Target prediction for small, noncoding RNAs in bacteria. Nucleic Acids Res 34(9):2791–2802 Tram U, Sullivan W (2002) Role of delayed nuclear envelope breakdown and mitosis in Wolbachia-induced cytoplasmic incompatibility. Science 296:1124–1126 van Nouhuys S, Hanski I (2005) Metacommunities of butterflies, their host plant, and their parasitoids. In: Holyoak M, Leibold MA, Holt RD (eds) Metacommunities spatial dynamics and ecological communities. University of Chicago Press, USA Vandekerckhove TTM, Watteyne S, Willems A, Swings JG, Mertens J, Gillis M (1999) Phylogenetic analysis of the 16 S rDNA of the cytoplasmic bacterium Wolbachia from the novel host Folsomia candida (Hexpoda, Collembola) and its implications for Wolbachia taxonomy. FEMS Microbiol Lett 180:179–286

13

Male-Killing Wolbachia in the Butterfly Hypolimnas bolina

227

Veneti Z, Bentley JK, Koana T, Braig HR, Hurst GDD (2005) A functional dosage compensation complex required for male-killing in Drosophila. Science 307:1461–1463 Walker T, Klasson L, Sebaihia M, Sanders MJ, Thomson NR, Parkhill J, Sinkins SP (2007) Ankyrin repeat domain-encoding genes in the wPip strain of Wolbachia from the Culex pipiens group. BMC Biol 5(39):1–9 Weeks AR, Marec F, Breeuwer JAJ (2001) A mite species that consists entirely of haploid females. Science 292:2479–2482 Wen Z, Rupasinghe S, Niu G, Berenbaum MR, Schuler MA (2006) CYP6B1 and CYP6B3 of the Black Swallowtail (Papilio polyxenes): adaptative evolution through subfunctionalization. Mol Biol Evol 23(12):2434–2443 Werren JH (1987) The coevolution of autosomal and cytoplasmic sex ratio factors. J Theor Biol 124:317–334 Werren JH, O’Neill SL (1997) The evolution of heritable symbionts. In: O’Neill SL, Hoffmann AA, Werren JH (eds) Influential passengers, inherited microorganisms and arthropods reproduction. New York, Oxford University Press Inc., pp 1–41 Werren JH, Windsor D, Guo L (1995) Distribution of Wolbachia among neotropical arthropods. Proc Biol Sci 262:197–204 Wu M, Sun LV, Vamathevan J, Riegler M, Deboy R, Brownlie JC, McGraw EA, Martin W, Esser C, Ahmadinejad N, Wiegand C, Madupu R, Beanan MJ, Brinkac LM, Daugherty SC, Durkin AS, Kolonay JF, Nelson WC, Mohamoud Y, Lee P, Berry K, Young MB, Utterback T, Weidman J, Nierman WC, Paulsen IT, Nelson KE, Herve Tettelin, O’Neill SL, Eisen JA (2004) Phylogenomics of the reproductive parasite Wolbachia pipientis wMel: a streamlined genome overrun by mobile genetic elements. PLoS Biol 2:327–341 Yen JH, Barr AR (1971) New hypothesis of the cause of cytoplasmic incompatibility in Culex pipiens L. Nature 232:657–658 Zhou W, Rousset F, O’Neill SL (1998) Phylogeny and PCR-based classification of Wolbachia strains using wsp gene sequences. Proc Biol Sci 265(1395):509–515

Chapter 14

Evolution of Immunosuppressive Organelles from DNA Viruses in Insects Brian A. Federici and Yves Bigot

Abstract Endoparasitic wasps inject particles into their lepidopteran hosts that enable these parasitoids to evade or directly suppress the hosts’ innate immune response, especially encapsulation by hemocytes. For decades, these particles have been considered virions produced by DNA viruses known as polydnaviruses (family Polydnaviridae). Structurally, there are two main types of particles, those resembling, respectively, virions of baculoviruses or ascoviruses. These particles contain double-stranded DNA in the form of multiple small circular molecules that are transcribed but not replicated in cells of the lepidopteran hosts. Instead particle DNA is replicated from the wasp genome and selectively amplified for packaging into the particles in the reproductive tract of female wasps. Once assembled and secreted into calyx lumen, the particles become mixed with eggs and injected into caterpillars during wasp oviposition. Particle DNA, referred to as the “viral genome,” has now been sequenced for several polydnaviruses. Annotation shows that most of this DNA consists of noncoding DNA or wasp genes, not viral genes. More significantly, recent studies have shown that particle structural proteins are coded by the wasp genome, not by particle DNA, but are of viral origin. Together, these findings provide strong evidence that these particles originated from viruses, but through symbiogenesis followed by gene deletion and acquisition evolved into transducing organelles that shuttle wasp immunosuppressive genes into their hosts, thereby enhancing wasp progeny survival and species radiation.

B.A. Federici Department of Entomology, University of California, Riverside 900 University Avenue, Riverside, California 92521, USA Laboratoire d’Etude des Parasites Ge´ne´tiquesParc Grandmont, Universite´ de Tours, U.F.R. des Sciences et Techniques, 37200, Tours, France e-mail: [email protected] Y. Bigot Laboratoire d’Etude des Parasites Ge´ne´tiquesParc Grandmont, Universite´ de Tours, U.F.R. des Sciences et Techniques, 37200 Tours, France

P. Pontarotti (ed.), Evolutionary Biology – Concepts, Molecular and Morphological Evolution, DOI 10.1007/978-3-642-12340-5_14, # Springer-Verlag Berlin Heidelberg 2010

229

230

14.1 14.1.1

B.A. Federici and Y. Bigot

Introduction Background

George Salt at the University of Cambridge published a series of pioneering studies during the 1960s aimed at understanding how endoparasitic wasps circumvented the innate immune response of their caterpillar hosts. Based on studies of the ichneumonid parasitoid, Venturia (then Nemeritis) canescens and its lepidopteran host, larvae of the Mediterranean flower moth, Ephestia kuehniella, he determined that parasitoid eggs gained protection as they passed through the calyx (egg storage region) of the female wasp’s reproductive tract (Salt 1965, 1966, 1968). This protection was due to a coating added to the eggs in the calyx. Subsequently, Susan Rotheram, one of Salt’s graduate students, determined that this coating contained masses of enveloped virus-like particles about 130 nm in diameter. After assembly in calyx cell nuclei, these were secreted into the calyx lumen where they adhered to fibrillar matrix on the egg surface (Rotheram 1967). In later studies, Rotheram 1973a, b showed that the particles contained protein and complex sugars, but no DNA. Then another of Salt’s graduate students found that a major particle glycoprotein was responsible for the immunoprotection (Bedwin 1979a, b). Following on these studies, Otto Schmidt and his collaborators in Germany showed that this protein was encoded in the wasp genome, but likely originated from basal lamina proteins found in the caterpillar host (Schmidt and Schuchmann-Feddersen 1989; Schmidt and Theopold 1991; Schmidt et al. 2001). After Salt and Rotheram’s studies, Vinson and colleagues as well as others found that particles in the calyx fluid of the endoparasitic ichneumonids Campoletis sonorensis and Cardiochiles nigriceps also suppressed the immune response of their caterpillar hosts (Vinson 1972; Vinson and Scott 1975; Vinson 1990). These particles were also produced in the nuclei of calyx cells, but though morphologically similar to V. canescens particles, they contained DNA. These findings stimulated numerous investigations of the calyx gland and secretions of many endoparasitic wasps of the families Ichneumonidae and Braconidae, revealing two major particle types, one in ichneumonids and another in braconids (see Stoltz and Vinson 1979, and Vinson 1990; Webb et al. 2005). When first discovered, the ichneumonid particles were not typical of virions of any known type of insect virus (Fig. 14.1). They were bound by two unit membranes, were oblong to globular in shape, and ranged from 130 to 150 nm in diameter by 300–400 nm in length, with a fusiform nucleocapsid (Webb et al. 2005). Later, viruses of a new family, the ascoviruses (family Ascoviridae) were discovered that attacked caterpillars, replicating and produces progeny virions in various host tissues. The virions produced by ascoviruses are structurally similar to the ichneumonid particles and are transmitted by parasitic wasps (Federici 1983; Federici et al. 2005). In contrast to the ichneumonid particles, those produced by braconid wasps resembled nudivirus virions and similar virions of the occluded form of baculoviruses (Burand 1998; Wang and Jehle 2009). They consisted primarily of one or more cylindrical

14 Evolution of Immunosuppressive Organelles from DNA Viruses in Insects

231

Fig. 14.1 Transmission electron micrographs of immunosuppressive particles produced by endoparasitic braconid and ichneumonid wasps. (a) Bracovirus particles. (b) Ichneumonid particles. The bracovirus particles resemble nudivirus and baculovirus virions, and molecular evidence now indicates that these particles have their origin in an ancestral nudivirus. The ichneumonid particles resemble ascovirus virions, but their origin remains uncertain at present. Bars ¼ 200 nm. Original micrographs by D.B. Stoltz

particles surrounded by a single envelope (Fig. 14.1b). The cylindrical inner particle varied in length from 30 to 100 nm, even within the same wasp species. Similar particles have been identified in more than 50 wasp species. In these, unlike the genomes of most viruses of insects, the DNA does not occur as a single circular molecule, but as numerous circular molecules. These vary in size from few to many kbp and are referred to as segmented, polydispersed, or multipartite DNA (Stoltz 1993; Webb et al. 2005). Most evidence indicates these particles do not have a genome per se, but rather their DNA is part of the wasp genome (Espagne et al. 2004; Webb et al. 2006; Desjardins et al. 2008). Moreover, as far as is known, though genes contained in the particles are expressed in nuclei of the parasitoid’s caterpillar host cells, no particle DNA replication occurs in these, nor do the particles produce any progeny. From the standpoint of a viral life cycle, they are a dead end.

14.1.2

Establishment of the Family Polydnaviridae

Based on the unusual physical and biological properties of these particles and their obligate symbiotic relationship with wasps (Edson et al. 1981), a new virus family, Polydnaviridae (“Poly” referring to the polydispersed DNA), was established to accommodate these newly discovered viruses (Stoltz et al. 1984). Establishment of this family formalized the recognition of two genera, the genus Ichnovirus (ichnoviruses) for particles produced by ichneumonid wasps, and genus Bracovirus (bracoviruses) for particles produced by braconids (Webb et al. 2005). At the time these genera where erected, the particles were considered to be infective viruses capable of replication (at least for these viruses in calyx cells), much like that which occurs in other types of viruses. Although molecular data were not

232

B.A. Federici and Y. Bigot

sufficient at that time to undertake meaningful comparisons of these viruses, available information as well as the significant structural differences between the particles of these two virus types suggested that the association of each with its corresponding wasp family arose independently. Thus, their similar functional roles in parasite biology and success were and are considered a result of convergent evolution.

14.1.3

Particle Function: General Mechanisms of Viral Immunosuppression

Detailed studies of several polydnavirus/parasitoid systems have shown that the virus-like particles produced by these wasps in major braconid and ichneumonid lineages (Whitfield 2002a, b) are required for suppression of the wasps’ hosts’ immune system in all species studied to date (Stoltz 1993, Vinson 1990; Webb et al. 2005, 2006). Suppression, depending on the specific system, occurs either by molecular mimicry, where the surface of the egg and early instars are coated with particles not recognized as foreign, by hemocyte inactivation through expression of particle genes after oviposition, or by both mechanisms. Many of the genes encoded by these wasp particles also inhibit components of innate immune pathways, including the Toll and Imd pathways. Detailed knowledge of how the particle genes of individual wasp species elude or incapacitate innate immune responses varies considerably from one wasp species to another, and thus our understanding of these processes is still in the early stages of development. Our purpose in this chapter, therefore, is not to discuss specific particle functions, but rather to summarize the key data that support the concept that these particles, though they originated as virions, are a novel type of organelle that originated by lateral gene transfer/ symbiogenesis. Those interested in detailed discussions of particle functions as well as their similarities and differences are referred to the excellent articles by Webb et al. (2006) and Tanaka et al. (2007).

14.2

Polydnavirus Particles as Organelles Rather Than Virions – the Concept

The structural similarity of braconid particles to baculovirus virions, and ichneumonid particles to ascovirus virions, made these viruses obvious choices as the evolutionary sources of these two types of immunosuppressive particles (Federici 1991; Federici and Bigot 2003). At the time braconid particles were discovered, the baculoviruses consisted of two main types, referred to as “occluded,” meaning that the virions were occluded in a protein matrix, and “nonoccluded,” meaning that they were not. Subsequently, the nonoccluded baculoviruses were reclassified into

14 Evolution of Immunosuppressive Organelles from DNA Viruses in Insects

233

a new type known as the nudiviruses. The nudivirus group consists of a small and very diverse group of nonoccluded viruses from insects and crustaceans that share 33 core genes with baculoviruses (out of more than 100), but differ in host range and pathology (Wang and Jehle 2009). Of significant evolutionary importance is that one of these nudiviruses, HzNV-2, replicates in the reproductive tract of the lepidopteran Heliothis zea, a host used commonly by many braconid and ichnomonid wasps. Of particular significance is the recent finding that an ancestral nudivirus is the likely source of the structural proteins encoded by braconid wasps that compose their immunosuppressive particles (Be´zier et al. 2009). While current evidence for the origin of the ichneumonid immunosuppressive particles is not nearly as strong as that for the braconids, recent molecular analyses suggest these originated from ascovirus virions or a related ancestor virus (Bigot et al. 2008). Data supporting these origins are discussed in more detail later below. Although the braconid and ichneumonid particles clearly resemble nudivirus and ascovirus virions, even early studies of these indicated they lacked important properties characteristic of all viruses. For example, once within a lepidopteran host cell, there was no replication of DNA. Moreover, in no case was there any production of progeny virions to disseminate the virus and infect the next host or cell. Other evidence indicating that the particles were not virions of a virus were that the so-called infection of host cells and particle production in the wasp tissues was strictly under control of the wasp. In all viruses, while they interact in various ways with host cells, it is the virus that controls the synthesis of virus proteins and replication of DNA, not the host cell, strictly speaking. Yet in the case of the braconid and ichneumonid particles, they were only produced in female wasps, and only in a narrow region of the reproductive tract, and only in pupal and adult tissues as eggs were being produced (Webb et al. 2006). Adding to these problems in classifying the particles as those of a virus was the occurrence of similar immunosuppressive particles that contained no DNA, such as those produced by the ichneumonid, V. cansecens, discussed above (Rotheram 1967) and more recently in other parasitic wasps (Barratt et al. 1999). Given that even before the DNA in particles was sequenced there was substantial evidence that they were not virions, the question became what are they? The most obvious correlates were something like mitochondria and plastids, organelles that originated from bacteria through the fusion of genomes, i.e., symbiogenesis followed by gene loss and acquisition (Margulis and Fester 1991; Margulis 1992; Khakhina 1992). The evidence is now indisputable that mitochondria and chloroplasts, for example, originated from bacteria that became endosymbionts and subsequently evolved into organelles. By analogy, the same evolutionary processes occurred, although much more recently, with endoparasitic braconid and ichneumonid wasps and at least two different types of viruses, an ancestral nudivirus in the case of the braconids, and for the ichneumonids, probably an ancestral ascovirus or iridovirus (the latter being the ancestor of the ascoviruses). Whereas the molecular evidence is still weak for the origin of the ichneumonid particles from ascoviruses, the evidence that bracoviruses originated from an ancestral nudivirus is now very strong (Be´zier et al. 2009).

234

B.A. Federici and Y. Bigot

At present, polydnavirus researchers continue to refer to the braconid and ichneumonid particles as, respectively, bracovirus or ichnovirus virions, despite overwhelming evidence from their own studies to the contrary (Webb et al. 2006; Tanaka et al. 2007; Be´zier et al. 2009). Alternatively, based on the molecular data regarding their evolution, current genetic complements, and functions, we argue that these interesting immunosuppressive particles should be recognized for what they are – organelles that evolved from viruses. Continuing to view these organelles as viruses masks a much more interesting biological and evolutionary phenomenon than viewing them as “symbiotic viruses.” It also contravenes the definition of such fundamental concepts as a virus, a genome, and symbiosis. If these particles are viruses, we have a tripartite – a virus, a wasp, and its lepidopteran host (Webb et al. 2006). Viewing the particles as organelles makes it a bipartite system, a wasp with a novel organelle encoded in the genome and a lepidopteran host (Federici and Bigot 2003). We think that this new paradigm better explains their biological properties and diversity and leads to better hypotheses for testing how they evolved and facilitated the evolution of wasps and their insect hosts. Below we elaborate on some of the key evidence for the likely evolutionary pathways that led to these novel organelles. We move from the braconid system, for which the most molecular data are available, to the ichneumonid system. We finish with a description of several other types of endoparasitic wasp/insect host systems which putatively represent various phases of the symbiotic evolutionary process that range from (1) tripartite systems consisting of a wasp, true virus, and insect host, to (2) bipartite systems consisting of a wasp with an organelle that has a DNA complement, and an insect host, to (3) bipartite systems with wasp with organelle lacking a DNA complement, and an insect host.

14.3 14.3.1

The Evolution of Braconid Particles from Nudiviruses Early Studies of Nudiviruses in Braconid Wasps and Their Hosts

Several viruses that have the structural features of nudiviruses have been known for many years. For example, the nudivirus of the braconid, Microplitis croceipes, is transmitted vertically, replicates in hemocytes and other tissues, and causes significant pathology and mortality in adult wasps (Hamm et al. 1988). A more interesting nudivirus is the so-called filamentous virus (FV) of the braconid, Cotesia marginiventris. CmFV is apparently a benign virus that is transmitted vertically by C. marginiventris and replicates in cells of both the wasp’s lateral and common oviduct, the latter near the calyx, and in cells of its lepidopteran hosts including Helicoverpa zea and Spodoptera frugiperda (Hamm et al. 1990). Structurally, the virions of these wasp-transmitted viruses resemble the nudiviruses, Hz-I, and the Gonad-Specific Virus, that occur, respectively, in cells lines derived from H. zea

14 Evolution of Immunosuppressive Organelles from DNA Viruses in Insects

235

and in the gonadal tissues of this species (Burand 1998). The Microplitis and CmFV nudiviruses viruses are apparently maintained in host populations by vertical transmission. An even more interesting nudivirus is Hz-NV1, a large virus with a genome of 228 kbp (Wang and Jehle 2009). This virus has been shown to integrate into the chromosomes of Trichoplusia ni (TN 368) and S. frugiperda (SF21AE and SF9) cells, in which it can establish a latent infection (Lin et al. 1999). This is particularly relevant to symbiogenesis because it demonstrates that a large ds DNA circular genome can integrate into the chromosomes of their insect hosts. This provides a possible mechanism for the evolutionary entry of full or partial nudivirus genomes into wasp genomic DNA. The above examples are very limited but they do at least provide examples of the types of viral/host systems that could lead over evolutionary time to the integration of nudivirus or baculovirus genomes into those of their wasp hosts. Fortunately, owing to the studies by Espagne et al. (2004), and more recently Be´zier et al. (2009), we now have very strong evidence that such an integration actually occurred, and given the estimates of Whitfield (2002a), a little less than 100 mya.

14.3.2

Molecular Evidence for the Evolution of Braconid Particles from a Nudivirus

One of the predictions of a viral paradigm is that the DNA in the virions would encode virion structural proteins and enzymes needed for the various replication and assembly processes. An organelle paradigm, on the other hand, would predict a significant reduction in genome size and that many, if not most of the original genes, would be transferred to the nuclear genome or lost during evolution. Thus, before any braconid or ichneumonid particles genomes, the so-called “viral genomes” were sequenced, we predicted that most of the DNA in the particle would consist of wasp genes, that is, DNA originating from wasp chromosomes (Federici 1991; Federici and Bigot 2003). The first significant confirmation of the organelle paradigm came from the sequencing DNA in the particles produced by the braconid wasp, Cotesia congregata (Espagne et al. 2004). In this important study, it was shown that fewer than 2% of the genes were related to those of any known virus. Most of the genes encoded proteins with physiological functions, such as protein tyrosine phosphatases, ankyrins, cysteine-rich proteins, and cystatins. Some of the genes were related to the genes found in the particles produced by other braconid species, but nevertheless, none of these was related to any known virion structural protein. Similar findings have now been reported for the “genomes” of particles produced by other braconids, including those of Glyptapanteles indiensis and G. flavicoxis (Desjardins et al. 2008). The DNA in all the particles sequenced to date consists mostly of noncoding DNA of wasp origin, and DNA that codes for wasp proteins. Some of these genes may well have originated

236

B.A. Federici and Y. Bigot

from viruses or bacteria, but they likely have been part of wasp genomes for millions of years, and therefore are now in essence wasp genes. Even though the structural characteristics of the particles made it probable they originated from a baculovirus or nudivirus, these results made it clear that the “genomic” DNA, unlike in the case of any other known virus, could not be used to find the viral origin from which the particles evolved. Nor would these “genomes” be very useful for polydnavirus systematics, because if the particle DNA is wasp DNA, the sequences would likely reflect the relationships of the wasps. In fact, evidence for this was already apparent years ago for braconid particle DNAs for several Cotesia species (Whitfield 2002b). As it had been known for many years that the braconid particles were produced in calyx cells, a way to get at more meaningful data regarding the origin of the braconid particles was to clone and sequence the transcripts from reproductive tissues at the time of particle production. Thus, in another important and insightful paper, Be´zier et al. (2009) sequenced 5,000 expressed sequence tags from the ovaries of two braconid wasps, Chelonus inanitus and C. congregata, and one ichneumonid, Hyposoter didymator. The sequences from the ichneumonid wasp did not show any relationship to known viral proteins, but analysis of the braconid sequences proved very profitable. They identified 22 sequences related to nudiviruses, and 13 of these were core genes shared with baculoviruses. The genes identified correlated with nudivirus and baculovirus virion structural proteins, proteins involved in virion assembly, and subunits of viral RNA polymerases. No polymerases involved in DNA replication were detected, indicating wasp polymerases were likely responsible for synthesis of braconid particle “genomes.” Aside from providing excellent data regarding the original of crucial particle components and proteins needed for particle assembly, these data show clearly that these proteins are all encoded in the wasp genome and are under strict regulation by the wasp genome, again a property not characteristic of any known virus.

14.4

Origin and Evolution of Ichneumonid Particles

As noted above for braconid particles, the DNA in ichnemonid particles consists primarily of noncoding ichneumonid wasp DNA and genes coding for ichneumonid proteins involved in immunosupression. Therefore, this DNA, while of some value for suggesting the possible viral origins of these particles, as discussed below, we do not currently have the type of information from these wasps corresponding to the data described above for the braconid particles. The structure of the ichneumonid particles suggests they originated from ascoviruses, and fortunately we do have reasonably good molecular data for the evolution of ascoviruses from iridoviruses (Stasiak et al. 2003). So we first review here pertinent key features of iridioviruses and ascoviruses, and then review the limited molecular evidence suggesting the ichnoviruses evolved from an ascovirus or iridovirus ancestor of these.

14 Evolution of Immunosuppressive Organelles from DNA Viruses in Insects

14.4.1

237

Family Iridoviridae

The family Iridoviridae is comprised of a diverse group of enveloped, doublestranded (ds) DNA viruses which produce large icosahedral virions that typically range 125–160 nm in diameter (Fig. 14.2). These viruses are commonly found in invertebrates, particularly insects, but also occur among vertebrates (Chinchar et al. 2005). Iridoviruses have a broad tissue tropism in insects, and infect and replicate in most tissues, with the unusual exception of the midgut epithelium, a tissue that most insect viruses attack readily. Corresponding with their tissue tropism, iridoviruses are poorly infectious per os (Federici 1993). Once within a cell, iridovirus DNA replication, formation of the virogenic stroma, and virion assembly all take place in the cytoplasm. Iridoviruses have been reported from diverse lepidopteran hosts, including the rice stem borer, Chilo suppressalis (Pyralidae), the American armyworm, Heliothis armigera (Noctuidae), and the fall armyworm, S. frugiperda (Noctuidae). Relevant to the possibility that an ancestral iridovirus or ascovirus is the source of the ichneumonid particles, the ichneumonid, Eiphosoma vitticolle, which parasitizes larvae of the fall armyworm, S. frugiperda, is also infected by an iridovirus, and transmits this virus to fall armyworm populations in the field (Lopez et al. 2002).

Fig. 14.2 Electron micrographs of iridovirus and ascovirus virions. Iridovirus virions observed in negatively stained preparations (a) and by transmission electron microscopy (b), respectively. Ascovirus virions as observed in negatively stained preparations (c) and by transmission electron microscopy (d), respectively. Despite the marked difference in virion structure, molecular evidence indicates these two types of viruses are closely related, and that the ascoviruses evolved from iridoviruses. Bar ¼ 100 nm

238

14.4.2

B.A. Federici and Y. Bigot

Family Ascoviridae

The ascoviruses (family Ascoviridae) are ds DNA viruses that attack lepidopterans and are characterized by large, enveloped virions, 130  400 nm, which vary, depending on the species, from allantoid to bacilliform in shape (Federici et al. 2005). Structural studies of ascovirus virions suggest that these contain two unit membranes, one that is part of the inner particle that surrounds the DNA core, and a second that makes up part of the outer virion envelope (Fig. 14.2). There are significant differences between ascovirus and ichneumonid particles, but nevertheless they correspond in size and general morphology (Figs. 14.1 and 14.2). Each ascovirus virion contains a single ds DNA genome, which, depending on the species, ranges from 138 to 180 kb. Four species of ascoviruses are recognized, S. frugiperda ascovirus (SfAV-1a), Trichoplusia ni ascovirus (TnAV-2a), Heliothis virescens ascovirus (HvAV-3a), and Diadromus pulchellus ascovirus (DpAV-4a). The first three occur in noctuid species such as the cabbage looper, T. ni, cotton budworms and bollworms of Heliothis and Heliocoverpa species, and armyworms, Spodoptera species, in the United States. These viruses are pathogens that kill the wasp’s host and as a result, wasp larvae as well. The fourth, noted earlier, occurs in France, where it attacks the pupa of the leak moth, Acrolepiosis assectella (family Yponomeutidae). This ascovirus is a true symbiotic virus that enhances the parasitic success of its wasp vector. All ascoviruses replicate genomic DNA, producing large numbers of progeny virions in their caterpillar or pupal hosts. Ascoviruses differ from all other viruses in that after they invade a cell, they destroy the nucleus and direct the cell to cleave into numerous vesicles in which virion assembly proceeds. These vesicles are liberated from tissues into the hemolymph, where female wasps acquire them mechanically during oviposition and transmit them to new caterpillar hosts. Aside from structural similarities, ascovirus virions and ichneumonid particles depend on parasitic wasps for transmission. Much like insect iridoviruses, ascoviruses are very difficult to transmit per os, but are highly infectious when transmitted by parasitoids or by injection (Hamm et al. 1985). Even more importantly with respect to the organelle paradigm and symbiogenesis, the genome of the D. pulchellus ascovirus (DpAV-4a) is carried in a nonintegrated form in the nuclei of males and females of its ichneumonid wasp vector, D. pulchellus (Bigot et al. 1997a, b). If one were looking for evolutionary intermediates between ascoviruses and ichnoviruses, this would be a type that would be expected.

14.4.3

Molecular Evidence for the Evolution of Ascoviruses from Iridoviruses

As noted above, the molecular evidence that ichnovirus particles evolved from ascoviruses is very limited. We therefore first discuss the data that exist for the evolution of the ascoviruses from iridoviruses. These data provide an important

14 Evolution of Immunosuppressive Organelles from DNA Viruses in Insects

239

foundation for the ascovirus > ichneumonid particle hypothesis because ascoviruses differ so much from iridoviruses in their cytopathology and morphology of their virions. Thus, if ascoviruses, which recall are transmitted by parasitoids, evolved from iridoviruses, the possibility that ichnoviruses evolved from ascoviruses, where at least the changes in virion structure are less substantial, becomes more plausible. The molecular evidence that ascoviruses evolved from iridoviruses is based on analyses of four proteins that occur among a diversity vertebrate and invertebrate ds DNA viruses. These proteins are the major capsid protein, DNA polymerase, thymidine kinase, and ATPase III. Our analyses, performed using Parsimony and Neighbor-Joining programs, indicate all these evolved from the same virus ancestor (Stasiak et al. 2000, 2003). Although there are variations in the topologies of the trees that emerged from our analyses of these proteins, two significant patterns are apparent. First, ascoviruses and iridoviruses are more closely related to each other than to the algal or vertebrate viruses in this viral lineage. Second and more significantly, the TK and ATPase trees show the lepidopteran Chilo iridovirus (CIV) clustering more closely with ascoviruses than with any of the vertebrate iridioviruses (Stasiak et al. 2000, 2003). That the CIV and ascovirus MCP do not cluster on the same branch is not surprising given the marked differences in virion shape (Fig. 14.2). Another important feature that emerged from these analyses is that the ascoviruses that are mechanically vectored by wasps, i.e., SfAV-1a, TnAV-2a, and HvAV-3a, cluster together on one branch of the ascovirus tree, whereas DpAV-4a, which is vertically transmitted by its wasp host, is found on a separate branch. This difference correlates with the important difference in biology, specifically, the more intimate association that DpAV-4a has with its wasp vector. In summary, while the data indicating ascoviruses evolved from iridoviruses must be considered preliminary, as the genes analyzed represent a small portion of those encoded by these viruses, the results are nevertheless important because they reflect patterns consistent with the biology of virus transmission by parasitic wasps. More recent molecular studies, specifically the sequencing of the DpAV-4a genome, suggest that in fact the ichneumonid particles may well have originated from an ancestral iridovirus. We noted above that the ichneumonid, E. vitticolle, a parasite of noctuid caterpillars, is both capable of transmitting and being infected by an iridovirus (Lopez et al. 2002). Annotation of the DpAV-4a genome shared more core genes with lepidopteran iridoviruses than the more common, highly pathogenic ascoviruses, e.g., SfAV-1, TnAV-2, and HzAV-3 (Bigot et al. 2009). These findings again illustrate the need for more genomic sequence data on iridoviruses and ascoviruses that infect lepidopteran insects.

14.4.4

Molecular Data Supporting an Iridovirus/Ascovirus Origin for Ichneumonid Particles

Though the molecular evidence at this stage is minimal, and despite the findings regarding the DpAV-4 genome noted above, BLAST results obtained with several

240

B.A. Federici and Y. Bigot

Fig. 14.3 Map of the 13-kbp region of the DpAV4 genome (EMBL Acc. No. CU469068 and CU467486) that contains the gene cluster with direct homologs in the genome of the Glypta fumiferanae ichnovirus. DpAV-4 ORF with well-characterized direct homologs among other ascovirus and iridovirus genomes are represented by white arrows. Homologous ORF of the GfIV genes are represented by black arrows (from Bigot et al. 2008). Below, the graph is scaled in kbp

ORFs in this genome provide evidence that certain ichnovirus ORFs have their closest relatives in ascovirus genomes. Specifically, we identified a 13 kbp region that contains a cluster of three genes (Fig. 14.3; ORF90, 91, and 93; Bigot et al. 2008) that have close homologs in a GfIV gene family composed of seven members (Lapointe et al. 2007). All contain a domain similar to a conserved domain found in the pox-D5 family of NTPases. To date, this pox-D5 domain has been identified as a NTP binding domain of about 250 amino acid residues found only in viral proteins encoded by poxvirus, iridovirus, ascovirus, and mimivirus genomes. These genes seem to be specific to GfIV, as they are absent in the three sequenced genomes of other ichnoviruses, namely CsIV, Tranosema rostrales ichnovirus (TrIV), and Hyposoter fugitivus ichnovirus (HfIV). More specifically, in DpAV-4, ORF90 encodes a protein of 925 amino acid residues that is 40 similar from position 140 to 925 to a protein of 972 amino acid residues encoded by the ORF1 contained in the segment C20 in the GfIV genome. These two proteins can therefore be considered putative orthologs. The 480 C-terminal residues of this DpAV-4 protein are also 42 similar to the C-terminal domain of the protein homologs encoded by the ORF1 of the D1 and D4 GfIV segments, 36 similar to the N-terminal and the C-terminal domains of the protein encoded by the ORFs 184R and 128L of the iridovirus CIV and LCDV, and 30 similar with those encoded by ORFs 119, 99, and 78 in the ascovirus genomes of HvAV-3e, SfAV-1a, and TnAV-2c, respectively. Overall, this indicates that this DpAV-4 protein is more closely related to that of GfIV than to those found in other ascovirus and iridovirus genomes currently available in databases. ORF091 encodes a protein of 161 amino acid residues similar only with the C-terminal domain of three proteins encoded by the ORFs 1, 1, and 3, contained, respectively, in GfIV segments D1, D4, and D3. In contrast, ORF93 is closer to iridovirus and ascovirus genes than to GfIV genes. This protein of 849 amino acid residues is 43 similar over all its length to CIV ORF184R orthologs in all iridoviral and ascoviral genomes and is only 36 similar over 350 amino acid residues to the C-terminal domain of the GfIV protein homologs encoded by the ORF1, 2, 1, 1, 1, and 1 in, respectively, the C20, C21, D1, D2, D3, and D4 segments of this virus. Since the three DpAV-4 genes have relatives in all ascovirus and iridovirus genomes sequenced so far, their presence in the DpAV-4 genome cannot result

14 Evolution of Immunosuppressive Organelles from DNA Viruses in Insects

241

from a lateral transfer that occurred from an ichnovirus genome related GfIV to DpAV-4. Thus, as these DpAV-4 genes are the closest relatives of the pox-D5 gene family present in GfIV identified so far, they could be considered a landmark of the symbiogenic ascovirus origin of the ichnovirus lineage to which this polydnavirus belongs. An alternative explanation is that the presence of DpAV-4-like genes in the genome of GfIV resulted from a lateral transfer from viral genomes closely related to those of GfIV and DpAV-4. Indeed, this might have happened when a Glypta wasp was infected by an ancestral virus related to DpAV-4. Nevertheless, the symbiogenic origin of GfIV from ascoviruses is also supported by morphological features of its virions (Lapointe et al. 2007), which, aside from similarities in shape, also show reticulations on their surface in negatively stained preparations, a characteristic of the virions of all ascovirus species examined to date (Federici et al. 2005).

14.4.5

Relationships Between Ascovirus Virion and Ichneumnid Particle Proteins

Because ascovirus virions and ichnovirus particles display structural similarities, we developed an approach to search for homologs of virion structural proteins in ichnoviruses. To date, only two virion proteins from the Campoletis sonorensis ichnovirus (CsIV) have been characterized (Webb et al 2006). The first is the P44, a structural protein that appears to be located as a layer between the out envelope and nucleocapsid, and the second, P12, a capsid protein. Presently, there are more than one hundred ascoviral or iridoviral MCP sequences in databases. BLAST searches using these sequences failed to detect any similarities between CsIV virion proteins and ascoviral or iridoviral MCPs, or any other proteins. To evaluate the possibility that homology between ichnovirus and ascovirus virion proteins may simply not be detectable by conventional Blastp searches, we used a different method, WAPAM (weighted automata pattern matching). The models were designed on the basis of a previous study (Stasiak et al. 2003) demonstrating that MCP encoded by ascovirus, iridovirus, phycodnavirus, and asfarvirus genomes are related, and all contain seven conserved domains separated by hinges of very variable size. We investigated these conserved domains further using hydrophobic cluster analysis. This analysis revealed that most conservation occurred at the level of hydrophobic residues, as expected for structural proteins. The size variability of the hinges between conserved domains and the conservation of hydrophobic residues might explain why BLAST searches using iridoviral and ascoviral MCP sequences have limited ability to detect MCP orthologs in phycodnavirus and asfarvirus genomes. We designed two syntactic models which together were able to specifically align all MCP sequences of the four virus families. Importantly, WAPAM aligned the CsIV ichnovirus P44 structural protein with both models. Complementary structural and HCA confirmed the presence of the seven conserved domains in this CsIV structural protein (Fig. 14.4a).

242

B.A. Federici and Y. Bigot

Fig. 14.4 Sequence (lanes 1–3) and secondary structure (lanes 4–6) comparisons among (a) MCP and (b) SfAV1a ORF061 orthologs from CsIV (lanes 1 and 4, typed in black), DpAV4 (lanes 2 and 5, typed in blue), and SfAV1a (lanes 3 and 6, typed in purple). Conserved positions among the amino acid sequence of CsIV and those of DpAV4 and SfAV1a are highlighted in gray. Secondary structures in the three SfAV1a ORF061 orthologs were calculated with the Network Protein Sequence Analysis at http://npsa-pbil.ibcp.fr/ website and the statistical relevance of the secondary structures were evaluated with Psipred at http://bioinf.cs.ucl.ac.uk/psipred/ website. C, E, and H in lanes 4–6 respectively indicated for each amino acid that it is involved in a coiled, b sheet, or a helix structure. Using default parameters of Psipred, upper case letters indicate that the predicted secondary structure is statically significant in Psipred results. Significant secondary structures are highlighted in yellow. In (a), the comparisons were limited to three of the seven conserved domains, 2, 5, and 7. Indeed, classical in silico methods appeared to be inappropriate to predict statistically significant secondary structures in conserved structural protein rich in b strand such as iridovirus and ascovirus major capsid proteins. In contrast, a complete and coherent domain comparison was obtained by HCA profiles (see Bigot et al. 2008)

In addition to the above analysis, ten syntactic models were developed using proteins conserved in the three sequenced ascovirus species (SfAV-1a, TnAV-2c, and HvAV-3a) and twelve iridoviruses. None of these models detected homologs among ichnovirus proteins available in databases, except for one, developed from small proteins encoded by the DpAV-4 ORF041, SfAV1a ORF061, HvAV-3a ORF74, and TnAV-2c ORF118 in the ascovirus genomes, and iridovirus CIV ORF347L and mimivirus MIV ORF096R genomes, respectively. Importantly, these proteins have orthologs in vertebrate iridoviruses, phycodnaviruses, and asfarvirus. In SfAV1a, the peptide encoded by ORF061 is one of the virion components. In ascoviruses, iridoviruses, phycodnaviruses, and the asfarvirus,

14 Evolution of Immunosuppressive Organelles from DNA Viruses in Insects

243

they have been annotated as thioredoxines, proteins that play a role in initiating viral infection. Database mining with our model revealed four hits with CsIV sequences (Acc N . M80623, S47226, AF236017, AF362508) each a homolog

1. Chromosomal integration of an Ascovirus genome in ancestors w asp genome of the Banchinae and Campopleginae lineages.

2a. Conservation, translocations and losts of the Ascovirus genes

2b. Translocation, duplication and diversif ication of host genes in the proviral genome of Ascoviral origine.

3a. Resulting proviral Ichnovirus genomes ( monolocus solution)

3b. Resulting proviral Ichnovirus genomes ( multilocus solution obtained af ter f ragmention of the proviral genome by recombination)

Fig. 14.5 Hypothetical mechanism for the integration and evolution of ascovirus genomes in endoparasitic wasps. Schematic representation of the three-step process of symbiogenesis, and DNA rearrangements that putatively occurred in the germ line of the wasp ancestors in the Banchinae and Campopleginae lineages, from the integration of an ascoviral genome to the proviral ichnoviral genome. Sequences that originate from the ascovirus are in blue, those of the wasp host and its chromosomes are in pink. Genes of ascoviral origin are surrounded by a thin black or white line, depending on their final chromosomal location. Two solutions can account for the final chromosomal organization of the proviral ichnovirus genome, monolocus or multilocus, since this question is not fully understood in either wasp lineage. More complex alternatives to this three-step process might also be proposed and would involve, for example, the complete de novo creation of a mono or multi locus proviral genome from the recruitment by recombination or transposition of ascoviral and host genes located elsewhere in the wasp chromosomes. This model for the chromosomal organization of proviral DNA in polydnaviruses is consistent with published data (Desjardins et al. 2007)

244

B.A. Federici and Y. Bigot

ORF of SfAV-1a ORF061. In fact, these sequences correspond to several variants of a single region contained in the B segment of the CsIV genome. To date, these have not been annotated in the final CsIV genome, probably because they overlap a recombination site. HCA analyses confirmed that the hydrophobic cores were conserved (Fig. 14.4b). Confirmation of the apparent relationship of iridoviruses, ascoviruses, and the ichneumonid particles awaits the sequencing of more of the viral genomes and sequencing of the wasp genes that code for at least the structural proteins that make up the ichneumonid particles. Nevertheless, the significant biological relationships of endoparasitc ichneumond wasps with iridoviruses, ascoviruses, and their caterpillar hosts, and especially the unique relationship of DpAV-4 with its vector, provide all the reagents for the development of symbiotic relationships that lead to symbiogenesis. The evolutionary progression of these relationships, and the benefits certain lineages of symbiotic viruses provided the wasps, and the likely account for the origin of ichneumonid (and braconid) particles. In Fig. 14.5, we illustrate a possible evolutionary scenario and mechanism that may have yielded the interesting immunosuppressive organelles.

Table 14.1 Examples of viruses vertically transmitted by parasitoids and their possible viral origins Virus Evolutionary Parasitoid Parasitoid Reference origin family host Produce virions in parasitoid’s host Diadromus pulchellus Iridovirusc Ichneumonidae Lepidoptera Bigot et al. 1997a ascovirusa Poxvirus Braconidae Diptera Lawrence 2002 Diachasmimorpha longicaudata poxvirusa Ascovirusc Braconidae Coleoptera Barratt et al. Microctonus aethiopoides a 1999 virus Cotesia melonoscela virus Ascovirusc Braconidae Lepidoptera Stoltz et al. 1988 Cotesia marginiventris Nudivirus Braconidae Lepidoptera Hamm et al. 1990 nudivirus Microplitis croceipes nudivirus Nudivirus Braconidae Lepidoptera Hamm et al. 1988 Diadromus pulchellus Reovirus Ichneumonidae Lepidoptera Rabouille et al. cypovirusb 1994 Diachasmimorpha Rhabdovirus Braconidae Diptera Lawrence and Akin 1990 longicaudata rhabdovirusb No virions produced in parasitoid’s host Campoletis sonorensis Ascovirusc Ichneumonidae Lepidoptera Webb et al. 2000 ichnovirus Braconidae Lepidoptera Webb et al. 2000 Cotesia marginiventris Nudivirusc bracovirus Ichneumonidae Coleoptera Hess et al. 1980 Bathyplectes anurus virus Poxvirusc a Involved in immunosuppression b RNA virus c Ancestral viruses from which the respective parasitic particles originated

14 Evolution of Immunosuppressive Organelles from DNA Viruses in Insects

14.5

245

Examples of the Diversity of Immunosuppressive Wasp Viruses and Organelles

While the focus here has been on the origin and evolution of braconid and ichneumonid particles, there are several other known endoparasitic wasp/virus associations that range from symbiotic (i.e., involving true viruses) to organelles that likely originated from viruses. These associations, along with several others that have been discussed above, are listed in Table 14.1 to show the diversity of these relationships, most of which have received very little study. Of particular interest are the ascoviruses and poxviruses that replicate in both the parasitoid and its insect host, produce progeny virions, and play a role in immunosuppression. These include the D. pulchellus ascovirus, D. longicaudata entomopoxvirus, the pox-like particles of Bathyplectes anurus, an ichneumonid parasite of a coleopteran, and the asco-like “virus” of M. aethiopoides, a braconid parasite of a coleopteran.

14.6

Summary

During the last 100 million years, the genomes of at least two different types of DNA viruses were integrated into the genomes of, respectively, endoparasitic braconid and ichneumonid wasps. These viral genes thus became part of the wasp genome. Over time, many of the original viral genes were deleted from the DNA packaged into the virions and replaced by wasp genes involved in suppressing the immune response of their caterpillar hosts, thereby transforming the original virions into a novel type of transducing immunosuppressive organelle that enhanced the survival of wasp progeny. The principal original viral genes that were selectively maintained in a functional state in the wasp genomes were those involved in producing critical structural proteins and enzymes essential for organelle assembly and trafficking wasp immunosuppressive genes into caterpillar host cells and nuclei for transcription. There are marked structural differences between the braconid and ichneumonid organelles and their transducing wasp DNAs, yet their common role in immunosuppression demonstrates a high degree of convergent evolution. This relatively recent example of symbiogenesis through which two DNA viruses evolved into immunosuppressive organelles likely accounts for much of the species radiation characteristic of endoparasitic braconids and ichneumonids, two of the largest groups of higher eukaryotic organisms. Acknowledgments This research was supported by grants from the CNRS and the N.A.T.O. to Y. Bigot, and U.S. National Science Foundation Grant INT-9726818 to B. A. Federici. The photographs used in Fig. 14.1 are by D.B. Stoltz, of Dalhouise University, Halifax, Canada.

246

B.A. Federici and Y. Bigot

References Barratt BIP, Evans AA, Stoltz DB, Vinson SB, Easingwood R (1999) Virus-like particles in the ovaries of Microctonus aethiopoides Loan (Hymenoptera: Braconidae), a parasitoid of adult weevils (Coleoptera: Curculionidae). J Invertebr Pathol 73:182–188 Bedwin O (1979a) The particulate basis of the resistance of a parasitoid to the defense reaction of its insect host. Proc Biol Sci 205:267–270 Bedwin O (1979b) An insect glycoprotein; a study of the particles responsible for the resistance of a parasitoids egg to the defense reactions of its insect hosts. Proc Biol Sci 205:271–286 Be´zier A, Annaheim M, Herbiniere J, Wetterwald C, Gyapay G, Bernard-Samain S, Wincker P, Roditi I, Heller M, Belghazi M, Pfister-Wilhem R, Periquet G, Dupuy C, Juguet E, Volkoff A-N, Lanzrein B, Drezen J-M (2009) Polydnaviruses of braconid wasps derive from an ancestral nudivirus. Science 323:926–930 Bigot Y, Rabouille A, Sizaret P-Y, Hamelim M-H, Periquet G (1997a) Particle and genomic characterisation of a new member of the Ascoviridae, Diadromus pulchellus ascovirus. J Gen Virol 78:1139–1147 Bigot Y, Rabouille A, Doury G, Sizaret P-Y, Delbost F, Hamelim M-H, Periquet G (1997b) Biological and molecular features of the relationships between Diadromus pulchellus ascovirus, a parasitoid hymenopteran wasp (Diadromus pulchullus) and its lepidopteran host, Acrolepiosis assectella. J Gen Virol 78:1149–1163 Bigot Y, Samain S, Auge´-Gouillou C, Federici BA (2008) Molecular evidence for the evolution of ichnoviruses from ascovirsues by symbiogenesis. BMC Evol Biol. doi:10.1186/1471-2148-8-253 Bigot Y, Renault S, Nicolas J, Moundras, C, Demattei MV, Semain S, Bideshi DK, Federici BA (2009) Symbiotic virus at the evolutionary intersection of three types of large DNA viruses: Iridoviruses, Ascoviruses, and Ichnoviruses. PloS One doi:10.1371/journal.pone.000639 Burand JP (1998) Nudiviruses. In: Miller LK, Bell LA (eds) The insect viruses. Plenum Press, New York, pp 69–90 Chinchar VG, Essbauer S, He JG, Hyatt A, Miyazaki T, Seligy V, Williams T (2005) Family Iridoviridae. In: Fauquet CM, Mayo MA, Maniloff J, Desselberger U, Ball LA (eds) Virus taxonomy: eight report of the international committee on virus taxonomy. Elsevier/Academic Press, London, pp 145–162 Deng L, Stoltz DB, Webb BA (2000) A gene encoding a polydnavirus structural polypeptide is not encapsidated. Virology 269:440–450 Desjardins CA, Gundersen-Rindal DE, Hostetler JB, Tallon LJ, Fuester RW, Schatz MC, Pedroni MJ, Fadrosh DW, Haas BJ, Toms BS, Chen D, Nene V (2007) Structure and evolution of a proviral locus of Glyptapanteles indiensis bracovirus. BMC Microbiol. doi:10.1186/1471-2180-7-61 Desjardins CA, Gundersen-Rindal DE, Hostetler JB, Tallon LJ, Fadrosh DW, Fuester RW, Pedroni MJ, Haas BJ, Schatz MC, Jones LM, Crabtree J, Forberger H, Nene V (2008) Comparative genomics of mutualistic viruses of Glyptapanteles parasitic wasps. Genome Biol. doi:10.1186/gb-2008-9-12-r183 Edson KM, Vinson SB, Stoltz DB, Summers MD (1981) Virus in a parasitoid wasp: supression of the cellular immune response in the parasitoid’s host. Science 211:582–583 Espagne E, Dupuy C, Huguet E, Cattolico L, Provost B, Martins N, Poire M, Periquet G, Drezen JM (2004) Genome sequence of a polydnavirus: insights into symbiotic virus evolution. Science 306:286–289 Federici BA (1983) Enveloped double stranded DNA insect virus with novel structure and cytopathology. Proc Natl Acad Sci USA 80:7664–7668 Federici BA (1991) Viewing polydnaviruses as gene vectors of endoparasitic hymenoptera. Redia 74:387–392 Federici BA (1993) Viral pathology in relation to insect control. In: Beckage NE, Thompson SN, Federici BA, (eds) Parasites and Pathogens of Insects, Vol 2, Academic Press, New York, pp 81–101

14 Evolution of Immunosuppressive Organelles from DNA Viruses in Insects

247

Federici BA, Bigot Y (2003) Origin and evolution of polydnaviruses by symbiogenesis of insect DNA viruses in endoparasitic wasps. J Insect Physiol 49:419–432 Federici BA, Bigot Y, Granados RR, Hamm JJ, Miller LK, Newton I, Stasiak K, Vlak JM (2005) Family Ascoviridae. In: Fauquet CM, Mayo MA, Maniloff J, Desselberger U, Ball LA (eds) Taxonomy of virus taxonomy: eight report of the international committee on virus taxonomy. Elsevier/Academic Press, London, pp 269–274 Hamm JJ, Nordlung DA, Marti OG (1985) Effects of a nonoccluded virus of Spodoptera frugiperda (Lepidoptera: Noctuidae) on the development of a parasitoid, Costesia marginiventris (Hymenoptera: Braconidae). Environ Entomol 14:258–261 Hamm JJ, Styer EL, Lewis WJ (1988) A baculovirus pathogenic to the parasitoid Microplitus croceipes (Hymenoptera: Braconidae). J Invertebr Pathol 52:189–191 Hamm JJ, Styer EL, Lewis WJ (1990) Comparative virogenesis of filamentous virus and polydnavirus in the female reproductive track of Cotesia marginiventris (Hymenoptera: Braconidae). J Invertebr Pathol 55:357–360 Hess RT, Poinar GO Jr, Etzel L, Merritt CC (1980) Calyx particle morphology of Bathyplectes anurus and B. curculionis (Hymenoptera: Ichneumonidae). Acta Zoo (Stockholm) 61:111–114 Khakhina LN (1992) Concepts of symbiogenesis. In: Margulis L, McMenamin M (eds) Historical and critical study of the research of Russian botanists. Yale University Press, New Haven Lapointe R, Tanaka K, Barney WE, Whitfield JB, Banks JC, Beliveau C, Stoltz D, Webb BA, Cusson M (2007) Genomic and morphological features of a banchine oplydnavirus: comparison with bracoviruses and ichnoviruses. J Virol 81:6491–6501 Lawrence P (2002) Purification and partial characterization of an entomoposvirus (DLEPV) from a parasitic wasp of tephritid fruit flies. J Insect Sci 2:10 Lin C-L, Lee JC, Chen SS, Wood HA, Li M-L, Li C-F, Chao Y-C (1999) Persistent Hz-1 virus infection in insect cells: evidence for insertion of viral DNA into host chromosomes and viral infection in a latent status. J Virol 73:128–139 Lopez M, Rojas JC, Vandame R, Williams T (2002) Parasitoid mediated transmission of an iridescent virus. J Invertebr Pathol 80:160–170 Margulis L (1992) Biodiversity: molecular biological domains, symbiosis and kingdom origins. Biosystems 27:39–51 Margulis L, Fester R (1991) Symbiosis as a source of evolutionary innovation. MIT Press, Cambridge Massachusetts Rabouille A, Bigot Y, Drezen JM, Sizaret P-Y, Hamelin M-H, Periquet G (1994) A member of the reoviridae (DpRV) has a ploidy-specific genomic segment in the wasp Diadromus pulchellus (Hymenoptera). Virology 205:228–237 Rotheram S (1967) Immune surface of eggs of a parasitic insect. Nature 214:700 Rotheram S (1973a) The surface of the egg of a parasitic insect. I. The surface of the egg and first instar larvae of Nemeritis. Proc Biol Sci 183:179–194 Rotheram S (1973b) The surface of the egg of a parasitic insect. IL. The ultrastructure of the particulate coat on the egg of Nemeritis. Proc Biol Sci 183:195–204 Salt G (1965) Experimental studies in insect parasitism XIII. The haemocytic reaction of a caterpillar to the eggs of its habitual parasite. Proc Biol Sci 162:303–318 Salt G (1966) Experimental studies in insect parasitism XIII. The haemocytic reaction of a caterpillar to the eggs of its habitual parasite. Proc Biol Sci 165:155–178 Salt G (1968) The resistance of insect parasitoids to the defense reactions of their hosts. Biol Rev 43:200–232 Schmidt O, Schuchmann-Feddersen I (1989) Role of virus-like particles in parasitoid-host interaction of insects. Subcell Biochem 15:91–119 Schmidt O, Theopold U (1991) Immune defense and suppression in insects. BioEssays 13:343–346 Schmidt O, Theopold U, Strand M (2001) Innate immunity and its evasion and suppression by hymenopteran endoparasitoids. BioEssays 23:344–351

248

B.A. Federici and Y. Bigot

Stasiak K, Demattei M-V, Federici BA, Bigot Y (2000) Phylogenetic position of the DpAV-4a ascovirus DNA polymerase among viruses with a large double-stranded DNA genome. J Gen Virol 81:3059–3072 Stasiak K, Renault S, Demattei MV, Bigot Y, Federici B (2003) Evidence for the evolution of ascoviruses from iridoviruses. J Gen Virol 84:2999–3009 Stoltz DB (1993) The polydnavirus life cycle. In: Beckage NE, Thompson SN, Federici BA (eds) Parasites and pathogens of insects, vol 1. Academic Press, New York, pp 167–187 Stoltz DB, Faulkner G (1978) Apparent replication of an unusual virus-like particle in both a parasitoid wasp and its host. Can J Microbiol 24:1509–1514 Stoltz DB, Vinson SB (1979) Viruses and parasitism in insects. Adv Virus Res 24:125–171 Stoltz DB, Krell P, Summers MD, Vinson SB (1984) Polydnaviridae – a proposed family of insect viruses with segmented, double-stranded, circular DNA genomes. Intervirology 21:1–4 Stoltz DB, Krell PJ, Cook D, MacKinnon EA, Lucarotti CJ (1988) An unusual virus from the parasitic wasp Cotesia melanoscela. Virology 162:311–320 Tanaka K, Lapointe R, Narney WE, Makkay AM, Stoltz D, Cusson M, Webb BA (2007) Shared and species-specific features among ichnovirus genomes. Virology 263:26–35 Vinson SB (1972) Factors involved in successful attack on Heliothis virescens by the parasitoid Cardiochiles nigriceps. J Invertebr Pathol 20:118–123 Vinson SB (1990) How parasitoids deal with the immune system of their host: an overview. Arch Insect Biochem Physiol 13:2–27 Vinson SB, Scott JR (1975) Particles containing DNA associated with the oocyte of an insect parasitoid. J Invertebr Pathol 25:375–378 Wang Y, Jehle JA (2009) Nudiviruses and other large, double-stranded circular DNA viruses of invertebrates: new insights into an old topic. J Invertebr Pathol 101:187–193 Webb BA, Beckage NE, Hayakawa Y, Lanzrein B, Stoltz DB, Strand MR, Summers MD (2005) Family Polydnaviridae. In: Fauquet CM, Mayo MA, Maniloff J, Desselberger U, Ball LA (eds) Virus taxonomy: eight report of the international committee on virus taxonomy. Elsevier/ Academic Press, London, pp 255–265 Webb BA, Strand MR, Dickey SE, Beck MH, Hilgarth RS, Barney WE, Kadash K, Kromer JA, Lindstrom KG, Rattanadechakul E, Shelby KS, Thoetkiattikul H, Turnbull MS, Witherell RA (2006) Polydnavirus genomes reflect their dual roles as mutualists and pathogens. Virology 347:160–174 Whitfield JB (2002a) Estimating the age of the polydnavirus/braconid wasp symbiosis. Proc Natl Acad Sci USA 99:7508–7513 Whitfield JB (2002b) Phylogeny of microgastroid braconid wasps, and what it tells us about polydnavirus evolution. In: Austin AD, Dowton M (eds) Hymenoptera, evolution, biodiversity, and biological control. CSIRO Publishing, Collingswood, Australia, pp 97–105

Chapter 15

The Neogastropoda: Evolutionary Innovations of Predatory Marine Snails with Remarkable Pharmacological Potential Maria Vittoria Modica and Mande¨ Holford

Abstract The Neogastropoda include many familiar molluscs, such as cone snails (Conidae), purple dye snails (Muricidae), mud snails (Nassariidae), olive snails (Olividae), oyster drills (Muricidae), tulip shells (Fasciolariidae), and whelks (Buccinidae). Due to their amazing predatory specializations, neogastropods are often dominant members of the benthic community at the top of the food chain. In a dazzling display that ranges from boring holes to darting harpoons, neogastropods have developed several prey hunting innovations with specialized compounds pharmaceutical companies could only dream about. It has been hypothesized that evolutionary innovations related to feeding were the main drivers of the rapid neogastropod radiation in the late Cretaceous. The anatomical, behavioral, and biochemical specializations of neogastropod families that are promising targets in drug discovery and development are addressed within an evolutionary framework in this chapter.

15.1 15.1.1

Introduction The Neogastropoda

Neogastropoda is an order of gastropod molluscs that are well characterized morphologically and are traditionally viewed as monophyletic (Ponder 1973; Taylor and Morris 1988; Ponder and Lindberg 1996, 1997; Kantor 1996; Strong 2003). M.V. Modica Dipartimento di Biologia Animale e dell’Uomo, “La Sapienza”, University of Rome, Viale dell’Universita` 32, 00185 Rome, Italy e-mail: [email protected] M. Holford The City University of New York – York College & Graduate Center, and The American Museum of Natural History, 94–20 Guy R. Brewer Blvd, Jamaica, NY 11451, USA e-mail: [email protected]

P. Pontarotti (ed.), Evolutionary Biology – Concepts, Molecular and Morphological Evolution, DOI 10.1007/978-3-642-12340-5_15, # Springer-Verlag Berlin Heidelberg 2010

249

250

M.V. Modica and M. Holford

This characterization of the Neogastropoda persists even after contrasting interpretations have been proposed (see e.g., Colgan et al. 2007; Kantor and Fedosov 2009). Strong (2003) has recently provided the most updated report of potential neogastropod synapomorphies. Anatomical characteristics of neogastropods include a very peculiar anterior foregut with a proboscis (pleurembolic or intraembolic), a valve of Leiblein, a gland of Leiblein (or a venom gland in Toxoglossa), paired primary and accessory salivary glands, an anal gland, and several radular peculiarities (Ponder 1973; Kantor 2002; Strong 2003). Figure 15.1 illustrates a generalized scheme of neogastropod anatomy. The order Neogastropoda includes up to 25 families (Bouchet and Rocroi 2005) traditionally split into three superfamilies, Cancellarioidea, Conoidea, and Muricoidea, on the basis of anatomical features of the anterior foregut, including the radula. Cancellarioidea, also called Nematoglossa, comprised of the single family Cancellariidae, is perceived to be the basal offshoot of neogastropods (Kantor 1996; Strong 2003; Oliverio and Modica 2009; Modica et al. 2009). They are characterized by a nematoglossan radula with a complex mechanism of interlocking of the distal cusps (viewed as an adaptation to suctorial feeding: Petit and Harasewych 1986) and a mid-oesophageal gland that is generally not separated from the oesophagous (Fig. 15.2a). Conoidea, also referred to as Toxoglossa, include Conidae, Terebridae, and the “turrid” which are estimated to have more than 10,000 extant species, and whose taxonomy is under revision (Puillandre et al. 2008). In Conoidea, the radula is modified in various degrees until forming a harpoon (toxoglossan radula), and the dorsal mid-oesophageal gland is separated from the oesophagous and develops into a venom apparatus, with a muscular bulb and a secretory tubule producing neurotoxins (Fig. 15.2b). Muricoidea (also termed Rachiglossa) include the vast majority of neogastropod families, whose monophyly is currently debated (Kantor 1996, 2002; Oliverio and Modica 2009). The muricoidean radula is rachiglossate (Fig. 15.2c) and their anatomy is similar to the generalized model proposed in Fig. 15.1, but there are many modifications at different taxonomic levels. Variations include the presence/absence of radula, accessory salivary glands, valve and gland of Leiblein, anal gland and a number of other foregut, renal, and reproductive features. According to the fossil record, the adaptive radiation of neogastropods has been particularly rapid (Taylor et al. 1980) and may be attributed to the evolution of a predatory lifestyle and diversification in a number of different trophic strategies. Such attributes allowed neogastropods to fully diversify their niches and to efficiently exploit their alimentary resources. In this scenario, the evolutionary role played by chemical innovations in feeding is unquestionable. The Cancellarioidea, Conoidea, and Muricoidea possess a bountiful reservoir of bioactive compounds routinely used to sedate or capture prey. These compounds are the building blocks for future drug discovery targets. Outlined in this chapter are the anatomical features, specialty feeding strategies, and potential bioactive compounds found in the families of the Neogastropoda. Specific attention is given to the discovery and characterization of bioactive compounds from the Conoidea. Based on the successful characterization and implementation of cone snail toxins in 250

15 The Neogastropoda: Evolutionary Innovations of Predatory Marine Snails

251

Fig. 15.1 Generalized scheme of neogastropod anatomy (male). Mantle longitudinally dissected, body wall not shown. Abbreviations are as follows: a anus; ag anal gland; asg accessory salivary gland; ct ctenidium; dg digestive gland; ft foot; hg hypobranchial gland; lg gland of Leiblein; lv valve of Leiblein; mo mouth; op operculum; os osphradium; pe penis; pg prostate gland; pr proboscis; sd salivary duct; sg salivary gland; st stomach; t testis. Modified after Ponder (1998a)

pharmacological approaches (Favreau and Sto¨cklin 2009; Twede 2009; Olivera and Teichert 2007; Fox and Serrano 2007), several groups within the Neogastropoda are highlighted as potential biodiversity targets for drug discovery.

15.1.2

Discovery and Characterization of Cone Snail Toxins

The gold standard for investigating toxins from marine snails is the discovery and characterization of neurotoxins from cone snails (Conus) (Fig. 15.2b). This extremely diversified group of marine snails comprises active predators that use biochemical substances to subdue their prey. Characterization of cone snail toxins begun almost a half century ago (Kohn 1956; Kohn et al. 1960; Endean et al. 1974), starting from empirical observations of envenomation episodes, and has blossomed into a successful research field (review; Norton and Olivera 2006). The characterization of conotoxins provides scientists with new, powerful tools to manipulate the function of ion channels and receptors governing the physiology of the nervous 251

252

M.V. Modica and M. Holford

Fig. 15.2 The Neogastropoda radiation. Three major families of the Neogastropoda are shown: (a) Cancellarioidea, (b) Conoidea, and (c) Muricoidea. The grey triangles shown are proportional to the number of species included in each lineage. Shown for each superfamily are radula, scheme of the foregut, and some shell representatives. Shells shown, from left to right, by genus: (a) Scalptia. (b) Conus, Terebra, Thatcheria, Gemmula. (c) Murex, Oliva, Vexillum, Melongena, Cymbiola, Fusinus, Volutopsius. (d) Schematic arrangement of the foregut (modified after Kantor 1996). Shell images courtesy of Guido and Philippe Poppe. Radula pictures courtesy of Yuri Kantor (b) and Alisa Kosyan (c).

system. The pharmacological usage of ion channels and receptors as drug development targets for the treatment of neurological and cardiovascular diseases is rapidly gaining momentum. The discovery of Prialt (Ziconotide) (Miljanich 2004), the synthetic form of the Conus magus peptide o-conotoxin MVIIA, an N-type calcium channel blocker, significantly highlight the potential of toxins from marine snails. Prialt was approved by the Food and Drug Administration of the United States in December 2004 for analgesic use in HIV and cancer patients. Although Prialt is a significant breakthrough, Conus represents only a very small fraction of the diversity of Neogastropoda. Conus is one of the 20–30 recognized neogastropod families and includes ca 4–500 species out of 10–15,000 estimated in the Conoidea (Bouchet and Rocroi 2005). The pharmacological potential of neogastropods as a source for bioactive compounds is largely unrealized. Similar to cone snails, several other neogastropods have evolved specialized compounds as a result of their feeding ecology that may have potential in pharmacological applications. 252

15 The Neogastropoda: Evolutionary Innovations of Predatory Marine Snails

15.2

253

Feeding Strategies in the Neogastropoda

From what is known about the diets of neogastropod families, the vast majority of neogastropods are carnivorous, with a degree of predatory activity that varies from actively seeking prey to grazing on sessile invertebrates, to scavenging. Some neogastropod families, such as Buccinidae and Muricidae, include many generalist species, which can feed on a variety of living and dead organisms. Most Muricidae feed on living bivalves, gastropods, polychaetes, bryozoans, sipunculids, barnacles, and other small crustaceans, but there are a few that also feed on carrions. A species of Drupa has been observed feeding also on holothurians (Wu 1965), while Drupella (Ergalataxinae) and all Coralliophilinae feed on corals (Taylor 1976; Ward 1965; Haynes 1990) (Fig. 15.4a). Some neogastropod families appear to be highly specialized, such as the Mitridae, which feed exclusively on sipunculids (Taylor et al 1980) and possess peculiar anatomical adaptations to this kind of prey (Harasewych 2009). An interesting feeding strategy is also displayed by the Volutidae, which has been reported for feeding on bivalves, gastropods, and in some deep-water species, on echinoderms (Darragh and Ponder 1998). Members of the Volutidae use their large foot to engulf the prey in a semiclosed environment, in which anesthetic substances are apparently released (Bigatti et al. 2009). Described in the following paragraphs are neogastropod feeding strategies that involve bioactive substances that may have pharmacological utility.

15.2.1

Harpooning

Cone snails, terebrids, and turrids make up the superfamily Conoidea (or Toxoglossa, “poisoned tongued”). Toxoglossans are a megadiverse group of hunting snails where the rapid evolution of venom peptide genes has led to an amazing molecular diversity. They feed on molluscs, polychaetes, acorn worms, and fish (Kohn 1959, 1968; Kohn and Nybakken 1975; Leviten 1980). The key evolutionary innovations enabling conoideans to hunt preys are a conspicuous venom apparatus made up of highly modified radular teeth (harpoon), a venom duct (a glandular duct connected to the oesophagous), and a muscular venom bulb (Fig. 15.2b). The radular tooth, held at the proboscis tip, is inserted into the prey and dispensed similar to a hypodermic needle (Olivera 2002). The mechanism of envenomation involves the contraction of the muscular venom bulb, which forces the secretion of the venom duct through the proboscis, until reaching the tooth. A single cone snail specimen may produce between 50 and 200 different peptides, which are known to target different ion channels (Terlau and Olivera 2004).

253

254

15.2.2

M.V. Modica and M. Holford

Shell Drilling

Shell drilling is the most common feeding technique in muricids, and it is achieved by the concerted action of the radula and a specialized glandular pad (the accessory boring organ) placed on the foot sole (Carriker 1961) (Fig. 15.3a). The drilling process may last up to 1 week (Palmer 1990; Dietl and Herbert 2005). Drilling is not restricted to muricids and has been observed in other rachiglossans, such as the marginellid genus Austroginella (Ponder and Taylor 1992), the buccinid Cominella (Peterson and Black 1995), and the nassariid Nassarius festivus (Morton and Chan 1997). Other feeding strategies developed by the muricids include the opening of the prey shell with the foot (Wells 1958), the cracking of the shells close to the apertural margin followed by proboscis insertion (Radwin and D’Attilio 1976) and the use of shell projections on outer lip (labial spines) to force the opening of the valves (Marko and Vermeij 1999).

15.2.3

Shell Wedging and Proboscis Insertion

As noted above, drilling has been reported for a few species of Buccinidae, but the majority of buccinids use the strengthened margin of their shells to wedge open bivalve shells (Nielsen 1975), in order to insert their proboscis (Fig. 15.3b). Buccinidae eat polychaetes, small crustaceans, and some species have been observed feeding on peculiar preys, e.g., Neptunea antiqua on priapulids, Taylor 1978). Buccinds can also insert their proboscis into the aperture of gastropod shells. Similar strategies of proboscis insertion with mild radular rasping or use of shell margins have been reported in families related to buccinids, such as: the Nassariidae, which feed on polychaetes, barnacles and carrion; the Fasciolariidae, which feed on bivalves, gastropods, sedentary polychaetes, and carrions; the Melongenidae, which feed on gastropods and bivalves; and the Columbellidae, which feed on ascidians, hydroids, small crustaceans, polychaetes, and algae (Taylor et al. 1980).

15.2.4

Suctorial Feeding

Suctorial feeding, or sucking the innards of prey organisms, is an evolutionary advanced feeding technique demonstrated by several neogastropod families. This form of feeding does not always result in the death of the prey, and several neogastropod species coexist with the prey. Two kinds of suctorial feedings are described: haematophagy and corallivory.

254

15 The Neogastropoda: Evolutionary Innovations of Predatory Marine Snails

255

Fig. 15.3 Examples of neogastropod feeding strategies. (a) An ocinebrine Muricidae drilling the shell of a venerid bivalve (photo G. Herbert). (b) A Muricanthus sp. (Muricidae) using the shell margin to wedge open a bivalve shell (photo G. Herbert). (c) Colubraria muricata (Colubrariidae) feeding on a clownfish in aquarium; the proboscis is inserted under the pectoral fins (photo M. Oliverio). (d) Coralliophila meyendorffi (Coralliophilinae) feeding on Actinia equina (photo P. Mariottini)

15.2.4.1

Haematophagy

Three different neogastropod families, Cancellariidae, Marginellidae, and Colubrariidae, have independently evolved haematophagous feeding on fish (Fig. 15.3c). The buccinoidean family Colubrariidae includes at least six species involved in a parasitic association with different species of fish, mainly belonging to the family Scaridae (Johnson et al. 1995; Bouchet and Perrine 1996). Colubraria specimens can extend their proboscis to a length exceeding three times the shell length. When the extended Colubraria proboscis is in contact with the skin of the prey, a scraping action with its minute radula allows access to the blood vessels of the fish. The snail then apparently takes advantage of the blood pressure of the fish to ingest its meal (Oliverio and Modica 2009). Experimental observations on different Colubraria species (Modica and Oliverio, unpublished) suggest that adaptation to haematophagy involves the use of anesthetic and anticoagulant compounds. In fact, the fish appears to be anesthetized when the snail is feeding. Anesthetization is reversible, and the fish usually recovers its full mobility in a few minutes after the interruption of the contact with the snail. The anesthetic compounds used are not lethal as the prey recovers, in agreement with field observations

255

256

M.V. Modica and M. Holford

that Colubraria usually feed on fish sleeping in crevices of the reef (M. Oliverio pers. comm.; Bouchet and Perrine 1996; Johnson et al. 1995). A similar strategy has been reported for the cancellariid Cancellaria cooperi (Cancellarioidea), which has been observed using its proboscis to ingest blood from open injuries on the body of the electric ray Torpedo californica (O’Sullivan et al. 1987). Cancellariidae are likely to include exclusively suctorial feeders, as inferred from foregut and radular characteristics. Dissection of Cancellaria cooperi evidenced a peculiar oesophageal structure (M.V. Modica, J. Biggs, and M. Holford, unpublished observations). In fact, the mid oesophagous is extremely long (up to 5 times the shell length) and glandular, similar to what is found in Colubraria, suggesting a convergent adaptation to haematophagy. Other examples of haematophagous feeding are the very minute species of Marginellidae, Kogomea ovata, Hydroginella caledonica, and Tateshia yadai, that live attached to the pectoral fins of their host (Kosuge 1986; Bouchet 1989).

15.2.4.2

Corallivory

Feeding on the living tissues of corals and other Anthozoans is reported in Muricidae for Drupella (Ergalataxinae) and for the subfamily Coralliophilinae (Taylor 1976; Ward 1965; Haynes 1990). Coralliophilinae includes over 200 marine tropical to temperate species, from shallow to deep waters. The few species for which alimentary preferences are known (about 10% of the shallow water species, Oliverio et al. 2008) feed exclusively on anthozoans (Fig. 15.3d). A variety of feeding strategies and preferences are displayed for this group. Some species are stenophagous, with very strict host specificity; they are mostly sessile on corals, and many groups have developed interesting eco–morphological adaptations. In fact, while Quoyula has a limpet-like shell suitable for external life on stony corals, Rhizochilus lives and feeds on anthipatharians with the shell deformed to adhere to the black coral branch. A second group lives embedded in the host skeleton: Rapa lives inside alcyonarian octocorals, Magilopsis and Leptoconchus have ovoid shells and bore holes into corals, while Magilus is sessile inside corals and possesses an uncoiled adult shell (Robertson 1970). Some others are mobile as Latiaxis, which is probably associated with deep-water gorgonians, or Babelomurex that mostly feeds on shallow water hexacorals. In a few cases mobile euryphagous species can feed on anthozoans belonging to different orders, such as some species of Coralliophila associated with sea anemones, scleractinians, and zoanthids (M. Oliverio, unpublished observations). Among coralliophilines some anatomical modifications related to parasitism on corals are widespread, such as the loss of the radula and jaws, viewed as an adaptation to suctorial feeding, and brooding of embryos in capsules kept in the pallial cavity (Richter and Luque 2002). The amazing display of feeding strategies developed by neogastropods is possible due to the diversity of innovative anatomical features and chemical compounds that can be readily employed to overcome their prey. 256

15 The Neogastropoda: Evolutionary Innovations of Predatory Marine Snails

15.3

257

Neogastropod Specialized Anatomy and Predatory Chemical Substances

Most neogastropod snails have developed specialized glands or other anatomical features that enable them to produce and use chemical substances to subdue their prey. It can be argued that the development of specialized foregut glands, such as the venom gland in Conoidea, or salivary and accessory salivary glands in other neogastropod groups, has lead to the successful radiation of neogastropods. The biochemical weaponry developed in the foregut and other glands is an evolutionary advantage that has enabled neogastropods to thrive.

15.3.1

Foregut Glands

The foregut glands described here include the venom gland, primary, and accessory salivary glands (Figs. 15.1 and 15.2). Toxins may be produced in a specific venom gland, as is the case with most Conoideans, or in primary and/or accessory salivary glands (Andrews 1991) for species that do not have a venom gland. In some cases, the production of toxins might involve other foregut organs/tissues, such as the glandular mid-oesophagous of the haematophagous Colubraria and Cancellaria.

15.3.1.1

Venom Gland

The presence of a venom apparatus is characteristic of the Conoidea (Fig. 15.2b). Generally it is a conspicuous organ, constituted by a proximal muscular bulb and a very long, convolute duct (the gland itself). The tubular gland always passes through the nerve ring and opens into the buccal cavity, posterior to the radular sac opening. The active exocrine secretion of the venom is due to a single cell type: cuboidal ciliated cells, accumulating venom granules at their apex, until they are discharged into the lumen (Smith 1967). The venom gland may be lined with such secretory cells for its whole length or, as happens in some species, the secretory tissue may be confined to the region posterior to the nerve ring, while the anteriormost region is a simple ciliated duct (Taylor et al. 1993). The terminal muscular bulb is usually constituted by two muscular layers, internal and external, separated by connective tissue; the relative thickness and development of these layers is variable between species. According to Ponder (1973) the tubular venom gland originated from the dorsal glandular folds of the oesophagous while the gland of Leiblein gave rise to the muscular bulb. Some conoideans, mostly radula-less species, do not possess a venom apparatus. All cone snails (Conus) have a venom apparatus and the toxins found in their venom glands have led the field in characterizing peptide toxins from marine snails. When venom is injected into a prey, the conotoxins work in a concerted manner to 257

258

M.V. Modica and M. Holford

shut down the prey’s nervous system. Conotoxins are potent neurotoxins that target ion channels and receptors. The complement of peptides found in any one Conus venom is strikingly different from that found in the venom of any other Conus specimens (Romeo et al. 2008). Thus, in the whole genus, many tens of thousands of distinct active peptides have evolved. A question that immediately arises is why individual cone snails should need so many different peptides. It has been speculated that the complement of peptides in a venom may be used for at least three general purposes: An individual peptide may play a role in (1) prey capture, directly or indirectly; (2) defense and escape from predators; or (3) other biological processes, such as interaction with potential competitors. Not all terebrids and turrids have a venom apparatus, but those that do also produce toxins to subdue their prey. Unlike conotoxins, less is known about terebrid and turrid toxins, teretoxins and turritoxins, respectively. Preliminary characterization of terebrid and turrid toxins (Imperial et al. 2003, 2007; Watkins et al. 2006; Heralde et al. 2008) indicate a similar threedomain conotoxin structure consisting of a highly conserved signal sequence, a more variable pro-region, and a hypervariable mature toxin sequence. While conotoxins have been identified as potent neuropeptides, no known molecular target has been identified for teretoxins or turritoxins. However, given their similarities to conotoxins it is expected they will also be effective modifiers for ion channels and receptors in the nervous system.

15.3.1.2

Primary Salivary Glands

Primary salivary glands are usually acinous, with a very small lumen and a system of narrow branched ducts (Fig. 15.1). In some species, the paired glands may be fused together in a single glandular mass, but two salivary ducts are always present and run along the oesophagous (or, in some groups, embedded in the oesophageal walls) until opening into the roof of the buccal cavity. Two cell types have been identified in the secretory epithelium, mixed with one another: (1) basal cells with apocrine secretion and (2) superficial ciliated cells secreting mucus (Andrews 1991). Ciliary movement is responsible for delivering the secretion, as the outer layer of muscle fibers is poorly developed (Andrews 1991). Acinous salivary glands are present in all neogastropod, although their role in toxin production may be variable, depending on whether other secreting structures, such as venom gland or accessory salivary glands, are present. Only acinous salivary glands are present in Buccinidae and related families, such as Nassariidae, Melongenidae, Fasciolariidae, and Columbellidae (accessory salivary glands are missing). Species of the buccinid genus Neptunea (as e.g., N. antiqua) have very large salivary glands containing high quantity of tetramine (F€ange 1960; Asano and Itoh 1959, 1960; Saitoh et al. 1983; Fujii et al. 1992; Shiomi et al. 1994; Watson-Wright et al. 1992; Power et al. 2002), which blocks nicotinic acetylcholine receptors (Emmelin and F€ange 1958). A number of human intoxication has been reported so far, caused by consumption of snails of these species (Fleming 1971; Millar and Dey 1987; Reid et al. 1988). Further studies have 258

15 The Neogastropoda: Evolutionary Innovations of Predatory Marine Snails

259

shown the presence of three additional unidentified toxins in the salivary glands of N. antiqua that appear to inhibit neuronal Ca2+ channels (Power et al. 2002). Other whelks are known to produce histamine, choline, and choline esters (Endean 1972). Nassariidae possess three types of secreting cells in their salivary glands, one of which secretes a glycoprotein rich in disulphide groups like the accessory salivary glands of the muricid Nucella lapillus (Fretter and Graham 1994; Minniti 1986; Martoja 1964). The finding that conopeptides are expressed in the salivary gland of Conus pulicarius (Biggs et al. 2008) suggests that salivary glands may play a role in the envenomation process. Crude extracts of salivary glands of the haematophagous Colubraria reticulata have been observed to increase coagulation time of human blood (S. Rufini, M.V. Modica, and M. Oliverio, unpublished). Current research by Modica and colleagues is underway to identify the anticoagulant transcript using cDNA analysis.

15.3.1.3

Accessory Salivary Glands

Accessory salivary glands are considered to be an informative synapomorphy of Neogastropoda, although they are missing in several families. Accessory salivary glands are present in the basal family Cancellariidae (Fig. 15.2a) and in several Toxoglossa, where in some vermivorous cones they coexist with the venom gland (Marsh 1971). Two pairs of accessory salivary glands are also found in Muricidae, Mitridae, Costellariidae, Volutidae, and Olividae, while in Volutomitridae only one gland is found. In Marginellidae, Harpidae, and in the buccinoideans, accessory salivary glands are generally missing, but are present in Busycon (Andrews 1991). A common anatomical organization of the glands is shared by all neogastropods. The paired glands are tubular in shape, with a lumen lined by a columnar secretory epithelium surrounded by a subepithelial muscular coat richly innervated. External to the muscle layer there is an outer layer of gland cells, with long necks opening in the central lumen of the gland (Ponder 1973; Andrews 1991) producing a peculiar granular secretion (Andrews 1991). Exceptions to this model include olives, volutids, and some mitriform species (Marcus and Marcus 1959; Ponder 1970, 1972). The structure is very similar to the venom gland of Conoidea (West et al. 1996). The glandular accessory salivary glands open at the tip of the buccal cavity with nonciliated ducts. In Muricidae, accessory salivary glands are usually large and well developed. In Nucella lapillus and Stramonita haemastoma, the only muricids studied so far at the biochemical level, accessory salivary glands produce a glycoprotein rich in cysteines (Martoja 1971; McGraw and Gunter 1972), similar to conotoxins. Extracts of the glands are able to elicit flaccid paralysis in Mytilus edulis which can be drilled or not, and, in the case of S. haemastoma, in barnacles, which are never drilled (Carriker 1981; Huang and Mir 1972; Andrews 1991; West et al. 1996; Andrews et al. 1991). S. haemastoma also produces a toxic secretion in the primary salivary glands that decreases cardiac activity in mammals and induces vasodilatation, 259

260

M.V. Modica and M. Holford

hypotension, and smooth muscle contraction (Huang and Mir 1972). A similar response was demonstrated in a combined primary/accessory salivary glands extract of another muricid, Acanthina spirata (Hemingway 1978). N. lapillus extracts also disrupt neuromuscular transmission in rat phrenic nerve–hemidiaphragm preparations (West et al. 1996). In some Volutidae, the accessory salivary glands have been reported to produce a narcotizing compound, with a very low pH, inducing muscular relaxation in the preys (Bigatti et al 2009).

15.3.2

Hypobranchial Gland

The hypobranchial gland is constituted by a thickening of the epithelium in the roof of the pallial cavity and produces large amounts of mucus. Its primary function is currently viewed to be the cleaning of the mantle cavity; the mucous secretion binds together the particulate matter, which is then eliminated from the mantle cavity. However, the hypobranchial gland comprises at least three different cell types that may correspond to distinct chemical activities, which have only been partially identified (Naegel and Aguilar-Cruz 2006). In many muricid species, the hypobranchial gland produces chromogens, which, exposed to light and oxygen, develop into a purple pigment that has been used for centuries as a dye (Tyrian purple). Similarly, in the Mitridae, the hypobranchial secretion once exposed to air becomes yellowish, then purple, and finally dark brown (Harasewych 2009), while in Costellariidae it remains predominantly yellow-green (Ponder 1998b). The production of small compounds, mainly choline esters, but also biogenic amines, has been detected in the hypobranchial gland of several species of muricids and buccinids. These substances elicit neuromuscular blocking, with paralyzing effects both in invertebrates and vertebrates (Roseghini et al. 1996). Due to the low concentrations in which these toxic compounds are found in the snails, it is not sure how effective they are in prey hunting (West et al. 1996). The functions of the hypobranchial gland and the role it played in the evolution and diversification of the Neogastropoda are still to be clarified; nevertheless, hypobranchial secretions may have useful pharmacological properties.

15.4

Neurotoxins, Anesthetics, and Anticoagulants: Prominent Bioactive Compounds from Neogastropod Snails

As stated in the introduction of this chapter, conotoxins, with the approval of the analgesic drug Prialt, have demonstrated the utility of translating basic research of marine snail compounds into drug development targets. The identification of novel neurotoxins, anesthetics, and anticoagulants are three areas in which harvesting the bioactive compounds of the Neogastropoda could prove very fruitful. The following

260

15 The Neogastropoda: Evolutionary Innovations of Predatory Marine Snails

261

section highlights the success of conotoxins as neurotoxins and outlines the potential of identifying anesthetic and anticoagulant compounds from neogastropod snails.

15.4.1

Neurotoxins

In the Conoidea, the best-characterized venom components are small, highly structured disulfide peptides, individually encoded by a separate gene. Every Conus species has its own distinct repertoire of 50–200 venom peptides, with each peptide presumably having a physiologically relevant target in prey or potential predators/competitors (Olivera 2002). Most conotoxins are small peptides (6–40 amino acids in length), with the majority being in the size range of 12–30 amino acids (Olivera et al. 1990; Terlau and Olivera 2004). Conotoxins are comprised of a highly conserved precursor structure including a signal sequence, followed by a propeptide region and then a mature toxin that is cleaved from the prepro-structure. The mature toxins are highly disulfide rich and are classified according to their cysteine framework. Cone snails practice combinatorial drug therapy in that it is not one conotoxin that attacks the prey, but instead a cocktail of the 50–200 venom peptides working together to shut down the prey’s nervous system. The conotoxin cocktail contains ion channel and receptor modifiers that can affect neuronal signaling. For example, conotoxins that inhibit Na+ channel function prevent the formation of action potential, while conotoxins that target Ca2+ prevent vesicle fusion, which impedes the release of neurotransmitters. There are presently more than 3,000 different Conus venom proteins reported in the literature (Conoserver: http://research1t.imb.uq.edu.au/conoserver/). Less than 10% of the described conotoxins have been functionally characterized. Of those characterized, at least 25 different functions have been described (Olivera 2006; Conoserver). Several conotoxins are at various stages of drug development with the more promising examples being: MrIA (active on norepinephrine transporters), Vc1.1 (active on nicotinic receptors), and Conantokin-G (active on NMDA receptors) (Olivera 2006). While the majority of conotoxins in therapeutic development are analgesic compounds, conotoxins are also being considered as viable targets for epilepsy or myocardial infarction, as well as disorders concerning neuroprotective/ cardioprotective properties (Twede et al. 2009). Another promising group to investigate in order to discover new neurotoxins and/or substances capable of inactivating toxins is the corallivorous subfamily Coralliophilinae (Muricidae). The Anthozoa, such as sea anemones, and stony and soft corals, which are included in the Cnidaria along with the jellyfishes (Scyphozoa), sea-wasps (Cubozoa), hydrocorals, and hydromedusae (Hydrozoa), are known to produce a neurotoxin-rich venom as well as other toxic defensive compounds, from which the Coralliophilinae appear to be immune. Envenomation by cnidarians represents a remarkable sanitary problem for humans. An estimated 40,000–50,000 marine envenomations occur annually due to several species of Cnidaria. Cubozoan alone have been responsible for over 5,000 human deaths in 261

262

M.V. Modica and M. Holford

the last 130 years (Brinkman and Burnell 2009). Antivenom is available only for a very limited number of species. If, as is suggested by reported observations, coralliophilines have antivenom-type compounds, they may potentially be useful in cases of cnidarian envenomations. The immunity of Coralliophilinae raises a number of interesting evolutionary questions, such as: What are the physiological adaptations related to corallivory? Do corallivorous species secrete bioactive compounds interacting with and inactivating anthozoans’ toxin? Are there specialized organs involved in the production of the antivenom (e.g., salivary glands)? Is host switching in euryphagous and host specificity in stenophagous correlated with biochemical variations in the secretion? The answers to these questions may translate into a modern physiological and biochemical understanding of gastropod innovations related to feeding.

15.4.2

Anesthetic and Anticoagulant Compounds

As pointed out in Sect. 15.3, three different neogastropod families have haematophagous species, which produce anesthetic and anticoagulant compounds that may be useful in elucidating cellular communication in the nervous system and as antithrombotic agents. In Colubrariidae, anticoagulants are produced in the salivary glands, but the anatomical structures responsible for anesthetic secretion are not yet known. In addition to the salivary glands, it might be worthy to investigate the glandular mid-posterior oesophagous, a peculiar derived structure that may be related to the haematophagous lifestyle (Oliverio and Modica 2009). Furthermore, the peculiar mid-oesophagous of Cancellaria cooperi is a very advantageous tissue to test for bioactive compounds production, as cancellariid mid-oesophagous may be homologous to toxoglossan venom glands (Ponder 1973; Kantor 1996, 2002). Another issue of interest is the presence in Cancellariidae of both primary and accessory salivary glands. The roles these anatomical structures play in prey subduction and in the production of bioactive substances, as well as their interactions, are still to be investigated. Are the bioactive substances the same in the different haematophagous lineages? Intriguing evolutionary questions may be addressed studying and comparing anticoagulant and anesthetic molecules in Colubrariidae and Cancellariidae.

15.5

Investigating Genetic Evolution and Expression of Neogastropod Toxins

The early evolution, and the first diversification of venom toxins, has been interpreted as the result of a process of neofunctionalization in which strong positive selection acts on redundant genes produced in duplication events, originating new 262

15 The Neogastropoda: Evolutionary Innovations of Predatory Marine Snails

263

functions (Ohno 1970). This evolutionary mechanism was reported also for conotoxins (Duda and Palumbi 1999). The evolutionary pressure promoting the variability of these “specialty genes” (also called exogenes, as their products act outside the organism; Olivera 2006) is related with a predator–prey arms race process in which the availability of a particular kind of prey may produce an evolutionary force acting on ecologically important genetic loci. Conotoxins are particularly prone to rapid genetic variations, due to their extremely reduced size. It is still unclear at which level the results reported for Conus might be generalized in the neogastropods, but it is plausible at least to hypothesize that the same organs produce the same type of bioactive substances across the entire order Neogastropoda. According to the amount of variation that will be detected at the different taxonomic levels in neogastropods, it will be possible to clarify the evolutionary patterns acting at each level. In snakes, where the same neofunctionalization mechanism is responsible for the evolution of the toxin gene families, the genes that have been recruited to constitute the venom proteome have been partly identified (Fry 2005). In neogastropods, including cone snails, the origin of the toxin sequences has yet to be investigated. The role of differential gene expression and posttranscriptional modifications in modulating toxin diversity is also an intriguing area requiring further investigation. This line of research could be addressed at different taxonomic levels: (1) Between different species – a particular focus should be dedicated to host specificity, to verify if the inverse correlation between the degree of specialization and the diversity of the venom in Conus leopardus (Remigio and Duda 2008) can be generalized to other neogastropod groups. (2) In individuals of the same species – the high levels of intraspecific variability observed in Conus ventricosus (Romeo et al. 2008) raise the possibility that fine-scale modulatory mechanisms may act in response to environmental and ecological variations. And (3) at different ontogenetic stages – juvenile neogastropods have often a largely different diet from the adults, implying a different suite of toxins. How and under which mechanisms does venom composition change during ontogenesis? To address these and other toxin evolution and expression topics, a robust phylogenetic hypothesis and an integrated strategy for the characterization of bioactive compounds are required.

15.6

Conclusion: Integrated Strategies for Building a More Robust Evolutionary Framework and Effective Drug Development Methods

The major challenges in characterizing bioactive compounds in snails are the complexity of sampling, the scarcity of the biological material, and the absence of databases for determination of peptide and protein sequences. Venom profiling may thus prove an elusive target, unless molecular biology techniques are coupled 263

264

Research fields

Integrative approach

Output

M.V. Modica and M. Holford

Ecology

Anatomy & Physiology

Phylogeny

Chemical ecology

Integrated evolutionary f ramew ork

Pharmacology

Comparative phylogeny

Genomics & Proteomics

Enhanced drug development

Fig. 15.4 Integrated research strategies for investigating biodiversity. The integration of different approaches to diversity may lead to a more complete evolutionary framework and enhance the rate of drug discovery and development

with biochemical analysis of polypeptide composition. A multidisciplinary platform, combining modern genomic and proteomic techniques, as well as phylogeny and descriptive approaches to ecology and anatomy, is necessary to increase the rate of pharmacological characterization of new bioactive compounds. Genomic libraries can be obtained from tissues of interest and their analysis can be integrated with proteomic techniques, such as venom fractionation, peptide purification, mass spectrometry, and sequence analysis using automated Edman degradation. Spider venoms have recently been analyzed by a three-dimensional approach, combining calculated, predicted, and measured data obtained with different techniques such as cDNA sequences and LC-MALDI analysis (Escoubas et al. 2006). The use of such “venom landscapes” may constitute a significant improvement in venom profiling and can also be effective as molecular markers in taxonomic and phylogenetic studies. A similar strategy has been applied to snake venoms (Nascimento et al. 2006). Molecular phylogeny, combined with anatomical and ecological data, can guide us through the maze of snail biodiversity, toward the species or group of species which are likely to possess bioactive compounds worthy to investigation to find new therapeutics (Fig. 15.4). This strategy was successfully applied to the Terebridae, outlining particular genera/species important for teretoxin discovery (Holford et al. 2009a, b). 264

15 The Neogastropoda: Evolutionary Innovations of Predatory Marine Snails

265

Interestingly, the relationship between drug discovery and phylogeny is a twoway street. In fact, exogenes mostly belong to gene superfamilies with highly conserved sequence elements, enabling the use of standard molecular techniques. In what has been called a “concerted discovery strategy” venom toxins are revealed to be useful characters for the taxonomy and phylogenetic relationships of their producers (Olivera 2006; Olivera and Teichert 2007; Bulaj 2008). This integrated approach has been used in non-molluscan toxin-producing groups such as snakes to garner insight into the molecular evolution of snake venoms and to correlate the appearance of other morphological evolutionary novelties (Fry and W€uster 2004). For the Neogastropoda, whose phylogeny cannot be readily elucidated using standard taxonomic approaches, an integrated approach has several possibilities. Proteomics of the venom as well as the characterization of its biochemical and functional properties successfully separated two closely related, morphological indistinguishable pit-viper species (Angulo et al. 2007). The use of genomic analysis and venom profiling techniques, along with more traditional approaches such as anatomical and physiological studies, will allow a better understanding of the correlation between venom composition, trophic preferences, and adaptive radiation of the Neogastropoda, creating the basis for a modern integrated evolutionary framework and an effective drug discovery strategy (Fig. 15.4). Acknowledgments The authors thank Marco Oliverio for invaluable advice and helpful comments on the manuscript. Yuri Kantor, Alisa Kosyan, Gregory Herbert, Paolo Mariottini, Marco Oliverio, and Guido and Philippe Poppe are acknowledged for images used in the figures. MH acknowledges support from NIH grant GM088096-01.

References Andrews EB (1991) The fine structure and function of the salivary glands of Nucella lapillus (Gastropoda: Muricidae). J Moll Stud 57:111–126 Andrews EB, Elphick MR, Thorndyke MC (1991) Pharmacologically active constituents of the accessory salivary and hypobranchial glands of Nucella lapillus. J Moll Stud 57:136–138 Angulo Y, Escolano J, Lomonte B, Gutie´rrez JM, Sanz L, Calvete JJ (2007) Snake venomics of Central American pitvipers: clues for rationalizing the distinct envenomation profiles of Atropoides nummifer and Atropoides picadoi. J Proteome Res 7(2):706–719 Asano M, Itoh M (1959) Occurrence of tetramine and choline compounds in the salivary gland of a marine gastropod Neptunea arthritica (Bernardi). J Agric Res 10:209 Asano M, Itoh M (1960) Salivary poison of a marine gastropod, Neptunea arthritica Bernardi, and the seasonal variation of its toxicity. Ann N Y Acad Sci 90:675–688 Bigatti G, Sanchez Antelo CJM, Miloslavich P, Penchaszadeh PE (2009) Feeding behavior of Adelomelon ancilla (Lighfoot, 1786): a predatory neogastropod (Gastropoda: Volutidae) in Patagonian benthic communities. The Nautilus 123(3):159–165 Biggs JS, Olivera BM, Kantor YI (2008) a-Conopeptides specifically expressed in the salivary gland of Conus pulicarius. Toxicon 52:101–105 Bouchet P (1989) A marginellid gastropod parasitize sleeping fishes. Bull Mar Sci 45:76–84 Bouchet P, Perrine D (1996) More gastropods feeding at night on parrotfishes. Bull Mar Sci 59 (1):224–228

265

266

M.V. Modica and M. Holford

Bouchet P, Rocroi JP (2005) Classification and nomenclator of gastropod families. Malacologia 47 (1–2):1–397 Brinkman DL, Burnell JN (2009) Biochemical and molecular characterisation of cubozoan protein toxins. Toxicon 54:1162–1173 Bulaj G (2008) Integrating the discovery pipeline for novel compounds targeting ion channels. Curr Opin Chem Biol 12:441–447 Carriker MR (1961) Comparative functional morphology of boring mechanisms in gastropods. Am Zool 1(2):263–266 Carriker MR (1981) Shell penetration and feeding by naticacean and muricacean predatory neogastropods: a synthesis. Malacologia 20:403–422 Colgan DJ, Ponder WF, Beacham E, Macaranas JM (2007) Molecular phylogenetics of Caenogastropoda (Gastropoda: Mollusca). Mol Phylogenet Evol 42(3):717–737 Conoserver: http://research1t.imb.uq.edu.au/conoserver/ Darragh TA, Ponder WF (1998) Family Volutidae. In: Beesley PL, Ross JGB, Wells A (eds) Mollusca: the Southern synthesis. Fauna of Australia, vol 5. CSIRO Publishing, Melbourne, pp 833–835, part B Dietl GP, Herbert GS (2005) Influence of alternative shell-drilling behaviours on attack duration of the predatory snail Chicoreus dilectus. J Zool 265:201–206 Duda TFJ, Palumbi SR (1999) Molecular genetics of ecological diversification: duplication and rapid evolution of toxin genes of the venomous gastropod Conus. Proc Natl Acad Sci USA 96:6820–6823 Emmelin N, F€ange R (1958) Comparison between biological effects of neurine and a salivary glands extract of Neptunea antiqua. Acta Zool 39:47–52 Endean R (1972) Aspects of molluscan pharmacology. In: Florkin M, Scheer BT (eds) Chemical zoology, vol 7, Mollusca. Academic Press, New York, pp 421–466 Endean R, Parrish G, Gyr P (1974) Pharmacology of the venom of Conus geographus. Toxicon 12:131 Escoubas P, Sollod B, King GF (2006) Venom landscapes: mining the complexity of spider venoms via a combined cDNA and mass spectrometric approach. Toxicon 47:650–663 F€ange R (1960) The salivary gland of Neptunea antiqua. Ann N Y Acad Sci 90:689–694 Favreau P, Sto¨cklin R (2009) Marine snail venoms: use and trends in receptor and channel neuropharmacology. Curr Opin Pharmacol 9:594–601 Fleming C (1971) Case of poisoning from red whelks. Br Med J 3:250–251 Fox JW, Serrano SM (2007) Approaching the golden age of natural product pharmaceuticals from venom libraries: an overview of toxins and toxin-derivatives currently involved in therapeutic or diagnostic applications. Curr Pharm Res 13:2927–2934 Fretter V, Graham A (1994) British prosobranch molluscs. Revised and updated edition, Ray Society, London Fry BG (2005) From genome to “venome”: molecular origin and evolution of the snake venom proteome inferred from phylogenetic analysis of toxin sequences and related body proteins. Genome Res 15:403–420 Fry BG, W€uster W (2004) Assembling an arsenal: origin and evolution of the snake venom proteome inferred from phylogenetic analysis of toxin sequences. Mol Biol Evol 21 (5):870–883 Fujii R, Moriwaki N, Tanaka K, Ogawa T, Mori E, Saitou M (1992) Spectrophotometric determination of tetramine in carnivorous gastropods with tetrabromophenolphthalein ethyl ester. J Food Hyg Soc Japan 33(3):237–240 Harasewych MG (2009) Anatomy and biology of Mitra cornea Lamarck, 1811 (Mollusca, Caenogastropoda, Mitridae) from the Azores. Ac¸oreana 6:121–135 Haynes JA (1990) Distribution movement and impact of the corallivorous gastropod Coralliophila abbreviata (Lamarck) in a Panamanian patch. J Exp Mar Biol Ecol 142:25–42 Hemingway GT (1978) Evidence for a paralytic venom in the intertidal snail Acanthina spirata (Neogastropoda: Thaisidae). Comp Biochem Physiol 60C:79–81

266

15 The Neogastropoda: Evolutionary Innovations of Predatory Marine Snails

267

Heralde FM, Imperial J, Bandyopadhyay P, Olivera BM, Concepcion GP, Santos AD (2008) A rapidly diverging superfamily of peptide toxins in venomous Gemmula species. Toxicon 51:890–897 Holford M, Puillandre N, Modica MV, Watkins M, Collin R, Bermingham E, Olivera BM (2009a) Correlating molecular phylogeny with venom apparatus occurrence in panamic auger snails (Terebridae). PLoS ONE 4(11):e7667. doi:10.1371/journal.pone.0007667 Holford M, Puillandre N, Terryn Y, Cruaud C, Olivera BM, Bouchet P (2009b) Evolution of the Toxoglossa venom apparatus as inferred by molecular phylogeny of the Terebridae. Mol Biol Evol 26(1):15–25 Huang CL, Mir GN (1972) Pharmacological investigation of salivary gland of Thais haemastoma (Clench). Toxicon 10:111–117 Imperial JS, Watkins M, Chen P, Hillyard DR, Cruz LJ, Olivera BM (2003) The augertoxins: biochemical characterization of venom components from the toxoglossate gastropod Terebra subulata. Toxicon 42:391–398 Imperial JS, Kantor YI, Watkins M, Heralde FM, Stevenson B, Chen P, Hansson K, Stenflo J, Ownby J-P, Bouchet P, Olivera BM (2007) Venomous auger snail Hastula (Impages) hectica (Linnaeus, 1758): molecular phylogeny, foregut anatomy and comparative toxinology. J Exp Zool 308B:744–756 Johnson S, Johnson J, Jazwinski S (1995) Parasitism of sleeping fish by gastropod mollusks in the Colubrariidae and Marginellidae at Kwajalein, Marshall Islands. The Festivus 27(11):121–126 Kantor YI (1996) Phylogeny and relationships of Neogastropoda. In: Taylor J (ed) Origin and evolutionary radiation of the Mollusca. Oxford University Press, Oxford, pp 221–230 Kantor YI (2002) Morphological prerequisite for understanding neogastropod phylogeny. Boll Malacol Suppl 4:161–174 Kantor YI, Fedosov A (2009) Morphology and development of the valve of Leiblein: possible evidence for paraphyly of the Neogastropoda. The Nautilus 123(3):73–82 Kohn AJ (1956) Piscivorous gastropods of the genus Conus. Proc Natl Acad Sci USA 42:168–171 Kohn AJ (1959) The ecology of Conus Hawaii. Ecol Monogr 29:47–90 Kohn AJ (1968) Microhabitats, abundance and food of Conus (Gastropoda) on atoll reefs in the Maldive and Chagos islands. Ecology 49:1046–1062 Kohn AJ (1978) Ecological shift and release in an isolated reefs: the significance of prey size. Ecology 59:614–631 Kohn AJ, Nybakken JW (1975) Ecology of Conus on eastern Indian ocean fringing reefs: diversity of species and resource utilization. Mar Biol 29:211–234 Kohn AJ, Saunders PR, Wiener S (1960) Preliminary studies on the venom of the marine snail Conus. Ann N Y Acad Sci 90:706–725 Kosuge S (1986) Description of a new species of ecto-parasitic snail on fish. Bull Inst Malacol 2 (5):77 Leviten PJ (1980) The foraging strategy of vermivorous conid gastropods. Ecol Monogr 46:157–178 Marcus E, Marcus E (1959) Studies on Olividae. Bol Fac Fil Cieˆnc Let Univ S Paulo Zool 22:99–188 Marko PB, Vermeij GJ (1999) Molecular phylogenetics and the evolution of labral spines among eastern pacific ocenebrine gastropods. Mol Phylogenet Evol 13(2):275–288 Marsh M (1971) The foregut glands of some vermivorous cone shells. Aust J Zool 19:313–326 Martoja M (1964) Contribution a l’e´tude de l’appareil digestif et la digestion chez les gaste´ropodes carnivores de la famille Nassaride´s. Cell 64:237–334 Martoja M (1971) Donne´es histologiques sur les glandes salivaires et oesophagiennes de Thais lapillus (L.) (¼ Nucella lapillus. Prosobranche Ne´ogastropode) Arch Zool Exp Gen 112:249–291 McGraw KA, Gunter G (1972) Observations on killing of the Virginia oyster by the Gulf oyster borer, Thais haemastoma, with evidence for a paralytic secretion. Proc Natl Shellfish Assoc 62:95–97

267

268

M.V. Modica and M. Holford

Miljanich GP (2004) Ziconotide: neuronal calcium channel blocker for treating severe chronic pain. Curr Med Chem 11:3029–3040 Millar JG, Dey A (1987) Food poisoning due to the consumption of red whelks Neptunea antiqua. Comm Dis Scotl Wkly Rep 21(38):5–6 Minniti F (1986) Morphological and histochemical study of pharynx of Leiblein, salivary glands and gland of Leiblein in the carnivorous Gastropoda Amyclina tinei Maravigna and Cyclope neritea Lamarck (Nassariidae: Prosobranchia Stenoglossa). Zool Anz 217:14–22 Modica MV, Kosyan A, Oliverio M (2009) The relationships of the enigmatic gastropod Tritonoharpa: new data on early neogastropod evolution? The Nautilus 123(3):177–188 Morton B, Chan K (1997) The first report of shell-boring predation by a representative of the Nassariidae (Gastropoda). J Moll Stud 63:480–482 Naegel LCA, Aguilar-Cruz CA (2006) The hypobranchial gland from the purple snail Plicopurpura pansa (Gould, 1853) (Prosobranchia, Muricidae). J Shellfish Res 25(2):391–394 Nascimento DG, Rates B, Santos DM, Verano-Braga T, Barbosa-Silva A, Dutra AAA, Biondi I, Martin-Euclaire MF, De Lima ME, Pimenta AMC (2006) Moving pieces in a taxonomic puzzle: venom 2D-LC/MS and data clustering analyses to infer phylogenetic relationships in some scorpions from the Buthidae family (Scorpiones). Toxicon 47:628–639 Nielsen C (1975) Observations on Buccinum undatum L. attacking bivalves and on prey responses, with a short review on attacking methods of other prosobranchs. Ophelia 13:87–108 Norton RS, Olivera BM (2006) Conotoxins down under. Toxicon 48:780–798 O’Sullivan JB, McConnaughey RR, Huber ME (1987) A blood-sucking snail: the Cooper’s nutmeg Cancellaria cooperi Gabb, parasitizes the California electric ray, Torpedo californica Ayres. Biol Bull 172:362–366 Ohno S (1970) Evolution by gene duplication. Springer, Berlin Olivera BM (2002) Conus venom peptides: Reflections from the biology of clades and species. Annu Rev Ecol Syst 33:25–47 Olivera BM (2006) Conus peptides: biodiversity-based discovery and exogenomics. J Biol Chem 281:31173–31177 Olivera BM, Teichert RW (2007) Diversity of the neurotoxic Conus peptides: a model for concerted pharmacological discovery. Mol Interv 7(5):253–262 Olivera BM, Rivier J, Clark C, Ramilo CA, Corpuz GP, Abogadie FC, Mena EE, Woodward SR, Hillyard DR, Cruz LJ (1990) Diversity of Conus neuropeptides. Science 249:257–263 Oliverio M, Modica MV (2009) Relationships of the haematophagous marine snail Colubraria (Rachiglossa, Colubrariidae), within the neogastropod phylogenetic framework. Zool J Linn Soc. 158:779–800 Oliverio M, Barco A, Modica MV, Richter A, Mariottini P (2008) Ecological barcoding of corallivory by ITS2 sequences: hosts of coralliophiline gastropods detected by the cnidarian DNA in their stomach. Mol Ecol Resour 9(1):94–103 Palmer AR (1990) Effect of crab effluent and scent of damaged conspecifics on feeding, growth, and shell morphology of the Atlantic dogwhelk, Nucella lapillus (L.). Hydrobiologia 193:155–182 Peterson CH, Black R (1995) Drilling by buccinid gastropods of the genus Cominella in Australia. The Veliger 38:37–42 Petit RE, Harasewych MG (1986) New Philippine Cancellariidae (Gastropoda: Cancellariacea), with notes on the fine structure and function of the nematoglossan radula. The Veliger 28(4):436–443 Ponder WF (1970) The morphology of Alcithoe arabica (Mollusca: Volutidae). Malacol Rev 3:127–165 Ponder WF (1972) The morphology of some mitriform gastropods with special reference to their alimentary and reproductive system (Neogastropoda). Malacologia 11(2):295–342 Ponder WF (1973) The origin and evolution of the Neogastropoda. Malacologia 12:295–338 Ponder WF (1998a) Infraorder Neogastropoda. In: Beesley PL, Ross JGB, Wells A (eds) Mollusca: the Southern synthesis. Fauna of Australia, vol 5. CSIRO Publishing, Melbourne, p 819 part B

15 The Neogastropoda: Evolutionary Innovations of Predatory Marine Snails

269

Ponder WF (1998b) Family Costellariidae. In: Beesley PL, Ross JGB, Wells A (eds) Mollusca: the Southern synthesis. Fauna of Australia, vol 5. CSIRO Publishing, Melbourne, pp 843–845, part B Ponder WF, Lindberg DR (1996) Gastropod phylogeny – challenges for the 90s. In: Taylor J (ed) Origin and evolutionary radiation of the Mollusca. Oxford University Press, London, pp 135–154 Ponder WF, Lindberg DR (1997) Towards a phylogeny of gastropod molluscs: an analysis using morphological characters. Zool J Linn Soc 119:83–265 Ponder WF, Taylor JD (1992) Predatory shell drilling by two species of Austroginella (Gastropoda: Marginellidae). J Zool 228:317–328 Power AJ, Keegan BF, Nolan K (2002) The seasonality and role of the neurotoxin tetramine in the salivary glands of the red whelk Neptunea antiqua L. Toxicon 40:419–425 Puillandre N, Samadi S, Boisselier M-C, Sysoev AV, Kantor YI, Cruaud C, Couloux A, Bouchet P (2008) Starting to unravel the toxoglossan knot: molecular phylogeny of the “turrids” (Neogastropoda: Conoidea). Mol Phylogenet Evol 47:1122–1134 Radwin GE, D’Attilio A (1976) Murex shells of the world. Stanford University Press, Stanford Reid TMS, Gould IM, Mackie IM, Ritchie AH, Hobbs G (1988) Food poisoning due to the consumption of red whelks Neptunea antiqua. Epidemiol Infect 101:419 Remigio EA, Duda TFJ (2008) Evolution of ecological specialization and venom of a predatory marine gastropod. Mol Ecol 17:1156–1162 Richter A, Luque AA (2002) Current knowledge on Coralliophilidae (Gastropoda) and phylogenetic implication of anatomical and reproductive characters. Boll Malacol 38:5–19 Robertson R (1970) Review of the predators and parasites of stony corals, with special reference to symbiotic prosobranch gastropods. Pac Sci 24:43–54 Romeo C, Di Francesco L, Oliverio M, Palazzo P, Raybaudi Massilia G, Ascenzi P, Polticelli F, Schinina` ME (2008) Conus ventricosus venom peptides profiling by HPLC-MS: a new insight in the intraspecific variation. J Sep Sci 31:488–498 Roseghini M, Severini C, Falconieri Erspamer G, Erspamer V (1996) Choline esters and biogenic amines in the hypobranchial gland of 55 molluscan species of the neogastropod Muricoidea superfamily. Toxicon 34(1):33–55 Saitoh H, Oikawa K, Takano T, Kamimura K (1983) Determination of tetramethylammonium ion in shellfish by ion chromatography. J Chromatogr 281:397 Shiomi K, Mizukami M, Shimakura K, Nagashima Y (1994) Toxins in the salivary gland of some marine carnivorous gastropods. Comp Biochem Physiol 107B:427–432 Smith EH (1967) The neogastropod midgut, with notes on the digestive diverticula and intestine. Trans R Soc Edinburgh 67:23–42 Strong EE (2003) Refining molluscan characters: morphology, character coding and a phylogeny of the Caenogastropoda. Zool J Linn Soc 137:447–554 Taylor JD (1976) Habitats, abundance and diets of muricacean gastropods at Aldabra Atoll. Zool J Linn Soc 59:155–193 Taylor JD (1978) Habitats and diet of predatory gastropods at Addu Atoll, Maldives. J Exp Mar Biol Ecol 31:83–103 Taylor JD, Morris NJ (1988) Relationships of neogastropoda. Malacol Rev 4:167–179 Taylor JD, Morris NJ, Taylor CN (1980) Food specialization and the evolution of predatory prosobranch gastropods. Palaentology 23(2):375–409 Taylor JD, Kantor YI, Sysoev AV (1993) Foregut anatomy, feeding mechanisms, relationships and classification of the Conoidea (¼Toxoglossa) (Gastropoda). Bull Br Mus Nat Hist 59:125–170 Terlau H, Olivera BM (2004) Conus venoms: a rich source of novel ion channel-targeted peptides. Pysiol Rev 84:41–68 Twede VD, Miljanich GP, Olivera BM, Bulaj G (2009) Neuroprotective and cardioprotective conopeptides: an emerging class of drug leads. Curr Opin Drug Discov Dev 12:231–239

270

M.V. Modica and M. Holford

Ward J (1965) The digestive tract and its relation to feeding habits in the stenoglossan prosobranch Coralliophila abbreviata (Lamarck). Can J Zool 43:447–464 Watkins M, Hillyard DR, Olivera BM (2006) Genes expressed in a turrid venom duct: divergence and similarity to conotoxins. J Mol Evol 62:247–256 Watson-Wright WM, Sims GG, Smyth C, Gillis M, Maher M, Trottier T, Van Sinclair DE, Gilgan M (1992) Identification of tetramine as toxin causing food poisoning in Atlantic Canada following consumption of whelks Neptunea decemcostata. In: Gopalakrishnakone P, Tan CK (eds) Recent advances in toxinology research, vol 2. University of Singapore, Singapore, pp 551–561 Wells HW (1958) Feeding habits of Murex fulvescens. Ecology 39:556–558 West DJ, Andrews EB, Bowman D, McVean AR, Thorndyke MC (1996) Toxins from some poisonous and venomous marine snails. Comp Biochem Physiol 113C:l–10 Wu SK (1965) Comparative functional studies of the digestive system of the muricid gastropods Drupa ricina and Morula granulata. Malacologia 3:211–233

Chapter 16

Antennal Hammers: Echos of Sensillae Past Nina Laurenne and Donald L.J. Quicke

Abstract Many hosts of parasitoids live in concealed environments such as within plants tissue and wood, and therefore they are difficult to find. This is likely to be especially true when concealed hosts are in the pupal stage and thereby silent and immobile. Cryptine ichneumonids collectively have a wide host range including members of several insect orders with different degrees of concealment. Many cryptine genera show a morphological adaptation to finding concealed hosts; their antennal tips are modified into a hammer-like structures that are used to tap the substrate. This vibrational sounding (¼echolocation though solid media) is typical to the tribe Cryptini and it has multiple origins within the subfamily. We show that vibrational sounding is associated with antennal modification and the usage of wood-boring buprestid and cerambycid beetles, and suggest, based on an apparent transition series, that the hammers are derived from mechano-sensilla within the Cryptinae.

16.1

Introduction

The Ichneumonidae is one of the largest insect families with more than 20,000 described species (Yu et al. 2005), though, according to Gaston and Gauld (1993), the real number of species may reach more than half of a million. Ichneumonid wasps are cosmopolitan and whereas most species are parasitoids of other insects N. Laurenne Museum of Natural History, Entomology Division, University of Helsinki, P.O. Box 17, (P. Arkadiankatu 13), 00014 Helsinki, Finland e-mail: [email protected] D.L.J. Quicke Department of Life Sciences, Imperial College London, Silwood Park Campus, Ascot, Berkshire SL5 7PY, UK Department of Entomology, Natural History Museum, London SW7 5BD, UK

P. Pontarotti (ed.), Evolutionary Biology – Concepts, Molecular and Morphological Evolution, DOI 10.1007/978-3-642-12340-5_16, # Springer-Verlag Berlin Heidelberg 2010

271

272

N. Laurenne and D.L.J. Quicke

and in some cases spiders, their way of life varies remarkably. Unlike simple parasites, parasitoids always kill their hosts that are typically larvae or pupae of various Lepidoptera, Coleoptera and Diptera. Parasitoid life history strategies are commonly divided into two classes, the koinobionts and idiobionts (Askew and Shaw 1986; Godfray 1994; Quicke 1997). These two life strategies differ from each other considerably, but the defining difference between them is that idiobionts do not permit their host to carry on developing after parasitisation. In those cases in which the host is a larval stage, it is typically paralysed by the female wasp’s venom. In contrast to idiobionts, the hosts of koinobionts are allowed to continue their development after parasitisation until they reach a suitable stage to be consumed by parasitoid larvae. Several other features are associated with these life strategies, for example, koinobionts are most usually endoparasitoids with relatively narrow host ranges as they have to be able to adapt to the host’s immunological defenses. Idiobionts are typically ectoparasitoids and generalists with a wide host range, though those that attack pupal hosts are usually endoparasitoids. For idiobionts, host range is often largely determined by the potential hosts that are encountered. The hosts of koinobionts are often exposed or very little concealed. Paralysed hosts would be very prone to predation if they were exposed, and therefore hosts of idiobionts tend to live in concealed conditions (i.e. leaf-rolls or leaf mines, plant stems, under bark or inside wood). The trait of exploiting concealed hosts is regarded as the ancestral state in the Ichneumonidae and transitions from idiobiosis to koinobionsis appear to have happened multiple times within the family (Belshaw et al. 1998; Whitfield 1998).

16.1.1

Host Location

According to Vinson (1988), host location consists of several stages beginning with finding a suitable habitat. Then, a parasitoid must locate a potential host therein, followed by examining it for suitability (species and the developmental stage) and finally, oviposition. Parasitoids use many modalities in host location; scent, vision, sound and vibration are involved (Wertheim et al. 2003; Fischer et al. 2004; Fatouros et al. 2005). Wasps lead to a host by several cues, for example, parasitoids can recognise shape, colour and a movement of a host (Fischer et al. 2001, 2004). Volatile chemicals from host frass and damaged plant material are shown to be attractive to parasitoids (Gohole et al. 2003; Bukovinszky et al. 2005). Some species have even evolved to detect host sex pheromones and other kairomones and use them in host searching (Wertheim et al. 2003; Jumean et al. 2005). In general, multiple cues are involved in host-searching process and their efficiency is affected by environmental factors, such as temperature (Fischer et al. 2001; Kro¨der et al. 2007a, b). The female ovipositor and antennae are both important for host examination and acceptance as they have various sensillae for detecting the suitability of a potential

16

Antennal Hammers: Echos of Sensillae Past

273

host (Mackauer et al. 1996; Ignacimuthu and Dorn 2000; Isidoro et al. 2001; Romani et al. 2002). Many mobile hosts of ichneumonid wasps live in concealed places such as within wood. Such host larvae cause vibration when they chew wood and move, and some parasitoid groups have evolved an ability to detecting these hostgenerated vibrations. However, not all potential concealed hosts create their own vibrations, e.g. pupal and prepupal stages or larva shortly about to moult. To locate these, some parasitic wasps have evolved an active, vibration-based, method called vibrational sounding. This form of echolocation occurs in one non-apocritan group, the Orussidae which have highly modified antennae and massively enlarged subgenual (hearing) organs in the forelegs (Vilhelmsen et al. 2001). Females tap with their antenna the substrate and detect the echoes with their subgenual organs. This idea was originally suggested by Cooper (1953) and later Powell and Turner (1975) made similar observations of female behaviour supporting Cooper’s conjecture. Use of vibrational sounding as a means of host location has also evolved on a number of separate occasions within the Ichneumonidae. Amongst the parasitic apocritan wasps vibrational sounding has been most thoroughly investigated in the pimpline ichneumonid genus Pimpla and relatives (Henaut and Guerdoux 1982; Henaut 1990; Meyho¨fer and Casas 1999; Fischer et al. 2001, 2003). The success of echolocation is dependent of several factors, and Kro¨der et al. (2006, 2007b) have shown it to be more efficient in warmer conditions and the role of vision to be more important in cooler conditions. Parasitoids can adjust the intensity of echolocation according to the temperature which shows adaptation to environmental conditions in temperate regions. The ability to adjust to the microhabitat and its varying environmental factors involves a complicated interaction. According to Otten et al. (2001), females with larger size are better in finding concealed hosts in comparison with smaller ones: a larger body mass is capable of transmitting vibration better than smaller one. Apart from in the pimplines, females of a number of other ichneumonid genera are hypothesised to use vibrational sounding based on their morphology: with antennal tips modified into a hammer-like structures suitable for “hammering” the substrate and enlarged subgenual organs in their fore tibiae for detecting substrate-borne vibrations (Broad and Quicke 2000). Additionally, the antennal pegs of female Xorides (Xoridinae) are solid (Quicke unpublished observations) and therefore likely to act as antennal hammers. The largest subfamily of Ichneumonidae is the Cryptinae with 4,659 species belonging to 394 genera (Yu et al. 2005). The cryptines are appropriate model group as the vibrational sounding has multiple origins and losses and there is a detailed molecular phylogenetic analysis (Laurenne et al. 2006). We tested the association between the occurrence of hammer-like terminal antennal segments within the Cryptinae and the explotation of wood-boring buprestids and cerambycids within a comparative phylogenetic framework. Traditionally, the Cryptini has been divided into three tribes: Cryptini, Phygadeuontini and Hemigasterini, and molecular studies largely support this classification

274

N. Laurenne and D.L.J. Quicke

(Laurenne et al. 2006; Quicke et al. 2009). Most cryptines are idiobiont ectoparasitoids and their hosts usually belong the largest insect orders (Coleoptera, Lepidoptera, Hymenoptera and Diptera), but spider egg predation occurs in some cryptine genera, and a few other insect orders are occasionally attacked. Despite their host groups covering several orders as a whole, individual cryptine species can be quite host specific or have a narrow host range (Askew and Shaw 1986; Gauld 1988; Schwarz and Shaw 1998, 2000).

16.2

Material and Methods

We examined the terminal antennal flagellomeres of species representing 122 genera of the subfamily Cryptinae, six of Ichneumoninae and one species each of the Alomyinae, Eucerotinae and Pedunculinae. Scanning electron microscopy (SEM) was used for the vast majority, though light microscope was occasionally relied upon for larger sized specimens of some groups. For males we included 32 genera (26 cryptines, 2 hemigasterines and 4 phygadeuontines). Female antennal tips were classified into five categories according to the degree of modification from unmodified antennae with a tapered tip to ones forming a large flat surface. The intermediate stages show structures of individual setae becoming thicker and forming a cluster (Laurenne et al. 2009).

16.2.1

Comparative Analysis

Comparative analysis (CAIC) was carried out to test the statistical significance of association between antennal modification and the use of wood-boring beetles (buprestids and cerambycids) (Purvis and Rambaut 1995). The degree of antennal modification was treated as a continuous variable and the coleopteran hosts were treated as a categorical variable. Evolutionary rate was assumed to be the same for each taxon. The trees used in the comparative analysis were based on Laurenne et al.’s (2006) molecular study of cryptine phylogeny based on the length-variable D2 (þD3) variable region of the nuclear 28S rDNA gene, but taxa without the host record information were pruned from the tree as missing values are not allowed in CAIC. Two cryptine genera (Mallochia and Schreineria) with host records were added into the tree and, in the absence of molecular data, their placements were based on Townes’s (1969) classification. To avoid biased results, the comparative analyses were carried out using five different gap cost ratios and with two different alignment methods (POY and Clustal W þ PAUP*). Details of the methods are described in Laurenne et al. (2009).

16

Antennal Hammers: Echos of Sensillae Past

16.3

275

Results

The percentages of the degree of antennal modification are shown in Fig. 16.1. Figure 16.2 presents the occurrence of antennal development on a phylogenetic tree. Figures 16.3 and 16.4 show the transformation series from a simple antennal tip with no especially modified sensilla to a large united structure with a virtually uniform surface. Surculus (Fig. 16.3a) displays a simple antennal tip without obvious modification. Figure 16.3b,c shows thickening of some apical setae in the genera Latibulus (Fig. 16.3b) and Hidryta (Fig.16.3c). Setae are modified into truncate structures forming a cluster in genera Camera (Fig.16.3d) and in Cryptanura (Fig.16.3f) modified structures have started to fuse in the middle. In Fig. 16.4, fused structures form a more or less flat surface in females of Acrorichnus (Fig. 16.4a), and Buathra (Fig. 16.4b) shows a smooth face of modified and fused structures. The antennal tip of Osprynchotus (Fig. 16.4c) forms a large uniform flat surface, a truly hammer-like antenna. Some genera have different types of specialisation of the antennal tip, for example, Meringopus (Fig. 16.4d) has thickened “setae” originating from sockets inside the antennal surface. Terminal antennal structures of cryptines are often sexually dimorphic characters as males typically do not display any particular antennal modification. However, some specialisations do occur in males of a few genera. For example, males of Gabunia (Fig. 16.4e) have two peg-like structures on their antennal tip and those of Eurycryptus have one smaller structure (Fig. 16.4f).

Fig. 16.1 The precentage of occurrence of each degree of antennal hammer development in each tribe of Cryptinae

276

N. Laurenne and D.L.J. Quicke

Fig. 16.2 The phylogeny of cryptine waps (Laurenne et al. 2009). The black circles indicate attacking buprestid/cerambycid beetles and having strongly modified antennae (category 4–5). Grey circles indicate the occurrence of slightly modified antenna (categories 1–3)

16

Antennal Hammers: Echos of Sensillae Past

277

a

b

c

d

e

f

Fig. 16.3 Female antennal tips showing antennal modification. (a) Surculus, not modified. (b and c) Some thickened setae on a tip – (b) Latibulus and – (c) Hidryta. (d) Diapetimorpha, thickened structures form a cluster. (e) Camera, dense cluster of truncate structures form a patch. (f) Cryptanura, a cluster of short apically flattened structures with a fusion in the middle

The CAIC analysis showed a significant association between the degree of antennal development and the usage of wood-boring buprestid and cerambycid beetles in the Cryptini. Thirteen genera of the tribe Cryptini exploit wood-boring beetle larvae and have modified antennae. Within the Phygadeuontini, only five genera have this association. p-Values showed a significant association (0.0080–0.0397) in all analysis except with the alignment obtained with the highest gap:substitution cost (4:1, p-value ¼ 0.0707). Detailed results are presented in the Laurenne et al. (2009).

16.4

Discussion

Possession of an antennal hammer is a clearly homoplastic character at an higher level as it is found also in other ichneumonid subfamilies (Labeninae, Xoridinae, Claseinae and Pimplinae) (Broad and Quicke 2000) as well as in the Orussidae (Cooper 1953; Broad and Quicke 2000; Vilhelmsen et al. 2001). This structure is

278

N. Laurenne and D.L.J. Quicke

a

b

c

d

e

f

Fig. 16.4 Antennal tips of female and males. (a) Acrorichnus female, apical structures form a clear patch. (b) Buathra female, structures form a smooth patch. (c) Osprychotus female, a hammer-like antennal tip. (d) Meringopus female, thickened antennal setae originating from deep sockets. (e) Gabunia male, two pegs on antennal tips, (f) Eurycryptus, one antennal peg

associated with deeply concealed cerambycid and buprestid beetle hosts and we have shown by comparative analysis that it is also highly homoplastic within the single but large subfamily Cryptinae. Behavioural observations of Echthrus and of a Gabunia sp. (Quicke et al. 2003) support the hypothesis that antennal hammers in the Cryptini are associated with host searching. In 2004, we video recorded the host-searching behaviour of a female Echthrus reluctator on a pile of pine logs in Hungary (Quicke 2001). The wasp walked along the log tapping the substrate with the antennae repeatedly sweeping symmetrically in inwardly directed arcs. Similar behaviour was also observed in an unidentified Afrotropical species of Gabunia (tribe Cryptini) in Kibale Forest National Park in Uganda.

16.4.1

Hosts of Cryptine Wasps

Most cryptine wasps are ectoparasitoids and they do not need to adapt to host’s immunological defense. This may explain why some genera attack hosts from

16

Antennal Hammers: Echos of Sensillae Past

279

several insect orders. The essential ability in host usage might be to find concealed hosts of suitable sized. 16.4.1.1

Hosts of the Phygadeuontini

Species of the tribe Phygadeuontini typically parasitise exposed or weakly concealed hosts and this is considered to be a ground-plan biology for the Cryptinae (Gokhman 1996). The comparative analysis using the phylogeny (Laurenne et al. 2006) shows that modified antennal tips have multiple origins within the Phygadeuontini and host range covers several insect orders. Antennal modification was found in three genera, all of which attack wood-boring beetles (Fig. 16.4). 16.4.1.2

Hosts of the Cryptini

In the tribe Cryptini, all the taxa that exploit wood-boring beetles have antennal hammers. This is probably the ground-plan for the tribe. Strongly modified antennal structures are also found in genera that attack other insect groups such as aculeate Hymenoptera larvae in their nests. Parasitoids probably locate cells with suitable host using vibrational sounding. Aculeate larvae are probably largely silent and do not chew wood, though they move inside a cell when they need a feed by adults. Members of the genera Acroricnus, Eurycryptus, Messatoporus, Osprynchotus and Photocryptus exploit aculeate larvae (Genaro 1996) and they all have modified antennal tips. According to Gauld (1988) there may be a host shift from Coleoptera hosts to the young of nest-building aculeate Hymenoptera, but this is only a hypothesis and cannot be tested at present due to the lack of sufficient detailed host information for the vast majority of Cryptinae genera. Unlike most other subtribes, the Gabuniini form a well-supported monophyletic group (Laurenne et al. 2006) comprising 12 genera. Ten of these have strong antennal modifications and the four available host records indicate that these species exploit cerambycid or buprestid beetle hosts. The cylindrical body shape of gabuniines and their long ovipositors probably enable them to reach their hosts and are perhaps constrained by host boring shape (Townes and Townes 1962); the enlarged subgenual organs found in the forelegs of females are assumed to be for detecting echos during host location (Broad and Quicke 2000). Most of available host records concerning the cryptine wasps concern phygadeuontines, many of which attack rather weakly concealed hosts, especially ones in cocoons, or spider egg masses. The spider egg “parasitoids” attack exposed egg masses, and therefore, vibrational sounding probably has no role in locating them, and the antennal tips of the spider egg “parasitoids” examined are typically simple. Hyperparasitism of cocooned parasitoid hosts occurs more commonly in the Cryptini than in the Phygadeuontini, though there are numerous examples within the latter. Some genera have modified antennal tips, but that could possibly be explained by the adaptation to exploit other insect groups as well. Within the Cryptini, males of six out of the ten genera examined had either one or two terminal flagellomere pegs. The females of the same genera also had

280

N. Laurenne and D.L.J. Quicke

antennal modifications except for the case of Chrysocryptus. Structures of male terminal flagellomeres are probably not related to the echolocation role of female antennal hammers. Their co-occurrence suggests that there might be homologous genetic control in the tribe Cryptini. Whether, and in what way, they may be functional has yet to be determined. Field observations of mate-location and mating are sadly largely lacking. Considering the size of the subfamily, very few host records are available for cryptine genera, and when records exist, they are often vague. Records typically especially lack information about the host’s precise developmental stage. Field records are largely lacking, and the host-location behaviour is usually referred to as “antennation” without describing what part of the antennae is used. We hope that this paper will encourage more detailed observation and reporting in the future.

16.4.2

Postulated Derivation of Hammers from Sensilla

If the states shown in Fig. 16.3a–f represent various stages in the evolution of antennal hammers as seems likely, then the individual components of the hammer surface would appear to be derived from sensilla. The unmodified terminal flagellomere of Surculus has many thin curved sensilla chaetica, with a lower number of more erect obliquely ended chaetica (on right), and one visible blunt sensillum. In Latibulus (Fig. 16.3b), there are numerous blunt trichoid sensilla in relatively small sockets plus several longer more pointed chaetica in rather large sockets. In Fig. 16.3c, there is a similar grouping of socketed and less conspicuously socketed blunt sensilla but with their apices curving towards the antennal tip and interspersed with small trichoid sensilla. In Fig. 16.3d, the apical cluster comprises a dense central area of T-shaped pegs that lack sockets at least on the basal side though on the side of the antennal apex there appears to be a well-developed basal socket; these are surrounded by curved, socketed robust trichoid sensilla. Socketed trichoid sensilla are typically involved in mechanoreception. If, as the above suggests, the antennal hammers of cryptines, and possibly other ichneumonid wasps, are evolved from mechanoreceptory sensilla, it begs the question as to what the intermediate evolutionary stages did, and what substrates, the hosts during those intermediate phases occupied. Certainly more detailed behaviour, microscopic and ultrastructural observations of living representatives of apparent intermediate stages are needed.

References Askew RR, Shaw MR (1986) Parasitoid communities: their size, structure and development. In: Waage J, Greathead D (eds) Insect parasitoids. Academic, London, pp 225–264 Belshaw R, Fitton M, Herniou E, Gimeno C, Quicke DLJ (1998) A phylogenetic reconstruction of the Ichneumonoidea (Hymenoptera) based on the D2 variable region of 28S ribosomal RNA. Syst Entomol 23:109–123

16

Antennal Hammers: Echos of Sensillae Past

281

Broad GR, Quicke DLJ (2000) The adaptive significance of host location by vibrational sounding in parasitoid wasps. Proc R Soc Lond B Biol 267:2403–2409 Bukovinszky T, Gols R, Posthumus MA, Vet LEM, van Lenteren JC (2005) Variation in plant volatiles and attraction of the parasitoid Diadegma semiclausum (Hellen). J Chem Ecol 31:461–480 Cooper KW (1953) Egg gigantism, oviposition, and genital anatomy: their bearing on the biology and phylogenetic position of Orussus (Hymenoptera: Siricoidea). Proc R Acad Sci 10:38–68 Fatouros NE, Huigens ME, van Loon JJA, Dicke M, Hilker M (2005) Butterfly antiaphrodisiac lures parasitic wasps. Nature 433:704 Fischer S, Samietz J, W€ackers FL, Dorn S (2001) Interaction of vibrational and visual cues in parasitoid host location. J Comp Physiol A 187:785–791 Fischer S, Samietz J, Dorn S (2003) Efficiency of vibrational sounding in parasitoid host location depends on substrate density. J Comp Physiol A 189:723–730 Fischer S, Samietz J, W€ackers FL, Dorn S (2004) Perception of chromatic cues during host location by the pupal parasitoid Pimpla turionellae (L.) (Hymenoptera: Ichneumonidae). Environ Entomol 33:81–87 Gaston KJ, Gauld ID (1993) How many species of pimplines (Hymenoptera: Ichneumonidae) are there in Costa Rica? J Trop Ecol 9:491–499 Gauld ID (1988) Evolutionary patterns of host utilization by ichneumonoid parasitoids hymenoptera Ichneumonidae and Braconidae. Biol J Linn Soc 35:351–378 Genaro JA (1996) Nest parasites (Coleoptera, Diptera, Hymenoptera) of some wasps and bees (Vespidae, Sphecidae, Colletidae, Megachilidae, Anthophoridae) in Cuba. Caribb J Sci 32:239–240 Gohole LS, Overholt WA, Khan ZR, Vet LEM (2003) Role of volatiles emitted by host and nonhost plants in the foraging behaviour of Dentichasmias busseolae, a pupal parasitoid of the spotted stemborer Chilo partellus. Entomol Exp Appl 107:1–9 Godfray HCJ (1994) Parasitoids: behavioral and evolutionary ecology. Princeton University Press, Princeton, NJ Gokhman VE (1996) Trends of biological evolution in the subfamily Ichneumoninae and related groups (Hymenoptera Ichneumonidae): an attempt of phylogenetic reconstruction. Russ Entomol J 4:91–103 Henaut A, Guerdoux J (1982) Location of a lure by the drumming insect Pimpla instigator (Hymenoptera, Ichneumonidae). Experientia 38:346–347 Henaut A (1990) Study of the sound produced by Pimpla instigator (Hymenoptera, Ichneumonidae) during host selection. Entomophaga 35:127–139 Ignacimuthu S, Dorn S (2000) Mechano- and chemoreceptors and their possible role in host location behaviour of parasitoid Anisopteromalus calandrae Howard (Hymenoptera: Pteromalidae). Entomon 25:179–184 Isidoro N, Romani R, Bin F (2001) Antennal multiporous sensilla: their gustatory features for host recognition in female parasitic wasps (Insecta, Hymenoptera: Platygastroidea). Microsc Res Tech 55:350–358 Jumean Z, Unruh T, Gries R, Gries G (2005) Mastrus ridibundus parasitoids eavesdrop on cocoonspinning codling moth, Cydia pomonella, larvae. Naturwissenschaften 92:20–25 Kro¨der S, Samietz J, Dorn S (2006) Effect of ambient temperature on mechanosensory host location in two parasitic wasps of different climatic origin. Physiol Entomol 31:299–305 Kro¨der S, Samietz J, Dorn S (2007a) Temperature affects interaction of visual and vibrational cues in parasitoid host location. J Comp Physiol 193:223–231 Kro¨der S, Samietz J, Schneider D, Dorn S (2007b) Adjustment of vibratory signals to ambient temperature in a host-searching parasitoid. Physiol Entomol 32:105–112 Laurenne NM, Broad GR, Quicke DLJ (2006) Direct optimization and multiple alignment of 28S D2–D3 rDNA sequences: problems with indels on the way to a molecular phylogeny of the cryptine ichneumon wasps (Insecta: Hymenoptera). Cladistics 22:442–473

282

N. Laurenne and D.L.J. Quicke

Laurenne NM, Karatolos N, Quicke DLJ (2009) Hammering homoplasy: multiple gains and losses of vibrational sounding in cryptine wasps (Insecta: Hymenoptera: Ichneumonidae). Biol J Linn Soc 96:82–102 Meyho¨fer R, Casas J (1999) Vibratory stimuli in host location by parasitic wasps. J Insect Physiol 45:967–971 Mackauer M, Michaud JP, Volkl W (1996) Host choice by aphidiid parasitoids (Hymenoptera: Aphidiidae): host recognition, host quality, and host value. Can Entomol 128:959–980 Otten H, W€ackers F, Battini M, Dorn S (2001) Efficiency of vibrational sounding in the parasitoid Pimpla turionellae is affected by female size. Anim Behav 61:671–677 Powell JA, Turner WJ (1975) Observations on oviposition behaviour and host selection in Orussus occidentalis (Hymenoptera: Siricoidea). J Kans Entomol Soc 48:299–307 Purvis A, Rambaut A (1995) Comparative analysis by independent contrasts (CAIC): an Apple Macintosh application for analysing comparative data. Comput Appl Biosci 11:247–251 Quicke DLJ (1997) Parasitic wasps. Chapman & Hall, London, New York Quicke DLJ (2001) Movie of host searching Echthrus. http://www.imperial.ac.uk/imedia/vid/fons/ biology/quicke//Echthrus.mp4. Accessed 7 Dec 2009 Quicke DLJ, Laurenne NM, Broad GR, Barclay MVL (2003) Host location behaviour and a new host record for Gabunia aff. togoensis Krieger (Hymenoptera: Ichneumonidae: Cryptinae) in Kibale Forest National Park, West Uganda. Afr Entomol 11:308–310 Quicke DLJ, Laurenne NM, Fitton MG, Broad GR (2009) A thousand and one wasps: a 28S rDNA and morphological phylogeny of the Ichneumonidae (Insecta: Hymenoptera) with an investigation into alignment parameter space and elision. J Nat Hist 43:1305–1421 Romani R, Isidoro N, Bin F, Vinson SB (2002) Host recognition in the pupal parasitoid Trichopria drosophilae: a morpho-functional approach. Entomol Exp Appl 105:119–128 Schwarz M, Shaw MR (1998) Western Palaearctic Cryptinae (Hymenoptera: Ichneumonidae) in the National Museums of Scotland, with nomenclatural changes, taxonomic notes, rearing records and special reference to the British check list. Part 1. Tribe Cryptini. Entomologist’s Gaz 49:101–127 Schwarz M, Shaw MR (2000) Western Palaearctic Cryptinae (Hymenoptera: Ichneumonidae) in the National Museums of Scotland, with nomenclatural changes, taxonomic notes, rearing records and special reference to the British check list. Part 3. Tribe Phygadeuontini, subtribes Chiroticina, Acrolytina, Hemitelina and Gelina (excluding Gelis), with descriptions of new species. Entomologist’s Gaz 51:147–186 Townes H (1969) The genera of Ichneumonidae, part 1. Mem Am Entomol Inst 11:1–300 Townes H, Townes M (1962) Ichneumon-flies of America north of Mexico: 3. Subfamily Gelinae, tribe Mesostenini. United States National Museum Bulletin 216:1–602 Vinson SB (1988) Comparison of host characteristics that elicit host recognition behavior of parasitoid Hymenoptera. In: Gupta VK (ed) Advances in parasitic Hymenoptera research: proceedings of the II conference on the taxonomy and biology of parasitic Hymenoptera. E. J. Brill, Leiden, pp 285–291 Vilhelmsen L, Isidoro N, Romani R, Basibuyuk HH, Quicke DLJ (2001) Host location and oviposition in a basal group of parasitic wasps: the subgenual organ, ovipositor apparatus and associated structures in the Orussidae (Hymenoptera, Insecta). Zoomorphology 121:63–84 Wertheim B, Vet LEM, Dicke M (2003) Increased risk of parasitism as ecological costs of using aggregation pheromones: laboratory and field study of Drosophila–Leptopilina interaction. Oikos 100:269–282 Whitfield JB (1998) Phylogeny and evolution of host–parasitoid interactions in Hymenoptera. Ann Rev Entomol 43:129–151 Yu D, van Achtenberg K, Horstmann K (2005) World Ichneumonoidea 2004. Taxonomy, biology, morphology and distribution. CD/DVD, Taxapad, Vancouver, Canada

Chapter 17

Adaptive Radiation of Neotropical Emballonurid Bats: Molecular Phylogenetics and Evolutionary Patterns in Behavior and Morphology Burton K. Lim

Abstract A phylogenetic analysis of loci from the four genetic transmission pathways in mammals (mitochondrial, autosomal, X, and Y sex chromosomes) was used to investigate the evolution of bats in the pantropically distributed family Emballonuridae. The nuclear data sets support a monophyletic clade of species found in the New World. Character optimization of distributional areas suggests that the most recent common ancestor colonized South America from Africa. Molecular dating with fossil calibrations estimated that a basal split occurred approximately 27 million years ago followed by primary intergeneric diversification 19.4–18.0 million years ago. An analysis of historical biogeography identified the northern Amazon as the ancestral area where there was speciation by taxon pulses from a stable core area in the Guiana Shield. Range contractions followed by expansions during the Early Miocene suggest an adaptive radiation in cluttered forest and open savannah habitats. A correlation of ear morphology, echolocation, and foraging behavior indicates a phylogenetic basis for these complex character systems.

17.1

Introduction

South America was an insular continent from the Late Cretaceous to the Early Pliocene but nevertheless, it has high levels of biodiversity for many groups of organisms compared with other parts of the world. For example, bats account for 20% of the mammalian faunal diversity (Wilson and Reeder 2005) and are unique in being the only order of mammals that can fly. This gives bats an advantage for over-water dispersal but there have been no studies investigating the evolutionary B.K. Lim Department of Natural History, Royal Ontario Museum, 100 Queen’s Park, Toronto, Ontario M5S 2C6, Canada e-mail: [email protected]

P. Pontarotti (ed.), Evolutionary Biology – Concepts, Molecular and Morphological Evolution, DOI 10.1007/978-3-642-12340-5_17, # Springer-Verlag Berlin Heidelberg 2010

283

284

B.K. Lim

mechanisms for the successful radiation of bats, especially in the rainforests of the Amazon. As with most taxa, this has been hindered by a lack of comprehensive species-level phylogenies, a dearth of fossils in the paleontological record, and a paucity of ecological data. Herein, I synthesize data on New World emballonurid bats in the tribe Diclidurini as one of the first detailed studies of an adaptive radiation of mammals in the Neotropics. I begin by giving general background information on the biology of the family Emballonuridae. The primary objective of this study is to hypothesize the processes involved in the biotic diversification in New World emballonurid bats by inferring a robust phylogeny of New World emballonurid bats using a molecular phylogenetic approach, estimating times of divergence based on molecular dating with fossil calibration points, examining the historical biogeography with the incorporation of both temporal and spatial information, and investigating patterns of evolution in morphology and behavior as inferred from the phylogeny.

17.1.1

Emballonurid Bats

The family Emballonuridae is characterized by a tail that emerges mid-dorsally from the interfemoral membrane, which is the origin of its common name of sheathtailed bats. They are found pantropical in distribution, and the New World emballonurids occur from Mexico through Central America into South America to southeastern Brazil, including the off-shore islands of Trinidad, Tobago, and Grenada (Koopman 1994). Most species are uncommonly encountered in Neotropical rainforests using traditional methods of capture such as mesh mist nets set in the understory because they typically fly in or over the canopy. Consequently, New World emballonurid bats are typically poorly studied and incompletely sampled in terms of taxonomic and geographic coverage. However, this apparent rarity is associated with a sampling bias that may be partially corrected by supplemental surveying by novel methods such as flap trapping (Borissenko 1999; Lim 2009), acoustic monitoring (Jung et al. 2007), and systematically searching for roosts (Simmons and Voss 1998).

17.1.2

Taxonomy

There are 16 genera of emballonurid bats with 13 extant (eight in the New World and five in the Old World) and three extinct (all Old World) that are represented by 63 species with 52 extant (22 New World and 30 Old World) and 11 extinct (all Old World; McKenna and Bell 1997; Simmons 2005; Lim et al. 2010). Four previous phylogenies have been proposed for Emballonuridae including studies on cranial morphology (Barghoorn 1977), protein electrophoresis and immunology (Robbins and Sarich 1988), hyoid morphology (Griffiths and Smith 1991), and morphology

17 Adaptive Radiation of Neotropical Emballonurid Bats: Molecular Phylogenetics

285

and behavior (Dunlop 1998). All of these studies were at the taxonomic rank of genus except for the species-level analysis of Dunlop (1998). However, the only taxonomic congruence among the topologies is the higher-level recognition of subfamilies (Emballonurinae and Taphozoinae). The lack of consensus in other parts of these trees was confounded by a combination of incomplete taxonomic sampling and poor resolution. A recent molecular phylogenetic analysis of DNA sequence variation supported this taxonomic classification (Lim et al. 2008). Although the New World emballonurid species were comprehensively surveyed, there were only exemplar samples of the two Old World tribes, which are still poorly represented by tissue collections.

17.2

Molecular Phylogenetic Analyses

The data set for New World emballonurid bats included 99 specimens representing all of the eight recognized genera and 21 of the 22 species (Simmons 2005; Lim et al. 2010). The only missing species is Saccopteryx antioquensis, which is endemic to the northern Andes of Colombia and known by only two specimens without tissue samples (Mun˜oz and Cuartas 2001). Outgroup taxa included nine specimens representing two genera of Old World emballonurids and four genera of other bat species (Lim et al. 2008). Loci from the four genomic components of mammalian transmission genetics were used to hypothesize the evolutionary history of New World emballonurid bats. Each of these genetic transmission pathways has different properties associated with effective population size, mutation rate, and recombination that should be conducive for recovering a robust estimate of phylogeny. The mitochondrial marker was the complete protein-coding gene cytochrome b (Cytb); the autosomal marker was intron 26 of the protein-coding gene Chd1 (found on chromosome 5 in humans); the Y sex chromosome marker was intron 7 of the protein-coding gene Dby; and the X sex chromosome marker was intron 18 of the protein-coding gene Usp9x (Lim et al. 2008). There were a total of 3,176 aligned basepairs (bp) including 1,140 bp of Cytb, 624 bp of Chd1, 750 bp of Dby, and 662 bp of Usp9x. The phylogenetic analyses of individual and combined nucleotide data sets incorporated both an explicit model of DNA evolution using a statistical Bayesian approach and a model-free methodology using a maximum parsimony approach as corroboration of topological robustness. Bayesian inference was implemented in the program MrBayes (Ronquist and Huelsenbeck 2003) and parsimony reconstruction was implemented in the program PAUP* (Swofford 2001) as outlined by Lim et al. (2008). Branch supports of the resultant trees were calculated by the posterior probability distribution in the Bayesian analysis and by 1,000 bootstrap replications in the parsimony analysis. The trees were compared for topological congruence using the Approximately Unbiased (AU) test (Shimodaira and Hasegawa 2001). Each data set was reciprocally constrained to the individual gene trees to determine if one was better than another.

286

17.2.1

B.K. Lim

Tree Topology

Parsimony and Bayesian analyses of each of the individual data sets gave congruent topologies with high bootstrap proportions and posterior probabilities for monophyletic clades representing the currently recognized genera and species of New World emballonurid bats (Fig. 17.1; Lim et al. 2008). However, the mitochondrial

Fig. 17.1 Phylogenetic tree from a Bayesian analysis of combined DNA sequences of three nuclear genes for New World emballonurid bats, tribe Diclidurini (Lim et al. 2008). The first number along the branch is the Bayesian posterior probability percentage, and the second number is the bootstrap percentage from a parsimony analysis. Numbers in parentheses are the corresponding branch-support values from a phylogenetic analysis after the removal of the outgroup taxon Nycteris javanicus, which was missing data for two of the genes. Intrageneric support values are the same for both analyses and branches with an asterisk (*) have 100% support. Peropteryx macrotis has two divergent populations from Central America (CA) and South America (SA)

17 Adaptive Radiation of Neotropical Emballonurid Bats: Molecular Phylogenetics

287

gene had significantly faster rates of nucleotide substitution, higher levels of homoplasy, and a greater degree of saturation of transitions than any of the three nuclear genes. These factors contributed to the loss of phylogenetic signal at deeper branches of the cytochrome b tree including the monophyly of the New World emballonurids. In contrast, there was better resolution and branch support for the more slowly evolving nuclear introns. However, the intergeneric relationships within the two subtribes were poorly resolved and supported by only a few nucleotide changes. This suggests a hard polytomy resulting from a lack of phylogenetic signal in each of the different genetic transmission pathways because of rapid speciation as opposed to a soft polytomy due to conflicting phylogenetic signal. Based on topological congruence, linear accumulation of substitutions, and high consistency index, the three nuclear genes were combined to lessen the effects of random sequence errors among nucleotide sites and ensure the recovery of phylogenetic signal from a robust species tree. A monophyletic New World clade was recovered in the individual and combined nuclear data sets indicating a single origin of emballonurid bats in the Neotropics (Fig. 17.1). Similarly, there was a basal split in the New World tribe Diclidurini that was congruent and well supported in the nuclear trees.

17.3

Divergence Times

The combined nuclear data set for the tribe Diclidurini was used in a Bayesian relaxed clock approach to approximate the times of divergence (Thorne and Kishino 2002). Two fossil constraints were used as calibration points including a minimum age of 13 million years ago (mya) for the split of Cyttarops and Diclidurus based on the only pre-Pleistocene record of an extant New World emballonurid genus (Czaplewski 1997). The second constraint was a maximum age of 30 million years ago for the split of the Old and New World emballonurids based on a molecular dating analysis with fossil calibrations for all families of bats (Teeling et al. 2005). The basal split in the New World emballonurids occurred in the Late Oligocene approximately 27 million years ago and six of the eight currently recognized genera diversified relatively rapidly in the Early Miocene 19.4–18.0 million years ago, and most intrageneric differentiation (16 of 21 species) occurred before the Pliocene 5 million years ago (Fig. 17.2; Lim 2007).

17.4

Historical Biogeography

Character optimization (Farris 1970) of distributional areas onto the phylogeny for the superfamily Emballonuroidea indicates that the ancestor of New World emballonurid bats has its origins in Africa (Fig. 17.3; Lim 2007). This biogeographic scenario was previously suggested from phylogenetic studies of interfamilial relationships of

288

B.K. Lim

Fig. 17.2 Molecular dating based on a relaxed clock Bayesian analysis with fossil calibrations of New World emballonurid bats (Lim 2007). Nodes are labeled with divergence time estimates (millions of years ago) and standard deviations. Intergeneric and most intrageneric diversification occurred in the Miocene (shaded). Peropteryx macrotis has two divergent populations from Central America (CA) and South America (SA)

bats (Eick et al. 2005; Teeling et al. 2005). The paleoenvironment during the Early Oligocene was drier than today with more open habitats such as woodlands and savannahs as suggested by the prevalence of large hypsodont mammals in the fossil record (Flynn and Wyss 1998). Colonization of South America by trans-Atlantic dispersal and subsequent speciation in allopatry has been reported for three other groups of placental mammals based on fossil records from the Oligocene including molossid bats (Legendre 1984), caviomorph rodents (Wyss et al. 1993), and platyrrhine primates (Takai et al. 2000). These range expansions probably occurred earlier in the Eocene (Poux et al. 2006). The phylogenies of each of the eight genera of New World emballonurid bats were incorporated in an historical biogeographic analysis using the algorithm Phylogenetic Analysis for Comparing Trees (PACT; Wojcicki and Brooks 2005). In constructing the area cladogram, temporal information from the molecular dating

17 Adaptive Radiation of Neotropical Emballonurid Bats: Molecular Phylogenetics

289

Fig. 17.3 Phylogenetic tree for the superfamily Emballonuroidea with the ancestral areas mapped onto each node (AF Africa, EU Europe, NA North America, SA South America) following Lim (2007). Lineage splits, other than the extant New World emballonurids (tribe Diclidurini), are based on the minimum age of the fossil record (black bars). The basal divergence at 52 million years ago (mya) of the families Nycteridae and Emballonuridae is the molecular approximation by Teeling et al. (2005). Extinct taxa are indicated by an asterisk (*)

analysis (Lim 2007) was also used in conjunction with spatial information based on the current distribution of each species (Table 17.1). There were nine biogeographic areas identified in Central and South America for New World emballonurids (Fig. 17.4). The final area cladogram identified the Northern Amazon as the

290

B.K. Lim

Table 17.1 Biogeographic areas identified for species of New World emballonurid bats based on current species distributions (Lim 2008) Species Biogeographic area A B C D E F G H I Balantiopteryx infusca C Balantiopteryx io B Balantiopteryx plicata A Centronycteris centralis B C D H Centronycteris maximiliani F G I Cormura brevirostris B D E F G H Cyttarops alecto B F G Diclidurus albus A B D E F G I Diclidurus ingens E F G Diclidurus isabellus F Diclidurus scutatus F G I Peropteryx kappleri B C D E F G H I Peropteryx leucoptera F G H I Peropteryx macrotis (Central America) A B Peropteryx macrotis (South America) D E F G H I Peropteryx pallidoptera F G Peropteryx trinitatis E F Rhynchonycteris naso A B C D E F G H I Saccopteryx antioquensis D Saccopteryx bilineata A B C D E F G H I Saccopteryx canescens D E F G H Saccopteryx gymnura F G Saccopteryx leptura A B C D E F G H I A ¼ Pacific versant of Central America; B ¼ Atlantic versant of Central America; C ¼ Choco region of northwestern South America; D ¼ northern Andes and valleys of Colombia; E ¼ north coast of Venezuela and offshore islands; F ¼ north of the Amazon River; G ¼ south of the Amazon River; H ¼ eastern slope of the Andes in the western Amazon basin; and I ¼ southeastern South America (Fig. 17.4)

ancestral area for the basal node and for most internal nodes based on character optimization (Fig. 17.5). This indicates that most lineage splits were within-area speciation events. However, there were three range expansions from the Northern Amazon followed by vicariant contractions including (1) a peripheral isolation in the Pacific slope of northwestern South America and subsequent colonization of Proto-Central America during the Middle Miocene; (2) colonization of northern Colombia and vicariant isolation after the uplift of the Andes during the Late Miocene; and (3) overland dispersal into Central America during the Pleistocene after the establishment of the Panamanian land bridge connection, which was followed by extinction in the intervening area of the northern Andes in Colombia, which resulted in allopatric speciation (Lim 2008). As is the case for most species of New World emballonurid bats, widely distributed species typically are not conducive for recovering biogeographic patterns. However, the optimization of the Northern Amazon at most nodes of the area cladogram indicates repeated within-area speciation events. Tectonic uplifting of

17 Adaptive Radiation of Neotropical Emballonurid Bats: Molecular Phylogenetics

291

Fig. 17.4 Map of the nine biogeographical areas in Central America and South America that were identified based on current species distributions in Table 17.1 (Lim 2008): (A) Pacific versant; (B) Atlantic versant; (C) Choco; (D) Northern Andes; (E) North Coast; (F) Northern Amazon; (G) Southern Amazon; (H) Western Amazon; (I) Southeastern South America

the northern Andes (Hoorn et al. 1995) combined with fluctuations in temperature and sea levels (Haq et al. 1987; Miller et al. 2005), and changes in vegetation (Janis 1993) contributed to a heterogeneous paleoenvironment in South America during the Miocene (Lundberg et al. 1998). This scenario is similar to the taxon-pulse hypothesis of biotic diversification with recurring adaptive shifts over time to different habitats centered on a stable core area (Erwin 1979, 1981). For New World emballonurid bats, there were repeated episodes of range expansions and contractions from a stable core area such as the ancient Guiana Shield of the Northern Amazon. Mapping the area cladogram (Fig. 17.5) onto the chronogram (Fig. 17.3) suggests that other than an earlier colonization in the Miocene that was associated with the genus Balantiopteryx (Lim 2008; Lim et al. 2004), range expansion from South America into Central America probably did not occur until later in the Pliocene. Although Centronycteris split vicariantly in the Late Miocene with Centronycteris maximiliani speciating in the Northern Amazon and Centronycteris centralis in the Northern Andes, C. centralis did not colonize Central America until a later date. Similarly, Saccopteryx bilineata and Saccopteryx leptura split during the Late Miocene in the North Amazon before both species became widely distributed throughout the continental mainland. Even more recently, Diclidurus albus and Diclidurus ingens split during the Early Pleistocene in the North Amazon before D. albus dispersed into Central America. Although the topology forms a trichotomy with Peropteryx kappleri, the allopatrically distributed Central and South American populations of Peropteryx macrotis split in the Late Pleistocene. Three other

292

B.K. Lim

Fig. 17.5 Final area cladogram from an historical biogeographic analysis of New World emballonurid bats (Lim 2008). Ancestral areas at nodes are derived from character optimization. Three nodes marked with roman numerals in parentheses identify biotic expansions followed by vicariant isolation. All other nodes are within-area taxon pulses of biotic diversification in the Northern Amazon (F)

species (Cormura brevirostris, Cyttarops alecto, and Rhynchonycteris naso) are also widely distributed but their range expansions cannot be discerned from the area cladogram. Likewise, patterns of range expansion from the Northern Amazon southwards are not explicitly discernible because no speciation events involve the Southern Amazon. However, C. maximiliani, S. bilineata, S. leptura, Saccopteryx canescens, Saccopteryx gymnura, Diclidurus scutatus, and Peropteryx pallidoptera dispersed from the Northern to the Southern Amazon sometime after they speciated in the late Miocene. This timing coincides with the uplifting of the eastern

17 Adaptive Radiation of Neotropical Emballonurid Bats: Molecular Phylogenetics

293

cordillera of the Andes, which created the Amazon River and primary drainage of South America east toward the Atlantic Ocean as we know it today (Hoorn et al. 1995).

17.5 17.5.1

Evolutionary Patterns Morphological Data

The most comprehensive morphological study of the family Emballonuridae incorporated 141 external, cranial, and skeletal characters from 43 of 52 extant species including 18 of 22 New World species (Dunlop 1998; Lim and Dunlop 2008). However, the phylogeny was poorly supported with the exception of the genera within the tribe Diclidurini. Topological congruence using the KH (Kishino and Hasegawa 1989), Wilcoxon signed ranks (Templeton 1983), and winning sites (Prager and Wilson 1988) tests indicated that the morphological data set constrained to each of the molecular trees was significantly worse than its own tree ( p < 0.02), except for Usp9x ( p < 0.07). Similarly, all three of the molecular data sets were significantly worse ( p < 0.01) when constrained to the morphological tree as opposed to their own tree. In terms of character congruence, the incongruence length difference test (Farris et al. 1995) identified the morphological data set as significantly different from the molecular data sets. Taxonomic congruence summarizes these topological and character differences because the three nuclear gene trees corroborate the split of the New World taxa into the subtribes Diclidurina and Saccopterina, which are clades not recovered by the morphological tree. Except for a collapse to a polytomy at the basal node of the subtribe Saccopterina in the parsimony tree, combining the morphological and molecular data sets resulted in the same topology as the nuclear tree for both Bayesian and parsimony analyses. This indicates that the morphological dataset has a lot of homoplasy with very little phylogenetic signal.

17.5.2

Ecological Data

The most comprehensive ecological study incorporated 28 characters primarily associated with roosting and foraging behavior; however, data for most of the species were unknown (Dunlop 1998; Lim and Dunlop 2008). A phylogenetic analysis of this incomplete dataset resulted in a largely unresolved topology. A combined analysis of morphological and behavioral characters resulted in a slightly better but still poorly resolved consensus tree of 509 equally parsimonious trees. The only higher level relationships recovered were the subfamilies Taphozoinae and Emballonurinae.

294

B.K. Lim

Although there is a lack of resolving power because of high levels of homoplasy and large amounts of missing data, characters can be optimized onto the robust molecular phylogeny to hypothesize evolutionary patterns in morphology and behavior. Three examples are detailed herein that are associated with the diversification of genera of New World emballonurid bats.

17.5.3

Wing Sacs

Species of Balantiopteryx, Cormura, Peropteryx, and Saccopteryx have a sac-like structure in the propatagium between the shoulder and forearm that is uniquely structured in each of the genera in terms of location in the wing membrane, direction of the opening, and size. However, only the wing sac in S. bilineata has been thoroughly studied. It is well developed in males and acts as a storage container without glandular cells (Scully et al. 2000) for bodily secretions used in a salting behavior to mark females in the harem (Voigt and von Helversen 1999). Based on both a parsimony and likelihood method of ancestral state reconstruction as implemented in Mesquite (Maddison and Maddison 2006), wing sac character states mapped independently onto the molecular phylogeny (Fig. 17.6; Lim and Dunlop 2008). An alternative hypothesis of a single origin of wing sacs for New World emballonurid bats is less parsimonious with two additional losses and it is also not supported by the likelihood method of ancestral state reconstruction, which predicts no wing sac at the base of this clade. However, because of multiple occurrences of sac-like structures in different genera, there is a possibility of a phylogenetic predisposition (Soltis et al. 1995) whereby the genetic components underlying the structure originated once on the tree (Lim and Dunlop 2008).

17.5.4

Roosts and Pelage

Most species of emballonurids and many bats in general have brown fur but some genera have atypical appearances including paler pelage that is white, as in the ghost bat Diclidurus, gray as in the smoky bat Cyttarops, or a pelage pattern with two dorsal pale lines as in Rhynchonycteris and Saccopteryx. In terms of primary roosting sites, most emballonurid bats occupy relatively sheltered areas such as caves and crevices in rocky outcrops, or in man-made structures such as tombs and buildings. Some species are primarily found in other forms of concealed roosts including tree hollows and rotted-out logs. A few genera, however, predominately roost in more exposed situations including in leaves at the tops of palm trees (Cyttarops and Diclidurus), or on sloping tree trunks overhanging rivers (R. naso), vertical tree trunks within forest (S. leptura), and within exposed cavities on the outside of buttressed roots of trees (S. bilineata). Although Saccopteryx is also known to roost in other places such as tree hollows, caves, and man-made

17 Adaptive Radiation of Neotropical Emballonurid Bats: Molecular Phylogenetics

295

Fig. 17.6 Chronogram of New World emballonurid bats with the primary characters defining the basal diversification during the Late Oligocene and Early Miocene. Echolocation call design: C1 – frequency high (41.3–98.2 kHz), call duration low (4.8–7.6 ms), and pulse interval low (58–119 ms); C2 – frequency low (23.5–42.6 kHz), call duration high (8.1–9.7 ms), and pulse interval high (100–317 ms). Ear morphology: E1 – medial edge of ears arise from between the eyes; E2 – medial edge of ears are connected between the eyes; E3 – medial edge of ears arise above the inner portion of the eyes; E4 – medial edge of ears arise above the middle portion of the eyes; E5 – medial edge of ears arise above the outer portion of the eyes. Pelage pattern: P1 – fur typically a uniformly medium or dark brown color; P2 – fur has 2 wavy pale lines on the dorsum; P3 – fur is pale gray; and P4 – fur is brownish white or white. Roost site: R1 – lives in shelter area; R2 – lives in exposed areas on tree trunks; and R3 – lives in exposed areas under palm leaves. Wing sacs: W1 – no wing sacs; W2 – large-sized wing sacs located along the forearm of the propatagia; W3 – medium-sized wing sacs located in the middle of the propatagia; W4 – smallsized and conspicuous wing sacs located near the leading edge of the propatagia; and W5 – smallsized and inconspicuous wing sacs located near the leading edge of the propatagia

structures, they regularly use the exposed surfaces of trees, unlike other genera that occupy sheltered areas (Bradbury and Emmons 1974; Bradbury and Vehrencamp 1976). Pelage and roosting behavior map consistently and are correlated on the phylogeny suggesting a phylogenetic basis to these character systems and an association of camouflage for genera that roost on exposed substrate such as tree trunks and leaves at the tops of palm trees (Fig. 17.6; Lim and Dunlop 2008).

296

17.5.5

B.K. Lim

Ear Morphology and Echolocation

Although bats are not the only mammals that echolocate, they have the most sophisticated system of high frequency emission, sound reception, and neural processing for navigating and foraging in the dark. Ear shape and position are important factors for receiving returning echoes. The position of the medial edge of the ear in relation to the eye dictates the degree of forward or lateral orientation of the ear on the head of the bat. The direction of the ear may in turn influence the ecological adaptation of flying behavior. The more basal nodes for extant bats are equivocal for ear position because of polymorphic states in most families and the lack of comprehensive intrafamilial phylogenies (Lim and Dunlop 2008). Nonetheless, a possible accelerated character transformation is an ancestral state reconstruction of the ear directed more forward with the medial edge located between the eyes at the base of the New World emballonurid tree (Fig. 17.6). More laterally directed ears as seen in the subtribe Saccopterygina would be considered derived states. New World emballonurid bats are all aerial insectivores with an echolocation search call consisting of a central quasi-constant frequency band with short frequency modulated components and multiharmonics with most of the energy in the second harmonic. There is a negative correlation of a decrease in flying distance to forest clutter with an increase in peak echolocation frequency and a positive correlation of a decrease in pulse interval and call duration with a decrease in distance to clutter (Jung et al. 2007). These acoustic parameters map consistently on the phylogeny suggesting that foraging habitat and echolocation call design reflect phylogenetic relationships. Species within the subtribe Saccopterygina (Centronycteris, Rhynchonycteris, and Saccopteryx) fly in more cluttered environments within the forest or near the edge of forest and have higher frequencies, shorter pulse intervals, and shorter call durations (Fig. 17.6). In contrast, the subtribe Diclidurina (Balantiopteryx, Cormura, Cyttarops, Diclidurus, and Peropteryx) fly in less cluttered environment in open spaces near the forest or above the canopy and have lower frequencies, longer pulse intervals, and longer call durations. If ear positioning is linked to echolocation parameters and flying behavior, foraging near to forest clutter would be considered a derived ecological adaptation for Saccopterygina because forward directed ears are considered ancestral for New World emballonurids and are also found in Diclidurina.

17.6

Conclusions

The most recent common ancestor of New World emballonurid bats colonized an insular South America from Africa during the Early Oligocene 30 million years ago when savannah was more prevalent than today. A basal split occurred approximately 27 million years ago in the Northern Amazon with the speciation of the subtribes Saccopterygina in forested habitats and Diclidurina in savannah. There

17 Adaptive Radiation of Neotropical Emballonurid Bats: Molecular Phylogenetics

297

was relative stasis until a rapid differentiation of genera 19.4–18.0 million years ago during the Early Miocene when marine incursions from the Caribbean into the northwestern Amazon region resulted in heterogeneous environments in a forestsavannah mosaic. The uplands of the Guiana Shield acted as a stable core area during range contractions. Subsequent range expansions back into favorable lowland habitats completed episodes of taxon pulses of biotic diversification. These changing paleoenvironments in the Early Miocene resulted in an adaptive radiation occurring in forested habitats that gave rise to the differentiation of the genera in Saccopterygina. The association of ear morphology and echolocation call design suitable for foraging within cluttered environments supports a phylogenetic basis to the evolution of these complex character systems. A similar radiation occurred in savannah habitats giving rise to the diversification of genera in Diclidurina that were adapted to foraging in more open environments. More detailed study of morphology, ecology, and echolocation of emballonurids at the species-level in a phylogenetic context will give further insights into the remarkable evolutionary history and adaptive radiation of bats. Acknowledgments I thank Mark Engstrom for critical comments throughout the formulation of the ideas presented herein. Primary funding for fieldwork and research was secured through the generous support of the Royal Ontario Museum Governors and Department of Natural History.

References Barghoorn SF (1977) New material of Vespertiliavus Schlosser (Mammalia, Chiroptera) and suggested relationships of emballonurid bats based on cranial morphology. Am Mus Novit 2618:1–29 Borissenko AV (1999) A mobile trap for capturing bats in flight. Plecotus et al 2:10–19 Bradbury JW, Emmons LH (1974) Social organization of some Trinidad bats: 1. Emballonuridae. Z Tierpsychol 36:137–183 Bradbury JW, Vehrencamp SL (1976) Social organization and foraging in emballonurid bats. Behav Ecol Sociobiol 1:337–381 Czaplewski NJ (1997) Chiroptera. In: Kay RF, Madden RH, Cifelli RL, Flynn JJ (eds) Vertebrate paleontology in the neotropics: the Miocene fauna of La Venta, Colombia. Smithsonian Institution Press, Washington, DC, pp 410–431 Dunlop JM (1998) The evolution of behavior and ecology in Emballonuridae (Chiroptera). PhD dissertation, York University, North York, Ontario Eick GN, Jacobs DS, Matthee CA (2005) A nuclear DNA phylogenetic perspective on the evolution of echolocation and historical biogeography of extant bats (Chiroptera). Mol Biol Evol 22:1869–1886 Erwin TL (1979) Thoughts on the evolutionary history of ground beetles: hypotheses generated from comparative faunal analyses of lowland forest sites in temperate and tropical regions. In: Erwin TL, Ball GE, Whitehead DR (eds) Carabid beetles: their evolution, natural history, and classification. Dr W. Junk, The Hague, pp 539–592 Erwin TL (1981) Taxon pulses, vicariance, and dispersal: an evolutionary synthesis illustrated by carabid beetles. In: Nelson G, Rosen DE (eds) Vicariance biogeography: a critique. Columbia University Press, New York, pp 159–196 Farris JS (1970) Methods for computing Wagner trees. Syst Zool 19:83–92

298

B.K. Lim

Farris JS, Kallersjo M, Kluge AG, Bult C (1995) Testing significance of incongruence. Cladistics 10:315–319 Flynn JJ, Wyss AR (1998) Recent advances in South American mammalian paleontology. Trends Ecol Evol 13:449–454 Griffiths TA, Smith AL (1991) Systematics of emballonuroid bats (Chiroptera: Emballonuridae and Rhinopomatidae) based on hyoid morphology. Bull Am Mus Nat Hist 206:62–83 Haq BU, Hardenbol J, Vail PR (1987) Chronology of fluctuating sea levels since the Triassic. Science 235:1156–1167 Hoorn C, Guerrero J, Sarmiento GA, Lorente MA (1995) Andean tectonics as a cause for changing drainage patterns in Miocene northern South America. Geology 23:237–240 Janis CM (1993) Tertiary mammal evolution in the context of changing climates, vegetation, and tectonic events. Ann Rev Ecol Syst 24:467–500 Jung K, Kalko EKV, von Helversen O (2007) Echolocation calls in Central American emballonurid bats: signal design and call frequency alternation. J Zool 212:125–137 Kishino H, Hasegawa M (1989) Evaluation of the maximum likelihood estimate of the evolutionary tree topologies from DNA sequence data, and the branching order in Hominoidea. J Mol Evol 29:170–179 Koopman KF (1994) Chiroptera: systematics Part 60 of Mammalia, vol 8, Handbook of Zoology. Walter de Gruyter, New York Legendre S (1984) E˙tude odontologique des repre´sentants actuels du groupe Tadarida (Chiroptera, Molossidae): implications phyloge´niques, syste´matiques et zooge´ographiques. Rev Suisse Zool 91:399–442 Lim BK (2007) Divergence times and origin of neotropical sheath-tailed bats (tribe Diclidurini) in South America. Mol Phylogenet Evol 45:777–791 Lim BK (2008) Historical biogeography of New World emballonurid bats (tribe Diclidurini): taxon pulse diversification. J Biogeogr 35:1385–1401 Lim BK (2009) Environmental assessment at the Bakhuis Bauxite Concession: small-sized mammal diversity and abundance in the lowland humid forests of Suriname. Open Biol J 2:42–57 Lim BK, Dunlop JM (2008) Evolutionary patterns of morphology and behavior as inferred from a molecular phylogeny of New World emballonurid bats (tribe Diclidurini). J Mammal Evol 15:79–121 Lim BK, Engstrom MD, Simmons NB, Dunlop JM (2004) Phylogenetics and biogeography of least sac-winged bats (Balantiopteryx) based on morphological and molecular data. Mamm Biol 69:225–237 Lim BK, Engstrom MD, Bickham JW, Patton JC (2008) Molecular phylogeny of New World emballonurid bats (Tribe Diclidurini) based on loci from the four genetic transmission systems in mammals. Biol J Linn Soc 93:189–209 Lim BK, Engstrom MD, Reid FA, Simmons NB, Voss RS, Fleck DW (2010) A new species of Peropteryx (Chiroptera: Emballonuridae) from western Amazonia with comments on phylogenetic relationships within the genus. Am Mus Novit 3686:1–20 Lundberg JG, Marshall LG, Guerrero J, Horton B, Malabarba MCSL, Wesselingh F (1998) The stage for Neotropical fish diversification: a history of tropical South American rivers. In: Malabarba LR, Reis RE, Vari RP, Lucena ZMS, Lucena CAS (eds) Phylogeny and classification of Neotropical fishes. Edipucrs, Porto Alegre, Brazil, pp 13–48 Maddison WP, Maddison DR (2006) Mesquite: a modular system for evolutionary analysis, version 1.12. http://mesquiteproject.org. Accessed 23 Sept 2006 McKenna MC, Bell SK (1997) Classification of mammals above the species level. Columbia University Press, New York Miller KG, Kominz MA, Browning JV, Wright JD, Mountain GS, Katz ME, Sugarman PJ, Cramer BS, Christie-Blick N, Pekar SF (2005) The Phanerozoic record of global sea-level change. Science 310:1293–1298

17 Adaptive Radiation of Neotropical Emballonurid Bats: Molecular Phylogenetics

299

Mun˜oz J, Cuartas CA (2001) Saccopteryx antioquensis n. sp. (Chiroptera: Emballonuridae) del noroeste de Colombia. Actual Biol 23:53–61 Poux C, Chevret P, Huchon D, de Jong WW, Douzery EJP (2006) Arrival and diversification of caviomorph rodents and platyrrhine primates in South America. Syst Biol 55:228–244 Prager EM, Wilson AC (1988) Ancient origin of lactalbumin from lysozyme: analysis of DNA and amino acid sequences. J Mol Evol 27:326–335 Robbins LW, Sarich VM (1988) Evolutionary relationships in the family Emballonuridae (Chiroptera). J Mammal 69:1–13 Ronquist F, Huelsenbeck JP (2003) MrBayes 3: Bayesian phylogenetic inference under mixed models. Bioinformatics 19:1572–1574 Scully WMR, Fenton MB, Saleuddin ASM (2000) A histological examination of the holding sacs and glandular scent organs of some bat species (Emballonuridae, Hipposideridae, Phyllostomidae, Vespertilionidae, and Molossidae). Can J Zool 78:613–623 Shimodaira H, Hasegawa M (2001) CONSEL: for assessing the confidence of phylogenetic tree selection. Bioinformatics 17:1246–1247 Simmons NB (2005) Order Chiroptera. In: Wilson DE, Reeder DM (eds) Mammal species of the world: a taxonomic and geographic reference, 3rd edn. Johns Hopkins University Press, Baltimore, pp 312–529 Simmons NB, Voss RS (1998) The mammals of Paracou, French Guiana: a neotropical lowland rainforest fauna. Part 1, bats. Bull Am Mus Nat Hist 237:1–219 Soltis DE, Soltis PS, Morgan DR, Swensen SM, Mullin BC, Dowd JM, Martin PG (1995) Chloroplast gene sequence data suggest a single origin of the predisposition for symbiotic nitrogen fixation in angiosperms. Proc Natl Acad Sci USA 92:2647–2651 Swofford DL (2001) PAUP*: phylogenetic analysis using parsimony (*and other methods), version 4.0b10. Sinauer Associates, Sunderland, MA Takai M, Anaya F, Shigehara N, Setoguchi T (2000) New fossil materials of the earliest New World onkey, Branisella boliviana, and the problem of platyrrhine origins. Am J Phys Anthropol 111:263–281 Teeling EC, Springer MS, Madsen O, Bates P, O’Brien SJ, Murphy WJ (2005) A molecular phylogeny for bats illuminates biogeography and the fossil record. Science 307:580–584 Templeton AR (1983) Phylogenetic inference from restriction endonuclease cleavage site maps with particular reference to the evolution of humans and the apes. Evolution 37:221–244 Thorne JL, Kishino H (2002) Divergence time and evolutionary rate estimation with multilocus data. Syst Biol 51:689–702 Voigt CC, von Helversen O (1999) Storage and display of odour by male Saccopteryx bilineata (Chiroptera, Emballonuridae). Behav Ecol Sociobiol 50:29–40 Wilson DE, Reeder DM (eds) (2005) Mammal species of the world: a taxonomic and geographic reference, 3rd edn. Baltimore, Johns Hopkins University Press Wojcicki M, Brooks DR (2005) PACT: an efficient and powerful algorithm for generating area cladograms. J Biogeogr 32:755–774 Wyss AR, Flynn JJ, Norell MA, Swisher CC, Charrier R, Novacek MJ, McKenna MC (1993) South America’s earliest rodent and recognition of a new interval of mammalian evolution. Nature 365:434–437

Chapter 18

Trends in Rhizobial Evolution and Some Taxonomic Remarks Julio C. Martı´nez-Romero, Ernesto Ormen˜o-Orrillo, Marco A. Rogel, Aline Lo´pez-Lo´pez, and Esperanza Martı´nez-Romero

Abstract Bacteria that establish nitrogen-fixing symbiosis in specialized plant structures belong to only three of over 100 bacterial phyla. Among these, rhizobial symbioses are the best known and nodulation genes (nod) have been described in many species. nodA phylogenies revealed a larger diversity in Bradyrhizobium than in other genera and suggest that bradyrhizobial nod genes are the oldest in agreement to the proposal that nod genes evolved in Bradyrhizobium (Plant Soil 161:11–20, 1994). In many cases, rhizobial symbiotic and housekeeping genes have different evolutionary histories in relation to the lateral transfer of symbiotic genes among bacteria. Misclassified Rhizobium strains were identified, to properly identify rhizobial species we propose the use of fragments of the rpoB and dnaK genes, which according to probability analyses reflect the behavior of whole genes. With these analyses several rhizobial species related to Agrobacterium tumefaciens may be reclassified to a genus other than Rhizobium.

18.1

Introduction

Legume plants are widespread and diverse with a large number of species; they profit from symbiosis with nitrogen-fixing bacteria (collectively designated as rhizobia and comprising different, not closely related genera, such as Bradyrhizobium, Mesorhizobium, Azorhizobium, Sinorhizobium, Rhizobium, and others) that induce the formation of nodules on roots and rarely on stems and provide nitrogen that allows the plants to grow in nitrogen poor soils. Rhizobia are used as inoculants in agriculture, a practice that has been in use for over a hundred years, substituting fertilizers and saving millions of dollars in some cases (Hungria et al. 2000, 2005). J.C. Martı´nez-Romero, E. Ormen˜o-Orrillo, M.A. Rogel, A. Lo´pez-Lo´pez, and E. Martı´nezRomero Centro de Ciencias Geno´micas, UNAM, Av. Universidad, Cuernavaca, Morelos 62210, Me´xico e-mail: [email protected]

P. Pontarotti (ed.), Evolutionary Biology – Concepts, Molecular and Morphological Evolution, DOI 10.1007/978-3-642-12340-5_18, # Springer-Verlag Berlin Heidelberg 2010

301

J.C. Martı´nez-Romero et al.

302

Rhizobial evolution and diversity (reviewed in Terefework et al. 2000; Wang and Martı´nez-Romero 2000; Sprent 2001; Sessitsch et al. 2002; Provorov and Vorobyov 2008; Martinez-Romero 2009) and molecular mechanisms mediating their interaction with legume hosts (Barnett and Fisher 2006; Jones et al. 2007) have been studied for a small proportion of legume-rhizobial symbioses (Lo´pez-Lo´pez et al. 2010). The coevolution of Rhizobium and legumes in symbiosis has been critically analyzed (Sprent 1997; Martinez-Romero 2009).

18.2

Nitrogen-Fixing Symbioses with Plants

ria * cte ba Pr ot eo

ae te s no ba Ac cte tin ria ob * ac te ria * Ch lor ob i Cy a

ch

iro

Sp

Fir mi

cu

te s

In plants with nitrogen-fixing symbiosis, special structures are involved (Fig. 18.1) indicating a sort of “convergent evolution” and suggesting a need to contain (in specialized structures) large numbers of selected bacteria to provide enough nitrogen for plants and/or to confine, control, or protect bacteria. Few bacterial genera belonging to only three phyla (out of over 100 current bacterial phyla) are capable of forming these nitrogen-fixing symbioses with plants (Fig. 18.1). There are more phyla with nitrogen-fixing bacteria than with nodulating bacteria, suggesting that nodulating bacteria evolved from nitrogen-fixing bacteria. Other bacteria out of the complex community found associated with plants, such as Azoarcus (Hurek and Reinhold-Hurek 2003) and Herbaspirillum (Roncato-Maccari et al. 2003), fix low levels of nitrogen not in nodules but as endophytes (inside plants); maybe rhizobial nitrogen fixation started similarly, as low-level nitrogen fixation. It is the aim of applied research with some of the plant-associated bacteria to achieve similar levels of nitrogen fixation with rice, corn, sugar-cane, and potatoes, as those obtained with the well recognized nitrogen-fixing symbioses of plants. Bacteria induce the formation of nodules on actinorrhizal plants and in legumes (including the nonlegume Parasponia) while cyanobacteria do not induce coralloid roots in cycads (an older symbiosis than those of legumes and actinorrhizal plants),

Fig. 18.1 Bacterial phyla, names correspond to phyla containing nitrogen-fixing species. Asterisks (*) indicate phyla containing bacteria that establish symbiosis in specialized structures

18

Trends in Rhizobial Evolution and Some Taxonomic Remarks

303

and seemingly neither the specialized cavities in Azolla and Gunnera, such structures formed normally by plants are subsequently colonized by cyanobacteria. Rhizobia and actinobacteria become intracellular in nodules as do cyanobacteria in Gunnera. Interestingly, in Casuarina glauca, an actinorhizal plant, a legume symbiotic gene (symRK) has been found that is required for nodulation suggesting a common genetic basis for nodule formation in legumes and actinorrhizal plants (Gherbi et al. 2008). A landmark in symbiotic research in Rhizobium was the discovery of the inducing molecules (Lerouge et al. 1990), Nod factors (produced by enzymes encoded by nod genes), which have a unique structure in biology, are active at nanomolar concentrations and are capable of inducing nodules in the absence of bacteria (De´narie´ et al. 1996; Relic et al. 1994). Great interest and much effort have been devoted toward identifying nodulation factors in actinobacteria but results have not been reported yet. Genetic approaches in the 1980s led to the discovery of nodulation mutants in Rhizobium and nod genes were described then (Long et al. 1983; Kondorosi et al. 1984). With the exception of photosynthetic Bradyrhizobium nodulating some Aeschynomene species on stems (Giraud et al. 2007), all other rhizobial species use Nod factors to induce nodules on legume roots. Furthermore, the acquisition of nod genes in some nonsymbiotic bacteria makes them form nodules (see later). The nodABC genes constitute an operon in most rhizobia. Exceptions are Rhizobium etli biovar phaseoli with nodA separated from nodBC (Vazquez et al. 1991) and Mesorhizobium loti where nodB does not form an operon with nodA and C (Sullivan et al. 2002). nodABC genes encode the enzymes that synthesize the core of the Nod factor: nodC encodes an N-acetylglucosaminyltransferase, nodB a chitooligosaccharide deacetylase, and nodA specifies the N-acylation of the aminosugar backbone by different fatty acids (Atkinson et al. 1994; Debelle´ et al. 1996a; Roche et al. 1996). Other nod gene products act to add chemical modifications to the Nod factor (Relic et al. 1994; Ferro et al. 2000), mediate its secretion (Evans and Downie 1986), provide precursors (Baev et al. 1991), or regulate nod gene expression (Mulligan and Long 1985; Kondorosi et al. 1991).

18.3

nod Gene Evolution

Where do nod genes originally come from? A hyaluronate synthase (hyaluronic acid is an polymer of alternative N-acetylglucosamine and glucuronic acid) from Streptococcus has sequence similarities to NodC, DG42 from Xenopus, and chitin synthases from yeast. Some bacterial xylanases (that catalyze the hydrolysis of linked xylose oligomeric and polymeric substrates) contain domains homologous to NodB proteins (Laurie et al. 1997). A Bacillus strain produces a molecule seemingly structurally related to Nod factors that stimulates plant proliferation (Lian et al. 2001). Interestingly, some plant mutants affecting rhizobial nodulation are defective in the mycorrhization process (Oldroyd et al. 2005) and it is suggested that a common

304

J.C. Martı´nez-Romero et al.

signaling pathway exists for Nod factor perception and mycorrhizal symbiosis (Catoira et al. 2000; Gianinazzi-Pearson and De´narie´ 1997). Mycorrhizal symbiosis occurs in around 80% of all plants and is considered as old as the first plants that evolved on Earth. The Nod factor may be considered as a very small chitin molecule that subsequently acquired other chemical modifications, some of them involved in protecting the molecule from plant chitinases (Staehelin et al. 1994). Mycorrhiza, being fungi, have chitin. Maybe rhizobia mimicked micorrhizal symbiosis (Debelle´ et al. 1996b). nod gene phylogenies have been reported in Bradyrhizobium, Rhizobium, Mesorhizobium, and Sinorhizobium (Moulin et al. 2004; Steenkamp et al. 2008; Stepkowski et al. 2007; Han et al. 2008; Rincon-Rosales et al. 2009). A host correlation to nod genes has been recognized (Suominen et al. 2001) and Nod factor fucosylation and acetylation have been correlated to bacterial phylogenies and specificities (Moulin et al. 2004); bacteria with sulfate modifications are scattered in rhizobial phylogenies (Martı´nez et al. 1995). We constructed a phylogenetic tree with available reported nodA sequences (Fig. 18.2). There seems to be a larger diversity of nodA sequences in Bradyrhizobium compared with the diversity in b-Proteobacteria or Sinorhizobium. In 1994, we proposed the hypothesis that nod genes evolved in Bradyrhizobium and that they were later transferred to other genera such as Rhizobium (Martinez-Romero 1994). In Bradyrhizobium, an ancestral nod group has been identified from bacteria nodulating several diverse legumes (indicated in Fig. 18.2), supposedly this group of legumes extended over many parts of the world during the Eocene after the origin of legumes north of the Tethys Sea (Steenkamp et al. 2008). Bradyrhizobium are the main nodule bacteria of tropical tree legumes (Qian et al. 2003; Moreira et al. 1998; Parker 2004; Ormen˜o-Orrillo et al. 2006) with a low degree of specificity and tropical legumes are considered older than temperate legumes. We found 23 novel lineages of Bradyrhizobium in the rain forest of Los Tuxtlas in Veracruz, Mexico, and they exhibited low specificity (Ormen˜o-Orrillo submitted). Specificity is a characteristic of many temperate legumes and few tropical legumes and may have been acquired later in bacteria (Perret et al. 2000; Young et al. 2003). Most nodule forming bacteria belong to the a-Proteobacteria and few to b-Proteobacteria (Moulin et al. 2001; Chen et al. 2003). Lateral transfer of nod genes to b-Proteobacteria was considered to account for the existence of nodulation in Burkholderia and Cupriavidus nodulating species (Moulin et al. 2001; Amadou et al. 2008), in Devosia (Rivas et al. 2002), and in Phyllobacterium (Valverde et al. 2005).

18.4

Different Evolutionary Histories of Chromosomal and Symbiotic Genes

In Rhizobium, Sinorhizobium, and in b-Proteobacteria, symbiotic genes including nod and nif (nitrogen fixation) genes are located on plasmids (Amadou et al. 2008) that may be transferred among species both in the laboratory and in nature.

Rhizobium/ Sinorhizobium

Burkholderia/ Cupriavidus

Azorhizobium Mesorhizobium

Rhizobium/ Sinorhizobium

Mesorhizobium

Rhizobium/ Sinorhizobium

Mesorhizobium

Sinorhizobium

Trends in Rhizobial Evolution and Some Taxonomic Remarks

Fig. 18.2 NodA gene phylogeny in different rhizobial genera

B. tuberum M. nodulans

Bradyrhizobium

18 305

306

J.C. Martı´nez-Romero et al.

In Mesorhizobium except Mesorhizobium amorphae (Wang et al. 1999b), in Azorhizobium, in Methylobacterium, and in Bradyrhizobium, symbiotic genes are on the chromosome. Symbiotic islands have been found to be transferable among mesorhizobia in the environment (Sullivan et al. 1995; Sullivan and Ronson 1998; Nandasena et al. 2007). Evidence that transfer and recombination occurs in nature is obtained by comparing housekeeping and nod gene phylogenies revealing different evolutionary histories in symbiotic and housekeeping genes (Haukka et al. 1998; Steenkamp et al. 2008). In the laboratory plant pathogens such as Agrobacterium tumefaciens and opportunistic human pathogens as Ochrobactrum may become fully symbiotic by acquiring symbiotic plasmids from Rhizobium tropici, albeit with reduced levels of nitrogen fixation (Martinez et al. 1987; Rogel et al. 2006). Two highly diverging lineages of R. tropici (type A and B) harbor very similar symbiotic plasmids that we suppose are exchanged among these lineages (Martı´nez-Romero 1996). Biovars were defined in Rhizobium as the different symbiotic specificities (mainly plasmid encoded) that could be exhibited in a single chromosomal background (species). As such three biovars were recognized in Rhizobium leguminosarum (viciae, trifolii, and phaseoli) (Jordan 1984); however, recently a more complicated situation has been revealed and some R. leguminosarum strains have been assigned to different species: Rhizobium pisi (Ramı´rez-Bahena et al. 2008) and Rhizobium fabae (Tian et al. 2008). The symbiotic plasmid from biovar phaseoli in R. etli is highly conserved (Gonza´lez et al. 2010) may be in relation to a recent evolutionary origin (Martinez-Romero 2009) maybe as recent as Phaseolus vulgaris, dating of around 2–3 million years ago (Delgado-Salinas et al. 2006). We identified a new biovar in R. etli, biovar mimosae, and supposed that it was a more ancient plasmid than the phaseoli plasmid (Wang et al. 1999a); nod gene phylogenies seem to support this hypothesis. Nonrandom association between plasmid and chromosome markers (Young et al. 2003) and limited plasmid transfer have been observed in nodule bacteria (Wernegreen and Riley 1999); however, different evolutionary histories of symbiotic and metabolic genes or chromosomal markers have been recognized in some cases in rhizobia (Silva et al. 2005; Tian et al. 2007; Han et al. 2008; RinconRosales et al. 2009). Two sympatric species of Sinorhizobium nodulating wild Acaciellas in Mexico seem to contain the same symbiotic plasmid, and incongruencies in symbiotic and housekeeping phylogenies have been repeatedly observed in sinorhizobia (Haukka et al. 1998; Toledo et al. 2003; Lloret et al. 2007). African Sinorhizobium terangae is a close relative to these American sinorhizobia but not on the basis of symbiotic genes (Rincon-Rosales et al. 2009) (Fig 18.3). In symbionts of Galega orientalis and Galega officinalis (two native legumes from the Caucasus), there is evidence of transfer of symbiotic information (Andronov et al. 2003). In Bradyrhizobium japonicum, a biovar with symbiotic genes specific for genistoid wild legumes is also found in another species B. canariense (Vinuesa et al. 2005). Lateral transfer of symbiotic genes is recognized to have occurred in Bradyrhizobium nodulating a diversity of wild legumes (Steenkamp et al. 2008).

18

Trends in Rhizobial Evolution and Some Taxonomic Remarks S. americanum

rpoB

S. fredii

307

S. americanum

nodA S. fredii bv. mediterranense

S. saheli S. mexicanum S. mexicanum S. terangae

S. chiapanecum

S. chiapanecum

Mesorhizobium de acacias

S. kostiense

S. kostiense

S. arboris

S. saheli

S. meliloti S. arboris S. medicae S. adhaerens

S. terangae

S. morelense

Fig. 18.3 Schematic comparison of chromosomal and symbiotic gene phylogenies in Sinorhizobium

Symbiotic plasmids in rhizobia are repABC plasmids. repABC plasmids are characteristic of a-Proteobacteria and differences in repA, repB, and repC gene evolution have been reported (Castillo-Ramirez et al. 2009), supporting the occurrence of large recombination rates in plasmids. Genomic analyses have revealed mosaicism in symbiotic plasmids (Gonzalez et al. 2006). Genetic information in plasmids has been described as accessory or the mobile genome (Young et al. 2006). Plasmid (and maybe also genomic island) plasticity may have been instrumental for the adaptation of rhizobia to legume evolution and specificity (MartinezRomero 2009).

18.5

Chromosomal Evolution and Molecular Markers

Rhizobial lineages have been estimated to be nearly as old as plants, for example, Rhizobium and Bradyrhizobium last common ancestor was dated as being over 400 million years old but legumes evolved around 100–65 million years ago (Sprent 2001). Nodulation seemingly evolved (Young and Johnston 1989), in only one group of bacteria that were associated with plants (maybe as endophytes, MartinezRomero 2009). Further spread of nod genes by lateral gene transfer may have conferred to diverse genera their nodulating capacity.

308

J.C. Martı´nez-Romero et al.

In 1989, it was suggested that “We will eventually need many genera to accommodate all the root-nodule bacteria” (Young and Johnston 1989), up to now 13 genera and over 50 species have been described establishing symbioses with a small sample of legumes analyzed. Small subunit ribosomal (16S rRNA) gene sequences have been commonly used to identify and propose species in rhizobia (Wang and Martı´nez-Romero 2000). It is remarkable that in spite of the large divergence of nod gene sequences found in Bradyrhizobium, this genus exhibits only a very limited diversity of 16S rRNA genes (Barrera et al. 1997; Vinuesa et al. 2005) and species delineation is not clear with this marker. Several molecular markers have been used to establish phylogenies and identify new species not only in Bradyrhizobium but in rhizobia in general. Genomic information provides large numbers of genes for these analyses (Young et al. 2006; Gonzalez et al. 2006; Crossman et al. 2008) and congruent bacterial relationships have been reported using indel analyses (Gupta 2005). Alternative phylogenetic relationships are encountered in multiple gene analyses from reported complete genomes of Agrobacterium, Rhizobium, and Sinorhizobium (Young et al. 2006); this suggests that the divergence of these lineages occurred within a very short time as has been concluded for other a-Proteobacteria (Castillo-Ramı´rez and Gonza´lez 2008).

18.6

Probability Estimates to Distinguish Rhizobial Species

Representative molecular markers are being searched to better reflect species phylogenies and not single gene phylogenies, in this regard dnaJ was found to reproduce accepted phylogenetic relationships (Alexandre et al. 2008). rpoB gene sequences have been considered for diversity studies in very different habitats or communities (Planet et al. 1995; Dahlloef et al. 2000; Case et al. 2007; SachmanRuiz et al. 2009). We have used partial sequences of rpoB as part of the phylogenetic studies to characterize new Sinorhizobium species (Lloret et al. 2007; Rincon-Rosales et al. 2009) and a new species of Klebseilla (Rosenblueth et al. 2004). rpoB is a large gene (more than 4,140 bp in Rhizobium) and usually, only fragments of the gene sequence are available. Different studies report sequences of different fragments, hampering direct comparisons. Sequencing a common fragment will facilitate comparisons and diminish misclassifications. Up to now several genomes of species within the Rhizobium genus have been completely sequenced. A practical utility for defining gene divergence ranges is to facilitate proper identification of novel species and of species belonging to a single species. When describing Sinorhizobium (Ensifer) mexicanum (Lloret et al. 2007) and Sinorhizobium chiapanecum (Rincon-Rosales et al. 2009), we proposed a probability range of inter- and intraspecies gene differences that allowed the distinction of different species and bacteria belonging to the same species. Comparing full rpoB gene sequences from seven Rhizobium genomes, we calculated that the 95% confidence interval for identities ranges from 0.898 to 1.000 for the sequences within this genus. The 0.898 threshold provides a useful criterion to determine if a new isolate

18

Trends in Rhizobial Evolution and Some Taxonomic Remarks

309

belongs to this genus: an identity of less than 0.898 excludes it from being a Rhizobium. Nevertheless, this is not a practical approach to classify new isolates due to the large size of rpoB gene, which can hardly be expected to be totally sequenced in diversity studies considering a large number of strains. Thus, we examined 700 bp fragments that covered the entire 4,140 bp sequence and found that the identities of a 700 bp fragment, ranging from positions 2,800 to 3,500, closely match the distribution of the entire gene sequence (Kolmogorov Smirnoff, p ¼ 0.05), in contrast to all other fragments analyzed. This fragment would provide not only a dependable molecular marker to study the phylogenies of rhizobia, but also a performable one. In both the full gene and the 700 bp (position 2,800–3,500) fragment, with a 95% confidence it can be stated that while Agrobacterium radiobacter is within the ranges of Rhizobium, A. tumefaciens, and Agrobacterium vitis identities to the members of the group do not fall within the limits of the genus in the distribution that described the dispersion of their differences. The same analysis was performed for dnaK. For this gene, the 95% confidence interval for identities ranges from 0.896 to 1.000 for the sequences within Rhizobium. Considering this interval, A. radiobacter and Agrobacterium rhizogenes are within the ranges of Rhizobium (therefore should be considered Rhizobium radiobacter and Rhizobium rhizogenes as has been proposed by Young et al. 2001), whereas A. tumefaciens and A. vitis identities to the members of the group do not fall within the limits of the genus (Fig. 18.4). Thus, by rpoB and by dnaK analyses, Agrobacterium could stand as an independent genus from Rhizobium as has been claimed before (Farrand et al. 2003), in consequence Rhizobium galegae, Rhizobium huautlense, Rhizobium cellulosilyticum, Rhizobium selenireducens, and Rhizobium daejeonense, all related to A. tumefaciens should be reclassified. It is clear from many published phylogenetic trees that Rhizobium is not monophyletic. We encountered several examples of misclassified Rhizobium strains in a 16S rRNA gene phylogenetic tree (Fig. 18.5), probably because many new isolates are only recognized by 16S rRNA genes and designation is done based on the closest relative frequently identified only as the best Blast hit, without further characterization. Rhizobium mongolense and Rhizobium lusitanum are polyphyletic (Fig. 18.5). Emendments to such misclassifications should be done. Agrobacterium tumefaciens

rpoB

Fig. 18.4 95% Confidence intervals for identities of species within Rhizobium genus for rpoB and dnaK genes. The arrows indicate the average identity of Agrobacterium tumefaciens or A. rhizogenes to the members of Rhizobium genus

Rhizobium Agrobacterium Agrobacterium rhizogenes tumefaciens

dnaK Rhizobium

J.C. Martı´nez-Romero et al.

310

EU399697 Rhizobium mongolense CCBAU 05122 AF008130 Rhizobium gallicum R602sp U89819 Rhizobium mongolense USDA 1844T

98

U89817 Rhizobium mongolense USDA 1877

100

U89822 Rhizobium mongolense USDA 2377 AY509212 Rhizobium mongolense S110*

70

100

EU256432 Rhizobium sullae CCBAU 85011

DQ196418 Rhizobium leguminosarum bv. viciae PEPSM13

100

90

EF141340 Rhizobium leguminosarum bv. phaseoli ATCC 14482

AY998046 Rhizobium etli bv. phaseoli IE4804 DQ648575 Rhizobium etli bv. mimosae Mim 7-4

81 61

U28916 Rhizobium etli CFN 42 AY509209 Rhizobium mongolense S152*

96

EU074200 Rhizobium lusitanum CCBAU 03301*

97 62

X67234 Rhizobium tropici IIA LMG9517 EF035070 Rhizobium multihospitium CCBAU 83435 U89832 Rhizobium tropici CIAT899

99

AY738130 Rhizobium lusitanum P1-7 CP000628 Agrobacterium radiobacter K84 96 77 AY945955 Agrobacterium rhizogenes ATCC 11325 63 EF522124 Agrobacterium rhizogenes CU10

0.002

Fig. 18.5 Rhizobium 16S rRNA gene phylogenies. Misclassified strains are indicated by asterisks (*) Acknowledgments To PAPIIT IN200709 and Michael Dunn for reading the manuscript. Partial financial support for this project was from GEF PNUMA, TSBF-CIAT. E.M. is grateful to DGAPA UNAM for a postdoctoral fellowship during her sabattical year at UC Davis in California.

References Alexandre A, Laranjo M, Young JPW, Oliveira S (2008) dnaJ is a useful phylogenetic marker for alphaproteobacteria. Int J Syst Evol Microbiol 58:2839–2849 Amadou C, Pascal G, Mangenot S, Glew M, Bontemps C, Capela D, Carrere S, Cruveiller S, Dossat C, Lajus A, Marchetti M, Poinsot V, Rouy Z, Servin B, Saad M, Schenowitz C, Barbe V, Batut J, Medigue C, Masson-Boivin C (2008) Genome sequence of the beta-Rhizobium Cupriavidus taiwanensis and comparative genomics of rhizobia. Genome Res 18:1472–1483 Andronov EE, Terefework Z, Roumiantseva ML, Dzyubenko NI, Onichtchouk OP, Kurchak ON, Dresler-Nurmi A, Young JPW, Simarov BV, Lindstroem K (2003) Symbiotic and genetic diversity of Rhizobium galegae isolates collected from the Galega orientalis gene center in the Caucasus. Appl Environ Microbiol 69:1067–1074 Atkinson EM, Palcic MM, Hindsgaul O, Long SR (1994) Biosynthesis of Rhizobium meliloti lipooligosaccharide Nod factors: NodA is required for an N-acyltransferase activity. Proc Natl Acad Sci USA 91:8418–8422 Baev N, Endre G, Petrovics G, Banfalvi Z, Kondorosi A (1991) Six nodulation genes of nod box locus 4 in Rhizobium meliloti are involved in nodulation signal production: nodM codes for D-glucosamine synthetase. Mol Gen Genet 228:113–124

18

Trends in Rhizobial Evolution and Some Taxonomic Remarks

311

Barnett MJ, Fisher RF (2006) Global gene expression in the rhizobial-legume symbiosis. Symbiosis 42:1–24 Barrera LL, Trujillo ME, Goodfellow M, Garcia FJ, Hernandez-Lucas I, Davila G, van Berkum P, Martinez-Romero E (1997) Biodiversity of bradyrhizobia nodulating Lupinus spp. Int J Syst Bacteriol 47:1086–1091 Case RJ, Boucher Y, Dahlloef I, Holmstroem C, Doolittle WF, Kjelleberg S (2007) Use of 16S rRNA and rpoB genes as molecular markers for microbial ecology studies. Appl Environ Microbiol 73:278–288 Castillo-Ramı´rez S, Gonza´lez V (2008) Factors affecting the concordance between orthologous gene trees and species tree in bacteria. BMC Evol Biol 8:300 Castillo-Ramirez S, Vazquez-Castellanos JF, Gonzalez V, Cevallos MA (2009) Horizontal gene transfer and diverse functional constrains within a common replication-partitioning system in Alphaproteobacteria: the repABC operon. BMC Genomics 10:536 Catoira R, Galera C, De Billy F, Penmetsa RV, Journet E-P, Maillet F, Rosenberg C, Cook D, Gough C, Denarie J (2000) Four genes of Medicago truncatula controlling components of a Nod factor transduction pathway. Plant Cell 12:1647–1666 Chen W-M, Moulin L, Bontemps C, Vandamme P, Bena G, Boivin-Masson C (2003) Legume symbiotic nitrogen fixation by b-Proteobacteria is widespread in nature. J Bacteriol 185:7266–7272 Crossman LC, Castillo-Ramı´rez S, McAnnula C, Lozano L, Vernikos GS, Acosta JL, Ghazoui ZF, Herna´ndez-Gonza´lez I, Meakin G, Walker AW, Hynes MF, Young JPW, Downie JA, Romero D, Johnston AWB, Da´vila G, Parkhill J, Gonza´lez V (2008) A common genomic framework for a diverse assembly of plasmids in the symbiotic nitrogen fixing bacteria. PLoS ONE 3(7):e2567 Dahlloef I, Baillie H, Kjelleberg S (2000) rpoB-based microbial community analysis avoids limitations inherent in 16s rRNA gene intraspecies heterogeneity. Appl Environ Microbiol 66:3376–3380 Debelle´ F, Plazanet C, Roche P, Pujol C, Savagnac A, Rosenberg C, Prome J-C, Denarie J (1996a) The NodA proteins of Rhizobium meliloti and Rhizobium tropici specify the N-acylation of Nod factors by different fatty acids. Mol Microbiol 22:303–314 Debelle´ F, Yang GP, Ferro M, Truchet G, Prome´ JC, De´narie´ J (1996b) Rhizobium nodulation factors in perspective. In: Legocki A, Bothe H, P€ uhler A (eds) Biological fixation of nitrogen for ecology and sustainable agriculture. Springer, Heidelberg, Germany, pp 15–24 Delgado-Salinas A, Bibler R, Lavin M (2006) Phylogeny of the genus Phaseolus (Leguminosae): a recent diversification in an ancient landscape. Syst Bot 31:779–791 De´narie´ J, Debelle´ F, Prome´ JC (1996) Rhizobium lipo-chitooligosaccharide nodulation factors: signaling molecules mediating recognition and morphogenesis. Annu Rev Biochem 65:503–535 Evans IJ, Downie JA (1986) The nodI gene product of Rhizobium leguminosarum is closely related to ATP-binding bacterial transport proteins; nucleotide sequence analysis of the nodI and nodJ genes. Gene 43:95–101 Farrand SK, van Berkum PB, Oger P (2003) Agrobacterium is a definable genus of the family Rhizobiaceae. Int J Syst Evol Microbiol 53:1681–1687 Ferro M, Lorquin J, Ba S, Sanon K, Prome´ JC, Boivin C (2000) Bradyrhizobium sp. strains that nodulate the leguminous tree Acacia albida produce fucosylated and partially sulfated Nod factors. Appl Environ Microbiol 66:5078–5082 Gherbi H, Markmann K, Svistoonoff S, Estevan J, Autran D, Giczey G, Auguy F, Peret B, Laplaze L, Franche C, Parniske M, Bogusz D (2008) SymRK defines a common genetic basis for plant root endosymbioses with arbuscular mycorrhiza fungi, rhizobia, and Frankiabacteria. Proc Natl Acad Sci USA 105:4928–4932 Gianinazzi-Pearson V, De´narie´ J (1997) Red carpet genetic programmes for root endosymbioses. Trends Plant Sci 2:371–372 Giraud E, Moulin L, Vallenet D, Barbe V, Cytryn E, Avarre J-C, Jaubert M, Simon D, Cartieaux F, Prin Y, Bena G, Hannibal L, Fardoux J, Kojadinovic M, Vuillet L, Lajus A, Cruveiller S, Rouy Z, Mangenot S, Segurens B, Dossat C, Franck WL, Chang W-S, Saunders E, Bruce D,

312

J.C. Martı´nez-Romero et al.

Richardson P, Normand P, Dreyfus B, Pignol D, Stacey G, Emerich D, Vermeglio A, Medigue C, Sadowsky M (2007) Legumes symbioses: absence of nod genes in photosynthetic bradyrhizobia. Science 316:1307–1312 Gonzalez V, Santamaria RI, Bustos P, Hernandez-Gonzalez I, Medrano-Soto A, MorenoHagelsieb G, Janga SC, Ramirez MA, Jimenez-Jacinto V, Collado-Vides J, Davila G (2006) The partitioned Rhizobium etli genome: genetic and metabolic redundancy in seven interacting replicons. Proc Natl Acad Sci USA 103:3834–3839 Gonza´lez V, Acosta JL, Santamarı´a RI, Bustos P, Ferna´ndez JL, Herna´ndez Gonza´lez IL, Dı´az R, Flores M, Palacios R, Mora J, Da´vila G (2010) Conserved symbiotic plasmid DNA sequences in the multireplicon pangenomic structure of Rhizobium etli. Appl Environ Microbiol 76:1604–1614 Gupta RS (2005) Protein signatures distinctive of a-Proteobacteria and its subgroups and a model for a-proteobacterial evolution. Crit Rev Microbiol 31:101–135 Han TX, Wang ET, Han LL, Chen WF, Sui XH, Chen WX (2008) Molecular diversity and phylogeny of rhizobia associated with wild legumes native to Xinjiang, China. Syst Appl Microbiol 31:287–301 Haukka K, Lindstrom K, Young JPW (1998) Three phylogenetic groups of nodA and nifH genes in Sinorhizobium and Mesorhizobium isolates from leguminous trees growing in Africa and Latin America. Appl Environ Microbiol 64:419–426 Hungria M, Vargas MAT, Campo RJ, Chueire LMO, Andrade DS (2000) The Brazilian experience with the soybean (Glycine max) and common bean (Phaseolus vulgaris) symbioses. In: Pedrosa FO, Hungria M, Yates G, Newton WE (eds) Nitrogen fixation: from molecules to crop production. Kluwer Academic Publishers, Netherlands, p 515 Hungria M, Franchini JC, Campo RJ, Graham PH (2005) The importance of nitrogen fixation to soybean cropping in South America. In: Werner D, Newton WE (eds) Nitrogen fixation in agriculture, forestry, ecology, and the environment. Springer, Dordrecht, pp 25–42 Hurek T, Reinhold-Hurek B (2003) Azoarcus sp. strain BH72 as a model for nitrogen-fixing grass endophytes. J Biotechnol 106:169–178 Jones KM, Kobayashi H, Davies BW, Taga ME, Walker GC (2007) How rhizobial symbionts invade plants: the Sinorhizobium-Medicago model. Nat Rev Microbiol 5:619–633 Jordan DC (1984) Family III. Rhizobiaceae Conn 1938, 321AL. In: Krieg NR, Holt JG (eds) Bergeys’s manual of systematic bacteriology, vol 1. The Williams and Wilkins Co., Baltimore, pp 234–254 Kondorosi E, Banfalvi Z, Kondorosi A (1984) Physical and genetic analysis of a symbiotic region of Rhizobium meliloti: identification of nodulation genes. Mol Gen Genet 193:445–452 Kondorosi E, Pierre M, Cren M, Haumann U, Buire M, Hoffmann B, Schell J, Kondorosi A (1991) Identification of NolR, a negative transacting factor controlling the nod regulon in Rhizobium meliloti. J Mol Biol 222:885–896 Laurie JI, Clarke JH, Ciruela A, Faulds CB, Williamson G, Gilbert HJ, Rixon JE, Millward-Sadler J, Hazlewood GP (1997) The NodB domain of a multidomain xylanase from Cellulomonas fimi deacetylates acetylxylan. FEMS Microbiol Lett 148:261–264 Lerouge P, Roche P, Faucher C, Maillet F, Truchet G, Prome´ JC, De´narie´ J (1990) Symbiotic host-specificity of Rhizobium meliloti is determined by a sulphated and acylated glucosamine oligosaccharide signal. Nature 344:781–784 Lian B, Prithiviraj B, Souleimanov A, Smith DL (2001) Evidence for the production of chemical compounds analogous to nod factor by the silicate bacterium Bacillus circulans GY92. Microbiol Res 156:289–292 Lloret L, Ormen˜o-Orrillo E, Rinco´n R, Martı´nez-Romero J, Rogel-Herna´ndez MA, Martı´nezRomero E (2007) Ensifer mexicanus sp. nov. a new species nodulating Acacia angustissima (Mill.) Kuntze in Mexico. Syst Appl Microbiol 30:280–290 Long SR, Buikema WJ, Ausubel FM (1983) Cloning of Rhizobium meliloti nodulation genes by direct complementation of Nod-mutants. Nature 298:485–487 Lo´pez-Lo´pez A, Rosenblueth M, Martı´nez J, Martı´nez-Romero E (2010) Rhizobial symbioses in tropical legumes and non-legumes. In: Dion P (ed) Soil biology and agriculture in the tropics. Springer Heidelberg, pp. 163–184

18

Trends in Rhizobial Evolution and Some Taxonomic Remarks

313

Martinez E, Palacios R, Sanchez F (1987) Nitrogen-fixing nodules induced by Agrobacterium tumefaciens harboring Rhizobium phaseoli plasmids. J Bacteriol 169:2828–2834 Martı´nez E, Laeremans T, Poupot R, Rogel MA, Lopez L, Garcı´a F, Vanderleyden J, Prome´ JC, Lara F (1995) Nod metabolites and other compounds excreted by Rhizobium spp. In: Tikhonovich IA, Provorov NA, Romanov VI, Newton WE (eds) Nitrogen fixation: fundamentals and applications. Kluwer Academic Publishers, Dordrecht, pp 281–286 Martinez-Romero E (1994) Recent developments in Rhizobium taxonomy. Plant Soil 161:11–20 Martinez-Romero E (2009) Coevolution in Rhizobium-legume symbiosis? DNA Cell Biol 28:361–370 Martı´nez-Romero E (1996) Comments on Rhizobium systematics. Lessons from R. tropici and R. etli. In: Stacey G, Mullin B, Gresshoff PM (eds) Biology of plant–microbe interactions. International Society for Molecular Plant–Microbe Interactions, St. Paul, Minnesota, pp 503–508 Moreira FMS, Haukka K, Young JPW (1998) Biodiversity of rhizobia isolated from a wide range of forest legumes in Brazil. Mol Ecol 7:889–895 Moulin L, Munive A, Dreyfus B, Boivin-Masson C (2001) Nodulation of legumes by members of the b subclass of Proteobacteria. Nature 411:948–950 Moulin L, Bena G, Boivin-Masson C, Stepkowski T (2004) Phylogenetic analyses of symbiotic nodulation genes support vertical and lateral gene co-transfer within the Bradyrhizobium genus. Mol Phylogenet Evol 30:720–732 Mulligan JT, Long SR (1985) Induction of Rhizobium meliloti nodC expression by plant exudate requires nodD. Proc Natl Acad Sci USA 82:6609–6613 Nandasena KG, O’Hara GW, Tiwari RP, Sezmis¸ E, Howieson JG (2007) In situ lateral transfer of symbiosis islands results in rapid evolution of diverse competitive strains of mesorhizobia suboptimal in symbiotic nitrogen fixation on the pasture legume Biserrula pelecinus L. Environ Microbiol 9:2496–2511 Oldroyd GED, Harrison MJ, Udvardi M (2005) Peace talks and trade deals. Keys to long-term harmony in legume-microbe symbioses. Plant Physiol 137:1205–1210 Ormen˜o-Orrillo E, Vinuesa P, Zuniga-Davila D, Martinez-Romero E (2006) Molecular diversity of native bradyrhizobia isolated from Lima bean (Phaseolus lunatus L.) in Peru. Syst Appl Microbiol 29:253–262 Parker MA (2004) rRNA and dnaK relationships of Bradyrhizobium sp. nodule bacteria from four Papilionoid legume trees in Costa Rica. Syst Appl Microbiol 27:334–342 Perret X, Staehelin Ch, Broughton WJ (2000) Molecular basis of symbiotic promiscuity. Microbiol Mol Biol Rev 64:180–201 Planet P, Jagoueix S, Bove JM, Garnier M (1995) Detection and characterization of the African citrus greening Liberobacter by amplification, cloning, and sequencing of the rplKAJL-rpoBC operon. Curr Microbiol 30:137–141 Provorov NA, Vorobyov NI (2008) Equilibrium between the “genuine mutualists” and “symbiotic cheaters” in the bacterial population co-evolving with plants in a facultative symbiosis. Theor Popul Biol 74:345–355 Qian J, Kwon S, Parker MA (2003) rRNA and nifD phylogeny of Bradyrhizobium from sites across the Pacific Basin. FEMS Microbiol Lett 219:159–165 Ramı´rez-Bahena MH, Garcı´a-Fraile P, Peix A, Valverde A, Rivas R, Igual JM, Mateos PF, Martı´nez-Molina E, Vela´zquez E (2008) Revision of the taxonomic status of the species Rhizobium leguminosarum (Frank 1879) Frank 1889AL, Rhizobium phaseoli Dangeard 1926AL and Rhizobium trifolii Dangeard 1926AL. R. trifolii is a later synonym of R. leguminosarum. Reclassification of the strain R. leguminosarum DSM 30132 (¼NCIMB 11478) as Rhizobium pisi sp. nov. Int J Syst Evol Microbiol 58:2484–2490 Relic B, Perret X, Estrada-Garcia MT, Kopcinska J, Golinowski W, Krishnan HB, Pueppke SG, Broughton WJ (1994) Nod factors of Rhizobium are a key to the legume door. Mol Microbiol 13:171–178

314

J.C. Martı´nez-Romero et al.

Rincon-Rosales R, Lloret L, Ponce E, Martinez-Romero E (2009) Rhizobia with different symbiotic efficiencies nodulate Acaciella angustissima in Mexico, including Sinorhizobium chiapanecum sp. nov. which has common symbiotic genes with Sinorhizobium mexicanum. FEMS Microbiol Ecol 68:255–255 Rivas R, Velazquez E, Willems A, Vizcaino N, Subba-Rao NS, Mateos PF, Gillis M, Dazzo FB, Martinez-Molina E (2002) A new species of Devosia that forms a unique nitrogen-fixing rootnodule symbiosis with the aquatic legume Neptunia natans (L.f.) Druce. Appl Environ Microbiol 68:5217–5222 Roche P, Maillet F, Plazanet C, Debelle F, Ferro M, Truchet G, Prome J-C, Denarie J (1996) The common nodABC genes of Rhizobium meliloti are host-range determinants. Proc Natl Acad Sci USA 93:15305–15310 Rogel MA, Torres C, Lloret L, Rosenblueth M, Herna´ndez-Lucas I, Martı´nez L, Martı´nez J, Martı´nez-Romero E (2006) Lateral transfer of Rhizobium symbiotic plasmids leading to genomic innovation. In: Sa´nchez F, Quinto C, Lo´pez-Lara IM, Geiger O (eds) Biology of plant–microbe interactions, vol 5. International Society for Molecular Plant–Microbe Interactions, St. Paul, USA, pp 310–318 Roncato-Maccari LDB, Ramos HJO, Pedrosa FO, Alquini Y, Chubatsu LS, Yates MG, Rigo LU, Steffens MBR, Souza EM (2003) Endophytic Herbaspirillum seropedicae expresses nif genes in gramineous plants. FEMS Microbiol Ecol 45:39–47 Rosenblueth M, Martinez L, Silva J, Martinez-Romero E (2004) Klebsiella variicola, a novel species with clinical and plant-associated isolates. Syst Appl Microbiol 27:27–35 Sachman-Ruiz B, Castillo-Rodal AI, Lo´pez-Vidal Y, Martı´nez-Romero E, Vinuesa P (2009) Diversity of environmental mycobacteria in Mexican rivers assessed by cultivation and metagenomics approaches. In: 109th General Meeting, American Society for Microbiology, May 17–21, 2009, Philadelphia, Pennsylvania Sessitsch A, Howieson JG, Perret X, Antoun H, Martinez-Romero E (2002) Advances in Rhizobium research. Crit Rev Plant Sci 21:323–378 Silva C, Vinuesa P, Eguiarte LE, Souza V, Martinez-Romero E (2005) Evolutionary genetics and biogeographic structure of Rhizobium gallicum sensu lato, a widely distributed bacterial symbiont of diverse legumes. Mol Ecol 14:4033–4050 Sprent JI (1997) Co-evolution of legume-rhizobial symbioses:is it essential for either partner? In: uhler A (eds) Biological fixation of nitrogen for ecology and sustainable Legocki A, Bothe H, P€ agriculture. Springer, Heidelberg, Germany, pp 313–316 Sprent JI (2001) Nodulation in legumes. Royal Botanic Gardens, Kew, UK Staehelin C, Schultze M, Kondorosi E, Mellor RB, Boller T, Kondorosi A (1994) Structural modifications in Rhizobium meliloti Nod factors influence their stability against hydrolysis by root chitinases. Plant J 5:319–330 Steenkamp ET, Stepkowski T, Przymusiak A, Botha WJ, Law IJ (2008) Cowpea and peanut in southern Africa are nodulated by diverse Bradyrhizobium strains harboring nodulation genes that belong to the large pantropical clade common in Africa. Mol Phylogenet Evol 48:1131–1144 Stepkowski T, Hughes CE, Law IJ, Markiewicz L, Gurda D, Chlebicka A, Moulin L (2007) Diversification of lupine Bradyrhizobium strains: evidence from nodulation gene trees. Appl Environ Microbiol 73:3254–3264 Sullivan JT, Ronson CW (1998) Evolution of rhizobia by acquisition of a 500-kb symbiosis island that integrates into a phe-tRNA gene. Proc Natl Acad Sci USA 95:5145–5149 Sullivan JT, Patrick HN, Lowther WL, Scott DB, Ronson CW (1995) Nodulating strains of Rhizobium loti arise through chromosomal symbiotic gene transfer in the environment. Proc Natl Acad Sci USA 92:8985–8989 Sullivan JT, Trzebiatowski JR, Cruickshank RW, Gouzy J, Brown SD, Elliot RM, Fleetwood DJ, McCallum NG, Rossbach U, Stuart GS, Weaver JE, Webby RJ, de Bruijn FJ, Ronson CW (2002) Comparative sequence analysis of the symbiosis island of Mesorhizobium loti strain R7A. J Bacteriol 184:3086–3095

18

Trends in Rhizobial Evolution and Some Taxonomic Remarks

315

Suominen L, Roos C, Lortet G, Paulin L, Lindstroem K (2001) Identification and structure of the Rhizobium galegae common nodulation genes: evidence for horizontal gene transfer. Mol Biol Evol 18:907–916 Terefework Z, Lortet G, Suominenl LK (2000) Molecular evolution of interactions between rhizobia and their legume hosts. In: Triplett E (ed) Prokaryotic nitrogen fixation: a model for analysis of a biological process. Horizon Scientific Press, Norfolk, England, pp 187–206 Tian CF, Wang ET, Han TX, Sui XH, Chen WX (2007) Genetic diversity of rhizobia associated with Vicia faba in three ecological regions of China. Arch Microbiol 188:273–282 Tian CF, Wang ET, Wu LJ, Han TX, Chen WF, Gu CT, Gu JG, Chen WX (2008) Rhizobium fabae sp. nov., a bacterium that nodulates Vicia faba. Int J Syst Evol Microbiol 58:2871–2875 Toledo I, Lloret L, Martı´nez-Romero E (2003) Sinorhizobium americanum sp. nov., a new Sinorhizobium species modulating native Acacia spp. in Mexico. Syst Appl Microbiol 26:54–64 Valverde A, Velazquez E, Fernandez-Santos F, Vizcaino N, Rivas R, Mateos PF, Martinez-Molina E, Igual JM, Willems A (2005) Phyllobacterium trifolii sp. nov., nodulating Trifolium and Lupinus in Spanish soils. Int J Syst Evol Microbiol 55:1985–1989 Vazquez M, Davalos A, de las Pen˜as A, Sanchez F, Quinto C (1991) Novel organization of the common nodulaiton genes in Rhizobium leguminosarum bv. phaseoli strains. J Bacteriol 173:1250–1258 Vinuesa P, Leo´n-Barrios M, Silva C, Willems A, Jarabo-Lorenzo A, Pe´rez-Galdona R, Werner D, Martı´nez-Romero E (2005) Bradyrhizobium canariense sp. nov., an acid-tolerant endosymbiont that nodulates endemic genistoid legumes (Papilionoideae: Genisteae) from the Canary Islands, along with Bradyrhizobium japonicum bv. genistearum, Bradyrhizobium genospecies alpha and Bradyrhizobium genospecies beta. Int J Syst Evol Microbiol 55:569–575 Wang ET, Martı´nez-Romero E (2000) Phylogeny of root- and stem-nodule bacteria associated with legumes. In: Triplett E (ed) Prokaryotic nitrogen fixation: a model for analysis of a biological process. Horizon Scientific Press, Norfolk, England, pp 177–186 Wang ET, Rogel MA, Garcı´a-De los Santos A, Martı´nez-Romero J, Cevallos MA, Martı´nezRomero E (1999a) Rhizobium etli bv. mimosae, a novel biovar isolated from Mimosa affinis. Int J Syst Bacteriol 49:1479–1491 Wang ET, van Berkum P, Sui XH, Beyene D, Chen WX, Martinez-Romero E (1999b) Diversity of rhizobia associated with Amorpha fruticosa isolated from Chinese soils and description of Mesorhizobium amorphae sp. nov. Int J Syst Bacteriol 49:51–65 Wernegreen JJ, Riley MA (1999) Comparison of the evolutionary dynamics of symbiotic and housekeeping loci: a case for the genetic coherence of rhizobial lineages. Mol Biol Evol 16:98–113 Young JPW, Johnston AWB (1989) The evolution of specificity in the legume-Rhizobium symbiosis. Trends Ecol Evol 4:341–349 Young JM, Kuykendall LD, Martinez-Romero E, Kerr A, Sawada H (2001) A revision of Rhizobium Frank 1889, with an emended description of the genus, and the inclusion of all species of Agrobacterium Conn 1942 and Allorhizobium undicolade Lajudie et al. 1998 as new combinations: Rhizobium radiobacter, R. rhizogenes, R. rubi, R. undicola and R. vitis. Int J Syst Evol Microbiol 51:89–103 Young JPW, Mutch LA, Ashford DA, Ze´ze´ A, Mutch KE (2003) The molecular evolution of host specificity in the Rhizobium-legume symbiosis. In: Hails R, Godfray HJC, Beringer JE (eds) Genes in the environment. Blackwell Science, Oxford, pp 245–257 Young JPW, Crossman LC, Johnston AWB, Thomson NR, Ghazoui ZF, Hull KH, Wexler M, Curson ARJ, Todd JD, Poole PS, Mauchline TH, East AK, Quail MA, Churcher C, Arrowsmith C, Cherevach I, Chillingworth T, Clarke K, Cronin A, Davis P, Fraser A, Za H, Hauser H, Jagels K, Moule S, Mungall K, Norbertczak H, Rabbinowitsch E, Sanders M, Simmonds M, Whitehead S, Parkhill J (2006) The genome of Rhizobium leguminosarum has recognizable core and accessory components. Genome Biol 7:R34

Chapter 19

Convergent Evolution of Morphogenetic Processes in Fungi Sylvain Brun and Philippe Silar

Abstract Eumycetes fungi are a diverse group of organisms whose evolution is characterized by frequent changes in nutritional strategy and the corresponding developmental programs. The reasons for this versatility are unknown. We previously discovered that the NADPH oxidase Nox2 and the tetraspanin Pls1 are used in two radically different cell types to achieve the same purpose: exiting from a reinforced cell, suggesting that convergent evolution of morphogenetic processes could account for the repetitive switches in trophic modes during fungal evolution. However, we recently observed that saprobic fungi are also able to differentiate appressorium-like structure closely resembling those of phytopathogenic species, arguing that the ability to differentiate such cells is an ancient property of filamentous fungi. Adaptation of parasitic and mutualistic fungi to plant may thus not solely reside in their ability to penetrate their host.

19.1

Introduction

Fungi belonging to the Eumycetes (Opisthokonta) are a great success of evolution. Their ancestors switched from phagotrophy, the original eukaryotic trophic mode, to osmotrophy likely a billion years ago (McLaughlin et al. 2009). Since then they have diversified into hundreds of thousands species and possibly much more (Hawksworth 1991). They have invaded nearly all biotopes, from the deepest depths of the oceans to the top of the highest mountains all around the globe. They are even found in the arctic soils that remain frozen most of the years (Schadt et al. 2003). Their total biomass is huge and they greatly impact on their environment. They live either in parasitic or in mutualistic symbiosis with other organisms, S. Brun and P. Silar UFR des Sciences du Vivant, Universite´ de Paris 7 – Denis Diderot, 75205 Paris Cedex 13, France Institut de Ge´ne´tique et Microbiologie, UMR CNRS – Universite´ de Paris 11, UPS Baˆt. 400, 91405 Orsay cedex, France e-mail: [email protected]

P. Pontarotti (ed.), Evolutionary Biology – Concepts, Molecular and Morphological Evolution, DOI 10.1007/978-3-642-12340-5_19, # Springer-Verlag Berlin Heidelberg 2010

317

318

S. Brun and P. Silar

or as free living saprobes. The saprobes participate in the global carbon cycle, especially they degrade highly recalcitrant materials that no other organism may and regulate soil health by producing humic acids. As mutualistic symbionts, the mycorhizal and endophytic fungi increase plant fitness and those present inside the digestive tract enable many insects and mammalian herbivores to use the hard-todigest plant materials as food. Similarly, the mutualistic lichens are an important component of many extreme biotopes. Parasitic fungi are known for nearly all organisms (even fungi!), but they are especially important for plants and insects. These have a tremendous impact on the dynamics of natural populations but also on domesticated plants and animals. The feeding, dispersal, and “behavioral” diversity of fungi is such that complete books are required to describe it (Webster 2007). Because of their importance, scientific programs aimed at better understanding the evolution and biology of fungi have been launched. The aftol (Assembling the Fungal Tree of Life) used multigene trees to resolve their phylogeny (James et al. 2006) and proposed a new classification (Hibbett et al. 2007). Numerous genomic programs have established sequences from a great diversity of fungi (see, for example, http://genome.jgi-psf.org/, http://www.broadinstitute.org/science/projects/fungalgenome-initiative/current-fgi-sequence-projects, http://www.genoscope.cns.fr/ spip/Fungi-sequenced-at-Genoscope.html). The data show that fungi are highly diverse (McLaughlin et al. 2009). For example, the genetic diversity of fungi belonging to related families or even to the same family may exceed that of animals from different classes (Dujon 2005; Espagne et al. 2008).

19.2

The Versatility of Fungal Development

An important point that emerges from phylogenetic studies is the versatility with which fungi may switch their trophic modes and “invent” repeatedly the same structures (James et al. 2006). For instance, saprobic and symbiotic fungi may exist within the same genus, and within the same class, saprotrophy, plant pathogeny, lichen symbiosis, and other trophic modes may evolve. Similarly, plant pathogens and mutualists invade their host plant by many means, one of which involves the in-force breaking of the plant cuticule and/or cell wall. To do this, fungi differentiate special cells called appressoria (Deising et al. 2000). These come in different sizes and shapes and their origin may be quite different. For example, in Magnaporthe grisea, a hemi-biotrophic parasite of rice and barley, the appressorium develops at the extremity of a dedicated hypha that is produced by a three-celled spore issued from asexual reproduction. In this species, appressoria are heavily melanized round cells with a very well-defined structure, from which the penetration peg emerges (Fig. 19.1). In Botrytis cinerea, appressorium-like structures are also produced at the extremity of an hypha that originates from a spore issued from asexual reproduction, but this spore has only one cell and the appressorium is no more than a specialized hypha slightly reinforced at its tip, which is able to orient its growth toward plant wall and to penetrate it, thanks to a penetration peg (Fig. 19.1).

19 Convergent Evolution of Morphogenetic Processes in Fungi

319

Fig. 19.1 Ontogeny of ascospores, appressorium, and appressorium-like structures. Sexual reproduction results in one-celled hyaline ascospores in B. cinerea, four-celled hyaline ascospores in M. grisea, and two-celled melanised ascospores with a germ pore in P. anserina. In this latter species, a cell death has occurred during ascospore differentiation. Appressorium is a roundish heavily melanized structure in M. grisea, while it is no more than a reinforced hyphae in B. cinerea. The similarity between P. anserina ascospore and M. grisea appressorium ontogenies are highlighted by arrows

320

S. Brun and P. Silar

These structures are thus qualified as “appressoria-like” rather than as true appressoria. M. grisea and B. cinerea belong to two different classes of ascomycetes, the Sordariomycetes and Leotiomycetes, respectively. In these classes, numerous species are known to live as saprobes, which seemingly do not differentiate appressoria as they do not need to penetrate host plants. Thus, the question raised is whether the utilization of appressoria to penetrate plants is the result of convergent evolution by plant pathogens or whether it reflects an ancient ability of fungi to differentiate penetration structures that would have been lost in saprobes. Spore is another fungal structure (along with the fruiting body) that exhibits many convergent evolutions. Spores are issued either from sexual (basidiospores, ascospores. . .) or from asexual (conidia. . .) reproduction and constitute an important part of the life cycle, since they enable fungi to disperse efficiently and to resist to adverse conditions. They come in many shapes, sizes, and colors and have been used in the past to classify fungi. For example, Podospora anserina, a model ascomycete produces heavily melanized ascospores that germinate in a regulated manner through a germ pore (Fig. 19.1). These are in fact constituted of two cells, one of which has undergone a cell death. Neurospora crassa produces one-celled striated ascospores with two germ pores located at the opposite poles, while M. grisea ascospores are composed of four hyaline cells and lack a germ pore (Fig. 19.1). Those of B. cinerea are composed of a single hyaline cell (Fig. 19.1). Yet, spore evolution appears filled with convergence. For example, in some Sordariomycetes, the fruiting body wall is a better descriptor of evolution than ascospore shape (Miller and Huhndorf 2005). Similarly, a germ pore is present in some species for both basidiomycetes and ascomycetes and is absent in others. The molecular basis for the versatility of fungi in switching trophic modes and developments is unknown. The only documented instance is for a change from mycoparasitism to saprotrophy in the genus Trichoderma. Indeed, there is evidence for a horizontal transfer of a cluster of genes involved in nitrate assimilation from a basidiomycete related to Ustilago maydis to the ascomycete Trichoderma reesei, whereas the other members of the Trichoderma genus appear to lack the cluster. This has been correlated with the fact that T. reesei is the only Trichoderma living as a saprobe in woody materials, while the other members of the genus are mycoparasites (Slot and Hibbett 2007). The nitrate assimilation cluster would enable T. reesei to efficiently scavenge nitrogen in wood, while the other Trichodermas must obtain it from their host, accounting for the trophic change. Trichoderma may parasitize basidiomycetes, favoring perhaps the gene transfer in the ancestors of T. reesei.

19.3

Are Appressoria and Appressorium-Like Structures the Result of Convergent Evolution?

We discovered serendipitously a possible convergent evolution of morphogenetic processes impacting on trophic strategy in filamentous fungi by studying the role of the Pls1 tetraspanin and the Nox2 NADPH oxidase (Nox) in the saprobic fungus

19 Convergent Evolution of Morphogenetic Processes in Fungi

321

P. anserina. Tetraspanin are membrane-bound proteins, whose roles are not yet completely clear (Veneault-Fourrey et al. 2006b). In fungi, tetraspanin of the Pls1 family have been at first unraveled as virulence factors in three different plant pathogenic species. In M. grisea, B. cinerea, and Colleototrichum lindemuthianum, the Pls1 mutants are blocked at the penetration step; the appressorium appears normal but penetration pegs are not produced (Clergeot et al. 2001; Gourgues et al. 2004; Veneault-Fourrey et al. 2005). This was taken as the indication for a specific role of Pls1 tetraspanin in phytopathogenic fungi. Yet, orthologues of Pls1 are present in saprobic fungi, including P. anserina (Lambou et al. 2008). Tetraspanins share the same membrane localization as Nox. Nox are membrane-bound enzymes that generate superoxide ions in exchange of consumption of NADPH. Several years ago, we proposed that the ancient role of Nox (and of the ROS they produce) was the sensing of the environment and cell-to-cell communication (Lalucque and Silar 2003). And indeed, these enzymes have now been shown to play key roles in development, pathogeny, symbiosis, and defense in a broad range of Eukaryotes (Lara-Ortiz et al. 2003; Malagnac et al. 2004; Aguirre et al. 2005; Silar 2005; Takemoto et al. 2007). There is presently three Nox isoforms known in fungi (see Table 19.1 for an update on Nox genes in fungal genomes) and all data argue that they do not fulfill redundant roles (Takemoto et al. 2007). In particular, in two saprobic fungi, P. anserina and N. crassa, the Nox2 isoform seems to be more specifically dedicated to regulate melanized ascospore germination (Malagnac et al. 2004; Cano-Dominguez et al. 2008). Indeed, both fungi produce melanized ascospores and, in both species, Nox2 mutant ascospores do not germinate. Furthermore, when P. anserina ascospore melanin is removed, the Nox mutant ascospores germinate efficiently but in a nonregulated manner (Malagnac et al. 2004). Accordingly, Nox2 appears dispensable for the germination of B. cinerea ascospores, which are not melanized (Segmuller et al. 2008). When we deleted the PaPls1 gene of P. anserina, we discovered that the DPaPls1 mutants had the same ascospore germination defects as the PaNox2 mutants (Lambou et al. 2008). Again, removal of melanin in PaPls1 mutant ascospores suppressed the germination default, leading to unregulated germination. Interestingly, the Nox2 isoforms are necessary for plant penetration in M. grisea and B. cinerea (Egan et al. 2007; Segmuller et al. 2008). Additionally, Pls1 is dispensable for the germination of the M. grisea nonmelanized ascospores (Lambou et al. 2008). These data suggest that Pls1 and Nox2 may act together. This finding is supported by the fact that both proteins are either present or absent in fungal genomes (Table 19.1, Fig. 19.2). In lower fungi, the coevolution is not clear. However, Pls1 tetraspanins are small proteins that evolve rapidly, impairing their detection in very divergent genomes by using ordinary tools. In the “higher fungi”, i.e., Ascomycetes and Basidiomycetes, the repartition of Pls1 and Nox2 is best accounted for by at least nine independent losses of both genes during evolution (Fig. 19.2). As the Pls1 and Nox2 genes are not linked in the genomes, these data provide a strong argument for their acting in the same processes (Loganantharaj and Atwi 2007). Both proteins may act together in a complex located at the plasma

322

S. Brun and P. Silar

Table 19.1 Occurrence of Nox1, Nox2, Nox3, and Pls1 in Eumycota Fungal species Ascomycota Pezizomycotina Sordariomycetes

Leotiomycetes

Eurotiomycetes

Dothideomycetes

Podospora anserina Sporotrichum thermophile Thielavia terrestris Chaetomium globosum Neurospora tetrasperma Neurospora discreta Neurospora crassa Magnaporthe grisea Cryphonectria parasitica Grosmannia clavigera Fusarium graminearum Fusarium verticillioides Fusarium oxysporum Haematonectria (Nectria) haematococca Epichloe¨ festucae Trichoderma atroviride Trichoderma reesei Trichoderma virens Verticillium dahliae Verticillium albo-atrum Colletotrichum graminicola Sclerotinia sclerotiorum Botrytis cinerea Blumeria graminis Aspergillus oryzae Aspergillus flavus Aspergillus terreus Aspergillus carbonarius Aspergillus niger Aspergillus fumigatus Neosartorya fischeri Aspergillus clavatus Aspergillus nidulans Penicillium chrysogenum Talaromyces stipitatus Penicillium marneffei Histoplasma capsulatum Paracoccidioides brasiliensis Blastomyces dermatitidis Uncinocarpus reesii Coccidioides posadasii Coccidioides immitis Arthroderma gypseum Microsporum canis Trichophyton tonsurans Trichophyton rubrum Trichophyton equinum Ascosphaera apis Mycosphaerella graminicolla Mycosphaerella fijiensis Cochliobolus heterostrophus Alternaria brassicola

Nox1/ NoxA

Nox2/ NoxB

Nox3/ NoxC

Pls1

1 1 1 1 1 1 1 1 1 1 1 1 1 2

1 1 1 1 1 1 1 1 1 1 1 1 1b 1

1 0 0 0 0 0 0 1 0 0 1 1 1 1

1 1 1 1 1 1 1 1 1 1 1 1 1 1

1 1 1 1 1 2 1 1 1 1b 1 1 + 1b 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 1 1 1 1

1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 1 1

0 0 0 0 1 1 1 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 0

1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 1 1

(continued)

19 Convergent Evolution of Morphogenetic Processes in Fungi

323

Table 19.1 (continued) Fungal species

Saccharomycotina

Taphrinomycotina

Basidiomycota Ustilaginomycotina Agaricomycotina Agaricomycetes

Tremellomycetes Pucciniomycotina

“Lower fungi” Mucoromycotina

Pyrenophora tritici Stagonospora nodorum Saccharomyces cerevisiae Candida glabrata Zygosaccharomyces rouxii Saccharomyces kluyveri Kluyveromyces thermotolerans Kluyveromyces lactis Ashbya gossypii Candida albicans Debaryomyces hansenii Yarrowia lipolytica Schizosaccharomyces japonicus Schizosaccharomyces pombe Schizosaccharomyces octosporus Pneumocystis carinii

Nox1/ NoxA 1 1b 0 0 0 0 0 0 0 0 0 0 0 0 0 0

Nox2/ NoxB 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0

Nox3/ NoxC 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0

1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0

Ustilago maydis Malassezia globosa

0 0

0 0

0 0

0 0

Heterobasidion annosum Schizophyllum commune Coprinopsis cinerea Laccaria bicolor Postia placentaa Pleurotus ostreatus Phanerochaete chrysosporium Cryptococcus neoformans Tremella mesenterica Sporobolomyces roseus Melampsora larici-populina Puccinia graminis

1 1 1 1 1 1 1 0 1 1 3 1

1 1 1 1 1 1 1 0 0 0 2 1

0 0 0 0 0 0 0 0 0 0 0 0

1 1 1 1 1 1 1 0 0 0 1 1

0 0 0 0 0 0 1(1) 1 1

0 0 0 0 0 0 1 (4) 1 1

0 0 0 0 0 0 0 0 0

1? 1? 1? 0 0 0 ? ? ?

Rhizopus oryzae Mucor circinelloides Phycomyces blakesleeanus Microsporidia Encephalitozoon cuniculi Antonospora locustae Nosema ceranea Blastocladiomycetes Allomyces macrogynus Chytridiomycetes Spizellomyces punctatus Batrachochytrium dendrobatidis

Pls1

a BLAST analysis detects two very similar copies for this species. However, the P. placenta project sequenced the genome of a dikaryon (http://genome.jgi-psf.org/Pospl1/Pospl1.home.html). The two copies are likely the different alleles present in each haploid genome b Genome sequence with an incomplete or erroneous gene sequence. Pseudogenes are in parenthesis

membrane and despite varying fungal habitat and/or physiological diversity, the function of this complex might have been conserved in the different lineages. The second striking conclusion is that melanized ascospore germination requires the same proteins as the formation of the penetration peg from appressoria. When compared (Fig. 19.1), these two processes appear noticeably similar in P. anserina

324

S. Brun and P. Silar Sordariales

*

Magnaporthales Diaporthales Sordariomycetes Leotiomycetes

*

O phiostomatales H ypocreales Ascosphaera

Eurotiomycetes

Pez iz omycotina

Ascomycota

O nygenales Eurotiales

P.c

Capnodiales Saccharomycotina

Dothideomycetes Pleosporales

Taphrinomycotina Agaricomycetes Agaricomycotina

appressorium-like structures

Tremellomycetes Basidiomycota

Ustilaginomycotina Microbotryomycetes Pucciniomycotina

Mucoromycotina

Pucciniomycetes

R.o

? Microsporidia

Lower Fungi

Blastocladiomycota Chrytridiomycota

Fig. 19.2 Phylogenetic tree of Eumycetes. The tree shows the fungal groups for which complete genome sequences are available. The nine vertical arrows locate the loss of Pls1 and Nox2. Asterisks (*) indicate the two groups for which the Pls1 and Nox2 proteins have been recruited for the same goal (exiting a reinforced structure) in two cell types: the ascospores in Sordariales (P. anserina and N. crassa) and the appressorium in Magnaporthales (M. grisea). Possible appearance of appressorium-like structures occurred very early during fungal evolution, however, at a yet undefined moment. Fungi unable to differentiate appressorium-like structure are indicated by P.c (Penicillium chrysogenum) and R.o (Rhizopus oryzae)

and M. grisea. Indeed, during the ontogeny of appressoria and ascospores, there is a programmed cell death event (Beckett et al. 1968; Veneault-Fourrey et al. 2006a). When the structures are formed they are both heavily melanized and both contain a pore from which a peg is produced (Beckett et al. 1968; Deising et al. 2000). We thus speculated that the same program was used by the two species to achieve the same mean (exiting from a melanized structure). This provides a nice example of the reutilisation of the same proteins to achieve a similar morphogenetic goal in two different cell types. We also speculated that this process could be recruited repeatedly during evolution to achieve the same mean, i.e., penetrate plants. If so, appressoria from different fungi would be due to convergent evolution. However, we recently obtained data that call off this statement. Indeed, we recently discovered that Nox2 and Pls1 are involved in a novel developmental stage in P. anserina: the development of appressorium-like cells involved in plant material penetration (Brun et al. 2009).

19 Convergent Evolution of Morphogenetic Processes in Fungi

19.4

325

Differentiating Appressorium-Like Structures Could Be an Ancient Property of Fungi

During our studies on Nox2 and Pls1, we noticed that in addition to their ascospore germination default, the null mutants of both genes presented a defect in the production of fruiting bodies, specifically when grown on cellulose as sole carbon source (Malagnac et al. 2008). This prompted us to investigate in more details the cellulose degradation process in P. anserina (Brun et al. 2009). When cellophane is provided as food source, P. anserina is able to orient its growth toward the cellophane layer. Upon contacting cellophane, it differentiates a structure that greatly resembles B. cinerea pseudo-appressorium. Even more striking is the similarity between the appressorium-like phenotypes of B. cinerea and P. anserina Pls1 and Nox2 mutants (Segmuller et al. 2008; Brun et al. 2009). In both species, these mutants are impaired at the reorientation step toward the substrate (onion skin and cellophane, respectively), which is a prerequisite for penetration. In both species, mutant hyphae tend to “hesitate” in the direction to grow. Then, they establish loose contacts with the substrate and finally are completely defective in penetrating it. Nonetheless, the setting up of fully functional penetration structures is not only under the control of Nox2 and Pls1 but also require the Nox1 isoform (Egan et al. 2007; Giesbert et al. 2008; Brun et al. 2009). In the view of this new finding, we speculate that the ability to differentiate cellular structure dedicated to penetrate plant materials might be an ancient property of filamentous fungi (at least ascomycetes and basidiomycetes), which is used in saprobes to efficiently degrade dead plants, and more aggressively in phytopathogens to penetrate their hosts. To test this possibility, we have evaluated the ability of several additional fungi to differentiate penetration structures on cellophane (see Fig. 19.3 for an example). A variety of structures permitting to breach the cellophane were indeed produced by a wide spectrum of fungi (several Sordariomycetes and Agaricomycetes; S. Brun and P. Silar, unpublished data). Presently, we did not detect such structures in two species, Penicillium chrysogenum and Rhizopus oryzae (Fig. 19.3). Significantly, both fungi lack Nox2 and Pls1 (Table 19.1, Fig. 19.2), confirming the crucial role of the two proteins in the differentiation of appressorium-like cells. Therefore, a wide range of fungi seem to possess the toolkit necessary to breach the plant cell wall. The patchy phylogenetic repartition of species known to produce appressoria and related structure could thus be due to biased sampling toward parasitic and mutualist plant symbionts in studies dealing with appressorium formation. However, some species may truly be unable to differentiate these structures: those that have lost Pls1 and Nox2. In other words, there is no need to invoke complex convergent evolution of fungal structures to explain the recurrent change in trophic lifestyle. Evidence is arising which confirms a role of ROS and Nox in polarized hyphal growth (Semighini and Harris 2008) and we believe that the ability of fungi to attack and penetrate plant materials may simply rely on sensing the glucose gradient created by the enzymatic degradation of the polysaccharides composing the plant cell wall, i.e., cellulose and hemicellulose (Brun et al. 2009). More generally, we believe that if

326

S. Brun and P. Silar

Fig. 19.3 Cellophane breach. Four days old mycelia of P. anserina (P. a), Trichoderma species (T. sp), Penicillium chrysogenum (P. c), and Rhizopus oryzae (R. o) were observed as described (Brun et al. 2009). Numbers indicate the distance from the first picture in mm as depicted by the arrows on the schemes on the right. In the first column, mycelia of all the strains are growing horizontally on the cellophane layer. In the second column, mycelia of P. anserina and T. species reorient their growth toward the cellophane and establish bulging contacts (some examples are indicated by arrows). In P. chrysogenum and R. oryzae, there is no reorientation toward cellophane, though rare contact may occur. In the third column, needle-like hyphae (some examples are indicated by asterisk) are emitted in P. anserina and T. species, which allow both fungi to penetrate into the cellophane layer. In contrast, P. chrysogenum and R. oryzae cannot penetrate cellophane. In the fourth column, schematic representation of the structures; the arrows points toward the approximate focal plan of the first three columns and the eye indicates the direction of the observation

this simple model is true, penetration structures under the control of Nox2/Pls1 should be found not only for phytopathogens and saprobes, but also for entomopathogens (for cuticle breaching) as well as for fungal parasites such as Trichoderma sp. (for chitin-based cell walls penetration) and possibly for human pathogens. We thus now need to confirm on a larger sample if the correlation between the ability to build these structures and the conservation of Nox2/Pls1 holds true. Acknowledgments This work was supported by ANR grant n ANR-05-Blan-0385-02.

19 Convergent Evolution of Morphogenetic Processes in Fungi

327

References Aguirre J, Rios-Momberg M, Hewitt D, Hansberg W (2005) Reactive oxygen species and development in microbial eukaryotes. Trends Microbiol 13:111–118 Beckett A, Barton R, Wilson IM (1968) Fine structure of the wall and appendage formation in ascospores of Podospora anserina. J Gen Microbiol 53:89–94 Brun S, Malagnac F, Bidard F, Lalucque H, Silar P (2009) Functions and regulation of the Nox family in the filamentous fungus Podospora anserina: a new role in cellulose degradation. Mol Microbiol 74:480–496 Cano-Dominguez N, Alvarez-Delfin K, Hansberg W, Aguirre J (2008) NADPH oxidases NOX-1 and NOX-2 require the regulatory subunit NOR-1 to control cell differentiation and growth in Neurospora crassa. Eukaryot Cell 7:1352–1361 Clergeot PH, Gourgues M, Cots J, Laurans F, Latorse MP, Pepin R, Tharreau D, Notteghem JL, Lebrun MH (2001) PLS1, a gene encoding a tetraspanin-like protein, is required for penetration of rice leaf by the fungal pathogen Magnaporthe grisea. Proc Natl Acad Sci USA 98:6963–6968 Deising HB, Werner S, Wernitz M (2000) The role of fungal appressoria in plant infection. Microbes Infect 2:1631–1641 Dujon B (2005) Hemiascomycetous yeasts at the forefront of comparative genomics. Curr Opin Genet Dev 15:614–620 Egan MJ, Wang ZY, Jones MA, Smirnoff N, Talbot NJ (2007) Generation of reactive oxygen species by fungal NADPH oxidases is required for rice blast disease. Proc Natl Acad Sci USA 104:11772–11777 Espagne E, Lespinet O, Malagnac F, Da Silva C, Jaillon O, Porcel BM, Couloux A, Aury JM, Segurens B, Poulain J, Anthouard V, Grossetete S, Khalili H, Coppin E, Dequard-Chablat M, Picard M, Contamine V, Arnaise S, Bourdais A, Berteaux-Lecellier V, Gautheret D, de Vries RP, Battaglia E, Coutinho PM, Danchin EG, Henrissat B, Khoury RE, Sainsard-Chanet A, Boivin A, Pinan-Lucarre B, Sellem CH, Debuchy R, Wincker P, Weissenbach J, Silar P (2008) The genome sequence of the model ascomycete fungus Podospora anserina. Genome Biol 9:R77 Giesbert S, Schurg T, Scheele S, Tudzynski P (2008) The NADPH oxidase Cpnox1 is required for full pathogenicity of the ergot fungus Claviceps purpurea. Mol Plant Pathol 9:317–327 Gourgues M, Brunet-Simon A, Lebrun MH, Levis C (2004) The tetraspanin BcPls1 is required for appressorium-mediated penetration of Botrytis cinerea into host plant leaves. Mol Microbiol 51:619–629 Hawksworth DL (1991) The fungal dimension of biodiversity: magnitude, significance, and conservation. Mycol Res 95:641–655 Hibbett DS, Binder M, Bischoff JF, Blackwell M, Cannon PF, Eriksson OE, Huhndorf S, James T, Kirk PM, Lucking R, Thorsten Lumbsch H, Lutzoni F, Matheny PB, McLaughlin DJ, Powell MJ, Redhead S, Schoch CL, Spatafora JW, Stalpers JA, Vilgalys R, Aime MC, Aptroot A, Bauer R, Begerow D, Benny GL, Castlebury LA, Crous PW, Dai YC, Gams W, Geiser DM, Griffith GW, Gueidan C, Hawksworth DL, Hestmark G, Hosaka K, Humber RA, Hyde KD, Ironside JE, Koljalg U, Kurtzman CP, Larsson KH, Lichtwardt R, Longcore J, Miadlikowska J, Miller A, Moncalvo JM, Mozley-Standridge S, Oberwinkler F, Parmasto E, Reeb V, Rogers JD, Roux C, Ryvarden L, Sampaio JP, Schussler A, Sugiyama J, Thorn RG, Tibell L, Untereiner WA, Walker C, Wang Z, Weir A, Weiss M, White MM, Winka K, Yao YJ, Zhang N (2007) A higher-level phylogenetic classification of the fungi. Mycol Res 111:509–547 James TY, Kauff F, Schoch CL, Matheny PB, Hofstetter V, Cox CJ, Celio G, Gueidan C, Fraker E, Miadlikowska J, Lumbsch HT, Rauhut A, Reeb V, Arnold AE, Amtoft A, Stajich JE, Hosaka K, Sung GH, Johnson D, O’Rourke B, Crockett M, Binder M, Curtis JM, Slot JC, Wang Z, Wilson AW, Schussler A, Longcore JE, O’Donnell K, Mozley-Standridge S, Porter D, Letcher PM, Powell MJ, Taylor JW, White MM, Griffith GW, Davies DR,

328

S. Brun and P. Silar

Humber RA, Morton JB, Sugiyama J, Rossman AY, Rogers JD, Pfister DH, Hewitt D, Hansen K, Hambleton S, Shoemaker RA, Kohlmeyer J, Volkmann-Kohlmeyer B, Spotts RA, Serdani M, Crous PW, Hughes KW, Matsuura K, Langer E, Langer G, Untereiner WA, Lucking R, Budel B, Geiser DM, Aptroot A, Diederich P, Schmitt I, Schultz M, Yahr R, Hibbett DS, Lutzoni F, McLaughlin DJ, Spatafora JW, Vilgalys R (2006) Reconstructing the early evolution of fungi using a six-gene phylogeny. Nature 443:818–822 Lalucque H, Silar P (2003) NADPH oxidase: an enzyme for multicellularity? Trends Microbiol 11:9–12 Lambou K, Malagnac F, Barbisan C, Tharreau D, Lebrun MH, Silar P (2008) A crucial role for the Pls1 tetraspanin during ascospore germination of the saprophytic fungus Podospora anserina. Eukaryot Cell 7:1809–1818 Lara-Ortiz T, Riveros-Rosas H, Aguirre J (2003) Reactive oxygen species generated by microbial NADPH oxidase NoxA regulate sexual development in Aspergillus nidulans. Mol Microbiol 50:1241–1255 Loganantharaj R, Atwi M (2007) Towards validating the hypothesis of phylogenetic profiling. BMC Bioinformatics 8(Suppl 7):S25 Malagnac F, Bidard F, Lalucque H, Brun S, Lambou K, Lebrun MH, Silar P (2008) Convergent evolution of morphogenetic processes in fungi: role of tetraspanins and NADPH oxidases 2 in plant pathogens and saprobes. Commun Integr Biol 1:180–181 Malagnac F, Lalucque H, Lepere G, Silar P (2004) Two NADPH oxidase isoforms are required for sexual reproduction and ascospore germination in the filamentous fungus Podospora anserina. Fungal Genet Biol 41:982–997 McLaughlin DJ, Hibbett DS, Lutzoni F, Spatafora JW, Vilgalys R (2009) The search for the fungal tree of life. Trends Microbiol 17:488–497 Miller AN, Huhndorf SM (2005) Multi-gene phylogenies indicate ascomal wall morphology is a better predictor of phylogenetic relationships than ascospore morphology in the Sordariales (Ascomycota, Fungi). Mol Phylogenet Evol 35:60–75 Schadt CW, Martin AP, Lipson DA, Schmidt SK (2003) Seasonal dynamics of previously unknown fungal lineages in tundra soils. Science 301:1359–1361 Segmuller N, Kokkelink L, Giesbert S, Odinius D, van Kan J, Tudzynski P (2008) NADPH oxidases are involved in differentiation and pathogenicity in Botrytis cinerea. Mol Plant Microbe Interact 21:808–819 Semighini CP, Harris SD (2008) Regulation of apical dominance in Aspergillus nidulans hyphae by reactive oxygen species. Genetics 179:1919–1932 Silar P (2005) Peroxide accumulation and cell death in filamentous fungi induced by contact with a contestant. Mycol Res 109:137–149 Slot JC, Hibbett DS (2007) Horizontal transfer of a nitrate assimilation gene cluster and ecological transitions in fungi: a phylogenetic study. PLoS ONE 2:e1097 Takemoto D, Tanaka A, Scott B (2007) NADPH oxidases in fungi: diverse roles of reactive oxygen species in fungal cellular differentiation. Fungal Genet Biol 44:1065–1076 Veneault-Fourrey C, Barooah M, Egan M, Wakley G, Talbot NJ (2006a) Autophagic fungal cell death is necessary for infection by the rice blast fungus. Science 312:580–583 Veneault-Fourrey C, Lambou K, Lebrun MH (2006b) Fungal Pls1 tetraspanins as key factors of penetration into host plants: a role in re-establishing polarized growth in the appressorium? FEMS Microbiol Lett 256:179–184 Veneault-Fourrey C, Parisot D, Gourgues M, Lauge R, Lebrun MH, Langin T (2005) The tetraspanin gene ClPLS1 is essential for appressorium-mediated penetration of the fungal pathogen Colletotrichum lindemuthianum. Fungal Genet Biol 42:306–318 Webster J (2007) Introduction to fungi, 3rd edn. Cambridge University Press, U.K

Chapter 20

Evolution and Historical Biogeography of a Song Sparrow Ring in Western North America Michael A. Patten

Abstract The Song Sparrow, Melospiza melodia (Aves: Emberizidae), exhibits a greater degree of geographic variation than does any other North American bird species. Detailed morphological work has demonstrated that a subset of the 25 diagnosable subspecies forms a classic ring species in the western United States. The ring’s center is the Sierra Nevada and Mojave Desert in California and adjacent Nevada, and its connecting point is in southeastern California, where an olive and black subspecies of the coastal slope interbreeds sporadically with a gray and rufous subspecies of the arid interior. However, song differences associated with habitat segregation lead to assortative mating between the two subspecies that meet in the Coachella Valley at the southern base of San Gorgonio Pass. Moving clockwise around the ring from the connecting point one finds a gradation of subspecies that become paler, rustier, and grayer. Standard models of ring species evolution imply the connecting point is the region occupied most recently, in this case after sparrows would have spread southward down either side of the mountains and desert. This scenario is plausible given molecular evidence of a glacial refugium on the Queen Charlotte Islands, British Columbia, suggesting that ancestral birds could have moved south in this pattern. By contrast, another postulated refugium is what is now the arid desert of southeastern California or northeastern Baja California, Mexico. This refugium’s location – coupled with a recent meta-analysis of North American hybrid zones that identifies the San Gorgonio Pass region as an ancestral contact zone of coastal and desert fauna – implies that the connecting point is the region occupied earliest, an alternative that would mean the Song Sparrow ring differs fundamentally from one that would have evolved via the standard model. Biogeographical and morphological data support the latter, more radical interpretation,

M.A. Patten Oklahoma Biological Survey and Department of Zoology, University of Oklahoma, 111 E. Chesapeake Street, Norman, OK 73019, USA e‐mail: [email protected]

P. Pontarotti (ed.), Evolutionary Biology – Concepts, Molecular and Morphological Evolution, DOI 10.1007/978-3-642-12340-5_20, # Springer-Verlag Berlin Heidelberg 2010

329

330

M.A. Patten

but genetic, vocal, ecological, and behavioral data are needed around the ring to determine conclusively which model is best supported.

20.1

Ring Species as a Biogeographic Pattern

A concrete bridge between microevolution and macroevolution, including speciation, continues to elude evolutionary biologists (Mayr 1982; Jablonski 2000; Reznick and Ricklefs 2009). Some researchers have concluded that macroevolution is no more than the accumulated effects of microevolution (Hansen and Martins 1996; Simons 2002), whereas others have concluded that macroevolution requires a fundamentally different mechanism (Stanley 1998; Erwin 2000). Ring species may prove to be that crucial bridge (Irwin et al. 2001b). A ring species consists of multiple subspecies whose contiguous geographic ranges encircle a geographic barrier and whose terminal subspecies behave as good biological species where their ranges meet (Cain 1954; Irwin and Irwin 2002; Coyne and Orr 2004). Subspecies around the ring that connect the terminal subspecies grade into each other to form a continuous set of intermediate forms. Because reproductive isolation evolves in the face of gene flow, Mayr (1942:180) referred to ring species as “the perfect demonstration of speciation”, and Cain (1954:141) referred to them as “the clearest evidence of geographical speciation”. But as Coyne and Orr (2004:102) noted, ring species do not demonstrate geographical (¼allopatric) speciation but rather speciation that occurs “through the attenuation of gene flow with distance”. Thus, ring species remain a key to understanding the evolution of reproductive isolation and, therefore, of speciation, and they demonstrate how “small changes can lead to species-level differences” (Irwin et al. 2001b). Lost or conflated in this argument about whether ring species are examples of geographic speciation is a clear distinction between pattern and process. To fit the pattern of a ring species, three conditions must be met (Irwin and Irwin 2002; Joseph et al. 2008; Patten and Pruett 2009): (1) geographic ranges of neighboring subspecies must meet, (2) phenotype and genotype of neighbors must exhibit the effects of intergradation, except for (3) the two subspecies that form the terminal points, which must exhibit a sharp break in phenotype, genotype, ecology, and behavior, enough so that these subspecies behave as good biological species where their ranges meet. Few proposed ring species meet these criteria (Irwin et al. 2001b; Coyne and Orr 2004), and even a weaker criterion, replacing (1) and (2) above, of “a series of progressively intermediate forms must be arranged in a ring” (Patten and Pruett 2009) still excludes many of the proposed ring species. Regardless, if a geographically variable species was found to fit the above criteria, it would be fair to dub it a ring species, immaterial of how the pattern came to be. It also seems fair to conclude that the pattern of phenotypic variation exhibited by a ring species demonstrates that the microevolutionary processes that lead to population differentiation are akin to the processes that lead to speciation, whatever differences there are being only a matter of degree (Irwin et al. 2001b).

20

Evolution and Historical Biogeography of a Song Sparrow Ring

20.2

331

The Evidence for Ring Species

Whether any claimed ring species fits all three criteria outlined above is debatable or unlikely (Coyne and Orr 2004; Martens and P€ackert 2007; Joseph et al. 2008). For example, Irwin et al. (2001b) and Irwin and Irwin (2002) reviewed 23 ring species reported in the scientific literature. Almost all were found wanting in some way, often because reproductive isolation of the terminal points had not been studied but sometimes because gene flow around the ring was unlikely or was known not to occur. In the case of the tsetse fly, Glossina morsitans, the terminal points did not meet in sympatry. Even the two most widely studied examples of putative ring species, the salamander Ensatina eschscholtzii (Stebbins 1957; Wake and Yanev 1986; Wake 2006; Kuchta et al. 2009) and the warbler Phylloscopus trochiloides (Mayr 1942; Irwin et al. 2001a, 2005), do not meet criteria fully (Coyne and Orr 2004; Martens and P€ackert 2007), although they nonetheless display enough characteristics to be considered ring species by most evolutionary biologists. Just over half of the examples of ring species Irwin et al. (2001b) considered pertained to bird species, although they did not consider Mayr’s (1942) examples of the Zosterops white-eyes in the Lesser Sunda Islands nor the Pernis honeyeaters in the Philippines, to say nothing of Stejneger’s (in Jordan 1905) speculation regarding Lanius shrikes around the Baltic Sea. Perhaps, there are no additional pertinent data on these systems. To these examples can be added two avian ring species described recently: the Willow Warbler (Phylloscopus trochilus) complex encircling the Baltic Sea (Bensch et al. 2009) and subspecies of the Song Sparrow (Melospiza melodia) encircling the Sierra Nevada and Mojave Desert of the southwestern United States (Patten and Pruett 2009). The Willow Warbler varies in plumage color, body size, AFLPs (amplified fragment length polymorphism), microsatellite markers, and migratory behavior to the extent that it “shares many features with the classic examples of ring species”, albeit one that evolved recently relative to nearly all other examples (Bensch et al. 2009). The Song Sparrow varies considerably in plumage color and pattern around the ring (Table 20.1), with phenotypically intermediate populations present in all contact zones, implying gene flow and intergradation where ranges meet (Fig. 20.1; Patten and Pruett 2009). The terminal points are two subspecies – the pale, rufescent M. m. fallax of the desert Southwest and the dark, olivaceous M. m. heermanni of southern and central California – that meet in the Coachella Valley, which lies between San Gorgonio Pass and the Salton Sea. The terminal taxa hybridize only rarely; instead, there is evidence that females choose mates assortatively, males respond more strongly to their own subspecies’ songs, and song structure is shaped by habitat structure, which differs between the subspecies (Patten et al. 2004b). Although genetic variation has not yet been studied around the ring, the terminal taxa differ in frequency of microsatellite markers and these differences are associated with plumage differences (Patten et al. 2004b). Moreover, a recent study of Song Sparrows along the whole of the Pacific Coast, from the western Aleutian Islands of Alaska to southernmost California, found, in many

332

M.A. Patten

Table 20.1 Patterns of phenotypic variation around the Song Sparrow Melospiza melodia ring in western North America Mantle color Mantle fringe Underparts Streak color Streak fringe Malar Supercilia

heermanni Grayish olive-brown Gray, thin White Fuscous Ruddy Reddish fuscous Ashy

gouldii Reddish olive-brown Absent White Black Olive Blackish Ashy

cleonensis Dark reddish brown Gray, thin Grayish Dark brown Brown Fuscous Grayish

montana Grayish brown Gray, broad White Brown Chestnut Chestnut brown Whitish

fallax Brownish gray Reddish gray, broad White Warm brown Chestnut Chestnut Whitish

Fig. 20.1 The Song Sparrow (Melospiza melodia) ring in western North America (from Patten and Pruett 2009). The northwestern portion of center of the ring is the Sierra Nevada, the tallest mountain range in the conterminous United States. The remainder of the gap is the Mojave Desert (southern California) and southern Great Basin desert (southern Nevada). The large lake in southeastern California is the Salton Sea, which sits at the southern edge of where the terminal taxa meet, and San Gorgonio Pass lies at the northwestern edge of Coachella Valley

20

Evolution and Historical Biogeography of a Song Sparrow Ring

333

cases, that microsatellite variation and plumage variation (subspecies) were correlated significantly (Pruett et al. 2008; cf. Zink 2010). This finding suggests that a detailed genetic survey around the ring holds the promise of yielding a pattern that corroborates the pattern evident in the analysis of plumage variation.

20.3

Models for the Evolution of Ring Species

The two recently proposed ring species need more study, but at the least the criteria for establishing the pattern appear to have been met as convincingly as in the two more well-studied examples of Ensatina eschscholtzii and Phylloscopus trochiloides. But determining that a species or subspecies complex fits a ring species is only half of the battle. How a ring pattern came to be is about the process of a ring species, and the stringent criteria Coyne and Orr (2004:103) set forth for determining if a ring species is valid focused equally on process and pattern. Although these authors agreed that criterion (1) above must hold, they modified (2) to state that geographic continuity must have been present always; i.e., no geographic barriers to gene flow could have existed in the past, during ring formation. They further imposed two criteria related to the process by which the ring formed: (A) there must be historical information that the ring was formed by a single population (i.e., not from two or more genetically distinct lines), with all subspecies around the ring descended from that single line, and (B) one of the terminal points must be represented by a population that expanded its range most recently. Criterion (A) may be justified if we wish to hold up a ring species as a solid example of speciation either in the face of gene flow or with geographic distance. Criterion (B), by contrast, implies that the ring must have formed in a certain way, which ignores other plausible ways in which a ring could evolve. The model inherent in criterion (B) is consistent with the first model put forth for the evolution of a ring species (Stejneger, in Jordan 1905), a half-century before the term “ring species” was coined. In one of several published response to Jordan’s review of geographic speciation, Stejneger postulated that two subspecies might breed in sympatry, but only under specific circumstances. Using Lanius shrikes in northern Europe as an example, he asked readers to imagine that two trajectories of range expansion split from a common stock in Asia, with one heading west through central Europe to reach the Scandinavian Peninsula by way of Denmark and the other heading northwest through Finland to colonize the Scandinavian Peninsula from the north. The ranges of these subspecies would meet in southern part of the peninsula. Stejneger (p. 552) proposed that “it is then not unnatural to conclude that in the specimens meeting there the characters might have become so fixed that the two forms would react on each other as two distinct species, though at their original dividing line they might still remain in the imperfectly differentiated stage”. This scenario corresponds with the classic conceptual model of how a ring forms (Fig. 20.2, “classical I”; Martens and P€ackert 2007). An alternative model (Fig. 20.2, “classical II”) yields the same pattern and still invokes forming a ring that would meet Criterion (B).

334

M.A. Patten

Fig. 20.2 Competing models for the evolution of a species ring. The “classical I” model corresponds to Leonard Stejneger’s (in Jordan 1905) conception of how a ring formed (see also Martens and P€ackert 2007). A ring may also form in the classical sense by encircling the geographic barrier back to the starting point (see Kuchta et al. 2009 for similar examples). The “in situ” model relies on repeated, simultaneous ecological speciation, whereas the “ecological divergence” model combines aspects of a classical ring model (e.g., differentiation during range expansion) with ecological speciation

Using current snapshots to distinguish between various iterations of these “classical” models can be challenging (Kuchta et al. 2009), but alternative models that would yield the pattern of a ring species and conform to conceptual specifications of the “ring species hypothesis” (sensu Joseph et al. 2008) have not been explored. Yet there are alternative models in which a ring pattern evolves by means of a process that retains the concept’s emphasis on divergence with gene flow, a possibility increasingly recognized as plausible (Nosil 2008; Thorpe et al. 2008; Mila´ et al. 2009). One such model is a simple scenario invoking in situ divergence across various ecotones around a ring (Fig. 20.2), with divergence being especially pronounced across one moderately steep, but not too steep, environmental gradient (Doebeli and Dieckmann 2003; Leimar et al. 2008). Taxa on either side of this gradient diverge by the process of ecological speciation, “the evolution of reproductive isolation between populations by divergent natural selection arising from differences between ecological environments” (Schluter 2009). These taxa become the terminal points of the ring. Because geographic ranges were always and are still continuous, and intergradation persists at other contact points where gradients are shallower, a ring species pattern forms in the face of gene flow. Another model for the evolution of a ring species also invokes ecological speciation across an environmental gradient (Fig. 20.2, “ecological divergence”). In this case, ranges expand around a geographic barrier, just as in the classical models; however, ranges split initially from the parent population across an ecotone

20

Evolution and Historical Biogeography of a Song Sparrow Ring

335

with a moderately steep gradient, an area conducive to divergence (Endler 1977). As ranges expand around either side of the barrier, time elapsed at the initial branch point is sufficient for divergence to occur there, but the expanding front does not diverge at this same rate. Indeed, the two fronts remain undifferentiated enough that when the fronts meet, the populations interbreed readily, forming a broad hybrid zone of secondary contact. The end result would again be a ring species pattern in the face of gene flow. The chief differences from the classical models are that terminal points occur at an ecotone and are at the opposite end of the ring from where the expanding fronts met. It is important to note that a variety of other scenarios may lead to a ring species pattern. For example, a species may have spread from multiple glacial refugia and in doing so form multiple zones of secondary contact (Bensch et al. 2009). Or a set of subspecies may have arisen by a process of vicariant (allopatric) divergence, but all barriers between resultant forms have since eroded, leaving a ring of connected forms with intergradation where ranges meet (Joseph et al. 2008). We therefore ought to predict the existence of a ring species pattern in situations that cannot teach us about speciation in the face of gene flow, an oft-cited hallmark of the ring species hypothesis. Such examples only add to the abundant evidence for allopatric speciation, albeit they will prove suitable for studies of the maintenance of geographic variation in the face of gene flow (e.g., hybrid zone dynamics; Barton and Hewitt 1989).

20.4

Evolution of the Song Sparrow Ring

The Song Sparrow currently ranges across North America, with populations occurring north to southwestern Alaska and to southern Canada east to Newfoundland and contiguous populations south to northwestern Mexico. There are also geographically isolated populations on the Channel Islands and Islas Coronados off of California and Baja California, respectively, and at various locales in mainland Mexico, south to the Trans-Mexican volcanic belt (Patten and Pruett 2009). So wide a geographic range may hinder interpretation of the evolution of geographic variation. We thus need to consider whether the species was always so widespread or, more likely, if the species expanded its range considerably in the wake of the most recent glaciation 12,000 ybp. In the case of the Song Sparrow, two genetic analyses (Zink and Dittmann 1993; Fry and Zink 1998) identified two or three Pleistocene refugia, respectively. That is, extant populations of the Song Sparrow carry a genetic signature that implies range expansion away from either two or three regions that harbored the species’ ancestors during the last glacial maximum (Fig. 20.3; see Sommer and Zachos 2009). Two refugia identified by mtDNA restriction sites (Zink and Dittmann 1993) were Newfoundland and the Queen Charlotte Islands, British Columbia (Fig. 20.3). Because Newfoundland was covered by a sheet of ice, it seems an implausible site for a refugium. This concern was alleviated by a follow-up study of mtDNA sequence (Fry and Zink 1998), who found evidence for a “model of Song Sparrow

336

M.A. Patten

Fig. 20.3 Approximate extent of the North American ice sheets during the last glacial maximum (Ehlers and Gibbard 2004). On the basis of mitochondrial DNA restriction sites and sequences (Zink and Dittmann 1993; Fry and Zink 1998), three glacial refugia (dashed circles) for the Song Sparrow (Melopsiza melodia) have been proposed. A fourth (solid circle) was proposed initially but later discarded

population history involving multiple Pleistocene refugia and colonization of some formerly glaciated regions from multiple sources”. Their study identified three refugia: the Queen Charlotte Islands, the Atlantic Coast of the northeastern United States, and, likely, southern California (Fig. 20.3). Southern California was considered a likely location for a refugium, but it could not be identified conclusively because sample size was small. Nevertheless, a genetic survey across a suite of terrestrial vertebrate taxa – but not including the Song Sparrow – identified southeastern California as a Pleistocene refugium (Waltari et al. 2007), lending support to Fry and Zink’s (1998) finding. Waltari et al. (2007) also presented evidence for a refugium in the central or southern Baja

20

Evolution and Historical Biogeography of a Song Sparrow Ring

337

California peninsula, a location Fry and Zink (1998) could not have detected because they lacked samples of the Song Sparrow from the peninsula. The Baja California peninsula nonetheless corresponds to a common Pleistocene refugium incorporated into a meta-analysis of North American hybrid zones (Fig. 20.4; Swenson and Howard 2005). That the sparrow occurs currently in all three (or four, if we include Baja California as separate from southern California) putative refugia (Fig. 20.3) raises the possibility of future screening for ancestral haplotypes, preferably in the nuclear genome. The issue of hybrid or contact zones is an additional crucial consideration when piecing together the evolutionary and biogeographic history of the Song Sparrow. The contact zone of the terminal points of the sparrow ring occurs in the Coachella Valley, at the southeastern base of San Gorgonio Pass (Fig. 20.1). The San

Fig. 20.4 Proposed routes of range expansion away from glacial refugia (squares) in North America (after Swenson and Howard 2005)

338

M.A. Patten

Gorgonio Pass divides the north end of the north–south Peninsular Ranges from the east–west Transverse Ranges and is an area of faunal transition (Patten et al. 2004a; Leavitt et al. 2007). It has been identified as a “hot spot” for phylogeographic breaks (Swenson and Howard 2005), locations where there are deep splits in phylogenetic history. The Transverse Ranges themselves figure prominently in phylogenetic breaks: animal taxa (invertebrate and vertebrate) either north or south of that line of mountains tend to be in separate phylogenetic clusters (Calsbeek et al. 2003; Burns et al. 2007), further emphasizing the prominence of the San Gorgonio Pass region as a contact zone hot spot. That the terminal points of the Song Sparrow ring occur in this region of faunal transition is likely not a coincidence. If we accept that the Song Sparrow’s ancestors persisted in a glacial refugium in southern California or in Baja California and spread north from there (Figs. 20.3 and 20.4), a cleave in the expanding fronts of the geographic range would be at the San Gorgonio Pass. The moderately steep environmental gradient in the pass – from a Mediterranean climate at the northwest end to an extreme desert climate at the southeast end – is conceivably ideal for ecological speciation. If speciation occurred while the expanding fronts differentiated, via isolation by distance, enough to be recognized as subspecies but not enough to yield reproductive isolation, then the result would be a true ring species that evolved by a process that best fit the “ecological divergence” model (Fig. 20.2). Conversely, Lapointe and Rissler (2005) examined congruent phylogeographies across California of seven verebrates, an invertebrate, and a plant and found general patterns that corresponded broadly to the ranges of the subspecies of the Song Sparrow that constitute the ring (Fig. 20.1). If these regions, each of which has a distinct environment (i.e., general climate and vegetation), tend to promote divergence via an ecological speciation model, then the San Gorgonio Pass still might be the site of speciation when other contact zones represent areas where locally adapted populations meet. Such a scenario would yield a true ring species, but one that evolved by means of the “in situ” model (Fig. 20.2). Morphologically, the California subspecies of the Song Sparrow form a distinct group, as do the subspecies in the desert Southwest and the mesic Pacific Northwest (Patten and Pruett 2009). It therefore seems unlikely that postglacial range expansion was solely from the Queen Charlotte refugium, a requisite for the ring to conform to a “classical I” model (Fig. 20.2). Evolution by means of a “classical II” model may be more likely, if the ancestral taxon expanded north to encircle the Sierra Nevada and Mojave Desert counterclockwise, yet such a pattern would not jibe with general tracks of postglacial expansion in other species (Fig. 20.4; Swenson and Howard 2005). Moreover, the subspecies M. m. rivularis of Baja California Sur is morphologically most like M. m. fallax of the Sonoran Desert, one of the terminal points of the ring; indeed, they are nearly identical in plumage – the principal difference is the diagnostically longer bill of M. m. rivularis (Patten and Pruett 2009). If phenotype corresponds to evolutionary relatedness and the Pleistocene refugium was in the Baja California peninsula, then the ancestral form expanded northward only on the east side of the peninsula, an unlikely scenario

20

Evolution and Historical Biogeography of a Song Sparrow Ring

339

given presumably spotty suitable habitat in the far more xeric portion of Baja California east of the Peninsular Ranges.

20.5

Conclusions

Morphological variation in the Song Sparrow in the southwestern United States creates a ring species pattern around the Sierra Nevada and Mojave Desert (Patten and Pruett 2009). A detailed study of two subspecies that differ most strikingly in plumage implies that they are terminal points of the ring (Patten et al. 2004b). These subspecies meet at the base of the San Gorgonio Pass, a well-known area of faunal transition (Leavitt et al. 2007). Yet prima facie evidence suggests that neither of the classical models for the evolution of a ring species (Fig. 20.2) holds in this case. A glacial refugium for the Song Sparrow likely existed in the desert Southwest (Fry and Zink 1998), and postglacial range expansion from this region tended to be of a northward trajectory (Swenson and Howard 2005). It thus would appear that an “ecological divergence” model is the most plausible. This model requires ecological speciation of M. m. heermanni and M. m. fallax, the terminal points, across the San Gorgonio Pass while the species expanded its range northward on either side of the Sierra Nevada and Mojave Desert (Fig. 20.5). At this stage an “in situ” model cannot be eliminated, and distinguishing between these models requires detailed genetic, ecological, and behavioral research around the ring. Even so, Occam’s razor would argue in favor of the “ecological divergence” model, if only because it invokes ecological speciation (or subspeciation) at only one location instead of a minimum of four (the number of contact zones between Song Sparrow subspecies that form the ring). There are additional wrinkles in the formation of the Song Sparrow ring. For example, M. m. cleonensis is morphologically intermediate between subspecies in the “California group” and those in the “Alaska and Pacific Northwest group” (sensu Patten and Pruett 2009). I suggest that this intermediacy reflects a historical merging of a northward expanding front from the refugium in southern California and the southward expanding front from the Queen Charlotte Islands. That M. m. montana, the northern “cap” to the species ring, shares characters of both California and “Eastern” subspecies also implies extensive gene flow, but it remains to be determined whether eastward and southward fronts merged to leave a ring species pattern without divergence in the face of gene flow or by distance. Only in-depth studies that combine morphology, genetics (especially nuclear DNA), ecology, and geological history will be able to distinguish among various models for the evolution of a ring species or confirmation of the “ring species hypothesis” (Joseph et al. 2008; Bensch et al. 2009). Regardless, an important starting point for any investigation of a putative ring species is full consideration of all plausible models that could have led to a ring species’ evolution, not just an

340

M.A. Patten

San Gorgonio Pass

Fig. 20.5 Hypothesized postglacial expansion of the Song Sparrow (Melospiza melodia) from an identified (but nonetheless postulated) glacial refugium in the Sonoran Desert (dashed circle). Such range expansion would yield a ring species pattern, but in this species’ case the terminal points are in the vicinity of the San Gorgonio Pass, meaning the ring evolved by a combination of “divergence by distance” and ecological speciation (the “ecological divergence” model of Fig. 20.2), a process heretofore not considered in studies of ring species

expectation of conformity to classical models. Consideration of alternative models not only promises to provide deeper insight in how ring species evolve but also promises to build a stronger bridge between micro- and macroevolution.

20

Evolution and Historical Biogeography of a Song Sparrow Ring

341

Acknowledgments I thank Pierre Pontarotti for the opportunity to speak at the 13th Evolutionary Biology Meeting and Axelle Pontarotti for her excellent guidance both pre and post meeting. John T. Rotenberry, Leonard Nunney, and Marlene Zuk advised during early stages of this study, and Christin L. Pruett has been a sounding board during later stages. I am grateful to Lukas F. Keller and his research group and colleagues at Universit€at Z€ urich for their feedback following my September 2008 seminar there. Brenda D. Smith-Patten has been a limitless source of support throughout this research; she also helped prepare Fig. 20.2 and commented on a draft of this chapter.

References Barton NH, Hewitt GM (1989) Adaptation, speciation, and hybrid zones. Nature 341:497–503 ˚ kesson S (2009) Genetic, morphological, and feather Bensch S, Grahn M, M€ uller N, Gay L, A isotope variation of migratory Willow Warblers show gradual divergence in a ring. Mol Ecol 18:3087–3096 Burns KJ, Alexander MP, Barhoum DN, Sgariglia EA (2007) Statistical assessment of congruence among phylogeographic histories of three avian species in the California Floristic Province. Ornithol Monogr 63:96–109 Cain AJ (1954) Animal species and their evolution. Princeton University Press, Princeton, NJ Calsbeek R, Thompson JN, Richardson JE (2003) Patterns of molecular evolution and diversification in a biodiversity hotspot: the California Floristic Province. Mol Ecol 12:1021–1029 Coyne JA, Orr HA (2004) Speciation. Sinauer Assoc, Sunderland, MA Doebeli M, Dieckmann U (2003) Speciation along environmental gradients. Nature 421:259–264 Ehlers J, Gibbard PL (2004) Quaternary glaciations – extent and chronology, part 2: North America. Elsevier, Amsterdam Endler JA (1977) Geographic variation, speciation, and clines. Princeton Monogr Pop Biol 10:1–246 Erwin DH (2000) Macroevolution is more than repeated rounds of microevolution. Evol Dev 2:78–84 Fry AJ, Zink RM (1998) Geographic analysis of nucleotide diversity and Song Sparrow (Aves: Emberizidae) population history. Mol Ecol 7:1303–1313 Hansen TF, Martins EP (1996) Translating between microevolutionary process and macroevolutionary patterns: the correlation structure of interspecific data. Evolution 50:1404–1417 Irwin DE, Irwin JH (2002) Circular overlaps: rare demonstrations of speciation. Auk 119:596–602 Irwin DE, Bensch S, Price TD (2001a) Speciation in a ring. Nature 409:333–337 Irwin DE, Irwin JH, Price TD (2001b) Ring species as bridges between microevolution and speciation. Genetica 112–113:223–243 Irwin DE, Bensch S, Irwin JH, Price TD (2005) Speciation by distance in a ring species. Science 307:414–416 Jablonski D (2000) Micro- and macroevolution: scale and hierarchy in evolutionary biology and paleobiology. Paleobiology 26(suppl):15–52 Jordan DS (1905) The origin of species through isolation. Science 22:545–562 Joseph L, Dolman G, Donnellan S, Saint KM, Berg ML, Bennett ATD (2008) Where and when does a ring start and end? Testing the ring-species hypothesis in a species complex of Australian parrots. Proc Biol Sci 275:2431–2440 Kuchta SR, Parks DS, Mueller RL, Wake DB (2009) Closing the ring: historical biogeography of the salamander ring species Ensatina eschscholtzii. J Biogeogr 36:982–995 Lapointe F-J, Rissler LJ (2005) Congruence, consensus, and the comparative phylogeography of codistributed species in California. Am Nat 166:290–299

342

M.A. Patten

Leavitt DH, Bezy RL, Crandall KA, Sites JW Jr (2007) Multi-locus DNA sequence data reveal a history of deep cryptic vicariance and habitat-driven convergence in the desert night lizard Xantusia vigilis species complex (Squamata: Xantusiidae). Mol Ecol 16:4455–4481 Leimar O, Doebeli M, Dieckmann U (2008) Evolution of phenotypic clusters through competition and local adaptation along an environmental gradient. Evolution 62:807–822 Martens J, P€ackert M (2007) Ring species – do they exist in birds? Zool Anz 246:315–324 Mayr E (1942) Systematics and the origin of species. Columbia University Press, New York Mayr E (1982) Speciation and macroevolution. Evolution 36:1119–1132 Mila´ B, Wayne RK, Fitze P, Smith TB (2009) Divergence with gene flow and fine-scale phylogeographical structure in the wedge-billed Woodcreeper, Glyphorynchus spirurus, a neotropical rainforest bird. Mol Ecol 18:2979–2995 Nosil P (2008) Speciation with gene flow could be common. Mol Ecol 17:2103–2106 Patten MA, Pruett CL (2009) The Song Sparrow as a ring species: patterns of geographic variation, a revision of subspecies, and implications for speciation. System Biodivers 7:33–62 Patten MA, Erickson RA, Unitt P (2004a) Population changes and biogeographic affinities of the birds of the Salton Sink, California/Baja California. Studies Avian Biol 27:24–32 Patten MA, Rotenberry JT, Zuk M (2004b) Habitat selection, acoustic adaptation, and the evolution of reproductive isolation. Evolution 58:2144–2155 Pruett CL, Arcese P, Chan YL, Wilson AG, Patten MA, Keller LF, Winker K (2008) Concordant and discordant signals between genetic data and described subspecies of Pacific coast Song Sparrows. Condor 110:359–364 Reznick DN, Ricklefs RE (2009) Darwin’s bridge between microevolution and macroevolution. Nature 457:837–842 Schluter D (2009) Evidence for ecological speciation and its alternative. Science 323:737–741 Simons AM (2002) The continuity of microevolution and macroevolution. J Evol Biol 15:688–701 Sommer RS, Zachos FE (2009) Fossil evidence and phylogeography of temperate species: ‘glacial refugia’ and post-glacial recolonization. J Biogeogr 36:2013–2020 Stanley SM (1998) Macroevolution: pattern and process. Johns Hopkins University Press, Baltimore Stebbins RC (1957) Intraspecific sympatry in the lungless salamander Ensatina eschscholtzii. Evolution 11:265–270 Swenson NG, Howard DJ (2005) Clustering of contact zones, hybrid zones, and phylogeographic breaks in North America. Am Nat 166:581–591 Thorpe RS, Surget-Groba Y, Johansson H (2008) The relative importance of ecology and geographic isolation for speciation in anoles. Phil Trans R Soc Lond B Biol Sci 363:3071–3081 Wake DB (2006) Problems with species: patterns and processes of species formation in salamanders. Ann Mo Bot Gard 93:8–23 Wake DB, Yanev KP (1986) Geographic variation in allozymes in a “ring species”, the plethodontid salamander Ensatina eschscholtzii of western North America. Evolution 40:702–715 Waltari E, Hijmans RJ, Peterson AT, Nya´ri AS, Perkins SL, Guralnick RP (2007) Locating Pleistocene refugia: comparing phylogeographic and ecological niche model predictions. PLoS ONE 2(7):e563 Zink RM (2010) Drawbacks with the use of microsatellites in phylogeography: the Song Sparrow Melospiza melodia as a case study. J Avian Biol 41:1–7 Zink RM, Dittmann DL (1993) Gene flow, refugia, and evolution of geographic variation in the Song Sparrow (Melospiza melodia). Evolution 47:717–729

Chapter 21

Cave Bear Genomics in the Paleolithic Painted Cave of Chauvet-Pont d’Arc Ce´line Bon and Jean-Marc Elalouf

Abstract Caves are reservoirs of fossils, some of which belong to species now extinct. Paleogenetics explores ancient DNA that may have survived in these fossils to better understand the phylogeny of Pleistocene species and the paleoenvironment. The Chauvet-Pont d’Arc Cave, which displays the earliest known human drawings, contains thousands of animal remains, setting this cave as a mine for genetic analysis. We focused on the extinct cave bear, Ursus spelaeus, and proved that Chauvet-Pont d’Arc samples still contain enough DNA for genetic studies. One of them yielded well-preserved DNA and allowed sequencing the complete cave bear mitochondrial genome. We used this molecular information to establish bear phylogeny and the tempo of Ursidae speciation. Widening our analysis to cave bears samples from Chauvet-Pont d’Arc and a closely located cave, we showed that the Pleistocene ursine population was highly homogeneous at the regional level.

21.1

21.1.1

The Chauvet-Pont d’Arc Cave, a Well-Preserved Paleolithic Site The Earliest Rock Art Recorded to Date

In 1994, the three cavers Jean-Marie Chauvet, Eliette Brunel, and Christian Hillaire made a major discovery in the field of archeology: they found a cave containing hundreds of Paleolithic rock art pictures. This cave, located near Vallon-Pont d’Arc (Arde`che, Southeastern France) at the entrance of the Arde`che Gorge, is now known as Chauvet-Pont d’Arc from one of its discoverers, Jean-Marie Chauvet.

C. Bon and J-M. Elalouf CEA, IBiTec-S, F-91191 Gif-sur-Yvette cedex, France e-mail: [email protected]

P. Pontarotti (ed.), Evolutionary Biology – Concepts, Molecular and Morphological Evolution, DOI 10.1007/978-3-642-12340-5_21, # Springer-Verlag Berlin Heidelberg 2010

343

344

C. Bon and J.-M. Elalouf

Since some of the pictures were drawn with charcoal, dating analysis was possible using the radiocarbon method. Several paintings returned a radiocarbon age between 30,000 and 32,000 years Before Present (BP), which sets them about twice older than the age currently proposed for Lascaux Cave paintings. ChauvetPont d’Arc rock art is the oldest Paleolithic drawing known to date (Valladas et al. 2001). The cave displays three kinds of rock art pictures: charcoal- and ochre-made drawings and engravings. As dating is only feasible for charcoal-made pictures, some of the other pictures might be older than 32,000 years BP. The cave also contains other remains of human occupation. The track of a male infant was found in a deep part of the cave, in the Gallery of the Crosshatches. During his trip, the child regularly rubbed his torch against the wall, leaving numerous sooty marks. These marks were radiocarbon dated back to 26,000 years BP (Garcia 2005). Huge hearths were found in other cave sectors and were most probably used by Paleolithic artists for the production of charcoal pencils. The cave also contains about 20 flint tools as well as an ivory assegai point (Geneste 2005). Other anthropogenic processes, such as stone blocks grouped together by humans or a cave bear skull deposited on a large rock, remain enigmatic. Due to the rich overall archeological content and, especially, the great age of the rock art pictures, the Chauvet-Pont d’Arc Cave is protected from the very day of its discovery (Baffier 2005). As soon as they saw the first rock art pictures, the three discoverers took care to protect the ancient soil. Afterwards, footbridges were installed throughout the cave. The access to the cave is restricted to a handful of people that are granted authorization from the prefect. A permanent watch was set to detect microbial pollution as well as local climate change. Even the scientific researches are strictly monitored to ensure preservation of the site. Thus, there are only two short campaigns of studies each year, no more than 12 people are tolerated inside the cave, no direct contact with the archeological remains or the walls are allowed, and retrieving of samples rests on special curator’s authorization (Baffier 2005). Despite these constraints, the cave provides a unique basis for scientific research because its preserved state gives us access to a Paleolithic site untouched since the entrance of the cave collapsed some 20,000 years ago.

21.1.2

The Chauvet-Pont d’Arc Cave, a Bear Cave

Even without such anthropogenic remains, Chauvet-Pont d’Arc would still have been a major paleontological discovery since it displays thousands of animal remains, most of which consist of Ursus spelaeus bones (Fig. 21.1) (Fosse and Philippe 2005). Among the 3,844 bones dispatched all over the ground, 3,703 are ascribed to the cave bear. The brown bear (Ursus arctos) has been identified through a single skull, which contrasts with the 200 cave bear skulls that are present in Chauvet-Pont d’Arc. Other species, such as the wolf, extinct cave hyena, fox, ibex, deer, are evidenced by a few samples. Canidae coprolites and footprints are also present in the cave.

21

Cave Bear Genomics in the Paleolithic Painted Cave of Chauvet-Pont D’Arc

345

Fig. 21.1 Topography of the Chauvet-Pont d’Arc Cave. Blue areas correspond to places with cave bear wallows; purple circles indicate cave bears footprints; green thick lines on walls indicate that cave bear claw marks are present. Radiocarbon ages are given as years BP. Topography: Y. Le Guillou and F. Maksud. Paleontological data: P. Fosse and M. Philippe

346

C. Bon and J.-M. Elalouf

But the cave is not only a bear grave, for it also displays many evidences of live animal’s occupation. The ground is warped by the numerous wallows in which bears used to hibernate; the walls are scratched by claw marks and polished by their roaming; bear footprints can be seen in every chamber. Whereas the brown bear is still an extant species, the cave bear became extinct about 25,000 years ago (Pacher and Stuart 2009). Ursus spelaeus was a robustly built bear that weighed 200 kg more than the sturdiest extant bears, i.e., the Kodiac and polar bears. The sexual dimorphism is strong, as well as the intraspecific variability (Kurte´n 1976). It is currently estimated that the cave bear was confined to Europe, even though cave-bear-looking bears that may belong to some cave bear subspecies were found in Crimea, Caucasus, or Siberia (Knapp et al. 2009). It has been considered that the cave bear was mostly herbivorous, but two recent studies (Richards et al. 2008; Peigne et al. 2009) showed that it was omnivorous at least during the prehibernation period. Since the cave bear is an extinct species, its phylogenetic relationship with other bears has long been only known through paleontological data. The direct ancestor of the cave bear is Ursus deningeri, because Ursus spelaeus succeeds continuously to Ursus deningeri (Mazza and Rustioni 1994). It is estimated that the transition between the two species occurred around the beginning of the last interglacial, but to draw a limit between these two chrono-species may be awkward (Argant 2001). Views diverge about the origins of the Ursus arctos and the Ursus spelaeus lineages. Whereas most paleontologists assume that these two lineages emerged from Ursus etruscus, Mazza and Rustioni proposed that Ursus etruscus is a dead end, and that Ursus deningeri appeared among extremely polymorphic Ursus arctos lineages. This issue was first questioned in 1994 by analyzing mitochondrial DNA fragments from Pleistocene remains (Hanni et al. 1994). This initial studies and subsequent work (Loreille et al. 2001) yielded sequence data for the mitochondrial control region and cytochrome b (CYTB) gene. However, when we initiated our studies the information available consisted of less than 10 % of the mitochondrial genome. As increasing evidences suggest that long sequences are necessary to obtain robust phylogenies and to accurately date the divergence events between lineages (Rohland et al. 2007), a complete cave bear mitochondrial genome sequence was highly desirable (Bon et al. 2008).

21.2

21.2.1

Sequencing the Mitochondrial Genome of the Extinct Cave Bear The Challenge of Retrieving and Sequencing Ancient DNA

The study of ancient DNA is tricky. Although in the living cell enzymatic processes continuously repair DNA, endogenous nucleases and exogenous fungi or bacteria begin degrading DNA from the death of an organism. Under rare circumstances (such as rapid desiccation or adsorption on a mineral matrix), the DNA may escape

21

Cave Bear Genomics in the Paleolithic Painted Cave of Chauvet-Pont D’Arc

347

the onslaught, its only source of deterioration being through chemical processes (Hofreiter et al. 2001b; Paabo et al. 2004). Thus ancient DNA is scarce and displays a number of chemical alterations. This has several consequences. The length of the DNA molecules is reduced by strand breaks. In addition, depurination and crosslinking between strands or between a DNA strand and another molecule result in impeding PCR amplifications. As the initial amount of ancient DNA is extremely low, the amplification stage is sensitive to contaminations, not only from modern DNA but also from previously amplified products. Another problem is the deamination of cytosine and adenine, leading to mutations such as T instead of C, and G instead of A in the retrieved sequence. At last, the samples often contain a variety of organic molecules that may act as PCR inhibitors. This prevents the use of a large amount of extract in the PCR mix. Considering the care taken to protect the Chauvet-Pont d’Arc Cave from contaminations, we turned to it to select an eligible cave bear sample for the sequencing of the mitochondrial genome. After screening several samples, we chose US18 because of its biomolecular preservation. It still contained enough collagen for radiocarbon dating, and the amino-acid racemization extent was quite low. After DNA extraction, a 117 bp mitochondrial sequence was amplified over a wide range of sample extract (from 0.1 to 2%), which shows that we retrieved large amounts of DNA and few PCR inhibitors. Since independent replication is required in ancient DNA studies, another group of investigators from another Institute performed extraction and analysis. The same and another overlapping pair of primers were used and confirmed the sequence initially obtained. Both extracts were employed in the subsequent experiments.

21.2.2

Obtaining the Complete Cave Bear Mitochondrial Sequence

When this analysis began, only few fragments of the cave bear mitochondrial genome were known: a portion of the control region had been sequenced from several samples (Hanni et al. 1994; Hofreiter et al. 2002, 2007; Orlando et al. 2002; Rohland et al. 2004). A single gene, namely CYTB, had been characterized throughout its coding region from one sample found in the Balme-a`-Collomb Cave (Loreille et al. 2001). We designed an iterative experimental strategy to determine the cave bear mitochondrial genome. First, we aligned the mitochondrial genomes of the extant brown bear (Ursus arctos), polar bear (Ursus maritimus), and American black bear (Ursus americanus) (Delisle and Strobeck 2002). From this alignment, conserved regions were identified and used to design a first series of primers for amplifying DNA fragments ranging from 100 to 200 bp. These 147 primer pairs spanned the entire genome. Only 64 primer pairs out of 147 succeeded; the 83 failures may result from mispairing between the template cave bear DNA and the primers. As a consequence, in

348

C. Bon and J.-M. Elalouf

the following rounds, we used the sequence obtained from previous runs to design cave bear specific primers. In the end, nine rounds were required and we successfully used 245 primer pairs. In order to avoid contaminations, prePCR steps were done in a dedicated laboratory facility, in a building free from molecular biology research. Each primer pair was designed to amplify DNA fragments shorter than 200 bp. For each fragment, at least two PCR amplifications were performed. As differences caused by ancient DNA damages were usually detected, a third amplification was often carried out, and the consensus sequence was retained. In the worst case scenario, this strategy is expected leading to a 0.06% error rate (Hofreiter et al. 2001a). PCR products were cloned and a minimum of 12 colonies was sequenced on both strands. In the end, 570 successful PCR amplifications and more than 14,000 sequencing reactions were required to cover the entire mitochondrial genome. In order to check the accuracy of the sequence, we analyzed each fragment individually by BLAST to validate that the best GenBank match was an Ursidae sequence. Specifically, we verified that previously analyzed cave bear mitochondrial sequences (control region and CYTB gene) displayed the best BLAST score with our analogous sequences. The control region sequence of US18 cave bear belongs to the B haplotype as defined in Orlando et al. (2002) and is identical to Scladina cave’s samples SC3500 and SC3800. Our and the published CYTB sequences differ only on four transitions (0.35% of all CYTB nucleotides), two of them being located at the third base position of codons. Furthermore, as the two specimens belong to different mitochondrial haplotypes, these differences may highlight intraspecific polymorphism. We obtained a 16,810 bp long mitochondrial genome, which is in the range of the extant Ursidae mitochondrial genomes. These genomes vary in length between 16,723 bp (Ursus maritimus) (Arnason et al. 2002) and 17,044 bp (Ursus thibetanus formosanus). The variation of the mitochondrial genome length is mainly due to a domain of the control region, which displays a highly variable number of repeat of a 10 bp motif (Yu et al. 2007). This domain is longer than 200 bp and therefore cannot be retrieved through a single PCR from ancient cave bear extracts. Thus, we designed two primer pairs to target the 50 and the 30 ends of the domain. Afterwards, all fragments were assembled into a 350 bp repeat sequence. Another group has sequenced a second cave bear mitochondrial genome from a sample found in Gamssulzen cave, Austria (Krause et al. 2008). This sample is a 44,000-year-old bone and its sequence belongs to the D haplogroup as defined in Orlando et al. (2002). The experimental strategy was slightly different from ours as they used a two-step multiplex approach PCR. As we did, they confirmed their data by at least two independent amplifications, cloning of the PCR product and sequencing of multiple clones. Both cave bears sequences are very similar. Without taking into account the 350 bp repeat region, 16,227 bp among 16,448 are identical. As expected, the 221 mutations are rather transitional mutations (216) than transversional (5), with a transition/transversion ratio equal to 43.2. As these two sequences belong to different haplogroups, it is not surprising that they display 1.3% differences.

21

Cave Bear Genomics in the Paleolithic Painted Cave of Chauvet-Pont D’Arc

349

Our aim was to determine the phylogenetic position of the cave bear, especially with respect to the two main brown bear lineages (Taberlet and Bouvet 1994). As only one brown bear mitochondrial genome was published, we decided to sequence the mitochondrial genome of a brown bear belonging to the western lineage. We analyzed a submodern bone sample from a French Pyrenean site (Guzet, Arie`ge, France). This was conducted in a third building and after the cave bear mitochondrial genome had been obtained to avoid cross-species contaminations. The same experimental strategy was followed, except that the first series of primers (designed on a brown bear sample) was already highly specific, and that, as submodern DNA is still well conserved, less primer pairs were needed (only 52 primer pairs). As for the cave bear sequence, each PCR was performed at least twice, several clones were sequenced, and the consensus sequence was checked using BLAST.

21.2.3

Resolving the Phylogeny of the Extinct Cave Bear

In order to obtain the Ursidae phylogeny, we aligned the cave bear and the Pyrenean brown bear mitochondrial sequences (EU327344 and EU497665, respectively) with sequences retrieved from GenBank for other bears species, using MEGA 4.0.2 alignment tool with the default parameters. The giant panda was set as an outgroup. The domain of the control region containing the 10 bp repeat motif was removed prior to the phylogenetic analyses. First, we tested the mutational saturation of our dataset, in order to check that homoplasy keeps low and does not alter the results. We calculated the patristic distance using Patristic software (Fourment and Gibbs 2006) and plotted the genetic distance against the patristic distance. These distances are almost equal, indicating that mutational saturation is weak and that few reversions affect the dataset. We also calculated the transition/transversion ratio, which is equal to 19:1. As this ratio is rather high, it confirms that saturation is rare. Phylogenetic trees were reconstructed from this dataset using Neighbor Joining (NJ), Maximum Parsimony (MP), and Maximum Likelihood (ML) using PhyML (Guindon and Gascuel 2003) and Mega 4.0.2 (Tamura et al. 2007) softwares, as appropriate. PhyML was implemented with a GTR þ G4 substitution model with some invariable sites, and for the NJ reconstruction method, we used the Tamura 3-parameters and the gamma-distribution shape parameter estimated with PhyML. The robustness of the phylogenetic trees was estimated with the bootstrap method (1,000 replicates for NJ and MP, 100 replicates for ML). Almost the same topology was recovered whatever the algorithm used (Fig. 21.2). The only difference concerns Ursus thibetanus subspecies’ relationships. Our results confirm the spectacled bear’s (i.e. Tremarctos ornatus) basal position (Waits et al. 1999; Yu et al. 2004, 2007; Pages et al. 2008). Ursinae is a monophyletic group in which Melursus ursinus is the most basal bear. Then Ursinae split into two clades, one leading to Ursus spelaeus, Ursus arctos, and Ursus maritimus and the other leading to Ursus thibetanus, Ursus americanus, and

350

C. Bon and J.-M. Elalouf

Fig. 21.2 Molecular phylogeny inferred from complete mitochondrial genomes. Tree reconstruction was performed by NJ analysis using the giant panda (Ailuropoda melanoleuca) as an outgroup. The same tree topology was obtained using two other methods, except for the relationships between Ursus thibetanus subspecies. Bootstrap values are indicated for NJ (regular), MP (bold), and ML (italic) analysis. The two sequences from this study are displayed in bold. GenBank accession numbers for the other sequences are: Ailuropoda melanoleuca, FM177761, EF212882, EF196663, and AM711896; Tremarctos ornatus, FM177764 and EF196665; Melursus ursinus, EF196662; Ursus thibetanus, EF1966362, EF667005, FM177759, EF587265, EF076773, and EF196661; Ursus americanus, AF303109; Helarctos malayanus, FM177765 and EF196664; Ursus maritimus, AF303111 and AJ428577; Ursus arctoseast, AF303110; Ursus spelaeus, FM177760

Helarctos malayanus. Whereas the first group is highly robust (all bootstrap values equal 100%), the second one is less statistically supported. Besides, this clade is not always found when analyzing shorter dataset (Talbot and Shields 1996a; Waits et al. 1999; Yu et al. 2004, 2007; Bon et al. 2008; Pages et al. 2008). As most of the internal branches are very short, we conclude that ursine speciation

21

Cave Bear Genomics in the Paleolithic Painted Cave of Chauvet-Pont D’Arc

351

was very rapid. Because of this radiation, it is difficult to retrieve the branching order, except for the brown-polar-cave bear clade. Relationships within this group are always consistent and are supported by maximal bootstrap values. The cave bear stands as a sister species to the brown and polar bear clade. The brown bear species is a paraphyletic group with respect to Ursus maritimus, as the polar bear species emerges from the western brown bear lineage (Talbot and Shields 1996b). Therefore, mitochondrial genome data disagree with Mazza and Rustoni’s late speciation hypothesis and confirm that the cave bear and brown bear lineages split before the radiation of the brown bear species. The robust phylogeny obtained with a complete mitochondrial genome offers the opportunity of evaluating the divergence times between species. We used the BEAST software (Drummond et al. 2005; Drummond and Rambaut 2007) with the complete mitochondrial genomes dataset. Calibration was performed with the divergence between the giant panda and Ursidae, and between Ursinae and Tremarctinidae, set at 12  1 MY and 6  0.5 MY (million years), respectively, considering a normal distribution. We chose a relaxed uncorrelated lognormal molecular clock, a GTR þ G4 substitution model with some invariable sites and a Yule process of speciation. Two independent chains that each consist of 10,000,000 points were calculated and the burn-in was set to 10,000. To highlight the benefits brought by the analysis of long DNA sequences in molecular dating analysis, we randomly created alignments of various lengths from whole mitochondrial genome sequences. We calculated node ages using the parameters described above. Obviously, short sequences yield different node ages and wider credibility intervals than longer sequences. The alignment has to reach at least 10 kb to stabilize the node ages. A long sequence alignment is therefore required to obtain an accurate molecular dating (Bon et al. 2008). According to the results obtained with complete mitochondrial genomes (Fig. 21.3), Tremarctinae diverged from Ursinae 6.3 MY ago, shortly before the appearance of Ursus boeckhi, the first ursine representative. The bears radiation occurred about 4 millions years later, between 2 and 3 MY ago. The short time while five bears groups appeared explains the difficulties in determining the branching order of bears. These speciations happened during the Pliocene, when Ursus minimus was the most common bear in Europe. As this fossil species is assumed to be the last common ancestor of Ursus spelaeus, Ursus arctos, and Ursus thibetanus, our results agree with paleontological data. We date the divergence event between arctoid and speleoid lineages to 1.6 MY, during the Villafranchian stage, when Ursus etruscus was the main bear in Europe. Most paleontologists consider that Ursus etruscus was the last common ancestor of the brown and cave bears. In conclusion, our approach proved successful for sequencing the complete mitochondrial genome of a species extinct for more than 20,000 years. The cave bear mitochondrial genome shares high similarities with other bear mitochondrial genomes. In addition, the phylogenetic analysis robustly confirms that the cave bear is a sister species to the brown and polar bear clade. The amount of data obtained made

352

C. Bon and J.-M. Elalouf

Fig. 21.3 Phylogeny and divergence times determined using the mitochondrial genome sequence of the cave bear and of eight extant bears. Divergence times were calculated using BEAST software with the splits between the giant panda and Ursidae and between Ursinae and Tremarctinidae set to 12 and 6 MY, respectively. Age for each node and 95% credibility intervals are, as follows: 1, 6.3 MY (5.4–7.2); 2, 3.0 MY (2.2–3.8); 3, 2.8 MY (2.1–3.5); 4, 2.4 MY (1.7–3); 5, 2.1 MY (1.4–2.7); 6, 1.6 MY (1–2.1); 7, 0.6 MY (0.3–0.8); and 8, 0.4 MY (0.2–0.5). The extinct cave bear is displayed by a picture from Chauvet-Pont d’Arc

possible to evaluate the tempo of bears’ history during Pliocene and Pleistocene and compare our conclusions with paleontological ones. The cave bear mitochondrial genome sequence opens up possibilities to push forward extinct bears DNA analysis. First, this sequence will help rescuing poorly preserved samples by targeting different regions of the mitochondrial genome. We studied Chauvet-Pont d’Arc bear samples that failed to yield any DNA when

21

Cave Bear Genomics in the Paleolithic Painted Cave of Chauvet-Pont D’Arc

353

analyzed for the mitochondrial control region. We targeted 112 bp in the 16 S gene and obtained a successful amplification for 48% of the 23 samples, instead of 17% when the control region was queried. Second, sequence data provided by extant bears may not be sufficient to analyze DNA sequences of species that existed before Ursus spelaeus, such as Ursus deningeri. The availability of the cave bear mitochondrial genome is expected to provide a better template for exploring very ancient bear species.

21.3

Genetic Diversity Among Chauvet-Pont d’Arc Cave Bears

We explored the genetic diversity of cave bears from Chauvet-Pont d’Arc Cave by analyzing several samples from the cave. For comparison purposes, we turned to another cave from the same area, the Deux-Ouvertures Cave. This cave is located by the end of the Arde`che Gorge, approximately 15 km away from Chauvet-Pont d’Arc, and displays rock art pictures. It also contains numerous cave bear remains, and except for Chauvet-Pont d’Arc, is the most striking bears cave in the area. We collected 39 and 17 samples from Chauvet-Pont d’Arc and Deux-Ouvertures caves, respectively. DNA was extracted, and we attempted to amplify a 117 bp fragment of the mitochondrial genome control region. Most of the Chauvet-Pont d’Arc cave samples (32/39) and some of the DeuxOuvertures cave ones (3/17) failed to yield the queried fragment. We conclude that this fragment was no longer present or that the samples contain too much PCR inhibitory compounds for being successfully amplified. The samples that gave positive results belong to the same haplogroup (haplogroup B) and to two different haplotypes, which we named HT1 and HT2. HT1 is also found in Scladina (AY149268, AY149267) and Gigny (AY149264) Caves (Orlando et al. 2002). HT2 differs from HT1 only in the position 16,550 and is found in the Cova-Linares Cave (AY149271, AY149272) (Loreille et al. 2001). It is not surprising to find the B haplogroup in these two caves since it is widely spread throughout Western Europe. HT1 and HT2 were both found in Chauvet-Pont d’Arc: two samples in ChauvetPont d’Arc Cave displayed the HT2 haplotype (US08 and US21); the five samples that yielded the HT1 haplotype are US17, US18, US19, US34, and US39. On the other hand, all Deux-Ouvertures Cave samples gave the same haplotype, HT1. In order to verify that this homogeneity is not due to a biased sampling with different bones belonging to the same individual, we sampled five humerus from five different individuals. We obtained the HT1 sequence for each of them, validating that HT1 is widely spread in this cave. Thus, we observed a high genetic homogeneity inside the bear population of each cave, as well as from one cave to another. This evidences the frequent female genetic exchange along Arde`che Gorge and contrasts with the highly subdivided cave bear population hypothesis (Hofreiter et al. 2002, 2007).

354

C. Bon and J.-M. Elalouf

In the same time, several Chauvet-Pont d’Arc samples were dated and returned radiocarbon age between 37,300  340 years BP and 29,560  160 years BP. Most of them range from 30,000 to 32,000 years BP, indicating that cave bears were present at Chauvet-Pont d’Arc for a relatively brief period of time. It is worth noting that Scladina and Cova-Linares samples which belong to the HT1 and HT2 haplotypes display approximately the same age as the Chauvet-Pont d’Arc samples. Scladina’s bones belong to an archeological layer estimated to 40,000–45,000 years, and Cova-Linares’ ones are from a 35,000-year-old layer. In conclusion, the genetic studies carried out in Chauvet-Pont d’Arc provided a complete mitochondrial genome for the extinct cave bear, which enabled us to obtain robust phylogenetic trees for Ursidae. The amount of data also offers the opportunity of evaluating the divergence dates between species and to compare genetic and paleontological results. Widening our studies to several samples from this cave and another cave allowed us to explore the genetic diversity of the area. We established that the mitochondrial genetic landscape in two caves 15 km away from each other in the Arde`che Gorge is almost homogeneous. With other bear caves along the river, extending such analysis to additional sites may allow to describe more precisely the genetic pattern of the area. This study also demonstrates that well-preserved DNA still remains in the Chauvet-Pont d’Arc Cave and establishes this painted cave as a reservoir for ancient DNA researches. Other species from the Chauvet-Pont d’Arc Cave can now be analyzed to better characterize the Pleistocene environment.

Reference Argant A (2001) Los antepasados del oso de las cavernas. Cad Lab Xeol Laxe 26:9 Arnason U, Adegoke JA, Bodin K, Born EW, Esa YB, Gullberg A, Nilsson M, Short RV, Xu X, Janke A (2002) Mammalian mitogenomic relationships and the root of the eutherian tree. Proc Natl Acad Sci USA 99:8151–8156 Baffier D (2005) La Grotte Chauvet: conservation d’un patrimoine. Bulletin de la socie´te´ pre´historique franc¸aise 102:11–16 Bon C, Caudy N, de Dieuleveult M, Fosse P, Philippe M, Maksud F, Beraud-Colomb E, Bouzaid E, Kefi R, Laugier C, Rousseau B, Casane D, van der Plicht J, Elalouf JM (2008) Deciphering the complete mitochondrial genome and phylogeny of the extinct cave bear in the paleolithic painted cave of Chauvet. Proc Natl Acad Sci USA 105:17447–17452 Delisle I, Strobeck C (2002) Conserved primers for rapid sequencing of the complete mitochondrial genome from carnivores, applied to three species of bears. Mol Biol Evol 19:357–361 Drummond AJ, Rambaut A (2007) BEAST: Bayesian evolutionary analysis by sampling trees. BMC Evol Biol 7:214 Drummond AJ, Rambaut A, Shapiro B, Pybus OG (2005) Bayesian coalescent inference of past population dynamics from molecular sequences. Mol Biol Evol 22:1185–1192 Fosse P, Philippe M (2005) La faune de la grotte Chauvet: pale´obiologie et anthropozoologie. Bulletin de la socie´te´ pre´historique franc¸aise 102:89–102 Fourment M, Gibbs MJ (2006) PATRISTIC: a program for calculating patristic distances and graphically comparing the components of genetic change. BMC Evol Biol 6:1

21

Cave Bear Genomics in the Paleolithic Painted Cave of Chauvet-Pont D’Arc

355

Garcia MA (2005) Ichnologie ge´ne´rale de la grotte Chauvet. Bulletin de la socie´te´ pre´historique franc¸aise 102:103–108 Geneste JM (2005) L’arche´ologie des vestiges mate´riels dans la grotte Chauvet-Pont-d’Arc. Bulletin de la socie´te´ pre´historique franc¸aise 102:135–144 Guindon S, Gascuel O (2003) A simple, fast, and accurate algorithm to estimate large phylogenies by maximum likelihood. Syst Biol 52:696–704 Hanni C, Laudet V, Stehelin D, Taberlet P (1994) Tracking the origins of the cave bear (Ursus spelaeus) by mitochondrial DNA sequencing. Proc Natl Acad Sci USA 91:12336–12340 Hofreiter M, Jaenicke V, Serre D, von Haeseler A, Paabo S (2001a) DNA sequences from multiple amplifications reveal artifacts induced by cytosine deamination in ancient DNA. Nucleic Acids Res 29:4793–4799 Hofreiter M, Serre D, Poinar HN, Kuch M, Paabo S (2001b) Ancient DNA. Nat Rev Genet 2:353–359 Hofreiter M, Capelli C, Krings M, Waits L, Conard N, Munzel S, Rabeder G, Nagel D, Paunovic M, Jambresic G, Meyer S, Weiss G, Paabo S (2002) Ancient DNA analyses reveal high mitochondrial DNA sequence diversity and parallel morphological evolution of late pleistocene cave bears. Mol Biol Evol 19:1244–1250 Hofreiter M, Munzel S, Conard NJ, Pollack J, Slatkin M, Weiss G, Paabo S (2007) Sudden replacement of cave bear mitochondrial DNA in the late Pleistocene. Curr Biol 17:R122–R123 Knapp M, Rohland N, Weinstock J, Baryshnikov G, Sher A, Nagel D, Rabeder G, Pinhasi R, Schmidt HA, Hofreiter M (2009) First DNA sequences from Asian cave bear fossils reveal deep divergences and complex phylogeographic patterns. Mol Ecol 18:1225–1238 Krause J, Unger T, Nocon A, Malaspinas AS, Kolokotronis SO, Stiller M, Soibelzon L, Spriggs H, Dear PH, Briggs AW, Bray SC, O’Brien SJ, Rabeder G, Matheus P, Cooper A, Slatkin M, Paabo S, Hofreiter M (2008) Mitochondrial genomes reveal an explosive radiation of extinct and extant bears near the Miocene–Pliocene boundary. BMC Evol Biol 8:220 Kurte´n B (1976) The cave bear story: life and death of a vanished animal. Columbia University Press, New York Loreille O, Orlando L, Patou-Mathis M, Philippe M, Taberlet P, Hanni C (2001) Ancient DNA analysis reveals divergence of the cave bear, Ursus spelaeus, and brown bear, Ursus arctos, lineages. Curr Biol 11:200–203 Mazza P, Rustioni M (1994) On the phylogeny of Eurasian bears. Palaeontographica 230:38 Orlando L, Bonjean D, Bocherens H, Thenot A, Argant A, Otte M, Hanni C (2002) Ancient DNA and the population genetics of cave bears (Ursus spelaeus) through space and time. Mol Biol Evol 19:1920–1933 Paabo S, Poinar H, Serre D, Jaenicke-Despres V, Hebler J, Rohland N, Kuch M, Krause J, Vigilant L, Hofreiter M (2004) Genetic analyses from ancient DNA. Annu Rev Genet 38:645–679 Pacher M, Stuart AJ (2009) Extinction chronology and palaeobiology of the cave bear (Ursus spelaeus). Boreas 38:189–206 Pages M, Calvignac S, Klein C, Paris M, Hughes S, Hanni C (2008) Combined analysis of fourteen nuclear genes refines the Ursidae phylogeny. Mol Phylogenet Evol 47:73–83 Peigne S, Goillot C, Germonpre M, Blondel C, Bignon O, Merceron G (2009) Predormancy omnivory in European cave bears evidenced by a dental microwear analysis of Ursus spelaeus from Goyet, Belgium. Proc Natl Acad Sci USA 106:15390–15393 Richards MP, Pacher M, Stiller M, Quiles J, Hofreiter M, Constantin S, Zilhao J, Trinkaus E (2008) Isotopic evidence for omnivory among European cave bears: late pleistocene Ursus spelaeus from the Pestera cu Oase, Romania. Proc Natl Acad Sci USA 105:600–604 Rohland N, Siedel H, Hofreiter M (2004) Nondestructive DNA extraction method for mitochondrial DNA analyses of museum specimens. Biotechniques 36(814–816):818–821 Rohland N, Malaspinas AS, Pollack JL, Slatkin M, Matheus P, Hofreiter M (2007) Proboscidean mitogenomics: chronology and mode of elephant evolution using mastodon as outgroup. PLoS Biol 5:e207

356

C. Bon and J.-M. Elalouf

Taberlet P, Bouvet J (1994) Mitochondrial DNA polymorphism, phylogeography, and conservation genetics of the brown bear Ursus arctos in Europe. Proc Biol Sci 255:195–200 Talbot SL, Shields GF (1996a) A phylogeny of the bears (Ursidae) inferred from complete sequences of three mitochondrial genes. Mol Phylogenet Evol 5:567–575 Talbot SL, Shields GF (1996b) Phylogeography of brown bears (Ursus arctos) of Alaska and paraphyly within the Ursidae. Mol Phylogenet Evol 5:477–494 Tamura K, Dudley J, Nei M, Kumar S (2007) MEGA4: Molecular Evolutionary Genetics Analysis (MEGA) software version 4.0. Mol Biol Evol 24:1596–1599 Valladas H, Clottes J, Geneste JM, Garcia MA, Arnold M, Cachier H, Tisnerat-Laborde N (2001) Palaeolithic paintings. Evolution of prehistoric cave art. Nature 413:479 Waits LP, Sullivan J, O’Brien SJ, Ward RH (1999) Rapid radiation events in the family Ursidae indicated by likelihood phylogenetic estimation from multiple fragments of mtDNA. Mol Phylogenet Evol 13:82–92 Yu L, Li QW, Ryder OA, Zhang YP (2004) Phylogeny of the bears (Ursidae) based on nuclear and mitochondrial genes. Mol Phylogenet Evol 32:480–494 Yu L, Li YW, Ryder OA, Zhang YP (2007) Analysis of complete mitochondrial genome sequences increases phylogenetic resolution of bears (Ursidae), a mammalian family that experienced rapid speciation. BMC Evol Biol 7:198

Index

A Accessory, 250, 251, 254, 257–260, 262 Actinobacteria, 303 Actinorhizal plants, 303 Adaptations, 8, 50, 53, 60, 82, 83, 95, 96 Adaption, 82, 84–90, 95 Adaptive radiation, 13, 283–297 Aeschynomene, 303 Ag–NOR staining, 10 Agrobacterium radiobacter, 309 rhizogenes, 309 tumefaciens, 306, 309 vitis, 309 Allopatry, 50 Alpha, 119 Alpha-lactalbumin, 118, 121, 127 Alternative splicing, 31, 38 Amazon, 284, 289–293, 296, 297 Amines, 260 Amniotes, 3, 4, 6, 7, 12, 13 Ancestral area, 289, 290, 292 Ancestral karyotype, 144–146, 153 Ancient DNA, 346–348, 354 Andes, 285, 290, 291, 293 Anesthetic, 253, 255, 260–262 Antarctic fur seal (Arctocephalus gazella), 127 Antennal modification antennal hammer, 271–280 Anticoagulant, 255, 259–262 Aphid Acyrthosiphon pisum, 133–136 Aphis gossypii, 133, 134, 137 Myzus persicae, 133, 134, 137 Apparatus, 250, 253, 257, 258

Appressorium ascospores, 319, 324 Area cladogram, 288, 289, 291, 292 Aromatase, 7 Ascoviruses Diadromus pulchellus, 238, 244, 245 Heliothis virescens, 238 Spodoptera frugiperda, 237, 238 Trichoplusia ni, 238 Azoarcus, 302 Azolla, 303 Azorhizobium, 301, 306 B Background selection, 9 Baculoviruses, 230, 232, 233, 236 Bats, 283–297 Bayesian, 10, 12 Bayesian inference, 285 Bdelloid rotifers, 104 Behavior, 283–297 Beta, 119, 120 Beta-lactoglobin, 121 Biased incrementalism, 91–93, 95 Birth and death model, 31, 35, 40 BLAST, 192 Bootstrap, 107 Bovine (bos Taurus), 116, 127, 128 Bracoviruses Chelonus inanitus, 236 Cotesia congregata, 235, 236 Glyptapanteles flavicoxis, 235 Glyptapanteles indiensis, 235 Bradyrhizobium canariense, 306 japonicum, 306

357

358 Brown bear, 344, 346, 347, 349, 351 Buccinidae, 253, 254, 258 Buccinids, 254, 258, 260 C California sea lion (Zalophus californianus), 127 Cancellariid, 256, 262 Cancellariidae, 250, 255, 256, 259, 262 Cancellarioidea, 250, 252, 256 Cape fur seal (Arctocephalus pusillus), 126, 127 Caseins, 118–122, 127, 128 Cave bear, 343–354 C-banding, 10 Charnov–Bull hypothesis, 7 Chauvet-Pont d’Arc, 343–354 Chdl, 285 Chemical alterations, 347 Choline, 259, 260 Chromogens, 260 Chromosomal inversions, 52, 55 Chromosomal rearrangements, 51, 52, 55, 58, 59, 61 Chromosomal theory of speciation, 51, 61 Chromosome rearrangements, 55 CNGs. See Conserved nongenic sequences Codon reassignments ambiguous intermediate mechanism, 86–90 codon capture mechanism, 86, 87 Coevolution, 302 Colinearity, 55 Colubrariidae, 255, 262 Columbellidae, 254, 258 Comparative analysis CAIC, 277 Comparative genomics, 10, 19–20, 25, 26, 29, 31, 40, 41 Complexity hypothesis, 102 Concerted evolution, 203–204 Conidae, 250 Connectivity analysis, 106, 108, 109, 111 Conoidea, 250, 252, 253, 257, 259, 261 Conopeptides, 259 Conotoxins, 251, 252, 257–261, 263 Conserved nongenic sequences (CNGs), 191 Constraints, 19–41 Convergence, 5 Convergent evolution, 302, 317–326 Coralliophilinae, 253, 255, 256, 261, 262 Corallivory, 254, 256, 262 Costellariidae, 259

Index Cot curve analysis, 188 Cow, 117, 121, 128 Cryptinae, 273–275, 278, 279 Cryptosporidium, 107, 109 Cyanobacteria, 302, 303 Cycads, 302 Cytb, 285 D Dby, 285 Deletions, 55, 56, 58 Deux-Ouvertures Cave, 353 Developmental biology, 161 Diatoms, 107, 108, 110 Diclidurini, 284, 286, 287, 289, 293 Divergence times, 351, 352 Diversity, 252, 253, 256, 263, 264 Dmrt1, 10 Dobzhansky, T., 50–52, 59 Dosage compensation, 12 Dosage sensitivity, 201 Drug targets, 106–110 Duplication Genome duplication, 134 Lineage specific duplications, 138 Paralogs, 133, 136 E Early lactation protein (ELP), 123 Ear morphology, 295–297 Echidnas (Tachyglossus and zaglosus), 116 Echolocation, 295–297 E.C. number, 105, 109 Ecotones, 334, 335 Efficiency of sporulation, 54, 56 ELP. See Early lactation protein EM. See Error minimization Emballonuridae, 284, 289, 293 Embryos, 3, 7, 8 Emergence, 81–96 Endoparasitic wasps Braconidae Chelonus inanitus, 236 Cotesia congregata, 235, 236 Cotesia marginiventris, 234, 244 Glyptapanteles flavicoxis, 235 Glyptapanteles indiensis, 235 Microplitis croceipes, 234 Ichneumonidae Campoletis sonorensis, 230, 241, 244 Cardiochiles nigriceps, 230 Eiphosoma vitticolle, 237, 239 Hyposoter didymator, 236

Index Hyposoter fugitivus, 240 Venturia canescens, 230 Endosymbiont bacteria, 212 eukaryote, 209 facultative, 210 obligate, 210 primary, 210 reproductive, 211 secondary, 210 Endosymbiosis, 103, 104, 108 Enrichment analysis, 106, 111 ENU mutagenesis, 202 Environmental stress, 54 Enzymes, 103–109, 111 Epistasis, 36, 52 Ergalataxinae, 253, 256 Error minimization (EM), 83–91 Esters, 259, 260 Estrogen, 6 Eukaryotes, 102–108, 111 Eumycetes and Fungi Botrytis cinerea, 318, 322 Magnaporthe grisea, 322 Neurospora crassa, 320, 322 Penicillium chrysogenum, 322, 324–326 Podospora anserina, 320, 322 Rhizopus oryzae, 323–326 Trichoderma reesei, 320, 322 Trichoderma species, 326 Eutheria (eutherian or placentalia), 116 Evolution, 249–265 convergent, 182 divergent, 182 Evolutionary breakpoints, 144, 147–150 Evolutionary constraints, 190, 194, 200 Evolutionary rates Divergence time, 144 Mutations, 133 Omega ratio (dN/dS), 134, 137–140 Synonymous non-substitution rate (dN), 134, 135, 137 Synonymous substitution rate (dS), 134, 135, 137 Evolvability, 95 Exogenes, 263, 265 Exons, 26, 38 Extinction, 9 Eye camera, 182–185 compound, 181–183 mirror, 182 pinhole, 182

359 F Fadrozole, 6 Fasciolariidae, 254, 258 Feeding, 250, 252–256, 262 Fitness change, 75–77 Fitness landscape, 33, 34, 36 Fluorescent in situ hybridization (FISH), 10, 11 Forest, 294, 296, 297 Functional constraints, 200–203 G Gene architecture, 26 Gene-conversion, 203–204 Gene duplication, 29, 31 Gene expression, 160, 163, 171 Gene identity intervals interspecies, 308 intraspecies, 308 Gene markers dnaJ, 308 dnaK, 309 rpoB, 308, 309 Genes, 253, 261–263, 265 Genetic code adaptive code hypothesis, 84–90 emergence hypothesis, 90–91 Genetic code evolution, 85, 90, 91 Genetic diversity, 353–354 Gene transfer lateral, horizontal, 232 Genic theory of speciation, 51 Genome architecture, 19, 20, 23, 26–29, 35, 37, 38, 40 Genome 10K, 13 Genome sequence, 19, 23 Genomic, 56, 58 Genomic rearrangements, 51, 52, 55–61 Genomic structure, 188–190 Genotype  environment, 8–9 Gland, 250, 251, 257–260, 262 Goats, 121, 123, 128 Grey seal (Halichoerus grypus), 127 Guiana Shield, 291, 297 Gunnera, 303 H Haematophagous, 255, 257, 259, 262 Haematophagy, 254–256 Haplogroup, 348, 353 Haplotypes, 348, 353, 354 Harbour seal (Phoca vitulina), 127 Harpidae, 259

360 Harpooning, 253 Hemiplasy, 144, 150–154 Herbaspirillum, 302 Heterogamety, 4–8, 10–12 Heteromorphic sex chromosomes, 4, 9 Hill–Robertson effect, 9 Histamine, 259 Historical biogeography, 284, 287–293 Hitchhiking, 9 Homoplasy, 144, 151, 153 Horizontal gene transfer (HGT), 101–104, 106–109, 111 Horizontal transfer, 202–204 Host location, 272–274, 279, 280 Hosts, 272–274, 278–280 Human chromosome 2, 195 Human chromosome 21, 191, 194, 195, 201 Hybrid fertility, 55, 57, 58 Hybridization, 4, 10 Hypobranchial gland, 251, 260 Hypolimnas bolina Hypolimnas bolina resistance, 221 I Ichneumonidae, 271–273 Ichnoviruses Campoletis sonorensis, 240–244 Cardiochiles nigriceps, 230 Hyposoter fugitivus, 240 Tranosema rostrales, 240 Immunosuppressive genes Imd, 232 Toll, 232 Inactivation, 12 Incipient, 50, 56, 58, 60 Incubation, 4–8 Insertions, 55, 56 Interaction, 8–9 Introns, 21, 23, 25, 26, 38, 39 Inversions, 52, 55, 56 Iridoviruses Chilo suppressalis, 237 Isolation, 49–61 J Junk DNA, 190 K Kappa, 119, 120 Karyotype, 4, 5 KEGG, 106, 111

Index L Lactotransferin, 121 LALBA, 127 Lateral transfer, 304, 306 Legume plants Phaseolus vulgaris, 306 Leishmania, 107, 108, 111 Lepidopterans Chilo suppressalis, 237 Ephestia kuehniella, 230 Heliothis armigera, 237 Heliothis zea, 233 Spodoptera frugiperda, 234 Trichoplusia ni, 235 Likelihood, 10 LINEs. See Long interspersed elements Lipopolysaccharides, 111 LLP-A, 123 LLP-B, 123 Long conserved noncoding sequences (LNCS), 192 Long interspersed elements (LINEs), 189 M Mammaliaforms, 116 Mammals, 116–122, 124, 126–129 Marginellid, 254 Marginellidae, 255, 256, 259 Markov-chain Monte Carlo, 12 McDonald–Kreitman test, 23, 24 Melongenidae, 254, 258 Melospiza melodia, 331, 332, 340 Mesorhizobium amorphae, 306 loti, 303 Metabolic enzymes, 103, 104, 111 Metatheria (marsupials or Marsupialia), 116 metaTIGER, 104–111 Methylobacterium, 306 Microarray interspecies array, 183, 184 Microevolution, 8 Migration, 9 Milk proteins, 116–119, 122–128 Minimal gene set, 29, 30 Miocene, 287, 288, 290–292, 295, 297 Misfolding, 33, 34, 40 Mismatch repair, 51, 55 Mitochondria, 103, 104 Mitochondrial genome, 346–354 Mitridae, 253, 259, 260 Molecular dating, 284, 287, 288

Index Molecular evolution, 67, 68, 78 Molluscs cephalopod, 182–184 nautilus, 182–185 octopus, 182–184 pectin, 182–185 squid, 182–185 Morphogenetic gradient dorsal gradient, 162, 163, 167, 169–171 dpp gradient, 168, 171 gradient, 160, 164, 166, 167, 169–172 Morphology, 283–297 Mouse chromosome 2, 195 Muller’s Ratchet, 9, 10 Muricidae, 253, 255, 256, 259, 261 Muricids, 254, 259, 260 Muricoidea, 250, 252 Mutation, 188, 192, 194, 200–202 beneficial, 69, 75–78 deleterious, 69, 75–77 neutral, 75 Mutational cold spot, 201–202 Mutational load, 55 Mutation robustness error minimization (EM), 83, 90, 91 extrinsic, 94 intrinsic, 94, 95 Mutation-selection equilibrium, 73, 75, 77 Mycorrhizal symbiosis, 304 N NADPH oxidase, 320, 321, 325 Nassariid, 254 Nassariidae, 254, 258, 259 Natural science, Natural selection, 82–84, 91–96 Neotropics, 283–297 Nervous system neural, 159–167, 172 neuroblast, 164–167, 172 Networks, 20, 26, 31–35, 37, 40 Neurotoxins, 250, 251, 258, 260–262 Neutral networks, 91, 93–95 New World emballonurid bats, 284–288, 290, 292, 294–296 Nitrogen fixation, 302, 304, 306 Nodulation factors nodB, 303 nodC, 303 Noncoding sequences, 20, 23 Nonorthologous gene displacement, 29, 31 Nonsynonymous substitutions, 21 Northern Amazon, 289–292, 296 Nudiviruses, 231, 233–236, 244

361 O Odobenids, 125 Oligocene, 287, 288, 295, 296 Olividae, 259 One-band-one-gene hypothesis, 188 Operons, 23, 27, 28 Organelles immunosuppressive, 229–245 Origin of life, 67, 68 Ortholog, 134, 137 Orthologous, 28–32, 35, 38 Ostreococcus, 107, 111 Otariids (sea lions, fur seals), 125, 127 Oviparity, 12, 13 P Paleolithic, 343–354 Pan-genome, 102 Paralogs, 29, 31, 35, 40 Parsimony, 10, 285, 286, 293, 294 Particles immunosuppressive, 231–234 Patterning, 159–172 Pelage, 294–295 Peptides, 252, 253, 257, 258, 261, 263, 264 Phenomic, 19, 32–36, 39, 40 Phocids (true seals), 125, 127 Photoreceptors, 181, 183 Phylogenetic trees, 101–112 Phylogenies, 304, 306–310 Phylogeny, 284, 285, 287, 293–296, 349–353 Phytophthora, 107, 109, 111 Pinniped, 125–126 Plasmodium, 107, 108 Plasticity, 19–41 Plastids, 103, 104, 107–109, 111 Platypus (Ornithorhynchus anatinus), 116, 118, 119, 121 Pleiotropy, 36 Pleistocene, 287, 290, 291, 346, 352, 354 Pleistocene refugia, 335, 336 Pleistocene refugium, 336, 337 Pliocene, 283, 287, 291 Polydnaviruses, 232–234, 236, 241, 243 Polygenic inheritance, 8 Polymorphisms, 151–153 Positive selection, 21–25 Poxviruses Diachasmimorpha longicaudata poxvirus, 244 Preferential attachment, 91, 92 Prezygotic, 50, 61 Prialt, 252, 260 PRIAM, 105, 106

362 Primary, 250, 257–260, 262 Production, 257, 258, 260, 262 Profiling, 263–265 Prokaryotes, 23, 27–30, 36, 37, 39, 102–103, 107, 108, 111 Promiscuous domains, 26 Proteomics, 264, 265 Prototheria (monotreme or Monotrema), 116 Pseudaptation, 81–96 Pseudogenes, 21, 23 PSI-BLAST, 105, 106 PTMP-1, 123 PTMP-2, 123 Q Quasispecies, 68, 72–74, 78 R RAC 2 (myoblast fusion), Radiation, 351 Radiocarbon age, 345, 354 Radula, 250, 253–257 Rearrangements, 51–53, 55–61 Reciprocical Best Hit, 133, 136 Recombination, 9, 12 Red kangaroo (macropus rufus), 122 Regulators, 30, 31 Reinforcing mechanism, 50 Relative reproductive isolation, 50 Repeat masking, 192 Replication, 68, 71, 72, 75–78 Reproductive, 49–61 Reproductive barrier, 55, 56, 58–60 Reproductive isolation, 49–61, 331, 338 Rhizobia, 301–310 Rhizobium R. cellulosilyticum, 309 R. daejeonense, 309 R. etli, 303, 306 R. fabae, 306 R. galegae, 309 R. huautlense, 309 R. leguminosarum, 306 R. lusitanum, 309 R. mongolense, 309 R. pisi, 306 R. selenireducens, 309 R. tropici, 306 Ringed seal (Pusa hispida), 127 Ring species Ensatina eschscholtzii, 331, 333 Glossina morsitans, 331 Lanius, 331, 333 Melospiza melodia, 331, 332, 340

Index Phylloscopus trochiloides, 331 Phylloscopus trochilus, 331 Zosterops, 331 RNA folding, 69–72 sequence-structure map, 68, 69, 72 world, 67–69, 71, 77 RNA complexity, 188 RNome, 25, 31 Robustness, 20, 32–34, 36–37, 40, 41 Roosts, 284, 293–295 Rot curve analysis, 188 S Saccharomyces, 107 Saccharomyces cerevisiae, 52, 54, 55 Salivary glands, 250, 251, 257–260, 262 Savannahs, 288, 296, 297 Scale free networks, 91–92 Scaling, 30, 31, 40 Scaling, size, 160, 167–172 SDs. See Segmental duplications Secretion, 253, 257–260, 262 Segmental duplications (SDs), 144, 147–150, 153 Selection Adaptation, 133, 140 Fast-evolving genes, 137, 139–140 Positive selection, 134, 140 Relaxed selection, 134, 137, 140 Selective pressure, 68, 72, 78 Selfish operon, 28 Sequence data Coding sequence (CDS), 135 Expressed sequence tag (ESTs), 133, 137 Pea aphid genome, 134, 136, 140 Sequences, 10, 11, 13, 14 Sequencing, 116–125, 129 Sex determination, 4–8, 10–13 SHARKhunt, 105, 106 Shell drilling, 254 Shell wedging, 254 Short interspersed elements (SINEs), 189 Signaling pathways BMP signaling pathway/BMP signaling, 165, 166 SINEs. See Short interspersed elements Single nucleotide polymorphisms (SNPs), 190, 191, 202 Sinorhizobium S. chiapanecum, 308 S. mexicanum, 308 S. terangae, 306 SNPs. See Single nucleotide polymorphisms

Index Song Sparrow, 329–340 SOS, 53 South America, 283, 284, 286, 288–291, 293, 296 Spandrel, 82 Speciation, 50–56, 58, 60, 61 allopatric, 330, 335 ecological, 334, 338–340 Species, 49–58, 60, 61 Sporulation efficiency, 54, 56–61 16S rRNA, 308–310 Sry, 9 Starvation, 49–61 Stochastic approaches, 10 Subspecies, 330, 331, 333, 335, 338, 339 Symbiogenesis genome fusion, 233 Symbiosis, 234, 301, 302, 304 Symbiotic genes nif, 304 nodA, 303–305 nodABC, 303 nodB, 303 nodBC, 303 nodC, 303 Symbiotic islands, 306 Symbiotic plasmids, 306, 307 repA, 307 repABC, 307 repB, 307 repC, 307 Synaptid, 116 Synonymous sites, 21, 22, 23, 25, 26, 32, 33 Synonymous substitutions, 21 Syntenies, 144, 145, 151, 152 T Tandem repeats, 147, 148 Taxon-pulse, 291, 292, 297 Terebridae, 250, 264 Terebrids, 253, 258 Teretoxins, 258, 264 TEs. See Transposable elements Testosterone, 6 Tetramine, 258 Tetraploid, 51, 59, 60 Tetraploidization, 51, 52, 55, 59 Tetraspanin, 320, 321 Theileria, 107, 108 Therapsid, 116 Theria, 116 Toxins, 250–252, 257–259, 261, 262–263, 265 Toxoplasma, 107, 108

363 Transferomics, 101–112 Translocations, 55, 56 Transposable elements (TEs), 25, 147, 149–150 Transpositions, 53, 55, 56 Trichosurin, 123 Trypanosoma, 107, 108 Turrids, 250, 253, 258 Turritoxins, 258 U UCEs. See Ultraconserved elements Ultraconserved elements (UCEs), 191, 194–195, 201–204 Underdominance, 51, 52 Ursidae, 348, 349, 351, 352, 354 Ursinae, 349, 351, 352 Ursus spelaeus, 344, 346, 349–351, 353 Usp9x, 285, 293 V Venom, 250, 253, 257–259, 261–265 Vibrational sounding, 273, 279 Viviparity, 4, 12, 13 Volutidae, 253, 259, 260 Volutomitridae, 259 W Wallaby (macropus eugenii), 122–124, 127 Walrus, 125 WAP. See Whey acidic protein WDC2, 121 Whey acidic protein (WAP), 118, 121, 122, 123, 128 Whole-genome, 13 Wing sac, 294, 295 Within-area specification events, 290 Wolbachia cytoplasmic incompatibility (CI), 211, 215, 217, 220 male-killing (MK), 209–222 supergroup, 210, 211 transmission, 211, 216 wBol1, 212–216, 218–221 wBol2, 214, 215, 217 wPip, 220 Wood boring beetles Wood-boring, 273, 277 X X chromosome, 5, 12 Y Y chromosome, 195, 201